Culturomics and Google’s Ngram Viewer: More Noise?

The other day, a few minutes of wilfing led us to Technium’s post on Google’s latest project, the Ngram Viewer. Is Google making us stupid again? But this is serious stuff, as evidenced by the Ngram Viewer introduction in last December’s Science. The Ngram Viewer is a corpus allowing users to search keywords in millions of books and to quantitatively plot the results. So what? A TED video helps explain the development and potential uses. Commentary to the Science article, and to the claims made in the TED video, questions the usefulness of the Google project.

Is the Ngram Viewer an electronic Tower of Babel? We’re not sure; what are its implications, its practical uses? It appears to be an interesting cultural anthropological tool. The corpus contains “over 500 billion words,” and “cannot be read by a human.” But anyone can access it at the Culturomics site. In the Science paper, “Quantitative Analysis of Culture Using Millions of Digitized Books,” the authors provide this takeaway: “Cultural change guides the concepts we discuss (such as ‘slavery’). Linguistic change – which, of course, has cultural roots – affects the words we use for those concepts (‘the Great War’ vs. ‘World War I’). In this paper, we will examine both linguistic changes, such as changes in the lexicon and grammar; and cultural phenomena, such as how we remember people and events.”

Closing the paper is a concise definition of culturomics with a touching comment on its limitations: “Culturomics is the application of high-throughput data collection and analysis to the study of human culture. Books are a beginning, but we must also incorporate newspapers (29), manuscripts (30), maps (31), artwork (32), and a myriad of other human creations (33, 34). Of course, many voices – already lost to time – lie forever beyond our reach.” (Not to mention the trunk of writing, molding in our basement for over twenty years, that we finally threw out – the poems were beginning to crawl out of the trunk, climb up the basement stairs, and haunt our dreams.) The Science paper concludes with examples of how culturomics might be used as “a new type of evidence in the humanities.” Yet some of the paper’s conclusions seem obvious: “People, too, rise to prominence, only to be forgotten.” Surely, that “One generation passeth away, and another generation cometh” is not a new concept. But their discussion of the impact of censorship is interesting. In any case, the field of Humanities currently needs all the help it can get.

We played around a bit with the Ngram Viewer. In one experiment we plotted “silence” against “noise,” and found that noise overtook silence around 1961, even though 1961 is the year Wesleyan first published Silence, by John Cage. Cage would have enjoyed the Ngram Viewer. Our Ngram Viewer chart plotting silence and noise is shown below:

All Stung Over By Links of Googled Grace

We are stung by it, in Flannery O’Connor’s world, where grace is a holy bee attracted to the colors of the soul’s peacock-like feathers, or we are brushed by a mere grace singing like a wind, stirring Wallace Stevens’s “gold-feathered bird” in “The palm at the end of the mind”; its “fire-fangled feathers dangle down,” and we become grace when we are satisfied to merely be. In any case, we can not know if grace will, like Portia’s mercy, “droppeth as the gentle rain from heaven,” or if grace, like Flannery’s wooden leg, will smack us between the eyes as we roll casually under a mellow blue wave.

So it seemed when we were close to rest last evening, checking our Gmail, and noticed, in the sidebar, links, to ads, whose words appeared pulled directly from our text. After a few clicks, we got to the bottom of this, for Google explains: “Ad targeting in Gmail is fully automated, and no humans read your email in order to target advertisements or related information.” As if we should be comforted by the fact that no humans read our email; it’s not the humans we are worried about, we thought, and thought again of Richard Brautigan’s (1967) “All Watched Over by Machines of Loving Grace.” We are living with the machines now, their grace as palpable as bees whose dance would show us the way to an immortal light, which is to say a mere mortal light, but which might be enough to light us a new path to an old palm.

Gaggle Me-Researcher Project Spilled on WankiLeaks

Gaggle, a new Internet start-up whose IPO and purpose have been double-secret rumors for months (it’s not yet clear if Gaggle portends a new great vowel shift or if there’s a schism in the works), has just had its cover blown by WankiLeaks, the surreptitious, hole-and-corner whistle-blowing site.

According to the story just leaked, Gaggle’s primary project is called “Gaggle Me-Researcher.” You enter your information in the Gaggle Me-Researcher tool, and it reveals “thyself,” which you can then come to know.

Using a kind of sic et non computer code, Gaggle Me-Researcher collects all the data from your computer, from your email, your social networking sites, your documents, your Excel files, your photos – any program, file, or folder beginning with “My.” It also collects all of the data from all of your friends’ computers, from anyone ever connected to your computer in any way, including spammers – the information, the data, from anything you’ve ever touched using your computer. Gaggle Me-Researcher then compiles a comprehensive profile of you, called “Meself.”

It’s not yet clear how Gaggle Me-Researcher collects your DNA, but it apparently does. This allows Gaggle Me-Researcher to trace the individual user’s “self-data” all the way back to Mitochondrial Eve or Y-chromosomal Adam. Thus, ultimately, according to Professor emeritus Stephen Jama of South Santa Monica Bay College, insider consultant to the Gaggle Me-Researcher project, to “know thyself,” is to know everyone else.

According to WankiLeaks, Gaggle’s introductory offer will include the catchy claim, “It’s never been easier to ‘know thyself.’”

“Ask not for whom the whistle blows,” Professor Jama concluded, somewhat cryptically, “it blows for thee.”