From the NY Times: ”The warehouse of words makes it possible to analyze cultural influences statistically in a way previously not possible. Cultural references tend to appear in print much less frequently than everyday words, said Mr. Michel, whose expertise is in applied math and systems biology. An accurate picture needs a huge sample. Checking if “sasquatch” has infiltrated the culture requires a supply of at least a billion words a year, he said.”
“So far, Google has scanned more than 11 percent of the entire corpus of published books, about two trillion words. The data analyzed in the paper contains about 4 percent of the corpus.”
So go to it!