The Cuckoos Calling

In 2013 the Sunday Times ‘outed’ J.K. Rowling as the author of the detective novel “The Cuckoo’s Calling”, published under her nom de plume Robert Galbraith. They did this partly using a scientific technique called ‘forensic stylometry’. Stylometry is the study of linguistic style, usually applied to written language, but it has also successfully been used with music and fine-art paintings as well. It’s often used to attribute authorship to anonymous or disputed documents. In this case though it was used by journalists to show that the author of the world famous Harry Potter Novels was the same as that of the then new novel called “The Cuckoo’s Calling”.


Background : Forensic Stylometry


The basic theory is pretty simple: language is a set of choices, and speakers and writers tend to fall into habitual, or at least common, choices. Some choices come from dialect. For example an Englishman drives a lorry but an American dives a truck. For any given potential statement there are a myriad number of ways the that the same meaning could be expressed. However much of this apparently free variation is actually rather static at least at an individual level. So by studying examples of two or more documents a person has written, a model of the kind of choices that person makes can be built and similarities noted.

Some of the tests that can be performed are :

»

The distribution of word lengths

»

The distribution of the top 100 most commonly occurring words used between the different works. These will mostly consist of lowly “function words” like prepositions, conjunctions, and articles. Even if you are trying to mask your usual writing style by choosing different vocabulary, it’s hard to fake your typical palette of function words.

»

“character 4-grams” Every string of four adjacent characters in the text is analysed excluding the spaces between words. Although this seems like a rather “dumb” feature to examine, especially as it doesn’t even take into account word boundaries, some recent studies have shown that it can be used to identify authorship with surprising accuracy.

»

Analysis of the frequency of pairs of adjacent words


The Study of “The Cuckoos Calling”


The catalyst for the investigation was that the Sunday Times had received an anonymous tip via Twitter that Galbraith was the pen name of J.K. Rowling. A Times journalists then approached two academics who had developed software specifically to examine questions of authorship: Peter Millican, a teacher of philosophy and computing at Oxford University, and Patrick Juola, a computer science professor at Duquesne University in Pittsburgh.

They were given machine-readable texts of “The Cuckoo’s Calling” along with Rowling’s previous novel, “The Casual Vacancy,” along with novels by three British women who specialize in crime fiction: Ruth Rendell, P.D. James, and Val McDermid.

They didn’t take long to yield an answer: “Cuckoo” was stylistically more similar to “The Casual Vacancy” than it was to the work of any of the three other novelists. Even after the academics requested an additional book by each of the writers, they still found that Rowling’s “Harry Potter and the Deathly Hallows,” despite being in a genre far removed from detective fiction, came in second place, ahead of the six non-Rowling novels they analyzed.

However nothing in the analysis constituted ‘definitive proof’ of Rowling’s authorship. It merely indicated that it was likely to be her. However it was certainly enough for the Sunday Times to use when they approached Rowling’s agent and asked, directly, “Did J.K. Rowling write The Cuckoo’s Calling?” Less than a day later, Rowling confirmed through a spokesman that she had indeed written the novel.


For Further Interest…


ngram_chart

Google has a free on-line app called “Ngram Viewer” that you can use to enter phrases that will then search though Google Books and display a graph showing how those phrases have occurred in a corpus of books (e.g., “British English”, “English Fiction”, “French”) over a selection of years.

You can check it out here >>>.