WARNING!

This website contains spoilers for Andy Weir’s Project Hail Mary.
It is recommended you read the book before exploring this site.

#22 λ+

The definite article “the” occurs 8,131 times in Project Hail Mary.

That’s 4.24% of tokens in the novel overall.

The definite article makes up 5.0% of the tokens from the narrator on Earth and 5.07% in Space.

Of the speakers who say “the” more than 20 times, it is 4.5% of Bob Redell’s spoken tokens, 4.0% of Dr. Lokken’s, 3.7% of DuBois’s, 4.0% of LeClerc’s, and 3.9% of Steve Hatch’s.

In Stratt’s speech it only makes up 3.2% and in Grace’s: 2.4% on Earth and 2.2% in Space.

Only 0.1% of Rocky’s tokens are the definite article, a measure of his broken English on which we will have more to say in a later beanbag.

Here’s a visualization of the relative frequencies across all sections and chapters of the book.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

Each coloured bar represents a section and is colour-coded for Earth sections (green) and Space sections (purple).

The grey bars represent the chapters, which are marked with ticks and numbers.

The y-axis is the relative frequency, i.e. the proportion of tokens in that section (or chapter) that are “the” (so a bar might be higher, even with fewer occurrences, if the section or chapter is shorter).

Tokenization from spaCy and treats punctuation separately.