WARNING!

This website contains spoilers for Andy Weir’s Project Hail Mary.
It is recommended you read the book before exploring this site.

#15

The top ten most common words in the novel are:

the: 8,131
I: 6,750
to: 4,001
a: 3,793
it: 3,241
and: 3,035
of: 2,882
in: 1,830
that: 1,705
is: 1,670

These counts rely on the tokenization performed by the NLP library, spaCy.

Note that under spaCy’s tokenization, the count for “is” does not include the contracted form “’s”. The count for “that” does include when it forms part of ”that’s”.

We may refine these counts as manual corrections are made to the automated tokenization.