#79 VII
Chapter 14 Facts
Number of paragraphs: 241
Number of sentences: 664
Number of tokens: 5,434
Number of unique tokens: 1,518
Number of speakers: 9
Leclerc : 989 tokens
Stratt : 505 tokens
Grace : 472 tokens
Rocky : 329 tokens
Dr. Lokken : 7 tokens
Destroyer One : 1 tokens
Destroyer Two : 1 tokens
Submarine One : 1 tokens
Submarine Two : 1 tokens
Direct speech: 42.45% of tokens
Earth: 2 sections; 42.79% of tokens
Space: 2 sections; 57.21% of tokens
Words unusually frequent for Earth sections:
Leclerc, methane, nineteen, he, climate.
Words unusually infrequent or lacking for Earth sections:
they, Astrophage, I, light, their.
Words unusually frequent for Space sections:
field, radiation, magnetic, Eridians, Eridian.
Words unusually infrequent or lacking for Space sections:
I, I’m, the, my, Taumoeba.
For the sentences count, segmentation was performed using spaCy. Tokenization is just based on whitespace, em-dash, en-dash, and ellipsis delimiters. Unique tokens are case-insensitive.
Speaker identification was done manually.
Unusually frequent or infrequent words are based on log-likelihood of lemmas (lemmatization by spaCy).