#122 λVV
Chapter 21 Facts
Number of paragraphs: 290
Number of sentences: 840
Number of tokens: 6,295
Number of unique tokens: 1,598
Number of speakers: 9
Grace : 1021 tokens
Rocky : 472 tokens
Stratt : 194 tokens
DuBois : 182 tokens
Ilyukhina : 145 tokens
Yáo : 72 tokens
Dimitri : 26 tokens
Baikonur Radio Voice : 4 tokens
Radio Transport Voice : 1 tokens
Direct speech: 33.63% of tokens
Space: 4 sections; 60.78% of tokens
Earth: 2 sections; 39.22% of tokens
Words unusually frequent for Earth sections:
heroin, dose, nitrogen, bunker, DuBois.
Words unusually infrequent or lacking for Earth sections:
you, think, energy, he, ask.
Words unusually frequent for Space sections:
intelligence, animal, Threeworld, heal, smart.
Words unusually infrequent or lacking for Space sections:
ship, the, fuel, hull, panel.
For the sentences count, segmentation was performed using spaCy. Tokenization is just based on whitespace, em-dash, en-dash, and ellipsis delimiters. Unique tokens are case-insensitive.
Speaker identification was done manually.
Unusually frequent or infrequent words are based on log-likelihood of lemmas (lemmatization by spaCy).