#90 Vλℓ
Chapter 16 Facts
Number of paragraphs: 211
Number of sentences: 596
Number of tokens: 4,642
Number of unique tokens: 1,342
Number of speakers: 7
Grace : 649 tokens
Rocky : 309 tokens
Stratt : 304 tokens
DuBois : 141 tokens
Yáo : 25 tokens
Dimitri : 17 tokens
Ilyukhina : 10 tokens
Direct speech: 31.32% of tokens
Space: 2 sections; 70.40% of tokens
Earth: 1 sections; 29.60% of tokens
Words unusually frequent for Earth sections:
woman, astronaut, candidate, DuBois, Russian.
Words unusually infrequent or lacking for Earth sections:
light, work, back, see, that.
Words unusually frequent for Space sections:
he, his, name, mate, Rocky.
Words unusually infrequent or lacking for Space sections:
they, Taumoeba, it, wall, the.
For the sentences count, segmentation was performed using spaCy. Tokenization is just based on whitespace, em-dash, en-dash, and ellipsis delimiters. Unique tokens are case-insensitive.
Speaker identification was done manually.
Unusually frequent or infrequent words are based on log-likelihood of lemmas (lemmatization by spaCy).