#133 λ+I
Chapter 23 Facts
Number of paragraphs: 146
Number of sentences: 497
Number of tokens: 4,408
Number of unique tokens: 1,225
Number of speakers: 7
Stratt : 1258 tokens
Grace : 643 tokens
Dr. Lokken : 263 tokens
Yáo : 88 tokens
Dimitri : 56 tokens
Rocky : 51 tokens
Ilyukhina : 20 tokens
Direct speech: 53.97% of tokens
Earth: 2 sections; 71.94% of tokens
Space: 1 sections; 28.06% of tokens
Words unusually frequent for Earth sections:
you, Cáceres, Meknikov, Yáo, generator.
Words unusually infrequent or lacking for Earth sections:
they, sun, Venus, why, two.
Words unusually frequent for Space sections:
we, Paul, John, 4.5, 1.1.
Words unusually infrequent or lacking for Space sections:
two, some, need, room, around.
For the sentences count, segmentation was performed using spaCy. Tokenization is just based on whitespace, em-dash, en-dash, and ellipsis delimiters. Unique tokens are case-insensitive.
Speaker identification was done manually.
Unusually frequent or infrequent words are based on log-likelihood of lemmas (lemmatization by spaCy).