#104 VV̶V
Chapter 18 Facts
Number of paragraphs: 217
Number of sentences: 664
Number of tokens: 4,988
Number of unique tokens: 1,337
Number of speakers: 8
Steve Hatch : 981 tokens
Grace : 714 tokens
Rocky : 434 tokens
BBC Reporter : 78 tokens
Ilyukhina : 59 tokens
DuBois : 57 tokens
Bob Redell : 8 tokens
Yáo : 6 tokens
Direct speech: 46.85% of tokens
Earth: 2 sections; 49.10% of tokens
Space: 2 sections; 50.90% of tokens
Words unusually frequent for Earth sections:
he, office, Beatles, Pete, beetle.
Words unusually infrequent or lacking for Earth sections:
Stratt, she, very, put, sun.
Words unusually frequent for Space sections:
altitude, predator, much, ocean, cloud.
Words unusually infrequent or lacking for Space sections:
I, now, Taumoeba, want.
For the sentences count, segmentation was performed using spaCy. Tokenization is just based on whitespace, em-dash, en-dash, and ellipsis delimiters. Unique tokens are case-insensitive.
Speaker identification was done manually.
Unusually frequent or infrequent words are based on log-likelihood of lemmas (lemmatization by spaCy).