#74 VℓV
Chapter 13 Facts
Number of paragraphs: 249
Number of sentences: 667
Number of tokens: 5,252
Number of unique tokens: 1,340
Number of speakers: 6
Grace : 1125 tokens
Dr. Lokken : 996 tokens
Bob Redell : 994 tokens
Stratt : 279 tokens
Rocky : 114 tokens
Easton : 101 tokens
Direct speech: 68.74% of tokens
Earth: 2 sections; 84.54% of tokens
Space: 1 sections; 15.46% of tokens
Words unusually frequent for Earth sections:
neutrino, proton, New, blackpanel, particle.
Words unusually infrequent or lacking for Earth sections:
Venus, the, who, DuBois, kid.
Words unusually frequent for Space sections:
sick, cell, very, hydrogen, die.
Words unusually infrequent or lacking for Space sections:
it, just, from, one, as.
For the sentences count, segmentation was performed using spaCy. Tokenization is just based on whitespace, em-dash, en-dash, and ellipsis delimiters. Unique tokens are case-insensitive.
Speaker identification was done manually.
Unusually frequent or infrequent words are based on log-likelihood of lemmas (lemmatization by spaCy).