#50 IVV
Chapter 09 Facts
Number of paragraphs: 171
Number of sentences: 534
Number of tokens: 4,657
Number of unique tokens: 1,228
Number of speakers: 5
Dimitri : 520 tokens
Grace : 168 tokens
Stratt : 55 tokens
Russian Technician 1 : 12 tokens
Russian Technician 2 : 9 tokens
Direct speech: 16.41% of tokens
Space: 3 sections; 68.69% of tokens
Earth: 1 sections; 31.31% of tokens
Words unusually frequent for Earth sections:
Dimitri, revolver, chamber, thrust, gram.
Words unusually infrequent or lacking for Earth sections:
what, ’re, her, not, as.
Words unusually frequent for Space sections:
hull, burrito, they, tunnel, airlock.
Words unusually infrequent or lacking for Space sections:
he, Rocky, his, Earth, Taumoeba.
For the sentences count, segmentation was performed using spaCy. Tokenization is just based on whitespace, em-dash, en-dash, and ellipsis delimiters. Unique tokens are case-insensitive.
Speaker identification was done manually.
Unusually frequent or infrequent words are based on log-likelihood of lemmas (lemmatization by spaCy).