#68 IV̶V
Chapter 12 Facts
Number of paragraphs: 253
Number of sentences: 596
Number of tokens: 4,577
Number of unique tokens: 1,114
Number of speakers: 3
Grace : 304 tokens
Rocky : 210 tokens
Computer : 14 tokens
Direct speech: 11.47% of tokens
Space: 3 sections; 100.00% of tokens
Words unusually frequent for Space sections:
he, ball, observe, sleep, word.
Words unusually infrequent or lacking for Space sections:
Taumoeba, fuel, hull, light, see.
For the sentences count, segmentation was performed using spaCy. Tokenization is just based on whitespace, em-dash, en-dash, and ellipsis delimiters. Unique tokens are case-insensitive.
Speaker identification was done manually.
Unusually frequent or infrequent words are based on log-likelihood of lemmas (lemmatization by spaCy).