Here's the idea: I feed a whole bunch of science fiction short stories into a pattern recognition algorithm and then see if it can correctly identify the era of origin for a bunch of other short stories. The three eras I have in mind are the "Golden Age" (1934-1955), the "New Wave" (1964-1980) and "Post-Cyberpunk" (1990-present). The question is, after I train the algorithm on a whole bunch of core sf texts from each of these identifiable eras, would it then be able to correctly place, say, "All You Zombies" as being Golden Age? I've had good luck using this technique to distinguish non-fiction articles from short stories (92% accuracy), and I'd like to expand the approach.
So here's what I need help with: first off, please attack my premises! How legitimate are these categories? How reasonable are the cut-off dates? Do you think that this sort of classification will be too hard or too easy for a poor little computer program? Are there more interesting questions I could be asking using this sort of technique? Contrariwise, is this approach too reductive?
Next up, I need help tracking down about 100 short stories for each time period. Ideally the stories would be purely sf, no slipstream or other fuzzy genre stories (trying to eliminate variables for the poor little algorithm). They would also be less than 10,000 words long and available in full text online (for ease of data collection).
I've already got some initial ideas of course:
- Asimov Robot stories
- Heinlein's Future History stories
- Bradbury's Martian Chronicles
- Stanley Weinbaum's "Martian Odyssey"
- Stories like those found in "Adventures in Time and Space"
- "Cold Equations"
- Philip K. Dick
- James Tiptree Jr.
- Dangerous Visions and Again Dangerous Visions
- Barrington J. Bayley
- Philip Jose Farmer
- Cory Doctorow
- Charles Stross
- Ted Chiang
- Stephen Baxter
Thanks in advance for all comments and suggestions!