So at the moment I'm interested in large datasets. Trying to collect some interesting links to what's out there:
- theinfo.org: A site to discuss large data sets, although it seems quiet
- Comprehensive Knowledge Archive Network
- UK govt data site
- The London Datastore
- Datawrangling: some datasets available on the web
- Ordnance Survey OpenData
- Wikipedia database dumps
- Project Gutenberg mirroring HOWTO
- Amazon Public data-sets
- data.gov
- USPS Address Information Systems products (okay, this is commercial)
There's obviously some duplication here in terms of sites linking to other sites, I'm highlighting stuff I thought was interesting. I'll probably update this post as I find new data that seems interesting.