The Living Thing / Notebooks :

Data sets

Questions for answers looking for questions

See also musical corpora for some specialised music ones.

Generic tools for construction thereof

Miscellaneous data sets

Collected open data sets at cloud providers

Various providers host data sets conveniently close to their cloud platforms

Social network-ey ones

I’m no longer in this area, so I won’t say much on this.

Point clouds/spatial data

Stashed at 3D data.