Contact us
Contact us

GeenStijl.nl embeddings (TGTR-4)

The trained word embeddings (±150MB) are released for free and may be useful for further study on toxic online discourse.

We indexed over 8M messages from the controversial Dutch websites GeenStijl and Dumpert to train a word embedding model that captures the toxic language representations contained in the dataset.

Available upon request