‘Bad language’ in the Nordics: profanity and gender in a social media corpus
Abstract
This study looks at the relative frequency of ‘bad language’ according to gender in Nordic languages and in English in a 210-million-token corpus of messages by 18,686 Nordic Twitter users. For the Nordic languages, more than 19,000 ‘bad-language’ word forms were compiled on the basis of usage note annotations in major Nordic-language dictionaries. The most frequent terms overall are swear words, and while males use more of these items on average, the gender...