The whole article is quite funny, especially the lists of most used tankie words, or the branding of foreignpolicy as a left-wing news source.
The whole article is quite funny, especially the lists of most used tankie words, or the branding of foreignpolicy as a left-wing news source.
Yeah, any paper worth its weight in flour would at the very least have an appendix with illustrative examples of the comments they find interesting or have “high toxicity”. By just talking about all content in abstract with random asspull metrics they get to claim objectivity while presenting zero actual information. Typical for the kind of people who like to reduce countries to their GDP (per capita if you’re lucky).
Edit: I didn’t notice that they actually did include some in their appendix after 5 pages with 135 citations. So much bloat and there’s even a couple Washington Post articles there. They definitely didn’t even read a lot of those beyond the abstract. Either way the examples are just strewn around in the text and did not include their “toxicity level” so the point still stands. Actually worst “qualitative analysis” I’ve ever read tbh, and that’s usually already the worst in data science. More like “pseudoscientific cherrypicking”.
“If you look here at figure X you can see a selection of the most frequent vocabulary. In figure Y you can see several possibilities of our own design that show the arrangement of the words in figure X into some rather mean and hurtful sentences. Disgraceful. Coincidentally when we sent this paper for peer review both reviewer 1 and reviewer 2 had come to similar conclusions and used a mixture of the words in our paper, the vocabulary database, and some choice additions to say, in similarly mean and hurtful tones, that our work was shit. We can’t work out what it means right now but we’re going to run the reviews through our system for a meta analysis before concluding that the reviewers we’re Lemmygrad users.”