simonw
8 hours ago
I'm surprised this story didn't mention the scandal with Scots Wikipedia: https://www.theguardian.com/uk-news/2020/aug/26/shock-an-aw-...
> an American teenager – who does not speak Scots, the language of Robert Burns – has been revealed as responsible for almost half of the entries on the Scots language version of Wikipedia
It wasn't malicious either, it was someone who started editing Wikipedia at 12 and naively failed to recognise the damage they were doing.
TZubiri
5 hours ago
The Cebuano wiki is a similar case, not spoken often, but it was a personal project of an editor that was mad at political articles and started making animal articles in the Cebuano wiki.
The solution is to differentiate and tag inputs and outputs, such that outputs can't be fed as inputs recursively. Funnily enough, wikipedia's sourcing policy does this perfectly, not only are sources the input and page content is just an output, but page content is a tertiary source, and sources by policy should be secondary (and sometimes primary) sources, so the system is even protected against cross tertiary source pollution (say an encyclopedia feeding off wikipedia and viceversa).
It is only when articles posing as secondary sources fail to cite wikipedia that a recursive quality loss can occur, see [[citogenesis]]
galagawinkle489
3 hours ago
Many sources for Wikipedia articles refer to Wikipedia without citing it. Many journalists will work from Wikipedia, and most of Wikipedia's sources are journalistic articles. It happens to be that often this isn't noticed because the information obtained this way is true and uncontroversial. Citogenesis only documents examples where, by bad luck, the result is untrue information.