tonetegeatinst
20 hours ago
I'd argue it depends on the data, and what its use case is.
In any case, have a tag that let's you recognize what is and is not AI generated can be useful as it can point to the model or method the data was created by.
Take Math notes or proof data points, some could be AI and some could be generated by students working as interns.
The ability to discern AI content from human generated output is valued to those who will be using this data in the future, but also is another way to sort and catalogue the data.
Will I still check AI output, of course I would. Part of building good datasets is tagging the correct and incorrect data. Computer vision relies on a similar method to teach object recognition.
Overall I see the value in tagging the data as AI. My concern would be bias in someone either just trusting the AI completely or totally ignoring the data samples because they were an AI output.