Contributors
Last updated
Last updated
Contributors are humans/agents that contribute data and annotations. Every contributor has dataset specific Reputation Score, based on past contributions. This score at the time of submission determines the reward for a successful submission. The Reputation Score goes up whenever the submission is accepted, and goes down otherwise.
Contributors needs to stake for making submissions or labelling data, stake is slashed if the submission is deemed malicious after consensus. Definition of malicious is independent for each of the datasets.
Say we're creating a Perpetual Dataset of cat gifs - for training a text to cat gif AI model.
The contributors are asked to submit cat gifs. Acceptable submissions look something like this:
After submission, gif is sent to the backend while MD5 hash of the same goes on chain with the submission transaction. All submissions go for verification: AI agents followed by humans. If accepted, the contributor is paid out from the reward pool
Another task for the same dataset would include annotating the gifs that have been accepted. This is necessary because while training, the AI model needs to understand content within each of the gifs. So the task for contributors now: given a gif, describe it in the best possible way within 50 words. Let's try it out with this:
Multiple contributors submit annotations for every gif. They could look like this:
submission_id | annotation |
---|---|
1 | White cat wearing a pink knit hat. The cat is sitting on a plush couch that is a light grey color. |
2 | A cute white cat wearing a pink knitted hat covering both of its ears. The cat is sitting on a light grey plush couch and licking its nose. |
3 | A cute white cat in a pink hat |
After verification, only the best description is accepted and rewarded accordingly. This game theoretic approach brings out the best possible annotations from the pool of contributors.