Perpetual Datasets
Last updated
Last updated
Each Perpetual Dataset (or simply called Dataset) corresponds to a specific domain: labelled anime videos, images of pets, Japanese audio etc. For all datasets, only the hash of each data sample goes on-chain while the actual data (content + labels) is stored off-chain. The benefits of this are two-fold:
Keeping data like images, text and annotations offchain: enable high storage and throughput requirements for the datasets
Keeping all data contribution and verification records on-chain along with hash of each data sample: ensure complete transparency
Keeping hash of the each data sample on-chain ensures that user can always calculate hash on their end to verify against the on-chain value, similar to checksum
Contributors and Verifiers need to stake FRAC to participate in the data augmentation process. They either earn rewards in FRAC or their stake gets slashed based on quality of their submission as decided by the consensus protocol
Each dataset has a reward pool, funded by the Dataset Creator at the genesis, to compensate contributors and verifiers. At the end of each epoch, of the current reward pool is distributed to that epoch's participants, with being a hyperparameter defined by the Dataset Creator.