How a mobile app is transforming decentralised AI data collection

By Dashveenjit Kaur 

Decentralised AI data collection is getting a significant boost with Ta-da, a mobile application that aims to solve one of artificial intelligence’s most pressing challenges: acquiring high-quality, diverse training data. 

The platform, which emerged from the voice AI company Vivoka, has attracted around 85,000 users and the company currently works with 50 clients to generate millions of data points weekly.

Training robust AI models requires vast amounts of different yet high-quality data, particularly for speech recognition, image classification, and natural language processing. However, traditional data collection methods can often prove expensive, time-consuming, and prone to bias. 

Ta-da’s approach to decentralised AI data collection addresses these challenges through a combination of mobile accessibility, blockchain technology, and user incentivisation.

How Ta-da’s decentralised AI data collection works

The platform operates on a straightforward yet effective principle: users can download the mobile app on iOS or Android devices and contribute data by recording voice clips or capturing images. 

The Ta-da ecosystem comprises two-tier validation. While some users contribute data, others act as validators, reviewing submissions to ensure they meet required quality standards.

This peer-review mechanism serves a crucial purpose in helping maintain data integrity. 

By implementing blockchain technology, Ta-da ensures that all submitted data comes with verifiable metadata, providing AI companies with transparent information about each contribution’s origin and collection conditions.

The platform began mid-2022 and launched as a beta in mid-2023, initially attracting 20,000 early adopters. Following a successful private fundraising round at the end of

2023, Ta-da officially launched its app into production mid-2024, which led, the company says, to rapid community growth.

Blockchain integration and quality assurance

Rather than relying solely on internal metrics, Ta-da employs an onchain approach that allows clients to review key metadata for each submission. For instance, when users submit voice recordings, the platform stores details about the contributor and the recording conditions in a verifiable format on the blockchain.

The transparency offers AI companies visibility into the origins of their training data. The platform’s structure ensures that submission payments are only processed after successful validation, creating a system that addresses concerns about unverified work and maintains high data quality standards.

Ta-da’s roadmap includes several key developments aimed at enhancing user accessibility and expanding functionality. One planned feature is wallet abstraction, which will simplify the onboarding process for new users. The company also plans to introduce more sophisticated tasks beyond voice recording and social media engagement.

While Ta-da incorporates Web3 elements for payments and transparency, it primarily serves Web2 clients seeking large volumes of quality, pre-vetted data. This hybrid approach demonstrates a practical use case for blockchain technology that extends beyond cryptocurrency speculation, showing how decentralised systems can solve real-world problems in AI development.

The platform’s gamified, incentive-driven environment helps maintain user engagement and encourages regular contributions that can benefit AI developers. As the industry recognises the importance of diverse, carefully vetted training data, solutions that combine crowd participation with secure and transparent technology are becoming more valuable.

Current impact and performance

Ta-da’s impact on the AI training data landscape is already being felt. The platform processes an estimated two to three million data points weekly, demonstrating some efficiency to its decentralised AI data collection model. This volume of data, combined with the platform’s quality control mechanisms, should provide AI companies with a reliable source of diverse training materials.

The success of Ta-da’s approach suggests a shift in how the industry thinks about data collection for AI training, given that the internet’s contents have already been well-scraped for data. By combining mobile accessibility, blockchain verification, and user incentives, the platform hopes to create a sustainable ecosystem that benefits data contributors and AI developers.

Ta-da’s model could serve as a blueprint for future developments in decentralised AI data collection, particularly as the demand for high-quality training data continues to grow alongside advances in artificial intelligence technology and the finite supply of publicly-available data.

Please login to comment
  • No comments found