In two recent presentations at MIT and Stanford Andrew Ng, Chairman of Coursera and Chief Scientist at Baidu, lays out a challenge for AI startups: How will you get the huge amount of data needed to build and train your models? He points out that data acquisition strategy is key to building a protected competitive position. We agree.
But every rule has its exception, and WattzOn's Mr Bill is that exception. Mr Bill is a proprietary machine learning software system that needs very little training data to get economically important results. Mr Bill extracts text data from static documents, such as PDFs, images and scans. It delivers highly valuable semi-structured data, ready for use in analytics and AI.
Mr Bill only needs 20 - 30 exemplars of a document to train. Not 10,000. And it is robust to changes in presentation layout.
This very powerful result changes the economics of AI. Instead of spending months assembling a large data set and training models, one can use Mr Bill to extract data that has value today. Feed that data into AI models, increase product value, and store the data for future use with other AI models.
With Mr Bill, there is a bit of having your [data] cake and eating it too.
What happens when valuable data becomes available in hours not months? Learn for yourself!
Liberate data from static docs. Mr Bill returns a data triplet: File_ID, Field_Label, Data. What can you do with that?