Skip to content

Latest commit

 

History

History
19 lines (11 loc) · 1.77 KB

reality.md

File metadata and controls

19 lines (11 loc) · 1.77 KB

Reality

According to a top post from KDnuggets:

Data is more important than fancy AI architectures

Let’s say that you have two AI startup founders, Alice and Bob. Their companies raised around the same amount of money, and are fiercely competing over the same market. Alice invests in the best engineers, PhDs with a good track record in AI research. Bob hires mediocre but competent engineers, and invests her (“Bob” is short for Roberta!) money in securing better data. On which company would you bet your money?

My money would be squarely on Bob. Why? At its essence, machine learning works by extracting information from a dataset and transferring it to the model weights. A better model is more more efficient at this process (in terms of time and/or overall quality), but assuming some baseline of adequacy (that is, the model is actually learning something) better data will trump a better architecture.

According to Cassie Kozyrkov:

machine learning specialists know that they won’t find the perfect solution in a textbook. Instead, they’ll be engaged in a marathon of trial-and-error. Having great intuition for how long it’ll take them to try each new option is a huge plus and is more valuable than an intimate knowledge of how the algorithms work (though it’s nice to have both).