The author makes a good point. Unfortunately, many companies are led by people who just don't get data and its analysis and potential for driving change.
Many executives got to where they are by either lingering around for a long time and waiting for competent people to leave or by mastering the skills their company needed 10 years ago, but which are now irrelevant.
This is leading to ample opportunities for startups and growth companies to beat the living daylights out of bigger competitors. No amount of MBA-speak and powerpoint presentations can help you understand what to do with the data deluge. These leaders can try to hire smart analysts, but how are they going to be able to differentiate between data scientists who can make a real contribution and those who can't?
A lot of bigger more established companies won't make the cut, no matter how much data they accumulate.
There's ample opportunity to disrupt established markets with data driven strategies,but this disruption probably won't come from larger more established companies.
While I agree that "science" (as defined in the article) is a very good way to make use of data, I do not think the data without the science is useless.
For one, if you collect the data now you can always apply the "science" later--as long as you have the data and can query it, you're always ready. This alone means that collecting data now could be useful even if you have no idea of what to do with it. Storage is relatively cheap, and you--or somebody else--might come up with a clever way to analyze it in the future.
Additionally, I think the author underestimates the potential of machine learning. I am by no means an expert in the field (I'm taking an AI class--that has to count for something! ;) I hope) but even the simple techniques we've covered can get some interesting information without much domain knowledge involved. As ML evolves, I suspect there will be more and more technology that can find interesting trends and relationships in data regardless of what the data is actually modelling. The article does mention ML a bit, but I think it will be much more significant in the near future.
Ultimately, figuring out what questions to ask about data and harnessing human curiosity will give you much more than just hoarding data. I just think it's possible that sufficiently abstract and generic approaches in the near future will make it easier to get similar--or perhaps orthogonal--results using techniques from ML and AI. I firmly believe that hoarding the data without doing anything to it is still much better than not collecting it at all.
Machine learning is no silver bullet. You still have to understand the algorithms and the assumptions they make. For example, if you're using a support vector machine, do you have reasonable to believe your data is linearly separable? If you're using a kernel, does the kernel make sense for your data? You can't just blindly try methods till something seems to work, because you'll end with something that works by accident on your test data but doesn't apply to the real problem you're trying to solve. Understand the data and everything else follows.
I agree with you that collecting data can be a good thing, and it can be analyzed at some later point. But in the case of online world, the lifetime of data is very small(this might not be for other industries). Trends change very rapidly, and using an old data for making inference might not be profitable if not a loss.
Actually that's exactly the opposite of true. In fact you -need- old data to validate that your mathematical model actually works by applying it to whatever historical timespan of data you are trying to derive intelligence from. At least if any sort of unit of time is applicable to any of your models.
Of course just sitting on the data doesn't do anyone any good at all..and randomly picking at stuff isn't likely to do much either unless you go in knowing what you want to find.
I think his main point was that people are already hoarding but - like most knowledge - don't know what they're missing/clueless about without some good statistical analysts.
My employer has a whole department of these analysts and after working with them closely it is obvious to me that they are the most important / valuable to ourselves and customers asset in the whole company. Engineers can pick up on some of these things but these guys aren't just picking stuff out of their ass or anything. I'd have to agree with the author that any company storing or collecting any significant amount of data should give this stuff some serious thought.
Many executives got to where they are by either lingering around for a long time and waiting for competent people to leave or by mastering the skills their company needed 10 years ago, but which are now irrelevant.
This is leading to ample opportunities for startups and growth companies to beat the living daylights out of bigger competitors. No amount of MBA-speak and powerpoint presentations can help you understand what to do with the data deluge. These leaders can try to hire smart analysts, but how are they going to be able to differentiate between data scientists who can make a real contribution and those who can't?
A lot of bigger more established companies won't make the cut, no matter how much data they accumulate.
There's ample opportunity to disrupt established markets with data driven strategies,but this disruption probably won't come from larger more established companies.