No. Data is not the new oil, new gold or new bacon. No one is trading data futures in Hong Kong - data is not a commodity. It can surely look like it, and it is convenient to metaphorically describe its specific characteristics. For example, data can be bought or sold or used to produce value, but before that, it needs to be harvested (or mined or grown), and this is where the metaphor falls apart. One may argue that, on the contrary, there are data mines, and if you drill and mine there for long enough — you eventually strike gold in the form of magical insights. That one must immediately jump in to sell big data, blockchain, AI, machine learning.
I would argue that unless you are a national (or a private) bureau of statistics organizing polls and focus groups, data is not harvested, mined or grown but rather accumulated as a side effect of doing something else. For example, if you are in the business of making payments or running a search engine or collecting and hosting public resumes, then you’ve probably accumulated enormous amounts of data.
Data is an inevitable byproduct of creating value somewhere else where a network effect exists.
Byproducts are frequently mistreated as a burden, the unavoidable evil you have to deal with only to prove transactional consistency to settle a dispute between the parties. This is shortsighted, excessively defensive and doesn’t recognize an opportunity to leverage the network effect. Issues like this are easy to detect — you have massive amounts of data, and the cost to maintain it is higher than the revenue it generates. It may be very noticeable and even look embarrassing from the consumer's point of view. Let me give you several examples.
I’m a customer of a well-respected, “Top Five” bank in Canada. At least once a month when I open my (physical) mailbox, it explodes with about five envelopes from this reputable bank. Three of them would be identical promotional mails addressed to three different people who had previously lived at that address before and had happened to be clients of that bank.
It is an ineffable strangeness to me why nobody in charge of these mails ever wondered how three different clients could live at the same address at the same time? Sometimes envelopes include sensitive information like bank account balances or line of credit transactions. The first five times, I brought these envelopes back to the post office to return them to the sender — the bank. Three years have passed, and I still receive them. As we say on the internets — smh (shaking my head).
Screen Scraping Must Die
Another example of this is when I want to use my data. My well-respected bank stores this data to “make my life easier”, but they don’t let me access that data, and when I find a way, they threaten me, that I violated terms of services and other electronic access agreements.
I’m talking about personal finance management for consumers and cash management for small businesses. Considering my respected bank has access to all of my transactions (within this bank), it seems logical to use these data to learn more about my behavioural patterns (read: spending (read: overspending), or as a small business owner, I can use this data to keep my accounting books fresh, resource planning agile, and my cash flow positive.
There are apps for that. They ask me to log in to my online banking on the first use, then they use my credentials to log in there as if it was me. Then, they perform their magic — fetch all of the data available (transactions and balances) and enrich data with accounting codes and expenses categories (powered by AI™), they then provide me with a plethora of insights within minutes. I’m happy and ready to become a better, more conscious over-spender with some unconscious guilt. Also, within minutes!
Turns out, I should not be. The standard fifty-thousand pages long terms of services agreement with my respectable bank proclaims that sharing logins and passwords is a violation of the aforementioned agreement for which I shall be condemned and punished.
Long Live Screen Scraping
I’m not saying somebody is a data-hogging extremist there. After all, it doesn’t look like the banks are trying to enforce these rules or react to acts of violation. However, I hope that this annoyance with screen scraping within banks will eventually lead to better outcomes. In my previous post, I wrote about how banks could solve this problem by exposing API and stir up their business development.. FTW (for the win) — as we say on the internets.
I fancy maintaining the metaphorical mysticism of commodities like gold, but adhere to my statement that data is a byproduct. This makes gold metaphors either awkward or bland. A bland metaphor is not even a metaphor, it is just a fact. An interesting one though — the largest producing gold mine in the world, the Grasberg mine in Papua, Indonesia, is primarily a copper mine. And let's not get into awkward metaphors at all.
So if data is not a commodity, like gold or copper or bitcoin, that you need to mine, and is a byproduct of doing something else, then what's it like? Data is like the botulinum toxin that once threatened the existence of a billion-dollar industry of food canning, then became the most lethal biological weapon of mass destruction until an insight and strategy magically converted this universally despised thing into a beloved billion-dollar industry of Botox applications. Another reason I like my metaphor more is that “Botox apps” sounds absurd enough to sound funny.
Notice how before it was a colossal operational burden associated with war and disease, inflicting financial damage? Now it remains an operational inconvenience but becomes a product associated with the beauty industry and celebrities.
I attempted to expose the issues of metaphorical comparison of data with gold. Nevertheless, it is as easy to critique my beautiful botox metaphor — I only hope that my critic will look more ridiculous than me inventing it.
Apart from the fact that my good metaphor was quite silly, there is the fact that metaphors can obscure ideas, create wrong perceptions and wrong inclinations. By the way, if you haven't yet, read our blog post— "Netflix is a terrible analogy for banking disruption" .
Focusing on the wrong part of the data as a commodity metaphor can lead to misplaced feats. For example, look at many AI startups struggling to differentiate and find their product-market fit — I believe the root cause of this situation is in treating data as a commodity, not as a byproduct. One side of the market reacted to this common misconception by offering data-processing services like Artificial Intelligence/Machine Learning (AI/ML) which is merely a tool, while another side of the market reacted by supplying whatever data they had to whatever ML tools they got.
Treating data like a commodity led to overlooking its other significant properties — data is not only a byproduct of you doing your business. It is also a byproduct of your clients doing business with you. The odds are that the data you accumulated — belongs to them, and you cannot use it without their explicit consent, that's how companies have previously landed themselves in half of the data scandals you hear about in the media (the other half was about not securing the data and being negligent).
However, there is no need to be defensive and avoid storing data at all costs. Remember, your clients do business with you because you help them succeed at what they do. The better you understand and know your clients, the better you can serve them. How to get users' consent for storing their data and how to store data securely — are not the problems to solve anymore. Those problems were solved years ago. There is nothing wrong with getting to know your clients, simply be open about it. Let them know what you want to store and why, let them opt-out if they prefer, and finally, let them use their own data that you collect — why not give them an API for this?