Thought Leaders

Every Second of Every Day, 1.7MB of Data is Created for Every Person – Here’s How We Can Get Control of it

Published June 17, 2020

Updated December 9, 2022

Itzhak Assaraf

Today, AI is everywhere.

From our digital voice assistants that help us stay on top of tasks, to our reliance on Google Maps for directions, to the recommendation engines that help us decide what to watch on Netflix, AI has become an integral part of our lives. Though some may posit that the term has become an almost meaningless cliche, in truth, it’s more important than ever.

Novelty uses like the $220 AI-powered toothbrush that’ll shine your pearly whites to perfection aside, it’s being put to use in incredible and impactful ways more often than not. AI is being used to help banks determine if transactions are fraudulent or legitimate, it enables hospitals to improve patient care and, incredibly, AI is helping identify persons with suicidal tendencies and get them the necessary help before they cause harm to themselves or others, among other important uses.

How AI Creates So. Much. Data

But as AI becomes more commonplace, the amount of personal data organizations hold grows at an exponential rate—and in fact, that’s exactly how AI trains itself. The more data it’s given, the more it learns and the better it performs. The result? Currently, more than 1.7MB of data is created on each person in every second of every day. This is a staggering number and AI-enabled technologies are a major contributing factor.

Interestingly though, in recent years, consumers have grown more aware and concerned regarding the ways in which their data is used, and sometimes abused. In part thanks to AI, but by way of many other tools as well, personal data has become the lifeblood of razor-targeted marketing efforts. Data helps organizations understand buying patterns, customer behavior, click-through rates, and more, to reach a slew of new insights.

Consider how, for example, our beloved recommendation engines work. After a long day at the office (or kitchen table-turned-makeshift workplace), you want to stretch out and unwind with a good show. You turn on your favorite streaming service, waiting to see your myriad viewing options.

Just how do those smart people over at your streaming service know what you’re interested in? The data science team collects thousands of data points per user, such as how long you spent watching any particular show, the time of day you typically watch, the devices you use, etc,. The more you watch, the more personal data the AI collects on you, thus enabling it to make better, more accurate predictions about what you’ll be interested in. This cycle of collecting personal data, and thus creating even more personal data never ceases, resulting in the mind-boggling amount of data that’s “born” each moment.

The Problem With Too Much Information

But now that your streaming service has collected and created all this data, it needs to be stored, managed, and kept safe. This is an expensive proposition. Moreover, the nature of data is to be spread out. Sure, there’s only one database, which is likely thought of as the central bucket for all data, but the reality is so much more complicated and messy; data science teams constantly create copies in various formats as part of training and testing the modules. Employees also unintentionally create copies, sending PI by email, generating reports, and more.

The result is a huge amount of personal data over which there is little supervision and even less control. More than that, most of it has no use for the organization and can be deleted after the use of it but who really remember/knows that it exists? This leaves organizations wide open to censure and penalties, as well as security risks. So how can all this data, which seems to be just about uncontainable, be reconciled with the need to adhere to privacy regulations such as GDPR and CCPA?

Turns out the problem is also the solution.

AI To Rein In All Your Data

Human-made attempts to instill order to the unending personal data problem obviously fall short. Very short. That’s because getting a handle on everything you’ve got requires knowing that you have it in the first place, which we’ve established is near-well impossible. But AI, which is all about scale, speed, accuracy, and automation, is perfectly suited to keep personal data in check.

To start, AI is a whole lot faster at sorting and organizing massive volumes of information than humans (sorry, humans). It can read data far more precisely and quicker than we can. It can automatically categorize data into GDPR or CCPA-sensitive categories, extract PIIs from both structured and unstructured data, merge duplicates of PIIs, and identify potentially sensitive documents on images – and it never gets tired of doing this.

AI can also identify data in places it shouldn’t be and can track and control all data movements, enabling it to monitor for risk. Speaking of risk, by automatically discovering unknown uses of sensitive data and eliminating all unneeded copies, AI enables you to drastically reduce your attack surface.

So for example, let’s say you have an AI engine that can perform entity extraction, understand entity relationships, and the meaning of data elements, as well as understand categories of information such as health-related information or criminal information. With AI, you can analyze endless copies in different data types, like data in motion, data at rest, structured, and unstructured, to actually gain greater control and management of that data. Lastly, with AI, organizations can perform large-scale multilingual data analysis to draw out unique business insights.

The Disease Is The Cure

In one of my all-time favorite movies, The Incredibles, Mr. Incredible realizes that the only thing powerful enough to destroy the robot is the robot himself. AI is an incredibly powerful tool. And as we continue to feed the great monster, it will only grow more powerful. Now’s the time to ensure it’s being harnessed properly and put to good use, by using it to enable organizations to gain far greater control over their most precious asset.

Itzhak Assaraf

Itzhak Assaraf is CTO and cofounder of 1touch.io and has more than 20 years of experience in all aspects of technology, software, network, security and hardware.

Unite.AI

Every Second of Every Day, 1.7MB of Data is Created for Every Person – Here’s How We Can Get Control of it

How AI Creates So. Much. Data

The Problem With Too Much Information

AI To Rein In All Your Data

The Disease Is The Cure

You may like