When most IT pros think about the data their business holds, they focus on a few specific areas. For instance, most will nominate a various range of databases, keeping structured data as key to what they hold, while others will go even further, storing important business information in file shares, email systems, and other secure places. But unstructured data sources are rarely exploited to their full potential.
It’s this unstructured data, also known as dark data, that can provide you with valuable business insights you can’t gain anywhere else. Let’s dive into everything you need to know about this data source, so you can head back to your business and start leveraging it as soon as possible.
What is dark data?
According to Gartner, dark data is data that’s processed and stored during normal business activities, but it isn’t used for other purposes, like analytics or marketing. This may not seem significant on the surface, but several studies have found that around 80 percent of enterprise data falls under this category—in other words, this valuable data is likely sitting somewhere on your servers going unused.
If most businesses only utilise a small proportion of their data effectively, this data represents an opportunity to gain the front foot by boosting your operational efficiency and learning more about your business, your customers, and the markets in which you operate. Just keep in mind that harnessing this unused data can also come with its fair share of challenges.
Overcome the challenges
Unstructured data is typically spread out across multiple locations, such as file shares, email, social media, and logs. Generally, this data isn’t stored using a set of rules, like when you add data to a database, so its accuracy is questionable. When you start trying to make use of it, its sheer volume can make it feel like looking for a needle in a haystack.
This data also lacks neat divisions. For example, a structured database would likely store the customers’ names, addresses, states, postcodes, and countries as separate fields. But the same information in an email or document would display as one long, unstructured string of characters. While it’s easy to query a database for all the customers in one suburb, it’s far more difficult to do the same search across thousands of employee inboxes or documents. However, if you could query this unstructured data and perform analysis on alternative words in emails, you’d see contextual information about how happy your customers really are.
Solve real problems
The best way to look for a needle in a haystack is to use a magnet: Rather than trying to go all in from the start, focus your efforts on answering specific questions, such as, “How do you know customers are happy with your new product?”
Traditional data sources will provide some of the answers by tracking metrics, such as sales volumes. Then, you can dig deeper and layer that data with analysis of customer service logs from your call centre and insights from social media through positive and negative language—it’s about context, as well as content.
Melissa McCormack, from predictive analytics research firm Software Advice, says, “Evaluate the data you currently collect and pinpoint data sources that may correlate and/or add value to those data. It’s important to be aware of what light data you already have, and how it can answer business questions, before probing into the dark data.”
You should also avoid trying to use every piece of data you have—focus on what’s most relevant to solving your business problem.
Know what you have and what you can access
Before you can start asking questions about your data, you need to know what data you have. While databases are traditionally under the management of the IT department, this data often resides within business units. In other words, your IT team may need to work with business operatives to learn what data they’re storing and then find ways to gain access to it without compromising security and trust.
You must also understand that the data you can use might not exist purely inside your own servers and applications. Government departments often make data available to the public. In addition, there are social media platforms, which may have comments and other information available to the public domain.
Focus on your customers
Take a customer-centric approach to learn as much as you can about your customers. A recent report from Deloitte points to several examples from the healthcare and retail sectors, where the use of different data allows businesses to learn about the specific habits and behaviors of patients and customers. They say, “We believe that nontraditional data holds the key to creating advanced intelligent response capabilities to solve problems, potentially without human intervention, before they happen.”
This is key: By focusing on the needs of customers and understanding their needs, you can use data to anticipate issues before they become severe. In today’s day and age, this type of proactive monitoring is appearing in many forms, especially within IT security.
Don’t try to do it alone
Dark data is not easy to analyse. The volume, disparity of the formats, and diversity of the sources means traditional analytics tools aren’t suitable. The journey to success will likely require investment in artificial intelligence and machine learning technologies.
Finding all your data sources is just part of the challenge. Look for tools that support your analysis, such as data collection tools or software for text analysis that not only finds specific worlds but can tell you about the words around your search terms. This will assist you with finding the data you’re looking for and also the context around it.
Even if you have great tools for analysing your structured data, it’s likely you are only seeing part of the story. With so much of your business data sitting around unused within your repositories, finding ways to use dark data will allow you to learn more about your customers, boost operational efficiency, and find ways to improve every aspect of your company’s operations.