Data Abstraction - Chapter 7
Here we still leave our sensors out of the discussion. When we get to data abstraction, transformation, Big Data and analytics, we shift our focus to data availability, validity and usefulness. Our concern is on what we have, how do we bring it in, what does it look like and what can it tell us. Data from the physical world of sensors is just one flavor in a hearty soup of information. Our job here is to extract all the flavors and bring them together. This allows analysts to find the right subtle and solid combinations, realizing the patterns and capturing the value in them.
There is a data explosion going on. We know it’s worth something,
just not exactly for what or for who.
Data Abstraction collects the different pieces and makes them available for analysis. These days every enterprise has some level of Business Intelligence (BI) skills and our practice is to engage these resources in the early stages of your project discovery. By identifying the gaps, bridging them and then helping clients to expand on their existing structure we can augment the solutions already in place. Once those engaged recognize this, breaking down data silos can get easier.
The biggest problem we see is in data silos.
Breaking those down so that the information can be shared across the enterprise and
used in an analytic process for a broader audience is where the value is found.
The list of expanding data sources continues: sensor data, network data, transaction, production, user, market, customer, compliance, safety, operations and financial data. We may also need to build a system of systems using external sources such as GPS or social media. Some may need transformation to create a usable format, to strip or hash sensitive information while preserving value. Getting all the data accessible means traversing the silos, each with their own view of the data they own. These conversations often require finesse and expertise to share the solutions, possibilities and to calm the fears.
How much data do you have? Companies generally take one of two paths: the ‘7 years and we dump it’ or the ‘keep it all, we might need it’ approach. Often, we see some combination of both depending on the data in question.
Clients come into the process with a specific idea and then
find more pieces that are equally compelling as analysis progresses.
We find our solutions lead to new, unanswered questions…every time.
We know there is value in the data, it all tells a story, we just haven’t figured out where it is yet. Analysis, trends and direction can change and we need to know that story. Further regulations about data retention are coming and may affect our ability to tell our story. We now need to find things we don’t know today about our data if we are to keep the opportunity to find it. We have seen instances where disposed data became critical to a current business need, sometimes we’ve even been able to reconstruct it from collateral sources…but not always. More data is often better, even when it doesn’t seem of much value today. With the Cloud’s ability to obtain inexpensive data storage solutions, it is easier than ever to keep your data.
Nobel Economist Ronald Coarse said, “Torture the data and it will confess to anything.” We seek truth, not confession. This requires a clear explanation of where it comes from and how it is transformed, defining it as a ‘single source of truth’. Then everyone can understand and agree with what it is saying. Documenting data preparation methods is a critical step to consensus and adoption; transparency in data manipulation should always be an expectation and is ingrained into our practice.
Presentation of data is a different beast, the right view of the data for the right end user is critical – no assumptions.
- There is standardized reporting with a published style of data, often applied to executive dashboards and reports.
- Most data scientists trust no one else’s data handling and want to get as close to the raw information as possible.
- Front-line users need access to as much information as is relevant to their activities but no more.
- External users want something else entirely.
- Is there a revenue stream hiding in your data?
At Sirius, we have seen many situations. Everything from data owners who know they need some help with untapped value in the data, to those who believe it is being fully used and require no assistance. We are often called in to evaluate an issue that is really a symptom betraying an underlying problem. We have found business value in all of them. It is for this reason that we seek executive ownership of the endeavor, to push ownership down into the organization instead of simply addressing the data sprouts from within departments.
Knowing the available methods, options, and tools combined with experience navigating these issues is where the Sirius Data and Analytic Solutions team has a unique level of depth and expertise. This team has been together since 2005 and focuses solely on data analytics. They understand the difference in the speed of business versus the speed of IT. They come at the topic from different perspectives and a guide in the middle can be an asset to both. With Sirius you have a known and trusted partner in the field, with tested methods and practice.