You need data unification

"Everything Starts Out Looking Like a Toy" #97

Jun 13, 2022

Hi, I’m Greg 👋! I publish this newsletter on finding data products and interesting data observations with the goal of finding patterns and future product insights. (Also, it’s fun.) If you need a background on how we got here, check out What is Data Operations?

This week’s toy: if Guitar Hero and Tetris had a baby, it might look like Bemuse.ninja. I’m about as good at this as I am at Dance Dance Revolution (SPOILER: not good). Edition 97 of this newsletter is here - it’s June 13, 2022.

The Big Idea

A short long-form essay about data things

⚙️ You need data unification

You’re probably familiar with the idea of a network effect, aka Metcalfe’s law.

from https://en.wikipedia.org/wiki/Metcalfe%27s_law

If not, it’s the idea of conforming to a standard to increase utility. For example, a single cell phone does not provide as much utility as a network where everyone and everything has network access. But if each object can talk to a network, the number of connections between each object multiply, eventually exponentially. Connecting a thing to a network makes it possible to access lots of data about the connections between objects.

The effect of networks on your data

The second order effect of having networked objects is the metadata connected to their relationships. Metadata is the information about the data, meaning descriptive information, time series, and other things that help you learn how your data is performing and interacting in the network of connections we described above.

For example, understanding how nodes connect in a graph is essential. A graph implies a series of different pieces of information that must have at least one connection to another thing in the graph. With respect to your business data, that means it’s important to map how similar types of data (think company and people data) are treated in different applications in your business.

Take a typical Saas business as an example. A seller at the company needs to share valuable information with a customer success rep at the same company. However, they are using different systems. Without a unified concept of the related data (the company as it exists in both systems), it’s tough for the seller and the CS rep to work together. The network of information in your business has a broken link.

What’s an ops leader to do?

Whether you implement the Modern Data Stack, adopt a no-code data automation platform, or match up all of your people and company data by hand, you need unified data. The end product is not as exciting initially as the illustrations you might see from a vendor like this one, that looks like a sci-fi version of all the information you might know about a customer.

An illustration of the touch points that create a user experience. — Mixpanel describes the customer 360 view in futuristic terms

Any of these systems is great at connecting information, if you tell it what to do. In fact, they can do lots of amazing things, and need a critical item: the logic to map and connect systems.

It’s all about the lowest common denominator

Data unification starts with a mapping exercise where you look for commonalities in data. Usually this is one key field (in a database, this would be the unique identifier for the object, like a UUID or an email address). You will use this key to find other data that relates to this object.

Once you decide on a key for your data, think about a corresponding key that will map to that same data in another system. When these two fields are mapped, you can reference fields from that other object and know that they are related to the same person or company. For example, this arrangement would allow you to find the proper company name connected to a contact, even though the contact record might not have the company’s name mentioned. Account ID – the unique identifier of the account – lets you know you can find any data related to the account and know it is also related to that contact (Note: although person and company data is the example here, it’s a strategy to map any kind of data together).

Why map data together in a graph?

There are a few good reasons to link data together into a unified view.

Document the connections between entities (questions you might ask: how does changing a lead or contact alter an account? How might we link customers to open software issues?)
Better visibility when designing process (if we wanted to improve one part of the business, where would we need to focus?)
Be a better consultant for the business. Knowing how data fits together allows you to suggest realistic solutions (how can we improve, and understand the dependent variables in our analysis?)

What benefits does unification offer?

So what? You’ve found the data in your organization. You mapped it to other, similar data. What now?

By creating a unified view of your data, you’ve unlocked new capabilities, including:

The ability to create a new object that counts how often things happen (engagements per account, product events per person, number of escalations)
Enabling the deduping of information when you receive new lead lists or account lists
Aggregating statistics from child objects to parent objects

These are singular examples. The point of the exercise is to build a positive link between the business outcomes you want to drive and the design of the data in your operational data hub. To drive the outcomes and measure them, you need to build data structures and linkages that facilitate this result.

Now, put it all together

You started with LEGO pieces: examples of data in your organization that conform to similar standards and can be connected. But sharing this as a data product requires you to make the equivalent of a LEGO set.

Data unification doesn’t mean much until you document how putting the pieces together delivers a better result for the organization. Just adding “the Modern Data Stack” to your list of technology does not automatically give you unified, cleaned, accurate, and relevant data. You need to think about how it fits together to build the logic to make it go.

What’s the takeaway? The modern data stack (and most other data automation platforms) won’t build the logic for you that will answer the questions for your business and improve your operation. You need to use the data unification process to model the data you are using, understand how it fits together, and derive the insights you need to get better. All of these systems can facilitate data collection, analysis, and workflow to make it happen, but you need to start by building the logic.

Links for Reading and Sharing

These are links that caught my 👀

1/ How does a robot swish a 🏀 - Shane Wighton made something amazing: a backboard and hoop that never lets you miss a shot. Don’t take it from me: spend an enjoyable 23 minutes watching this YouTube video.

2/ Products are Functions - Ryan Singer shares an engaging story on how to solve for the real value of a product. It’s a twist on the Jobs To Be Done idea, where you are comparing a before and after situation using a flow that doesn’t include the product and one that does. Determining what’s different and why it’s better (or worse) derives the value (and the perception of that value).

3/ There is no data - Ben Evans writes: “There is no such thing as ‘data’, it isn’t worth anything, and it doesn’t really belong to you anyway.” The real issue is that data is not exactly interchangeable, or if it is, it needs a translation layer to compare one bit of data to another. Perhaps Ben should write “your first party data is hard to compare to other first party data without a secret decoder ring.”

What to do next

Hit reply if you’ve got links to share, data stories, or want to say hello.

I’m grateful you read this far. Thank you. If you found this useful, consider sharing with a friend.

Want more essays? Read on Data Operations or other writings at gregmeyer.com.

The next big thing always starts out being dismissed as a “toy.” - Chris Dixon

Arpit Choudhury

Jun 14, 2022

Data Unification is the key!

This post reminds me of an article I'd written for the HubSpot blog titled How to Validate Data Between HubSpot CRM and an Email Marketing Tool: https://blog.hubspot.com/customers/validate-data-hubspot-crm-email-marketing-tool

Expand full comment