Creating Harmony: Collaborative Database Schema Design for Better Data Teamwork
The path to product-led growth (PLG) success depends on Data and GTM teams getting along. Read the second in a series: "Everything Starts Out Looking Like a Toy" #146
A HUGE thank you to our newsletter sponsor for your support.
If you're reading this but haven't subscribed, join our community of curious GTM and product leaders. If you’d like to sponsor the newsletter, reply to this email.
Brought to you by Pocus, a Revenue Data Platform built for go-to-market teams to analyze, visualize, and action data about their prospects and customers without needing engineers. Pocus helps companies like Miro, Webflow, Loom, and Superhuman save 10+ hours/week digging through data to surface millions in new revenue opportunities.
Hi, I’m Greg 👋! I write weekly product essays, including system “handshakes”, the expectations for workflow, and the jobs to be done for data. What is Data Operations? A discussion that grew into Data & Ops, a fractional product team.
This week’s toy: an e-ink device that looks a lot like a BlackBerry, but is an open source device. It’s a lot easier to type on a tactile mobile keyboard than a screen any day - and surprising that manufacturers have abandoned this form factor. Maybe it will come back! Edition 146 of this newsletter is here - it’s May 22, 2023.
The Big Idea
A short long-form essay about data things
⚙️ The Path to PLG Success: Data and GTM teams join forces to improve data and next actions
The first post in the Path to PLG Success series discussed challenges that data and go-to-market (GTM) teams face in B2B SaaS companies, especially when it comes to working together. It's all about making sure everyone's rowing toward the same goal with product-led growth to accelerate revenue.
Tl;dr: the three things we highlighted included GTM tools not having all the product data they need, people not agreeing on what certain metrics mean, and the data in warehouses being fresher than what's in GTM tools.
We’re all about finding ways to help data and GTM teams work better together and drive growth for their companies. For the next post, we’re building on tactics to improve these challenges based on the way that data is stored - the metadata and schema of our shared data work.
Creating Harmony: Collaborative Database Schema Design for Better Data Teamwork
A database schema is a blueprint for organizing and managing data in a database, consisting of concepts like tables, fields, and relationships to define how information flows and is connected. In a lot of ways, the schema of a database lays out the organizational logic of a company, codifying business logic and processes into a digital structure. That means that the schema design is an important prerequisite for aligning company processes and data.
These schemas, by design, tend to be a bit rigid. Once you create a field or a table, you are probably not going to take it away. To do so requires you to evaluate whether something that depends on this might break. Field values, on the other hand, update frequently. As long as you don’t remove prior values, it’s usually easy to add new data values to your fields to update and adapt to new situations.
Fields are easy to update. When business logic depends on specific values – for example, a picklist in Salesforce that determines which team receives a lead – and the team organizational structure (and name) changes, the business logic has to change too. All of this means that you need to understand when a field (or a data value in a field) drives downstream logic. You might need help from a data team to make sure the output matches what you expect.
A show of virtual hands please: who has found a picklist data value in your GTM systems and had no idea where this came from? This is often a sign of data “drift”, or the movement of data from an intended purpose to something new that isn’t captured in institutional memory.
Database schemas are an implementation of company policy and logic
You need database schemas to define how information is stored, how changes are recorded and updated, and to track the progress of objects through the flows that define your business logic. Quite literally, the database schema and the records form the workflow of your business.
That means that the schema should reinforce the ideas that GTM and data teams collaborate on to ensure they are aligned. Some of these ideas require custom-built objects to map the items that exist between systems.
Picking up on the challenges we outlined in the first post of this series, there are some specific areas of conflict and collaboration that need to be addressed when building data schemas for GTM and data teams:
Building the database (and the information) to support the needs of fast-changing product data
Creating a structure that lets you define metrics on the data that make sense to the rest of the organization
Updating the data quickly to support fresh data in GTM (go-to-market) tools when the data changes outside of those tools
This is not an exhaustive list - it’s a few examples that show up often because when they are not solved, they cause conflict.
Fast-Changing Product Data
What happens when a user logs in to your app? At the very least, the last_seen_at timestamp is updated for that login, recording the time when the user took that action. In many cases, each action sends a “heartbeat” signal letting the app know the action, the user, the browser or app used, and the data sent.
Product data is 1st party data generated by users when they engage with your product. This data is voluminous – you might have hundreds or thousands of events happening every minute – so you need to be able to insert it into a table as fast as possible. Whether you are setting up dimension tables and adding a new value or counting the number of events for a person, product data is critical data.
As a GTM leader, you can easily see what individual product activity is associated with an account. Not just the first time something happens, but also to see data over time to uncover trends and patterns. That means your data team needs to make it easy to relate a user action in the app to user and account records in your GTM systems.
Both Data and GTM teams also need to be able to define a brand new event easily (a page visited, a combination of actions, or a value entered) and pass that event to other systems. That means that the name of the event needs to be changeable while still having a fixed ID for the system to track.
It’s important because if you’re waiting for a user to take an action in order to run a flow or add them to a segment, it’s this event that needs to trigger that process.
Creating a structure to support a metrics layer for GTM
Now that we’ve got an understanding of updating and maintaining product data, there’s another area where the data team and the GTM team need to work together: metrics.
This doesn’t seem surprising, but creates friction nonetheless. To define the meaning of a “Sales Qualified Lead” or a “Product Qualified Lead” you need to define the trigger or signal (taking action on a particular page, requesting a demo, or starting a trial and taking certain product actions). Each of these conditions likely touches product data, so the GTM team aligns with the data team to make sure the signals are present. The GTM team is ultimately responsible for managing the cohorts or audiences or lists that result from tracking important metrics.
Metrics involve something that changes from X to Y for a time period. That means that when defining a metric for your GTM team, that team needs to be able to name the metric, identify the value that changed to inform that metric, and the time period during which you check that value.
Example: you want to know if first-time users take more than three actions in their first product session. To understand this, you need to know when they logged in, what was the date of their first login, and have an accurate count of their product actions in the session. Once that occurs, set an attribute for that person so that your other teams can take action when you alert other systems with a webhook.
As a RevOps leader, you’re going to set the threshold for action. Is there a lower bound for your metric that requires activity? For example, an active trial may not have enough sales tasks completed. Is there an upper bound that requires activity, e.g. after 3 or 4 sales touches you should know if the lead is sales-qualified? Setting the threshold for data is important, and the data needs to be there to support it. The Data team provides that data, and GTM teams determine how best to use it to inform the next actions.
Quickly updating data for the needs of the business
GTM teams depend upon the freshness of product data to know what to do next. This means that product data needs to be updated both on a cadence and on demand. Most people – when asked – would say they need real-time data on product actions like a user raising their hand to request information from inside the product because they’re stuck or want to learn more.
Scheduled updates run several times daily and update GTM systems with up-to-date information from the data warehouse. But GTM systems sometimes need more frequent updates. That’s why data teams also need to provide updates on demand – often with webhooks – at the same time an event occurs.
For example, think about a demo request from the website that gets submitted from a form. You want two things to happen that are in potential opposition:
An almost instantaneous response from the sales team to keep the prospect engaged
The best possible data enrichment for that prospect/account before the engagement happens
If the enrichment happens on a schedule, you might not have the information you need as a seller to make the best impression and align the product with the prospect’s needs. This is a case where using a webhook or another inline process to trigger enrichment (jumping the line effectively for that record) is needed.
Does everything need to be updated in real-time? It depends, of course. You need to structure your tool stack to handle both real-time and delayed action. Ask yourself if the record needs to be updated instantly or in a few minutes or hours. A request for a demo or for help? Instant. An aggregated count of activities on an account? That might be fine if it waited an hour.
The contract that data teams and GTM teams make to work together should be designed to help the business thrive.
Ideas for helping data teams and GTM teams
Here are some additional ideas to help Data Teams and GTM Teams to work together:
Keep the schema flexible - it’s better to be able to add new events and ideas than it is to have a perfect system because it’s bound to change
Record decisions about changes. When the business agreement changes, write it down somewhere. Whether you store this change in GitHub or in Miro or Google Docs or Notion, make it easy to find.
What sort of data should organizations be tracking?
If you start with the key common metrics that you need (my go-to metrics often come from Jason Lemkin, who suggests ARR, ARR Growth Rate, Burn Rate, and Net Retention Rate), there are some attributes that need to be available to calculate these items.
At a minimum, you should have tables for users, accounts, teams, or any other ‘nouns’ within your product that have an impact. For example, a developer product may have ‘instances’ as a table in their database, and Slack lets you know the channels and messages in your workspace. Align your data tables to the way your company views the business.
Within those tables, you need to be able to count things like users, active users, accounts, and activities (major actions in your app), and measure changes to these data points over time. Also, you need to detail how much money a customer spends in a month (or the cumulative amount that the customer has spent, sliced by time stamp) and allocate that spend to particular products.
You’ll also need to build custom objects that represent other ideas in your business. An example of this might be an object to track the sales cycle if you want to group product actions into a customer journey. You might also need to build views against information that spans tables, like a company view that incorporates account data, product activity, and sales history.
One more thing - It helps to have information about higher levels of abstraction like audience or segment definitions based on tenure, spending, activity, or company size. If you’d like some details on calculating these additional audiences and metrics, check out this post on Saas metrics.
Next time, we’ll talk about building accurate and reliable reporting in a data-driven organization.
What’s the takeaway? Data teams are responsible for delivering clean, consumable data views, and GTM teams need to ask for what they need to run the business and make decisions. The structure, metrics, and cadence of refreshing data all feed into this process.
Links for Reading and Sharing
These are links that caught my 👀
1/ A very satisfying click - I found this story charming. Go to this page and click the toggle in the upper right-hand side of the page. What’s interesting is the amount of engineering that goes into a simple interaction. From the physics of the click and the reaction the sound design, there’s a lot going on here.
2/ What’s in a prompt? - The team at Brex has written a guide to help demystify the process of writing prompts for LLMs and Chatbots. If you didn’t know it already, giving proper context is a critical step to getting a good response from ChatGPT and other bots. By explaining an example, asking for specific results, and making the bot show its work, you have a better chance of getting the right answer. You also will have a better idea of how the model got to the result.
3/ A simple idea - To make something simple, make it predictable. Making things predictable gives control and certainty to the user.
What to do next
Hit reply if you’ve got links to share, data stories, or want to say hello.
Want to book a discovery call to talk about how we can work together?
The next big thing always starts out being dismissed as a “toy.” - Chris Dixon