Your Data Means Nothing Without The Story

Jeremy Willets
Oct 18, 2022
6 min read

Introduction

I don’t need to see research from the New York Times to make the following assumption — we are surrounded by more data than at any point in human history. Whether it’s baseball stats, the weather forecast, or information on how a team did over the past two weeks — data is everywhere. All the time. It’s on our televisions, smart phones, e-mail, books. There are even a litany of new-ish terms to describe our data — metadata, database, data warehouse, big data, data lake. These are terms that I’ve encountered in the course of my career. But there’s something that’s often missing from the conversation about data — the story behind it.

An Example

Before we dig deeper into the relevance of stories to data, let’s look at an example where we have data… but no story.

One of my hobbies is making music. I’ve been doing it for years — the last half dozen or so have been exclusively on my iPad under the alias TLNGO. I recorded a song called “Endless Desire” in 2019 and released it through the same distribution channels as all of my other music. And then I moved on to writing and recording additional songs.

Fast forward to some point in 2021…. I opened the Apple Music for Artists app to check out the streaming stats for my catalog. Streaming services provide artists with special dashboards that show stream counts, as well as locations of streamers. I rarely get much traffic for my music, but I still open up these dashboards every once in awhile to see how my tracks are doing. It can be a nice ego boost, and I’m always curious to see where the streams are coming from (rarely anywhere near home).

When I looked at the Shazam stats for “Endless Desire,” I was amazed. They were exponentially higher than anything else in my catalog. (Shazam is the music discovery service that Apple purchased in 2018. An oversimplified version of how it works — you hear a song in a coffee shop or grocery store or TV commercial that you like, you open the app, Shazam listens to the audio, and matches it with the actual song and links where you can hear it or buy it. So now you can listen to the entire song, add it to a playlist, win a wager with a friend that it was/wasn’t a certain artist, etc.)

Picture: TLNGO Shazam stats as of 10/10/2022.

My initial reaction was “huh?” Nothing in the streaming stats for my catalog on any of the streaming providers led me to believe that people were actually listening to the track. But clearly a lot of people were using Shazam to match it.

I had Shazam activity in quite a few major cities — Paris, Mexico City, New York City, London, and even Moscow.

Naturally, my next step was to make up my own story about what might be going on.

Here are some of the thoughts that went through my head:

Someone’s using my track in a commercial (without my permission).
A lot of folks have downloaded it and aren’t streaming it.
The streaming services aren’t properly tracking my music.
There’s a network of Shazam bots in the world and they’ve latched on to this track.
I’m huge in Paris and I just don’t know it yet.

As you can see, some of these are more far-fetched than others. But these are the kinds of assumptions a person can make without a shred of story to fall back on.

Signal and Noise

“Signal” and “noise” are two concepts I’d like to briefly unpack. We’ll use them again in a minute.

The signal is the meaningful data that you’re trying to find. The noise is what gets in our way of seeing the signal.

In the “Endless Desire” example above, I see a really loud “noise” — look at all of those Shazams! But I’m clueless about the “signal.” I just don’t have a way to contextualize the numbers that are showing up on my dashboard, or sort through the proverbial noise. Maybe I need to take a trip to Paris to figure out what’s going on. Because only if I go to Paris might I be able to uncover the true story of what’s actually happening.

Data in the Context of Agile

Here’s a (very incomplete) list of the types of data I’ve encountered over my career while working with Agile teams:

Throughput
Predictability
Business Value (Projected and/or Delivered)
WIP
Story Points Completed
Work Items Completed
Blocked Items
Percentiles
Cycle Time

You get the idea. There’s a lot here. It’s enough to make your head spin and your pulse quicken.

But how much of this data is signal and how much is noise?

As with most things in life, it depends. In this case, it depends on the story behind the data.

Let’s take a look at an example.

Pretend you’ve been asked to work with an existing Scrum team. Before your first “meet and greet” with them, you take a look at some of the data they’ve been tracking, and all of it looks good. They’re getting work done within the sprint boundaries, and their throughput seems consistent. Your “meet and greet” is just prior to the team’s Sprint Review. The “meet and greet” goes well, and you’re really optimistic that this is going to be the greatest Sprint Review you’ve ever seen. The Sprint Review begins and the team demonstrates their “Done” work to the handful of stakeholders who are present. After the demonstration finishes, the team asks for feedback from the stakeholders. After a few moments of silence, one of the attendees from marketing speaks up and says, “It’s great that you finished off everything, but our customers can’t use this feature.”

In this example, noise is everywhere. The team’s data is the primary culprit. It’s front and center and visible for anyone to check out. The signal, though, is much more obfuscated. In this example, the meaningful data — the signal — is really whether or not the customer can use the increment that the team delivered in the sprint. And as we read above, customers are not going to be able to use the feature that the team developed.

Differentiating Between Signal and Noise in Agile

Agile is predicated on delivering valuable product to customers at a sustainable pace. Neither of these is necessarily easy to see.

Let’s look at how to spot whether the work being done is delivering value to customers. There’s undoubtedly a lot of noise in this picture. In fact, there’s probably more noise than signal. Everyone from Scrum Masters to Senior Management can get infatuated with looking at things like burn down/up charts and obsessing about ebbs-and-flows in throughput. These are only tangentially related to delivering value to customers. They’re arbiters that a team is progressing through work, and also indicators of the pace at which they’re doing it. But they assume that the work they’re doing is going to be embraced and used by customers. Many times, that’s a big assumption. But in this case, that’s the signal. So how to spot delivery of customer value? In many cases, it’s going to be a lagging indicator like product/feature adoption and overall sales.

Sustainable pace is the next item to unpack. It can be tricky to quantify because Agile tends to be full of lightweight methods that are focused on doing the highest value thing at all times, and don’t include things like time tracking. In the traditional project management paradigm, it’s easy to see how much people are working because every few days, someone goes into the tool that’s used to track time and queries to see how much time people are logging to the project. That assumes that people are tracking their time appropriately. Because so much of Agile is focused on delivering valuable product to customers, it’s possible that people and teams could be working at an unsustainable pace. But how would someone notice this? A few thoughts:

If the team is using Scrum, check out their throughput for a handful of recent sprints. Is work getting done consistently throughout? Or is there a pattern of closing lots of work on the final day?
If the team is using Kanban, check out their WIP. Do they have WIP limits on their board? How often do they exceed them? How long does exceeding the WIP limit last? Is there an “expedite” concept? How often is it used? By who?

Why The Story Matters

Humans are natural storytellers. In the workplace, we’re surrounded by data points that we think tells a story. But it’s often an incomplete story. And when we try to make assumptions about the story, we just get further down the proverbial rabbit hole. Why did sales of a given product spike at the end of the quarter? We’d like to think that it was because of the new feature we just added, but was it really because we got a large order at the very end of the quarter?

When pairing stories with data, we must strive to find the actual true story. After all, it’s only with this context that you can truly identify continuous improvement opportunities. Take, for example, the story above where the team’s data looked great, but the product they were on developing wasn’t going to be usable by customers. If the only window into the team was the data, they’d be passing with flying colors. But when the usability of the new feature to customers comes into play, the view changes considerably. Instead of marveling at great looking data, the conversation becomes more about getting closer to delivering what customers actually need. That’s a great place to be for any team, whether they’re using Agile or not.