Interpret Models and Question Assumptions

In this unit, you’ll use concepts from the HBR Guide to Data Analytics Basics for Managers to develop the practical skill of looking beneath the surface of analytics—so you can spot the strengths and weaknesses in any model or dashboard before making decisions. The ability to question assumptions is what separates analytics-savvy managers from those who simply accept results at face value.

Spotting Assumptions in Models and Dashboards

Every model or dashboard is built on a foundation of assumptions, whether about the data, the business environment, or how people behave. For example, a sales forecast might quietly assume "last year’s seasonality will repeat," while a lead scoring tool could be based on the idea that "website activity always signals purchase intent." If these assumptions don’t hold, the results can quickly become misleading.

To get to the heart of a model, ask what data it uses, what patterns it expects to continue, and whether any shortcuts or “rules of thumb” are built in. You don’t need to be a statistician—just curious and willing to probe. It is vital to recognize that all models are intentional simplifications of a complex reality; as the saying goes, "all models are wrong, but some are useful."

When spotting assumptions, distinguish between "looking-back" data and "looking-forward" needs. Many dashboards rely on historical data that describes the past perfectly but fails to account for a shifting ground—like a new competitor entering the market or a change in customer behavior. A simple question like "What would happen if this key assumption changed?" can reveal a lot about the model’s reliability.

Judging Data Quality and Sensitivity

Trustworthy analysis depends on the quality and representativeness of the underlying data. It’s important to consider whether the sample is large and diverse enough to reflect reality. For instance, if a customer survey only includes responses from power users, it may not represent the broader customer base.

You should also investigate where the data was created. Was it captured automatically, or was it entered manually by a tired employee on a Friday afternoon? You can perform a quick "manual audit" by looking at 10 to 20 random records. If you see obvious errors—like misspelled names, missing prices, or nonsensical dates—it’s a sign that the entire dataset might be "leaky."

Sensitivity is another key concept. Outliers—unusual data points—can distort results, such as when "one huge deal last quarter made our average deal size look much bigger than normal." If small changes in the data or assumptions lead to big swings in results, the model may be fragile. For example, "If we exclude the top 5% of spenders, does our revenue forecast drop sharply?" If so, you’ll want to be cautious about relying on the model for major decisions.

  • Chris: The new lead scoring dashboard looks impressive, but I’m curious—what data did you use to build it?
  • Victoria: Mostly last quarter’s leads and their outcomes. We focused on website activity and email engagement as the main predictors.
  • Chris: Did you check if those predictors work for all customer segments, or just the ones who are most active online?
  • Victoria: That’s a good point. Most of our data is from tech-savvy clients, so it might not represent everyone.
  • Chris: Also, what happens if we get a batch of leads who don’t fit that pattern—like referrals or event signups? Does the model still hold up?
  • Victoria: Honestly, I’m not sure. We haven’t tested it on those groups yet.
  • Chris: Maybe we should run a quick check before rolling it out to the whole team, just to be safe.

In this exchange, Chris demonstrates how to probe for assumptions about data sources, representativeness, and sensitivity. Notice how the questions are practical and non-confrontational, helping the team spot potential weaknesses before making a big decision.

Recognizing Overconfidence and Communicating Limits

Precise numbers and polished dashboards may be impressive, but that doesn't necessarily mean their results are solid. Be alert for signs of overconfidence, such as results presented as “certain” without any mention of limitations. Beware of using data for "advocacy"—trying to win an argument—rather than "inquiry"—trying to find the truth. If an analysis seems to perfectly support a pre-existing belief, that’s exactly when you should look for evidence that might disprove it.

A responsible approach is to restate findings with appropriate caution and context. Avoid the "statistical methods story" where you get bogged down in how the numbers were crunched; instead, focus on the "business impact story" while highlighting the range of possible outcomes. Instead of "This campaign will increase sales by 20%", you might say, "Based on current data, we expect sales to rise, but results could vary depending on market conditions."

By questioning assumptions, checking data quality, and communicating limits, you’ll help your team avoid surprises and make decisions that stand up to real-world complexity. In the upcoming roleplay, you’ll get to practice these skills by challenging a new lead-scoring tool and deciding how much trust to place in its results.

Sign up
Join the 1M+ learners on CodeSignal
Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal