Let's bring two big ideas together: combining data sources and ensuring your sample is representative. Getting the data combination right is only half the battle if the underlying sample is biased.
Engagement Message
What's a risk of combining a biased customer dataset with a complete sales dataset?
Type
Fill In The Blanks
Markdown With Blanks
To answer the question "Do customers from our loyalty program buy more expensive products?", you need to join the Loyalty
table with the Sales
table and the Products
table.
The Loyalty
and Sales
tables can be joined on [[blank:Customer ID]]. The Sales
and Products
tables can be joined on [[blank:Product ID]].
Suggested Answers
- Customer ID
- Product ID
- Order ID
- Name
Type
Multiple Choice
Practice Question
You analyze survey data and find that 95% of respondents love your new feature. However, the survey was only sent to your top 10% of most active users. What is the main problem here?
A. The data has missing values B. The sample is biased and not representative C. The datasets were combined incorrectly D. The correlation is spurious
Suggested Answers
