Section 1 - Instruction

You've learned the fundamentals of tabular data: features, labels, and data types. Now, let's put that knowledge into practice by identifying these components in different scenarios.

Engagement Message

Ready to test your skills?

Section 2 - Practice

Type

Multiple Choice

Practice Question

Imagine you're building a model to predict whether a customer will renew their subscription. Which of the following would be the label?

A. Customer Age B. Monthly Bill Amount C. Subscription Status (Renewed/Canceled) D. Last Login Date

Suggested Answers

  • A
  • B
  • C - Correct
  • D
Section 3 - Practice

Type

Sort Into Boxes

Practice Question

Sort these attributes into the correct feature type.

Labels

  • First Box Label: Numerical Feature
  • Second Box Label: Categorical Feature

First Box Items

  • Age
  • Account Balance
  • Years as Customer

Second Box Items

  • City
  • Product Category
  • Job Title
Section 4 - Practice

Type

Fill In The Blanks

Markdown With Blanks

Let's spot some data quality issues. Fill in the blanks to identify the problems in this dataset description.

In the 'Age' column, one entry is blank, which is a [[blank:missing value]]. In the 'Country' column, we see "USA", "U.S.A.", and "United States", which is an [[blank:inconsistent format]] issue.

Suggested Answers

  • missing value
  • inconsistent format
  • error
  • label
Section 5 - Practice

Type

Swipe Left or Right

Practice Question

When working with real estate data, some attributes work well as features while others could serve as labels for different prediction models. Sort these attributes based on their typical role.

Labels

  • Left Label: Good Feature
  • Right Label: Potential Label

Left Label Items

  • Square Footage
  • Number of Bedrooms
  • Neighborhood
  • Year Built

Right Label Items

  • Sale Price
  • Rental Income
  • Time on Market
  • Assessed Tax Value
Section 6 - Practice

Type

Multiple Choice

Practice Question

Look at this small dataset. What is the most obvious data quality problem?

NameAgeCity
Alice28New York
Bob35London
Charlie250Paris

A. Missing values B. Inconsistent formatting C. An outlier or error in the 'Age' column D. The 'Name' column should be numerical

Suggested Answers

  • A
  • B
  • C - Correct
  • D
Sign up
Join the 1M+ learners on CodeSignal
Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal