Getting Started with Amazon SageMaker

Introduction & Lesson Overview

Welcome to the next step in your machine learning journey—moving from training models locally to building and deploying them in the cloud with Amazon SageMaker. In the previous course, you learned how to build machine learning workflows on your local machine, including data preprocessing, model training, evaluation, and deploying models as REST APIs. Now, we are taking the next step by moving these workflows to the cloud using Amazon SageMaker. This transition will allow you to leverage the power and flexibility of cloud computing, making it easier to scale your machine learning projects and collaborate with others.

In this lesson, you will get an overview of what Amazon SageMaker is, what it can do for you, and how to set up your environment so you are ready to start building and deploying models in the cloud.

What Is Amazon SageMaker?

Amazon SageMaker is a fully managed service from AWS that helps you build, train, and deploy machine learning models at scale. Instead of managing your own servers or worrying about infrastructure, SageMaker provides everything you need in one place. You can use it to prepare your data, train models using powerful cloud hardware, deploy models as APIs, and monitor their performance — all from a single platform.

SageMaker is designed to support the entire machine learning lifecycle, making it easier for you to go from an idea to a production-ready model.

Core Capabilities Of SageMaker

SageMaker offers several key features that cover every stage of the machine learning process. For data preparation, you can use built-in tools to clean and transform your data or connect to data stored in AWS services like S3. When it comes to model training, SageMaker lets you choose from popular built-in algorithms, bring your own code, or use pre-built containers for frameworks like Scikit-Learn, TensorFlow and PyTorch.

Once your model is trained, SageMaker makes it easy to deploy it as a REST API endpoint so you can make predictions in real time. It also provides tools for monitoring your deployed models, tracking their performance, and detecting issues like data drift. For example, you might use SageMaker to train a model on a large dataset stored in S3, deploy it as an endpoint, and then monitor its accuracy as new data comes in.

How Amazon SageMaker Can Help

To help you get a better sense of what SageMaker is, please watch the official AWS overview video below. This video provides a high-level introduction to SageMaker's capabilities and shows how it can simplify your machine learning workflow.

This video will give you a visual understanding of the platform and help you see how the concepts we discuss fit together in practice.

Get Your AWS Credentials in Place

You don't need to worry about configuring AWS credentials while working in this course—everything is already set up for you in the CodeSignal environment. All the necessary permissions and credentials are handled behind the scenes, so you can focus on learning SageMaker without any extra setup.

However, if you ever want to use SageMaker from your own computer, you'll need to configure your environment so that AWS can verify your identity and permissions. To use AWS services like SageMaker, you need to provide your AWS credentials to authenticate with AWS. There are multiple ways to configure these credentials depending on your setup and preferences, and the AWS SDK will automatically look for credentials in several standard locations.

If you don't already have AWS credentials, you can create them through the AWS Management Console. For detailed information about the various ways to configure your credentials and region settings, refer to the AWS documentation on configuring credentials and how to get credentials. Remember, you only need to worry about this setup if you're working on your own machine—in this course, you can skip these steps and get started right away!

Understanding AWS Permissions (IAM Basics)

When working with AWS services like SageMaker, it's important to understand that AWS uses a permission system called IAM (Identity and Access Management) to control who can do what. Think of IAM as a security system that ensures only authorized people and services can access your cloud resources.

Here's a simple analogy: imagine AWS as a large office building. Your credentials are like your ID badge that gets you in the door. But once inside, IAM permissions determine which rooms you can enter and what you can do in each room. Without the right permissions, you might be able to enter the building but not access the specific services you need.

In the context of SageMaker, here are the key permissions concepts you should know:

IAM Users: This is your identity in AWS—like your employee profile. When you log into AWS or use your access keys, you're acting as an IAM user.
IAM Roles: These are like temporary badges that services can wear. When SageMaker needs to access your files or save results, it uses a role that gives it the necessary permissions.
IAM Policies: These are the actual rules that say what's allowed. For example, a policy might say "allow reading files from this storage location" or "allow creating machine learning models."

Don't worry if this seems abstract right now—it will become clearer as we work through examples in the course!

What Permissions Would You Need?

While you don't need to set up any permissions in this course (everything is already configured for you), it's helpful to understand what would be required if you were setting up SageMaker in your own AWS account. This knowledge will help you understand error messages and work effectively with cloud infrastructure teams in the future.

To use SageMaker effectively, you would need two distinct sets of permissions:

1. Permissions for Your User Account

Your IAM user needs permissions to interact with SageMaker and related services. This includes:

Creating and managing SageMaker resources (training jobs, endpoints, etc.)
Uploading data to and downloading results from S3 (AWS's storage service)
Granting SageMaker permission to act on your behalf (the iam:PassRole permission)

AWS provides a managed policy called AmazonSageMakerFullAccess that includes these permissions. While convenient for learning and development, production environments typically use more restrictive custom policies.

2. Permissions for SageMaker (The Execution Role)

When SageMaker performs tasks for you, it needs its own permissions through an IAM role. This execution role must be able to:

Access your training data in S3
Write trained models and other outputs back to S3
Send logs to CloudWatch for monitoring and debugging
Pull machine learning framework containers from ECR (AWS's container registry)

You'll often see this role referenced in SageMaker operations as an ARN (Amazon Resource Name) that looks like: arn:aws:iam::123456789012:role/SageMakerExecutionRole. In this course, both sets of permissions are pre-configured in your environment. However, if you work with SageMaker outside this course and encounter errors like "AccessDenied" or "not authorized to perform: iam:PassRole," it typically indicates that one of these permission sets needs adjustment.

The key point: AWS enforces strict permission boundaries to keep your resources secure. While this adds some initial complexity, it ensures that your data and models remain protected in the cloud.

Installing Required Python Packages

Just like with credentials, you don't need to install any packages in the CodeSignal environment—everything you need is already installed and ready to use. But if you want to set up SageMaker on your own computer, you'll need two main Python packages:

boto3: The official AWS Software Development Kit (SDK) for Python. It lets your Python code talk to any AWS service, including S3 (for storage), EC2 (for compute), and SageMaker itself. You'll use boto3 whenever you want to interact with AWS resources directly from your code.
sagemaker: The official SageMaker Python SDK. It provides higher-level, user-friendly functions specifically for building, training, and deploying machine learning models with SageMaker. The sagemaker package makes it much easier to manage machine learning workflows in SageMaker compared to using only boto3.

To install both packages on your own machine, you would run:

In summary, you can skip these setup steps while working in this course, but now you know how to get started if you want to use SageMaker outside of CodeSignal.

Lesson Summary & Next Steps

In this lesson, you were introduced to Amazon SageMaker as a fully managed service that streamlines the entire machine learning workflow in the cloud. You learned what SageMaker is, what it can do, and how it fits into the machine learning lifecycle. You also discovered the basics of AWS permissions (IAM) and why they're important for giving SageMaker access to your resources. Finally, you found out how to set up your environment to use SageMaker, including configuring AWS credentials and installing the necessary Python packages if working outside this course.

Next, you'll take your first step into the cloud by uploading your data to Amazon S3—getting everything ready so you can start training models with SageMaker.

Next Lesson: Moving Your Data to the Cloud

Join the 1M+ learners on CodeSignal

Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal