Welcome to Google Cloud Run, a fully managed serverless platform that runs containers directly on Google's infrastructure. With Cloud Run, you don't worry about servers, clusters, or orchestration. You simply provide a container image, and Cloud Run handles everything else—from provisioning and scaling to networking and security. It automatically scales your containers from zero to thousands based on traffic, and you pay only for the compute time you use, measured down to the nearest 100 milliseconds.
Cloud Run is built on Knative, an open-source Kubernetes-based platform, which means your containers are portable and can run anywhere Knative is supported. By abstracting away all infrastructure, Cloud Run lets you focus entirely on your application code. This lesson introduces Cloud Run's unique serverless model, its core concepts, and the prerequisites for deploying your first service.
Cloud Run represents a fundamentally different approach to running containers. There are no clusters to create, no capacity to provision, and no infrastructure to manage. You work directly with services — each service is a running container that responds to requests.
When you deploy to Cloud Run, you're working with a serverless model. This means:
- Automatic scaling: Your service scales automatically from zero instances (when there's no traffic) to as many instances as needed to handle load. You don't configure scaling policies or thresholds — Cloud Run handles this intelligently based on incoming requests.
- Pay-per-use pricing: You're charged only for the CPU, memory, and networking resources your container uses while processing requests. When your service isn't handling traffic, you pay nothing for compute (though you may incur small storage costs for the container image).
- Fully managed infrastructure: Google handles all the underlying infrastructure, including networking, load balancing, SSL certificates, health checks, and logging. You never see or manage any servers.
- Request-based execution: Each container instance handles one or more concurrent requests. Cloud Run automatically routes traffic to healthy instances and creates new instances when needed.
Cloud Run services are stateless by design. This is a critical architectural principle: each request must be independent and self-contained. You cannot rely on data stored in a container's memory or local filesystem between requests, as any instance can be terminated at any time. For persistent data, you must use external services like Cloud Storage, Cloud SQL, or Firestore. This statelessness is what enables Cloud Run to scale rapidly and efficiently.
The platform provides two execution models: Cloud Run (fully managed), where Google handles everything, and Cloud Run for Anthos, which runs on your own Google Kubernetes Engine (GKE) clusters. The fully managed version offers the simplest experience, while Cloud Run for Anthos provides more control over the underlying network and compute environment, making it suitable for workloads with specific compliance or networking requirements. In this course, we'll focus on the fully managed version.
Before deploying services to Cloud Run, you need to understand the naming conventions and requirements. Service names in Cloud Run must follow specific rules to ensure compatibility with DNS and Google Cloud's infrastructure.
Here are the naming rules for Cloud Run services:
- Service names must start with a letter.
- Names can contain lowercase letters, numbers, and hyphens.
- Names cannot end with a hyphen.
- Maximum length is
63characters. - Each service name must be unique within your Google Cloud project and region.
For this lesson, we'll use my-first-service as our service name. In real projects, choose names that clearly describe the service's purpose, such as api-gateway, user-service, or payment-processor. Good naming helps you manage multiple services and understand your architecture at a glance.
Additionally, you'll need:
- A Google Cloud project with billing enabled.
- The
gcloudCLI installed and configured. - Docker installed locally (for building container images in later lessons).
- Appropriate IAM permissions to deploy Cloud Run services.
You can verify your gcloud setup by checking your current project:
If you need to set a project, use:
You should also ensure you're working in a specific region. Cloud Run is available in multiple regions worldwide. For this course, we'll use , but you can choose a region closer to your users:
Cloud Run's serverless model means you think about your application differently than with traditional container platforms. Instead of managing clusters, capacity, and orchestration, you focus on three key concepts:
Services are the primary resource in Cloud Run. A service represents your containerized application and includes the container image, configuration, and routing rules. Each service gets a unique HTTPS URL that you can use to access your application. Services are long-lived resources that persist until you delete them.
Revisions are immutable snapshots of your service configuration. Every time you deploy a new version of your container or change configuration settings, Cloud Run creates a new revision. This immutability is a core principle that ensures deployments are predictable and repeatable. It allows you to roll back to previous versions instantly or split traffic between multiple revisions for gradual rollouts and A/B testing.
Instances are the running containers that handle requests. Cloud Run automatically creates and destroys instances based on traffic. When a request arrives and no instances are running, Cloud Run starts a new one, a process known as a "cold start." This introduces a small amount of latency for the first request. Subsequent requests are served by "warm" instances. Each instance can handle multiple concurrent requests (you configure the concurrency limit), which is a key factor in how efficiently your service scales. When traffic increases, Cloud Run creates more instances. When traffic decreases, instances are terminated. You never manage instances directly — Cloud Run handles this automatically.
Here is a simple diagram illustrating this relationship:
This model eliminates many operational concerns. You don't worry about:
- How many instances to run.
- When to scale up or down.
Understanding the deployment workflow helps you see how Cloud Run fits into your development process. The typical workflow follows these steps:
- Build your container: Package your application into a container image using Docker or Cloud Build. Your container must listen on a port (defined by the
PORTenvironment variable) and respond to HTTP requests. - Push to a registry: Store your container image in a registry like Google Container Registry (GCR) or Artifact Registry. The registry acts as a centralized, versioned repository for your images, decoupling the build process from the deployment process. Cloud Run pulls images from these registries when deploying.
- Deploy to Cloud Run: Use the
gcloud run deploycommand to create or update a service. You specify the container image, service name, and any configuration options. This command is declarative; it describes the desired state of your service, and Cloud Run's control plane works to achieve that state. - Access your service: Cloud Run provides an HTTPS URL immediately. Your service is accessible worldwide with automatic SSL and load balancing.
- Monitor and update: View logs, metrics, and traces in the Google Cloud Console. Deploy updates by running the deploy command again with a new container image or configuration.
Each deployment creates a new revision, and Cloud Run automatically routes traffic to the latest revision by default. You can also configure traffic splitting to gradually roll out changes or run multiple versions simultaneously.
The platform integrates with Google Cloud's identity and access management, allowing you to control who can invoke your services. You can make services publicly accessible or restrict them to authenticated users and service accounts.
In this lesson, you've learned that Google Cloud Run is a fully managed serverless platform that runs containers without requiring you to manage any infrastructure. You work directly with services that automatically scale based on traffic and charge only for actual usage. We covered Cloud Run's core concepts: services (your deployed applications), revisions (immutable snapshots of configuration), and instances (automatically managed containers that handle requests).
You've also seen how Cloud Run's serverless model changes the way you think about containers. By handling all operational concerns automatically, it allows you to focus entirely on building and deploying your application. In the upcoming practice exercises, you'll deploy your first Cloud Run services and begin working hands-on with the platform. Later lessons will cover building custom images, configuring resources, and managing production deployments.
