Welcome to your first lesson on Google Cloud databases! In this course, you'll learn how to work with various database services offered by Google Cloud Platform (GCP), starting with one of the most popular and flexible options: Cloud Firestore.
Cloud Firestore is a fully managed NoSQL document database provided by GCP. Unlike traditional relational databases such as MySQL or PostgreSQL, which organize data into tables with fixed schemas, NoSQL databases like Firestore offer a more flexible, schema-less approach. You don't need to define every field in advance, and you can store different types of data in the same collection. This flexibility makes Firestore especially well-suited for applications that need to scale quickly or handle varying data structures.
Firestore is serverless, meaning you don't have to manage servers, infrastructure, or database maintenance. Google Cloud automatically handles scaling, replication, and high availability for you. Firestore provides fast, consistent performance for both small and large workloads, making it ideal for applications that require real-time data access and updates.
In this lesson, you'll learn the fundamental building blocks of Firestore by creating two different types of collections. First, you'll create a simple Users collection to demonstrate the basic structure of a Firestore collection. Then, you'll build a more sophisticated UserPosts collection to show how to organize related data efficiently. By the end of this lesson, you'll understand how to create collections, choose appropriate document IDs, and add data to your Firestore database.
Let's start by creating your first Firestore collection. We'll use the google-cloud-firestore library, which is the official Python SDK for Firestore. If you were working on your own computer, you would need to install it using pip install google-cloud-firestore, but on CodeSignal, all necessary libraries come pre-installed, so you can start coding right away.
Here's the code to create a simple Users collection and add documents to it:
This function creates a reference to a new collection named Users_<timestamp>, ensuring a unique collection name each time you run the code. In Firestore, you don't need to explicitly create a collection before adding documents — collections are created automatically when you add your first document.
To add documents (users) to this collection, you can use the following function:
In this function, we add three user documents to the collection, using each user's user_id as the document ID. Firestore allows you to specify your own document IDs, which makes it easy to retrieve documents by a known key.
Before we dive deeper into Firestore's data model, let's discuss how Firestore handles different data types. Understanding data types is important because Firestore supports a variety of types, and you should know how they map from Python.
Firestore supports several data types, including strings, numbers (integers and floating-point), booleans, timestamps, arrays (lists), maps (dictionaries), and more. When you use the Firestore Python SDK, most standard Python types are automatically converted to their Firestore equivalents:
- Strings: Python
strmaps to Firestore string. - Integers: Python
intmaps to Firestore integer. - Floating-point numbers: Python
floatmaps to Firestore double. - Booleans: Python
boolmaps to Firestore boolean. - Lists: Python
listmaps to Firestore array. - Dictionaries: Python
dictmaps to Firestore map. - Timestamps: Python
datetime.datetimemaps to Firestore timestamp.
Unlike some other NoSQL databases, Firestore does not require you to use a special type for decimal numbers — Python's float type is supported directly. However, keep in mind that floating-point numbers can have precision issues, so if you need exact decimal representation (such as for currency), you may want to store values as integers (e.g., the number of cents) or as strings.
For example, to store a product with a price:
Every document in Firestore is uniquely identified by a document ID within its collection. The document ID is the most fundamental concept in Firestore for organizing and accessing your data. When you store a document, you can either let Firestore generate a random unique ID, or you can specify your own (such as a user ID or another unique identifier).
Choosing a good document ID is important for efficient lookups and data organization. For example, using user_id as the document ID in a Users collection allows you to quickly retrieve a user's document by their ID.
Let's see how this works in practice:
This function retrieves a user document by its user_id. If the document exists, it returns the document data as a dictionary; otherwise, it returns None. Firestore's flexible schema means you can add additional fields to documents at any time, and each document in a collection can have different fields if needed.
While using document IDs works well for simple lookups, many applications need to organize related items together and query them in a specific order. In Firestore, you can model related data using subcollections or by storing related documents in the same collection with fields that allow for efficient queries and ordering.
For example, let's create a UserPosts collection where each post document contains a user_id, a post_date, and a title. We can then query for all posts by a specific user and order them by date.
Here's how to create and populate such a collection:
Notice that we're using firestore.SERVER_TIMESTAMP for the post_date field. When storing timestamps in Firestore, you have two options: you can use Python's datetime.now() to generate a timestamp on the client side, or you can use firestore.SERVER_TIMESTAMP to let Firestore generate the timestamp on the server. Using firestore.SERVER_TIMESTAMP is generally the better choice because it ensures all timestamps are generated by the same clock source (Google's servers), avoiding potential issues with client-side clock skew or time zone differences. Server timestamps are particularly important for ordering operations and for scenarios where you need to ensure consistent, reliable time measurements across all documents. When you use in a document, Firestore automatically replaces it with the actual server timestamp when the document is written.
Once you've created collections and added data, you may want to check on your collections' status and properties. While Firestore is schema-less and does not provide table metadata in the same way as some other databases, you can still retrieve useful information about your collections and documents.
Here's a function that retrieves basic information about a collection:
This function streams all documents in the collection, counts them, and returns the collection name and a sample document (if any exist). Firestore collections are created automatically when you add documents, and you can add or remove documents at any time.
Let's see how all these pieces work together in a complete example:
When you run this code, you'll see output similar to:
The timestamps in the collection names will be different each time you run the code, reflecting the exact moment when each collection was created. This main block demonstrates the complete workflow: create a collection, add data, and verify the results.
You've now learned the fundamental concepts of Google Cloud Firestore and created your first collections. Let's recap the key points. Every Firestore document is uniquely identified by a document ID within its collection. Choosing meaningful document IDs, such as user IDs, allows for efficient lookups and organization.
Firestore supports a wide range of data types, and most standard Python types are supported directly. You can store strings, numbers, booleans, lists, dictionaries, and more, without needing to define a schema in advance.
To model related or ordered data, you can use fields such as user_id and post_date and take advantage of Firestore's powerful querying and ordering capabilities. Subcollections and nested data structures are also available for more complex data models.
You've seen two collection patterns in this lesson. The first pattern uses a simple collection with user documents identified by user IDs. The second pattern uses a collection of posts, where each post includes a user ID and a date, allowing you to query and order posts efficiently.
In the upcoming practice exercises, you'll create your own collections using these patterns. You'll experiment with different data models and queries to see how they affect your application's performance and flexibility. Understanding how to structure your data and choose document IDs is a skill that develops with practice, and these exercises will help you build that intuition.
