Getting Started with Firestore

Introduction to Google Cloud Firestore

Welcome to your first lesson on Google Cloud databases! In this course, you'll learn how to work with various database services offered by Google Cloud Platform (GCP), starting with one of the most popular and flexible options: Cloud Firestore . Cloud Firestore is a fully managed NoSQL document database provided by GCP. Unlike traditional relational databases such as MySQL or PostgreSQL, which organize data into tables with fixed schemas, NoSQL databases like Firestore offer a more flexible, schema-less approach. You don't need to define every field in advance, and you can store different types of data in the same collection. This flexibility makes Firestore especially well-suited for applications that need to scale quickly or handle varying data structures. Firestore is serverless, meaning you don't have to manage servers, infrastructure, or database maintenance. Google Cloud automatically handles scaling, replication, and high availability for you. Firestore provides fast, consistent performance for both small and large workloads, making it ideal for applications that require real-time data access and updates. In this lesson, you'll learn the fundamental building blocks of Firestore by creating two different types of collections. First, you'll create a simple Users collection to demonstrate the basic structure of a Firestore collection. Then, you'll build a more sophisticated UserPosts collection to show how to organize related data efficiently. By the end of this lesson, you'll understand how to create collections, choose appropriate document IDs, and add data to your Firestore database.

Your First Firestore Collection

Let's start by creating your first Firestore collection. We'll use the google-cloud-firestore library, which is the official Python SDK for Firestore. If you were working on your own computer, you would need to install it using pip install google-cloud-firestore, but on CodeSignal, all necessary libraries come pre-installed, so you can start coding right away. Here's the code to create a simple Users collection and add documents to it: Pythonfrom google.cloud import firestore import time def create_simple_collection(): """Create a basic Firestore collection and add sample users""" db = firestore.Client() collection_name = f"Users_{int(time.time())}" users_ref = db.collection(collection_name) return users_reffrom google.cloud import firestore import time def create_simple_collection(): """Create a basic Firestore collection and add sample users""" db = firestore.Client() collection_name = f"Users_{int(time.time())}" users_ref = db.collection(collection_name) return users_ref This function creates a reference to a new collection named Users_<timestamp>, ensuring a unique collection name each time you run the code. In Firestore, you don't need to explicitly create a collection before adding documents — collections are created automatically when you add your first document. To add documents (users) to this collection, you can use the following function: Pythondef add_sample_users(users_ref): """Add sample users to the Firestore collection""" users = [ {'user_id': 'user_001', 'name': 'Alice', 'email': 'alice@example.com'}, {'user_id': 'user_002', 'name': 'Bob', 'email': 'bob@example.com'}, {'user_id': 'user_003', 'name': 'Carol', 'email': 'carol@example.com'} ] for user in users: # Use user_id as the document ID for easy lookup users_ref.document(user['user_id']).set(user) return len(users)def add_sample_users(users_ref): """Add sample users to the Firestore collection""" users = [ {'user_id': 'user_001', 'name': 'Alice', 'email': 'alice@example.com'}, {'user_id': 'user_002', 'name': 'Bob', 'email': 'bob@example.com'}, {'user_id': 'user_003', 'name': 'Carol', 'email': 'carol@example.com'} ] for user in users: # Use user_id as the document ID for easy lookup users_ref.document(user['user_id']).set(user) return len(users) In this function, we add three user documents to the collection, using each user's user_id as the document ID. Firestore allows you to specify your own document IDs, which makes it easy to retrieve documents by a known key.

Firestore Data Types

Before we dive deeper into Firestore's data model, let's discuss how Firestore handles different data types. Understanding data types is important because Firestore supports a variety of types, and you should know how they map from Python. Firestore supports several data types, including strings, numbers (integers and floating-point), booleans, timestamps, arrays (lists), maps (dictionaries), and more. When you use the Firestore Python SDK, most standard Python types are automatically converted to their Firestore equivalents: Strings: Python str maps to Firestore string. Integers: Python int maps to Firestore integer. Floating-point numbers: Python float maps to Firestore double. Booleans: Python bool maps to Firestore boolean. Lists: Python list maps to Firestore array. Dictionaries: Python dict maps to Firestore map. Timestamps: Python datetime.datetime maps to Firestore timestamp. Unlike some other NoSQL databases, Firestore does not require you to use a special type for decimal numbers — Python's float type is supported directly. However, keep in mind that floating-point numbers can have precision issues, so if you need exact decimal representation (such as for currency), you may want to store values as integers (e.g., the number of cents) or as strings. For example, to store a product with a price: Pythonproduct = { 'product_id': 'PROD_001', 'name': 'Laptop', 'price': 999.99 # float is supported }product = { 'product_id': 'PROD_001', 'name': 'Laptop', 'price': 999.99 # float is supported } Firestore will store the price as a double-precision floating-point number. You can also store arrays and nested objects. For instance, if you want to store multiple product categories and detailed specifications: Pythonproduct_with_details = { 'product_id': 'PROD_002', 'name': 'Gaming Laptop', 'price': 1499.99, 'categories': ['Electronics', 'Computers', 'Gaming'], # list/array 'specs': { # nested dictionary/map 'cpu': 'Intel i7', 'ram': '16GB', 'storage': '512GB SSD' } }product_with_details = { 'product_id': 'PROD_002', 'name': 'Gaming Laptop', 'price': 1499.99, 'categories': ['Electronics', 'Computers', 'Gaming'], # list/array 'specs': { # nested dictionary/map 'cpu': 'Intel i7', 'ram': '16GB', 'storage': '512GB SSD' } } This demonstrates how Firestore handles complex, nested data structures naturally using Python's native list and dictionary types.

Document IDs: The Foundation of Firestore

Every document in Firestore is uniquely identified by a document ID within its collection. The document ID is the most fundamental concept in Firestore for organizing and accessing your data. When you store a document, you can either let Firestore generate a random unique ID, or you can specify your own (such as a user ID or another unique identifier). Choosing a good document ID is important for efficient lookups and data organization. For example, using user_id as the document ID in a Users collection allows you to quickly retrieve a user's document by their ID. Let's see how this works in practice: Python def get_user(users_ref, user_id): """Retrieve a user document by user_id""" doc = users_ref.document(user_id).get() if doc.exists: return doc.to_dict() else: return None def get_user(users_ref, user_id): """Retrieve a user document by user_id""" doc = users_ref.document(user_id).get() if doc.exists: return doc.to_dict() else: return None This function retrieves a user document by its user_id . If the document exists, it returns the document data as a dictionary; otherwise, it returns None . Firestore's flexible schema means you can add additional fields to documents at any time, and each document in a collection can have different fields if needed.

Modeling Related and Ordered Data in Firestore

While using document IDs works well for simple lookups, many applications need to organize related items together and query them in a specific order. In Firestore, you can model related data using subcollections or by storing related documents in the same collection with fields that allow for efficient queries and ordering. For example, let's create a UserPosts collection where each post document contains a user_id, a post_date, and a title. We can then query for all posts by a specific user and order them by date. Here's how to create and populate such a collection: Pythondef create_posts_collection(): """Create a Firestore collection for user posts""" db = firestore.Client() collection_name = f"UserPosts_{int(time.time())}" posts_ref = db.collection(collection_name) return posts_ref def add_sample_posts(posts_ref): """Add sample posts to the Firestore collection""" posts = [ {'user_id': 'user_001', 'post_date': firestore.SERVER_TIMESTAMP, 'title': 'First Post'}, {'user_id': 'user_001', 'post_date': firestore.SERVER_TIMESTAMP, 'title': 'Second Post'}, {'user_id': 'user_001', 'post_date': firestore.SERVER_TIMESTAMP, 'title': 'Third Post'}, {'user_id': 'user_002', 'post_date': firestore.SERVER_TIMESTAMP, 'title': 'Bob Post'} ] for post in posts: # Let Firestore generate a unique document ID for each post posts_ref.add(post) return len(posts)def create_posts_collection(): """Create a Firestore collection for user posts""" db = firestore.Client() collection_name = f"UserPosts_{int(time.time())}" posts_ref = db.collection(collection_name) return posts_ref def add_sample_posts(posts_ref): """Add sample posts to the Firestore collection""" posts = [ {'user_id': 'user_001', 'post_date': firestore.SERVER_TIMESTAMP, 'title': 'First Post'}, {'user_id': 'user_001', 'post_date': firestore.SERVER_TIMESTAMP, 'title': 'Second Post'}, {'user_id': 'user_001', 'post_date': firestore.SERVER_TIMESTAMP, 'title': 'Third Post'}, {'user_id': 'user_002', 'post_date': firestore.SERVER_TIMESTAMP, 'title': 'Bob Post'} ] for post in posts: # Let Firestore generate a unique document ID for each post posts_ref.add(post) return len(posts) Notice that we're using firestore.SERVER_TIMESTAMP for the post_date field. When storing timestamps in Firestore, you have two options: you can use Python's datetime.now() to generate a timestamp on the client side, or you can use firestore.SERVER_TIMESTAMP to let Firestore generate the timestamp on the server. Using firestore.SERVER_TIMESTAMP is generally the better choice because it ensures all timestamps are generated by the same clock source (Google's servers), avoiding potential issues with client-side clock skew or time zone differences. Server timestamps are particularly important for ordering operations and for scenarios where you need to ensure consistent, reliable time measurements across all documents. When you use firestore.SERVER_TIMESTAMP in a document, Firestore automatically replaces it with the actual server timestamp when the document is written. To retrieve all posts by a specific user, ordered by post_date, you can use a query: Pythondef get_posts_by_user(posts_ref, user_id): """Query posts by user_id, ordered by post_date""" query = posts_ref.where(filter=firestore.FieldFilter('user_id', '==', user_id)).order_by('post_date') return [doc.to_dict() for doc in query.stream()]def get_posts_by_user(posts_ref, user_id): """Query posts by user_id, ordered by post_date""" query = posts_ref.where(filter=firestore.FieldFilter('user_id', '==', user_id)).order_by('post_date') return [doc.to_dict() for doc in query.stream()] This approach allows you to efficiently retrieve all posts for a user, sorted by date. Firestore automatically indexes fields used in queries, making these operations fast and scalable.

Working with Your Collections

Once you've created collections and added data, you may want to check on your collections' status and properties. While Firestore is schema-less and does not provide table metadata in the same way as some other databases, you can still retrieve useful information about your collections and documents. Here's a function that retrieves basic information about a collection: Pythondef get_collection_info(collection_ref): """Get basic information about the collection""" docs = list(collection_ref.stream()) return { 'name': collection_ref.id, 'document_count': len(docs), 'sample_document': docs[0].to_dict() if docs else None }def get_collection_info(collection_ref): """Get basic information about the collection""" docs = list(collection_ref.stream()) return { 'name': collection_ref.id, 'document_count': len(docs), 'sample_document': docs[0].to_dict() if docs else None } This function streams all documents in the collection, counts them, and returns the collection name and a sample document (if any exist). Firestore collections are created automatically when you add documents, and you can add or remove documents at any time. Let's see how all these pieces work together in a complete example: Pythonif __name__ == "__main__": # Create and populate simple collection users_ref = create_simple_collection() user_count = add_sample_users(users_ref) info = get_collection_info(users_ref) print(f"Simple collection: {info['name']} with {user_count} users") # Create and populate posts collection posts_ref = create_posts_collection() post_count = add_sample_posts(posts_ref) info = get_collection_info(posts_ref) print(f"Posts collection: {info['name']} with {post_count} posts")if __name__ == "__main__": # Create and populate simple collection users_ref = create_simple_collection() user_count = add_sample_users(users_ref) info = get_collection_info(users_ref) print(f"Simple collection: {info['name']} with {user_count} users") # Create and populate posts collection posts_ref = create_posts_collection() post_count = add_sample_posts(posts_ref) info = get_collection_info(posts_ref) print(f"Posts collection: {info['name']} with {post_count} posts") When you run this code, you'll see output similar to: Simple collection: Users_1699564823 with 3 users Posts collection: UserPosts_1699564825 with 4 postsSimple collection: Users_1699564823 with 3 users Posts collection: UserPosts_1699564825 with 4 posts The timestamps in the collection names will be different each time you run the code, reflecting the exact moment when each collection was created. This main block demonstrates the complete workflow: create a collection, add data, and verify the results.

Summary and What's Next

You've now learned the fundamental concepts of Google Cloud Firestore and created your first collections. Let's recap the key points. Every Firestore document is uniquely identified by a document ID within its collection. Choosing meaningful document IDs, such as user IDs, allows for efficient lookups and organization. Firestore supports a wide range of data types, and most standard Python types are supported directly. You can store strings, numbers, booleans, lists, dictionaries, and more, without needing to define a schema in advance. To model related or ordered data, you can use fields such as user_id and post_date and take advantage of Firestore's powerful querying and ordering capabilities. Subcollections and nested data structures are also available for more complex data models. You've seen two collection patterns in this lesson. The first pattern uses a simple collection with user documents identified by user IDs. The second pattern uses a collection of posts, where each post includes a user ID and a date, allowing you to query and order posts efficiently. In the upcoming practice exercises, you'll create your own collections using these patterns. You'll experiment with different data models and queries to see how they affect your application's performance and flexibility. Understanding how to structure your data and choose document IDs is a skill that develops with practice, and these exercises will help you build that intuition.

Next Lesson: Firestore CRUD and Queries

Join the 1M+ learners on CodeSignal

Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignal