Home
> Inner Joins with Firebase Functions
Get updates on future FREE course and blog posts!
Subscribe

Inner Joins with Firebase Functions

19 min read

Jonathan Gamble

jdgamble555 on Sunday, June 2, 2024 (last modified on Sunday, June 2, 2024)

Because Firestore is a NoSQL database, you can’t use joins like in SQL. You must plan your queries when you write your data, or you must over-fetch. There are no other options unless you use a third-party database alongside Firestore.

Inner Joins Demo

TL;DR#

You can’t have a serious Firebase app without needing Firebase Functions to perform joins for you. Luckily, it can make these aggregations very simple with BulkWriter.

Connections#

In SQL, you use joins to get nested data. You may have to use a jsonb type, and return the data as a nested JSON object.

One-to-One#

A 1:1 example would be a user and a profile. The user data may be secure, while the profile data is visible to everyone. This is similar to linking a user’s profile data from the Firebase Auth database to the Firestore users or profiles collection.

One-to-Many and Many-to-One#

One-to-many is the most common example in Firestore. If there are many authors, and each author only writes one book, that is one-to-one. However, in reality, each author could write several books. An author or user could write several posts, comments, or have many bookmarks. From the author's perspective, this is a one-to-many relationship. Each user can have multiple items. From the item perspective, this is a many-to-one. Each item can only have one user.

🔍 If you query a list of posts, each post needs to query a user. You have 2x the Firestore read cost. This means for every many-to-one query, you get charged double.

Many-to-Many#

For completeness, many-to-many relationships are like a book with many authors. Or, a student can belong to many classes, and a class can have many students. I have other posts on these.

Join with Reference Type#

If you don’t care about reads or the extra latency of several queries in one, you may want to look into the Reference Type for Joins. This has advantages when you need a quick mock-up or don’t want to deal with Firebase Functions.

Firebase Functions#

Before deploying any functions during the creation and testing phases, I suggest you test with Firebase Emulators on your local machine. This will save you time trying to redeploy for every change and can keep you from managing a secondary cloud database.

Initialization#

First, update your Firebase CLI to the latest version.

	 npm i -g firebase-tools

Next, add Firebase to your project.

	firebase init

You will be asked to log in and link to a current Firebase project. I highly suggest you use TypeScript and avoid ESLint unless you’re a pro at it. The default configuration is out-of-date and too strict.

Generation 2#

I will use only Generation 2 Functions for this example, as they are faster and allow you to select more than one database.

Inner Joins#

To perform joins, you must copy data into the document you want to query. You also need to know your queries when you write your data, before any query occurs. Joins will take the form of nested JSON objects in a Firestore document.

User Operations#

Let’s say you want to include the user’s (author’s) information on each post. While the join involves 6 operations, depending on your application, you may need only 3 triggers. It is common to use only two of them.

  1. Add a User
    1. You can’t add a post without a user, so there is nothing to do.
  2. Modify a User
    1. ❗This is the expensive operation. You must update every post by that user with the new user data only if the user’s data has changed. We use an onDocumentUpdated trigger on the users collection for this.
  3. Delete a User
    1. Most modern apps use a soft delete, which requires marking every user's post as hidden. You could add a filter or copy it to an archive collection. Use an onDocumentDeleted trigger on the users collection, then grab all posts and mark them as private or use some other name for a filter.
    2. You could also emulate a Cascade Delete, and delete every post by that user. Every post could have comments, and you could delete every comment with a separate Cascade Delete emulation. Use an onDocumentDeleted trigger on the users collection, then grab all posts and delete them individually. You could have the onDocumentDeleted trigger on the posts collection to do the same for the comments.
  4. Create a Post
    1. You can add the user information directly when the post is created, and this can be enforced with Firestore Rules. There may be edge cases where you have to add an onCreate trigger to your posts collection to enforce the correct data, but I am only mentioning it for awareness.
  5. Update a Post
    1. Again, using Firestore Security Rules you can easily prevent the nested user data from being edited.
  6. Delete a Post
    1. Everything gets deleted, so there is nothing to do.

Comment Page Example#

I made an example app where users can add comments to a page. I am using comments and profiles collections. We will need 3 Triggers.

  1. Update a Profile
    1. Make sure all comments by that user get updated
  2. Create a Comment
    1. Add nested user data automatically
  3. Delete a Profile
    1. Cascade Delete all comments

Update a Profile#

We update all comments with the latest user information. This app displays the photoURL and the displayName. Although not strictly necessary, I also update the Firebase Auth database.

⚠️ Remember that the Firebase Auth database must be updated on the client to be updated immediately in the session. There are lots of moving parts.

	export const updateComments = onDocumentUpdated(
    'profiles/{docId}',
    async (event) => {

        const eventData = event.data;

        if (!eventData) {
            return null;
        }

        const userId = event.params.docId;

        // Update user in Firebase Auth
        const { displayName, photoURL } = eventData.after.data();

        await adminAuth.updateUser(userId, {
            displayName,
            photoURL
        });

        // Update all comments
        const db = eventData.after.ref.firestore;

        const bulkWriter = db.bulkWriter();

        const comments = await db.collection('comments')
            .where('createdBy.uid', '==', userId)
            .get();

        // Bulk Update by looping
        comments.forEach((comment) => {
            bulkWriter.set(comment.ref, {
                createdBy: {
                    displayName,
                    photoURL
                }
            }, { merge: true });
        });

        bulkWriter.onWriteError((error) => {
            const MAX_RETRIES = 3;
            if (error.failedAttempts < MAX_RETRIES) {
                return true;
            }
            // Handle errors here
            return false;
        });

        await bulkWriter.close();

        return event;
    });

Bulk Writer#

Using batch is faster than consecutive writes but not faster than simultaneous writes. For that, we use bulkwriter. We can retry up to 10 times but are still stuck with the Firebase Function limits of 9 minutes per function call. This should not be a problem. This is known as the fan-out method or aggregating the data. If we need more processing power than Firebase Functions can provide, we should use the trigger function to offload the bulk updates to a faster server.

Create a Comment#

We trigger the comments function, but only re-update the comment document if the data has not been properly added. Normally, we want to add the data on the client.

You can pick and choose any field(s). Since it costs money to read a users document, I am getting the data from the Firebase Auth database. You must ALWAYS think about reads. However, most of the time, you should read another document to join data.

	export const createComment = onDocumentCreated(
    'comments/{commentId}',
    async (event) => {

        const eventData = event.data;

        if (!eventData) {
            return null;
        }

        // createdBy should be enforced in Security Rules
        const docData = eventData.data();

        const {
            displayName,
            photoURL
        } = await adminAuth.getUser(docData.createdBy.uid);

        if (
            docData.displayName !== displayName
            || docData.photoURL !== photoURL
        ) {
            return eventData.ref.set({
                createdBy: {
                    displayName,
                    photoURL
                }
            }, { merge: true });
        }

        return event;
    });

Cascade Delete#

This is just for example. The concept for deleting is exactly the same.

	export const deleteProfile = onDocumentDeleted(
    { document: 'profiles/{docId}' },
    async (event) => {

        const eventData = event.data;

        if (!eventData) {
            return null;
        }

        const userId = event.params.docId;

        // Delete in Firebase Auth
        await adminAuth.deleteUser(userId);

        const db = eventData.ref.firestore;

        // Delete all comments...

        const bulkWriter = db.bulkWriter();

        const comments = await db.collection('comments')
            .where('createdBy.uid', '==', userId)
            .get();

        // Bulk Delete by looping
        comments.forEach((comment) => {
            bulkWriter.delete(comment.ref);
        });

        bulkWriter.onWriteError((error) => {
            const MAX_RETRIES = 3;
            if (error.failedAttempts < MAX_RETRIES) {
                return true;
            }
            // Handle errors here
            return false;
        });

        await bulkWriter.close();

        // Delete all posts...

        // ...

        return event;
    });

Trigger from a Different Database#

Firebase Functions Generation 2 allows you to select a different database. Instead of using a string for the first input of the trigger, you could use an object with a database key.

	export const deleteProfile = onDocumentDeleted(
    { document: 'profiles/{docId}', database: 'cars' },
    async (event) => {

Update Only Changed#

There could be situations where a user is updated, and some of the nested data has already been updated. You could save writes by checking for the updates first, although each conditional could slow down write time. This probably won’t be significant.

	// Bulk Update by looping
comments.forEach((comment) => {

    const { createdBy } = comment.data();

    // only update if data has changed
    if (
        createdBy.displayName !== displayName
        || createdBy.photoURL !== photoURL
    ) {

        bulkWriter.set(comment.ref, {
            createdBy: {
                displayName,
                photoURL
            }
        }, { merge: true });
    }
});

Even Faster Reads#

The Firebase REST API allows you to query only the document IDs for a query. This still costs you one document read per document, but it could save you time and money by not downloading all the document data. However, the setup time will be longer since converting a query to the REST API is cumbersome.

How to Avoid the Joins#

  1. Update a Profile
    1. Don’t display the user information.
    2. Don’t allow user information to be changed (Ex. use a username). This is why you can’t edit a tweet on Twitter; it is an expensive operation.
    3. Grab the data on the client, and cache it.
  2. Create a Comment
    1. Add the User information on the client, and enforce it with Firestore Rules (you should do this anyway)
  3. Delete a Profile
    1. Don’t allow a profile to be deleted.
    2. Disable the profile but not the posts or comments.

More on One-to-Many#

This method could also save you read charges and give you a better developer experience when reading several documents. You don’t want to make 10 API calls if you have one post with 9 comments. You could aggregate the latest 10 comments to your post document and use an infinite scroll or a load button to load the rest of the comments.

Demo: Vercel Serverless

Repo: GitHub

For more, see Cloud Functions.

J


Related Posts

© 2024 Code.Build