Because Firestore is a NoSQL database, you canât use joins like in SQL. You must plan your queries when you write your data, or you must over-fetch. There are no other options unless you use a third-party database alongside Firestore.
TL;DR#
You canât have a serious Firebase app without needing Firebase Functions to perform joins for you. Luckily, it can make these aggregations very simple with BulkWriter
.
Connections#
In SQL, you use joins to get nested data. You may have to use a jsonb
type, and return the data as a nested JSON object.
One-to-One#
A 1:1 example would be a user
and a profile
. The user
data may be secure, while the profile data is visible to everyone. This is similar to linking a userâs profile data from the Firebase Auth database to the Firestore users
or profiles
collection.
One-to-Many and Many-to-One#
One-to-many is the most common example in Firestore. If there are many authors, and each author only writes one book, that is one-to-one. However, in reality, each author could write several books. An author or user could write several posts
, comments
, or have many bookmarks
. From the author's perspective, this is a one-to-many relationship. Each user can have multiple items. From the item perspective, this is a many-to-one. Each item can only have one user.
đ If you query a list of posts, each post needs to query a user. You have 2x the Firestore read cost. This means for every many-to-one query, you get charged double.
Many-to-Many#
For completeness, many-to-many relationships are like a book with many authors. Or, a student can belong to many classes, and a class can have many students. I have other posts on these.
Join with Reference Type#
If you donât care about reads or the extra latency of several queries in one, you may want to look into the Reference Type for Joins. This has advantages when you need a quick mock-up or donât want to deal with Firebase Functions.
Firebase Functions#
Before deploying any functions during the creation and testing phases, I suggest you test with Firebase Emulators on your local machine. This will save you time trying to redeploy for every change and can keep you from managing a secondary cloud database.
Initialization#
First, update your Firebase CLI to the latest version.
npm i -g firebase-tools
Next, add Firebase to your project.
firebase init
You will be asked to log in and link to a current Firebase project. I highly suggest you use TypeScript and avoid ESLint unless youâre a pro at it. The default configuration is out-of-date and too strict.
Generation 2#
I will use only Generation 2 Functions for this example, as they are faster and allow you to select more than one database.
Inner Joins#
To perform joins, you must copy data into the document you want to query. You also need to know your queries when you write your data, before any query occurs. Joins will take the form of nested JSON objects in a Firestore document.
User Operations#
Letâs say you want to include the userâs (authorâs) information on each post. While the join involves 6 operations, depending on your application, you may need only 3 triggers. It is common to use only two of them.
- Add a User
- You canât add a post without a user, so there is nothing to do.
- Modify a User
- âThis is the expensive operation. You must update every post by that user with the new user data only if the userâs data has changed. We use an
onDocumentUpdated
trigger on theusers
collection for this.
- âThis is the expensive operation. You must update every post by that user with the new user data only if the userâs data has changed. We use an
- Delete a User
- Most modern apps use a soft delete, which requires marking every user's post as hidden. You could add a filter or copy it to an archive collection. Use an
onDocumentDeleted
trigger on theusers
collection, then grab all posts and mark them as private or use some other name for a filter. - You could also emulate a Cascade Delete, and delete every post by that user. Every post could have comments, and you could delete every comment with a separate Cascade Delete emulation. Use an
onDocumentDeleted
trigger on theusers
collection, then grab all posts and delete them individually. You could have theonDocumentDeleted
trigger on theposts
collection to do the same for thecomments
.
- Most modern apps use a soft delete, which requires marking every user's post as hidden. You could add a filter or copy it to an archive collection. Use an
- Create a Post
- You can add the user information directly when the post is created, and this can be enforced with Firestore Rules. There may be edge cases where you have to add an
onCreate
trigger to yourposts
collection to enforce the correct data, but I am only mentioning it for awareness.
- You can add the user information directly when the post is created, and this can be enforced with Firestore Rules. There may be edge cases where you have to add an
- Update a Post
- Again, using Firestore Security Rules you can easily prevent the nested user data from being edited.
- Delete a Post
- Everything gets deleted, so there is nothing to do.
Comment Page Example#
I made an example app where users can add comments to a page. I am using comments
and profiles
collections. We will need 3 Triggers.
- Update a Profile
- Make sure all comments by that user get updated
- Create a Comment
- Add nested user data automatically
- Delete a Profile
- Cascade Delete all comments
Update a Profile#
We update all comments with the latest user information. This app displays the photoURL
and the displayName
. Although not strictly necessary, I also update the Firebase Auth database.
â ïž Remember that the Firebase Auth database must be updated on the client to be updated immediately in the session. There are lots of moving parts.
export const updateComments = onDocumentUpdated(
'profiles/{docId}',
async (event) => {
const eventData = event.data;
if (!eventData) {
return null;
}
const userId = event.params.docId;
// Update user in Firebase Auth
const { displayName, photoURL } = eventData.after.data();
await adminAuth.updateUser(userId, {
displayName,
photoURL
});
// Update all comments
const db = eventData.after.ref.firestore;
const bulkWriter = db.bulkWriter();
const comments = await db.collection('comments')
.where('createdBy.uid', '==', userId)
.get();
// Bulk Update by looping
comments.forEach((comment) => {
bulkWriter.set(comment.ref, {
createdBy: {
displayName,
photoURL
}
}, { merge: true });
});
bulkWriter.onWriteError((error) => {
const MAX_RETRIES = 3;
if (error.failedAttempts < MAX_RETRIES) {
return true;
}
// Handle errors here
return false;
});
await bulkWriter.close();
return event;
});
Bulk Writer#
Using batch is faster than consecutive writes but not faster than simultaneous writes. For that, we use bulkwriter
. We can retry up to 10 times but are still stuck with the Firebase Function limits of 9 minutes per function call. This should not be a problem. This is known as the fan-out
method or aggregating the data. If we need more processing power than Firebase Functions can provide, we should use the trigger function to offload the bulk updates to a faster server.
Create a Comment#
We trigger the comments function, but only re-update the comment document if the data has not been properly added. Normally, we want to add the data on the client.
You can pick and choose any field(s). Since it costs money to read a users
document, I am getting the data from the Firebase Auth database. You must ALWAYS think about reads. However, most of the time, you should read another document to join data.
export const createComment = onDocumentCreated(
'comments/{commentId}',
async (event) => {
const eventData = event.data;
if (!eventData) {
return null;
}
// createdBy should be enforced in Security Rules
const docData = eventData.data();
const {
displayName,
photoURL
} = await adminAuth.getUser(docData.createdBy.uid);
if (
docData.displayName !== displayName
|| docData.photoURL !== photoURL
) {
return eventData.ref.set({
createdBy: {
displayName,
photoURL
}
}, { merge: true });
}
return event;
});
Cascade Delete#
This is just for example. The concept for deleting is exactly the same.
export const deleteProfile = onDocumentDeleted(
{ document: 'profiles/{docId}' },
async (event) => {
const eventData = event.data;
if (!eventData) {
return null;
}
const userId = event.params.docId;
// Delete in Firebase Auth
await adminAuth.deleteUser(userId);
const db = eventData.ref.firestore;
// Delete all comments...
const bulkWriter = db.bulkWriter();
const comments = await db.collection('comments')
.where('createdBy.uid', '==', userId)
.get();
// Bulk Delete by looping
comments.forEach((comment) => {
bulkWriter.delete(comment.ref);
});
bulkWriter.onWriteError((error) => {
const MAX_RETRIES = 3;
if (error.failedAttempts < MAX_RETRIES) {
return true;
}
// Handle errors here
return false;
});
await bulkWriter.close();
// Delete all posts...
// ...
return event;
});
Trigger from a Different Database#
Firebase Functions Generation 2 allows you to select a different database. Instead of using a string for the first input of the trigger, you could use an object with a database key.
export const deleteProfile = onDocumentDeleted(
{ document: 'profiles/{docId}', database: 'cars' },
async (event) => {
Update Only Changed#
There could be situations where a user is updated, and some of the nested data has already been updated. You could save writes by checking for the updates first, although each conditional could slow down write time. This probably wonât be significant.
// Bulk Update by looping
comments.forEach((comment) => {
const { createdBy } = comment.data();
// only update if data has changed
if (
createdBy.displayName !== displayName
|| createdBy.photoURL !== photoURL
) {
bulkWriter.set(comment.ref, {
createdBy: {
displayName,
photoURL
}
}, { merge: true });
}
});
Even Faster Reads#
The Firebase REST API allows you to query only the document IDs for a query. This still costs you one document read per document, but it could save you time and money by not downloading all the document data. However, the setup time will be longer since converting a query to the REST API is cumbersome.
How to Avoid the Joins#
- Update a Profile
- Donât display the user information.
- Donât allow user information to be changed (Ex. use a username). This is why you canât edit a tweet on Twitter; it is an expensive operation.
- Grab the data on the client, and cache it.
- Create a Comment
- Add the User information on the client, and enforce it with Firestore Rules (you should do this anyway)
- Delete a Profile
- Donât allow a profile to be deleted.
- Disable the profile but not the posts or comments.
More on One-to-Many#
This method could also save you read charges and give you a better developer experience when reading several documents. You donât want to make 10 API calls if you have one post with 9 comments. You could aggregate the latest 10 comments to your post document and use an infinite scroll or a load button to load the rest of the comments.
Demo: Vercel Serverless
Repo: GitHub
For more, see Cloud Functions.
J