#web #graphql #juniper

juniper-eager-loading

Eliminate N+1 query bugs when using Juniper

4 releases

✓ Uses Rust 2018 edition

new 0.1.1 Jun 13, 2019
0.1.0 Jun 10, 2019
0.0.2 Jun 10, 2019
0.0.1 May 24, 2019

#2 in #juniper

23 downloads per month

MIT license

59KB
462 lines

juniper-eager-loading

🚨 This library is still experimental and everything is subject to change 🚨

This is a library for avoiding N+1 query bugs designed to work with Juniper and juniper-from-schema.

It is designed to make the most common association setups easy to handle and while being flexible and allowing you to customize things as needed. It is also 100% data store agnostic. So regardless if your API is backed by an SQL database or another API you can still use this library.

See the crate documentation for a usage examples and more info.


lib.rs:

juniper-eager-loading is a library for avoiding N+1 query bugs designed to work with Juniper and juniper-from-schema.

🚨 **This library is still experimental and everything is subject to change** 🚨

It is designed to make the most common assocation setups easy to handle and while being flexible and allowing you to customize things as needed. It is also 100% data store agnostic. So regardless if your API is backed by an SQL database or another API you can still use this library.

If you're familiar with N+1 queries in GraphQL and eager loading, feel free to skip forward to "A real example".

NOTE: Since this library requires juniper-from-schema it is best if you're first familiar with that.

Table of contents

What is N+1 query bugs?

Imagine you have the following GraphQL schema

schema {
    query: Query
}

type Query {
    allUsers: [User!]!
}

type User {
    id: Int!
    country: Country!
}

type Country {
    id: Int!
}

And someone executes the following query:

query SomeQuery {
    allUsers {
        country {
            id
        }
    }
}

If you resolve that query naively with an SQL database as you data store you will see something like this in your logs:

select * from users
select * from countries where id = ?
select * from countries where id = ?
select * from countries where id = ?
select * from countries where id = ?
...

This happens because you first load all the users and then for each user in a loop you load that user's country. That is 1 query to load the users and N additional queries to load the countries. Therefore the name "N+1 query". These kinds of bugs can really hurt performance of your app since you're doing many more database calls than necessary.

One possible solution to this is called "eager loading". The idea is to load all countries up front, before looping over the users. So instead of doing N+1 queries you do 2:

select * from users
select * from countries where id in (?, ?, ?, ?)

Since you're loading the countries up front, this strategy is called "eager loading".

N+1s in GraphQL

If you're not careful when implementing a GraphQL API you'll have lots of these N+1 query bugs. Whenever a field returns a list of types and those types perform queries in their resolvers, you'll have N+1 query bugs.

This is also a problem in REST APIs, however because the responses are fixed we can more easily setup the necessary eager loads because we know the types needed to compute the response.

However in GraphQL the responses are not fixed. They depend on the incoming queries, which are not known ahead of time. So setting up the correct amount of eager loading requires inspecting the queries before executing them and eager loading the types requested such that the actual resolvers wont need to run queries. That is exactly what this library does.

How this library works at a high level

If you have a GraphQL type like this

type User {
    id: Int!
    country: Country!
}

You might create the corresponding Rust model type like this:

struct User {
    id: i32,
    country_id: i32
}

However this approach has one big issue. How are you going to resolve the field User.country without doing a database query? All the resolver has access to is a User with a country_id field. It can't get the country without loading it from the database...

Fundamentally these kinds of model structs don't work for eager loading with GraphQL. So this library takes a different approach.

What if we created separate structs for the database models and the GraphQL models? Something like this:

# fn main() {}
#
mod models {
    pub struct User {
        id: i32,
        country_id: i32
    }

    pub struct Country {
        id: i32,
    }
}

struct User {
    user: models::User,
    country: HasOne<Country>,
}

struct Country {
    country: models::Country
}

enum HasOne<T> {
    Loaded(T),
    NotLoaded,
}

Now we're able to resolve the query with code like this:

  1. Load all the users (first query).
  2. Map the users to a list of country ids.
  3. Load all the countries with those ids (second query).
  4. Pair up the users with the country with the correct id, so change User.country from HasOne::NotLoaded to HasOne::Loaded(matching_country).
  5. When resolving the GraphQL field User.country simply return the loaded country.

A real example

use juniper::{Executor, FieldResult};
use juniper_eager_loading::{prelude::*, EagerLoading, HasOne};
use juniper_from_schema::graphql_schema;
use std::error::Error;

// Define our GraphQL schema.
graphql_schema! {
    schema {
        query: Query
    }

    type Query {
        allUsers: [User!]! @juniper(ownership: "owned")
    }

    type User {
        id: Int!
        country: Country!
    }

    type Country {
        id: Int!
    }
}

// Our model types.
mod models {
    use std::error::Error;
    use juniper_eager_loading::LoadFrom;

    #[derive(Clone)]
    pub struct User {
        pub id: i32,
        pub country_id: i32
    }

    #[derive(Clone)]
    pub struct Country {
        pub id: i32,
    }

    // This trait is required for eager loading countries.
    // It defines how to load a list of countries from a list of ids.
    // Notice that `Connection` is generic and can be whatever you want.
    // This is this library can be data store agnostic.
    impl LoadFrom<i32> for Country {
        type Error = Box<dyn Error>;
        type Connection = super::DbConnection;

        fn load(
            employments: &[i32],
            db: &Self::Connection,
        ) -> Result<Vec<Self>, Self::Error> {
            // ...
            # unimplemented!()
        }
    }
}

// Our sample database connection type.
pub struct DbConnection;

impl DbConnection {
    // Function that will load all the users.
    fn load_all_users(&self) -> Vec<models::User> {
        // ...
        # unimplemented!()
    }
}

// Our Juniper context type.
pub struct Context {
    db: DbConnection,
}

impl juniper::Context for Context {}

// Our GraphQL user type.
// `#[derive(EagerLoading)]` takes care of all the heavy lifting.
#[derive(Clone, EagerLoading)]
// You need to set the connection and error type.
#[eager_loading(connection = "DbConnection", error = "Box<dyn Error>")]
pub struct User {
    // This user model is used to resolve `User.id`
    user: models::User,

    // Setup a "has one" association between a user and a country.
    // `default` will use all the default attribute values.
    // Exacty what they are is explained below.
    #[has_one(default)]
    country: HasOne<Country>,
}

// And the GraphQL country type.
#[derive(Clone, EagerLoading)]
#[eager_loading(connection = "DbConnection", error = "Box<dyn Error>")]
pub struct Country {
    country: models::Country,
}

// The root query GraphQL type.
pub struct Query;

impl QueryFields for Query {
    // The resolver for `Query.allUsers`.
    fn field_all_users(
        &self,
        executor: &Executor<'_, Context>,
        trail: &QueryTrail<'_, User, Walked>,
    ) -> FieldResult<Vec<User>> {
        let db = &executor.context().db;
        // Load the model users.
        let user_models = db.load_all_users();

        // Turn the model users into GraphQL users.
        let mut users = User::from_db_models(&user_models);

        // Perform the eager loading.
        // `trail` is used to only eager load the fields that are requested. Because
        // we're using `QueryTrail`s from "juniper_from_schema" it would be a compile
        // error if we eager loaded too much.
        User::eager_load_all_children_for_each(&mut users, &user_models, db, trail)?;

        Ok(users)
    }
}

impl UserFields for User {
    fn field_id(
        &self,
        executor: &Executor<'_, Context>,
    ) -> FieldResult<&i32> {
        Ok(&self.user.id)
    }

    fn field_country(
        &self,
        executor: &Executor<'_, Context>,
        trail: &QueryTrail<'_, Country, Walked>,
    ) -> FieldResult<&Country> {
        // This will unwrap the country from the `HasOne` or return an error if the
        // country wasn't loaded, or wasn't found in the database.
        Ok(self.country.try_unwrap()?)
    }
}

impl CountryFields for Country {
    fn field_id(
        &self,
        executor: &Executor<'_, Context>,
    ) -> FieldResult<&i32> {
        Ok(&self.country.id)
    }
}
#
# fn main() {}

#[derive(EagerLoading)]

For a type to support eager loading it needs to implement the following traits:

Implementing these traits involves lots of boilerplate, therefore you should use #[derive(EagerLoading)] to derive implementations as much as possible.

Sometimes you might need customized eager loading for a specific association, in that case you should still have #[derive(EagerLoading)] on your struct but implement EagerLoadChildrenOfType yourself for the field that requires a custom setup. An example of how to do that can be found here.

Attributes

#[derive(EagerLoading)] has a few attributes you need to provide:

Name Description Default Example
connection The type of connection your app uses. This could be a database connection or a connection to another web service. N/A connection = "diesel::pg::PgConnection"
error The type of error eager loading might result in. N/A error = "diesel::result::Error"
model The model type behind your GraphQL struct models::{name of struct} model = "crate::db::models::User"
id Which id type does your app use? i32 id = "UUID"
root_model_field The name of the field has holds the backing model {name of struct} in snakecase. root_model_field = "user"

Associations

Assocations are things like "user has one country". These are the fields that need to be eager loaded to avoid N+1s. Each assocation works for different kinds of foreign key setups and has to be eager loaded differently. They should fit most kinds of associations you have in your app. Click on each for more detail.

The documation for each assocation assumes that you're using an SQL database, but it should be straight forward to adapt to other kinds of data stores.

For each field of your GraphQL struct that is one of these four types the trait EagerLoadChildrenOfType will be implemented by #[derive(EagerLoading)].

Attributes supported on all associations

These are the attributes that are supported on all associations. None of these attributes take arguments.

skip

Skip implementing EagerLoadChildrenOfType for the field. This is useful if you need to provide a custom implementation.

print

This will cause the implementation of EagerLoadChildrenOfType for the field to be printed while compiling. This is useful when combined with skip. It will print a good starting place for you to customize.

The resulting code wont be formatted. We recommend you do that with rustfmt.

Diesel helper

Implementing LoadFrom for lots of model types might involve lots of boilerplate. If you're using Diesel you can use the impl_LoadFrom_for_diesel macro to hide all of that boilerplate.

When your GraphQL schema doesn't match your database schema

This library supports eager loading most kinds of association setups, however it probably doesn't support all that might exist in your app. It also works best when your database schema closely matches your GraphQL schema.

If you find yourself having to implement something that isn't directly supported remember that you're still free to implement you resolver functions exactly as you want. So if doing queries in a resolver is the only way to get the behaviour you need then so be it. Avoiding some N+1 queries is better than avoiding none.

However if you have a setup that you think this library should support please don't hestitate to open an issue.

Dependencies

~9.5MB
~190K SLoC