13 releases (5 breaking)

0.5.1	Mar 4, 2020
0.5.0	Nov 27, 2019
0.4.2	Nov 14, 2019
0.2.0	Jun 30, 2019

#12 in #juniper

111 downloads per month

MIT license

70KB
475 lines

juniper-eager-loading

This is a library for avoiding N+1 query bugs designed to work with Juniper and juniper-from-schema.

It is designed to make the most common association setups easy to handle and while being flexible and allowing you to customize things as needed. It is also 100% data store agnostic. So regardless if your API is backed by an SQL database or another API you can still use this library.

See the crate documentation for a usage examples and more info.

`lib.rs`:

juniper-eager-loading is a library for avoiding N+1 query bugs designed to work with Juniper and juniper-from-schema.

It is designed to make the most common assocation setups easy to handle and while being flexible and allowing you to customize things as needed. It is also 100% data store agnostic. So regardless if your API is backed by an SQL database or another API you can still use this library.

If you're familiar with N+1 queries in GraphQL and eager loading, feel free to skip forward to "A real example".

NOTE: Since this library requires juniper-from-schema it is best if you're first familiar with that.

What is N+1 query bugs?
- N+1s in GraphQL
How this library works at a high level
A real example
#[derive(EagerLoading)]
- Attributes
Associations
- Attributes supported on all associations
Eager loading interfaces or unions
Eager loading fields that take arguments
Diesel helper
When your GraphQL schema doesn't match your database schema

What is N+1 query bugs?

Imagine you have the following GraphQL schema

schema {
    query: Query
}

type Query {
    allUsers: [User!]!
}

type User {
    id: Int!
    country: Country!
}

type Country {
    id: Int!
}

And someone executes the following query:

query SomeQuery {
    allUsers {
        country {
            id
        }
    }
}

If you resolve that query naively with an SQL database as you data store you will see something like this in your logs:

select * from users
select * from countries where id = ?
select * from countries where id = ?
select * from countries where id = ?
select * from countries where id = ?
...

This happens because you first load all the users and then for each user in a loop you load that user's country. That is 1 query to load the users and N additional queries to load the countries. Therefore the name "N+1 query". These kinds of bugs can really hurt performance of your app since you're doing many more database calls than necessary.

One possible solution to this is called "eager loading". The idea is to load all countries up front, before looping over the users. So instead of doing N+1 queries you do 2:

select * from users
select * from countries where id in (?, ?, ?, ?)

Since you're loading the countries up front, this strategy is called "eager loading".

N+1s in GraphQL

If you're not careful when implementing a GraphQL API you'll have lots of these N+1 query bugs. Whenever a field returns a list of types and those types perform queries in their resolvers, you'll have N+1 query bugs.

This is also a problem in REST APIs, however because the responses are fixed we can more easily setup the necessary eager loads because we know the types needed to compute the response.

However in GraphQL the responses are not fixed. They depend on the incoming queries, which are not known ahead of time. So setting up the correct amount of eager loading requires inspecting the queries before executing them and eager loading the types requested such that the actual resolvers wont need to run queries. That is exactly what this library does.

How this library works at a high level

If you have a GraphQL type like this

type User {
    id: Int!
    country: Country!
}

You might create the corresponding Rust model type like this:

struct User {
    id: i32,
    country_id: i32
}

However this approach has one big issue. How are you going to resolve the field User.country without doing a database query? All the resolver has access to is a User with a country_id field. It can't get the country without loading it from the database...

Fundamentally these kinds of model structs don't work for eager loading with GraphQL. So this library takes a different approach.

What if we created separate structs for the database models and the GraphQL models? Something like this:

#
mod models {
    pub struct User {
        id: i32,
        country_id: i32
    }

    pub struct Country {
        id: i32,
    }
}

struct User {
    user: models::User,
    country: HasOne<Country>,
}

struct Country {
    country: models::Country
}

enum HasOne<T> {
    Loaded(T),
    NotLoaded,
}

Now we're able to resolve the query with code like this:

Load all the users (first query).
Map the users to a list of country ids.
Load all the countries with those ids (second query).
Pair up the users with the country with the correct id, so change User.country from HasOne::NotLoaded to HasOne::Loaded(matching_country).
When resolving the GraphQL field User.country simply return the loaded country.

A real example

use juniper::{Executor, FieldResult};
use juniper_eager_loading::{prelude::*, EagerLoading, HasOne};
use juniper_from_schema::graphql_schema;
use std::error::Error;

// Define our GraphQL schema.
graphql_schema! {
    schema {
        query: Query
    }

    type Query {
        allUsers: [User!]! @juniper(ownership: "owned")
    }

    type User {
        id: Int!
        country: Country!
    }

    type Country {
        id: Int!
    }
}

// Our model types.
mod models {
    use std::error::Error;
    use juniper_eager_loading::LoadFrom;

    #[derive(Clone)]
    pub struct User {
        pub id: i32,
        pub country_id: i32
    }

    #[derive(Clone)]
    pub struct Country {
        pub id: i32,
    }

    // This trait is required for eager loading countries.
    // It defines how to load a list of countries from a list of ids.
    // Notice that `Context` is generic and can be whatever you want.
    // It will normally be your Juniper context which would contain
    // a database connection.
    impl LoadFrom<i32> for Country {
        type Error = Box<dyn Error>;
        type Context = super::Context;

        fn load(
            employments: &[i32],
            field_args: &(),
            ctx: &Self::Context,
        ) -> Result<Vec<Self>, Self::Error> {
            // ...
            # unimplemented!()
        }
    }
}

// Our sample database connection type.
pub struct DbConnection;

impl DbConnection {
    // Function that will load all the users.
    fn load_all_users(&self) -> Vec<models::User> {
        // ...
        # unimplemented!()
    }
}

// Our Juniper context type which contains a database connection.
pub struct Context {
    db: DbConnection,
}

impl juniper::Context for Context {}

// Our GraphQL user type.
// `#[derive(EagerLoading)]` takes care of generating all the boilerplate code.
#[derive(Clone, EagerLoading)]
// You need to set the context and error type.
#[eager_loading(
    context = Context,
    error = Box<dyn Error>,

    // These match the default so you wouldn't have to specify them
    model = models::User,
    id = i32,
    root_model_field = user,
)]
pub struct User {
    // This user model is used to resolve `User.id`
    user: models::User,

    // Setup a "has one" association between a user and a country.
    //
    // We could also have used `#[has_one(default)]` here.
    #[has_one(
        foreign_key_field = country_id,
        root_model_field = country,
        graphql_field = country,
    )]
    country: HasOne<Country>,
}

// And the GraphQL country type.
#[derive(Clone, EagerLoading)]
#[eager_loading(context = Context, error = Box<dyn Error>)]
pub struct Country {
    country: models::Country,
}

// The root query GraphQL type.
pub struct Query;

impl QueryFields for Query {
    // The resolver for `Query.allUsers`.
    fn field_all_users(
        &self,
        executor: &Executor<'_, Context>,
        trail: &QueryTrail<'_, User, Walked>,
    ) -> FieldResult<Vec<User>> {
        let ctx = executor.context();

        // Load the model users.
        let user_models = ctx.db.load_all_users();

        // Turn the model users into GraphQL users.
        let mut users = User::from_db_models(&user_models);

        // Perform the eager loading.
        // `trail` is used to only eager load the fields that are requested. Because
        // we're using `QueryTrail`s from "juniper_from_schema" it would be a compile
        // error if we eager loaded associations that aren't requested in the query.
        User::eager_load_all_children_for_each(&mut users, &user_models, ctx, trail)?;

        Ok(users)
    }
}

impl UserFields for User {
    fn field_id(
        &self,
        executor: &Executor<'_, Context>,
    ) -> FieldResult<&i32> {
        Ok(&self.user.id)
    }

    fn field_country(
        &self,
        executor: &Executor<'_, Context>,
        trail: &QueryTrail<'_, Country, Walked>,
    ) -> FieldResult<&Country> {
        // This will unwrap the country from the `HasOne` or return an error if the
        // country wasn't loaded, or wasn't found in the database.
        Ok(self.country.try_unwrap()?)
    }
}

impl CountryFields for Country {
    fn field_id(
        &self,
        executor: &Executor<'_, Context>,
    ) -> FieldResult<&i32> {
        Ok(&self.country.id)
    }
}
#

`#[derive(EagerLoading)]`

For a type to support eager loading it needs to implement the following traits:

GraphqlNodeForModel
EagerLoadAllChildren
Each association field must implement EagerLoadChildrenOfType

Implementing these traits involves lots of boilerplate, therefore you should use #[derive(EagerLoading)] to derive implementations as much as possible.

Sometimes you might need customized eager loading for a specific association, in that case you should still have #[derive(EagerLoading)] on your struct but implement EagerLoadChildrenOfType yourself for the field that requires a custom setup. An example of how to do that can be found here.

If you're interested in seeing full examples without any macros look here.

Attributes

#[derive(EagerLoading)] has a few attributes you need to provide:

Name	Description	Default	Example
`context`	The type of your Juniper context. This will often hold your database connection or something else than can be used to load data.	N/A	`context = Context`
`error`	The type of error eager loading might result in.	N/A	`error = diesel::result::Error`
`model`	The model type behind your GraphQL struct	`models::{name of struct}`	`model = crate::db::models::User`
`id`	Which id type does your app use?	`i32`	`id = UUID`
`root_model_field`	The name of the field has holds the backing model	`{name of struct}` in snakecase.	`root_model_field = user`
`primary_key_field`	The field that holds the primary key of the model. This field is only used by code generated for `#[has_many]` and `#[has_many_through]` associations.	`id`	`primary_key_field = identifier`
`print`	If set it will print the generated implementation of `GraphqlNodeForModel` and `EagerLoadAllChildren`	Not set	`print`

Associations

Assocations are things like "user has one country". These are the fields that need to be eager loaded to avoid N+1s. Each assocation works for different kinds of foreign key setups and has to be eager loaded differently. They should fit most kinds of associations you have in your app. Click on each for more detail.

The documation for each assocation assumes that you're using an SQL database, but it should be straight forward to adapt to other kinds of data stores.

For each field of your GraphQL struct that is one of these four types the trait EagerLoadChildrenOfType will be implemented by #[derive(EagerLoading)].

Attributes supported on all associations

These are the attributes that are supported on all associations.

`skip`

Skip implementing EagerLoadChildrenOfType for the field. This is useful if you need to provide a custom implementation.

`print`

This will cause the implementation of EagerLoadChildrenOfType for the field to be printed while compiling. This is useful when combined with skip. It will print a good starting place for you to customize.

The resulting code wont be formatted. We recommend you do that with rustfmt.

`fields_arguments`

Used to specify the type that'll be use for EagerLoadChildrenOfType::FieldArguments. More info here.

For example #[has_one(fields_arguments = CountryUsersArgs)]. You can find a complete example here.

The code generation defaults EagerLoadChildrenOfType::FieldArguments to (). That works for fields that don't take arguments.

Eager loading interfaces or unions

Eager loading interfaces or unions is possible but it will require calling .downcast() on the QueryTrail. See the juniper-from-schema docs for more info fo more info.

Eager loading fields that take arguments

If you have a GraphQL field that takes arguments you probably have to consider them for eager loading purposes.

If you're using on code generation for such fields you have to specify the type on the association field. More into here.

If you implement EagerLoadChildrenOfType manually you have to set EagerLoadChildrenOfType::FieldArguments to the type of the arguments struct generated by juniper-from-schema. You can find more info here.

You also have to implement LoadFrom<T, ArgumentType> for your model. You can find a complete example here.

If you see a type error like:

error[E0308]: mismatched types
   --> src/main.rs:254:56
    |
   254 | #[derive(Clone, Eq, PartialEq, Debug, Ord, PartialOrd, EagerLoading)]
    |                                                           ^^^^^^^^^^^^ expected (), found struct `query_trails::CountryUsersArgs`
    |
    = note: expected type `&()`
               found type `&query_trails::CountryUsersArgs<'_>`

It is because your GraphQL field Country.users takes arguments. The code generation defaults to using () for the type of the arguments so therefore you get this type error. The neat bit is that the compiler wont let you forget about handling arguments.

Diesel helper

Implementing LoadFrom for lots of model types might involve lots of boilerplate. If you're using Diesel it is recommend that you use one of the macros to generate implementations.

When your GraphQL schema doesn't match your database schema

This library supports eager loading most kinds of association setups, however it probably doesn't support all that might exist in your app. It also works best when your database schema closely matches your GraphQL schema.

If you find yourself having to implement something that isn't directly supported remember that you're still free to implement you resolver functions exactly as you want. So if doing queries in a resolver is the only way to get the behaviour you need then so be it. Avoiding some N+1 queries is better than avoiding none.

However if you have a setup that you think this library should support please don't hestitate to open an issue.

Dependencies

~7–15MB
~198K SLoC