Performance Issues with larger schemas

We built a larger schema with the new module and faced some severe performance issues.

I did several hours profiling and figured out multiple issues, that could easily be improved.

As a starting point, I profiled a fully cached GraphQL request with development mode turned off. This looked like this:

![image](https://user-images.githubusercontent.com/432045/200847650-bcd7d107-6448-4b0b-9a48-c7bac8ee7da9.png)

On every request, the schema is validated by the webonyx library even though it comes from the cache. This can be easily turned off 
**File `SdlSchemaPluginBase::getSchema()`**
```
    $options = ['assumeValid' => true];
    $schema = BuildSchema::build($document, function ($config, TypeDefinitionNode $type) use ($resolver) {
      if ($type instanceof InterfaceTypeDefinitionNode || $type instanceof UnionTypeDefinitionNode) {
        $config['resolveType'] = $resolver;
      }

      return $config;
    }, $options);
```
There is an option `assumeValid` and if passed, the validation is skipped. This saved me approximatly 300ms per request.

Additionally, to support larger ASTs we also had to add the option 'noLocation' to the Parser::parse() function to avoid recursion loops.
This also gives a performance boost because the resulting AST is substantially smaller.
```
protected function getSchemaDocument(array $extensions = []) {

...

    $options = ['noLocation' => TRUE];
    $ast = Parser::parse(implode("\n\n", $schema), $options);
    if (empty($this->inDevelopment)) {
      $this->astCache->set($cid, $ast, CacheBackendInterface::CACHE_PERMANENT, ['graphql']);
    }

...
```
I went a head and figured out, that the performance is still an issue. I think there is a serious flaw in the implementation of the SchemaExtender. Here is a timeline when extending the schema and after fixing the issues above:

![image](https://user-images.githubusercontent.com/432045/200850169-24cc2175-740d-4c25-9953-a3dce12e2611.png)

As we can see, a function is called, named \GraphQL\Type\Schema::getTypeMap(). The method decription looks like this:
```
    /**
     * Returns array of all types in this schema. Keys of this array represent type names, values are instances
     * of corresponding type definitions
     *
     * This operation requires full schema scan. Do not use in production environment.
     *
     * @return array<string, Type>
     *
     * @api
     */
    public function getTypeMap() : array
```
The maintainer warns us, to run this function in production. But we run it on every request if schema extensions are turned on. :-)

Therefore, I implemented a caching of the extended schema:
```
...
    if ($extendSchema = $this->getExtensionDocument($extensions)) {
      // Generate the AST from the extended schema and save it to the cache.
      // This is important, because the Drupal graphql module is not caching the extended schema.
      // During schema extension, a very expensive function \GraphQL\Type\Schema::getTypeMap() is called.
      $document = $this->getExtensionSchemaAST($schema, $extendSchema);
      $options = ['assumeValid' => TRUE];
      $extended_schema = BuildSchema::build($document, function ($config, TypeDefinitionNode $type) use ($resolver) {
        if ($type instanceof InterfaceTypeDefinitionNode || $type instanceof UnionTypeDefinitionNode) {
          $config['resolveType'] = $resolver;
        }
        return $config;
      }, $options);
      return $extended_schema;
    }

...

  public function getExtensionSchemaAST($schema, $extendSchema) {
    $cid = "schema_extension:{$this->getPluginId()}";
    if (empty($this->inDevelopment) && $cache = $this->astCache->get($cid)) {
      return $cache->data;
    }

    $schema = SchemaExtender::extend($schema, $extendSchema);
    $schema_string = SchemaPrinter::doPrint($schema);
    $options = ['noLocation' => TRUE];
    $ast = Parser::parse($schema_string, $options);
    if (empty($this->inDevelopment)) {
      $this->astCache->set($cid, $ast, CacheBackendInterface::CACHE_PERMANENT, ['graphql']);
    }

    return $ast;
  }
```

The first time the schema is extended, we save the resulting AST from the cache. On any subsequent request, we can get the AST from the cache and load it into our `BuildSchema::build()`. This is superfast, because `BuildSchema::build` is lazy-loading our types and is not scanning the whole schema.
See documentation here: https://webonyx.github.io/graphql-php/schema-definition-language/#performance-considerations

This changes will result in a massive performance improvement. In total, we saved more than 400ms on cached each request.

![image](https://user-images.githubusercontent.com/432045/200852014-33bae4d5-35a3-42ef-8d0f-7cca3d306a0b.png)

I will create a pull-request to address this issues.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Performance Issues with larger schemas #1312

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Performance Issues with larger schemas #1312

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions