renovatebot_renovate/docs/development/zod.md

11 KiB

Table of Contents

Zod schema guideline

We decided that Renovate should use Zod for schema validation. So any new manager or datasource should use Zod as well. This document explains how and why you should use Zod features.

When writing schema validation you want a balance between strictness of explicit contracts between separately developed systems, and the permissiveness of Renovate. We want Renovate to be only as strict as it needs to be.

Renovate should not assume a field is always present, because that assumption may lead to run-time errors when a field turns out to be missing. For example: if Renovate assumes an optional field from a public registry will always be used, it may run into trouble when a self-hosted implementation does not use this field.

When and where to use Zod

You should use Zod to validate:

  • Data received from external APIs and data sources, particularly the lib/modules/datasource/* section of Renovate
  • Data parsed from files in the repository, particularly the lib/modules/manager/* section of Renovate

The cdnjs datasource is a good example of using Zod schema validations on API responses from external sources.

The composer manager is a good example of using Zod schema validation in a manager to validate parsed files in a repository.

Technical guide

Use schema.ts files for Zod schemas

Following well-known refactoring principles, you should put Zod schema code in the correct place. The Zod schema usually goes in the schema.ts files, and the tests go in the schema.spec.ts files. You should write tests for Zod schemas.

Creating or extending Zod schemas on the fly reduces Renovate's performance. Only create or extend Zod schemas in this way if you really need to.

Name schemas without any Schema suffix

Schema names must start with a capital letter:

const ComplexNumber = z.object({
  re: z.number(),
  im: z.number(),
});

Do not add Schema to the end of the schema name. Avoid names like ComplexNumberSchema.

Inferred types

Create inferred types after schemas if they're needed somewhere in the code. Place such inferred types just after the schema definition using the same name.

While IDEs may confuse schema and type name sometimes, it's obvious which is which from the syntactic context.

Example:

export const User = z.object({
  firstName: z.string(),
  lastName: z.string(),
});
export type User = z.infer<typeof User>;

Specify only necessary fields

The external data that Renovate queries can be very complex, but Renovate may only need some of those fields. Avoid over-specifying schemas, only extract fields Renovate really needs. This reduces the surface of the contract between the external data source and Renovate, which means less errors to fix in the future.

For example, say you want Renovate to know about the width, height and length of a box. You should avoid code like this:

const Box = z.object({
  width: z.number(),
  height: z.number(),
  length: z.number(),
  color: z.object({
    red: z.number(),
    green: z.number(),
    blue: z.number(),
  })
  weight: z.number(),
});

const { width, height, length } = Box.parse(input);
const volume = width * height * length;

The code above refers to the color and weight fields, which Renovate does not need to do its job. Here's the correct code:

const Box = z.object({
  width: z.number(),
  height: z.number(),
  length: z.number(),
});

const { width, height, length } = Box.parse(input);
const volume = width * height * length;

Use Json, Yaml and Toml for string parsing

You may need to perform extra steps like JSON.parse() before you can validate the data structure. Use the helpers in schema-utils.ts for this purpose.

The wrong way to parse from string:

const ApiResults = z.array(
  z.object({
    id: z.number(),
    value: z.string(),
  }),
);
type ApiResults = z.infer<typeof ApiResults>;

let results: ApiResults | null = null;
try {
  const json = JSON.parse(input);
  results = ApiResults.parse(json);
} catch (e) {
  results = null;
}

The correct way to parse from string:

const ApiResults = Json.pipe(
  z.array(
    z.object({
      id: z.number(),
      value: z.string(),
    }),
  ),
);

const results = ApiResults.parse(input);

Use .transform() method to process validated data

Schema validation helps to be more confident with types during downstream data transformation.

If the validated data contains everything you need to transform it, you can apply transformations as the part of the schema itself.

This is an example of undesired data transformation:

const Box = z.object({
  width: z.number(),
  height: z.number(),
  length: z.number(),
});

const { width, height, length } = Box.parse(input);
const volume = width * height * length;

Instead, use the idiomatic .transform() method:

const BoxVolume = z
  .object({
    width: z.number(),
    height: z.number(),
    length: z.number(),
  })
  .transform(({ width, height, length }) => width * height * length);

const volume = BoxVolume.parse({
  width: 10,
  height: 20,
  length: 125,
}); // => 25000

Rename and move fields at the top level transform

When you need to rename or move object fields, place the code to the top-level transform.

The wrong way is to make cascading transformations:

const SourceUrl = z
  .object({
    meta: z
      .object({
        links: z.object({
          Github: z.string().url(),
        }),
      })
      .transform(({ links }) => links.Github),
  })
  .transform(({ meta: sourceUrl }) => sourceUrl);

The correct way is to rename at the top-level:

const SourceUrl = z
  .object({
    meta: z.object({
      links: z.object({
        Github: z.string().url(),
      }),
    }),
  })
  .transform(({ meta }) => meta.links.Github);

Stick to permissive behavior when possible

Zod schemas are strict, which means that if some field is wrong, or missing data, then the whole dataset is considered malformed. Because Renovate uses Zod, it would then abort processing, even if we want Renovate to continue processing!

Remember: we want to make sure the incoming data is good enough for Renovate to work. We do not need to validate that the data matches to any official specification.

Here are some techniques to make Zod more permissive about the input data.

Use .catch() to force default values

const Box = z.object({
  width: z.number().catch(10),
  height: z.number().catch(10),
});

const box = Box.parse({ width: 20, height: null });
// => { width: 20, height: 10 }

Use LooseArray and LooseRecord to filter out incorrect values from collections

Suppose you want to parse a list of package releases, with elements that may (or may not) contain a version field. If the version field is missing, you want to filter out such elements. If you only use methods from the zod library, you would need to write something like this:

const Versions = z
  .array(
    z
      .object({
        version: z.string(),
      })
      .nullable()
      .catch(null),
  )
  .transform((releases) =>
    releases.filter((x): x is { version: string } => x !== null),
  );

When trying to achieve permissive behavior, this pattern will emerge quite frequently, but filtering part of the code is not very readable.

Instead, you should use the LooseArray and LooseRecord helpers from schema-utils.ts to write simpler code:

const Versions = LooseArray(
  z.object({
    version: z.string(),
  }),
);

Combining with Result class

The Result (and AsyncResult) class represents the result of an operation, like Result.ok(200) or Result.err(404).

It supports the .transform() method, which is similar to zod's.

It also supports .onResult() and .onError() methods for side-effectful result inspection.

After all result manipulations are done, you may call .unwrap(), .unwrapOrElse() or .unwrapOrThrow() methods to get the underlying result value.

You can wrap the schema parsing result into the Result class:

const { val, err } = Result.parse(url, z.string().url())
  .transform((url) => http.get(url))
  .onError((err) => {
    logger.warn({ err }, 'Failed to fetch something important');
  })
  .transform((res) => res.body);

You can use schema parsing in the middle of the Result transform chain:

const UserConfig = z.object({
  /* ... */
});

const config = await Result.wrap(readLocalFile('config.json'))
  .transform((content) => Json.pipe(UserConfig).safeParse(content))
  .unwrapOrThrow();

Combining with Http class

The Http class supports schema validation for the JSON results of methods like .getJson(), .postJson(), etc. Under the hood, .parseAsync() method is used (important consequence: in case of invalid data, it will throw).

Provide schema in the last argument of the method:

const Users = z.object({
  users: z.object({
    id: z.number(),
    firstName: z.string(),
    lastName: z.string(),
  }),
});

const { body: users } = await http.getJson(
  'https://dummyjson.com/users',
  LooseArray(User),
);

For GET requests, use the .getJsonSafe() method which returns a Result instance:

const users = await http
  .getJsonSafe('https://dummyjson.com/users', LooseArray(User))
  .onError((err) => {
    logger.warn({ err }, 'Failed to fetch users');
  })
  .unwrapOrElse([]);