mirror of
https://github.com/renovatebot/renovate.git
synced 2024-12-22 21:48:32 +00:00
165 lines
7.4 KiB
Markdown
165 lines
7.4 KiB
Markdown
GraphQL can be used to efficiently retrieve all the tags and releases from GitHub.
|
|
The approach involves fetching items in reverse chronological order by the `created_at` field.
|
|
Items can be retrieved page by page until the first cached item is reached.
|
|
|
|
Although sorting by the `updated_at` field would be more precise, this option is not available in the API.
|
|
As a result, we also fetch relatively all items that are relatively _fresh_.
|
|
The freshness period is equal to the TTL of the entire cache, and this allows for updates or deletions to be reflected in the cache during that time.
|
|
|
|
In most cases, only one page of releases will need to be fetched during a Renovate run.
|
|
While it is possible to reduce this to zero for most runs, the practical implementation is complex and prone to errors.
|
|
|
|
# Components overview
|
|
|
|
```
|
|
lib/util/github/graphql
|
|
│
|
|
├── cache-strategies
|
|
│ ├── abstract-cache-strategy.ts <- common logic: `reconcile()` and `finalize()`
|
|
│ ├── memory-cache-strategy.ts <- single Renovate run (private packages)
|
|
│ └── package-cache-strategy.ts <- long-term persistence (public packages)
|
|
│
|
|
├── query-adapters
|
|
│ ├── releases-query-adapter.ts <- GitHub releases
|
|
│ └── tags-query-adapter.ts <- GitHub tags
|
|
│
|
|
├── datasource-fetcher.ts <- Complex pagination loop
|
|
│
|
|
└── index.ts <- Facade that hides whole thing
|
|
```
|
|
|
|
The datasource-fetcher.ts file contains the core component that implements the logic for looping over paginated GraphQL results.
|
|
This class is meant to be instantiated every time we need to paginate over GraphQL results.
|
|
It is responsible for handling several aspects of the fetch process, including:
|
|
|
|
- Making HTTP requests to the `/graphql` endpoint
|
|
- Handling and aggregating errors that may occur during the fetch process
|
|
- Dynamically adjusting the page size and retrying in the event of server errors
|
|
- Enforcing a maximum limit on the number of queries that can be made
|
|
- Detecting whether a package is private or public, and selecting the right cache strategy (in-memory or long-term) accordingly
|
|
- Ensuring proper concurrency when querying the same package simultaneously.
|
|
|
|
The `cache-strategies/` directory is responsible for caching implementation.
|
|
The core function is `reconcile()` which updates the cache data structure with pages of items one-by-one.
|
|
|
|
The files in `query-adapters/` directory allow for GitHub releases and tags to be fetched according to their specifics and to be transformed to the form suitable for caching.
|
|
For cached items, only `version` and `releaseTimestamp` fields are mandatory.
|
|
Other fields are specific to GitHub tags or GitHub releases.
|
|
|
|
# Process overview
|
|
|
|
## Initial fetch
|
|
|
|
Let's suppose we perform fetching for the first time.
|
|
For simplicity, this example assumes that we are retrieving items in small batches of 5 at a time.
|
|
The cache TTL is assumed to be 30 days.
|
|
|
|
```js
|
|
// Page 1
|
|
[
|
|
{ "version": "3.1.1", "releaseTimestamp": "2022-12-18" },
|
|
{ "version": "3.1.0", "releaseTimestamp": "2022-12-15" },
|
|
{ "version": "3.0.2", "releaseTimestamp": "2022-12-09" },
|
|
{ "version": "3.0.1", "releaseTimestamp": "2022-12-08" },
|
|
{ "version": "3.0.0", "releaseTimestamp": "2022-12-05" },
|
|
]
|
|
|
|
// Page 2
|
|
[
|
|
{ "version": "2.2.2", "releaseTimestamp": "2022-11-23" },
|
|
{ "version": "2.2.1", "releaseTimestamp": "2022-10-17" },
|
|
{ "version": "2.2.0", "releaseTimestamp": "2022-10-13" },
|
|
{ "version": "2.1.1", "releaseTimestamp": "2022-10-07" },
|
|
{ "version": "2.1.0", "releaseTimestamp": "2022-09-21" },
|
|
]
|
|
|
|
// Page 3
|
|
[
|
|
{ "version": "2.0.1", "releaseTimestamp": "2022-09-18" },
|
|
{ "version": "2.0.0", "releaseTimestamp": "2022-09-01" },
|
|
]
|
|
```
|
|
|
|
As we retrieve items during the fetch process, we gradually construct a data structure in the following form:
|
|
|
|
```js
|
|
{
|
|
"items": {
|
|
"3.1.1": { "version": "3.1.1", "releaseTimestamp": "2022-12-18" },
|
|
"3.1.0": { "version": "3.1.0", "releaseTimestamp": "2022-12-15" },
|
|
"3.0.2": { "version": "3.0.2", "releaseTimestamp": "2022-12-09" },
|
|
"3.0.1": { "version": "3.0.1", "releaseTimestamp": "2022-12-08" },
|
|
"3.0.0": { "version": "3.0.0", "releaseTimestamp": "2022-12-05" },
|
|
"2.2.2": { "version": "2.2.2", "releaseTimestamp": "2022-11-23" },
|
|
"2.2.1": { "version": "2.2.1", "releaseTimestamp": "2022-10-17" },
|
|
"2.2.0": { "version": "2.2.0", "releaseTimestamp": "2022-10-13" },
|
|
"2.1.1": { "version": "2.1.1", "releaseTimestamp": "2022-10-07" },
|
|
"2.1.0": { "version": "2.1.0", "releaseTimestamp": "2022-09-21" },
|
|
"2.0.1": { "version": "2.0.1", "releaseTimestamp": "2022-09-18" },
|
|
"2.0.0": { "version": "2.0.0", "releaseTimestamp": "2022-09-01" },
|
|
},
|
|
"createdAt": "2022-12-20",
|
|
}
|
|
```
|
|
|
|
Internally, we index each release by version name for quicker access.
|
|
When the fetch process is complete, we return the values of the items object.
|
|
If the repository is public, we also persist this data structure in a long-term cache for future use.
|
|
|
|
## Recurring fetches
|
|
|
|
In the case where we already have items stored in the cache, we can model the fetch process as follows.
|
|
Suppose we have a new release that changes the pagination of our items.
|
|
Also note that versions `3.0.1` and `3.0.2` are deleted since last fetch.
|
|
The resulting pagination would look like this:
|
|
|
|
```js
|
|
// Page 1 --- FETCHED AND RECONCILED ---
|
|
[
|
|
{ "version": "4.0.0", "releaseTimestamp": "2022-12-30" }, // new <- item cached
|
|
{ "version": "3.1.1", "releaseTimestamp": "2022-12-18" }, // fresh <- item updated
|
|
{ "version": "3.1.0", "releaseTimestamp": "2022-12-15" }, // fresh <- item updated
|
|
//{ "version": "3.0.2", "releaseTimestamp": "2022-12-09" }, // fresh <- item deleted
|
|
//{ "version": "3.0.1", "releaseTimestamp": "2022-12-08" }, // fresh <- item deleted
|
|
{ "version": "3.0.0", "releaseTimestamp": "2022-12-05" }, // fresh <- item updated
|
|
{ "version": "2.2.2", "releaseTimestamp": "2022-11-23" }, // old <- fetching stopped
|
|
]
|
|
|
|
// Page 2 --- NOT FETCHED ---
|
|
[
|
|
{ "version": "2.2.1", "releaseTimestamp": "2022-10-17" }, // old
|
|
{ "version": "2.2.0", "releaseTimestamp": "2022-10-13" }, // old
|
|
{ "version": "2.1.1", "releaseTimestamp": "2022-10-07" }, // old
|
|
{ "version": "2.1.0", "releaseTimestamp": "2022-09-21" }, // old
|
|
{ "version": "2.0.1", "releaseTimestamp": "2022-09-18" }, // old
|
|
]
|
|
|
|
// Page 3 --- NOT FETCHED ---
|
|
[
|
|
{ "version": "2.0.0", "releaseTimestamp": "2022-09-01" }, // old
|
|
]
|
|
```
|
|
|
|
Given we performed fetch at the day of latest release, new cache looks like:
|
|
|
|
```js
|
|
{
|
|
"items": {
|
|
"4.0.0": { "version": "4.0.0", "releaseTimestamp": "2022-12-30" },
|
|
"3.1.1": { "version": "3.1.1", "releaseTimestamp": "2022-12-18" },
|
|
"3.1.0": { "version": "3.1.0", "releaseTimestamp": "2022-12-15" },
|
|
"3.0.0": { "version": "3.0.0", "releaseTimestamp": "2022-12-05" },
|
|
"2.2.2": { "version": "2.2.2", "releaseTimestamp": "2022-11-23" },
|
|
"2.2.1": { "version": "2.2.1", "releaseTimestamp": "2022-10-17" },
|
|
"2.2.0": { "version": "2.2.0", "releaseTimestamp": "2022-10-13" },
|
|
"2.1.1": { "version": "2.1.1", "releaseTimestamp": "2022-10-07" },
|
|
"2.1.0": { "version": "2.1.0", "releaseTimestamp": "2022-09-21" },
|
|
"2.0.1": { "version": "2.0.1", "releaseTimestamp": "2022-09-18" },
|
|
"2.0.0": { "version": "2.0.0", "releaseTimestamp": "2022-09-01" },
|
|
},
|
|
"createdAt": "2022-12-20",
|
|
}
|
|
```
|
|
|
|
It will be updated by further fetches until cache reset at `2023-01-20`.
|