3 releases (stable)

new 2.0.0 Apr 8, 2024
1.0.0 Mar 18, 2024
0.0.0 Feb 28, 2024

#1930 in Magic Beans

Download history 129/week @ 2024-02-28 10/week @ 2024-03-06 127/week @ 2024-03-13 30/week @ 2024-03-20 11/week @ 2024-03-27 63/week @ 2024-04-03

235 downloads per month

GPL-3.0-only

2MB
42K SLoC

pallet-migrations

Provides multi block migrations for FRAME runtimes.

Overview

The pallet takes care of executing a batch of multi-step migrations over multiple blocks. The process starts on each runtime upgrade. Normal and operational transactions are paused while migrations are on-going.

Example

This example demonstrates a simple mocked walk through of a basic success scenario. The pallet is configured with two migrations: one succeeding after just one step, and the second one succeeding after two steps. A runtime upgrade is then enacted and the block number is advanced until all migrations finish executing. Afterwards, the recorded historic migrations are checked and events are asserted.

Pallet API

See the pallet module for more information about the interfaces this pallet exposes, including its configuration trait, dispatchables, storage items, events and errors.

Otherwise noteworthy API of this pallet include its implementation of the MultiStepMigrator trait. This must be plugged into frame_system::Config::MultiBlockMigrator for proper function.

The API contains some calls for emergency management. They are all prefixed with force_ and should normally not be needed. Pay special attention prior to using them.

Design Goals

  1. Must automatically execute migrations over multiple blocks.
  2. Must expose information about whether migrations are ongoing.
  3. Must respect pessimistic weight bounds of migrations.
  4. Must execute migrations in order. Skipping is not allowed; migrations are run on a all-or-nothing basis.
  5. Must prevent re-execution of past migrations.
  6. Must provide transactional storage semantics for migrations.
  7. Must guarantee progress.

Design

Migrations are provided to the pallet through the associated type Config::Migrations of type SteppedMigrations. This allows multiple migrations to be aggregated through a tuple. It simplifies the trait bounds since all associated types of the trait must be provided by the pallet. The actual progress of the pallet is stored in the Cursor storage item. This can either be MigrationCursor::Active or MigrationCursor::Stuck. In the active case it points to the currently active migration and stores its inner cursor. The inner cursor can then be used by the migration to store its inner state and advance. Each time when the migration returns Some(cursor), it signals the pallet that it is not done yet.
The cursor is reset on each runtime upgrade. This ensures that it starts to execute at the first migration in the vector. The pallets cursor is only ever incremented or set to Stuck once it encounters an error (Goal 4). Once in the stuck state, the pallet will stay stuck until it is fixed through manual governance intervention.
As soon as the cursor of the pallet becomes Some(_); MultiStepMigrator::ongoing returns true (Goal 2). This can be used by upstream code to possibly pause transactions. In on_initialize the pallet will load the current migration and check whether it was already executed in the past by checking for membership of its ID in the Historic set. Historic migrations are skipped without causing an error. Each successfully executed migration is added to this set (Goal 5).
This proceeds until no more migrations remain. At that point, the event UpgradeCompleted is emitted (Goal 1).
The execution of each migration happens by calling SteppedMigration::transactional_step. This function wraps the inner step function into a transactional layer to allow rollback in the error case (Goal 6).
Weight limits must be checked by the migration itself. The pallet provides a WeightMeter for that purpose. The pallet may return SteppedMigrationError::InsufficientWeight at any point. In that scenario, one of two things will happen: if that migration was exclusively executed in this block, and therefore required more than the maximum amount of weight possible, the process becomes Stuck. Otherwise, one re-attempt is executed with the same logic in the next block (Goal 3). Progress through the migrations is guaranteed by providing a timeout for each migration via SteppedMigration::max_steps. The pallet ONLY guarantees progress if this is set to sensible limits (Goal 7).

Scenario: Governance cleanup

Every now and then, governance can make use of the clear_historic call. This ensures that no old migrations pile up in the Historic set. This can be done very rarely, since the storage should not grow quickly and the lookup weight does not suffer much. Another possibility would be to have a synchronous single-block migration perpetually deployed that cleans them up before the MBMs start.

Scenario: Successful upgrade

The standard procedure for a successful runtime upgrade can look like this:

  1. Migrations are configured in the Migrations config item. All migrations expose max_steps, are error tolerant, check their weight bounds and have a unique identifier.
  2. The runtime upgrade is enacted. An UpgradeStarted event is followed by lots of MigrationAdvanced and MigrationCompleted events. Finally UpgradeCompleted is emitted.
  3. Cleanup as described in the governance scenario be executed at any time after the migrations completed.

Advice: Failed upgrades

Failed upgrades cannot be recovered from automatically and require governance intervention. Set up monitoring for UpgradeFailed events to be made aware of any failures. The hook FailedMigrationHandler::failed should be setup in a way that it allows governance to act, but still prevent other transactions from interacting with the inconsistent storage state. Note that this is paramount, since the inconsistent state might contain a faulty balance amount or similar that could cause great harm if user transactions don't remain suspended. One way to implement this would be to use the SafeMode or TxPause pallets that can prevent most user interactions but still allow a whitelisted set of governance calls.

Remark: Failed migrations

Failed migrations are not added to the Historic set. This means that an erroneous migration must be removed and fixed manually. This already applies, even before considering the historic set.

Remark: Transactional processing

You can see the transactional semantics for migration steps as mostly useless, since in the stuck case the state is already messed up. This just prevents it from becoming even more messed up, but doesn't prevent it in the first place.

Dependencies

~17–31MB
~498K SLoC