34 major breaking releases
37.0.0 | Sep 26, 2024 |
---|---|
36.0.0 | Jul 18, 2024 |
35.0.0 | Jul 12, 2024 |
34.0.0 | Jun 24, 2024 |
0.0.0 | Nov 21, 2022 |
#8 in #phase
8,625 downloads per month
Used in 74 crates
(8 directly)
3MB
58K
SLoC
Release
Polkadot SDK stable2409
lib.rs
:
Multi phase, offchain election provider pallet.
Currently, this election-provider has two distinct phases (see Phase
), signed and
unsigned.
Phases
The timeline of pallet is as follows. At each block,
frame_election_provider_support::ElectionDataProvider::next_election_prediction
is used to
estimate the time remaining to the next call to
frame_election_provider_support::ElectionProvider::elect
. Based on this, a phase is chosen.
The timeline is as follows.
elect()
+ <--T::SignedPhase--> + <--T::UnsignedPhase--> +
+-------------------------------------------------------------------+
Phase::Off + Phase::Signed + Phase::Unsigned +
Note that the unsigned phase starts pallet::Config::UnsignedPhase
blocks before the
next_election_prediction
, but only ends when a call to ElectionProvider::elect
happens. If
no elect
happens, the signed phase is extended.
Given this, it is rather important for the user of this pallet to ensure it always terminates election via
elect
before requesting a new one.
Each of the phases can be disabled by essentially setting their length to zero. If both phases
have length zero, then the pallet essentially runs only the fallback strategy, denoted by
Config::Fallback
.
Signed Phase
In the signed phase, solutions (of type RawSolution
) are submitted and queued on chain. A
deposit is reserved, based on the size of the solution, for the cost of keeping this solution
on-chain for a number of blocks, and the potential weight of the solution upon being checked. A
maximum of pallet::Config::SignedMaxSubmissions
solutions are stored. The queue is always
sorted based on score (worse to best).
Upon arrival of a new solution:
- If the queue is not full, it is stored in the appropriate sorted index.
- If the queue is full but the submitted solution is better than one of the queued ones, the worse solution is discarded, the bond of the outgoing solution is returned, and the new solution is stored in the correct index.
- If the queue is full and the solution is not an improvement compared to any of the queued ones, it is instantly rejected and no additional bond is reserved.
A signed solution cannot be reversed, taken back, updated, or retracted. In other words, the origin can not bail out in any way, if their solution is queued.
Upon the end of the signed phase, the solutions are examined from best to worse (i.e. pop()
ed
until drained). Each solution undergoes an expensive Pallet::feasibility_check
, which ensures
the score claimed by this score was correct, and it is valid based on the election data (i.e.
votes and targets). At each step, if the current best solution passes the feasibility check,
it is considered to be the best one. The sender of the origin is rewarded, and the rest of the
queued solutions get their deposit back and are discarded, without being checked.
The following example covers all of the cases at the end of the signed phase:
Queue
+-------------------------------+
|Solution(score=20, valid=false)| +--> Slashed
+-------------------------------+
|Solution(score=15, valid=true )| +--> Rewarded, Saved
+-------------------------------+
|Solution(score=10, valid=true )| +--> Discarded
+-------------------------------+
|Solution(score=05, valid=false)| +--> Discarded
+-------------------------------+
| None |
+-------------------------------+
Note that both of the bottom solutions end up being discarded and get their deposit back, despite one of them being invalid.
Unsigned Phase
The unsigned phase will always follow the signed phase, with the specified duration. In this phase, only validator nodes can submit solutions. A validator node who has offchain workers enabled will start to mine a solution in this phase and submits it back to the chain as an unsigned transaction, thus the name unsigned phase. This unsigned transaction can never be valid if propagated, and it acts similar to an inherent.
Validators will only submit solutions if the one that they have computed is strictly better than
the best queued one and will limit the weight of the solution to MinerConfig::MaxWeight
.
The unsigned phase can be made passive depending on how the previous signed phase went, by
setting the first inner value of Phase
to false
. For now, the signed phase is always
active.
Fallback
If we reach the end of both phases (i.e. call to ElectionProvider::elect
happens) and no
good solution is queued, then the fallback strategy pallet::Config::Fallback
is used to
determine what needs to be done. The on-chain election is slow, and contains no balancing or
reduction post-processing. If pallet::Config::Fallback
fails, the next phase
Phase::Emergency
is enabled, which is a more fail-safe approach.
Emergency Phase
If, for any of the below reasons:
- No signed or unsigned solution submitted, and no successful
Config::Fallback
is provided - Any other unforeseen internal error
A call to T::ElectionProvider::elect
is made, and Ok(_)
cannot be returned, then the pallet
proceeds to the Phase::Emergency
. During this phase, any solution can be submitted from
Config::ForceOrigin
, without any checking, via Pallet::set_emergency_election_result
transaction. Hence, [
Config::ForceOrigin]
should only be set to a trusted origin, such as
the council or root. Once submitted, the forced solution is kept in QueuedSolution
until the
next call to T::ElectionProvider::elect
, where it is returned and Phase
goes back to
Off
.
This implies that the user of this pallet (i.e. a staking pallet) should re-try calling
T::ElectionProvider::elect
in case of error, until OK(_)
is returned.
To generate an emergency solution, one must only provide one argument: Supports
. This is
essentially a collection of elected winners for the election, and voters who support them. The
supports can be generated by any means. In the simplest case, it could be manual. For example,
in the case of massive network failure or misbehavior, Config::ForceOrigin
might decide to
select only a small number of emergency winners (which would greatly restrict the next validator
set, if this pallet is used with pallet-staking
). If the failure is for other technical
reasons, then a simple and safe way to generate supports is using the staking-miner binary
provided in the Polkadot repository. This binary has a subcommand named emergency-solution
which is capable of connecting to a live network, and generating appropriate supports
using a
standard algorithm, and outputting the supports
in hex format, ready for submission. Note that
while this binary lives in the Polkadot repository, this particular subcommand of it can work
against any substrate-based chain.
See the staking-miner
docs for more
information.
Feasible Solution (correct solution)
All submissions must undergo a feasibility check. Signed solutions are checked one by one at the end of the signed phase, and the unsigned solutions are checked on the spot. A feasible solution is as follows:
- all of the used indices must be correct.
- present exactly correct number of winners.
- any assignment is checked to match with
RoundSnapshot::voters
. - the claimed score is valid, based on the fixed point arithmetic accuracy.
Accuracy
The accuracy of the election is configured via SolutionAccuracyOf
which is the accuracy that
the submitted solutions must adhere to.
Note that the accuracy is of great importance. The offchain solution should be as small as possible, reducing solutions size/weight.
Error types
This pallet provides a verbose error system to ease future debugging and debugging. The overall hierarchy of errors is as follows:
pallet::Error
: These are the errors that can be returned in the dispatchables of the pallet, either signed or unsigned. Since decomposition with nested enums is not possible here, they are prefixed with the logical sub-system to which they belong.ElectionError
: These are the errors that can be generated while the pallet is doing something in automatic scenarios, such asoffchain_worker
oron_initialize
. These errors are helpful for logging and are thus nested as:ElectionError::Miner
: wraps aunsigned::MinerError
.ElectionError::Feasibility
: wraps aFeasibilityError
.ElectionError::Fallback
: wraps a fallback error.ElectionError::DataProvider
: wraps a static str.
Note that there could be an overlap between these sub-errors. For example, A
SnapshotUnavailable
can happen in both miner and feasibility check phase.
Future Plans
Emergency-phase recovery script: This script should be taken out of staking-miner in
polkadot and ideally live in substrate/utils/frame/elections
.
Challenge Phase. We plan on adding a third phase to the pallet, called the challenge phase. This is a phase in which no further solutions are processed, and the current best solution might be challenged by anyone (signed or unsigned). The main plan here is to enforce the solution to be PJR. Checking PJR on-chain is quite expensive, yet proving that a solution is not PJR is rather cheap. If a queued solution is successfully proven bad:
- We must surely slash whoever submitted that solution (might be a challenge for unsigned solutions).
- We will fallback to the emergency strategy (likely extending the current era).
Bailing out. The functionality of bailing out of a queued solution is nice. A miner can submit a solution as soon as they think it is high probability feasible, and do the checks afterwards, and remove their solution (for a small cost of probably just transaction fees, or a portion of the bond).
Conditionally open unsigned phase: Currently, the unsigned phase is always opened. This is useful because an honest validator will run substrate OCW code, which should be good enough to trump a mediocre or malicious signed submission (assuming in the absence of honest signed bots). If there are signed submissions, they can be checked against an absolute measure (e.g. PJR), then we can only open the unsigned phase in extreme conditions (i.e. "no good signed solution received") to spare some work for the active validators.
Allow smaller solutions and build up: For now we only allow solutions that are exactly
DesiredTargets
, no more, no less. Over time, we can change this to a [min, max] where any
solution within this range is acceptable, where bigger solutions are prioritized.
Score based on (byte) size: We should always prioritize small solutions over bigger ones, if
there is a tie. Even more harsh should be to enforce the bound of the reduce
algorithm.
Take into account the encode/decode weight in benchmarks. Currently, we only take into
account the weight of encode/decode in the submit_unsigned
given its priority. Nonetheless,
all operations on the solution and the snapshot are worthy of taking this into account.
All of the tests here should be dedicated to only testing the feasibility check and nothing
more. The best way to audit and review these tests is to try and come up with a solution
that is invalid, but gets through the system as valid.
Dependencies
~18–32MB
~540K SLoC