47 stable releases

new 2.1.1 Nov 16, 2024
2.0.16 Nov 16, 2024
2.0.3 Jul 20, 2024
1.18.26 Oct 11, 2024
0.1.0 Jan 30, 2024

#814 in Magic Beans

Download history 177/week @ 2024-07-26 214/week @ 2024-08-02 215/week @ 2024-08-09 143/week @ 2024-08-16 137/week @ 2024-08-23 253/week @ 2024-08-30 138/week @ 2024-09-06 143/week @ 2024-09-13 97/week @ 2024-09-20 241/week @ 2024-09-27 138/week @ 2024-10-04 135/week @ 2024-10-11 90/week @ 2024-10-18 167/week @ 2024-10-25 106/week @ 2024-11-01 88/week @ 2024-11-08

454 downloads per month

Apache-2.0

1MB
22K SLoC

The agave-watchtower program is used to monitor the health of a cluster. It periodically polls the cluster over an RPC API to confirm that the transaction count is advancing, new blockhashes are available, and no validators are delinquent. Results are reported as InfluxDB metrics, with an optional push notification on sanity failure.

If you only care about the health of one specific validator, the --validator-identity command-line argument can be used to restrict failure notifications to issues only affecting that validator.

If you do not want duplicate notifications, for example if you have elected to receive notifications by SMS the --no-duplicate-notifications command-line argument will suppress identical failure notifications.

Metrics

watchtower-sanity

On every iteration this data point will be emitted indicating the overall result using a boolean ok field.

watchtower-sanity-failure

On failure this data point contains details about the specific test that failed via the following fields:

  • test: name of the sanity test that failed
  • err: exact sanity failure message

Dependencies

~39–56MB
~1M SLoC