Vault
Availability telemetry
[!IMPORTANT]
Documentation Update: Product documentation, which were located in this repository under/website
, are now located inhashicorp/web-unified-docs
, colocated with all other product documentation. Contributions to this content should be done in theweb-unified-docs
repo, and not this one. Changes made to/website
content in this repo will not be reflected on the developer.hashicorp.com website.
Availability telemetry provides information about standby and active nodes in your Vault instance. Enterprise installations also include replication metrics.
Default metrics
vault.ha.rpc.client.echo
Metric type | Value | Description |
---|---|---|
summary | ms | Time taken to send an echo request from a standby to the active node (also emitted by perf standbys) |
vault.ha.rpc.client.echo.errors
Metric type | Value | Description |
---|---|---|
counter | number | Number of standby echo request failures (also emitted by perf standbys) |
vault.ha.rpc.client.forward
Metric type | Value | Description |
---|---|---|
summary | ms | Time taken to forward a request from a standby to the active node |
vault.ha.rpc.client.forward.errors
Metric type | Value | Description |
---|---|---|
counter | number | Number of standby request forwarding failures |
Merkle tree metrics
vault.merkle.flushDirty
Metric type | Value | Description |
---|---|---|
summary | ms | The average time required to flush dirty pages to storage |
vault.merkle.flushDirty.num_pages
Metric type | Value | Description |
---|---|---|
gauge | pages | Number of pages flushed |
vault.merkle.flushDirty.outstanding_pages
Metric type | Value | Description |
---|---|---|
gauge | pages | Number of dirty pages waiting to be flushed |
vault.merkle.saveCheckpoint
Metric type | Value | Description |
---|---|---|
summary | ms | The average time required to save a checkpoint |
vault.merkle.saveCheckpoint.num_dirty
Metric type | Value | Description |
---|---|---|
gauge | pages | Number of dirty pages at checkpoint |
Write-ahead log (WAL) telemetry
vault.wal.deleteWALs
Metric type | Value | Description |
---|---|---|
summary | ms | Time required to fully delete a write-ahead log |
vault.wal.flushReady
Metric type | Value | Description |
---|---|---|
summary | ms | Time required to fully flush a write-ahead log that is ready for storage |
vault.wal.flushReady.queue_len
Metric type | Value | Description |
---|---|---|
summary | number | Current size of the write queue in the WAL system |
vault.wal.gc.deleted
Metric type | Value | Description |
---|---|---|
gauge | number | Number of write-ahead logs deleted during garbage collection |
vault.wal.gc.total
Metric type | Value | Description |
---|---|---|
gauge | number | Total number of write-ahead logs currently on disk |
vault.wal.loadWAL
Metric type | Value | Description |
---|---|---|
summary | ms | Time required to load a write-ahead log |
vault.wal.persistWALs
Metric type | Value | Description |
---|---|---|
summary | ms | Time required to persist a write-ahead log |
vault.wal.write_controller.d
Metric type | Value | Description |
---|---|---|
gauge | number | Current derivative value computed by the write controller. |
The vault.wal.write_controller.d
metric has limited production use, but Vault
developers may find vault.wal.write_controller.d
useful for tuning or
debugging controller behavior.
vault.wal.write_controller.i
Metric type | Value | Description |
---|---|---|
gauge | number | Current integral value computed by the write controller. |
The vault.wal.write_controller.i
metric has limited production use, but Vault
developers may find vault.wal.write_controller.i
useful for tuning or
debugging controller behavior.
vault.wal.write_controller.p
Metric type | Value | Description |
---|---|---|
gauge | number | Current proportional error value detected by the write controller. |
The vault.wal.write_controller.p
metric has limited production use, but Vault
developers may find vault.wal.write_controller.p
useful for tuning or
debugging controller behavior.
vault.wal.write_controller.reject_fraction
Metric type | Value | Description |
---|---|---|
gauge | number | The estimated fraction of write requests that must be rejected to maintain cluster stability. |
The write controller reject fraction is an estimate between 0 and 1.
Log shipping metrics
vault.logshipper.buffer.length
Metric type | Value | Description |
---|---|---|
gauge | buffer entries | Current length of the log shipper buffer |
vault.logshipper.buffer.max_length
Metric type | Value | Description |
---|---|---|
gauge | buffer entries | Maximum length of the log shipper buffer seen to date |
vault.logshipper.buffer.max_size
Metric type | Value | Description |
---|---|---|
gauge | bytes | Maximum allowable size of the log shipper buffer |
vault.logshipper.buffer.size
Metric type | Value | Description |
---|---|---|
gauge | bytes | Current size of the log shipper buffer |
vault.logshipper.streamWALs.guard_found
Metric type | Value | Description |
---|---|---|
counter | number | Number of times Vault began streaming WAL entires and found a starting index in the merkle tree |
vault.logshipper.streamWALs.missing_guard
Metric type | Value | Description |
---|---|---|
counter | number | Number of times Vault began streaming WAL entires without finding a starting index in the Merkle tree |
vault.logshipper.streamWALs.scanned_entries
Metric type | Value | Description |
---|---|---|
summary | entries | Number of entries scanned in the buffer before Vault found the correct entry |
Replication metrics Enterprise
Note
The following metrics only appear in telemetry results when replication is in an unhealthy state:
vault.replication.fetchRemoteKeys
Metric type | Value | Description |
---|---|---|
summary | ms | Time required to fetch keys from a remote cluster participating in replication before Merkle tree delta generation occurs |
vault.replication.fsm.last_remote_wal
Metric type | Value | Description |
---|---|---|
gauge | number | Index of the last remote write-ahead log. |
Note
Standby nodes do not emit `last_remote_wal` details.vault.replication.fsm.last_upstream_remote_wal
Metric type | Value | Description |
---|---|---|
gauge | number | Index of the last remote WAL segment received from the upstream cluster by the local cluster leader. |
vault.replication.merkle.commit_index
Metric type | Value | Description |
---|---|---|
gauge | number | Index of the last commit to the Merkle tree |
vault.replication.merkleDiff
Metric type | Value | Description |
---|---|---|
summary | ms | Time required to perform a Merkle tree delta comparison among the clusters participating in replication |
vault.replication.merkleSync
Metric type | Value | Description |
---|---|---|
summary | ms | Time required to perform a Merkle tree synchronization with the most recent delta generated by the clusters participating in replication |
vault.replication.rpc.client.conflicting_pages
Metric type | Value | Description |
---|---|---|
summary | ms | Time required to complete a conflicting pages request for the client |
vault.replication.rpc.client.create_token_register_auth_lease
Metric type | Value | Description |
---|---|---|
summary | ms | Time required to complete a register authentication token request for the client |
vault.replication.rpc.client.fetch_keys
Metric type | Value | Description |
---|---|---|
summary | ms | Time required to complete a fetch keys request for the client |
vault.replication.rpc.client.forward
Metric type | Value | Description |
---|---|---|
summary | ms | Time required to complete a forward request for the client |
vault.replication.rpc.client.guard_hash
Metric type | Value | Description |
---|---|---|
summary | ms | Time required to complete a guard hash request for the client |
vault.replication.rpc.client.persist_alias
Metric type | Value | Description |
---|---|---|
summary | ms | Time required to persist an alias for the client |
vault.replication.rpc.client.register_auth
Metric type | Value | Description |
---|---|---|
summary | ms | Time required to complete a register authentication request for the client |
vault.replication.rpc.client.register_lease
Metric type | Value | Description |
---|---|---|
summary | ms | Time required to register a lease for the client |
vault.replication.rpc.client.save_mfa_response_auth
Metric type | Value | Description |
---|---|---|
summary | ms | Time required by the client to save the MFA authentication response |
vault.replication.rpc.client.stream_wals
Metric type | Value | Description |
---|---|---|
summary | ms | Time required to stream write-ahead logs for the client |
vault.replication.rpc.client.sub_page_hashes
Metric type | Value | Description |
---|---|---|
summary | ms | Time required to complete a sub-page hash request for the client |
vault.replication.rpc.client.sync_counter
Metric type | Value | Description |
---|---|---|
summary | ms | Time required to complete a counter sync request for the client |
vault.replication.rpc.client.upsert_group
Metric type | Value | Description |
---|---|---|
summary | ms | Time required to complete a group upsert request for the client |
vault.replication.rpc.client.wrap_in_cubbyhole
Metric type | Value | Description |
---|---|---|
summary | ms | Time required to complete a cubbyhole wrap request for the client |
vault.replication.rpc.dr.server.echo
Metric type | Value | Description |
---|---|---|
summary | ms | Time required to complete an echo request for disaster recovery |
vault.replication.rpc.dr.server.fetch_keys_request
Metric type | Value | Description |
---|---|---|
summary | ms | Time required to complete a fetch keys request for disaster recovery |
vault.replication.rpc.server.auth_request
Metric type | Value | Description |
---|---|---|
summary | ms | Time required to complete an authentication request |
vault.replication.rpc.server.bootstrap_request
Metric type | Value | Description |
---|---|---|
summary | ms | Time required to complete a bootstrap request |
vault.replication.rpc.server.conflicting_pages_request
Metric type | Value | Description |
---|---|---|
summary | ms | Time required to complete a conflicting pages request |
vault.replication.rpc.server.echo
Metric type | Value | Description |
---|---|---|
summary | ms | Time required to complete an echo operation |
vault.replication.rpc.server.last_heartbeat
Metric type | Value | Description |
---|---|---|
gauge | timestamp | Epoch time (seconds since 1970-01-01) of the last heartbeat received from the connected cluster |
vault.replication.rpc.server.forwarding_request
Metric type | Value | Description |
---|---|---|
summary | ms | Time required to complete a forwarding request |
vault.replication.rpc.server.guard_hash_request
Metric type | Value | Description |
---|---|---|
summary | ms | Time required to complete a guard hash request |
vault.replication.rpc.server.persist_alias_request
Metric type | Value | Description |
---|---|---|
summary | ms | Time required to complete a request to persist an alias |
vault.replication.rpc.server.persist_persona_request
Metric type | Value | Description |
---|---|---|
summary | ms | Time required to complete a request to persist an alias |
vault.replication.rpc.server.save_mfa_response_auth
Metric type | Value | Description |
---|---|---|
summary | ms | Time required to save a MFA authentication response |
vault.replication.rpc.server.stream_wals_request
Metric type | Value | Description |
---|---|---|
summary | ms | Time required to complete a request to stream write-ahead logs |
vault.replication.rpc.server.sub_page_hashes_request
Metric type | Value | Description |
---|---|---|
summary | ms | Time required to complete a sub-page hashes request |
vault.replication.rpc.server.sync_counter_request
Metric type | Value | Description |
---|---|---|
summary | ms | Time required to complete a counter sync request |
vault.replication.rpc.server.upsert_group_request
Metric type | Value | Description |
---|---|---|
summary | ms | Time required to complete a group upsert request |
vault.replication.rpc.standby.server.create_token_register_auth_lease_request
Metric type | Value | Description |
---|---|---|
summary | ms | Time required to service a create token request from a standby node |
vault.replication.rpc.standby.server.echo
Metric type | Value | Description |
---|---|---|
summary | ms | Time required to service an echo request from a standby node |
vault.replication.rpc.standby.server.register_auth_request
Metric type | Value | Description |
---|---|---|
summary | ms | Time required to service a register auth request from a standby node |
vault.replication.rpc.standby.server.register_lease_request
Metric type | Value | Description |
---|---|---|
summary | ms | Time required to service a register lease request from a standby node |
vault.replication.rpc.standby.server.wrap_token_request
Metric type | Value | Description |
---|---|---|
summary | ms | Time required to service a wrap token request from a standby node |
vault.replication.wal.gc
Metric type | Value | Description |
---|---|---|
summary | ms | Time required to complete one run of the WAL garbage collection process |
vault.replication.wal.last_dr_wal
Metric type | Value | Description |
---|---|---|
gauge | number | Index of the last write-ahead log for disaster recovery. Note that this is emitted by all Vault Enterprise clusters, regardless of cluster type. |
vault.replication.wal.last_performance_wal
Metric type | Value | Description |
---|---|---|
gauge | number | Index of the last write-ahead log for performance |
vault.replication.wal.last_wal
Metric type | Value | Description |
---|---|---|
gauge | number | Index of the last write-ahead log |