Sourcegraph's metrics include a single high-level metric alert_count
which indicates the number of level=critical
and level=warning
alerts each service has fired over time for each Sourcegraph service. This is the same metric presented on the Overview Grafana dashboard:
alert_count
Description: The number of alerts each service has fired and their severity level. The severity levels are defined as follows:
critical
: something is definitively wrong with Sourcegraph.
warning
: something could be wrong with Sourcegraph.
Values:
alert_count
are floating-point numbers, only their whole numbers have meaning. For example: 0.5
and 0.7
indicate no alerts are firing, while 1.2
indicates exactly one alert is firing and 3.0
indicates exactly three alerts firing.Labels:
level
: either critical
or warning
, as defined above.service_name
: the name of the service that fired the alert, one of the following constants:
"frontend"
"github-proxy"
"gitserver"
"precise-code-intel"
"query-runner"
"replacer"
"repo-updater"
"searcher"
"symbols"
"zoekt-indexserver"
"zoekt-webserver"
"syntect-server"
name
: the name of the alert that the service fired (chosen by the service)description
: a human-readable description of the alertExamples:
To get examples of how you might consume this metric in your own alerting system, see: Custom consumption of Sourcegraph alerts.
A complete reference of Sourcegraph's vast set of Prometheus metrics is not yet available. If you are interested in this, please reach out by filing an issue or contacting us at [email protected].