diff --git a/modules/ROOT/nav.adoc b/modules/ROOT/nav.adoc index 141a4a03eb..facc5795e7 100644 --- a/modules/ROOT/nav.adoc +++ b/modules/ROOT/nav.adoc @@ -401,6 +401,8 @@ include::cli:partial$cbcli/nav.adoc[] **** xref:rest-api:rest-manage-log-collection.adoc[Collecting Logs] **** xref:rest-api:rest-client-logs.adoc[Logging Client-Side Errors] + *** xref:rest-api:application-telemetry.adoc[] + ** xref:rest-api:rest-bucket-intro.adoc[Buckets API] *** xref:rest-api:rest-bucket-create.adoc[Creating and Editing Buckets] *** xref:rest-api:setting-minimum-replicas.adoc[Setting a Replica-Minimum] diff --git a/modules/manage/pages/monitor/set-up-prometheus-for-monitoring.adoc b/modules/manage/pages/monitor/set-up-prometheus-for-monitoring.adoc index 3e17a48361..82c616d4c6 100644 --- a/modules/manage/pages/monitor/set-up-prometheus-for-monitoring.adoc +++ b/modules/manage/pages/monitor/set-up-prometheus-for-monitoring.adoc @@ -139,3 +139,13 @@ The `http_sd_configs` section contains its own copy of the `basic_auth` and `tls You can change the list that the discovery API returns by adding query parameters to the URL in the `http_sd_configs` section. See xref:rest-api:rest-discovery-api.adoc[Prometheus Discovery API]. + +[#app-telemetry] +== Include Application Telemetry in Prometheus Metrics + +You can enable application telemetry to have Couchbase Server collect metrics from your applications that use the Couchbase SDKs. +When you enable application telemetry, Couchbase Server collects telemetry data from your applications. +It the reports the collected data as metrics through the same Prometheus endpoint that it uses to report its own metrics. +Enabling this feature lets you use Prometheus to monitor the health of both your Couchbase Server cluster and your applications that use the Couchbase SDKs. + +See xref:rest-api:application-telemetry.adoc[] to learn how to enable application telemetry. \ No newline at end of file diff --git a/modules/rest-api/pages/application-telemetry.adoc b/modules/rest-api/pages/application-telemetry.adoc new file mode 100644 index 0000000000..2f0489f9ee --- /dev/null +++ b/modules/rest-api/pages/application-telemetry.adoc @@ -0,0 +1,222 @@ += Application Telemetry +:description: pass:q[You can enable application telemetry that lets Couchbase Server periodically collect information from your clients that use the Couchbase SDK.] +:page-topic-type: reference +:page-toclevels: 3 + +[abstract] +{description} + +== Description + +Having Couchbase Server collect telemetry information from your applications that use the Couchbase SDKs can help you troubleshoot client issues. +This telemetry data is useful to diagnose issues such as poor performance or timeouts. + +When you enable application telemetry, Couchbase Server advertises to SDK clients that it can collect telemetry data. +When an SDK client connects to a cluster with application telemetry enabled, it opens a WebSocket connection to a node in the cluster. +Couchbase Server uses this connection to periodically gather telemetry data from the client in Prometheus format. + +Couchbase Server reports the collected telemetry data through the same Prometheus metrics endpoint it uses to publish its own metrics. +See xref:manage:monitor/set-up-prometheus-for-monitoring.adoc[] to learn how to set up Prometheus to collect metrics from your Couchbase Server cluster. + +NOTE: A Couchbase Server cluster only supports application telemetry when all of its nodes are running version 8.0 or later. +Earlier versions of Couchbase Server do not support application telemetry. +You cannot have the cluster collect application telemetry if the cluster is running in mixed mode with some nodes running a version earlier than 8.0. +If you enable application telemetry on a cluster running in mixed mode with pre-8.0 nodes, the cluster does not advertise its ability to collect telemetry to clients. + +== HTTP Methods + +This API endpoint supports the following methods: + +* <<#get-status>> +* <<#configure-telemetry>> + + +[#get-status] +== Get Application Telemetry Status + +The following method gets the current state of application telemetry for the cluster. + +---- +GET /settings/appTelemetry +---- + +=== curl Syntax + +[source,bash] +---- +curl -sS -u $USER:$PASSWORD \ + -X GET 'http[s]://{host}:{port}/settings/appTelemetry' +---- + +.Path and curl Parameters +:priv-link: get-privs +include::partial$user-pw-host-port-params.adoc[] + + +[#get-privs] +=== Required Privileges + +Your user account must have at least 1 of the following roles to get the application telemetry status: + +* xref:learn:security/roles.adoc#full-admin[Full Admin] +* xref:learn:security/roles.adoc#cluster-admin[Cluster Admin] +* xref:learn:security/roles.adoc#read-only-admin[Read-Only Admin] + +[#get-status-responses] +=== Responses + +`200 OK`:: +Returned when the call is successful. +The response body contains a JSON object with the following fields: + ++ +* `enabled`: whether application telemetry is enabled or not. +* `maxScrapeClientsPerNode`: the maximum number of clients from which a single node can scrape telemetry data at the same time. +* `scrapeIntervalSeconds`: how often clients scrape telemetry data from the nodes, in seconds. + +`403 Forbidden`:: +Returned if you do not have the proper roles to call this API. +See <<#get-privs>>. + + +[#gey-state-example] +=== Examples + +The following example gets the cluster's current application telemetry setting from the local node and pipes the result through `jq`. + +[source,bash] +---- + curl -sX GET -u Administrator:password \ + 'http://localhost:8091/settings/appTelemetry' | jq +---- + +Running the previous command returns a JSON object similar to the following: + +[source,json] +---- +{ + "enabled": false, + "maxScrapeClientsPerNode": 1024, + "scrapeIntervalSeconds": 60 +} +---- + + +[#configure-telemetry] +== Configure Application Telemetry + +By sending a POST request to the `/settings/appTelemetry` endpoint, you can: + +* Enable or turn off application telemetry. +* Set the limit on the number of clients a single node can scrape telemetry data from at the same time. +* Set how often the nodes scrape telemetry data from clients. + +.Configure Application Telemetry +---- +POST /settings/appTelemetry +---- + + +=== curl Syntax + +[source,bash] +---- +curl -sS -u $USER:$PASSWORD \ + -X POST http://{host}:{port}/settings/encryptionKeys \ + [-d enabled=[true|false]] \ + [-d maxScrapeClientsPerNode=] \ + [-d scrapeIntervalSeconds=] +---- + +.Path and curl Parameters +:priv-link: config-privs +include::partial$user-pw-host-port-params.adoc[] + + +.REST Parameters +`enabled` (boolean, optional):: +Set to `true` to enable application telemetry or `false` to disable it. + +`maxScrapeClientsPerNode` (integer, optional):: +Set the maximum number of clients from which a single node can scrape telemetry data at the same time. +If the number of client telemetry connections reaches this threshold, the node rejects new telemetry connections until the number of connect clients drops. + ++ +Valid values are from `1` to `1024`. + ++ +The default value is `1024`. +You can set `maxScrapeClientsPerNode` to a lower value to reduce potential overhead on your nodes from collecting telemetry data. +However, lowering it too far could result in clients being unable to find a node for their telemetry connection. +When clients cannot find a node to connect to, their telemetry data is not collected. + +`scrapeIntervalSeconds` (integer, optional):: +Sets how often the nodes scrape telemetry data from clients, in seconds. + ++ +Valid values are `60` to `600`. + ++ +The default value is `60`. +You can increase this value to reduce the overhead on your nodes of collecting telemetry data. +However, setting it too high could result in a loss of telemetry data when clients close their connections. + +[#config-privs] +=== Required Privileges +Your user account must have at least 1 of the following roles to configure application telemetry: + +* xref:learn:security/roles.adoc#full-admin[Full Admin] +* xref:learn:security/roles.adoc#cluster-admin[Cluster Admin] + +=== Responses + +`200 OK`:: +Returned when the call is successful. +A successful call also returns a JSON object with the new application telemetry settings. +This object has the same format as the <<#get-status-responses,response from the GET method>>. + +`403 Forbidden`:: +Returned if you do not have the proper roles to call this API. +See <<#config-privs>> for a list of the required roles. + +[#config-examples] +=== Examples + +The following example enables telemetry, sets the maximum number of clients a node can scrape telemetry data from at the same time to `512`, and sets the scrape interval to `90` seconds. +It pipes the result through `jq` to make it easier to read. + +[source,bash] +---- +curl -X POST -u Administrator:password \ + http://localhost:8091/settings/appTelemetry \ + -d enabled=true \ + -d maxScrapeClientsPerNode=512 \ + -d scrapeIntervalSeconds=90 | jq +---- + +If successful, the previous command returns the following JSON object containing the new state of application telemetry for the cluster: + +[source,json] +---- +{ + "enabled": true, + "maxScrapeClientsPerNode": 512, + "scrapeIntervalSeconds": 90 +} +---- + +== See Also + +* xref:manage:monitor/set-up-prometheus-for-monitoring.adoc[] + +* See the SDK Telemetry from the Server section of the Collecting Information and Logging page in the documentation for the SDK you use. +For example: + +** xref:cxx-sdk:howtos:collecting-information-and-logging.adoc#sdk-telemetry-from-the-server[C++ SDK] +** xref:go-sdk:howtos:collecting-information-and-logging.adoc#sdk-telemetry-from-the-server[Go SDK] +** xref:java-sdk:howtos:collecting-information-and-logging.adoc#sdk-telemetry-from-the-server[Java SDK] +** xref:kotlin-sdk:howtos:collecting-information-and-logging.adoc#sdk-telemetry-from-the-server[Python SDK] + + + + diff --git a/modules/rest-api/partials/user-pw-host-port-params.adoc b/modules/rest-api/partials/user-pw-host-port-params.adoc index cdec18b81e..e0d65f9635 100644 --- a/modules/rest-api/partials/user-pw-host-port-params.adoc +++ b/modules/rest-api/partials/user-pw-host-port-params.adoc @@ -5,9 +5,9 @@ The name of a user who has one of the roles listed in <<{priv-link}>>. `PASSWORD`:: The password for the `user`. -`HOST`:: -Hostname or IP address of a Couchbase Server. +`host`:: +Hostname or IP address of a Couchbase Server node. -`PORT`:: +`port`:: Port number for the REST API. Defaults are 8091 for unencrypted and 18901 for encrypted connections.