From 36c3142c7d3be36acfa254573342d96cc75cb465 Mon Sep 17 00:00:00 2001 From: Tim Hingston Date: Fri, 6 Sep 2024 13:43:59 +1000 Subject: [PATCH] Add docs for enhanced OTel tracing in Studio --- docs/source/configuration/overview.mdx | 14 ++++ .../telemetry/apollo-telemetry.mdx | 67 ++++++++++++++++++- 2 files changed, 79 insertions(+), 2 deletions(-) diff --git a/docs/source/configuration/overview.mdx b/docs/source/configuration/overview.mdx index 13e6991a6a..c96db9d4fc 100644 --- a/docs/source/configuration/overview.mdx +++ b/docs/source/configuration/overview.mdx @@ -865,6 +865,20 @@ You won't see an immediate change in checks behavior when you first turn on exte + + +### Enhanced tracing in Studio via OpenTelemetry + + + + + +Beginning in v1.49.0, the router supports sending traces to Studio via the more detailed OTel (OpenTelemetry) protocol. +Support for OTel traces has historically only been available for 3rd party APM tools. With this option, +Studio can now provide a much more granular view of Router internals than the legacy Apollo tracing protocol. + +See [Enhanced tracing in Studio via OTel](./telemetry/apollo-telemetry#enhanced-tracing-in-studio-via-opentelemetry). + ### Safelisting with persisted queries You can enhance your graph's security with GraphOS Router by maintaining a persisted query list (PQL), an operation safelist made by your first-party apps. As opposed to automatic persisted queries (APQ) where operations are automatically cached, operations must be preregistered to the PQL. Once configured, the router checks incoming requests against the PQL. diff --git a/docs/source/configuration/telemetry/apollo-telemetry.mdx b/docs/source/configuration/telemetry/apollo-telemetry.mdx index c5f6c0c609..b3acb828a4 100644 --- a/docs/source/configuration/telemetry/apollo-telemetry.mdx +++ b/docs/source/configuration/telemetry/apollo-telemetry.mdx @@ -72,6 +72,70 @@ telemetry: field_level_instrumentation_sampler: always_off ``` + + +### Enhanced tracing in Studio via OpenTelemetry + + + + + +Beginning in v1.49.0, the router supports sending traces to Studio via the more detailed OTel (OpenTelemetry) protocol. +Support for OTel traces has historically only been available for 3rd party APM tools. With this option, +Studio can now provide a much more granular view of Router internals than the legacy Apollo tracing protocol. + +Benefits include: + +- A comprehensive way to visualize the Router execution path in Studio. +- Additional spans that were previously not included in Studio traces, such as query parsing, planning, execution, and more. +- Additional attributes including HTTP request details, REST connector details, and more. + +It is expected that this will become the default in a future version of Router. + +#### Configuration + +This change adds a new configuration option `telemetry.apollo.experimental_otlp_tracing_sampler`. Use this option to send +a percentage of traces to Studio via OTLP instead of the native Apollo Usage Reporting protocol. Supported values: + +- `always_off` (default): send all traces via the legacy Apollo Usage Reporting protocol. +- `always_on`: send all traces via OTLP. +- `0.0 - 1.0` (used for testing): the ratio of traces to send via OTLP (0.5 = 50 / 50). + +Note that this sampler is only applied _after_ the common tracing sampler, for example: + +#### Sample 1% of traces, send all traces via OTLP: + +```yaml +telemetry: + apollo: + # Send all traces via OTLP + experimental_otlp_tracing_sampler: always_on + + exporters: + tracing: + common: + # Sample traces at 1% of all traffic + sampler: 0.01 +``` + +OTel traces sent to Studio will not necessarily be identical to the ones sent to 3rd Party APM tools via OTLP: + +- Only specific OTLP attributes will be included for parity with what is provided in legacy traces today. This ensures that data privacy + is maintained in an equivalent manner. The existing Router configuration options for Apollo telemetry will continue to function + with OTLP traces, such as forwarding of GraphQL errors, headers, and variables. +- Some features of OTLP traces may only be available in Studio and not in 3rd Party APM tools (e.g. resolver-level timing information from + [Federated Tracing](../federation/metrics/#enabling-federated-tracing)). + + + +This change results in using a new wire protocol for traces, and some users may experience an increase in tracing traffic +to GraphOS Studio due to the additional detail being captured. In exceptional situations it may be necessary to send fewer traces. +This can be achieved via sending fewer traces (`telemetry.exporters.tracing.common.sampler`) or as a last resort, falling back +to the old protocol via `telemetry.apollo.otlp_tracing_sampler` to send fewer OTLP traces or fully disable them. +Any performance regressions due to the new tracing protocol should also be reported to the Apollo support team. + + + ### Experimental local field metrics Apollo Router can send field-level metrics to GraphOS without using FTV1 tracing. This feature is experimental and is not yet displayable in GraphOS Studio. @@ -96,8 +160,7 @@ telemetry: #highlight-start send_headers: only: # Include only headers with these names - - referer - #highlight-end + - referer #highlight-end ``` **Supported values:**