库
Learn how to add native instrumentation to your library. 了解如何将本地测量装置添加到您的库中。
OpenTelemetry provides instrumentation libraries for many libraries, which is typically done through library hooks or monkey-patching library code. OpenTelemetry 为许多库提供了测量库,这通常是通过库的钩子或monkey-patching来完成的。
Native library instrumentation with OpenTelemetry provides better observability and developer experience for users, removing the need for libraries to expose and document hooks: 使用 OpenTelemetry 的本机测量库为用户提供了更好的可观测性和开发人员体验,无需库公开和记录钩子(???):
语义约定
Check out available semantic conventions that cover web-frameworks, RPC clients, databases, messaging clients, infra pieces and more! 查看涵盖Web框架、RPC客户端、数据库、消息传递客户端、基础设施等的可用语义约定!
If your library is one of those things - follow the conventions, they are the main source of truth and tell which information should be included on spans. Conventions make instrumentation consistent: users who work with telemetry don’t have to learn library specifics and observability vendors can build experiences for a wide variety of technologies (e.g. databases or messaging systems). When libraries follow conventions, many scenarios may be enabled out of the box without the user’s input or configuration. 如果您的库是其中之一, 遵循这些约定,它们是主要的事实来源,并告知Span中应包含哪些信息。约定使测量装置保持一致:使用遥测技术的用户不必学习库细节,可观测性供应商可以为各种技术(例如数据库或消息传递系统)提供支持。当库遵循约定时,许多场景可以开箱即用,无需用户输入或配置。
Semantic conventions are always evolving and new ones are constantly added. If some don’t exist for your library, then please consider adding them . Pay special attention to span names; strive to use meaningful names and consider cardinality when defining them. 语义约定总是在发展,并且不断添加新的约定。如果某个约定对于你的库尚不存在,请考虑添加它们。特别注意Span名称;努力使用有意义的名称,并考虑将其定义为一个序列。
There is a schema_url attribute that can be used to record what version of the semantic conventions are being used. Please set this attribute, when possible. 有一个schema_url属性,可用于记录正在使用的语义约定版本。如果可能,请设置此属性。
If you have any feedback or want to add a new convention - please come and contribute! Instrumentation Slack or Specification repository are a good places to start! 如果您有任何反馈或想要添加新的约定,请参与贡献! Instrumentation Slack或 规范存储库 是一个很好的起点!
定义Spans Think of your library from the perspective of a library user and what the user might be interested in knowing about the behavior and activity of the library. As the library maintainer, you know the internals but the user will most likely be less interested in the inner-workings of the library and more interested in the functionality of their application. Think about what information can be helpful in analyzing the usage of your library, then think about an appropriate way to model that data. Some things to consider are: 从库使用者的角度考虑您的库,以及使用者可能有兴趣了解库的行为和活动。作为库维护者,您了解内部工作原理,但用户很可能对库的内部工作原理不太感兴趣,而对其应用功能更感兴趣。思考哪些信息对于分析库的使用可能有帮助,然后考虑适当的数据建模方法。以下是一些需要考虑的内容:
For example, if your library is making requests to a database, create spans only for the logical request to the database. The physical requests over the network should be instrumented within the libraries implementing that functionality. You should also favor capturing other activities, like object/data serialization as span events, rather than as additional spans. 例如,如果您的库正在向数据库发出请求,则仅为对数据库的逻辑请求创建Span。网络上的物理请求应该在实现该功能的库中进行测量。您还应该优先考虑捕获其他活动,例如将对象/数据序列化为Span Events,而不是作为额外的Spans。
Follow the semantic conventions when setting span attributes. 设置Span属性时请遵循语义约定。
何时不使用测量装置
Some libraries are thin clients wrapping network calls. Chances are that OpenTelemetry has an instrumentation library for the underlying RPC client (check out the registry). In this case, instrumenting the wrapper library may not be necessary. As a general guideline, only instrument your library at its own level. 有些库是包装网络调用的瘦客户端。 可能OpenTelemetry 有一个用于底层RPC客户端的测量装置库(查看注册表)。在这种情况下,可能不需要测量包装器库。作为一般准则,仅在您的库自己的层次上进行测试。
Don’t instrument if: 如果出现以下情况,请勿使用测量
If you’re in doubt - don’t instrument - you can always do it later when you see a need. 如果您有疑问 - 不要使用测量技术 - 您可以稍后在看到需求时再进行操作。
If you choose not to instrument, it may still be useful to provide a way to configure OpenTelemetry handlers for your internal RPC client instance. It’s essential in languages that don’t support fully automatic instrumentation and still useful in others. 如果您选择不进行测量,那么还有一种非常有用的方法:为内部 RPC 客户端实例配置 OpenTelemetry 处理程序。它对于不支持全自动测量装置的语言至关重要,但在其他语言中仍然有用。
The rest of this document gives guidance on what and how to instrument if you decide to do it. 如果您决定进行测量,本文档的其余部分将指导您测量什么,以及如何测量。
The first step is to take dependency on the OpenTelemetry API package. 第一步是依赖 OpenTelemetry API 包。
OpenTelemetry has two main modules - API and SDK. OpenTelemetry API is a set of abstractions and not-operational implementations. Unless your application imports the OpenTelemetry SDK, your instrumentation does nothing and does not impact application performance. OpenTelemetry 有两个主要模块——API 和 SDK。 OpenTelemetry API 是一组抽象和非操作性实现。除非您的应用程序导入 OpenTelemetry SDK,否则您的仪器不会执行任何操作,也不会影响应用程序性能。
Libraries should only use the OpenTelemetry API. 库应仅使用 OpenTelemetry API。
You may be rightfully concerned about adding new dependencies, here are some considerations to help you decide how to minimize dependency hell: 您可能有理由担心添加新的依赖项,这里有一些注意事项可以帮助您决定如何尽量减少依赖关系:
All application configuration is hidden from your library through the Tracer API. Libraries may allow applications to pass instances of TracerProvider to facilitate dependency injection and ease of testing, or obtain it from global TracerProvider. OpenTelemetry language implementations may have different preferences for passing instances or accessing the global based on what’s idiomatic. 所有应用程序配置都通过Tracer API对您的库隐藏。库可以允许应用程序传递TracerProvider实例或全局TracerProvider实例来完成依赖项注入和易于测试。实现OpenTelemetry的编程语言,可能根据惯用方式,对传递实例或访问全局实例有不同的偏好。
When obtaining the tracer, provide your library (or tracing plugin) name and version - they show up on the telemetry and help users process and filter telemetry, understand where it came from, and debug/report any instrumentation issues. 获取Tracer时,提供您的库(或跟踪插件)名称和版本——它们显示在遥测数据上,并帮助用户处理和过滤遥测数据,了解它的来源,并调试/报告任何测量装置问题。
测量什么
公共API
Public APIs are a good candidates for tracing: spans created for public API calls allow users to map telemetry to application code, understand the duration and outcome of library calls. Which calls to trace: 公共 API 是Trace的良好使用方法:为公共 API 调用创建的Span允许用户将遥测数据映射到应用程序代码,了解库调用的持续时间和结果。应该跟踪哪些调用:
Instrumentation example:
private static Tracer tracer = getTracer(TracerProvider.noop());
public static void setTracerProvider(TracerProvider tracerProvider) {
tracer = getTracer(tracerProvider);
}
private static Tracer getTracer(TracerProvider tracerProvider) {
return tracerProvider.getTracer("demo-db-client", "0.1.0-beta1");
}
private Response selectWithTracing(Query query) {
// check out conventions for guidance on span names and attributes
Span span = tracer.spanBuilder(String.format("SELECT %s.%s", dbName, collectionName))
.setSpanKind(SpanKind.CLIENT)
.setAttribute("db.name", dbName)
...
.startSpan();
// makes span active and allows correlating logs and nest spans
try (Scope unused = span.makeCurrent()) {
Response response = query.runWithRetries();
if (response.isSuccessful()) {
span.setStatus(StatusCode.OK);
}
if (span.isRecording()) {
// populate response attributes for response codes and other information
}
} catch (Exception e) {
span.recordException(e);
span.setStatus(StatusCode.ERROR, e.getClass().getSimpleName());
throw e;
} finally {
span.end();
}
}
Follow conventions to populate attributes! If there is no applicable one, check out general conventions. 请遵循约定来填充属性!如果没有适用的约定,请查看通用约定。
嵌套网络和其他Span
Network calls are usually traced with OpenTelemetry auto-instrumentations through corresponding client implementation. 通常通过相应的客户端实现的OpenTelemetry自动测量装置,来跟踪网络调用。
If OpenTelemetry does not support tracing your network client, use your best judgement, here are some considerations to help: 如果OpenTelemetry不支持跟踪您的网络客户端,请做出最佳判断,以下注意事项可提供帮助:
If OpenTelemetry already supports tracing your network calls, you probably don’t want to duplicate it. There may be some exceptions: 如果 OpenTelemetry 已经支持跟踪您的网络调用,您可能不想重复它。可能有一些例外:
WARNING: Generic solution to avoid duplication is under construction . 警告:避免重复的通用解决方案正在构建中。
Traces are one kind of signal that your apps can emit. Events (or logs) and traces complement, not duplicate, each other. Whenever you have something that should have a verbosity, logs are a better choice than traces. Traces是您的应用程序可以发出的一种信号。Events(或Logs)和Traces是互补的,而不是重复的。每当您有一些冗长的信息时,Logs是比Traces更好的选择。
Chances are that your app uses logging or some similar module already. Your module might already have OpenTelemetry integration – to find out, see the registry. Integrations usually stamp active trace context on all logs, so users can correlate them. 您的应用程序很可能已经使用日志记录或某些类似的模块。您的模块可能已经集成了OpenTelemetry——要了解详情,参阅注册表。集成通常会在所有Logs上标记活跃的Trace Context,以便用户可以将它们关联起来。
If your language and ecosystem don’t have common logging support, use span events to share additional app details. Events maybe more convenient if you want to add attributes as well. 如果您的语言和生态系统没有通用日志记录支持,请使用Span Events来共享其他应用程序详细信息。如果您还想添加属性,Events可能会更方便。
As a rule of thumb, use events or logs for verbose data instead of spans. Always attach events to the span instance that your instrumentation created. Avoid using the active span if you can, since you don’t control what it refers to. 根据经验,对于冗长数据请使用Events或Logs,而不是Span。始终将Events附加到您的测量装备创建的Span实例。如果可以的话,请避免使用活跃的 Span,因为您无法控制它所指的内容。
上下文传播
提取上下文 If you work on a library or a service that receives upstream calls, e.g. a web framework or a messaging consumer, you should extract context from the incoming request/message. OpenTelemetry provides the Propagator API, which hides specific propagation standards and reads the trace Context from the wire. In case of a single response, there is just one context on the wire, which becomes the parent of the new span the library creates. 如果你在处理从上游调用而来的请求或消息(例如 Web 框架或消息传递使用者),您应该从传入的请求/消息中提取上下文。 OpenTelemetry 提供了Propagator API,它隐藏特定的传播标准,并从网络中上读取Trace Context。在单个响应的情况下,网络上只有一个上下文,它成为库创建的新Span的父Span。
After you create a span, you should pass new trace context to the application code (callback or handler), by making the span active; if possible, you should do this explicitly. 在创建Span后,你应该将新的Span Context传递给应用程序代码(回调或处理程序),通过激活Span来完成;如果可能,请明确地执行此操作
// extract the context
Context extractedContext = propagator.extract(Context.current(), httpExchange, getter);
Span span = tracer.spanBuilder("receive")
.setSpanKind(SpanKind.SERVER)
.setParent(extractedContext)
.startSpan();
// make span active so any nested telemetry is correlated
try (Scope unused = span.makeCurrent()) {
userCode();
} catch (Exception e) {
span.recordException(e);
span.setStatus(StatusCode.ERROR);
throw e;
} finally {
span.end();
}
Here’re the full examples of context extraction in Java, check out OpenTelemetry documentation in your language. 以下是Java中上下文提取信息的完整示例,请查看您语言的OpenTelemetry文档。
In the case of a messaging system, you may receive more than one message at once. Received messages become links on the span you create. Refer to messaging conventions for details (WARNING: messaging conventions are under constructions). 对于消息传递系统,您可能会同时收到多条消息。收到的消息将成为您创建的Span上的链接。有关详细信息,请参阅消息传递消息约定(警告:消息传递消息约定正在构建中)。
注入上下文 When you make an outbound call, you will usually want to propagate context to the downstream service. In this case, you should create a new span to trace the outgoing call and use Propagator API to inject context into the message. There may be other cases where you might want to inject context, e.g. when creating messages for async processing. 当你发出调用时,通常希望将上下文传播到下游服务。在这种情况下,你应该创建一个新的Span来跟踪出站调用,并使用Propagator API将上下文注入到消息中。在创建消息进行异步处理时,可能还有其他情况需要注入上下文。
Span span = tracer.spanBuilder("send")
.setSpanKind(SpanKind.CLIENT)
.startSpan();
// make span active so any nested telemetry is correlated
// even network calls might have nested layers of spans, logs or events
try (Scope unused = span.makeCurrent()) {
// inject the context
propagator.inject(Context.current(), transportLayer, setter);
send();
} catch (Exception e) {
span.recordException(e);
span.setStatus(StatusCode.ERROR);
throw e;
} finally {
span.end();
}
Here’s the full example of context injection in Java. 这是Java 中上下文注入的完整 示例。
There might be some exceptions: 可能有一些例外:
进程内
杂项
##Instrumentation registry 注册测量装置
Please add your instrumentation library to the OpenTelemetry registry, so users can find it. 请将您的测量装置库添加到 OpenTelemetry 注册表中,以便用户可以找到它。
OpenTelemetry API is no-op and very performant when there is no SDK in the application. When OpenTelemetry SDK is configured, it consumes bound resources. 当应用程序中没有 SDK 时,OpenTelemetry API 是无操作的并且非常高性能。配置OpenTelemetry SDK时,会消耗绑定的资源。
Real-life applications, especially on the high scale, would frequently have head-based sampling configured. Sampled-out spans are cheap and you can check if the span is recording, to avoid extra allocations and potentially expensive calculations, while populating attributes. 现实生活中的应用程序,尤其是大规模应用程序,通常会配置基于头部的采样。采样输出的Span是低功耗的,您可以检查Span是否正在记录,以避免在填充属性时进行额外的分配以及高功耗的计算。
// some attributes are important for sampling, they should be provided at creation time
Span span = tracer.spanBuilder(String.format("SELECT %s.%s", dbName, collectionName))
.setSpanKind(SpanKind.CLIENT)
.setAttribute("db.name", dbName)
...
.startSpan();
// other attributes, especially those that are expensive to calculate
// should be added if span is recording
if (span.isRecording()) {
span.setAttribute("db.statement", sanitize(query.statement()))
}
错误处理
OpenTelemetry API is forgiving at runtime - does not fail on invalid arguments, never throws, and swallows exceptions. This way instrumentation issues do not affect application logic. Test the instrumentation to notice issues OpenTelemetry hides at runtime. OpenTelemetry API 在运行时是高兼容性的——不会因无效参数而失败,从不抛出或吞掉异常。这样,测量问题就不会影响应用程序逻辑。测试测量装置以发现 OpenTelemetry 在运行时隐藏的问题。
测试
Since OpenTelemetry has variety of auto-instrumentations, it’s useful to try how your instrumentation interacts with other telemetry: incoming requests, outgoing requests, logs, etc. Use a typical application, with popular frameworks and libraries and all tracing enabled when trying out your instrumentation. Check out how libraries similar to yours show up. 由于 OpenTelemetry 具有各种自动测量装置,因此尝试您的测量装置如何与其他遥测交互非常有用:传入请求、传出请求、日志等。使用典型的应用程序,使用流行的框架和库,并在尝试测量时启用所有跟踪。查看与您的类似的库如何显示。
@Test
public void checkInstrumentation() {
SpanExporter exporter = new TestExporter();
Tracer tracer = OpenTelemetrySdk.builder()
.setTracerProvider(SdkTracerProvider.builder()
.addSpanProcessor(SimpleSpanProcessor.create(exporter)).build()).build()
.getTracer("test");
// run test ...
validateSpans(exporter.exportedSpans);
}
class TestExporter implements SpanExporter {
public final List<SpanData> exportedSpans = Collections.synchronizedList(new ArrayList<>());
@Override
public CompletableResultCode export(Collection<SpanData> spans) {
exportedSpans.addAll(spans);
return CompletableResultCode.ofSuccess();
}
...
}