目录
- 一、架构说明
- 二、方式1 – 自动化
- 2.1 opentelemetry-javaagent.jar(Java8+ )
- 2.2 使用opentelemetry-javaagent.jar完成自动注入
- 2.3 配置opentelemetry-javaagent.jar
- 2.4 使用注解(@WithSpan, @SpanAttribute)
- 2.5.1 代码集成@WithSpan, @SpanAttribute
- 2.5.2 禁用已标注@WithSpan的方法的自动注入:
- 2.5.3 不修改代码的情况下启用@WithSpan
- 设置Agent日志级别
- 三、方式2 -【推荐】自动化 & 手动API
- 3.1 maven依赖
- 3.2 Java启动命令注入OTel Agent
- 3.3 Exporter导出器
- 3.4 代码端集成自定义Traces和Metrics
- 3.5 Logback集成OTel
- 四、方式3 – 手动 & SpringBoot集成
- 4.1 maven依赖
- 4.2 应用配置
- 五、OpenTelemetry SDK手动编码
- 5.1 Traces
- 5.2 Metrics
一、架构说明
支持集成Agent
自动
插桩(Instrumentation) – 无需改变代码手动
编码进行插桩 – 侵入代码- API – OTel接口定义,具体实现可通过依赖SDK或者Agent注入
- annotations – OTel通用注解(如@WithSpan, @SpanAttr等)
- semconv – OTel通用语义约定(如常用属性名称)
- SDK – 具体OTel实现
- Instumentation Libraries – 被注入OTel插桩的代码库
- Resource Detector – 识别服务自身信息
- API – OTel接口定义,具体实现可通过依赖SDK或者Agent注入
Exporter – 导出器,将OTel相关数据导出到OTel Collector 或者 具体的监控后端
- OTLP Exporter – OTel官方协议OTLP导出器,可导出数据到OTel Collector 或者其他接收OTLP协议的监控后端
- vender exporters – 不同监控后端厂商的导出器,负责将OTel数据导出到不同的监控后端
- Jaeger, Ziplin
- Prometheus
- Skywalking
- …
OTel Collector – OTel官方的收集器(支持OTLP gRpc/http协议)
二、方式1 – 自动化
支持的编程语言:
- .NET
- Java
- JavaScript
- PHP
- Python
2.1 opentelemetry-javaagent.jar(Java8+ )
Java自动插桩使用的Java agent JAR可以附加到任何Java 8+应用程序。它动态注入字节码来捕获来自许多流行库和框架的遥测。它可以用于在应用程序或服务的“边缘”捕获遥测数据,例如入站请求、出站HTTP调用、数据库调用等。
支持的库、框架等包括:
- Spring: Boot, Web, MVC, WebFlux, Cloud Gateway, Batch, Scheduling, Data, MQ(JMS, Kafka, RabbitMq), Micrometer, RestTemplate
- HTTP客户端工具: HttpClient 2.0+, OkHttp 2.2+, HttpURLConnection, Java Http Client
- Web容器: Servlet 2.2+, Tomcat 7~10, Jetty 9 ~ 11, Undertow 1.4+, Netty 3.8+
- 数据库连接: Hibernate 3.3+, JDBC, HIkariCP 3.0+, c3p0 0.9.2+, DBCP 2.0+, MongoDB Driver 3.1+, R2DBC 1.0+
- RPC: Dubbo 2.7+, gRPC 1.6+
- 消息队列: RabbitMQ 2.7+, Kafka 0.11+, RocketMq, Pulsar 2.8+, JMS
- 缓存: Jedis 1.4+, Lettuce 4.0+, Redisson 3.0+
- 日志: Log4j2 2.11+, Logback 1.0+
- 搜索引擎: ES 5.0+
- 云原生: AWS Lambda 1.0+, AWS SDK, Azure Core 1.14+, Kubernetes Client 7.0+
- 其他: RxJava 1.0+, Reactor 3.1+, Guava, Quartz 2.0+, GraphQL 12.0+, Hystrix, …
具体支持情况可参见:
https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/main/docs/supported-libraries.md
下载:opentelemetry-javaagent.jar
2.2 使用opentelemetry-javaagent.jar完成自动注入
方式1 – 使用java启动命令指定javaagment:
java -javaagent:path/to/opentelemetry-javaagent.jar-Dotel.service.name=your-service-name-jar myapp.jar
方式2 – 通过环境变量全局配置javaagent:
export JAVA_TOOL_OPTIONS="-javaagent:path/to/opentelemetry-javaagent.jar"export OTEL_SERVICE_NAME="your-service-name"java -jar myapp.jar
2.3 配置opentelemetry-javaagent.jar
方式1 – 通过Java系统属性:
java -javaagent:path/to/opentelemetry-javaagent.jar \ -Dotel.service.name=your-service-name \ -Dotel.traces.exporter=zipkin \ -jar myapp.jar
方式2 – 通过环境变量:
OTEL_SERVICE_NAME=your-service-name \OTEL_TRACES_EXPORTER=zipkin \java -javaagent:path/to/opentelemetry-javaagent.jar \ -jar myapp.jar
方式3 – 通过属性文件:
OTEL_JAVAAGENT_CONFIGURATION_FILE=path/to/properties/file.properties \java -javaagent:path/to/opentelemetry-javaagent.jar \ -jar myapp.jar
开启agent debug日志:
-Dotel.javaagent.debug=true
关于opentelemetry-javaagent.jar的更多配置可参见:
https://github.com/open-telemetry/opentelemetry-java/blob/main/sdk-extensions/autoconfigure/README.md
https://opentelemetry.io/docs/instrumentation/java/automatic/agent-config/
2.4 使用注解(@WithSpan, @SpanAttribute)
添加maven依赖:
<dependencies><dependency><groupId>io.opentelemetry.instrumentation</groupId><artifactId>opentelemetry-instrumentation-annotations</artifactId><version>1.31.0</version></dependency></dependencies>
2.5.1 代码集成@WithSpan, @SpanAttribute
import io.opentelemetry.instrumentation.annotations.WithSpan;public class MyClass {@WithSpanpublic void myMethod() {<...>} @WithSpan public void myMethod(@SpanAttribute("parameter1") String parameter1, @SpanAttribute("parameter2") long parameter2) { <...> }}
2.5.2 禁用已标注@WithSpan的方法的自动注入:
-Dotel.instrumentation.opentelemetry-instrumentation-annotations.exclude-methods=my.package.MyClass1[method1,method2];my.package.MyClass2[method3]
2.5.3 不修改代码的情况下启用@WithSpan
适用于无法修改源码的情况:
-Dotel.instrumentation.methods.include=my.package.MyClass1[method1,method2];my.package.MyClass2[method3]
设置Agent日志级别
https://opentelemetry.io/docs/instrumentation/java/automatic/agent-config/#java-agent-logging-output
The agent’s logging output can be configured by setting the following property:
System property: otel.javaagent.logging
Description: The Java agent logging mode. The following 3 modes are supported:
simple
: The agent will print out its logs using the standard error stream. Only INFO or higher logs will be printed. This is the default Java agent logging mode.none
: The agent will not log anything – not even its own version.application
: The agent will attempt to redirect its own logs to the instrumented application’s slf4j logger. This works the best for simple one-jar applications that do not use multiple classloaders; Spring Boot apps are supported as well. The Java agent output logs can be further configured using the instrumented application’s logging configuration (e.g. logback.xml or log4j2.xml). Make sure to test that this mode works for your application before running it in a production environment
三、方式2 -【推荐】自动化 & 手动API
应用端通过Agent注入opentelemetry-javaagent.jar,
应用端代码仅需集成OTel API接口依赖(无需集成SDK实现依赖),
额外需集成Logback OTel Appender相关依赖(若不需要可移除)。
注:
最推荐此种集成方式,
保留了Agent注入,减少了代码侵入,
仅当有自定义Traces/Metrics等需求时,依赖OTel Api完成指标等埋点,
即便不注入Agent,也不影响本地应用开发与运行,
在线上环境通过Docker镜像、K8S挂载等完成Agent注入。
3.1 maven依赖
<properties><otel.version>1.32.0</otel.version><otel.springboot.version>1.32.0-alpha</otel.springboot.version><otel.logback.version>1.32.0-alpha</otel.logback.version></properties><dependencyManagement><dependencies><dependency><groupId>io.opentelemetry</groupId><artifactId>opentelemetry-bom</artifactId><version>${otel.version}</version><type>pom</type><scope>import</scope></dependency><dependency><groupId>io.opentelemetry.instrumentation</groupId><artifactId>opentelemetry-instrumentation-annotations</artifactId><version>${otel.version}</version></dependency><dependency><groupId>io.opentelemetry.instrumentation</groupId><artifactId>opentelemetry-logback-appender-1.0</artifactId><version>${otel.logback.version}</version></dependency><dependency><groupId>io.opentelemetry.instrumentation</groupId><artifactId>opentelemetry-logback-mdc-1.0</artifactId><version>${otel.logback.version}</version></dependency></dependencies></dependencyManagement><dependencies> <dependency><groupId>io.opentelemetry</groupId><artifactId>opentelemetry-api</artifactId></dependency><dependency><groupId>io.opentelemetry.instrumentation</groupId><artifactId>opentelemetry-instrumentation-annotations</artifactId></dependency><dependency><groupId>io.opentelemetry.instrumentation</groupId><artifactId>opentelemetry-logback-appender-1.0</artifactId></dependency><dependency><groupId>io.opentelemetry.instrumentation</groupId><artifactId>opentelemetry-logback-mdc-1.0</artifactId></dependency></dependencies>
3.2 Java启动命令注入OTel Agent
启动命令注入及配置opentelemetry-javaagent.jar:
java -javaagent:D:/programs/Java/OTel/opentelemetry-javaagent.jar# 导出traces为控制台日志打印-Dotel.traces.exporter=logging# 导出metrics为控制台日志打印-Dotel.metrics.exporter=logging# 禁用log日志导出(若使用logging则控制台会出现日志框架如logback打印一次,OTel logging再打印一次,比较混乱,故暂且禁用)-Dotel.logs.exporter=none-jar myapp.jar
如下为导出traces、metrics、logs均到OTLP Collector的相关配置:
java -javaagent:D:/programs/Java/OTel/opentelemetry-javaagent.jar-Dotel.traces.exporter=otlp-Dotel.metrics.exporter=otlp-Dotel.logs.exporter=otlp-Dotel.exporter.otlp.endpoint=http://localhost:4317-jar myapp.jar
如下为导出traces到Jaeger的相关配置:
注:
此种方式已被弃用,目前最新版版本的Jaeger已经内嵌OTel Collector,
可直接通过OTLP协议接收数据。
java -javaagent:D:/programs/Java/OTel/opentelemetry-javaagent.jar# 导出traces到Jaeger端(Jaeger后端需根据实际环境进行调整)-Dotel.traces.exporter=jaeger-Dotel.exporter.jaeger.endpoint=http://10.170.xx.xxx:xxx-Dotel.exporter.jaeger.timeout=10000# 导出metrics为控制台日志打印-Dotel.metrics.exporter=logging# 禁用log日志导出-Dotel.logs.exporter=none-jar myapp.jar
导出到Jaeger中的traces展示:
3.3 Exporter导出器
导出方式:
- OTLP exporter
- Logging exporter
- Logging OTLP JSON exporter
- Jaeger exporter
- Zipkin exporter
- Prometheus exporter
关于导出器的更多配置可参见:
https://github.com/open-telemetry/opentelemetry-java/blob/main/sdk-extensions/autoconfigure/README.md#exporters
3.4 代码端集成自定义Traces和Metrics
OTel工具类:
/** * OpenTelemetry工具类 * * @author luohq * @date 2023-11-20 10:01 */@Componentpublic class OTelUtils {private static final Logger log = LoggerFactory.getLogger(OTelUtils.class);/** * OTel实例 */private static OpenTelemetry OTEL_INSTANCE;/** * 兼容OTel SpringBoot Starter自动注入 * * @param openTelemetry SpringBoot OTel实例 */@Autowired(required = false)public void setOpenTelemetry(OpenTelemetry openTelemetry) {OTelUtils.OTEL_INSTANCE = openTelemetry;}/** * 获取OTel实例 * * @return OTel实例 */public static OpenTelemetry openTelemetry() {//获取自动注入的OTel实例if (Objects.nonNull(OTEL_INSTANCE)) {return OTEL_INSTANCE;}//获取全局配置中的OTel实例(使用Agent注入)return GlobalOpenTelemetry.get();}}
业务代码集成OTel:
//使用@WithSpan和@SpanAttribute生成Span@WithSpan(value = "Manual::GoodsService::findGoodsPage")@Overridepublic RespResult<GoodsVo> findGoodsPage(@SpanAttribute GoodsPageQueryDto goodsPageQueryDto) {/** 获取当前Span */Span curSpan = Span.current();curSpan.setAttribute("attr.custom", "luohq-test-svc");/** 自定义指标 */Meter meter = OTelUtils.openTelemetry().meterBuilder("GoodsService::findGoodsPage").setInstrumentationVersion("v1.0").build();//构建计数器LongCounter findCounter = meter.counterBuilder("findGoodsPage.count").setDescription("FindGoodsPage Sum Count").setUnit("1").build();findCounter.add(1);/** 自定义Span */Tracer tracer = OTelUtils.openTelemetry().getTracer(GoodsMapper.class.getSimpleName());Span daoSpan = tracer.spanBuilder("Manual::GoodsMapper::findGoodsWithCNamePage").startSpan();daoSpan.setAttribute("attr.custom", "luohq-test-dao");try (Scope scope = daoSpan.makeCurrent()) {//原处理逻辑log.info("findGoodsPage param: {}", goodsPageQueryDto);IPage<GoodsVo> goodsPage = this.goodsMapper.findGoodsWithCNamePage(this.toPage(goodsPageQueryDto), goodsPageQueryDto);log.info("findGoodsPage result: {}", goodsPage);return RespResult.successRows(goodsPage.getTotal(), goodsPage.getRecords());} catch (Throwable throwable) {//设置Span状态daoSpan.setStatus(StatusCode.ERROR, "Something bad happened!");//记录异常堆栈daoSpan.recordException(throwable);return RespResult.failed();}finally {daoSpan.end();}}
3.5 Logback集成OTel
如下日志配置需使用spring.profiles.active来激活对应的otel或者otel-mdc配置,
如果不需要可移除springProfile段落,直接在configuration下配置相应的日志配置即可,
其中otel、otel-mdc均会自动将日志框架Logback集成OTel并导出日志到相应后端(如默认导出到OTel Collector),
相较于otel,otel-mdc通过 MDC(Mapped Diagnostic Context, 映射调试上下文机制) 将trace_id、span_id、trace_flags添加到日志中。
具体logback-spring.xml配置:
<configuration scan="true" scanPeriod=" 5 seconds"><springProfile name="default"><appender name="console" class="ch.qos.logback.core.ConsoleAppender"><encoder><pattern>%d{yyyy-MM-dd HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg%n</pattern></encoder></appender><root level="INFO"><appender-ref ref="console"/></root></springProfile><springProfile name="otel"><appender name="console" class="ch.qos.logback.core.ConsoleAppender"><encoder><pattern>%d{yyyy-MM-dd HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg%n</pattern></encoder></appender><appender name="OpenTelemetry"class="io.opentelemetry.instrumentation.logback.appender.v1_0.OpenTelemetryAppender"></appender><root level="INFO"><appender-ref ref="console"/><appender-ref ref="OpenTelemetry"/></root></springProfile><springProfile name="otel-mdc"><appender name="console" class="ch.qos.logback.core.ConsoleAppender"><encoder><!--%d{yyyy-MM-dd HH:mm:ss.SSS} trace_id=%X{trace_id} span_id=%X{span_id} trace_flags=%X{trace_flags} [%thread] %-5level %logger{36} - %msg%n--><pattern>%d{yyyy-MM-dd HH:mm:ss.SSS} [%X{trace_id}:%X{span_id}:%X{trace_flags}] [%thread] %-5level %logger{36} - %msg%n</pattern></encoder></appender><appender name="otel" class="io.opentelemetry.instrumentation.logback.mdc.v1_0.OpenTelemetryAppender"><appender-ref ref="console"/></appender><root level="INFO"><appender-ref ref="otel"/></root></springProfile></configuration>
关于trace_flags定义参见:
https://www.w3.org/TR/trace-context/#trace-flags
四、方式3 – 手动 & SpringBoot集成
SpringBoot应用需依赖opentelemetry-spring-boot-starter
,
此种方式下由于starter端依赖了OTel SDK,所以无需Agent注入,
该starter集成了OTel API/SDK,并对OpenTelemetry进行了自动配置,
该starter引入的依赖如下:
- opentelemetry-api
- opentelemetry-instrumentation-api-semconv
- opentelemetry-sdk
- opentelemetry-exporter-otlp
- opentelemetry-exporter-logging
- opentelemetry-logback-appender-1.0
- other instrumentation libs
- web
- webmvc
- webflux
- jdbc
- kafka
- logback / log4j appender
- micrometer
- …
4.1 maven依赖
<properties><otel.version>1.32.0</otel.version><otel.springboot.version>1.32.0-alpha</otel.springboot.version><otel.logback.version>1.32.0-alpha</otel.logback.version></properties><dependencyManagement><dependencies> <dependency><groupId>io.opentelemetry</groupId><artifactId>opentelemetry-bom</artifactId><version>${otel.version}</version><type>pom</type><scope>import</scope></dependency><dependency><groupId>io.opentelemetry.instrumentation</groupId><artifactId>opentelemetry-instrumentation-annotations</artifactId><version>${otel.version}</version></dependency><dependency><groupId>io.opentelemetry.instrumentation</groupId><artifactId>opentelemetry-logback-appender-1.0</artifactId><version>${otel.logback.version}</version></dependency><dependency><groupId>io.opentelemetry.instrumentation</groupId><artifactId>opentelemetry-logback-mdc-1.0</artifactId><version>${otel.logback.version}</version></dependency><dependency><groupId>io.opentelemetry.instrumentation</groupId><artifactId>opentelemetry-spring-boot-starter</artifactId><version>${otel.springboot.version}</version></dependency></dependencies></dependencyManagement><dependencies><dependency><groupId>io.opentelemetry.instrumentation</groupId><artifactId>opentelemetry-spring-boot-starter</artifactId></dependency><dependency><groupId>io.opentelemetry.instrumentation</groupId><artifactId>opentelemetry-logback-appender-1.0</artifactId></dependency><dependency><groupId>io.opentelemetry.instrumentation</groupId><artifactId>opentelemetry-logback-mdc-1.0</artifactId></dependency><dependency><groupId>io.opentelemetry</groupId><artifactId>opentelemetry-exporter-jaeger</artifactId></dependency></dependencies>
4.2 应用配置
application.yaml:
otel:# 导出器配置exporter:# OTLP导出otlp:enabled: falseendpoint: http://localhost:4317timeout: 10s# 导出到Zipkinzipkin:enabled: falseendpoint: http://localhost:9411/api/v2/spans# 导出到Jaegerjaeger:enabled: trueendpoint: http://10.170.xx.xxx:xxxxtimeout: 10s# 导出到日志logging:enabled: truetraces:sampler:# 采样频率probability: 1.0
更多配置说明可参见:
https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/main/instrumentation/spring/spring-boot-autoconfigure/README.md#features
关于自定义Traces / Metrics、Logback集成OTel可参见 方式2。
五、OpenTelemetry SDK手动编码
5.1 Traces
// ...import io.opentelemetry.api.OpenTelemetry;import io.opentelemetry.api.trace.Tracer;public class Dice {private int min;private int max;private Tracer tracer;//自动注入OpenTelemetry对象public Dice(int min, int max, OpenTelemetry openTelemetry) {this.min = min;this.max = max;//获取Tracerthis.tracer = openTelemetry.getTracer(Dice.class.getName(), "0.1.0");}public Dice(int min, int max) {this(min, max, OpenTelemetry.noop())}public List<Integer> rollTheDice(int rolls) {//通过Tracer创建SpanSpan parentSpan = tracer.spanBuilder("parent")//添加关联的Context//.addLink(parentSpan1.getSpanContext())//.addLink(parentSpan2.getSpanContext())//.addLink(remoteSpanContext).startSpan();//为Span添加属性parentSpan .setAttribute(SemanticAttributes.HTTP_METHOD, "GET");parentSpan .setAttribute(SemanticAttributes.HTTP_URL, "/rolldice");//为Span添加事件Attributes eventAttributes = Attributes.of(AttributeKey.stringKey("key"), "value",AttributeKey.longKey("result"), 0L);parentSpan .addEvent("End Computation", eventAttributes);//开始Span处理try (Scope scope = parentSpan.makeCurrent()) {List<Integer> results = new ArrayList<Integer>();for (int i = 0; i < rolls; i++) {//span嵌套results.add(this.rollOnce());}return results;} catch (Throwable throwable) {//设置Span状态span.setStatus(StatusCode.ERROR, "Something bad happened!");//记录异常堆栈span.recordException(throwable);}finally {//结束SpanparentSpan.end();}}private int rollOnce() {//子Span处理Span childSpan = tracer.spanBuilder("child")// NOTE: setParent(...) is not required;// `Span.current()` is automatically added as the parent.startSpan();try(Scope scope = childSpan.makeCurrent()) {return ThreadLocalRandom.current().nextInt(this.min, this.max + 1);} finally {//结束SpanchildSpan.end();}}}
Context传递示例:
//Context读取(从请求头中获取context信息,如请求头traceparent)TextMapGetter<HttpHeaders> getter =new TextMapGetter<HttpHeaders>() {@Overridepublic String get(HttpHeaders headers, String s) {assert headers != null;return headers.getHeaderString(s);}@Overridepublic Iterable<String> keys(HttpHeaders headers) {List<String> keys = new ArrayList<>();MultivaluedMap<String, String> requestHeaders = headers.getRequestHeaders();requestHeaders.forEach((k, v) ->{keys.add(k);});return keys.}};//Context设置(向请求中写入context信息,如请求头traceparent)TextMapSetter<HttpURLConnection> setter =new TextMapSetter<HttpURLConnection>() {@Overridepublic void set(HttpURLConnection carrier, String key, String value) {// Insert the context as Headercarrier.setRequestProperty(key, value);}};//...public void handle(<Library Specific Annotation> HttpHeaders headers){//从当前请求中解析上下文Context extractedContext = opentelemetry.getPropagators().getTextMapPropagator().extract(Context.current(), headers, getter);//使用解析出的上下文try (Scope scope = extractedContext.makeCurrent()) {// Automatically use the extracted SpanContext as parent.Span serverSpan = tracer.spanBuilder("GET /resource").setSpanKind(SpanKind.SERVER).startSpan();try(Scope ignored = serverSpan.makeCurrent()) {// Add the attributes defined in the Semantic ConventionsserverSpan.setAttribute(SemanticAttributes.HTTP_METHOD, "GET");serverSpan.setAttribute(SemanticAttributes.HTTP_SCHEME, "http");serverSpan.setAttribute(SemanticAttributes.HTTP_HOST, "localhost:8080");serverSpan.setAttribute(SemanticAttributes.HTTP_TARGET, "/resource");HttpURLConnection transportLayer = (HttpURLConnection) url.openConnection();//设置新请求的上下文// Inject the request with the *current*Context, which contains our current Span.openTelemetry.getPropagators().getTextMapPropagator().inject(Context.current(), transportLayer, setter);// Make outgoing call}finally {serverSpan.end();}}}
5.2 Metrics
种类:
- LongCounter / DoubleCounter – Sync/Async
- LongUpDownCounter / DoubleUpDownCounter – Sync/Async
- LongGauge / DoubleGauge – Async
- LongHistogram / DoubleHistogram – Sync
OpenTelemetry openTelemetry = // obtain instance of OpenTelemetry// Gets or creates a named meter instanceMeter meter = openTelemetry.meterBuilder("instrumentation-library-name").setInstrumentationVersion("1.0.0").build();// Build counter e.g. LongCounterLongCounter counter = meter.counterBuilder("processed_jobs").setDescription("Processed jobs").setUnit("1").build();// It is recommended that the API user keep a reference to Attributes they will record againstAttributes attributes = Attributes.of(AttributeKey.stringKey("Key"), "SomeWork");// Record datacounter.add(123, attributes);------------------------------// Build an asynchronous instrument, e.g. Gaugemeter.gaugeBuilder("cpu_usage").setDescription("CPU Usage").setUnit("ms").buildWithCallback(measurement -> {measurement.record(getCpuUsage(), Attributes.of(AttributeKey.stringKey("Key"), "SomeWork"));});