Net 和 Java 基于 zipkin 的全链路追踪

在各大厂分布式链路跟踪系统架构对比中已经介绍了几大框架的对比, 如果想用免费的可以用 zipkin 和 pinpoint 还有一个忘了介绍: SkyWalking, 具体介绍可参考: https://github.com/apache/incubator-skywalking/blob/master/README_ZH.md

由于追踪的要求是 Net 平台和 Java 平台都要支持, 对于 java 平台各组件都是天生的支持的, 但对于 net 的支持找了些开源组件, 发现 Pinpoint 和 SkyWalking 给出的 Demo 都是基于 NetCore(SkyWalking 可以在 github 上搜 skywalking-netcore,Pinpoint 没有好的推荐), 版本要求比较高, 但不可能更改现有平台的 FW 框架, Zipkin 有开源项目 Medidata.zipkinTracerModule ,zipkin.net,zipkin-csharp, 网上依次推荐是从前到后, 经过测试发现 Medidata.zipkinTracerModule,zipkin.net 也是用于 Net Core 的, 在 NuGet 上安装报错. 最后测试 zipkin-csharp(https://github.com/openzipkin-attic/zipkin-csharp)可以成功, 在 NuGet 中搜索 Zipkin.Core, 现在版本也只有一个, 如下:

然后查看给出的 demo 中代码: zipkin-csharp/examples/ZipkinExample/Program.cs

using System;
using System.Net;
using System.Threading;
using Zipkin;
using Zipkin.Tracer.Kafka;
namespace ZipkinExample
{
    class Program
    {
        static void Main(string[] args)
        {
            var random = new Random();
            // make sure Zipkin with Scribe client is working
            //var collector = new HttpCollector(new Uri("http://localhost:9411/"));
            var collector = new KafkaCollector(KafkaSettings.Default);
            var traceId = new TraceHeader(traceId: (ulong)random.Next(), spanId: (ulong)random.Next());
            var span = new Span(traceId, new IPEndPoint(IPAddress.Loopback, 9000), "test-service");
            span.Record(Annotations.ClientSend(DateTime.UtcNow));
            Thread.Sleep(100);
            span.Record(Annotations.ServerReceive(DateTime.UtcNow));
            Thread.Sleep(100);
            span.Record(Annotations.ServerSend(DateTime.UtcNow));
            Thread.Sleep(100);
            span.Record(Annotations.ClientReceive(DateTime.UtcNow));
            collector.CollectAsync(span).Wait();
        }
    }
}

可以看出这里的 traceId 和 spanId 都是随机生成的, 在这里推荐自己生成 ID, 注意是 ulong 型, 这里毫秒数只格式化两位(数据库的位数 20 位, 会超), 也可以用更保险的其它方法.

/// <summary>
        /// 获得随机数
        /// </summary>
        /// <returns></returns>
        private static ulong getRandom()
        {
            var random = new Random();
            return ulong.Parse(DateTime.Now.ToString("yyyyMMddHHmmssff") + random.Next(100, 999));
        }
    }

collector 这里使用 Http 来接收, 注释 kafka 的, 放开 http 的. 去掉 collector.CollectAsync(span).Wait(); 中的 Wait.

Zipkin 的几个基本概念

Span: 基本工作单元, 一次链路调用 (可以是 RPC,DB 等没有特定的限制) 创建一个 span, 通过一个 64 位 ID 标识它, span 通过还有其他的数据, 例如描述信息, 时间戳, key-value 对的(Annotation)tag 信息, parent-id 等, 其中 parent-id 可以表示 span 调用链路来源, 通俗的理解 span 就是一次请求信息

Trace: 类似于树结构的 Span 集合, 表示一条调用链路, 存在唯一标识, 即 TraceId

Annotation: 注解, 用来记录请求特定事件相关信息(例如时间), 通常包含四个注解信息

cs - Client Start, 表示客户端发起请求

sr - Server Receive, 表示服务端收到请求

ss - Server Send, 表示服务端完成处理, 并将结果发送给客户端

cr - Client Received, 表示客户端获取到服务端返回信息

BinaryAnnotation: 提供一些额外信息, 一般以 key-value 对出现

启动服务端测试

下载 https://github.com/openzipkin/zipkin/releases 最近的稳定版 release-2.7.1 的 jar 包, 这里采用 mysql 的型式保存记录, 因此需要创建数据库 zipkin, 创建表:

SET FOREIGN_KEY_CHECKS=0;
-- ----------------------------
-- Table structure for `zipkin_annotations`
-- ----------------------------
DROP TABLE IF EXISTS `zipkin_annotations`;
CREATE TABLE `zipkin_annotations` (
  `trace_id_high` bigint(20) NOT NULL DEFAULT '0' COMMENT 'If non zero, this means the trace uses 128 bit traceIds instead of 64 bit',
  `trace_id` bigint(20) NOT NULL COMMENT 'coincides with zipkin_spans.trace_id',
  `span_id` bigint(20) NOT NULL COMMENT 'coincides with zipkin_spans.id',
  `a_key` varchar(255) NOT NULL COMMENT 'BinaryAnnotation.key or Annotation.value if type == -1',

`a_value` blob COMMENT 'BinaryAnnotation.value(), which must be smaller than 64KB',

`a_type` int(11) NOT NULL COMMENT 'BinaryAnnotation.type() or -1 if Annotation',
  `a_timestamp` bigint(20) DEFAULT NULL COMMENT 'Used to implement TTL; Annotation.timestamp or zipkin_spans.timestamp',
  `endpoint_ipv4` int(11) DEFAULT NULL COMMENT 'Null when Binary/Annotation.endpoint is null',
  `endpoint_ipv6` binary(16) DEFAULT NULL COMMENT 'Null when Binary/Annotation.endpoint is null, or no IPv6 address',
  `endpoint_port` smallint(6) DEFAULT NULL COMMENT 'Null when Binary/Annotation.endpoint is null',
  `endpoint_service_name` varchar(255) DEFAULT NULL COMMENT 'Null when Binary/Annotation.endpoint is null',
  UNIQUE KEY `trace_id_high` (`trace_id_high`,`trace_id`,`span_id`,`a_key`,`a_timestamp`) COMMENT 'Ignore insert on duplicate',
  UNIQUE KEY `trace_id_high_4` (`trace_id_high`,`trace_id`,`span_id`,`a_key`,`a_timestamp`) COMMENT 'Ignore insert on duplicate',
  KEY `trace_id_high_2` (`trace_id_high`,`trace_id`,`span_id`) COMMENT 'for joining with zipkin_spans',
  KEY `trace_id_high_3` (`trace_id_high`,`trace_id`) COMMENT 'for getTraces/ByIds',
  KEY `endpoint_service_name` (`endpoint_service_name`) COMMENT 'for getTraces and getServiceNames',
  KEY `a_type` (`a_type`) COMMENT 'for getTraces',
  KEY `a_key` (`a_key`) COMMENT 'for getTraces',
  KEY `trace_id` (`trace_id`,`span_id`,`a_key`) COMMENT 'for dependencies job',
  KEY `trace_id_high_5` (`trace_id_high`,`trace_id`,`span_id`) COMMENT 'for joining with zipkin_spans',
  KEY `trace_id_high_6` (`trace_id_high`,`trace_id`) COMMENT 'for getTraces/ByIds',
  KEY `endpoint_service_name_2` (`endpoint_service_name`) COMMENT 'for getTraces and getServiceNames',
  KEY `a_type_2` (`a_type`) COMMENT 'for getTraces',
  KEY `a_key_2` (`a_key`) COMMENT 'for getTraces',
  KEY `trace_id_2` (`trace_id`,`span_id`,`a_key`) COMMENT 'for dependencies job'
) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPRESSED;
-- ----------------------------

-- Records of zipkin_annotations

-- ----------------------------
-- ----------------------------
-- Table structure for `zipkin_dependencies`
-- ----------------------------
DROP TABLE IF EXISTS `zipkin_dependencies`;
CREATE TABLE `zipkin_dependencies` (
  `day` date NOT NULL,
  `parent` varchar(255) NOT NULL,
  `child` varchar(255) NOT NULL,
  `call_count` bigint(20) DEFAULT NULL,
  `error_count` bigint(20) DEFAULT NULL,
  UNIQUE KEY `day` (`day`,`parent`,`child`),
  UNIQUE KEY `day_2` (`day`,`parent`,`child`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPRESSED;
-- ----------------------------

-- Records of zipkin_dependencies

-- ----------------------------
-- ----------------------------
-- Table structure for `zipkin_spans`
-- ----------------------------
DROP TABLE IF EXISTS `zipkin_spans`;
CREATE TABLE `zipkin_spans` (
  `trace_id_high` bigint(20) NOT NULL DEFAULT '0' COMMENT 'If non zero, this means the trace uses 128 bit traceIds instead of 64 bit',
  `trace_id` bigint(20) NOT NULL,
  `id` bigint(20) NOT NULL,
  `name` varchar(255) NOT NULL,
  `parent_id` bigint(20) DEFAULT NULL,
  `debug` bit(1) DEFAULT NULL,
  `start_ts` bigint(20) DEFAULT NULL COMMENT 'Span.timestamp(): epoch micros used for endTs query and to implement TTL',
  `duration` bigint(20) DEFAULT NULL COMMENT 'Span.duration(): micros used for minDuration and maxDuration query',
  UNIQUE KEY `trace_id_high` (`trace_id_high`,`trace_id`,`id`) COMMENT 'ignore insert on duplicate',
  UNIQUE KEY `trace_id_high_4` (`trace_id_high`,`trace_id`,`id`) COMMENT 'ignore insert on duplicate',
  KEY `trace_id_high_2` (`trace_id_high`,`trace_id`,`id`) COMMENT 'for joining with zipkin_annotations',
  KEY `trace_id_high_3` (`trace_id_high`,`trace_id`) COMMENT 'for getTracesByIds',
  KEY `name` (`name`) COMMENT 'for getTraces and getSpanNames',
  KEY `start_ts` (`start_ts`) COMMENT 'for getTraces ordering and range',
  KEY `trace_id_high_5` (`trace_id_high`,`trace_id`,`id`) COMMENT 'for joining with zipkin_annotations',
  KEY `trace_id_high_6` (`trace_id_high`,`trace_id`) COMMENT 'for getTracesByIds',
  KEY `name_2` (`name`) COMMENT 'for getTraces and getSpanNames',
  KEY `start_ts_2` (`start_ts`) COMMENT 'for getTraces ordering and range'
) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPRESSED;
-- ----------------------------

-- Records of zipkin_spans

-- ----------------------------

启动

进入程序的当前目录启动, 注意参数内容, 如果想要保存到 elasticsearch, 需要按官方文档更改.

java -jar zipkin-server-2.7.1.jar --STORAGE_TYPE=mysql --MYSQL_DB=zipkin --MYSQL_USER=root --MYSQL_PASS=123456 --MYSQL_HOST=localhost --MYSQL_TCP_PORT=3306

启动后看到如下内容表明成功.

启动成功后浏览器访问 http://localhost:9411/

至此服务端和展示页面已经启动, 不过功能还是很简单的, 具体的使用可另行查询资料.

这里用来测试的服务采用网友提供的源码: mircoservice 分布式跟踪系统(zipkin+springboot) https://github.com/dreamerkr/mircoservice, 文章可参考: 微服务之分布式跟踪系统(springboot+zipkin)https://blog.csdn.net/qq_21387171/article/details/53787019

用默认配置分别运行 4 个客户端服务后运行效果:

(1)分别启动每个服务, 然后访问服务 1, 浏览器访问( http://localhost:8081/service1/test )

(2)输入 zipkin 地址, 每次 trace 的列表

点击其中的 trace, 可以看 trace 的树形结构, 包括每个服务所消耗的时间:

点击每个 span 可以获取延迟信息:

同时可以查看服务之间的依赖关系:

测试 Net 平台程序

将 demo 代码改为:

static void Main(string[] args)
        {
            var random = new Random();
            // make sure Zipkin with Scribe client is working
            var collector = new HttpCollector(new Uri("http://localhost:9411/"));
            //var collector = new KafkaCollector(KafkaSettings.Default);
            var traceId = new TraceHeader(traceId: (ulong)random.Next(), spanId: (ulong)random.Next());
            var span = new Span(traceId, new IPEndPoint(IPAddress.Loopback, 9000), "zipkinweb");
            span.Record(Annotations.ClientSend(DateTime.UtcNow));
            Thread.Sleep(100);
            span.Record(Annotations.ServerReceive(DateTime.UtcNow));
            Thread.Sleep(100);
            span.Record(Annotations.ServerSend(DateTime.UtcNow));
            Thread.Sleep(100);
            span.Record(Annotations.ClientReceive(DateTime.UtcNow));
            collector.CollectAsync(span);
        }

然后运行一次再查看, 会多出一条信息

点进去会看到请求的详细信息和备注信息:

右上角查看 json

验证了 NET 平台下是可以成功调用的, 而且可以看到 zipkin 服务前端展示是通过 api 请求的, 前后台分开的, 因此我们可以以此来做二次开发, 我们知道了数据结构或者通过自己请求数据库内容做更复杂的业务前端.

这里强调一点的是 net 最好用 framework4.5 以上的版本, 由 net 的 demo 来看其实封装性不高, 所以灵活性能很高, 需要自己进一步封装才能达到代码的侵入性更少, 性能更高. 后面考虑到性能和数据量可改用 kafka 接收和 ES 保存数据.

来源: https://www.cnblogs.com/zhangs1986/p/8966051.html

与本文相关文章

暂无,快来抢沙发吧！