传统大数据采集一般通过 flume 采集 nginx 的 log 来实现, 然后再经过 kafka 传递数据
有了 ngx_kafak_module 数据采集就能通过 nginx 直接向 kafka 发送数据 (用户行为日志)
多逛逛全球最大的同性交友网站还是能学到很多东西滴~
nginx-kafka 安装脚本
注意 CentOS/Ubuntu 安装依赖库时的区别
- install-nginx-kafka.sh
- #!/bin/bash
- # CentOS
- #yum update; yum install -y gcc gcc-c++ pcre-devel zlib-devel make Git wget curl VIM
- #Ubuntu
- apt-get update; apt-get install -y gcc g++ libpcre3 libpcre3-dev zlib1g-dev libssl-dev make Git wget curl VIM
- cd /tmp
- Git clone https://github.com/edenhill/librdkafka
- Git clone https://github.com/brg-liuwei/ngx_kafka_module
- wget http://nginx.org/download/nginx-1.15.5.tar.gz
- cd /tmp/librdkafka
- ./configure; make; sudo make install
- tar -zxvf nginx-1.15.5.tar.gz
- cd /tmp/nginx-1.15.5
- ./configure --prefix=/usr/local/nginx_kafka --add-module=/tmp/ngx_kafka_module; make; sudo make install
- sudo ln -s /usr/local/nginx_kafka/sbin/nginx /usr/local/bin/nginx-kafka
- sudo echo "/usr/local/lib">> /etc/ld.so.conf
- sudo ldconfig
更新软件源 & 安装依赖库, 软件
下载 librdkafka,ngx_kafka_module,nginx 源码
编译安装 librdkafka
解压 nginx 源码 & 带上 ngx_kafka_module 编译安装
为了方便, 制作 nginx-kafka 软链 (不与其他 nginx 冲突)
如果启动 nginx 报错, 找不到 kafka.so.1 的文件
error while loading shared libraries: librdkafka.so.1: cannot open shared object file: No such file or directory
加载 so 库
- echo "/usr/local/lib">> /etc/ld.so.conf; ldconfig
- nginx-kafka.conf
- #user nobody;
- worker_processes 1;
- #error_log logs/error.log;
- #error_log logs/error.log notice;
- #error_log logs/error.log info;
- #pid logs/nginx.pid;
- events {
- worker_connections 1024;
- }
- http {
- include mime.types;
- default_type application/octet-stream;
- #log_format main '$remote_addr - $remote_user [$time_local]"$request" '
- # '$status $body_bytes_sent"$http_referer" '
- # '"$http_user_agent" "$http_x_forwarded_for"';
- #access_log logs/access.log main;
- sendfile on;
- #tcp_nopush on;
- #keepalive_timeout 0;
- keepalive_timeout 65;
- #gzip on;
- kafka;
- kafka_broker_list kafka-1:9092 kafka-2:9092 kafka-3:9092;
- server {
- listen 80;
- server_name localhost;
- #charset koi8-r;
- #access_log logs/host.access.log main;
- location = /kafka/log {
- kafka_topic log;
- }
- location = /kafka/user {
- kafka_topic user;
- }
- #error_page 404 /404.html;
- # redirect server error pages to the static page /50x.HTML
- #
- error_page 500 502 503 504 /50x.HTML;
- location = /50x.HTML {
- root HTML;
- }
- }
- }
指定 kafka 集群 kafka_broker_list ip | host:port;
location 可以根据 topic 划分 URL
启动 nginx
启动 zookeeper 集群和 kafka 集群 (创建 topic)
略...
测试配置文件
nginx-kafka -c nginx-kafka.conf -t
启动 nginx-kafka
nginx-kafka -c nginx-kafka.conf -s reload
enjoy .
来源: http://www.jianshu.com/p/8c4e50206538