基于clickhouse keeper搭建clickhouse集群


服务器信息

主机名IP
my-db01192.168.1.214
my-db02192.168.1.215
my-db03192.168.1.216
  • hosts设置
# 切换成rootsudo -i# my-db01 执行echo '192.168.1.215 my-db02' >> /etc/hostsecho '192.168.1.216 my-db03' >> /etc/hosts# my-db02 执行echo '192.168.1.214 my-db01' >> /etc/hostsecho '192.168.1.216 my-db03' >> /etc/hosts# my-db03 执行echo '192.168.1.214 my-db01' >> /etc/hostsecho '192.168.1.215 my-db02' >> /etc/hosts

安装

使用admin用户安装:

  • 添加官方镜像
sudo yum install -y yum-utilssudo yum-config-manager --add-repo https://packages.clickhouse.com/rpm/clickhouse.repo
  • 安装clickhouse-server和clickhouse-client
sudo yum install -y clickhouse-server clickhouse-client

版本信息:

操作系统:CentOS Linux release 7.9.2009 (Core)

systemd:219

clickhouse-client:23.2.4.12-1.x86_64

clickhouse-server:23.2.4.12-1.x86_64

clickhouse-common-static:23.2.4.12-1.x86_64

  • 安装nc命令,用于检查连通性
yum install -y nc

调整配置

目录调整

# 创建数据目录sudo mkdir -p /data/clickhouse/lib# 创建日志目录sudo mkdir -p /data/clickhouse/log# 授权sudo chown -R clickhouse:clickhouse /data/clickhousesudo chmod 777 /data# 备份原始配置文件sudo cp /etc/clickhouse-server/users.xml ~sudo cp /etc/clickhouse-server/config.xml ~# 更改目录配置## 权限更改sudo chmod 666 /etc/clickhouse-server/config.xmlsudo chmod 666 /etc/clickhouse-server/users.xml## 日志目录替换sudo sed -i 's?/var/log/clickhouse-server?/data/clickhouse/log?g' /etc/clickhouse-server/config.xml## 数据目录替换sudo sed -i 's?/var/lib/clickhouse?/data/clickhouse/lib?g' /etc/clickhouse-server/config.xml

启停

  • 修改sudo vi /usr/lib/systemd/system/clickhouse-server.service参考:《问题记录->启动超时》
  • 设置自启动:sudo systemctl enable clickhouse-server
  • 启动命令:sudo systemctl start clickhouse-server
  • 关闭命令:sudo systemctl stop clickhouse-server
  • 启动状态:sudo systemctl status clickhouse-server

参数调整

sudo vi /etc/clickhouse-server/config.xml中的配置:

  • background_pool_size:默认16,可以调整到CPU个数的两倍。本次调整到32

  • max_concurrent_queries:默认100,可以调整到200或者300。本次调整到200

  • 设置外网(ipv4)可访问:<listen_host>0.0.0.0</listen_host>设置interserver_listen_host,因为服务器不支持ipv6(如果不设置,配置了clickhouse-keeper后,会无法启动,报错:RaftInstance: got exception: open: Address family not supported by protocol)<interserver_listen_host>0.0.0.0</interserver_listen_host>

users.xml中的配置:

  • 密码设置:
# 使用下述命令生成随机密码PASSWORD=$(base64 < /dev/urandom | head -c12); echo "$PASSWORD"; echo -n "$PASSWORD" | sha256sum | tr -d '-'# 明文密码:z+yJwbcWv6MA# 密文密码:b53ad819c11d5790655464f2d6ec0e78916551b62141fec0d1342a25138082d2
b53ad819c11d5790655464f2d6ec0e78916551b62141fec0d1342a25138082d2

上述配置在每个节点都需要设置

服务器调整

  • 不禁用overcommit
echo 0 | sudo tee /proc/sys/vm/overcommit_memory
  • 始终禁用透明大页(transparent huge pages)。 它会干扰内存分配器,从而导致显着的性能下降。
# 使用rootecho never > /sys/kernel/mm/transparent_hugepage/enabledecho never > /sys/kernel/mm/transparent_hugepage/defragecho 'echo never > /sys/kernel/mm/transparent_hugepage/defrag' >> /etc/rc.d/rc.localecho 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' >> /etc/rc.d/rc.localsudo chmod +x /etc/rc.d/rc.local
  • 禁用swap(官方建议:We recommend to disable the operating system’s swap file in production environments.)
1. sudo swapoff -a2. echo "vm.swappiness = 0">> /etc/sysctl.conf3. sudo sysctl -p4. sudo vi /etc/fstab # 注释swap那一行

集群搭建

  • 最小三台为一个集群
  • 基于clickhouse-keeper搭建集群
  • 搭建集群之前,三台服务器都需要按照上文所示,安装好clickhouse

clickhouse-keeper配置

在每台clickhouse服务器中的/etc/clickhouse-server/config.d/目录下新建clickhouse-keeper.xml,内容如下:

<clickhouse><keeper_server><tcp_port>9181</tcp_port><server_id>1</server_id><log_storage_path>/data/clickhouse/lib/coordination/log</log_storage_path><snapshot_storage_path>/data/clickhouse/lib/coordination/snapshots</snapshot_storage_path><coordination_settings><operation_timeout_ms>10000</operation_timeout_ms><session_timeout_ms>30000</session_timeout_ms><raft_logs_level>warning</raft_logs_level></coordination_settings><raft_configuration><server><id>1</id><hostname>my-db01</hostname><port>9444</port></server><server><id>2</id><hostname>my-db02</hostname><port>9444</port></server><server><id>3</id><hostname>my-db03</hostname><port>9444</port></server></raft_configuration></keeper_server><zookeeper><node><host>my-db01</host><port>9181</port></node><node><host>my-db02</host><port>9181</port></node><node><host>my-db03</host><port>9181</port></node></zookeeper></clickhouse>

注意事项:

  1. 每个节点server_id配置正确
  2. log_storage_path和snapshot_storage_path目录正确
  3. 端口能访问
  4. 文件授权:chown clickhouse:clickhouse /etc/clickhouse-server/config.d/clickhouse-keeper.xml

本次搭建情况如下:

  1. my-db01的server_id为1、my-db02的server_id为2、my-db03的server_id为3
  2. 开放端口9181、9444
  • 检查keeper是否正常,返回imok表示正常
echo ruok | nc localhost 9181; echo# imok

集群配置

集群设置为:0分片3副本的结构

配置如下(将该配置追加到clickhouse-keeper.xml文件中):

 <!-- ${建议设置成集群名+分片名},例如:cluster_3S_1R_01${建议设置为主机名},例如:my-db01 -->cluster_1S_3R_01my-db01my-db019000defaultmy-db029000defaultmy-db039000default

问题记录

启动超时

安装完之后,通过命令sudo systemctl start clickhouse-server无法正常启动,日志如下:

● clickhouse-server.service - ClickHouse Server (analytic DBMS for big data) Loaded: loaded (/usr/lib/systemd/system/clickhouse-server.service; enabled; vendor preset: disabled) Active: activating (auto-restart) (Result: timeout) since Tue 2023-03-21 16:59:02 CST; 6s agoProcess: 12585 ExecStart=/usr/bin/clickhouse-server --config=/etc/clickhouse-server/config.xml --pid-file=%t/%p/%p.pid (code=killed, signal=TERM) Main PID: 12585 (code=killed, signal=TERM)Mar 21 16:59:02 my-db02 systemd[1]: Failed to start ClickHouse Server (analytic DBMS for big data).Mar 21 16:59:02 my-db02 systemd[1]: Unit clickhouse-server.service entered failed state.Mar 21 16:59:02 my-db02 systemd[1]: clickhouse-server.service failed.

看出是timeout导致的,翻阅资料后发现问题:

  1. /usr/lib/systemd/system/clickhouse-server.service文件中超时设置,使用的是:TimeoutStartSec=infinity

  2. 通过systemctl --version查看systemd的版本为219

  3. TimeoutStartSecinfinity设置是229版本之后才有的,229之前设置为0,来禁用超时

这里提供一份修改过的clickhouse-server.service文件,可供参考

[Unit]Description=ClickHouse Server (analytic DBMS for big data)Requires=network-online.target# NOTE: that After/Wants=time-sync.target is not enough, you need to ensure# that the time was adjusted already, if you use systemd-timesyncd you are# safe, but if you use ntp or some other daemon, you should configure it# additionaly.After=time-sync.target network-online.targetWants=time-sync.target[Service]Type=notify# NOTE: we leave clickhouse watchdog process enabled to be able to see OOM/SIGKILL traces in clickhouse-server.log files.# If you wish to disable the watchdog and rely on systemd logs just add "Environment=CLICKHOUSE_WATCHDOG_ENABLE=0" line.User=clickhouseGroup=clickhouseRestart=alwaysRestartSec=30# Since ClickHouse is systemd aware default 1m30sec may not be enough# TimeoutStartSec=infinityTimeoutStartSec=0# %p is resolved to the systemd unit nameRuntimeDirectory=%p ExecStart=/usr/bin/clickhouse-server --config=/etc/clickhouse-server/config.xml --pid-file=%t/%p/%p.pid# Minus means that this file is optional.EnvironmentFile=-/etc/default/%p# Bring back /etc/default/clickhouse for backward compatibilityEnvironmentFile=-/etc/default/clickhouseLimitCORE=infinityLimitNOFILE=500000CapabilityBoundingSet=CAP_NET_ADMIN CAP_IPC_LOCK CAP_SYS_NICE CAP_NET_BIND_SERVICE[Install]# ClickHouse should not start from the rescue shell (rescue.target).WantedBy=multi-user.target

注意事项:

如果已经启动失败,修改后systemd相关文件后,需要执行systemctl daemon-reload

参考文档

安装:https://clickhouse.com/docs/en/install#from-rpm-packages

使用建议:https://clickhouse.com/docs/en/operations/tips

关闭swap:https://blog.csdn.net/weixin_43224440/article/details/111556962

参数调优:https://blog.csdn.net/qq_35128600/article/details/125897196

集群搭建参考:https://clickhouse.com/docs/en/guides/sre/keeper/clickhouse-keeper#clickhouse-keeper-user-guide

不支持ipv6参考:https://github.com/ClickHouse/ClickHouse/issues/33381

© 版权声明
THE END
喜欢就支持一下吧
点赞0 分享