场景
Consul在我这里的应用场景有两个:
- 注册中心: Prometheus基于Consul的自动Target发现, 加上CMDB系统可实现监控自动录入
- 配置中心: Prometheus的rule以k/v的形式存放在Consul注册中心中,然后通过confd实时监听,模板映射成rule文件,实现Prometheus告警规则统一界面管理.
本文只写部署, 后面有时间再补充如何实现的自动化监控体系(在上家公司做的,已经好久了)
主机名 | 角色 | 内网IP | 配置目录 | 数据目录 |
---|---|---|---|---|
shb-manager-mw-consul-node01 | master、client | 192.168.63.217 | /etc/consul.d/ | /data/consul |
shb-manager-mw-consul-node02 | master、client | 192.168.63.218 | /etc/consul.d/ | /data/consul |
shb-manager-mw-consul-node03 | master、client | 192.168.63.219 | /etc/consul.d/ | /data/consul |
阿里云slb:
内网负载
192.168.63.205:8500
后端服务器组:
192.168.63.217:8500
192.168.63.218:8500
192.168.63.219:8500
1. 下载最新物料包
所有主机
cd /opt
wget -c https://releases.hashicorp.com/consul/1.10.3/consul_1.10.3_linux_amd64.zip
2. 解压安装
所有主机
unzip -q consul_1.10.3_linux_amd64.zip
mv consul /usr/local/bin/
consul -autocomplete-install
complete -C /usr/local/bin/consul consul
3. 创建配置文件、数据目录
所有主机
mkdir -p /etc/consul.d/ /data/consul
useradd --system --home /etc/consul.d --shell /bin/false consul
mkdir --parents /data/consul
chown --recursive consul:consul /data/consul /etc/consul.d
4. 创建配置文件
node1上执行,获取encrypt
consul keygen
mVINgJxtdZGy8SMlKYNFFbaBEjVSHChlVPlLjvfjnII=
所有主机执行,注意node名改为各自的信息
cat << EOF > /etc/consul.d/consul.hcl
datacenter = "aliyun"
node_name = "node01"
data_dir = "/data/consul"
enable_syslog = true
log_level = "INFO"
retry_join = ["192.168.63.217", "192.168.63.218", "192.168.63.219"]
start_join = ["192.168.63.217", "192.168.63.218", "192.168.63.219"]
retry_interval = "30s"
rejoin_after_leave = true
client_addr = "0.0.0.0"
bind_addr = "0.0.0.0"
encrypt = "mVINgJxtdZGy8SMlKYNFFbaBEjVSHChlVPlLjvfjnII="
performance {
raft_multiplier = 1
}
limits {
http_max_conns_per_client = 2000
}
EOF
所有主机执行
cat << EOF > /etc/consul.d/server.hcl
server = true
bootstrap_expect = 3
ui = true
EOF
5. systemd守护进程
cat << EOF > /etc/systemd/system/consul.service
[Unit]
Description="HashiCorp Consul - A service mesh solution"
Documentation=https://www.consul.io/
Requires=network-online.target
After=network-online.target
ConditionFileNotEmpty=/etc/consul.d/consul.hcl
[Service]
User=consul
Group=consul
ExecStart=/usr/local/bin/consul agent -config-dir=/etc/consul.d/
ExecReload=/usr/local/bin/consul reload
KillMode=process
Restart=on-failure
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
EOF
systemctl daemon-reload
systemctl enable consul
systemctl start consul
6. 常用命令
# 集群状态
consul members
# 获取集群节点
curl "192.168.63.217:8300/v1/status/peers"
# 获取leader节点
curl "192.168.63.217:8500/v1/status/leader"
# 注册service
cat << EOF > node_exporter.json
{
"ID": "node-exporter-192.168.1.36",
"Name": "node-exporter-master-36",
"Tags": [
"aliyun",
"bendihua",
"linux"
],
"Address": "192.168.1.36",
"Port": 9100,
"Meta": {
"env": "test",
"hostname": "master-36",
"instance": "192.168.1.36",
"os": "centos7"
},
"EnableTagOverride": false,
"Check": {
"HTTP": "http://192.168.1.36:9100/metrics",
"Interval": "10s"
},
"Weights": {
"Passing": 10,
"Warning": 1
}
}
EOF
curl --request PUT --data @node_exporter.json http://192.168.63.205:8500/v1/agent/service/register
# 删除service
curl -X PUT http://192.168.63.205:8500/v1/agent/service/deregister/node-exporter-192.168.1.36
# 更新service与注册方式一样,相同的json文件中修改信息后put提交即为更新`
# 获取所有serviceid
curl http://192.168.63.205:8500/v1/health/state/any | python -m json.tool | grep ServiceID | awk '{print $2}' |sed 's/"//g' | sed 's/,//g'
# 批量删除不健康的service
servicelist=$(curl http://192.168.63.205:8500/v1/health/state/critical | python -m json.tool | grep ServiceID | awk '{print $2}' |sed 's/"//g' | sed 's/,//g'|grep jvm)
for i in $servicelist ;do
echo $i;
curl -X PUT http://192.168.63.205:8500/v1/agent/service/deregister/${i}
done
# 批量删除所有状态的service
servicelist=$(curl http://192.168.63.205:8500/v1/health/state/any | python -m json.tool | grep ServiceID | awk '{print $2}' |sed 's/"//g' | sed 's/,//g'|grep jvm)
for i in $servicelist ;do
echo $i;
curl -X PUT http://192.168.63.205:8500/v1/agent/service/deregister/${i}
done
# 用命令行方式获取所有services列表
consul catalog services|grep jvm > /tmp/jvm-list
# 用命令行方式批量删除services
for i in $(cat /tmp/jvm-list) ;do
echo $i;
curl -X PUT http://127.0.0.1:8500/v1/agent/service/deregister/${i}
done
7. 访问入口
问题
使用agent接口注册与删除服务有一个问题,就是前面挂了一个slb到三个agent节点,注册的service也是分散再各节点的, 发现还有一个catalog接口注册的办法,参考
https://blog.51cto.com/l0vesql/2489813
https://edgar615.github.io/consul-service-register.html
https://www.cnblogs.com/Qing-840/p/10144184.html
https://edgar615.github.io/consul-service-register.html
转载请注明来源, 欢迎对文章中的引用来源进行考证, 欢迎指出任何有错误或不够清晰的表达, 可以邮件至 chinaops666@gmail.com