跳到主要内容

也可以从百度云盘下载

ETCD常用操作

官网: https://github.com/etcd-io/etcd

拷贝etcdctl命令行工具:

wget https://github.com/etcd-io/etcd/releases/download/v3.5.3/etcd-v3.5.3-linux-amd64.tar.gz

# 也可以从百度云盘下载
链接: https://pan.baidu.com/s/1eRrzyKk0VWBe8jSd0qdmYg 提取码: 6c7f

查看etcd集群的成员节点:

$ export ETCDCTL_API=3
$ etcdctl --endpoints=https://[127.0.0.1]:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt --key=/etc/kubernetes/pki/etcd/healthcheck-client.key member list -w table

$ alias etcdctl='etcdctl --endpoints=https://[127.0.0.1]:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt --key=/etc/kubernetes/pki/etcd/healthcheck-client.key'

$ etcdctl member list -w table

查看etcd集群节点状态:

$ etcdctl endpoint status -w table

$ etcdctl endpoint health -w table

设置key值:

$ etcdctl put luffy 1
$ etcdctl get luffy

查看所有key值:

$  etcdctl get / --prefix --keys-only

查看具体的key对应的数据:

$ etcdctl get /registry/pods/jenkins/sonar-postgres-7fc5d748b6-gtmsb

list-watch:

$ etcdctl watch /luffy/ --prefix
$ etcdctl put /luffy/key1 val1

添加定时任务做数据快照(重要!)

$ etcdctl snapshot save `hostname`-etcd_`date +%Y%m%d%H%M`.db

恢复快照:

  1. 停止etcd和apiserver

  2. 移走当前数据目录

    $ mv /var/lib/etcd/ /tmp
  3. 恢复快照

    $ etcdctl snapshot restore `hostname`-etcd_`date +%Y%m%d%H%M`.db --data-dir=/var/lib/etcd/
  4. 集群恢复

    https://github.com/etcd-io/etcd/blob/release-3.3/Documentation/op-guide/recovery.md

  5. namespace删除问题

    很多情况下,会出现namespace删除卡住的问题,此时可以通过操作etcd来删除数据:

    # 查询namespace相关的元数据
    $ etcdctl get / --prefix --keys-only|grep namespace
    /registry/clusterrolebindings/system:controller:namespace-controller
    /registry/clusterroles/system:controller:namespace-controller
    /registry/namespaces/default
    /registry/namespaces/eladmin
    /registry/namespaces/kube-flannel
    /registry/namespaces/kube-node-lease
    /registry/namespaces/kube-public
    /registry/namespaces/kube-system
    /registry/namespaces/luffy
    /registry/serviceaccounts/kube-system/namespace-controller

    # 比如eladmin这个名称空间无法删除,则可以通过命令删除
    $ etcdctl delete /registry/namespaces/eladmin
锦囊妙计

由于ETCD中存储了k8s集群全部的业务数据,考虑最坏的情况,当etcd节点的机器故障,机器无法恢复怎么办?

在新机器中将备份数据恢复至数据目录中:

export ETCDCTL_API=3
etcdctl snapshot restore k8s-master-etcd_202211052008.db --data-dir=/root/etcd/data
name="etcd-single"
host="172.21.65.227"
cluster="etcd1=http://172.21.65.227:2380"

docker run -d --privileged=true \
-p 2379:2379 \
-p 2380:2380 \
-v /root/etcd/data:/data/etcd \
--name $name \
--net=host \
quay.io/coreos/etcd:v3.5.0 \
/usr/local/bin/etcd \
--name $name \
--data-dir /data/etcd \
--listen-client-urls http://$host:2379 \
--advertise-client-urls http://$host:2379 \
--listen-peer-urls http://$host:2380 \
--initial-advertise-peer-urls http://$host:2380 \
--initial-cluster $cluster \
--initial-cluster-token=luffy \
--initial-cluster-state=new \
--force-new-cluster \
--log-level info \
--logger zap \
--log-outputs stderr


# 验证集群
export ETCDCTL_API=3
export ETCD_ENDPOINTS=172.21.65.227:2379
etcdctl --endpoints=$ETCD_ENDPOINTS -w table member list
etcdctl --endpoints=$ETCD_ENDPOINTS -w table endpoint status
etcdctl --endpoints=$ETCD_ENDPOINTS get / --prefix --keys-only

集群中某些特定的namespace被删除了数据,该如何增量恢复?

# 思路: 从备份中找到丢失的etcd数据,增量创建


# 找到丢失的数据,发现内容乱码
$ etcdctl --endpoints=$ETCD_ENDPOINTS get /registry/services/specs/luffy/mysql

# 使用etcdhelper直接返回json数据
# https://github.com/openshift/origin/tree/master/tools/etcdhelper
# 可以基于源码构建,也可以直接下载构建好的二进制文件

# 基于源码构建,由于源码比较大,github下载很慢,因此直接从下面网盘地址获取源码:
# 链接: https://pan.baidu.com/s/1zOv97hGyy-gCEBRqVsz1Yg 提取码: x8t7

$ docker run -d --name go-builder golang:1.17 sleep 30000
$ docker cp origin-master go-builder:/go
$ docker exec -ti go-builder bash
# cd /go/origin-master
# go env -w GOPROXY=https://goproxy.cn,direct
# CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build tools/etcdhelper/etcdhelper.go

$ docker cp go-builder:/go/origin-master/etcdhelper .
$ cp etcdhelper /usr/bin/


## 上述构建好的etcdhelper文件,也可以直接通过网盘下载:
## 链接: https://pan.baidu.com/s/1NJDy08nzuSlImAw5mXJZ-Q 提取码: 5i55

etcdhelper如何使用?

export ETCDCTL_API=3
export ETCD_ENDPOINTS=172.21.65.227:2379
etcdhelper -endpoint $ETCD_ENDPOINTS ls
etcdhelper -endpoint $ETCD_ENDPOINTS get /registry/services/specs/luffy/mysql

如何批量操作?

kubectl没法直接通过json创建资源,因此需要将json文件保存且转换成为yaml格式

# 需求:单独恢复luffy名称空间的资源

# 通过etcdctl或者luffy相关的资源存储的key
export ETCDCTL_API=3
export ETCD_ENDPOINTS=172.21.65.227:2379
etcdctl --endpoints=$ETCD_ENDPOINTS get / --prefix --keys-only|grep luffy \>keys.txt

# 使用脚本利用etcdhelper将key转换成为json文件
cat key_to_json.sh
#!/bin/bash

i=0
export ETCDCTL_API=3
export ETCD_ENDPOINTS=172.21.65.227:2379
for line in `cat keys.txt`
do
etcdhelper -endpoint $ETCD_ENDPOINTS get $line \>$i.json
sed -i '1d' $i.json
let 'i+=1'
done

有了json文件,需要将json转为yaml

docker exec -ti go-builder  bash
# git clone https://gitee.com/agagin/json2yaml-go.git
# go env -w GOPROXY=https://goproxy.cn,direct
# GO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -o json2yaml json2yaml.go
# pwd
# /go/src/json2yaml-go

docker cp go-builder:/go/src/json2yaml-go/json2yaml .

./json2yaml -jp .