etcd 高可用实践

这篇文章记录实践 etcd 的 3节点 HA的过程。分为三个步骤

  • 生成证书
  • 运行 etcd
  • 验证

关于etcd的介绍,请看下一篇文章。

生成证书

SSL/TSL 认证分单向认证和双向认证两种方式。

官方推荐使用 cfssl 来自建 CA 签发证书,以下参考官方文档的步骤https://coreos.com/os/docs/latest/generate-self-signed-certificates.html.

证书类型介绍:

  • client certificate 用于通过服务器验证客户端。例如etcdctl,etcd proxy,fleetctl或docker客户端。
  • server certificate 由服务器使用,并由客户端验证服务器身份。例如docker服务器或kube-apiserver。
  • peer certificate 由 etcd 集群成员使用,供它们彼此之间通信使用。
1,下载 cfssl
echo "下载 cfssl"
curl -o /usr/local/bin/cfssl https://pkg.cfssl.org/R1.2/cfssl_linux-amd64
curl -o /usr/local/bin/cfssljson https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64
chmod +x /usr/local/bin/cfssl*
2,生成 CA 证书
mkdir -p /etc/etcd/ssl
cd /etc/etcd/ssl
cat >ca-config.json <<EOF
{
    "signing": {
        "default": {
            "expiry": "87600h"
        },
        "profiles": {
            "server": {
                "expiry": "87600h",
                "usages": [
                    "signing",
                    "key encipherment",
                    "server auth",
                    "client auth"
                ]
            },
            "client": {
                "expiry": "87600h",
                "usages": [
                    "signing",
                    "key encipherment",
                    "client auth"
                ]
            },
            "peer": {
                "expiry": "87600h",
                "usages": [
                    "signing",
                    "key encipherment",
                    "server auth",
                    "client auth"
                ]
            }
        }
    }
}
EOF

cat >ca-csr.json <<EOF
{
    "CN": "etcd",
    "key": {
        "algo": "rsa",
        "size": 2048
    }
}
EOF

cfssl gencert -initca ca-csr.json | cfssljson -bare ca -

请务必保证 ca-key.pem 文件的安全,*.csr 文件在整个过程中不会使用。

将会生成以下几个文件:

ca-key.pem ca.csr ca.pem

3,生成服务器端证书
cat >config.json <<EOF
{
    "CN": "etcd-0",
    "hosts": [
        "localhost",
        "10.19.0.55",
        "10.19.0.56",
        "10.19.0.57"
    ],
    "key": {
        "algo": "ecdsa",
        "size": 256
    },
    "names": [
        {
            "C": "US",
            "L": "CA",
            "ST": "San Francisco"
        }
    ]
}
EOF

echo "生成服务器端证书 server.csr server.pem server-key.pem"
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=server config.json | cfssljson -bare server

echo "生成peer端证书 peer.csr peer.pem peer-key.pem"
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=peer config.json | cfssljson -bare peer

至此,所有证书都已生成完毕。

运行etcd

安装etcd时有一个注意事项——要注意版本号。例如 kubernetes 不用的版本对 etcd 最低版本有要求。我是用 1.10 的 kubernetes,配合使用 3.1.18 的 etcd。

wget https://github.com/coreos/etcd/releases/download/v3.1.8/etcd-v3.1.8-linux-amd64.tar.gz
tar -zxvf etcd-v3.1.8-linux-amd64.tar.gz

ln -s /app/allblue/etcd/etcd-v3.1.18-linux-amd64/etcd /usr/bin/etcd
ln -s /app/allblue/etcd/etcd-v3.1.18-linux-amd64/etcdctl /usr/bin/etcdctl

初始化配置文件:

这个文件不用过多关注,因为后文启动配置时会将大部分内容重写覆盖。

vi /etc/etcd/etcd.conf

ETCD_NAME=node01
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="https://0.0.0.0:2380"
ETCD_LISTEN_CLIENT_URLS="https://0.0.0.0:2379"

#[cluster]
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://etcd-0:2380"
ETCD_INITIAL_CLUSTER="node01=https://node01:2380,node02=https://node02:2380,node03=https://node03:2380"
ETCD_INITIAL_CLUSTER_STATE="new"
ETCD_INITIAL_CLUSTER_TOKEN="nodecluster"
ETCD_ADVERTISE_CLIENT_URLS="https://node01:2379"
#ETCD_DISCOVERY=""
#ETCD_DISCOVERY_SRV=""
#ETCD_DISCOVERY_FALLBACK="proxy"
#ETCD_DISCOVERY_PROXY=""
#ETCD_STRICT_RECONFIG_CHECK="false"
#ETCD_AUTO_COMPACTION_RETENTION="0"
#
#[proxy]
#ETCD_PROXY="off"
#ETCD_PROXY_FAILURE_WAIT="5000"
#ETCD_PROXY_REFRESH_INTERVAL="30000"
#ETCD_PROXY_DIAL_TIMEOUT="1000"
#ETCD_PROXY_WRITE_TIMEOUT="5000"
#ETCD_PROXY_READ_TIMEOUT="0"
#
#[security]
ETCD_CERT_FILE="/etc/etcd/ssl/server.pem"
ETCD_KEY_FILE="/etc/etcd/ssl/server-key.pem"
ETCD_CLIENT_CERT_AUTH="true"
ETCD_TRUSTED_CA_FILE="/etc/etcd/ssl/ca.pem"
ETCD_AUTO_TLS="true"
ETCD_PEER_CERT_FILE="/etc/etcd/ssl/peer.pem"
ETCD_PEER_KEY_FILE="/etc/etcd/ssl/peer-key.pem"
#ETCD_PEER_CLIENT_CERT_AUTH="false"
ETCD_PEER_TRUSTED_CA_FILE="/etc/etcd/ssl/ca.pem"
ETCD_PEER_AUTO_TLS="true"
#
#[logging]
#ETCD_DEBUG="false"
# examples for -log-package-levels etcdserver=WARNING,security=DEBUG
#ETCD_LOG_PACKAGE_LEVELS=""
#[profiling]
#ETCD_ENABLE_PPROF="false"
#ETCD_METRICS="basic"

然后对etcd的命令进行配置:

vi /etc/systemd/system/etcd.service

[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
Documentation=https://github.com/coreos

[Service]
Type=notify
WorkingDirectory=/app/allblue/etcddata/
EnvironmentFile=/app/allblue/etcd/etcd.conf
User=root
ExecStart=/usr/bin/etcd \
  --name node01  \
  --cert-file=/app/allblue/etcd/ssl/server.pem  \
  --key-file=/app/allblue/etcd/ssl/server-key.pem  \
  --peer-cert-file=/app/allblue/etcd/ssl/peer.pem  \
  --peer-key-file=/app/allblue/etcd/ssl/peer-key.pem  \
  --trusted-ca-file=/app/allblue/etcd/ssl/ca.pem  \
  --peer-trusted-ca-file=/app/allblue/etcd/ssl/ca.pem  \
  --initial-advertise-peer-urls https://10.19.0.55:2380  \
  --listen-peer-urls https://10.19.0.55:2380  \
  --listen-client-urls https://10.19.0.55:2379,http://127.0.0.1:2379  \
  --advertise-client-urls https://10.19.0.55:2379  \
  --initial-cluster-token etcd-cluster-0  \
  --initial-cluster node01=https://10.19.0.55:2380,node02=https://10.19.0.56:2380,node03=https://10.19.0.57:2380  \
  --initial-cluster-state new  \
  --data-dir=/app/allblue/etcddata
Restart=on-failure
RestartSec=5
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target

在配置文件中需要关注[Service] 一栏的配置,包括数据存储目录、环境配置、启动命令等。有以下几个关注点:

  • 将在第一步骤完成的证书放入自定义文件夹 /app/allblue/etcd/ssl/ 进行管理,方便不同 server 间的转移。
  • /app/allblue/etcd/ 作为配置目录和ssl目录,以/app/allblue/etcddata 作为数据目录,便于管理。
  • 配置好不同节点的名字。

然后设置 etcd 为开机自动启动:

systemctl daemon-reload
systemctl enable etcd

systemctl start etcd

在其他机器上做好配置后,几台机器先后启动:

systemctl start etcd

验证

vi check.sh

#!/usr/bin/env bash

etcdctl \
  --ca-file=/app/allblue/etcd/ssl/ca.pem \
  --cert-file=/app/allblue/etcd/ssl/peer.pem \
  --key-file=/app/allblue/etcd/ssl/peer-key.pem \
  --endpoints=https://10.19.0.55:2379,https://10.19.0.56:2379,https://10.19.0.57:2379  \
  cluster-health

执行这个脚本,得到下列显示就是ok:

2018-07-02 16:12:01.988067 I | warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
2018-07-02 16:12:01.988388 I | warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
member 2e433331257e2668 is healthy: got healthy result from https://10.19.0.57:2379
member 884de43739502011 is healthy: got healthy result from https://10.19.0.56:2379
member fb3e4a0d91e719a6 is healthy: got healthy result from https://10.19.0.55:2379
cluster is healthy

参考资料


波兰 nazwa 家的 vps 重装系统的密码问题 alpha、beta、rc 各版本区别