Kubernetes集群实践(六)使用k3s创建一个轻量级k8s集群

什么是K3s?

K3s 是一个轻量级的 Kubernetes 发行版,它针对边缘计算、物联网等场景进行了高度优化。为了最大化的利用设备,于是我选择了k3s作为在嵌入式设备上使用的kubernetes集群。

本文主要介绍如何使用k3s创建一个轻量级的k8s集群。

关键词:k3s

准备工作

  • Ubuntu 20.04 LTS amd64,32GB,作为Master节点
  • Ubuntu 18.04 LTS arm64,8GB,作为Worker节点

使用Wireguard创建虚拟局域网

使用wireguard 创建无感知的虚拟局域网,降低跨网段组网的难度。

具体请参考我的另外一篇文章使用WireGuard组建虚拟局域网.

安装IPVS所需要的模块

ipvs网络模式比iptables具有更优秀的性能

1
2
3
4
5
6
7
cat <<EOF | sudo tee /etc/modules-load.d/k3s.conf
ip_vs
ip_vs_lc
ip_vs_rr
ip_vs_wrr
ip_vs_sh
EOF

实时加载模块

1
2
3
4
5
sudo modprobe ip_vs
sudo modprobe ip_vs_lc
sudo modprobe ip_vs_rr
sudo modprobe ip_vs_wrr
sudo modprobe ip_vs_sh

安装查看ipvs表软件包

1
sudo apt install ipvsadm -y

服务起来以后,可以执行以下命令查看ipvs规则

1
ipvsadm -ln

安装master节点

国内用户可以使用

1
curl -sfL https://rancher-mirror.rancher.cn/k3s/k3s-install.sh | INSTALL_K3S_MIRROR=cn sh -

上述网页由于未知原因挂掉了

最新地址

1
curl -sfL https://rancher-mirror.oss-cn-beijing.aliyuncs.com/k3s/k3s-install.sh | INSTALL_K3S_MIRROR=cn sh -

国际用户使用

1
curl -sfL https://get.k3s.io | sh -    

可以指定环境变量,在管道之后传递:

1
curl -sfL https://get.k3s.io | INSTALL_K3S_CHANNEL=latest sh -

下面是flag的含义

FLAG 描述
--node-ip value, -i value 为节点发布的 IP 地址
--node-external-ip value 对外发布节点的 IP 地址
--flannel-iface value 覆盖默认的 flannel 接口
--no-deploy value 不需要部署的组件 (有效选项: coredns, servicelb, traefik, local-storage, metrics-server)
--kube-proxy-arg value 自定义 kube-proxy 进程的参数
--write-kubeconfig-mode 更改kubeconfig的权限
--flannel-backend value 更改flannel后端模式

编辑下面的service单元文件

1
/etc/systemd/system/k3s.service

如下图所示

1
2
3
....
ExecStart=/usr/local/bin/k3s agent --node-ip 192.168.8.238 --node-external-ip 192.168.8.238 --flannel-iface eth0 --write-kubeconfig-mode 644 --kube-proxy-arg 'proxy-mode=ipvs' --kube-proxy-arg 'ipvs-scheduler=rr' --kube-proxy-arg 'masquerade-all=true' --kube-proxy-arg 'metrics-bind-address=0.0.0.0:10249
...

安装worker节点

在master节点上运行以下命令,获取加入集群的token

1
cat /var/lib/rancher/k3s/server/node-token

国内环境执行以下命令

1
curl -sfL https://rancher-mirror.rancher.cn/k3s/k3s-install.sh | INSTALL_K3S_MIRROR=cn K3S_TOKEN="xxx" K3S_URL=https://192.168.7.2:6443 sh -

其中环境变量INSTALL_K3S_EXEC可以按照下面设置

也可以编辑下面的service单元文件

1
/etc/systemd/system/k3s-agent.service

如下图所示

1
2
3
....
ExecStart=/usr/local/bin/k3s agent --node-ip 192.168.8.238 --node-external-ip 192.168.8.238 --flannel-iface eth0 --write-kubeconfig-mode 644 --kube-proxy-arg 'proxy-mode=ipvs' --kube-proxy-arg 'ipvs-scheduler=rr' --kube-proxy-arg 'masquerade-all=true' --kube-proxy-arg 'metrics-bind-address=0.0.0.0:10249
...

检查集群是否安装成功

检查k3s是否安装成功

1
sudo systemctl status k3s

检查node状态

1
sudo kubectl get node -A

k3s配置私有镜像

创建配置文件

1
sudo vim /etc/rancher/k3s/registries.yaml

配置文件格式如下

1
2
3
4
5
6
7
8
9
10
11
12
13
mirrors:
docker.io:
endpoint:
- "https://hub.deepsoft-tech.com"
k8s.gcr.io:
endpoint:
- "https://hub.deepsoft-tech.com"
registry.k8s.io:
endpoint:
- "https://hub.deepsoft-tech.com"
quay.io:
endpoint:
- "https://hub.deepsoft-tech.com"

高可用的K3s集群

从 v1.19.5+k3s1 版本开始,K3s 已添加了对嵌入式 etcd 的完全支持。从此版本开始,支持嵌入式etcd的高可用。

新集群

首先新集群不能有之前集群的数据,最好是从新安装一个集群。

要在这种模式下运行 K3s,你必须有奇数的 server 节点。我们建议从三个节点开始。

要开始运行,首先启动一个 server 节点,使用cluster-init标志来启用集群,并带有其他标志位。

1
curl -sfL https://rancher-mirror.oss-cn-beijing.aliyuncs.com/k3s/k3s-install.sh | INSTALL_K3S_MIRROR=cn sh -s - server --cluster-init --kube-proxy-arg='metrics-bind-address=0.0.0.0:10249' --flannel-backend=host-gw  --write-kubeconfig-mode=644 --bind-address=192.168.31.29

启动第一台集群以后,使用共享密钥将第二台和第三台server加入集群。注意--server参数,指向的是第一个启动master节点的控制平面

1
curl -sfL https://rancher-mirror.oss-cn-beijing.aliyuncs.com/k3s/k3s-install.sh | INSTALL_K3S_MIRROR=cn K3S_TOKEN=xxx sh -s server --server=https://192.168.31.29:6443 --kube-proxy-arg='metrics-bind-address=0.0.0.0:10249' --flannel-backend=host-gw  --write-kubeconfig-mode=644 --bind-address=192.168.31.113

为K3s启用CUDA支持

containerd设置

根据以下网址:Running CUDA workloads - k3d

拷贝以下模板到:/var/lib/rancher/k3s/agent/etc/containerd/config.toml.tmpl

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
[plugins.opt]
path = "{{ .NodeConfig.Containerd.Opt }}"

[plugins.cri]
stream_server_address = "127.0.0.1"
stream_server_port = "10010"

{{- if .IsRunningInUserNS }}
disable_cgroup = true
disable_apparmor = true
restrict_oom_score_adj = true
{{end}}

{{- if .NodeConfig.AgentConfig.PauseImage }}
sandbox_image = "{{ .NodeConfig.AgentConfig.PauseImage }}"
{{end}}

{{- if not .NodeConfig.NoFlannel }}
[plugins.cri.cni]
bin_dir = "{{ .NodeConfig.AgentConfig.CNIBinDir }}"
conf_dir = "{{ .NodeConfig.AgentConfig.CNIConfDir }}"
{{end}}

[plugins.cri.containerd.runtimes.runc]
# ---- changed from 'io.containerd.runc.v2' for GPU support
runtime_type = "io.containerd.runtime.v1.linux"

# ---- added for GPU support
[plugins.linux]
runtime = "nvidia-container-runtime"

{{ if .PrivateRegistryConfig }}
{{ if .PrivateRegistryConfig.Mirrors }}
[plugins.cri.registry.mirrors]{{end}}
{{range $k, $v := .PrivateRegistryConfig.Mirrors }}
[plugins.cri.registry.mirrors."{{$k}}"]
endpoint = [{{range $i, $j := $v.Endpoints}}{{if $i}}, {{end}}{{printf "%q" .}}{{end}}]
{{end}}

{{range $k, $v := .PrivateRegistryConfig.Configs }}
{{ if $v.Auth }}
[plugins.cri.registry.configs."{{$k}}".auth]
{{ if $v.Auth.Username }}username = "{{ $v.Auth.Username }}"{{end}}
{{ if $v.Auth.Password }}password = "{{ $v.Auth.Password }}"{{end}}
{{ if $v.Auth.Auth }}auth = "{{ $v.Auth.Auth }}"{{end}}
{{ if $v.Auth.IdentityToken }}identitytoken = "{{ $v.Auth.IdentityToken }}"{{end}}
{{end}}
{{ if $v.TLS }}
[plugins.cri.registry.configs."{{$k}}".tls]
{{ if $v.TLS.CAFile }}ca_file = "{{ $v.TLS.CAFile }}"{{end}}
{{ if $v.TLS.CertFile }}cert_file = "{{ $v.TLS.CertFile }}"{{end}}
{{ if $v.TLS.KeyFile }}key_file = "{{ $v.TLS.KeyFile }}"{{end}}
{{end}}
{{end}}
{{end}}

安装Nvidia CUDA 插件

安装以下资源清单

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
# Copyright (c) 2019, NVIDIA CORPORATION.  All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

apiVersion: apps/v1
kind: DaemonSet
metadata:
name: nvidia-device-plugin-daemonset
namespace: kube-system
spec:
selector:
matchLabels:
name: nvidia-device-plugin-ds
updateStrategy:
type: RollingUpdate
template:
metadata:
labels:
name: nvidia-device-plugin-ds
spec:
tolerations:
- key: nvidia.com/gpu
operator: Exists
effect: NoSchedule
# Mark this pod as a critical add-on; when enabled, the critical add-on
# scheduler reserves resources for critical add-on pods so that they can
# be rescheduled after a failure.
# See https://kubernetes.io/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/
priorityClassName: "system-node-critical"
containers:
- image: nvcr.io/nvidia/k8s-device-plugin:v0.12.2
name: nvidia-device-plugin-ctr
env:
- name: FAIL_ON_INIT_ERROR
value: "false"
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop: ["ALL"]
volumeMounts:
- name: device-plugin
mountPath: /var/lib/kubelet/device-plugins
volumes:
- name: device-plugin
hostPath:
path: /var/lib/kubelet/device-plugins

在Pod模板下添加以下选项以指定节点安装DaemonSet插件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: nvidia-device-plugin-daemonset
namespace: kube-system
spec:
...
template:
metadata:
labels:
name: nvidia-device-plugin-ds
spec:
nodeSelector:
gpu: nvidia-2080-ti
...

将以上资源清单提交到以后api-server后,查看是否执行成功

根据NAME查看log

1
kube-system logs nvidia-device-plugin-daemonset-c7d9m
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
2022/08/08 10:01:16 Starting FS watcher.
2022/08/08 10:01:16 Starting OS watcher.
2022/08/08 10:01:16 Starting Plugins.
2022/08/08 10:01:16 Loading configuration.
2022/08/08 10:01:16 Initializing NVML.
2022/08/08 10:01:16 Updating config with default resource matching patterns.
2022/08/08 10:01:16
Running with config:
{
"version": "v1",
"flags": {
"migStrategy": "none",
"failOnInitError": false,
"nvidiaDriverRoot": "/",
"plugin": {
"passDeviceSpecs": false,
"deviceListStrategy": "envvar",
"deviceIDStrategy": "uuid"
}
},
"resources": {
"gpus": [
{
"pattern": "*",
"name": "nvidia.com/gpu"
}
]
},
"sharing": {
"timeSlicing": {}
}
}
2022/08/08 10:01:16 Retreiving plugins.
2022/08/08 10:01:16 Starting GRPC server for 'nvidia.com/gpu'
2022/08/08 10:01:16 Starting to serve 'nvidia.com/gpu' on /var/lib/kubelet/device-plugins/nvidia-gpu.sock
2022/08/08 10:01:16 Registered device plugin for 'nvidia.com/gpu' with Kubelet

控制平面查看节点信息

1
kubectl describe nodes pve-2080-ti

执行GPU任务

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
apiVersion: v1
kind: Pod
metadata:
name: gpu-pod
spec:
restartPolicy: Never
containers:
- name: cuda-container
image: nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda10.2
resources:
limits:
nvidia.com/gpu: 1 # requesting 1 GPU
tolerations:
- key: nvidia.com/gpu
operator: Exists
effect: NoSchedule

查看执行的日志

1
2
3
4
5
6
7
wf09@amd-server ➜  gpu kubectl logs gpu-pod
[Vector addition of 50000 elements]
Copy input data from the host memory to the CUDA device
CUDA kernel launch with 196 blocks of 256 threads
Copy output data from the CUDA device to the host memory
Test PASSED
Done

如上说明CUDA安装成功了

k3s排错

k3s flannel failed (delete): failed to parse netconf: unexpected end of JSON input

貌似是cri网络插件的问题,但是没有找到出现问题的原因。

解决方法

删除k3s配置,重新启动即可。

删除ipvsadm规则

1
ipvsadm -C

删除iptables规则

1
2
3
iptables -F # flush 清除所有的已定规则
iptables -X # delete 删除所有用户自定义的链
iptables -Z # zero 将所有的chain的计数与流量统计都归零

删除网卡信息

1
ip link delete kube-ipvs0 

重新启动集群

1
sudo systemctl start k3s