Kubernetes 中的组件(如 Kubeadm、Kube-apiserver 等)在初始化启动时,都需要在节点(主机)上找到一个合适的 IP 作为组件的 advertising/listening 的地址。[1]
默认情况下,此 IP 是主机上默认路由对应的网卡所在的 IP。
# ip route show default via 172.31.16.1 dev eth0 10.244.1.0/24 via 10.244.1.0 dev flannel.1 onlink 10.244.2.0/24 via 10.244.2.0 dev flannel.1 onlink 10.244.3.0/24 via 10.244.3.0 dev flannel.1 onlink 10.244.4.0/24 via 10.244.4.0 dev flannel.1 onlink 172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 172.31.16.0/20 dev eth0 proto kernel scope link src 172.31.26.11
$ kubectl get pods NAME READY STATUS RESTARTS AGE test-centos7-7cc5dc6987-jz486 0/1 CrashLoopBackOff 8 (111s ago) 17m
查看 Pod 详细信息
$ kubectl describe pod test-centos7-7cc5dc6987-jz486 ... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 18m default-scheduler Successfully assigned default/test-centos7-7cc5dc6987-jz486 to ops-kubernetes3 Normal Pulled 16m (x5 over 18m) kubelet Container image "centos:centos7.9.2009" already present on machine Normal Created 16m (x5 over 18m) kubelet Created container centos7 Normal Started 16m (x5 over 18m) kubelet Started container centos7 Warning BackOff 3m3s (x71 over 18m) kubelet Back-off restarting failed container
定位中也可以使用 kubectl describe pod 命令检查 Pod 的退出状态码。Kubernetes 中的 Pod ExitCode 状态码是容器退出时返回的退出状态码,这个状态码通常用来指示容器的执行结果,以便 Kubernetes 和相关工具可以根据它来采取后续的操作。以下是一些常见的 ExitCode 状态码说明:
$ kubeadm token list TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS 8ca35s.butdpihinkdczvqb 19h 2022-09-14T02:54:55Z authentication,signing The default bootstrap token generated by 'kubeadm init'. system:bootstrappers:kubeadm:default-node-token
# journalctl -f -u filebeat {"type":"illegal_argument_exception","reason":"Validation Failed: 1: this action would add [2] total shards, but this cluster currently has [6924]/[3000] maximum shards open;"}
错误消息 {"type":"illegal_argument_exception","reason":"Validation Failed: 1: this action would add [2] total shards, but this cluster currently has [6924]/[3000] maximum shards open;"} 显示当前集群已有 6924 个分片,超过了 3000 个的限制。
filebeat 启动后报错,elasticsearch 上未创建相应的索引,关键错误信息 Failed to connect to backoff(elasticsearch(http://1.57.115.214:9200)): Connection marked as failed because the onConnect callback failed: resource 'filebeat-7.5.2' exists, but it is not an alias
journalctl -f -u filebeat INFO [index-management] idxmgmt/std.go:269 ILM policy successfully loaded. ERROR pipeline/output.go:100 Failed to connect to backoff(elasticsearch(http://1.57.115.214:9200)): Connection marked as failed because the onConnect callback failed: resource 'filebeat-7.5.2' exists, but it is not an alias INFO pipeline/output.go:93 Attempting to reconnect to backoff(elasticsearch(http://1.57.115.214:9200)) with 3 reconnect attempt(s) INFO elasticsearch/client.go:753 Attempting to connect to Elasticsearch version 7.6.2 INFO [index-management] idxmgmt/std.go:256 Auto ILM enable success. INFO [index-management.ilm] ilm/std.go:138 do not generate ilm policy: exists=true, overwrite=false INFO [index-management] idxmgmt/std.go:269 ILM policy successfully loaded. ERROR pipeline/output.go:100 Failed to connect to backoff(elasticsearch(http://1.56.219.122:9200)): Connection marked as failed because the onConnect callback failed: resource 'filebeat-7.5.2' exists, but it is not an alias INFO pipeline/output.go:93 Attempting to reconnect to backoff(elasticsearch(http://1.56.219.122:9200)) with 3 reconnect attempt(s)
# etcd {"level":"warn","ts":"2023-10-05T02:16:52.853273Z","caller":"embed/config.go:673","msg":"Running http and grpc server on single port. This is not recommended for production."} {"level":"info","ts":"2023-10-05T02:16:52.853914Z","caller":"etcdmain/etcd.go:73","msg":"Running: ","args":["etcd"]} {"level":"warn","ts":"2023-10-05T02:16:52.853947Z","caller":"etcdmain/etcd.go:105","msg":"'data-dir' was empty; using default","data-dir":"default.etcd"} {"level":"warn","ts":"2023-10-05T02:16:52.853994Z","caller":"embed/config.go:673","msg":"Running http and grpc server on single port. This is not recommended for production."} {"level":"info","ts":"2023-10-05T02:16:52.854009Z","caller":"embed/etcd.go:127","msg":"configuring peer listeners","listen-peer-urls":["http://localhost:2380"]} ...
常用管理命令
etcd
查看版本信息
# etcd --version etcd Version: 3.5.3 Git SHA: 0452feec7 Go Version: go1.16.15 Go OS/Arch: linux/amd64
$ ctr image ls REF TYPE DIGEST SIZE PLATFORMS LABELS docker.io/library/nginx:alpine application/vnd.docker.distribution.manifest.list.v2+json sha256:455c39afebd4d98ef26dd70284aa86e6810b0485af5f4f222b19b89758cabf1e 9.8 MiB linux/386,linux/amd64,linux/arm/v6,linux/arm/v7,linux/arm64/v8,linux/ppc64le,linux/s390x -
将镜像挂载到本地目录
$ ctr image mount docker.io/library/nginx:alpine /mnt $ ls /mnt bin docker-entrypoint.d etc lib mnt proc run srv tmp var dev docker-entrypoint.sh home media opt root sbin sys usr
NAME: rancher LAST DEPLOYED: Wed Oct 12 10:22:25 2022 NAMESPACE: cattle-system STATUS: deployed REVISION: 1 TEST SUITE: None NOTES: Rancher Server has been installed.
NOTE: Rancher may take several minutes to fully initialize. Please standby while Certificates are being issued, Containers are started and the Ingress rule comes up.
Check out our docs at https://rancher.com/docs/
If you provided your own bootstrap password during installation, browse to https://rancher.my.com to get started.
If this is the first time you installed Rancher, get started by running this command and clicking the URL it generates: