ASCII NUL character (\0) as line separator Since file names can potentially contain any character but NUL, this option assures that the output of fswatch can be safely parsed using NUL as delimiter, such as using xargs -0 and the shell builtin read -d ''.
fswatch -0 will split records using the NUL character.
xargs -0 will split records using the NUL character. This is required to correctly match impedance with fswatch.
xargs -n 1 will invoke command every record. If you want to do it every x records, then use xargs -n x.
xargs -I {} will substitute occurrences of {} in command with the parsed argument. If the command you are running does not need the event path name, just delete this option. If you prefer using another replacement string, substi‐ tute {} with yours.
以上命令中 date +%Y%m%d%H%M%S 只会被计算一次,假如第一次执行时 date +%Y%m%d%H%M%S = nohup.out.20230609132143,那么之后每次触发 xargs,变量 date +%Y%m%d%H%M%S 的值都是 nohup.out.20230609132143,不会被重新计算
$ lftp -h Usage: lftp [OPTS] <site> `lftp' is the first command executed by lftp after rc files -f <file> execute commands from the file and exit -c <cmd> execute the commands and exit --help print this help and exit --version print lftp version and exit Other options are the same as in `open' command -e <cmd> execute the command just after selecting -u <user>[,<pass>] use the user/password for authentication -p <port> use the port for connection <site> host name, URL or bookmark name
Kubernetes Metrics Server 从 kubelet 收集资源使用指标(情况)并通过 Metrics API 将其暴露给 Kubernetes API Server,以供 HPA(Horizontal Pod Autoscaler) 和 VPA(Vertical Pod Autoscaler) 使用。kubectl top 也使用 Metrics API。[1]
安装 Kubernetes Metrics Server
Kubernetes Metrics Server 安装之前必须要开启 kube api-server 的聚合层功能以及认证鉴权功能 [3]
$ kubectl get pods -A | grep metrics kube-system metrics-server-5cdf47479d-rwtd6 0/1 Running 0 5m53s
检查 Pod 日志
$ kubectl logs -n kube-system metrics-server-5cdf47479d-rwtd6 1 scraper.go:140] "Failed to scrape node" err="Get \"https://172.31.21.3:10250/metrics/resource\": x509: cannot validate certificate for 172.31.21.3 because it doesn't contain any IP SANs" node="k8smaster3" 1 scraper.go:140] "Failed to scrape node" err="Get \"https://172.31.26.116:10250/metrics/resource\": x509: cannot validate certificate for 172.31.26.116 because it doesn't contain any IP SANs" node="k8smaster1" 1 scraper.go:140] "Failed to scrape node" err="Get \"https://172.31.19.164:10250/metrics/resource\": x509: cannot validate certificate for 172.31.19.164 because it doesn't contain any IP SANs" node="k8smaster2"
参考部署步骤 部署后,Metrics Server 的 Pod 一直处于未就绪状态,检查 Pod 日志
$ kubectl logs -n kube-system metrics-server-5cdf47479d-rwtd6 "Failed to scrape node" err="Get \"https://k8smaster1:10250/metrics/resource\": dial tcp: lookup k8smaster1 on 10.96.0.10:53: no such host" node="k8smaster1" "Failed probe" probe="metric-storage-ready" err="no metrics to serve" "Failed to scrape node" err="Get \"https://k8smaster1:10250/metrics/resource\": dial tcp: lookup k8smaster1 on 10.96.0.10:53: no such host" node="k8smaster1" "Failed to scrape node" err="Get \"https://k8smaster3:10250/metrics/resource\": dial tcp: lookup k8smaster3 on 10.96.0.10:53: no such host" node="k8smaster3" "Failed to scrape node" err="Get \"https://k8sworker1:10250/metrics/resource\": dial tcp: lookup k8sworker1 on 10.96.0.10:53: no such host" node="k8sworker1" "Failed to scrape node" err="Get \"https://k8sworker2:10250/metrics/resource\": dial tcp: lookup k8sworker2 on 10.96.0.10:53: no such host" node="k8sworker2" "Failed to scrape node" err="Get \"https://k8smaster2:10250/metrics/resource\": dial tcp: lookup k8smaster2 on 10.96.0.10:53: no such host" node="k8smaster2" "Failed probe" probe="metric-storage-ready" err="no metrics to serve" I0524 09:32:43.033797 1 server.go:187] "Failed probe" probe="metric-storage-ready" err="no metrics to serve"
根据日志显示,API Server 的主机名解析存在问题。这是因为节点主机名在集群的 DNS 中无法解析导致,可以通过在 Metrics Server 的 Pod 中手动添加解析解决
重新部署后,登陆容器 centos7,安装所需工具进行测试。根据日志信息,首先测试 Pod 是否能连接到节点的 kubelet
$ curl -v 172.31.16.124:10250 * About to connect() to 172.31.16.124 port 10250 (#0) * Trying 172.31.16.124... * No route to host * Failed connect to 172.31.16.124:10250; No route to host * Closing connection 0 curl: (7) Failed connect to 172.31.16.124:10250; No route to host $ curl -v 172.31.22.159:10250 * About to connect() to 172.31.22.159 port 10250 (#0) * Trying 172.31.22.159... * Connected to 172.31.22.159 (172.31.22.159) port 10250 (#0) > GET / HTTP/1.1 > User-Agent: curl/7.29.0 > Host: 172.31.22.159:10250 > Accept: */* > * HTTP 1.0, assume close after body < HTTP/1.0 400 Bad Request < Client sent an HTTP request to an HTTPS server. * Closing connection 0
根据以上测试,Metrics Server 的 Pod 无法连接所在节点 k8s-worker1(172.31.16.124)的 kubelet (172.31.16.124:10250),可以正常连接其他节点的 kubelet。由此可以确定问题原因。
考虑到 Metrics Server 的 Pod 只是访问不到宿主节点所在的 kubelet,可以访问其他节点的 kubelet,梳理其中的网络连同流程发现,在访问其他节点的 kubelet 时,Metrics Server Pod 的报文在流出宿主节点前,会被 SNAT 为宿主节点的出口 IP,报文源 IP 为 宿主节点的 IP。而访问宿主节点的 kubelet 的报文,其源 IP 为 Metrics Server Pod 的 IP,目的 IP 为宿主节点的 IP。怀疑可能因为集群节点上的 iptables 允许集群节点的 IP 访问 kubelet,而 Pod 的 IP 未被允许访问 kubelet。为验证此猜想,在节点 k8s-worker1 的 iptables 添加允许 Metrics Server Pod 的 IP 访问的规则进行测试
iptables -I INPUT 7 -s 10.244.4.138 -j ACCEPT
再次测试和 kubelet 的连通性,发现可以正常连通,再次检查 kubectl top node,可以查到所有节点的监控数据
$ curl -v 172.31.16.124:10250 * About to connect() to 172.31.16.124 port 10250 (#0) * Trying 172.31.16.124... * Connected to 172.31.16.124 (172.31.16.124) port 10250 (#0) > GET / HTTP/1.1 > User-Agent: curl/7.29.0 > Host: 172.31.16.124:10250 > Accept: */* > * HTTP 1.0, assume close after body < HTTP/1.0 400 Bad Request < Client sent an HTTP request to an HTTPS server. * Closing connection 0 $ kubectl top node NAME CPU(cores) CPU% MEMORY(bytes) MEMORY% fm-k8s-c1-master1 286m 7% 3242Mi 42% fm-k8s-c1-master2 150m 3% 3262Mi 43% fm-k8s-c1-master3 251m 6% 3247Mi 42% fm-k8s-c1-worker1 166m 1% 4317Mi 13% fm-k8s-c1-worker2 2013m 12% 21684Mi 70%
然后在 Pod 中运行其他容器,本示例启动 2 个应用容器,一个 nginx 容器,里面启动了 nginx 服务,一个自定义的容器,里面包含常用工具
$ docker run --name=pod_nginx_test \ --network=container:0eec8dd9d164 \ --restart=no --log-opt max-size=100m \ --runtime=runc --detach=true \ nginx $ docker run --name=pod_centos_test \ --network=container:0eec8dd9d164 \ --restart=no --log-opt max-size=100m \ --runtime=runc --detach=true \ centos7:my ping 127.1 $ docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 54d57add44d2 centos7:my "ping 127.1" 3 seconds ago Up 2 seconds pod_centos_test cd0c661774e2 nginx "/docker-entrypoint.…" 2 minutes ago Up 2 minutes pod_nginx_test 0eec8dd9d164 k8s.gcr.io/pause:3.4.1 "/pause" 10 minutes ago Up 10 minutes test-pause
进入 pod_centos_test 容器,查看网络、端口、主机名信息
$ hostname test-pause $ s -elf F S UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME CMD 4 S root 1 0 0 80 0 - 6218 skb_wa 17:01 ? 00:00:00 ping 127.1 4 S root 7 0 0 80 0 - 2959 do_wai 17:02 pts/0 00:00:00 bash 0 R root 30 7 0 80 0 - 12935 - 17:08 pts/0 00:00:00 ps -elf $ netstat -anutp Active Internet connections (servers and established) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN - tcp6 0 0 :::80 :::* LISTEN - $ ip add 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever $ curl localhost:80 <!DOCTYPE html> <html> <head> <title>Welcome to nginx!</title> <style> html { color-scheme: light dark; } body { width: 35em; margin: 0 auto; font-family: Tahoma, Verdana, Arial, sans-serif; } </style> </head> <body> <h1>Welcome to nginx!</h1> <p>If you see this page, the nginx web server is successfully installed and working. Further configuration is required.</p>
<p>For online documentation and support please refer to <a href="http://nginx.org/">nginx.org</a>.<br/> Commercial support is available at <a href="http://nginx.com/">nginx.com</a>.</p>
<p><em>Thank you for using nginx.</em></p> </body> </html>
Pod 中的容器主机名相同,都使用 pause 容器的主机名,但实际并未共享 UTS namespace。
# 超过 1000 会报错 >>> client.get_domains(limit=1001) Traceback (most recent call last): File "/usr/local/lib/python3.11/site-packages/godaddypy/client.py", line 105, in _validate_response_success response.raise_for_status() File "/usr/local/lib/python3.11/site-packages/requests/models.py", line 1021, in raise_for_status raise HTTPError(http_error_msg, response=self) requests.exceptions.HTTPError: 422 Client Error: Unprocessable Entity for url: https://api.godaddy.com/v1/domains?limit=1001
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/local/lib/python3.11/site-packages/godaddypy/client.py", line 149, in get_domains data = self._get_json_from_response(url, params=params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/godaddypy/client.py", line 75, in _get_json_from_response return self._request_submit(requests.get, url=url, json=json, **kwargs).json() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/godaddypy/client.py", line 97, in _request_submit self._validate_response_success(resp) File "/usr/local/lib/python3.11/site-packages/godaddypy/client.py", line 107, in _validate_response_success raise BadResponse(response.json()) godaddypy.client.BadResponse: Response Data: {'code': 'VALUE_OVER', 'message': 'Limit must have a value no greater than 1000'}
Kubernetes 的 Service 代表的是 Kubernetes 后端服务的入口,它主要包含服务的访问 IP(虚拟 IP)和端口,工作在 L4.
Service 只存储服务的入口信息(IP:PORT),不存储后端 Pod 信息,Service 是通过 Label Selector 选择与之匹配的后端 Pod。当被 Service 选择的后端 Pod 运行且能正常向外提供服务时,Kubernetes 的 Endpoint Controller 会生成一个新的 Endpoint 对象,指向此 Pod 的 IP:PORT。Service 接收到访问请求,会将请求转发到对应的 Endpoint。同时,Service 的访问 IP 和 Endpoint/Pod 的 IP 都会在 Kubernetes 的 DNS 服务里面进行注册以记录域名和 IP 的对应关系,因此用户可以在集群中通过域名的方式访问 Service 和 Pod。
用户创建 Service 后,Kubernetes 会从集群的可用服务 IP 池中为 Service 分配一个稳定的集群内访问 IP,称为 Cluster IP。Kubernetes 还会通过注册 DNS 条目为 Cluster IP 分配 域名(主机名)。Cluster IP 和 域名 在集群内是独一无二的,并且在服务的整个生命周期中保持不变,直到将 Service 从集群中删除,Kubernetes 才会释放 Cluster IP 和 域名 。通过此方法,Service 作为代理,向客户端提供了稳定不变的访问后端服务的入口。
Service 除了作为 代理 功能,同时也提供了 负载均衡 和 高可用。当后端的 Pod 有多个时,默认会通过 轮询 将请求流量均匀分布到多个 Pod 上,当某个 Pod 不可用是,Service 不会将请求调度到问题节点。
Kubernetes 使用节点上运行的 kube-proxy 组件管理各 Service 和后端 Pod 的连接。kube-proxy 是一个基于出战流量的 负载均衡控制器,它监控 Kubernetes API Service 并持续将 Service IP (ClusterIP)映射到运行状况良好的后端 Pod,具体实现是通过主机上的 iptables/IPVS 的规则。访问 Service 的 IP 会被这些(路由)规则直接 DNAT 到后端 Pod 的 IP。
$ kubectl get services NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE api-service ClusterIP 10.106.126.96 <none> 10302/TCP 145d $ kubectl get endpoints api-service NAME ENDPOINTS AGE api-service 10.244.3.138:10302,10.244.4.120:10302 145d
针对 api-service,后端的 Pod 就绪后,生成了 ENDPOINTS,kube-proxy 会为该服务创建以下 iptables 规则
-A PREROUTING -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A KUBE-POSTROUTING -m mark ! --mark 0x4000/0x4000 -j RETURN -A KUBE-POSTROUTING -j MARK --set-xmark 0x4000/0x0 -A KUBE-POSTROUTING -m comment --comment "kubernetes service traffic requiring SNAT" -j MASQUERADE
May 02 13:29:44 test systemd[1]: Starting Remote desktop service (VNC)... May 02 13:29:44 test systemd[1]: Started Remote desktop service (VNC). May 02 13:29:44 test vncserver_wrapper[28040]: You will require a password to access your desktops. May 02 13:29:44 test vncserver_wrapper[28040]: getpassword error: Inappropriate ioctl for device May 02 13:29:44 test vncserver_wrapper[28040]: Password:FATAL: 'runuser -l root' failed! May 02 13:29:44 test systemd[1]: vncserver@:1.service: main process exited, code=exited, status=2/INVALIDARGUMENT May 02 13:29:44 test systemd[1]: Unit vncserver@:1.service entered failed state. May 02 13:29:44 test systemd[1]: vncserver@:1.service failed.
$ /etc/rc.d/init.d/vmware start Starting vmware (via systemctl): Job for vmware.service failed because the control process exited with error code. See "systemctl status vmware.service" and "journalctl -xe" for details. [FAILED] $ journalctl -xe -u vmware -- -- Unit vmware.service has failed. -- -- The result is failed. May 02 11:13:39 k8s-uat-xhy-mysql-new systemd[1]: Unit vmware.service entered failed state. May 02 11:13:39 k8s-uat-xhy-mysql-new systemd[1]: vmware.service failed. May 02 11:19:37 k8s-uat-xhy-mysql-new systemd[1]: Starting SYSV: This service starts and stops VMware services... -- Subject: Unit vmware.service has begun start-up -- Defined-By: systemd -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- Unit vmware.service has begun starting up. May 02 11:19:37 k8s-uat-xhy-mysql-new vmware[22226]: Starting VMware services: May 02 11:19:37 k8s-uat-xhy-mysql-new vmware[22226]: Virtual machine monitor[FAILED] May 02 11:19:37 k8s-uat-xhy-mysql-new vmware[22226]: Virtual machine communication interface[ OK ] May 02 11:19:37 k8s-uat-xhy-mysql-new vmware[22226]: VM communication interface socket family[ OK ] May 02 11:19:37 k8s-uat-xhy-mysql-new vmware[22226]: Virtual ethernet[FAILED] May 02 11:19:37 k8s-uat-xhy-mysql-new vmware[22226]: VMware Authentication Daemon[ OK ] May 02 11:19:37 k8s-uat-xhy-mysql-new systemd[1]: vmware.service: control process exited, code=exited status=1 May 02 11:19:37 k8s-uat-xhy-mysql-new systemd[1]: Failed to start SYSV: This service starts and stops VMware services. -- Subject: Unit vmware.service has failed -- Defined-By: systemd -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- Unit vmware.service has failed. -- -- The result is failed. May 02 11:19:37 k8s-uat-xhy-mysql-new systemd[1]: Unit vmware.service entered failed state. May 02 11:19:37 k8s-uat-xhy-mysql-new systemd[1]: vmware.service failed.
$ curl -v 127.0.0.1:4369 * About to connect() to 127.0.0.1 port 4369 (#0) * Trying 127.0.0.1... * No route to host * Failed connect to 127.0.0.1:4369; No route to host * Closing connection 0 curl: (7) Failed connect to 127.0.0.1:4369; No route to host
$ curl -v 127.0.0.1:4369 * About to connect() to 127.0.0.1 port 4369 (#0) * Trying 127.0.0.1... * Connected to 127.0.0.1 (127.0.0.1) port 4369 (#0) > GET / HTTP/1.1 > User-Agent: curl/7.29.0 > Host: 127.0.0.1:4369 > Accept: */* > * Empty reply from server * Connection #0 to host 127.0.0.1 left intact curl: (52) Empty reply from server
Macvlan 接口是物理以太网接口的虚拟子接口,Macvlan 允许用户在一个物理网络接口上面配置多个虚拟的网络接口,每个 Macvlan 接口都有自己的区别与父接口的 MAC 地址,并且可以像普通的物理网络接口一样分配 IP 地址。使用 Macvlan 技术实现的效果是一块物理网卡可以绑定多个 IP 地址,并且每个 IP 地址有自己独立的 MAC 地址。
network namespace 的增删改查功能已经集成到了 Linux 的 ip 工具集的 netns 子命令中,因此在 Linux 系统中,对 network namespace 的操作主要使用 ip netns 命令
$ ip netns help Usage: ip netns list ip netns add NAME ip netns set NAME NETNSID ip [-all] netns delete [NAME] ip netns identify [PID] ip netns pids NAME ip [-all] netns exec [NAME] cmd ... ip netns monitor ip netns list-id
新的 network namespace 创建后,可以使用 ip netns exec 命令进入 namespace,做网络配置或者查询的工作。
ip netns exec 命令只能根据 network namespace 的名称进入 namespace
以下命令查询 netns1 的 network namespace 的 IP 地址信息
$ ip netns exec netns1 ip add 1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
默认的 network namespace 除了附带一个 lo 网卡外,没有任何其他网络设备,并且此 lo 接口还处于 DOWN 的状态,因此此回环网卡也是不可访问的。
$ ip netns exec netns1 ping 127.0.0.1 connect: Network is unreachable
在此示例中,如果想启用本地回环地址,首先需要进入 namespace,将本地回环网卡的状态修改为 UP
$ ip netns exec netns1 ip linkset dev lo up $ ip netns exec netns1 ip add 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever $ ip netns exec netns1 ping 127.0.0.1 PING 127.0.0.1 (127.0.0.1) 56(84) bytes of data. 64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.021 ms ^C --- 127.0.0.1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.021/0.021/0.021/0.000 ms