记一次 controller manager and scheduler unavailable 问题分析

问题背景

用户反馈经常收到告警提示: controller manager and scheduler unavailable。

问题分析

  1. controller manager 和 scheduler 都是通过连接 API SERVER 去读写数据,假如 API SERVER 出现异常无法访问,将会影响 controller manager 和 scheduler 运行。

  2. API SERVER 运行依赖 ETCD 服务,如果 ETCD 不可访问或者不可读写,那么 API SERVER 也无法向 controller manager 和 scheduler 或者其他连接 API SERVER 的应用提供服务。

  3. 有很多因素导致 ETCD 服务不可访问或者不可读写,比如:

    • 网络断开或者网络闪断;

    • 三个 ETCD 节点丢失三个,最后一个节点将变成只读模式;

    • 多个 ETCD 实例之间会通过 2380 端口通信来选举 leader,并且彼此保持心跳检测。如果节点负载增加导致 ETCD 心跳检测响应延迟,超过预定的心跳超时时间后会进行 leader 的重新选举,选举时候将会出现 ETCD 服务不可用。

K8S 组件检查

根据上面的问题分析,接下来查看 K8S 服务组件的日志来进一步分析。

kube-scheduler

 E0321 22:47:48.503411       1 leaderelection.go:320] error retrieving resource lock kube-system/kube-scheduler: Get https://127.0.0.1:6443/api/v1/namespaces/kube-system/endpoints/kube-scheduler?timeout=10s: context deadline exceeded (Client.Timeout exceeded while awaiting headers)
 E0321 22:51:57.230575       1 leaderelection.go:320] error retrieving resource lock kube-system/kube-scheduler: Get https://127.0.0.1:6443/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/kube-scheduler?timeout=10s: context deadline exceeded
 E0321 22:54:39.196072       1 leaderelection.go:320] error retrieving resource lock kube-system/kube-scheduler: Get https://127.0.0.1:6443/api/v1/namespaces/kube-system/endpoints/kube-scheduler?timeout=10s: net/http: request canceled (Client.Timeout exceeded while awaiting headers)

kube-controller-manager

 E0321 22:47:50.884336       1 leaderelection.go:320] error retrieving resource lock kube-system/kube-controller-manager: Get https://127.0.0.1:6443/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/kube-controller-manager?timeout=10s: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
 I0321 22:47:53.739694       1 leaderelection.go:252] successfully acquired lease kube-system/kube-controller-manager
 I0321 22:47:53.739746       1 event.go:278] Event(v1.ObjectReference{Kind:"Endpoints", Namespace:"kube-system", Name:"k

根据以上信息可以看出,在 UTC 0322 20:55 左右,出现了连接 API SERVER 超时的情况,接下来查看 API SERVER 服务日志。

kube-apiserver

 E0321 22:47:41.289207       1 authentication.go:53] Unable to authenticate the request due to an error: [invalid bearer token, context canceled]
 E0321 22:47:48.503439       1 status.go:71] apiserver received an error that is not an metav1.Status: &errors.errorString{s:"context canceled"}
 I0321 22:47:48.503648       1 trace.go:116] Trace[1655495025]: "Get" url:/api/v1/namespaces/kube-system/endpoints/kube-scheduler,user-agent:kube-scheduler/v1.18.10 (linux/amd64) kubernetes/62876fc/leader-election,client:127.0.0.1 (started: 2021-03-21 22:47:38.517639233 +0000 UTC m=+1182359.771988386) (total time: 9.985984157s):
 .
 .
 .
 Trace[723022686]: [21.463074932s] [21.463070777s] About to write a response
 W0321 22:52:07.981746       1 lease.go:224] Resetting endpoints for master service "kubernetes" to [10.151.130.137 10.151.130.138 10.151.130.139]
 E0321 22:54:34.697599       1 status.go:71] apiserver received an error that is not an metav1.Status: rpctypes.EtcdError{code:0xe, desc:"etcdserver: leader changed"}
 

根据以上信息,可以确定因为 ETCD leader changed 导致 controller manager and scheduler unavailable。

ETCD 检查

在 API SERVER 日志中查看到了 etcdserver: leader changed 的提示,接下来检查 ETCD 服务日志来具体分析。

ETCD

 2021-03-21 22:47:38.503291 W | rafthttp: lost the TCP streaming connection with peer 496ad58099e4ad83 (stream Message reader)
 2021-03-21 22:47:38.503313 E | rafthttp: failed to read 496ad58099e4ad83 on stream Message (read tcp 10.151.130.138:51268->10.151.130.139:2380: i/o timeout)
 2021-03-21 22:47:38.503317 I | rafthttp: peer 496ad58099e4ad83 became inactive (message send to peer failed)
 2021-03-21 22:47:38.503628 W | rafthttp: lost the TCP streaming connection with peer ee17b2c8bf87d608 (stream Message reader)
 2021-03-21 22:47:38.503641 E | rafthttp: failed to read ee17b2c8bf87d608 on stream Message (read tcp 10.151.130.138:34276->10.151.130.137:2380: i/o timeout)
 2021-03-21 22:47:38.503645 I | rafthttp: peer ee17b2c8bf87d608 became inactive (message send to peer failed)
 2021-03-21 22:47:38.517019 W | rafthttp: lost the TCP streaming connection with peer ee17b2c8bf87d608 (stream MsgApp v2 reader)
 2021-03-21 22:47:38.521398 W | rafthttp: lost the TCP streaming connection with peer 496ad58099e4ad83 (stream MsgApp v2 reader)
 2021-03-21 22:47:38.531432 I | rafthttp: peer ee17b2c8bf87d608 became active
 2021-03-21 22:47:38.531455 I | rafthttp: established a TCP streaming connection with peer ee17b2c8bf87d608 (stream MsgApp v2 reader)
 2021-03-21 22:47:38.531461 I | rafthttp: peer 496ad58099e4ad83 became active
 2021-03-21 22:47:38.531474 I | rafthttp: established a TCP streaming connection with peer 496ad58099e4ad83 (stream MsgApp v2 reader)
 2021-03-21 22:47:38.531599 I | rafthttp: established a TCP streaming connection with peer 496ad58099e4ad83 (stream Message reader)
 2021-03-21 22:47:38.533011 I | rafthttp: established a TCP streaming connection with peer ee17b2c8bf87d608 (stream Message reader)
 2021-03-21 22:47:38.631853 E | rafthttp: failed to write ee17b2c8bf87d608 on stream Message (write tcp 10.151.130.138:2380->10.151.130.137:36700: write: connection reset by peer)
 2021-03-21 22:47:38.631869 I | rafthttp: peer ee17b2c8bf87d608 became inactive (message send to peer failed)
 2021-03-21 22:47:38.632307 W | rafthttp: lost the TCP streaming connection with peer ee17b2c8bf87d608 (stream Message writer)
 2021-03-21 22:47:38.634600 I | rafthttp: peer ee17b2c8bf87d608 became active
 2021-03-21 22:47:40.880842 W | rafthttp: closed an existing TCP streaming connection with peer 496ad58099e4ad83 (stream Message writer)
 2021-03-21 22:47:40.880863 I | rafthttp: established a TCP streaming connection with peer 496ad58099e4ad83 (stream Message writer)
 2021-03-21 22:47:40.881069 W | rafthttp: closed an existing TCP streaming connection with peer ee17b2c8bf87d608 (stream MsgApp v2 writer)
 2021-03-21 22:47:40.881086 I | rafthttp: established a TCP streaming connection with peer ee17b2c8bf87d608 (stream MsgApp v2 writer)
 2021-03-21 22:47:40.882339 W | rafthttp: closed an existing TCP streaming connection with peer 496ad58099e4ad83 (stream MsgApp v2 writer)
 2021-03-21 22:47:40.882354 I | rafthttp: established a TCP streaming connection with peer 496ad58099e4ad83 (stream MsgApp v2 writer)
 2021-03-21 22:47:41.425089 I | rafthttp: established a TCP streaming connection with peer ee17b2c8bf87d608 (stream Message writer)

通过 ETCD 日志可以确定,在 UTC 2021-03-21 22:47:38 出现了 lost the TCP streaming connection with peer 。

问题定位

  • 根据 ETCD 日志提示,一开始怀疑是网络闪断导致 ETCD peer 心跳超时,然后查看外部交换机网络监控,并通过工具去定时检查 ETCD 2379 和 2380 端口的健康检查,经过几天的观察未发现网络连通性异常。

  • 通过查看系统磁盘监控,发现在告警发生的时间段,磁盘 IO 有明显的波动。

  • 通过沟通,了解到虚拟机每天有定时快照,并且时间正好在 UTC 时间晚上 22 点 50 左右。

后续观察

经过沟通,将定时快照的时间提前了 2 小时。根据观察告警时间以及 ETCD 出现 lost the TCP streaming connection with peer 时间相应的提前了两小时。

kube-scheduler

 E0322 20:48:20.871852       1 leaderelection.go:320] error retrieving resource lock kube-system/kube-scheduler: Get https://127.0.0.1:6443/api/v1/namespaces/kube-system/endpoints/kube-scheduler?timeout=10s: context deadline exceeded (Client.Timeout exceeded while awaiting headers)
 I0322 20:48:31.622309       1 leaderelection.go:252] successfully acquired lease kube-system/kube-scheduler
 E0322 20:52:34.275658       1 leaderelection.go:320] error retrieving resource lock kube-system/kube-scheduler: etcdserver: leader changed
 E0322 20:55:12.950234       1 leaderelection.go:320] error retrieving resource lock kube-system/kube-scheduler: Get https://127.0.0.1:6443/api/v1/namespaces/kube-system/endpoints/kube-scheduler?timeout=10s: context deadline exceeded (Client.Timeout exceeded while awaiting headers)
 I0322 20:55:12.950297       1 leaderelection.go:277] failed to renew lease kube-system/kube-scheduler: timed out waiting for the condition
 F0322 20:55:12.950314       1 server.go:244] leaderelection lost

kube-controller-manager

 E0322 20:48:14.476644       1 leaderelection.go:320] error retrieving resource lock kube-system/kube-controller-manager: etcdserver: leader changed
 E0322 20:48:28.187307       1 leaderelection.go:320] error retrieving resource lock kube-system/kube-controller-manager: Get https://127.0.0.1:6443/api/v1/namespaces/kube-system/endpoints/kube-controller-manager?timeout=10s: context deadline exceeded (Client.Timeout exceeded while awaiting headers)
 E0322 20:52:41.226524       1 leaderelection.go:320] error retrieving resource lock kube-system/kube-controller-manager: Get https://127.0.0.1:6443/api/v1/namespaces/kube-system/endpoints/kube-controller-manager?timeout=10s: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
 E0322 20:55:12.902921       1 leaderelection.go:320] error retrieving resource lock kube-system/kube-controller-manager: Get https://127.0.0.1:6443/api/v1/namespaces/kube-system/endpoints/kube-controller-manager?timeout=10s: context deadline exceeded (Client.Timeout exceeded while awaiting headers)

kube-apiserver

 E0322 20:48:29.475790       1 status.go:71] apiserver received an error that is not an metav1.Status: rpctypes.EtcdError{code:0xe, desc:"etcdserver: request timed out"}
 E0322 20:48:29.475844       1 status.go:71] apiserver received an error that is not an metav1.Status: rpctypes.EtcdError{code:0xe, desc:"etcdserver: request timed out"}
 
 E0322 20:52:34.274857       1 status.go:71] apiserver received an error that is not an metav1.Status: rpctypes.EtcdError{code:0xe, desc:"etcdserver: leader changed"}

ETCD

 2021-03-22 20:30:46.088174 I | mvcc: finished scheduled compaction at 23035178 (took 26.164333ms)
 2021-03-22 20:34:52.330573 W | etcdserver: read-only range request "key:\"/registry/persistentvolumeclaims/default/nginx-logs-pvc\" " with result "range_response_count:1 size:1405" took too long (1.669619791s) to execute
 2021-03-22 20:34:52.332649 W | etcdserver: read-only range request "key:\"/registry/services/endpoints/kube-system/kube-controller-manager\" " with result "range_response_count:1 size:615" took too long (1.471652409s) to execute
 2021-03-22 20:34:52.333036 W | etcdserver: read-only range request "key:\"/registry/services/endpoints/kube-system/kube-scheduler\" " with result "range_response_count:1 size:588" took too long (217.627474ms) to execute
 2021-03-22 20:34:52.333058 W | etcdserver: read-only range request "key:\"/registry/resourcequotas/\" range_end:\"/registry/resourcequotas0\" count_only:true " with result "range_response_count:0 size:9" took too long (630.275951ms) to execute
 2021-03-22 20:34:52.333081 W | etcdserver: read-only range request "key:\"/registry/services/endpoints/local-path-storage/rancher.io-local-path\" " with result "range_response_count:1 size:653" took too long (394.73969ms) to execute
 2021-03-22 20:34:52.333308 W | etcdserver: read-only range request "key:\"/registry/services/specs/default/kubernetes\" " with result "range_response_count:1 size:616" took too long (430.624112ms) to execute
 2021-03-22 20:35:46.067308 I | mvcc: store.index: compact 23036390
 2021-03-22 20:35:46.093276 I | mvcc: finished scheduled compaction at 23036390 (took 25.599134ms)
 2021-03-22 20:40:46.077355 I | mvcc: store.index: compact 23037604
 2021-03-22 20:40:46.104071 I | mvcc: finished scheduled compaction at 23037604 (took 26.305942ms)
 2021-03-22 20:43:58.958177 W | etcdserver: read-only range request "key:\"/registry/services/endpoints/kube-system/kube-controller-manager\" " with result "range_response_count:1 size:615" took too long (702.184455ms) to execute
 2021-03-22 20:43:58.959252 W | etcdserver: read-only range request "key:\"/registry/leases/kube-system/kube-scheduler\" " with result "range_response_count:1 size:491" took too long (687.254ms) to execute
 2021-03-22 20:43:58.959275 W | etcdserver: read-only range request "key:\"/registry/leases/kube-system/kube-controller-manager\" " with result "range_response_count:1 size:518" took too long (651.871463ms) to execute
 2021-03-22 20:43:58.959363 W | etcdserver: read-only range request "key:\"/registry/events/\" range_end:\"/registry/events0\" count_only:true " with result "range_response_count:0 size:9" took too long (647.299563ms) to execute
 2021-03-22 20:45:46.088764 I | mvcc: store.index: compact 23038829
 2021-03-22 20:45:46.117371 I | mvcc: finished scheduled compaction at 23038829 (took 28.090255ms)
 2021-03-22 20:48:13.545676 W | rafthttp: lost the TCP streaming connection with peer 4bd420be9a532df8 (stream Message reader)
 2021-03-22 20:48:13.545703 E | rafthttp: failed to read 4bd420be9a532df8 on stream Message (read tcp 10.151.130.137:54864->10.151.130.138:2380: i/o timeout)
 2021-03-22 20:48:13.545707 I | rafthttp: peer 4bd420be9a532df8 became inactive (message send to peer failed)
 2021-03-22 20:48:13.549798 W | rafthttp: lost the TCP streaming connection with peer 4bd420be9a532df8 (stream MsgApp v2 reader)
 raft2021/03/22 20:48:14 INFO: ee17b2c8bf87d608 [term: 164] received a MsgVote message with higher term from 496ad58099e4ad83 [term: 165]
 raft2021/03/22 20:48:14 INFO: ee17b2c8bf87d608 became follower at term 165
 raft2021/03/22 20:48:14 INFO: ee17b2c8bf87d608 [logterm: 164, index: 29128056, vote: 0] cast MsgVote for 496ad58099e4ad83 [logterm: 164, index: 29128056] at term 165
 raft2021/03/22 20:48:14 INFO: raft.node: ee17b2c8bf87d608 lost leader 4bd420be9a532df8 at term 165
 raft2021/03/22 20:48:14 INFO: raft.node: ee17b2c8bf87d608 elected leader 496ad58099e4ad83 at term 165
 2021-03-22 20:48:14.475506 W | etcdserver: read-only range request "key:\"/registry/services/endpoints/kube-system/kube-controller-manager\" " with result "error:etcdserver: leader changed" took too long (5.299921652s) to execute
 2021-03-22 20:48:14.477440 W | etcdserver: read-only range request "key:\"/registry/services/endpoints/local-path-storage/rancher.io-local-path\" " with result "range_response_count:1 size:653" took too long (4.456789691s) to execute
 2021-03-22 20:48:14.477462 W | etcdserver: read-only range request "key:\"/registry/storageclasses/\" range_end:\"/registry/storageclasses0\" count_only:true " with result "range_response_count:0 size:9" took too long (3.623175886s) to execute
 2021-03-22 20:48:14.477539 W | etcdserver: read-only range request "key:\"/registry/networkpolicies/\" range_end:\"/registry/networkpolicies0\" count_only:true " with result "range_response_count:0 size:7" took too long (478.668022ms) to execute
 2021-03-22 20:48:14.477622 W | etcdserver: read-only range request "key:\"/registry/rolebindings/\" range_end:\"/registry/rolebindings0\" count_only:true " with result "range_response_count:0 size:9" took too long (1.244980828s) to execute
 2021-03-22 20:48:14.477732 W | etcdserver: read-only range request "key:\"/registry/apiregistration.k8s.io/apiservices/\" range_end:\"/registry/apiregistration.k8s.io/apiservices0\" count_only:true " with result "range_response_count:0 size:9" took too long (2.397168285s) to execute
 2021-03-22 20:48:16.618038 I | rafthttp: peer 4bd420be9a532df8 became active
 2021-03-22 20:48:16.618059 I | rafthttp: established a TCP streaming connection with peer 4bd420be9a532df8 (stream MsgApp v2 reader)
 2021-03-22 20:48:16.624716 W | rafthttp: closed an existing TCP streaming connection with peer 4bd420be9a532df8 (stream MsgApp v2 writer)
 2021-03-22 20:48:16.624736 I | rafthttp: established a TCP streaming connection with peer 4bd420be9a532df8 (stream MsgApp v2 writer)
 2021-03-22 20:48:16.624878 I | rafthttp: established a TCP streaming connection with peer 4bd420be9a532df8 (stream Message reader)
 2021-03-22 20:48:16.627175 W | rafthttp: closed an existing TCP streaming connection with peer 4bd420be9a532df8 (stream Message writer)
 2021-03-22 20:48:16.627188 I | rafthttp: established a TCP streaming connection with peer 4bd420be9a532df8 (stream Message writer)
 WARNING: 2021/03/22 20:48:23 grpc: Server.processUnaryRPC failed to write status: connection error: desc = "transport is closing"
 2021-03-22 20:50:46.088074 I | mvcc: store.index: compact 23040043

处理方法

通过商议,决定取消 ETCD 虚拟机的定时快照。经过一天运行,ETCD 运行日志中没有再出现 lost the TCP streaming connection with peer 。

 2021-03-23 19:58:04.190676 I | etcdserver: saved snapshot at index 29600607
 2021-03-23 19:58:04.190832 I | etcdserver: compacted raft log at 29595607
 2021-03-23 19:58:33.388856 I | pkg/fileutil: purged file /var/lib/rancher/etcd/member/snap/00000000000000a7-0000000001bfdaca.snap successfully
 2021-03-23 20:00:48.065975 I | mvcc: store.index: compact 23379302
 2021-03-23 20:00:48.091487 I | mvcc: finished scheduled compaction at 23379302 (took 25.110115ms)
 2021-03-23 20:02:39.378543 W | etcdserver: read-only range request "key:\"/registry/configmaps/ingress-nginx/ingress-controller-leader-nginx\" " with result "range_response_count:1 size:607" took too long (117.791832ms) to execute
 2021-03-23 20:05:48.073024 I | mvcc: store.index: compact 23380519
 2021-03-23 20:05:48.100228 I | mvcc: finished scheduled compaction at 23380519 (took 26.693346ms)
 2021-03-23 20:10:48.079058 I | mvcc: store.index: compact 23381735
 2021-03-23 20:10:48.106893 I | mvcc: finished scheduled compaction at 23381735 (took 27.437585ms)
 2021-03-23 20:15:30.155491 I | wal: segmented wal file /var/lib/rancher/etcd/member/wal/000000000000015a-0000000001c3c1fc.wal is created
 2021-03-23 20:15:33.414322 I | pkg/fileutil: purged file /var/lib/rancher/etcd/member/wal/0000000000000155-0000000001bcc554.wal successfully
 2021-03-23 20:15:48.095615 I | mvcc: store.index: compact 23382954
 2021-03-23 20:15:48.122662 I | mvcc: finished scheduled compaction at 23382954 (took 26.68459ms)
 2021-03-23 20:20:48.092179 I | mvcc: store.index: compact 23384166
 2021-03-23 20:20:48.118647 I | mvcc: finished scheduled compaction at 23384166 (took 26.050565ms)
 2021-03-23 20:25:48.097468 I | mvcc: store.index: compact 23385389
 2021-03-23 20:25:48.125829 I | mvcc: finished scheduled compaction at 23385389 (took 28.006118ms)
 2021-03-23 20:30:48.103474 I | mvcc: store.index: compact 23386608
 2021-03-23 20:30:48.131849 I | mvcc: finished scheduled compaction at 23386608 (took 27.984451ms)
 2021-03-23 20:35:48.118190 I | mvcc: store.index: compact 23387819
 2021-03-23 20:35:48.144807 I | mvcc: finished scheduled compaction at 23387819 (took 26.220415ms)
 2021-03-23 20:40:48.124117 I | mvcc: store.index: compact 23389036
 2021-03-23 20:40:48.150120 I | mvcc: finished scheduled compaction at 23389036 (took 25.643958ms)
 2021-03-23 20:45:48.130181 I | mvcc: store.index: compact 23390248
 2021-03-23 20:45:48.156873 I | mvcc: finished scheduled compaction at 23390248 (took 26.303108ms)
 2021-03-23 20:50:48.136542 I | mvcc: store.index: compact 23391473
 2021-03-23 20:50:48.168088 I | mvcc: finished scheduled compaction at 23391473 (took 31.146475ms)
 2021-03-23 20:55:48.141884 I | mvcc: store.index: compact 23392692
 2021-03-23 20:55:48.168249 I | mvcc: finished scheduled compaction at 23392692 (took 25.983679ms)
 2021-03-23 21:00:48.148486 I | mvcc: store.index: compact 23393921
 2021-03-23 21:00:48.175167 I | mvcc: finished scheduled compaction at 23393921 (took 26.323769ms)
 2021-03-23 21:05:48.153553 I | mvcc: store.index: compact 23395139
 2021-03-23 21:05:48.179843 I | mvcc: finished scheduled compaction at 23395139 (took 25.890462ms)
 2021-03-23 21:10:48.159706 I | mvcc: store.index: compact 23396363
 2021-03-23 21:10:48.186493 I | mvcc: finished scheduled compaction at 23396363 (took 26.422307ms)
 2021-03-23 21:15:48.165114 I | mvcc: store.index: compact 23397590
 2021-03-23 21:15:48.191193 I | mvcc: finished scheduled compaction at 23397590 (took 25.715209ms)
 2021-03-23 21:20:48.171112 I | mvcc: store.index: compact 23398807
 2021-03-23 21:20:48.198116 I | mvcc: finished scheduled compaction at 23398807 (took 26.516055ms)
 2021-03-23 21:25:48.176894 I | mvcc: store.index: compact 23400025
 2021-03-23 21:25:48.205417 I | mvcc: finished scheduled compaction at 23400025 (took 28.06852ms)
 2021-03-23 21:30:48.182368 I | mvcc: store.index: compact 23401248
 2021-03-23 21:30:48.208162 I | mvcc: finished scheduled compaction at 23401248 (took 25.359328ms)
 2021-03-23 21:35:48.188427 I | mvcc: store.index: compact 23402496
 2021-03-23 21:35:48.214977 I | mvcc: finished scheduled compaction at 23402496 (took 26.130896ms)
 2021-03-23 21:40:48.203757 I | mvcc: store.index: compact 23403714
 2021-03-23 21:40:48.230346 I | mvcc: finished scheduled compaction at 23403714 (took 26.234451ms)
 2021-03-23 21:45:48.203085 I | mvcc: store.index: compact 23404944
 2021-03-23 21:45:48.232110 I | mvcc: finished scheduled compaction at 23404944 (took 28.66739ms)
 2021-03-23 21:50:48.211722 I | mvcc: store.index: compact 23406161
 2021-03-23 21:50:48.238941 I | mvcc: finished scheduled compaction at 23406161 (took 26.80059ms)
 2021-03-23 21:55:48.221035 I | mvcc: store.index: compact 23407381
 2021-03-23 21:55:48.248858 I | mvcc: finished scheduled compaction at 23407381 (took 27.422759ms)
 2021-03-23 22:00:48.229789 I | mvcc: store.index: compact 23408597
 2021-03-23 22:00:48.257402 I | mvcc: finished scheduled compaction at 23408597 (took 27.233652ms)
 2021-03-23 22:05:48.243738 I | mvcc: store.index: compact 23409819
 2021-03-23 22:05:48.269861 I | mvcc: finished scheduled compaction at 23409819 (took 25.728754ms)
 2021-03-23 22:10:48.249241 I | mvcc: store.index: compact 23411044
 2021-03-23 22:10:48.275749 I | mvcc: finished scheduled compaction at 23411044 (took 26.150918ms)
 2021-03-23 22:15:48.254890 I | mvcc: store.index: compact 23412261
 2021-03-23 22:15:48.281103 I | mvcc: finished scheduled compaction at 23412261 (took 25.838699ms)
 2021-03-23 22:20:13.800090 W | etcdserver: read-only range request "key:\"/registry/ranges/serviceips\" " with result "range_response_count:1 size:8223" took too long (208.151939ms) to execute
 2021-03-23 22:20:48.264931 I | mvcc: store.index: compact 23413481
 2021-03-23 22:20:48.290955 I | mvcc: finished scheduled compaction at 23413481 (took 25.550081ms)
 2021-03-23 22:25:38.543370 I | etcdserver: start to snapshot (applied: 29650608, lastsnap: 29600607)
 2021-03-23 22:25:38.547640 I | etcdserver: saved snapshot at index 29650608
 2021-03-23 22:25:38.547810 I | etcdserver: compacted raft log at 29645608
 2021-03-23 22:25:48.270855 I | mvcc: store.index: compact 23414700
 2021-03-23 22:25:48.298973 I | mvcc: finished scheduled compaction at 23414700 (took 27.738717ms)
 2021-03-23 22:26:03.434776 I | pkg/fileutil: purged file /var/lib/rancher/etcd/member/snap/00000000000000a7-0000000001c09e1b.snap successfully
 2021-03-23 22:30:48.275888 I | mvcc: store.index: compact 23415924
 2021-03-23 22:30:48.305413 I | mvcc: finished scheduled compaction at 23415924 (took 29.128337ms)
 2021-03-23 22:35:48.280997 I | mvcc: store.index: compact 23417141
 2021-03-23 22:35:48.307847 I | mvcc: finished scheduled compaction at 23417141 (took 26.42452ms)
 2021-03-23 22:40:48.287720 I | mvcc: store.index: compact 23418356
 2021-03-23 22:40:48.317291 I | mvcc: finished scheduled compaction at 23418356 (took 29.183932ms)
 2021-03-23 22:45:48.295599 I | mvcc: store.index: compact 23419585
 2021-03-23 22:45:48.322346 I | mvcc: finished scheduled compaction at 23419585 (took 26.370195ms)
 2021-03-23 22:50:48.301478 I | mvcc: store.index: compact 23420817
 2021-03-23 22:50:48.327555 I | mvcc: finished scheduled compaction at 23420817 (took 25.656891ms)
 2021-03-23 22:55:48.307901 I | mvcc: store.index: compact 23422034
 2021-03-23 22:55:48.337474 I | mvcc: finished scheduled compaction at 23422034 (took 29.204046ms)
 2021-03-23 23:00:48.314532 I | mvcc: store.index: compact 23423249
 2021-03-23 23:00:48.340979 I | mvcc: finished scheduled compaction at 23423249 (took 26.109985ms)
 2021-03-23 23:05:48.319811 I | mvcc: store.index: compact 23424469
 2021-03-23 23:05:48.346462 I | mvcc: finished scheduled compaction at 23424469 (took 26.255207ms)
 2021-03-23 23:10:48.326553 I | mvcc: store.index: compact 23425683
 2021-03-23 23:10:48.353562 I | mvcc: finished scheduled compaction at 23425683 (took 26.632973ms)
 2021-03-23 23:15:48.333217 I | mvcc: store.index: compact 23426896
 2021-03-23 23:15:48.359319 I | mvcc: finished scheduled compaction at 23426896 (took 25.72745ms)
 2021-03-23 23:20:48.339558 I | mvcc: store.index: compact 23428115
 2021-03-23 23:20:48.366617 I | mvcc: finished scheduled compaction at 23428115 (took 26.658721ms)
 2021-03-23 23:25:48.345722 I | mvcc: store.index: compact 23429337
 2021-03-23 23:25:48.373084 I | mvcc: finished scheduled compaction at 23429337 (took 27.017076ms)
 2021-03-23 23:30:48.351550 I | mvcc: store.index: compact 23430561
 2021-03-23 23:30:48.378308 I | mvcc: finished scheduled compaction at 23430561 (took 26.3218ms)
 2021-03-23 23:35:48.356635 I | mvcc: store.index: compact 23431805
 2021-03-23 23:35:48.382235 I | mvcc: finished scheduled compaction at 23431805 (took 25.195416ms)
 2021-03-23 23:40:48.362534 I | mvcc: store.index: compact 23433019
 2021-03-23 23:40:48.388744 I | mvcc: finished scheduled compaction at 23433019 (took 25.639283ms)
 2021-03-23 23:45:48.368555 I | mvcc: store.index: compact 23434235
 2021-03-23 23:45:48.395256 I | mvcc: finished scheduled compaction at 23434235 (took 26.290222ms)
 2021-03-23 23:50:48.373567 I | mvcc: store.index: compact 23435452
 2021-03-23 23:50:48.405420 I | mvcc: finished scheduled compaction at 23435452 (took 31.411081ms)
 2021-03-23 23:55:48.379145 I | mvcc: store.index: compact 23436674
 2021-03-23 23:55:48.405956 I | mvcc: finished scheduled compaction at 23436674 (took 26.306194ms)
 2021-03-24 00:00:48.384095 I | mvcc: store.index: compact 23437902
赞(0) 打赏
未经允许不得转载:大咖说Rancher » 记一次 controller manager and scheduler unavailable 问题分析

支付宝扫一扫打赏

img

微信扫一扫打赏

img