处理docker服务启动失败:配置文件overlay导致【测试成功】

艺帆风顺 发布于 2025-04-03 20 次阅读


一、问题背景

    某台服务器运行的docker在服务器重启之后,突然无法正常启动,转态信息如下:

    [root@Master ~]# systemctl status docker● docker.service - Docker Application Container Engine Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor preset: disabled) Active: failed (Result: exit-code) since Sun 2023-10-08 17:04:55 CST; 4s ago Docs: https://docs.docker.com Process: 1496 ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock (code=exited, st> Main PID: 1496 (code=exited, status=1/FAILURE)
    10月 08 17:04:55 Master systemd[1]: docker.service: Service RestartSec=2s expired, scheduling restart.10月 08 17:04:55 Master systemd[1]: docker.service: Scheduled restart job, restart counter is at 3.10月 08 17:04:55 Master systemd[1]: Stopped Docker Application Container Engine.10月 08 17:04:55 Master systemd[1]: docker.service: Start request repeated too quickly.10月 08 17:04:55 Master systemd[1]: docker.service: Failed with result 'exit-code'.10月 08 17:04:55 Master systemd[1]: Failed to start Docker Application Container Engine.

    二、问题排查

        反复重启服务器无法正常解决。

        根据上述状态信息,Docker启动请求被快速重试。可能是由于配置问题或其他问题导致的。

            尝试手动启动Docker并查看是否出现错误消息:

            dockerd

      [root@Master ~]# sudo dockerdINFO[2023-10-08T17:05:35.279761600+08:00] Starting up INFO[2023-10-08T17:05:35.305377717+08:00] [graphdriver] trying configured driver: overlay2 failed to start daemon: error initializing graphdriver: overlay2: unknown option overlay2.override_kernel_check: overlay2INFO[2023-10-08T17:05:35.334807848+08:00] stopping event stream following graceful shutdown  error="context canceled" module=libcontainerd namespace=plugins.moby

          根据提供的日志信息,Docker启动失败,并且错误信息指出了一个问题,即overlay2 驱动器中存在一个未知选项 overlay2.override_kernel_check

          该问题通常是由于Docker配置文件中的不兼容选项导致的。

      三、问题解决

      1. 1、编辑Docker的配置文件,通常位于 /etc/docker/daemon.json

        2、检查文件中是否存在 overlay2.override_kernel_check 选项。

        1. 如果存在,将其删除或注释掉。

          例如,如果配置文件中有以下内容:

        { "storage-driver": "overlay2", "overlay2.override_kernel_check": "false"}

        vim /etc/docker/daemon.json   

         将以上两行数据进行删除,根据情况做好备份。

            3、重新启动;

          [root@Master ~]# sudo dockerdINFO[2023-10-08T17:07:16.064357110+08:00] Starting up INFO[2023-10-08T17:07:16.114444890+08:00] [graphdriver] using prior storage driver: overlay2 ERRO[2023-10-08T17:07:16.246438638+08:00] not restoring image chainID="sha256:da023cb3c0644a16e5af44524ea0ed00de84e4c3afaf9cf19833f58127476d81" err="layer does not exist" os=linuxERRO[2023-10-08T17:07:16.249728045+08:00] not restoring image chainID="sha256:741783d3ef2c8c5466a65a3634c245ff65c93fdb7947b269c7585381ea81f7e2" err="layer does not exist" os=linuxINFO[2023-10-08T17:07:16.257058064+08:00] Loading containers: start. ERRO[2023-10-08T17:07:16.259109012+08:00] failed to load container mount container=9c10ef85eea228d4988bf522e1b07129dad82c855ae5d53901f5f0f6d7b20bfb error="mount does not exist"INFO[2023-10-08T17:07:16.800070226+08:00] Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address INFO[2023-10-08T17:07:16.875444586+08:00] Loading containers: done. WARN[2023-10-08T17:07:16.978845814+08:00] Not using native diff for overlay2, this may cause degraded performance for building images: kernel has CONFIG_OVERLAY_FS_REDIRECT_DIR enabled storage-driver=overlay2INFO[2023-10-08T17:07:16.979036178+08:00] Docker daemon commit=a61e2b4 graphdriver=overlay2 version=24.0.5INFO[2023-10-08T17:07:16.989300325+08:00] Listening for connections addr="[::]:2377" module=node node.id=linr331fv2jrnsau59l2cg3kx proto=tcpINFO[2023-10-08T17:07:16.989514535+08:00] Listening for local connections addr=/var/run/docker/swarm/control.sock module=node node.id=linr331fv2jrnsau59l2cg3kx proto=unixINFO[2023-10-08T17:07:16.993130617+08:00] manager selected by agent for new session: {linr331fv2jrnsau59l2cg3kx 192.168.3.88:2377} module=node/agent node.id=linr331fv2jrnsau59l2cg3kxINFO[2023-10-08T17:07:16.994521175+08:00] waiting 0s before registering session module=node/agent node.id=linr331fv2jrnsau59l2cg3kxINFO[2023-10-08T17:07:17.051817825+08:00] 7f966dd0d341aa0a switched to configuration voters=() module=raft node.id=linr331fv2jrnsau59l2cg3kxINFO[2023-10-08T17:07:17.051983703+08:00] 7f966dd0d341aa0a became follower at term 4 module=raft node.id=linr331fv2jrnsau59l2cg3kxINFO[2023-10-08T17:07:17.052022458+08:00] newRaft 7f966dd0d341aa0a [peers: [], term: 4, commit: 214, applied: 0, lastindex: 214, lastterm: 4] module=raft node.id=linr331fv2jrnsau59l2cg3kxINFO[2023-10-08T17:07:17.052699247+08:00] 7f966dd0d341aa0a switched to configuration voters=(9193656432988367370) module=raft node.id=linr331fv2jrnsau59l2cg3kxINFO[2023-10-08T17:07:17.055780775+08:00] 7f966dd0d341aa0a is starting a new election at term 4 module=raft node.id=linr331fv2jrnsau59l2cg3kxINFO[2023-10-08T17:07:17.055824165+08:00] 7f966dd0d341aa0a became candidate at term 5 module=raft node.id=linr331fv2jrnsau59l2cg3kxINFO[2023-10-08T17:07:17.055863838+08:00] 7f966dd0d341aa0a received MsgVoteResp from 7f966dd0d341aa0a at term 5 module=raft node.id=linr331fv2jrnsau59l2cg3kxINFO[2023-10-08T17:07:17.055886455+08:00] 7f966dd0d341aa0a became leader at term 5 module=raft node.id=linr331fv2jrnsau59l2cg3kxINFO[2023-10-08T17:07:17.055954780+08:00] raft.node: 7f966dd0d341aa0a elected leader 7f966dd0d341aa0a at term 5 module=raft node.id=linr331fv2jrnsau59l2cg3kxERRO[2023-10-08T17:07:17.177163822+08:00] error creating cluster object error="name conflicts with an existing object" module=node node.id=linr331fv2jrnsau59l2cg3kxINFO[2023-10-08T17:07:17.177386542+08:00] leadership changed from not yet part of a raft cluster to linr331fv2jrnsau59l2cg3kx module=node node.id=linr331fv2jrnsau59l2cg3kxINFO[2023-10-08T17:07:17.177521636+08:00] dispatcher starting module=dispatcher node.id=linr331fv2jrnsau59l2cg3kxINFO[2023-10-08T17:07:17.177729202+08:00] node 73io7znw57sny5kr6lhv9e5w5 was found to be down when marking unknown on dispatcher start method="(*Dispatcher).markNodesUnknown" module=dispatcher node.id=linr331fv2jrnsau59l2cg3kxINFO[2023-10-08T17:07:17.177808725+08:00] node dhxlr7tbykrkeycw9locitbhk was found to be down when marking unknown on dispatcher start method="(*Dispatcher).markNodesUnknown" module=dispatcher node.id=linr331fv2jrnsau59l2cg3kxINFO[2023-10-08T17:07:18.094304257+08:00] worker linr331fv2jrnsau59l2cg3kx was successfully registered method="(*Dispatcher).register"INFO[2023-10-08T17:07:18.096552162+08:00] Initializing Libnetwork Agent Listen-Addr=0.0.0.0 Local-addr=192.168.3.88 Adv-addr=192.168.3.88 Data-addr= Remote-addr-list=[] MTU=1500 INFO[2023-10-08T17:07:18.096603695+08:00] initialized VXLAN UDP port to 4789 INFO[2023-10-08T17:07:18.096662587+08:00] Daemon has completed initialization INFO[2023-10-08T17:07:18.096689108+08:00] New memberlist node - Node:Master will use memberlist nodeID:f5cb3e23ea13 with config:&{NodeID:f5cb3e23ea13 Hostname:Master BindAddr:0.0.0.0 AdvertiseAddr:192.168.3.88 BindPort:0 Keys:[[122 241 22 55 115 208 114 76 250 171 155 62 19 180 132 67] [159 77 210 48 165 159 82 116 112 3 90 169 21 0 214 182] [108 48 231 153 35 79 92 81 41 149 209 73 253 97 148 18]] PacketBufferSize:1400 reapEntryInterval:1800000000000 reapNetworkInterval:1825000000000 rejoinClusterDuration:10000000000 rejoinClusterInterval:60000000000 StatsPrintPeriod:5m0s HealthPrintPeriod:1m0s} INFO[2023-10-08T17:07:18.097735261+08:00] Node f5cb3e23ea13/192.168.3.88, joined gossip cluster INFO[2023-10-08T17:07:18.097934127+08:00] Node f5cb3e23ea13/192.168.3.88, added to nodes list ERRO[2023-10-08T17:07:18.145699182+08:00] error reading the kernel parameter net.ipv4.vs.expire_nodest_conn error="open /proc/sys/net/ipv4/vs/expire_nodest_conn: no such file or directory"ERRO[2023-10-08T17:07:18.146547609+08:00] error reading the kernel parameter net.ipv4.vs.expire_quiescent_template error="open /proc/sys/net/ipv4/vs/expire_quiescent_template: no such file or directory"ERRO[2023-10-08T17:07:18.146561508+08:00] error reading the kernel parameter net.ipv4.vs.conn_reuse_mode error="open /proc/sys/net/ipv4/vs/conn_reuse_mode: no such file or directory"ERRO[2023-10-08T17:07:18.146571878+08:00] error reading the kernel parameter net.ipv4.vs.conn_reuse_mode error="open /proc/sys/net/ipv4/vs/conn_reuse_mode: no such file or directory"ERRO[2023-10-08T17:07:18.146579409+08:00] error reading the kernel parameter net.ipv4.vs.expire_nodest_conn error="open /proc/sys/net/ipv4/vs/expire_nodest_conn: no such file or directory"ERRO[2023-10-08T17:07:18.146610468+08:00] error reading the kernel parameter net.ipv4.vs.expire_quiescent_template error="open /proc/sys/net/ipv4/vs/expire_quiescent_template: no such file or directory"ERRO[2023-10-08T17:07:18.146632098+08:00] error reading the kernel parameter net.ipv4.vs.expire_quiescent_template error="open /proc/sys/net/ipv4/vs/expire_quiescent_template: no such file or directory"ERRO[2023-10-08T17:07:18.146657688+08:00] error reading the kernel parameter net.ipv4.vs.conn_reuse_mode error="open /proc/sys/net/ipv4/vs/conn_reuse_mode: no such file or directory"ERRO[2023-10-08T17:07:18.146669182+08:00] error reading the kernel parameter net.ipv4.vs.expire_nodest_conn error="open /proc/sys/net/ipv4/vs/expire_nodest_conn: no such file or directory"ERRO[2023-10-08T17:07:18.146714493+08:00] error reading the kernel parameter net.ipv4.vs.conn_reuse_mode error="open /proc/sys/net/ipv4/vs/conn_reuse_mode: no such file or directory"ERRO[2023-10-08T17:07:18.146724582+08:00] error reading the kernel parameter net.ipv4.vs.expire_nodest_conn error="open /proc/sys/net/ipv4/vs/expire_nodest_conn: no such file or directory"ERRO[2023-10-08T17:07:18.146733003+08:00] error reading the kernel parameter net.ipv4.vs.expire_quiescent_template error="open /proc/sys/net/ipv4/vs/expire_quiescent_template: no such file or directory"INFO[2023-10-08T17:07:18.147282419+08:00] API listen on /var/run/docker.sock ^CINFO[2023-10-08T17:07:24.075991560+08:00] Processing signal 'interrupt' INFO[2023-10-08T17:07:24.078430331+08:00] Stopping manager module=node node.id=linr331fv2jrnsau59l2cg3kxINFO[2023-10-08T17:07:24.078488513+08:00] dispatcher stopping method="(*Dispatcher).Stop" module=dispatcher node.id=linr331fv2jrnsau59l2cg3kxINFO[2023-10-08T17:07:24.078535702+08:00] dispatcher session dropped, marking node linr331fv2jrnsau59l2cg3kx down method="(*Dispatcher).Session" node.id=linr331fv2jrnsau59l2cg3kx node.session=5i8m4gt8594epw6tgwhwhpniyERRO[2023-10-08T17:07:24.078555570+08:00] failed to remove node error="rpc error: code = Aborted desc = dispatcher is stopped" method="(*Dispatcher).Session" node.id=linr331fv2jrnsau59l2cg3kx node.session=5i8m4gt8594epw6tgwhwhpniyINFO[2023-10-08T17:07:24.078694963+08:00] shutting down certificate renewal routine module=node/tls node.id=linr331fv2jrnsau59l2cg3kx node.role=swarm-managerINFO[2023-10-08T17:07:24.079227281+08:00] Manager shut down module=node node.id=linr331fv2jrnsau59l2cg3kxINFO[2023-10-08T17:07:24.079994515+08:00] Node f5cb3e23ea13/192.168.3.88, left gossip cluster INFO[2023-10-08T17:07:24.080019026+08:00] Node f5cb3e23ea13 change state NodeActive --> NodeFailed INFO[2023-10-08T17:07:24.080038614+08:00] Node f5cb3e23ea13/192.168.3.88, added to failed nodes list WARN[2023-10-08T17:07:24.208951254+08:00] Error (Unable to complete atomic operation, key modified) deleting object [endpoint nvgd0wi681flszc093m1ikjsh 9d2593d7d3092f2da64a1defe947cf93f3e9b399a6c820e8771474e8b265b307], retrying.... INFO[2023-10-08T17:07:24.218223852+08:00] stopping event stream following graceful shutdown error="" module=libcontainerd namespace=mobyINFO[2023-10-08T17:07:24.218347489+08:00] Daemon shutdown complete