2019년 9월 14일 토요일

[k8s/쿠버네티스] container hang 간단 복구 (ContainerCreating)

container kubernetes hang에 대한 이미지 검색결과

오늘은 글 쓰는날 인가봅니다~!

[root@m-k8s ~]# kubectl get pods --output=wide | grep -v Running
NAME                                READY   STATUS              RESTARTS   AGE    IP                NODE     NOMINATED NODE   READINESS GATES
nginx-deployment-5754944d6c-pwkw5   0/1     ContainerCreating   0          109m   <none>            w1-k8s   <none>           <none>


Pods를 배포하다 보면, 별별일이 다 발생하는데, 일단 이번꺼는 컨테이너를 생성하다가 발생했네요 (ContainerCreating)

debug 순서는 일단...log를 봐야 합니다.
[root@m-k8s ~]# kubectl describe pods nginx-deployment-5754944d6c-pwkw5
Name: nginx-deployment-5754944d6c-pwkw5
Namespace: default
Priority: 0
Node: w1-k8s/192.168.1.101
Start Time: Fri, 13 Sep 2019 23:32:01 +0000
Labels: app=nginx
pod-template-hash=5754944d6c
Annotations: <none>
Status: Pending
IP:
Controlled By: ReplicaSet/nginx-deployment-5754944d6c
Containers:
nginx:
Container ID:
Image: nginx:1.7.9
Image ID:
Port: 80/TCP
Host Port: 0/TCP
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-jdgfv (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
default-token-jdgfv:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-jdgfv
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedCreatePodSandBox 5m56s (x1135 over 112m) kubelet, w1-k8s (combined from similar events): Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "e793e21758c4f703653cc24aeeb537416ef286343573b21eccaebc916cf43718" network for pod "nginx-deployment-5754944d6c-pwkw5": NetworkPlugin cni failed to set up pod "nginx-deployment-5754944d6c-pwkw5_default" network: error adding host side routes for interface: cali1923ceab5c4, error: route (Ifindex: 1220, Dst: 192.168.221.131/32, Scope: 253) already exists for an interface other than 'cali1923ceab5c4'
Normal SandboxChanged 61s (x1262 over 112m) kubelet, w1-k8s Pod sandbox changed, it will be killed and re-created.

CNI와 통신상에 문제가 생긴거 같은데요.
본인들 말로는 죽이고 다시 만들겠다고 했는데...-_-
의지가 없네요 의지가 없어요...

뭐 그러면 또 이럴때 사람이 나서서 해줘야죠~!!!
다 자동화 되면..사실 저희 먹고 살수가 없지 않겠습니까!!!


그래서, 일단은 scale을 줄이고, 다시 늘려서 복구해 줬습니다.

[root@m-k8s ~]# kubectl scale deployment nginx-deployment --replicas=199
deployment.extensions/nginx-deployment scaled
[root@m-k8s ~]# kubectl get rs
NAME                          DESIRED   CURRENT   READY   AGE
nginx-deployment-5754944d6c   199       199       199     15h
[root@m-k8s ~]# kubectl scale deployment nginx-deployment --replicas=200
deployment.extensions/nginx-deployment scaled
[root@m-k8s ~]# kubectl get rs
NAME                          DESIRED   CURRENT   READY   AGE
nginx-deployment-5754944d6c   200       200       200     15h
[root@m-k8s ~]# kubectl get pods --output=wide | grep -v Running
NAME                                READY   STATUS    RESTARTS   AGE    IP                NODE     NOMINATED NODE   READINESS GATES

다음에는 세부적으로 에러 생기는걸 debug하는걸 좀 연구해 봐야 겠어요
모니터링이랑요.

빠잉 

댓글 없음:

댓글 쓰기