쿠버네티스 전문가 블로그

Home » Archives for 9월 2019

2019년 9월 17일 화요일

[x86/HW] 왜? 메모리에서 Multibit error가 발생했는데도 시스템이 다시 시작되지 않을까요?

시간: 9월 17, 2019 - 1 개의 댓글

안녕하세요~

기묘한 주제로 오늘은 ~~ 찾아뵙습니다~아
요즘은 서버하면 Unix 서버보다 x86 계열을 서버를 주로 많이 쓰잖나요
(저만 그런건 아니겠죠;;;)

ì•„ë‹ˆë¼ê³ í•´ì¤˜ìš” ì§¤ì— ëŒ€í•œ ì´ë¯¸ì§€ ê²€ìƒ‰ê²°ê³¼

하하 각설하고요.

x86을 쓰다 보면 제일 많이 얻어 맞는 장애는 Disk fault 그리고 메모리 ECC인데요.
ECC는 간단히 말해서 여분의 전송 BIT를 가지고 문제가 생기는 BIT의 오류를 교정하는 것입니다. 아우...이걸 어떻게 쉽게 얘기할수 있을지...ㅠㅠ

요즘 추세로 제일 이해하기 쉬운건 RAID5 디스크 처럼 생각하시면 되요~~!

ecc memoryì— ëŒ€í•œ ì´ë¯¸ì§€ ê²€ìƒ‰ê²°ê³¼

ecc memoryì— ëŒ€í•œ ì´ë¯¸ì§€ ê²€ìƒ‰ê²°ê³¼

여튼...그래서 두개의 BIT의 문제가 생기면 해당 오류를 정정할수가 없기 때문에 시스템이 현재 상태를 보호하기 위해서 다시 시작되게 되거든요. 원래는 이랬었어요

근데요 최근에 다른 방법으로 정정하는 기술들이 나와서 소개합니다.
정확하게는 최대한 시스템에 영향을 안 주고 넘어갈수 있도록 조정한거지만요!

내용은 이렇습니다.

^ü기술적으로 많은 경우 CPU가 해당 데이터를 patch하여 사용하기 때문에 UC ^{Uncorrected Error} 에 속하여 시스템이 다시 시작하게 됩니다. 하지만, 시스템이 데이터를 아직 patch하지 않은 경우에는 다시 시작하지 않을 가능성이 있는 조건인 UCR ^{Uncorrected Recoverable Error} 에 속하게 됩니다. 이 중에 DCU ^{Data Cache
Unit}와 IFU ^{Instruction Fetch Unit}에 의해서 발견된 것은 SRAR ^{Software Recoverable Action}에 속하게 되고, 이 경우에는 두가지 경우가 있는데, kernel space일 경우에는 kernel panic이 발생하게 되고 user space인 경우에는 해당 프로세스를 죽이게 됩니다.

ü 따라서, 현재의 경우에는 높은 가능성으로 SARO ^{Software
Recoverable Action Optional}에 속하게 되는데, 이는 Memory patrol scrub 기능 또는 LLC ^{Last
Level Cache} 의 writeback 트랜잭션시에 발견되게 됩니다.
이 경우에는 kernel space 일 경우이거나, user space일 경우 페이지 ^Page자체를 모두 무시하거나 격리처리하게 됩니다.

되게 어렵죠 -_-?
이거 영문을 거의 그대로 번역해서 그런건데요 그나마 이해하고 써도 이정도로..어렵...
쉽게 다시 설명하면요. 메모리는 CPU에서 쓰고자하는 내용을 CPU내부의 register와 sram 이후로 가장 빠르게 접근할수 있는 영역인데요.
그래서 CPU가 가져다가 참조하고 있는 것들이라면, 그건 고대로 Reboot합니다.
우리가 알고 있는거죠! 여기까지는 문제 없죠?

자 그러면...

ê·¸ë ë¤ê³ í´ì¤ìì ëí ì´ë¯¸ì§ ê²ìê²°ê³¼

CPU가 아직 참조하지 않은 메모리에 담긴 데이터가 1) DCU ^{Data Cache Unit}와 2) IFU ^{Instruction Fetch Unit}에 의해서 Multibit 에러가 발견되게 되면요. 두가지 방식으로 동작하게 되는데요.
dcu ifu memoryì— ëŒ€í•œ ì´ë¯¸ì§€ ê²€ìƒ‰ê²°ê³¼

dcu ifu memoryì— ëŒ€í•œ ì´ë¯¸ì§€ ê²€ìƒ‰ê²°ê³¼

Kernel panic = 우리가 알고 있는 리부우우우트

User space = 프로세스만 킬 (근데 DB단이라면 -_- OMG)...뭐 rollback해서 보호하거나 commit 안했을테니모..

그리고, Memory patrol scrub 기능 또는 LLC ^{Last Level Cache} 의 writeback 트랜잭션시 에 의해서 발견하게 되면요.

page 자체를 격리해 버려요. 있잖나요 디스크의 Bad Sector를 막아버리는 것처럼? 요즘 SSD도 비트가 flap을 못하면 막도만요~~ 수명이 있는거니까
이건 시스템에 아무런 영향이 없어요~그리고 사실 Page isolation이니까 다시 시작하게 되면 어짜피 reset 될꺼에요.

근데 Multibit 에러가 난 메모리는 시스템이 다시 시작할때 교체해 두는게 좋겠죠?
이번에는 운이 좋았던거에요~!

이를 표로 정리하면 이렇게 되요.

재미있지 않나요~ :)

또 변화한 신기한 이야기가 있다면 또 쓸께요~~!

==========================

1) DCU

Data Cache Unit

The Data Cache Unit (DCU) consists of the following sub-blocks:

The Level 1 (L1) data cache controller, that generates the control signals for the associated embedded tag, data, and dirty RAMs, and arbitrates between the different sources requesting access to the memory resources. The data cache is 4-way set associative and uses a Physically Indexed Physically Tagged (PIPT) scheme for lookup that enables unambiguous address management in the system.
The load/store pipeline that interfaces with the DPU and main TLB.
The system controller that performs cache and TLB maintenance operations directly on the data cache and on the instruction cache through an interface with the IFU.
An interface to receive coherency requests from the Snoop Control Unit (SCU).

The data cache has the following features:

Pseudo-random cache replacement policy.
Streaming of sequential data because of multiple word load instructions, for example LDM, LDRD, LDP and VLDM.
Critical word first linefill on a cache miss.

See Chapter 6 Level 1 Memory System for more information.

If the CPU cache protection configuration is implemented, the L1 Data cache tag RAMs and dirty RAMs are protected by parity bits. The L1 Data cache data RAMs are protected using Error Correction Codes (ECC). The ECC scheme is Single Error Correct Double Error Detect (SECDED).

The DCU includes a combined local and global exclusive monitor, used by the Load-Exclusive/ Store-Exclusive instructions. See the ARM^® Architecture Reference Manual ARMv8, for ARMv8-A architecture profile for information about these instructions.

2) IFU

A2.1.1 Instruction fetch

The Instruction Fetch Unit (IFU) fetches instructions from the L1 instruction cache and delivers up to three instructions per cycle to the instruction decode unit.

The IFU includes:

A 64KB, 4-way, set associative L1 instruction cache with 64-byte cache lines and optional dual-bit parity protection.
A fully associative instruction micro TLB with native support for 4KB, 64KB, and 1MB page sizes.
A 2-level dynamic branch predictor.

3) LLC

메모리 계층도에서 보면 가장 상위층에 존재하는 레지스터 다음에 캐시가 존자한다. 이 캐시도 세부적으로 분류할 수 있다. 보통 컴퓨터를 살때 보면 L1,L2와 같이 캐시가 얼마다 라고 적혀 있는 것을 확인 할 수 있을 것이다. 일반적으로 현대 프로세서는 L1,L2 두개로 구성되어 있고 L3캐시까지 있는 CPU 도 볼 수 있을 것이다. 즉 L1이 가장 성능이 좋을것이고 그다음 순차적으로 좋을것이다. 여기서 마지막 레벨에 있는 캐시를 Last Level Cache (LLC)라고 부른다. LLC이후에는 시간이 오래 걸리므로 캐시와 구분해서 계층도에서 표시된다.

And Wirteback

In a conventional writeback policy, dirty cache blocks are sent to the write buffer when they are evicted from the lastlevel cache (LLC). The write buffer is drained following the buffer management policy. Several proposals [16, 18, 13] improve writeback efficiency using an intelligent scheduling algorithm. However, the write buffer only has a small number of entries due to design complexity and power efficiency, limiting the ability to schedule high locality write requests as well as the possibility to flexible adjust read/write priority

Reference:

http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0500d/CHDDJAFJ.html

https://cesl.tistory.com/entry/Cache-정리 [Embedded Lab]

http://hpca23.cse.tamu.edu/pdfs/p21-wang.pdf

https://lenovopress.com/lp0778.pdf

[Continue reading...]

2019년 9월 14일 토요일

[k8s/쿠버네티스] container hang 간단 복구 (ContainerCreating)

시간: 9월 14, 2019 - 0 개의 댓글

container kubernetes hangì— ëŒ€í•œ ì´ë¯¸ì§€ ê²€ìƒ‰ê²°ê³¼

오늘은 글 쓰는날 인가봅니다~!

[root@m-k8s ~]# kubectl get pods --output=wide | grep -v Running
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-deployment-5754944d6c-pwkw5 0/1 ContainerCreating 0 109m <none> w1-k8s <none> <none>

Pods를 배포하다 보면, 별별일이 다 발생하는데, 일단 이번꺼는 컨테이너를 생성하다가 발생했네요 (ContainerCreating)

debug 순서는 일단...log를 봐야 합니다.

[root@m-k8s ~]# kubectl describe pods nginx-deployment-5754944d6c-pwkw5
Name: nginx-deployment-5754944d6c-pwkw5
Namespace: default
Priority: 0
Node: w1-k8s/192.168.1.101
Start Time: Fri, 13 Sep 2019 23:32:01 +0000
Labels: app=nginx
pod-template-hash=5754944d6c
Annotations: <none>
Status: Pending
IP:
Controlled By: ReplicaSet/nginx-deployment-5754944d6c
Containers:
nginx:
Container ID:
Image: nginx:1.7.9
Image ID:
Port: 80/TCP
Host Port: 0/TCP
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-jdgfv (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
default-token-jdgfv:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-jdgfv
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedCreatePodSandBox 5m56s (x1135 over 112m) kubelet, w1-k8s (combined from similar events): Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "e793e21758c4f703653cc24aeeb537416ef286343573b21eccaebc916cf43718" network for pod "nginx-deployment-5754944d6c-pwkw5": NetworkPlugin cni failed to set up pod "nginx-deployment-5754944d6c-pwkw5_default" network: error adding host side routes for interface: cali1923ceab5c4, error: route (Ifindex: 1220, Dst: 192.168.221.131/32, Scope: 253) already exists for an interface other than 'cali1923ceab5c4'
Normal SandboxChanged 61s (x1262 over 112m) kubelet, w1-k8s Pod sandbox changed, it will be killed and re-created.

CNI와 통신상에 문제가 생긴거 같은데요.
본인들 말로는 죽이고 다시 만들겠다고 했는데...-_-
의지가 없네요 의지가 없어요...

뭐 그러면 또 이럴때 사람이 나서서 해줘야죠~!!!
다 자동화 되면..사실 저희 먹고 살수가 없지 않겠습니까!!!

그래서, 일단은 scale을 줄이고, 다시 늘려서 복구해 줬습니다.

[root@m-k8s ~]# kubectl scale deployment nginx-deployment --replicas=199
deployment.extensions/nginx-deployment scaled
[root@m-k8s ~]# kubectl get rs
NAME DESIRED CURRENT READY AGE
nginx-deployment-5754944d6c 199 199 199 15h
[root@m-k8s ~]# kubectl scale deployment nginx-deployment --replicas=200
deployment.extensions/nginx-deployment scaled
[root@m-k8s ~]# kubectl get rs
NAME DESIRED CURRENT READY AGE
nginx-deployment-5754944d6c 200 200 200 15h
[root@m-k8s ~]# kubectl get pods --output=wide | grep -v Running
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES

다음에는 세부적으로 에러 생기는걸 debug하는걸 좀 연구해 봐야 겠어요
모니터링이랑요.

빠잉

[Continue reading...]

[k8s/쿠버네티스] scale & replicas

시간: 9월 14, 2019 - 0 개의 댓글

k8s scaleì— ëŒ€í•œ ì´ë¯¸ì§€ ê²€ìƒ‰ê²°ê³¼

안녕하세요

연속 이틀 글을 쓰는 이유는 scale에 너무 감명 받아서~~
지난번에 pod automation (이라고 쓰고 생각해 보니 실제로 pod는....calio만. --)
을 하고 나서 실제로 pod를 배포해 봤는데요

그 배포된 pod의 scale이 꽤나 자유롭게 되더라고요.
(아마 이러고 나면 모니터링을 다시 봐야 할꺼 같긴 한데)

일단 테스트로 배포한 pod는 이걸 사용했고요

kubectl apply -f https://k8s.io/docs/tasks/run-application/deployment.yaml

이걸 다음의 명령을 통해서 늘리고 줄여 보았습니다.

kubectl scale deployment nginx-deployment --replicas=60

그러면 이렇게 확인이 되어요

[root@m-k8s ~]# kubectl get rs nginx-deployment-5754944d6c
NAME DESIRED CURRENT READY AGE
nginx-deployment-5754944d6c 60 60 60 13h

그리고 이걸 200개의 replicas로 확장하면,

[root@m-k8s ~]# kubectl scale deployment nginx-deployment --replicas=200
deployment.extensions/nginx-deployment scaled

이렇게 늘어납니다.

[root@m-k8s ~]# kubectl get rs nginx-deployment-5754944d6c
NAME DESIRED CURRENT READY AGE
nginx-deployment-5754944d6c 200 200 77 13h

[root@m-k8s ~]# kubectl get rs nginx-deployment-5754944d6c
NAME DESIRED CURRENT READY AGE
nginx-deployment-5754944d6c 200 200 137 13h

[root@m-k8s ~]# kubectl get rs nginx-deployment-5754944d6c
NAME DESIRED CURRENT READY AGE
nginx-deployment-5754944d6c 200 200 199 13h

감동적이었어요!

그럼 잘 배포 되었나..그중에 하나를 찍어볼까요?

오...역시 많이 쓰는 툴은 다 그만한 의미가 있는거 같아요 :)

마이크로 서비스 수준의 nginx 들이겠지만...200개 가량을 이렇게 쉽게 배포하고 수거할수 있다니요!!

모니터링도 하고 안되는 애들도 다시 좀 봐야 겠어요.

빠잉!

[Continue reading...]

2019년 9월 13일 금요일

[k8s/쿠버네티스] Pod automation (v0.00000001)

시간: 9월 13, 2019 - 0 개의 댓글

안녕하세요오오~

추석입니다. 당일이에요~~
뭐...전 할일도 없고....-_-; 이러니 왠지 ...

ìš°ìš¸ ì§¤ì— ëŒ€í•œ ì´ë¯¸ì§€ ê²€ìƒ‰ê²°ê³¼

까지는 아니구요 ;) ㅎㅎㅎㅎ 요즘 k8s(쿠버네티스)를 공부 중인데요.
결국 이걸 해야 먹고 살겠더라고요 ㅠㅠ (뭐 파이썬이나 앤서블도 먹고 살려고 한거긴 한데...)

공부를 한다는건...개인적으로 제일 좋은건 랩을 꾸미고 기능을 테스트하고 오류를 만나고 내부 코드를 뜯어보고 하는거더라고요.

사실 앤서블도...랩을 한 천번은 돌려본거 같아요....
각설하고, k8s로 랩을 일단 아주아주아주 draft로 자동으로 만드는걸 우선 짰어요.

이게 곧 또 바뀔 가능성은 10000% 입니다. :)
개선할 것도 많고, 공부하면서 추가 변경 될꺼거든요.

그럼에도 공개하는 이유는...당장 이게 도움이 될수도 있지 않을까 해서요.

코드는 여기에 있습니다.
https://github.com/sysnet4admin/PRJ_DevOps/tree/master/0.All_inOne_k8s

그리고 virtualbox와 vagrant를 설치하고 vagrantfile이 있는 곳에서 vagrant up만 실행하시면 아래와 같은 결과가 나오고

아래와 같은 랩이 완성 됩니다~!

실제로 Master node에 접속해서 kubectl get nodes를 실행하면 다음과 같이 나옵니다. :)

이건 이제 시작이겠죠!!
열심히 공부해서 더 나은 내용들을 채우겠습니다.

조훈 드림.

[Continue reading...]

Translate

가장 많이 본 글

블로그 보관함

2019년 9월 17일 화요일

[x86/HW] 왜? 메모리에서 Multibit error가 발생했는데도 시스템이 다시 시작되지 않을까요?

Data Cache Unit

A2.1.1 Instruction fetch

2019년 9월 14일 토요일

[k8s/쿠버네티스] container hang 간단 복구 (ContainerCreating)

[k8s/쿠버네티스] scale & replicas

2019년 9월 13일 금요일

[k8s/쿠버네티스] Pod automation (v0.00000001)