[Paper Review] Transforming Cooling Optimization for Green Data Center via Deep Reinforcement Learning

2023. 8. 7. 15:18ยท๐Ÿฌ ML & Data/๐Ÿ“˜ ๋…ผ๋ฌธ & ๋ชจ๋ธ ๋ฆฌ๋ทฐ
728x90

* ๊ฐœ์ธ์ ์œผ๋กœ ์ฝ๊ณ  ๊ฐ€๋ณ๊ฒŒ ์ •๋ฆฌํ•ด๋ณด๋Š” ์šฉ๋„๋กœ ์ž‘์„ฑํ•œ ๊ธ€์ด๋ผ ๋ฏธ์ˆ™ํ•˜๊ณ  ์ •ํ™•ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ์–‘ํ•ด ๋ถ€ํƒ๋“œ๋ฆฝ๋‹ˆ๋‹ค :D

 

Transforming Cooling Optimization for Green Data Center via Deep Reinforcement Learning

Cooling system plays a critical role in a modern data center (DC). Developing an optimal control policy for DC cooling system is a challenging task. The prevailing approaches often rely on approximating system models that are built upon the knowledge of me

arxiv.org

  • EnergyPlus๋ฅผ ํ†ตํ•œ ์‹œ๋ฎฌ๋ ˆ์ด์…˜
  • ์„œ๋ฒ„ ์‚ฌ์ด์— cold aisle ๋„ฃ์–ด๋‘๊ณ  ์ด๊ฑธ๋กœ ์ „์ฒด ์„œ๋ฒ„ cooling์„ ์ปจํŠธ๋กค

 

Simulation System model

Data center model

  • ์„œ๋กœ ๋‹ค๋ฅธ ํฌ๊ธฐ, ์œ„์น˜์™€ ๋…๋ฆฝ์  cooling system์„ ๊ฐ€์ง„ data center(DX -์ง์ ‘ํ™•์žฅ / Chiller)
  • IT Equipment + illumination๊ณผ ๊ฐ™์€ ์†Œ์Šค์—์„œ ๋ฐœ์ƒํ•˜๋Š” ๋ฐœ์—ด
  • ITE ๋ถ€ํ•˜๋Š” ์ œ๊ณฑ๋ฏธํ„ฐ ๋‹น ์ •ํ•ด์ง„ ๋ถ€ํ•˜ L(์ „๋“ฑ ๋“ฑ)๊ณผ ์‹œ๊ฐ„์— ๋”ฐ๋ผ ๋‹ฌ๋ผ์ง€๋Š” ๋ถ€ํ•˜๋Ÿ‰ a์˜ ๊ณฑ
  • zone 1 load density 4kw, zone 2 2kw
  • ์ž‘์—… ๋ถ€ํ•˜์™€ ์˜จ๋„๋ฅผ ํ•˜๋‚˜์˜ ํŠœํ”Œ๋กœ ์ž‘์„ฑํ•ด์„œ state์— ์‚ฌ์šฉ
  • reward๋กœ PUE์™€ IT Equipment outlet ์˜จ๋„ ์ œ๊ณต
    • PUE๋Š” ์ตœ์†Œํ™”, ITE๋Š” ์ผ์ • ์ˆ˜์ค€ ์ด๋‚ด

Cooling system model

  • Action space
    • ๊ฐ™์€ ์ˆ˜๋ƒ‰ ๊ธฐ๋ฐ˜ / ๋ฌผ์„ ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ์‹์ด ๋‹ค๋ฆ„

 

Problem statement

  • ์˜จ๋„ `\T_{amb}\` ์™€ ๋ถ€ํ•˜ `\H_{ite}`\ , ์‹œ๊ฐ„์— ๋”ฐ๋ผ ๋ณ€ํ™”ํ•˜๋Š” tuple ์ œ๊ณต
  • ๋ƒ‰๊ฐ์ˆ˜์˜ 5๊ฐ€์ง€ input์„ ์ œ์–ดํ•˜๋Š” ๊ฒƒ์ด ๋ชฉํ‘œ(์œ„ ๊ทธ๋ฆผ์˜ Txx - DEC outlet temp, IEC outlet temp, chilled water loop outlet temp, DX cooling coil outlet temp, chiller cooling coil outlet temp)
  • PUE์˜ ์ตœ์†Œํ™”์™€ ์„œ๋ฒ„ ๊ณผ์—ด์˜ ํŒจ๋„ํ‹ฐ
    • ๋‘ ๊ฐœ์˜ ๋ชฉ์  ํ•จ์ˆ˜
      • penalty function(์ตœ์†Œํ™”)
        • λ - penalty ๊ณ„์ˆ˜
        • Tzi - zone i ์— ๋Œ€ํ•œ ํ‰๊ท  ITE ์˜จ๋„
        • φ - ๊ณผ์—ด ๊ธฐ์ค€ threshold

 

Neural end to end cooling control algorithm(CCA)

Batch Learning(Offline learning) / On Policy

  • ์‹ค์‹œ๊ฐ„ ๋ฐ์ดํ„ฐ๋ฅผ ํ•™์Šต์— ์ถ”๊ฐ€ํ•˜๋Š” ๊ฒƒ์€ ์œ„ํ—˜์„ ๊ฐ์ˆ˜ํ•ด์•ผํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์ด ๊ฒฝ์šฐ offline learning(batch learning) ์‚ฌ์šฉ
  • batch ํ•™์Šต์—๋„ ๋‘ ๊ฐ€์ง€ ์ข…๋ฅ˜๊ฐ€ ์žˆ๋Š”๋ฐ on Policy์™€ off policy
    • simulation ์‹œ๊ฐ„์— ๋”ฐ๋ฅธ ๋น„์šฉ์ด ๋†’์•„์„œ off policy ์‚ฌ์šฉ
  • Off-policy algorithms generally employ a separate behavior policy, which is independent of the policy being estimated, to generate the training trace; while on-policy directly uses control policy being estimated (in the real control practice or more likely in a simulator) to generate training data traces

 

CCA with offline trace

  • ์ผ๋ฐ˜์ ์ธ ๊ฐ•ํ™”ํ•™์Šต ์ ‘๊ทผ์—์„œ ๋ฏธ๋ž˜์˜ ๋ณด์ƒ ๋ฐ์ดํ„ฐ๋„ ํ‰๊ฐ€์— ์‚ฌ์šฉ๋˜๋Š” ๊ฒƒ๊ณผ๋Š” ๋‹ฌ๋ฆฌ
  • ์—ฌ๊ธฐ์—์„œ๋Š” ๋ฏธ๋ž˜ ๋ณด์ƒ ๋ฐ์ดํ„ฐ๋Š” ์•ˆ ์“ฐ๊ณ  ์ž‘์—…๋ถ€ํ•˜์™€ ๋‚ ์”จ ๋ฐ์ดํ„ฐ๊ฐ€ ์‹œ์Šคํ…œ ์ „ํ™˜์„ ์ •์˜
  • ์–ด๋–ค ๊ฒฐ๊ณผ๊ฐ€ ๋‚˜์™”์„ ๋•Œ ์ ์šฉ๋˜๋Š”๋ฐ ์‹œ๊ฐ„์ด ๊ฑธ๋ฆฌ๋ฏ€๋กœ ์ด๋ฒˆ ์‹œ๊ฐ„์— ๊ด€์ฐฐํ•œ ๊ฒฐ๊ณผ๊ฐ€ ๋‹ค์Œ ์‹œ๊ฐ„์— ๋ฐ˜์˜๋˜๋„๋ก ์‹œ๊ฐ„ ์ถ•์†Œ(?)
  • ๋ฐ์ดํ„ฐ๋Š” ์ „๋ถ€ N ์‹œ๊ฐ„ ๋™์•ˆ์˜ ์‹œ๊ณ„์—ด

  • Q-Network
    • ํ˜„์žฌ ์ƒํƒœ s ์—์„œ ํ–‰๋™ a๋ฅผ ์ทจํ–ˆ์„ ๋•Œ์˜ ๋น„์šฉ ์ถœ๋ ฅ
    • ์žฌ๊ท€์ ์ธ ์˜์‚ฌ๊ฒฐ์ • ์‹œ๋„ → ์ด์ „์˜ ์ƒํƒœ์™€ ๋™์ž‘๋“ค๋„ ๊ณ ๋ คํ•จ(?)
    • MSE
  • Policy Network
    • ํ˜„์žฌ ์ƒํƒœ s์—์„œ ํ–‰๋™ a๋ฅผ ์ทจํ–ˆ์„ ๋•Œ Q๋ฅผ ์ถœ๋ ฅ
    • ์ดˆ๋ฐ˜์— validation error๊ฐ€ ์ž‘์€ ๊ฒƒ์€ ์˜ค๋ฅ˜๊ฐ€ ์•„๋‹˜. ํ•™์Šต ๋œ ๋ผ์„œ ๊ทธ๋ ‡๋‹ค.

Neural Network Design

  • Q-Network
    • activation function์œผ๋กœ tanh ๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๋‘ ๊ฐœ์˜ hidden layer
    • linear output layer
    • ์Œ์ˆ˜ reward ์ถœ๋ ฅ
    • ์‹ค์ œ y ๋ฐ์ดํ„ฐ์™€ ์˜ˆ์ธก๋œ yr ๋ฐ์ดํ„ฐ๊ฐ„์˜ ๊ฐ„๊ทน์„ ์ค„์ด๋Š” ๊ฒƒ์„ ๋ชฉํ‘œ๋กœ ํ•จ
  • Policy Network
    • linear activation function๊ณผ tanh activation function์„ ์‚ฌ์šฉํ•˜๋Š” ๋‘ ๊ฐœ์˜ hidden layer
    • ๋‹ค์Œ control action์ธ a๋ฅผ ์ถœ๋ ฅ
    • Q-Network์˜ loss function ์ตœ์ ํ™”
  • ๋ชจ๋“  ๋ฐ์ดํ„ฐ๋ฅผ (-1, 1) ๋ฒ”์œ„๋กœ ์ •๊ทœํ™”ํ•ด์„œ tanh activation function์— ๋งž์ถ”๊ณ , ์‹ค์ œ ์—๋„ˆ์ง€์™€ ์˜จ๋„ ๊ฐ’์„ ๊ณ„์‚ฐํ•ด์•ผํ•  ๋•Œ ๋น„์ •๊ทœํ™”ํ•จ.

 

  1. Data
    • state data series, action, reward ๋ฐ์ดํ„ฐ ํ•„์š”
    • Q-NN input
    • policy network input
    • loss data y ๊ณ„์‚ฐ์„ ์œ„ํ•œ PUE์™€ ์˜จ๋„ ๋ฐ์ดํ„ฐ
  2. initialize
    • Q network ์™€ policy network ์ƒ์„ฑ
    • weight parameter random initialize
  3. epoch / mini batch์— ๋”ฐ๋ผ์„œ
    • Q NN ํŒŒ๋ผ๋ฏธํ„ฐ ์ตœ์ ํ™”
    • policy network ํŒŒ๋ผ๋ฏธํ„ฐ ์ตœ์ ํ™”
    • swap / evaluation
  4. return
    • ์ตœ์  ๊ฐ€์ค‘์น˜ ํŒŒ๋ผ๋ฏธํ„ฐ๋กœ ์„ธํŒ…๋œ Q network์™€ policy network
728x90
์ €์ž‘์žํ‘œ์‹œ ๋น„์˜๋ฆฌ ๋ณ€๊ฒฝ๊ธˆ์ง€ (์ƒˆ์ฐฝ์—ด๋ฆผ)

'๐Ÿฌ ML & Data > ๐Ÿ“˜ ๋…ผ๋ฌธ & ๋ชจ๋ธ ๋ฆฌ๋ทฐ' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€

[Paper Review] Mamba - Linear Time Sequence Modeling with Selective State Spaces 2  (1) 2024.12.11
[Paper Review] Mamba - Linear Time Sequence Modeling with Selective State Spaces 1  (1) 2024.12.11
[Model Review] TadGAN(Time series Anomaly Detection GAN)  (0) 2023.05.17
[Model Review] YOLOv5 + Roboflow Annotation  (0) 2023.03.14
[Model Review] MobileNet SSD ๋…ผ๋ฌธ ํ€ต ๋ฆฌ๋ทฐ  (1) 2022.12.13
'๐Ÿฌ ML & Data/๐Ÿ“˜ ๋…ผ๋ฌธ & ๋ชจ๋ธ ๋ฆฌ๋ทฐ' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€
  • [Paper Review] Mamba - Linear Time Sequence Modeling with Selective State Spaces 2
  • [Paper Review] Mamba - Linear Time Sequence Modeling with Selective State Spaces 1
  • [Model Review] TadGAN(Time series Anomaly Detection GAN)
  • [Model Review] YOLOv5 + Roboflow Annotation
darly213
darly213
ํ˜ธ๋ฝํ˜ธ๋ฝํ•˜์ง€ ์•Š์€ ๊ฐœ๋ฐœ์ž๊ฐ€ ๋˜์–ด๋ณด์ž
  • darly213
    ERROR DENY
    darly213
  • ์ „์ฒด
    ์˜ค๋Š˜
    ์–ด์ œ
    • ๋ถ„๋ฅ˜ ์ „์ฒด๋ณด๊ธฐ (97)
      • ๐Ÿฌ ML & Data (50)
        • ๐ŸŒŠ Computer Vision (2)
        • ๐Ÿ“ฎ Reinforcement Learning (12)
        • ๐Ÿ“˜ ๋…ผ๋ฌธ & ๋ชจ๋ธ ๋ฆฌ๋ทฐ (8)
        • ๐Ÿฆ„ ๋ผ์ดํŠธ ๋”ฅ๋Ÿฌ๋‹ (3)
        • โ” Q & etc. (5)
        • ๐ŸŽซ ๋ผ์ดํŠธ ๋จธ์‹ ๋Ÿฌ๋‹ (20)
      • ๐Ÿฅ Web (21)
        • โšก Back-end | FastAPI (2)
        • โ›… Back-end | Spring (5)
        • โ” Back-end | etc. (9)
        • ๐ŸŽจ Front-end (4)
      • ๐ŸŽผ Project (8)
        • ๐ŸงŠ Monitoring System (8)
      • ๐Ÿˆ Algorithm (0)
      • ๐Ÿ”ฎ CS (2)
      • ๐Ÿณ Docker & Kubernetes (3)
      • ๐ŸŒˆ DEEEEEBUG (2)
      • ๐ŸŒ  etc. (8)
      • ๐Ÿ˜ผ ์‚ฌ๋‹ด (1)
  • ๋ธ”๋กœ๊ทธ ๋ฉ”๋‰ด

    • ํ™ˆ
    • ๋ฐฉ๋ช…๋ก
    • GitHub
    • Notion
    • LinkedIn
  • ๋งํฌ

    • Github
    • Notion
  • ๊ณต์ง€์‚ฌํ•ญ

    • Contact ME!
  • 250x250
  • hELLOยท Designed By์ •์ƒ์šฐ.v4.10.3
darly213
[Paper Review] Transforming Cooling Optimization for Green Data Center via Deep Reinforcement Learning
์ƒ๋‹จ์œผ๋กœ

ํ‹ฐ์Šคํ† ๋ฆฌํˆด๋ฐ”