๐Ÿฌ ML & Data

    [๊ฐ•ํ™”ํ•™์Šต] Dueling Double Deep Q Learning(DDDQN / Dueling DQN / D3QN)

    Dueling Double DQN https://arxiv.org/pdf/1509.06461.pdf https://arxiv.org/pdf/1511.06581.pdf Double DQN DQN์—์„œ reward๋ฅผ ๊ณผ๋Œ€ ํ‰๊ฐ€ํ•˜๋Š” ๋ฌธ์ œ๊ฐ€ ์žˆ์Œ. Q Value๊ฐ€ agent๊ฐ€ ์‹ค์ œ๋ณด๋‹ค ๋†’์€ ๋ฆฌํ„ด์„ ๋ฐ›์„ ๊ฒƒ์ด๋ผ๊ณ  ์ƒ๊ฐํ•˜๋Š” ๊ฒฝํ–ฅ ⇒ Q learning update ๋ฐฉ์ •์‹์— ๋‹ค์Œ ์ƒํƒœ(state)์— ๋Œ€ํ•œ Q value ์ตœ๋Œ€๊ฐ’์ด ์กด์žฌํ•˜๊ธฐ ๋•Œ๋ฌธ Q ๊ฐ’์— ๋Œ€ํ•œ max ์—ฐ์‚ฐ์€ ํŽธํ–ฅ์„ ์ตœ๋Œ€ํ™”ํ•œ๋‹ค. ํ™˜๊ฒฝ์˜ ์ตœ๋Œ€ true value๊ฐ€ 0์ธ๋ฐ agent๊ฐ€ ์ถ”์ •ํ•˜๋Š” ์ตœ๋Œ€ true value๊ฐ€ ์–‘์ˆ˜์ธ ๊ฒฝ์šฐ์— ์„ฑ๋Šฅ ์ €ํ•˜ ํ•ด๊ฒฐ์„ ์œ„ํ•ด ๋‘ ๊ฐœ์˜ network ์‚ฌ์šฉ. Q Next : action selection → ๋‹ค์Œ ์•ก์…˜์œผ๋กœ ๊ฐ€์žฅ ์ข‹์€ ..

    [๋ผ์ดํŠธ ๋”ฅ๋Ÿฌ๋‹] 0. Intro

    ์ œ ๋ธ”๋กœ๊ทธ์—์„œ ์™ ์ง€๋Š” ๋ชจ๋ฅด๊ฒ ์ง€๋งŒ ๊พธ์ค€ํžˆ ์‚ฌ๋ž‘๋ฐ›์•„์˜จ ๋ผ์ดํŠธ ๋จธ์‹ ๋Ÿฌ๋‹ ์‹œ๋ฆฌ์ฆˆ๋ฅผ ์“ด์ง€๋„ ๋ฒŒ์จ 3๋…„ ๋ฐ˜์ด ์ง€๋‚ฌ์Šต๋‹ˆ๋‹ค. ์ฒ˜์Œ ์ด ์‹œ๋ฆฌ์ฆˆ๋ฅผ ์“ธ ๋•Œ ์ €๋Š” ์ด์ œ ๋ง‰ ์ปดํ“จํ„ฐ๊ณตํ•™ 1ํ•™๋…„ ๊ณผ์ •์„ ๋งˆ์นœ ํ•™์ƒ์ด์—ˆ๊ณ , ์ž๋ฃŒ๊ตฌ์กฐ๋ฉฐ ์•Œ๊ณ ๋ฆฌ์ฆ˜๋„ ๋ชจ๋ฅด๋Š” ์ฃผ์ฃผ์ฃผ์ฃผ์ฃผ๋‹ˆ์–ด ๊ฐœ๋ฐœ์ž ์‹œ์ ˆ์ด์—ˆ์Šต๋‹ˆ๋‹ค. ๋•Œ๋ฌธ์— ์ œ๊ฐ€ ๋ด๋„ ‘์•„, ์ด ๋…€์„์ด ์ดํ•ด๋ฅผ ๋ชปํ•˜๊ณ  ์ผ๊ตฌ๋‚˜….’ ํ•˜๋Š” ๋ถ€๋ถ„๋“ค์ด ๋ถ„๋ช… ์กด์žฌํ•ฉ๋‹ˆ๋‹ค. 3๋…„ ๋ฐ˜์ด ์ง€๋‚ฌ๊ณ , ์ €๋Š” 8๊ฐœ์›” ์ „์— ํ•™์‚ฌ ์กธ์—…์„ ํ–ˆ์œผ๋ฉฐ, ๋จธ์‹ ๋Ÿฌ๋‹ ์—”์ง€๋‹ˆ์–ด ๊ฒธ ์ด๊ฒƒ์ €๊ฒƒ ๊ฐœ๋ฐœ์ž๋กœ ๊ฒฝํ—˜์„ ์Œ“์€์ง€๋„ 1๋…„์ด ์กฐ๊ธˆ ๋„˜์—ˆ์Šต๋‹ˆ๋‹ค. ์‚ฌ์‹ค ์•„์ง๋„ ์•„๋Š” ๊ฒŒ ๋งŽ๋‹ค๊ณ  ๋Š๊ปด์ง€์ง€๋Š” ์•Š์Šต๋‹ˆ๋‹ค๋งŒ, ์ €๋•Œ์˜ ์ €๋ณด๋‹ค๋Š” ๋ญ๊ฐ€ ๋˜์—ˆ๋“  ๋‚˜์€ ๊ฒƒ๋„ ์‚ฌ์‹ค์ž…๋‹ˆ๋‹ค. ์ด์ œ ํ•œ ์ฃผ์ฃผ๋‹ˆ์–ด ๊ฐœ๋ฐœ์ž์ฏค์€ ๋๊ฒ ์ฃ ? ๊ทธ๋ฆฌ๊ณ  ๋‹น์‹œ์˜ ๋…€์„์€ ๋ชฐ๋ž๊ฒ ์ง€๋งŒ ๋‚ด๋…„ ํ›„๊ธฐ ๋Œ€ํ•™์› ์ง€์›์„ ์—ผ๋‘์— ๋‘๊ณ  ์žˆ๋Š” ๋งŒ..

    [Data] ์ „๋™ ๋ชจํ„ฐ ์ด์ƒํƒ์ง€ ๋ฐ ๋ถ„๋ฅ˜๋ฅผ ์œ„ํ•œ ์ฃผํŒŒ์ˆ˜ ๋ถ„์„

    1. ๋ฐ์ดํ„ฐ ์ทจ๋“ Sampling rate 25.6kHz DC Motor, ์ž์ฒด ์ œ์ž‘ ์‹คํ—˜ํ™˜๊ฒฝ ๋ฐ์ดํ„ฐ ํŒŒ์ผ ๋‹น 102,400๊ฐœ ํฌ์ธํŠธ 2. FFT ๋ชจํ„ฐ ์ฃผํŒŒ์ˆ˜ ๋ถ„์„ 1. Normal ์ •์ƒ์ƒํƒœ ๋ชจํ„ฐ์˜ ์ฃผํŒŒ์ˆ˜๋Š” ์ง„๋™ ์ฐจ์ˆ˜(Harmonic)๊ฐ€ ๋ฐ˜๋น„๋ก€ํ•œ๋‹ค. ํ˜„์žฌ ์‹คํ—˜ ์„ธํŠธ์˜ ๋ชจํ„ฐ๋Š” ์•ฝ 3600rpm์„ ๊ฐ€์ง€๊ณ  ์žˆ์œผ๋ฏ€๋กœ ์ง„๋™์ฐจ์ˆ˜๋Š” 1์ฐจ 60Hz, 2์ฐจ 120Hz, 3์ฐจ 180Hz๋กœ ์ด๋ฃจ์–ด์ ธ ์žˆ๋‹ค. ์œ„ FFT ์ฃผํŒŒ์ˆ˜ ๋ถ„์„ ๊ฒฐ๊ณผ 1์ฐจ, 2์ฐจ, 3์ฐจ ์ง„๋™ ์ฐจ์ˆ˜ ์ˆœ์œผ๋กœ amplitude๊ฐ€ ๊ฐ์†Œํ•˜๋Š” ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค. 2. Misalignment ์˜ค์ •๋ ฌ(Misalignment) ์ƒํƒœ๋Š” Parallel Misalignment(์ง€๋ฉด๊ณผ ๋ชจํ„ฐ์˜ ์ถ•์€ ํ‰ํ–‰ํ•˜๋‚˜ ๋ฒ ์–ด๋ง์„ ๊ธฐ์ค€์œผ๋กœ ๋‹จ์ฐจ๊ฐ€ ์กด์žฌํ•  ๋–„)์™€ Angular Misalign..

    [๊ฐ•ํ™”ํ•™์Šต] gym์œผ๋กœ ๊ฐ•ํ™”ํ•™์Šต custom ํ™˜๊ฒฝ ์ƒ์„ฑ๋ถ€ํ„ฐ Dueling DDQN ํ•™์Šต๊นŒ์ง€

    ์ธํ„ฐ๋„ท์„ ๋‹ค ๋’ค์ ธ๋ดค๋Š”๋ฐ ๊ฐ•ํ™”ํ•™์Šต์„ gym์—์„œ ์ œ๊ณตํ•˜๋Š” ๊ฒŒ์ž„ agent ์‚ฌ์šฉํ•ด์„œ ํ•˜๋Š” ์˜ˆ์ œ๋Š” ์œก์ฒœ๋งŒ ๊ฐœ๊ณ  ์ปค์Šคํ…€ํ•ด์„œ ํ•™์Šต์„ ํ•˜๋Š” ์˜ˆ์ œ๋Š” ๋‹จ ํ•œ ๊ฐœ ์žˆ์—ˆ๋‹ค. ์ด์ œ ๋ง‰ ๊ณต๋ถ€๋ฅผ ์‹œ์ž‘ํ•˜๋Š” ์‚ฌ๋žŒ๋“ค์—๊ฒŒ ๋„์›€์ด ๋˜์—ˆ์œผ๋ฉด ํ•˜๋Š” ๋งˆ์Œ์œผ๋กœ ๊ฐ„๋‹จํ•˜๊ฒŒ ์จ๋ณด๊ณ ์ž ํ•œ๋‹ค. 1. Gym์˜ Env ๊ตฌ์กฐ ์‚ดํŽด๋ณด๊ธฐ ๊ผญ ๊ทธ๋ž˜์•ผํ•˜๋Š” ๊ฒƒ์€ ์•„๋‹ˆ์ง€๋งŒ(๋ฐ‘๋ฐ”๋‹ฅ๋ถ€ํ„ฐ ๊ตฌํ˜„ํ•˜๋Š” ๋ฐฉ๋ฒ•๋„ ์žˆ๊ธด ํ•˜๋‹ค) ์–ด์จŒ๋“  gym ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์˜ environment ๊ตฌ์กฐ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•ด์„œ ๊ตฌํ˜„ํ•ด๋ณผ ๊ฒƒ์ด๋‹ค. !pip install gym gym ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์˜ env ๊ตฌ์กฐ๋Š” ๋Œ€์ถฉ ์•„๋ž˜์™€ ๊ฐ™๋‹ค. site-packages/gym/core.py ์—์„œ ์ง์ ‘ ๋ณผ ์ˆ˜ ์žˆ๋‹ค. class Env(Generic[ObsType, ActType]):m.Generator] = None """ The ma..

    [Paper Review] Transforming Cooling Optimization for Green Data Center via Deep Reinforcement Learning

    * ๊ฐœ์ธ์ ์œผ๋กœ ์ฝ๊ณ  ๊ฐ€๋ณ๊ฒŒ ์ •๋ฆฌํ•ด๋ณด๋Š” ์šฉ๋„๋กœ ์ž‘์„ฑํ•œ ๊ธ€์ด๋ผ ๋ฏธ์ˆ™ํ•˜๊ณ  ์ •ํ™•ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ์–‘ํ•ด ๋ถ€ํƒ๋“œ๋ฆฝ๋‹ˆ๋‹ค :D Transforming Cooling Optimization for Green Data Center via Deep Reinforcement Learning Cooling system plays a critical role in a modern data center (DC). Developing an optimal control policy for DC cooling system is a challenging task. The prevailing approaches often rely on approximating system models that are built upon the knowled..

    [๊ฐ•ํ™”ํ•™์Šต] DQN(Deep Q-Network)

    [Model Review] Markov Decision Process & Q-Learning 1. ๋งˆ๋ฅด์ฝ”ํ”„ ๊ฒฐ์ • ํ”„๋กœ์„ธ์Šค(MDP) ๋ฐ”๋‹ฅ๋ถ€ํ„ฐ ๋ฐฐ์šฐ๋Š” ๊ฐ•ํ™”ํ•™์Šต - ๋งˆ๋ฅด์ฝ”ํ”„ ๊ฒฐ์ • ํ”„๋กœ์„ธ์Šค(Markov Decision Process) ๋งˆ๋ฅด์ฝ”ํ”„ ํ”„๋กœ์„ธ์Šค(Markov Process) ์ƒํƒœ S์™€ ์ „์ดํ™•๋ฅ ํ–‰๋ ฌ P๋กœ ์ •์˜๋จ ํ•˜๋‚˜์˜ ์ƒํƒœ์—์„œ ๋‹ค๋ฅธ dnai-deny.tistory.com Deep Reinforcement Learning ๊ธฐ์กด Q Learning์—์„œ๋Š” State์™€ Action์— ํ•ด๋‹นํ•˜๋Š” Q-Value๋ฅผ ํ…Œ์ด๋ธ” ํ˜•์‹์œผ๋กœ ์ €์žฅ state space์™€ action space๊ฐ€ ์ปค์ง€๋ฉด Q-Value๋ฅผ ์ €์žฅํ•˜๊ธฐ ์œ„ํ•ด memory์™€ exploration time์ด ์ฆ๊ฐ€ํ•˜๋Š” ๋ฌธ์ œ ⇒ ๋”ฅ๋Ÿฌ๋‹์œผ๋กœ Q-Table์„ ์ƒ์„ฑํ•˜๋Š” Q..

    [๊ฐ•ํ™”ํ•™์Šต] Markov Decision Process & Q-Learning

    1. ๋งˆ๋ฅด์ฝ”ํ”„ ๊ฒฐ์ • ํ”„๋กœ์„ธ์Šค(MDP) ๋ฐ”๋‹ฅ๋ถ€ํ„ฐ ๋ฐฐ์šฐ๋Š” ๊ฐ•ํ™”ํ•™์Šต - ๋งˆ๋ฅด์ฝ”ํ”„ ๊ฒฐ์ • ํ”„๋กœ์„ธ์Šค(Markov Decision Process) ๋งˆ๋ฅด์ฝ”ํ”„ ํ”„๋กœ์„ธ์Šค(Markov Process) ์ƒํƒœ S์™€ ์ „์ดํ™•๋ฅ ํ–‰๋ ฌ P๋กœ ์ •์˜๋จ ํ•˜๋‚˜์˜ ์ƒํƒœ์—์„œ ๋‹ค๋ฅธ ์ƒํƒœ๋กœ ์ „์ด๊ฐ€ ์ผ์–ด๋‚จ ์ƒํƒœ ์ „์ด์— ๊ฐ๊ฐ ํ™•๋ฅ  ์กด์žฌ S4์˜ ๊ฒฝ์šฐ ์ข…๋ฃŒ์ƒํƒœ ๋งˆ๋ฅด์ฝ”ํ”„ ์„ฑ์งˆ(Markov property) $$ P[S_{t+1} | S_t] = P[S_{t+1} |S_1,S_2, ... S_t] $$ ์ƒํƒœ๊ฐ€ ๋˜๊ธฐ๊นŒ์ง€์˜ ๊ณผ์ •์€ ํ™•๋ฅ  ๊ณ„์‚ฐ์— ์˜ํ–ฅ์„ ์ฃผ์ง€ ์•Š์Œ. ์–ด๋Š ์‹œ์ ์˜ ์ƒํƒœ๋กœ ๋‹ค์Œ ์ƒํƒœ๋ฅผ ๊ฒฐ์ •ํ•  ์ˆ˜ ์žˆ์„ ๋•Œ ๋งˆ๋ฅด์ฝ”ํ”„ํ•œ ์ƒํƒœ๋ผ๊ณ  ํ•จ.๋ฐ˜๋ก€) ์šด์ „ํ•˜๋Š” ์‚ฌ์ง„(์–ด๋Š ์‹œ์ ์˜ ์‚ฌ์ง„์œผ๋กœ๋Š” ํ›„์ง„/์ „์ง„/์†๋„ ๋“ฑ์„ ํŒŒ์•… ๋ถˆ๊ฐ€ → ๋‹ค์Œ ์ƒํƒœ ๊ฒฐ์ • ๋ถˆ๊ฐ€๋Šฅ) ex) ์ฒด์Šค ๊ฒŒ์ž„(์–ด๋Š ..

    [Model Review] TadGAN(Time series Anomaly Detection GAN)

    ์ด๋ฒˆ์— ๊ณ ์žฅ์ง„๋‹จ์— ๊ด€ํ•œ ๊ณผ์ œ๋ฅผ ํ•˜๊ฒŒ ๋˜๋ฉด์„œ LSTM AE๋‚˜ CNN ๋ณด๋‹ค ์ตœ๊ทผ ๋ชจ๋ธ์„ ์ ์šฉํ•ด๋ณด๊ณ  ์‹ถ์–ด์„œ TadGAN์„ ๊ณจ๋ž๋‹ค. ์•„์ง ์™„์ „ํžˆ ์ดํ•ดํ–ˆ๋Š”์ง€๋Š” ๋ชจ๋ฅด๊ฒ ์œผ๋‚˜ ์•Œ๊ฒŒ๋œ๋Œ€๋กœ ์กฐ๊ธˆ ์ ์–ด๋ณด๋ ค๊ณ  ํ•œ๋‹ค. TadGAN(Time series Anomaly Detection GAN) TadGAN์€ 2020๋…„ ๋ฐœํ‘œ๋œ ๋ชจ๋ธ๋กœ, ์ด๋ฆ„ ๊ทธ๋Œ€๋กœ ์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ์˜ ์ด์ƒ ํƒ์ง€์šฉ GAN ๋ชจ๋ธ์ด๋‹ค. GAN ๋ชจ๋ธ์€ ๋ณต์›, ์ด๋ฏธ์ง€ ์ƒ์„ฑ ๋“ฑ์— ํŠนํ™”๋˜์–ด ์žˆ๋Š”๋ฐ, ์ด ์„ฑ์งˆ์„ ์ด์šฉํ•˜์—ฌ LSTM Auto Encoder์ฒ˜๋Ÿผ ํŒจํ„ด์„ ๋ณต์›ํ•˜๋ฉฐ ํ•™์Šตํ•˜๊ณ , ์ดํ›„์— ๋“ค์–ด์˜ค๋Š” ์ƒˆ๋กœ์šด ๋ฐ์ดํ„ฐ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ์˜ˆ์ธกํ–ˆ์„ ๋•Œ ์—๋Ÿฌ๊ฐ€ ํฐ ๋ถ€๋ถ„์„ ์ด์ƒ์น˜๋กœ ํƒ์ง€ํ•˜๋Š” ๊ฒƒ์ด๋‹ค. TadGAN์˜ ๊ตฌ์กฐ TadGAN์€ 2๊ฐœ์˜ Generator์™€ 2๊ฐœ์˜ Critic ์œผ๋กœ ๊ตฌ์„ฑ๋œ๋‹ค. Gene..

    [Model Review] YOLOv5 + Roboflow Annotation

    ! ์ฃผ์˜ ! ์ด ๊ธ€์—๋Š” ์ ์€ yolo v5์— ๋Œ€ํ•œ ์š”์•ฝ๊ณผ ์งง์€ ์‚ฌ์šฉ๋ฒ•, ๊ทธ๋ฆฌ๊ณ  roboflow annotation์— ๋Œ€ํ•œ ๊ฐœ์ธ์ ์ธ ๊ฒฌํ•ด๊ฐ€ ์“ฐ์—ฌ์žˆ์Šต๋‹ˆ๋‹ค. 1. YOLOv5 Summary You Only Look Once - one stage detection ๋ชจ๋ธ R-CNN์ด๋‚˜ Faster R-CNN๊ณผ ๋‹ฌ๋ฆฌ ์ด๋ฏธ์ง€ ๋ถ„ํ•  ์—†์ด ์ด๋ฏธ์ง€๋ฅผ ํ•œ ๋ฒˆ๋งŒ ๋ณด๋Š” ํŠน์ง• ์ „์ฒ˜๋ฆฌ๋ชจ๋ธ๊ณผ ์ธ๊ณต์‹ ๊ฒฝ๋ง ํ†ตํ•ฉ ์‹ค์‹œ๊ฐ„ ๊ฐ์ฒดํƒ์ง€ Backbone : input image → feature map CSP-Darknet https://keyog.tistory.com/30 Head : predict classes / bounding boxes Dense Prediction : One stage detector(predict classes + b..

    [Model Review] MobileNet SSD ๋…ผ๋ฌธ ํ€ต ๋ฆฌ๋ทฐ

    ํ€„๋ฆฌํ‹ฐ๊ฐ€ ๋†’์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ์ฃผ์˜! Mobile Object Detection model - based on VGG- 16 https://arxiv.org/abs/1704.04861 1. Summary VGG-16 ๊ธฐ๋ฐ˜ ๊ธฐ๋ณธ ๋ชจ๋ธ์ด๋‹ค. ๊ธฐ์กด VGG-16 ๋ชจ๋ธ์ด 3x3x3 convolution์„ 3-dimention์œผ๋กœ ์‚ฌ์šฉํ–ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ด parameter ๊ฐœ์ˆ˜๊ฐ€ 81๊ฐœ์˜€๋Š”๋ฐ, mobile ๊ธฐ๊ธฐ ์œ„์— ์˜ฌ๋ฆฌ๊ธฐ ์œ„ํ•ด depthwise convolution๊ณผ pointwise convolution์„ ํ•จ๊ป˜ ์‚ฌ์šฉํ•˜์—ฌ 331 x 3 + 311 x 3 = 27 + 9 = 36๊ฐœ์˜ parameter๋กœ ์ค„์ธ ๋ฐฉ์‹์˜ ๋ชจ๋ธ์ด๋‹ค. → ์ด๋ฅผ Depth separable convolution ์ด๋ผ๊ณ  ๋ถ€๋ฅธ๋‹ค. 2. Architectur..