๐Ÿฌ ML & Data/๐Ÿฆ„ ๋ผ์ดํŠธ ๋”ฅ๋Ÿฌ๋‹

[๋ผ์ดํŠธ ๋”ฅ๋Ÿฌ๋‹] n. Backpropagation ์ˆ˜์‹ ํ’€์ด ๋ฐ ๊ฒ€์ฆ

darly213 2024. 3. 13. 10:49
728x90

์ถœ์ฒ˜: https://mattmazur.com/2015/03/17/a-step-by-step-backpropagation-example/

 

A Step by Step Backpropagation Example

Background Backpropagation is a common method for training a neural network. There is no shortage of papers online that attempt to explain how backpropagation works, but few that include an example…

mattmazur.com

 

feed forward ๊ณ„์‚ฐ

1. h1 ๊ตฌํ•˜๊ธฐ
$$net_{h1} = 0.05 * 0.15 + 0.1 * 0.2 + 0.35 = 0.3775$$
$$out_{h1} = \frac{1}{1 + e^{-0.3775}} = 0.5932699921071872$$

2. h2 ๊ตฌํ•˜๊ธฐ
$$net_{h2} = 0.05 * 0.25 + 0.1 * 0.3 + 0.35 = 0.39249999999999996$$
$$out_{h2} = \frac{1}{1+e^{-0.3925}} = 0.596884378259767$$

3. o1 ๊ตฌํ•˜๊ธฐ
$$net_{o1}= out_{h1}* 0.4 + out_{h2}* 0.45 + 0.6 = 1.10590596705977$$
$$out_{o1} = \frac{1}{1 + e^{-1.105906}} = 0.7513650695523157$$

4. o2 ๊ตฌํ•˜๊ธฐ
$$net_{o2}= out_{h1}* 0.5 + out_{h2}* 0.55 + 0.6 = 1.2249214040964653$$
$$out_{o2}= \frac{1}{1+e^{-1.224921}}=0.7729284653214625$$

5. error ๊ตฌํ•˜๊ธฐ
์—ฌ๊ธฐ์—์„œ๋Š” squared error function์„ ์‚ฌ์šฉํ•จ
$$E_{total}= \sum\limits \frac12(target - output)^{2}$$
$$E_{o1}= \frac 12 (0.01 - 0.7513650695523157)^{2}= 0.274811083176155$$
$$E_{o2}= \frac 12 (0.99 - 0.7729284653214625)^{2}= 0.023560025583847746$$
$$E_{total}= E_{o1}+ E_{o2} = 0.274811083176155 + 0.023560025583847746 = 0.2983711087600027$$

 

 

Back propagation

output layer backward pass

์˜ˆ์ œ

  • w5 ์—…๋ฐ์ดํŠธ๋ฅผ ์œ„ํ•ด์„œ w5์— ๋Œ€ํ•ด $E_{total}$ , ์ฆ‰ cost๋ฅผ ํŽธ๋ฏธ๋ถ„ํ•œ ๊ฐ’์ด ํ•„์š”ํ•จ
  • chain rule์„ ์ ์šฉ์‹œ์ผœ๋ณด๋ฉด

$$\frac{\partial E_{total}}{\partial w_{5}} = \frac{\partial E_{total}}{\partial out_{o1}} * \frac{\partial out_{o1}}{\partial net_{o1}} * \frac{\partial net_{o1}}{\partial w_{5}}$$

 

  1. $\frac{\partial E_{total}}{\partial out_{o1}}$

$$E_{total} = \frac12 (target_{o1} - out_{o1})^{2} + \frac12 (target_{o2} - out_{o2})^{2}$$
$$\frac{\partial E_{total}}{\partial out_{o1}} = 2 * \frac 12 (target_{o1} - out_{o1})^{2-1} * -1 + 0 \ = -target_{o1}+ out_{o1} = 0.7413650695523157$$

  • ๊ฒ‰๋ฏธ๋ถ„ ๊ฒฐ๊ณผ 2, ์†๋ฏธ๋ถ„ ๊ฒฐ๊ณผ -1, ๋’ทํ•ญ์€ ์ƒ์ˆ˜์ทจ๊ธ‰

 

  1. $\frac{\partial out_{o1}}{\partial net_{o1}}$
    $$out_{o1}= \frac{1}{1+e^{-net_{o1}}}$$
    $$\frac{\partial out_{o1}}{\partial net_{o1}} = out_{o1}(1-out_{o1}) = 0.18681560180895948$$
  • logistic function์˜ ๋ฏธ๋ถ„ ๊ณต์‹์— ์˜ํ•ด $f(x) = \frac{1}{1+e^{x}}$ ํ˜•ํƒœ๋ผ๋ฉด $f'(x) = f(x)(1-f(x))$ ๋กœ ์ •์˜๋จ.

 

  1. $\frac{\partial net_{o1}}{\partial w_{5}}$
    $$net_{o1}= w_{5} * out_{h1}+ w_{6}* out_{h2} + b_{2} * 1 = 0.5932699921071872$$
    $$\frac{\partial net_{o1}}{\partial w_{5}} = out_{h1}= $$


  2. $\frac{\partial E_{total}}{\partial w_{5}}$
    $$\frac{\partial E_{total}}{\partial w_{5}} = 0.7413650695523157 * 0.18681560180895948 * 0.5932699921071872 = 0.08216704056423078$$


  3. $w_{5}^{+}$ ์—…๋ฐ์ดํŠธ
    $$w_{5}^{+} = w_{5}- lr * \frac{\partial E_{total}}{\partial w_{5}} = 0.4 - 0.5 * 0.08216704056423078 = 0.35891647971788465$$
  • ์œ„ ๊ณผ์ •์„ ์š”์•ฝํ•˜๋ฉด ์ด๋Ÿฐ ์‹์ด ๋‚˜์˜จ๋‹ค.
    $$-(target_{o} -out_{o}) * out_{o}(1-out_{o}) * out_{h}$$

 

 

๋‚˜๋จธ์ง€ ๊ฐ€์ค‘์น˜ ์—…๋ฐ์ดํŠธ

$$\begin{matrix} w_{6}^{+} &=& 0.45 - 0.5 * (-0.01 + out_{o1}) * out_{o1}(1-out_{o1}) * out_{h2}
\\ &=& 0.45 - 0.5_(-0.01 + 0.7513650695523157) * 0.7513650695523157(1-0.7513650695523157) *0.596884378259767
\\ &=& 0.45 - 0.5 * 0.08266762784753325
\\ &=& 0.4086661860762334 \end{matrix}$$

 

$$\begin{matrix} w_{7}^{+} &=& 0.5 - 0.5 * (-0.99 + out_{o2}) * out_{o2}(1-out_{o2}) * out_{h1}
\\ &=& 0.5 - 0.5_(-0.99 + 0.7729284653214625) * 0.7729284653214625(1-0.7729284653214625) *0.5932699921071872
\\ &=& 0.5 - 0.5 * -0.02260254047747507
\\ &=& 0.5113012702387375 \end{matrix}$$

 

$$\begin{matrix} w_{8}^{+} &=& 0.55 - 0.5 * (-0.99 + out_{o2}) * out_{o2}(1-out_{o2}) * out_{h2}
\\ &=& 0.55 - 0.5_(-0.99 + 0.7729284653214625)* 0.7729284653214625(1-0.7729284653214625) *0.596884378259767
\\ &=& 0.55 - 0.5 * -0.022740242215978222
\\ &=& 0.5613701211079891 \end{matrix}$$

 

 

Hidden layer ๊ณ„์‚ฐ

์˜ˆ์ œ(w1)

 

$$\frac{\partial E_{total}}{\partial w_{1}}= \frac{\partial E_{total}}{\partial out_{h1}} * \frac{\partial out_{h1}}{\partial net_{h1}} * \frac{\partial net_{h1}}{\partial w_{1}}$$

  • ์œ„ ์‹์„ ๊ตฌํ•ด์•ผํ•œ๋‹ค.
    $$E_{total}= E_{o1} + E_{o2}, \quad \therefore \frac{\partial E_{total}}{\partial out_{h1}} = \frac{\partial E_{o1}}{\partial out_{h1}} + \frac{\partial E_{o2}}{\partial out_{h1}}$$
    $$\frac{\partial E_{o1}}{\partial out_{h1}} = \frac{\partial E_{o1}}{\partial net_{o1}} * \frac{\partial net_{o1}}{\partial out_{h1}}$$

 

  • ํ•œ ํ•ญ์”ฉ ๋ถ„ํ•ด - 1

$$\frac{\partial E_{o1}}{\partial net_{o1}} = \frac{\partial E_{o1}}{\partial out_{o1}} * \frac{\partial out_{o1}}{\partial net_{o1}}$$

$$ \frac{\partial E_{o1}}{\partial out_{o1}} = 2 * \frac 12 (target_{o1} - out_{o1})^{2-1} * -1 + 0 \ = -target_{o1}+ out_{o1} = 0.7413650695523157$$

$$\frac{\partial out_{o1}}{\partial net_{o1}} = out_{o1}(1-out_{o1}) = 0.18681560180895948$$

$$\therefore \frac{\partial E_{o1}}{\partial net_{o1}} = \frac{\partial E_{o1}}{\partial out_{o1}} * \frac{\partial out_{o1}}{\partial net_{o1}} = 0.7413650695523157 * 0.18681560180895948 = 0.13849856162855698$$

 

  • ํ•œ ํ•ญ ์”ฉ ๋ถ„ํ•ด - 2

$$\frac{\partial net_{o1}}{\partial out_{h1}} = w5$$ $$\begin{matrix} \because net_{o1} &=& w_{5} * out_{h1}+ w6 * out_{h2}+ b2*1
\ \frac{\partial net_{o1}}{\partial out_{h1}} &=& w_{5}\end{matrix}$$

 

  • ๊ฒฐํ•ฉ

$$\frac{\partial E_{o1}}{\partial out_{h1}} = \frac{\partial E_{o1}}{\partial net_{o1}} * \frac{\partial net_{o1}}{\partial out_{h1}} = 0.13849856162855698 * 0.4 = 0.05539942465142279$$

 

 

๋ฐ˜๋Œ€ ํ•ญ

$$\frac{\partial E_{o2}}{\partial out_{h1}} = \frac{\partial E_{o2}}{\partial net_{o2}} * \frac{\partial net_{o2}}{\partial out_{h1}}$$

 

  • ํ•œ ํ•ญ ์”ฉ ๋ถ„ํ•ด

$$\frac{\partial E_{o2}}{\partial net_{o2}} = \frac{\partial E_{o2}}{\partial out_{o2}} * \frac{\partial out_{o2}}{\partial net_{o2}}$$


$$ \begin{matrix} \frac{\partial E_{o2}}{\partial out_{o2}} &=& 2 * \frac 12 (target_{o2} - out_{o2})^{2-1} * -1 + 0 \\ &=& -target_{o2}+ out_{o2} = -0.99 + 0.7729284653214625 = -0.21707153467853746 \end{matrix} $$

 

$$\frac{\partial out_{o2}}{\partial net_{o2}} = out_{o2}(1-out_{o2}) = 0.17551005281727122$$


$$\begin{matrix} \frac{\partial E_{o2}}{\partial net_{o2}} &=& \frac{\partial E_{o2}}{\partial net_{o2}} * \frac{\partial net_{o2}}{\partial out_{h1}} \\ &=& -0.21707153467853746 * 0.17551005281727122 = -0.03809823651655623 \end{matrix}$$

 

  • ํ•œ ํ•ญ์”ฉ ๋ถ„ํ•ด - 2

$$\frac{\partial net_{o2}}{\partial out_{h1}} = w_{7} = 0.5$$

 

  • ๊ฒฐํ•ฉ

$$\begin{matrix} \frac{\partial E_{o2}}{\partial out_{h1}} &=& \frac{\partial E_{o2}}{\partial net_{o2}} * \frac{\partial net_{o2}}{\partial out_{h1}} \\ &=& -0.03809823651655623* 0.5 = -0.019049118258278114\end{matrix}$$

 

 

์ด ์‹

$$\frac{\partial E_{total}}{\partial w_{1}}= \frac{\partial E_{total}}{\partial out_{h1}} * \frac{\partial out_{h1}}{\partial net_{h1}} * \frac{\partial net_{h1}}{\partial w_{1}}$$


$$\begin{matrix} \frac{\partial E_{total}}{\partial out_{h1}} &=& \frac{\partial E_{o1}}{\partial out_{h1}} + \frac{\partial E_{o2}}{\partial out_{h1}} \\ &=& 0.05539942465142279 + -0.019049118258278114 = 0.03635030639314468 \end{matrix}$$


$$\frac{\partial out_{h1}}{\partial net_{h1}} = out_{h1}(1-out_{h1}) = 0.24130070857232525$$


$$\begin{matrix} net_{h1} &=& input_{1}* w_{1}+ input_{2}* w_{3}+ b_{1}* 1
\ \frac{\partial net_{h1}}{\partial w_{1}} &=& input_{1} = 0.05\end{matrix} $$

 

  • ์ •๋ฆฌํ•˜๋ฉด

$$\begin{matrix} \frac{\partial E_{total}}{\partial w_{1}} &=& \frac{\partial E_{total}}{\partial out_{h1}} * \frac{\partial out_{h1}}{\partial net_{h1}} * \frac{\partial net_{h1}}{\partial w_{1}}
\\ & =& 0.03635030639314468 * 0.24130070857232525 * 0.05
\\ & = & 0.00043856773447434685\end{matrix}$$


$$w_{1}^{+}= w_{1}-lr * \frac{\partial E_{total}}{\partial w_1}$$


$$w_{1}^{+} = 0.15 - 0.5 * 0.00043856773447434685 = 0.1497807161327628$$

 

 

์ •๋ฆฌ

$$\frac{\partial E_{total}}{\partial w_{1}}= \frac{\partial E_{total}}{\partial out_{h1}} * \frac{\partial out_{h1}}{\partial net_{h1}} * \frac{\partial net_{h1}}{\partial w_{1}}$$

 

$$\frac{\partial E_{total}}{\partial w_{1}}= (\frac{\partial E_{o1}}{\partial out_{h1}} + \frac{\partial E_{o2}}{\partial out_{h1}}) * \frac{\partial out_{h1}}{\partial net_{h1}} * \frac{\partial net_{h1}}{\partial w_{1}}$$

 

$$\frac{\partial E_{total}}{\partial w_{1}}=
{(\frac{\partial E_{o1}}{\partial net_{o1}} * \frac{\partial net_{o1}}{\partial out_{h1}}) + (\frac{\partial E_{o2}}{\partial net_{o2}} * \frac{\partial net_{o2}}{\partial out_{h1}})} * \frac{\partial out_{h1}}{\partial net_{h1}} * \frac{\partial net_{h1}}{\partial w_{1}}$$

 

$$\frac{\partial E_{total}}{\partial w_{1}}=
{(\frac{\partial E_{o1}}{\partial out_{o1}} * \frac{\partial out_{o1}}{\partial net_{o1}} * \frac{\partial net_{o1}}{\partial out_{h1}}) +
(\frac{\partial E_{o2}}{\partial out_{o2}} * \frac{\partial out_{o2}}{\partial net_{o2}} * \frac{\partial net_{o2}}{\partial out_{h1}})} * \frac{\partial out_{h1}}{\partial net_{h1}} * \frac{\partial net_{h1}}{\partial w_{1}}$$

 

$$\begin{matrix} \frac{\partial E_{total}}{\partial w_{1}} &=& (-target_{o1}+ out_{o1}) * \frac{\partial out_{o1}}{\partial net_{o1}} * \frac{\partial net_{o1}}{\partial out_{h1}} \\ &+& (-target_{o2}+ out_{o2}) * \frac{\partial out_{o2}}{\partial net_{o2}} * \frac{\partial net_{o2}}{\partial out_{h1}} \\ & * & \frac{\partial out_{h1}}{\partial net_{h1}} * \frac{\partial net_{h1}}{\partial w_{1}} \end{matrix}$$

 

$$\begin{matrix} \frac{\partial E_{total}}{\partial w_{1}} &=& (-target_{o1}+ out_{o1}) * out_{o1}(1-out_{o1}) * \frac{\partial net_{o1}}{\partial out_{h1}} \\ &+&
(-target_{o2}+ out_{o2}) * out_{o2}(1-out_{o2}) * \frac{\partial net_{o2}}{\partial out_{h1}} \\
& * & \frac{\partial out_{h1}}{\partial net_{h1}} * \frac{\partial net_{h1}}{\partial w_{1}} \end{matrix}$$

 

$$\begin{matrix} \frac{\partial E_{total}}{\partial w_{1}} &=& (-target_{o1}+ out_{o1}) * out_{o1}(1-out_{o1}) * w_{5} \\ &+&
(-target_{o2}+ out_{o2}) * out_{o2}(1-out_{o2}) * w_{7} \\
& * & out_{h1}(1-out_{h1}) * \frac{\partial net_{h1}}{\partial w_{1}} \end{matrix}$$

 

$$\begin{matrix} \frac{\partial E_{total}}{\partial w_{1}} &=& (-target_{o1}+ out_{o1}) * out_{o1}(1-out_{o1}) * w_{5} \\ &+&
(-target_{o2}+ out_{o2}) * out_{o2}(1-out_{o2}) * w_{7} \\
& * & out_{h1}(1-out_{h1}) * \frac{\partial net_{h1}}{\partial w_{1}} \end{matrix}$$

 

$$\begin{matrix} \frac{\partial E_{total}}{\partial w_{1}} &=& (-target_{o1}+ out_{o1}) * out_{o1}(1-out_{o1}) * w_{5} \\ &+&
(-target_{o2}+ out_{o2}) * out_{o2}(1-out_{o2}) * w_{7} \\
& * & out_{h1}(1-out_{h1}) * input_{1} \end{matrix}$$

 

 

๋‚˜๋จธ์ง€

$w_{2}$

$$\frac{\partial E_{total}}{\partial w_{2}}= \frac{\partial E_{total}}{\partial out_{h1}} * \frac{\partial out_{h1}}{\partial net_{h1}} * \frac{\partial net_{h1}}{\partial w_{2}}$$

 

  • ์œ„์™€ ๊ฐ™์œผ๋ฏ€๋กœ $w_1$ ๊ณผ ์‹์€ ๊ฐ™๊ณ  ๋งˆ์ง€๋ง‰ input๋งŒ ๋‹ค๋ฅด๊ฒŒ ๊ณฑํ•ด์ฃผ๋ฉด ๋œ...๋‹ค.

$$\begin{matrix} \frac{\partial E_{total}}{\partial w_{1}} &=& \frac{\partial E_{total}}{\partial out_{h1}} * \frac{\partial out_{h1}}{\partial net_{h1}} * \frac{\partial net_{h1}}{\partial w_{1}}
\\ & =& 0.03635030639314468 * 0.24130070857232525 * 0.1
\\ & = & 0.0008771354689486937\end{matrix}$$

 

$$w_{2}^{+} = w_{2} - lr * \frac{\partial E_{total}}{\partial w_{2}}$$
$$w_{2}^{+} = 0.2 - 0.5 * 0.0008771354689486937 = 0.19956143226552567$$

 

 

$w_{3}$

$$\frac{\partial E_{total}}{\partial w_{3}}= \frac{\partial E_{total}}{\partial out_{h2}} * \frac{\partial out_{h2}}{\partial net_{h2}} * \frac{\partial net_{h2}}{\partial w_{3}}$$

 

$$\frac{\partial E_{total}}{\partial w_{3}}= (\frac{\partial E_{o1}}{\partial out_{h2}} + \frac{\partial E_{o2}}{\partial out_{h2}}) * \frac{\partial out_{h2}}{\partial net_{h2}} * \frac{\partial net_{h2}}{\partial w_{3}}$$

 

$$\frac{\partial E_{total}}{\partial w_{3}}=
{(\frac{\partial E_{o1}}{\partial net_{o1}} * \frac{\partial net_{o1}}{\partial out_{h2}}) + (\frac{\partial E_{o2}}{\partial net_{o2}} * \frac{\partial net_{o2}}{\partial out_{h2}})} * \frac{\partial out_{h2}}{\partial net_{h2}} * \frac{\partial net_{h2}}{\partial w_{1}}$$

 

$$\frac{\partial E_{total}}{\partial w_{3}}=
{(\frac{\partial E_{o1}}{\partial out_{o1}} * \frac{\partial out_{o1}}{\partial net_{o1}} * \frac{\partial net_{o1}}{\partial out_{h2}}) +
(\frac{\partial E_{o2}}{\partial out_{o2}} * \frac{\partial out_{o2}}{\partial net_{o2}} * \frac{\partial net_{o2}}{\partial out_{h2}})} * \frac{\partial out_{h2}}{\partial net_{h2}} * \frac{\partial net_{h2}}{\partial w_{1}}$$

 

$$\begin{matrix} \frac{\partial E_{total}}{\partial w_{3}} &=& (-target_{o1}+ out_{o1}) * \frac{\partial out_{o1}}{\partial net_{o1}} * \frac{\partial net_{o1}}{\partial out_{h2}} \\ &+&
(-target_{o2}+ out_{o2}) * \frac{\partial out_{o2}}{\partial net_{o2}} * \frac{\partial net_{o2}}{\partial out_{h2}} \\
&* & \frac{\partial out_{h2}}{\partial net_{h2}} * \frac{\partial net_{h2}}{\partial w_{3}} \end{matrix}$$

 

$$\begin{matrix} \frac{\partial E_{total}}{\partial w_{3}} &=& (-target_{o1}+ out_{o1}) * out_{o1}(1-out_{o1}) * \frac{\partial net_{o1}}{\partial out_{h2}} \\ &+&
(-target_{o2}+ out_{o2}) * out_{o2}(1-out_{o2}) * \frac{\partial net_{o2}}{\partial out_{h2}} \\
& * & \frac{\partial out_{h2}}{\partial net_{h2}} * \frac{\partial net_{h2}}{\partial w_{3}} \end{matrix}$$

 

$$\begin{matrix} \frac{\partial E_{total}}{\partial w_{3}} &=& (-target_{o1}+ out_{o1}) * out_{o1}(1-out_{o1}) * w_{6} \\ &+&
(-target_{o2}+ out_{o2}) * out_{o2}(1-out_{o2}) * w_{8} \\
& * & \frac{\partial out_{h2}}{\partial net_{h2}} * \frac{\partial net_{h2}}{\partial w_{3}} \end{matrix}$$

 

$$\begin{matrix} \frac{\partial E_{total}}{\partial w_{3}} &=& (-target_{o1}+ out_{o1}) * out_{o1}(1-out_{o1}) * w_{6} \\ &+&
(-target_{o2}+ out_{o2}) * out_{o2}(1-out_{o2}) * w_{8} \\
& * & out_{h2}(1-out_{h2}) * \frac{\partial net_{h2}}{\partial w_{3}} \end{matrix}$$

 

$$\begin{matrix} \frac{\partial E_{total}}{\partial w_{3}} &=& (-target_{o1}+ out_{o1}) * out_{o1}(1-out_{o1}) * w_{6} \\ &+&
(-target_{o2}+ out_{o2}) * out_{o2}(1-out_{o2}) * w_{8} \\
&*& out_{h2}(1-out_{h2}) * input_{1} \end{matrix}$$

 

$$\begin{matrix} \frac{\partial E_{total}}{\partial w_{3}} &=& (0.05475832085958778 + -0.020954030084105933) * 0.01203067086246092 = 0.0004066882960587459 \end{matrix}$$

 

$$w_{3}^{+} = w_{3}- lr * \frac{\partial E_{total}}{\partial w_{3}}$$

 

$$w_{3}^{+} = 0.25 - 0.5 * 0.0004066882960587459 = 0.24979665585197064$$

 

 

$w_{4}$

$$\frac{\partial E_{total}}{\partial w_{4}}= \frac{\partial E_{total}}{\partial out_{h2}} * \frac{\partial out_{h2}}{\partial net_{h2}} * \frac{\partial net_{h2}}{\partial w_{4}}$$

 

  • ์œ„์™€ ๊ฐ™์œผ๋ฏ€๋กœ $w_3$ ๊ณผ ์‹์€ ๊ฐ™๊ณ  ๋งˆ์ง€๋ง‰ input๋งŒ ๋‹ค๋ฅด๊ฒŒ ๊ณฑํ•ด์ฃผ๋ฉด ๋œ...๋‹ค.

$$\begin{matrix} \frac{\partial E_{total}}{\partial w_{4}} &=& \frac{\partial E_{total}}{\partial out_{h2}} * \frac{\partial out_{h2}}{\partial net_{h2}} * \frac{\partial net_{h2}}{\partial w_{4}}
\\ & =& 0.033804290775481846 * 0.2406134172492184 * 0.1
\\ & = & 0.0008133765921174917\end{matrix}$$

 

  • ๋”ฐ๋ผ์„œ $w_{4}^{+}$ ๋Š” ์•„๋ž˜์™€ ๊ฐ™๋‹ค.

$$w_{4}^{+} = w_{4}- lr * \frac{\partial E_{total}}{\partial w_{4}}$$


$$w_{3}^{+} = 0.3 - 0.5 * 0.0008133765921174917 = 0.29959331170394127$$

 

 

์—…๋ฐ์ดํŠธ ํ›„ ์„ฑ๋Šฅ๊ฐœ์„ ํ™•์ธ

  • ์—…๋ฐ์ดํŠธ ๋œ ๊ฐ’์œผ๋กœ ๋‹ค์‹œ feed forwarding์„ ์ˆ˜ํ–‰ํ•ด์„œ ์ฐจ์ด๋ฅผ ๋ณด์ž.

 

feed forwarding

 

$$net_{h1}= 0.05 * 0.1498 + 0.1 * 0.1996 + 0.35 = 0.37744999999999995$$
$$out_{h1}= \frac{1}{1 + e^{-net_{h1}}} = 0.5932579270154956$$
$$net_{h2}= 0.05 * 0.2498 + 0.1 * 0.2996 + 0.35 = 0.39244999999999997$$
$$out_{h2}= \frac{1}{1 + e^{-net_{h2}}} = 0.5968723475306276$$

$$net_{o1}= out_{h1}* 0.03598 + out_{h2} * 0.4087 + 0.6 = 1.056861998441629$$
$$out_{o1} = \frac{1}{1 + e^{-net_{o1}}} = 0.7420904125813247$$
$$net_{o2}= out_{h1}* 0.5113 + out_{h2} * 0.5614 + 0.6 = 1.2384169139867174$$
$$out_{o2} = \frac{1}{1 + e^{-net_{o2}}} = 0.7752883350296511$$

 

cost ๊ตฌํ•˜๊ธฐ

$$E_{total}= \sum\limits \frac12(target - output)^{2}$$
$$E_{o1} = \frac 12 (0.01 - 0.7420904125813247)^{2} = 0.2679781860967471$$
$$E_{o2} = \frac 12 (0.99 - 0.7752883350296511)^{2} = 0.02305054953716967$$
$$\therefore E_{total}= 0.2679781860967471 + 0.02305054953716967 = 0.2910287356339168$$

 

  • ์ด์ „ cost์™€ ๋น„๊ตํ•ด๋ณด๋ฉด
    $$E_{prev} = 0.2983711087600027$$
    $$E_{total}= 0.2910287356339168$$
    $$E_{prev} - E_{total} = 0.007342373126085933$$

ํ™•์‹คํžˆ ์ข‹์•„์กŒ์Œ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค!

728x90