๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
Coursera/Deep Learning Specialization

[Deep-Special] [Lec1] Week3. Shallow Neural Network

by Steve-Lee 2020. 11. 8.

๐Ÿ“” [Lec1] Neural Networks & Deep Learning

Shallow Neural Network

3์ฃผ ์ฐจ ๋ชฉํ‘œ

  • Describe hidden units and hidden layers
  • Use units with a non-linear activation function, such as tanh
  • Implement forward and backward propagation
  • Apply random initialization to your neural network
  • Increase fluency in Deep Learning notations and Neural Network Representations
  • Implement a 2-class classification neural network with a single hidden layer

3์ฃผ ์ฐจ ํ•™์Šต ๋‚ด์šฉ

  • ์ง€๋‚œ ์‹œ๊ฐ„์— Logistic Regression์— ๋Œ€ํ•ด ํ•™์Šตํ–ˆ๋‹ค
  • ์ด๋ฒˆ ์ฃผ๋Š” ์–•์€ ์‹ ๊ฒฝ๋ง์„ ์Œ“๊ณ  ๊ตฌํ˜„ํ•˜๋Š” ์‹œ๊ฐ„์„ ๊ฐ–๋Š”๋‹ค
  • Neural Network๋ž€ ๋ฌด์—‡์ธ๊ฐ€
  • Weight ์ดˆ๊ธฐํ™”๋ฅผ ์–ด๋–ป๊ฒŒ ํ•ด์•ผํ• ๊นŒ
  • Vectorization์„ ํ†ตํ•ด ์—ฌ๋Ÿฌ ์ƒ˜ํ”Œ๋“ค์„ ์–ด๋–ป๊ฒŒ ํ•™์Šต์‹œํ‚ฌ ์ˆ˜ ์žˆ๋Š”๊ฐ€
  • Activation Function์„ ์‚ฌ์šฉํ•˜๋Š” ์ด์œ ์™€ ์ข…๋ฅ˜์— ๋Œ€ํ•ด ์•Œ์•„๋ณธ๋‹ค

 

↓ 2์ฃผ ์ฐจ Logistic Regression์— ๋Œ€ํ•œ ์ •๋ฆฌ๋…ธํŠธ๋Š” ์•„๋ž˜์˜ ๋งํฌ์—์„œ ํ™•์ธํ•ด ๋ณผ ์ˆ˜ ์žˆ๋‹ค

 

[Deep-Special] [Lec1] Week2. Logistic Regression as a Neural Network (Last Update-20.11.02.Mon)

๐Ÿ“” [Lec1] Neural Networks & Deep Learning Deep Learning Specialization Course์˜ ์ฒซ ๋ฒˆ์งธ ๊ฐ•์˜ 'Neural Networks & Deep Learning'์˜ 2์ฃผ์ฐจ ๊ณผ์ •์ž…๋‹ˆ๋‹ค. 2์ฃผ์ฐจ ๋ชฉํ‘œ ์ˆœ์ „ํŒŒ, ์—ญ์ „ํŒŒ ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€๋ถ„์„ ๋น„์šฉ ํ•จ..

deepinsight.tistory.com

 

1. Computing a Neural Network's Output

  • 1๊ฐœ์˜ Feature Vector๊ฐ€ ์ฃผ์–ด์ง€๋ฉด ์ฝ”๋“œ 4์ค„๋กœ Single Neural Network๋ฅผ ๊ณ„์‚ฐํ•  ์ˆ˜ ์žˆ๋‹ค
  • Logistic Regression์„ ๊ตฌํ˜„ํ–ˆ๋˜ ๊ฒƒ๊ณผ ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ Vectorํ™”๋ฅผ ํ†ตํ•ด ํŠธ๋ ˆ์ด๋‹ ์ƒ˜ํ”Œ์„ ํ•™์Šต์‹œํ‚ฌ ์ˆ˜ ์žˆ๋‹ค
  • Vectorization → Whole Training Sample์„ ํ•™์Šต์‹œํ‚ฌ ์ˆ˜ ์žˆ๋‹ค

2. Vectorizing across multiple examples

multiple example์— ๋Œ€ํ•ด ๊ฒฐ๊ณผ๋ฅผ ๊ตฌํ•ด๋ณด์ž

 

e.g a[2](i): ์ฒซ ๋ฒˆ์งธ ์ƒ˜ํ”Œ 2๋ฒˆ์งธ ๋ ˆ์ด์–ด์˜ ๊ฐ’

  • m๊ฐœ์˜ training sample์— ๋Œ€ํ•ด Vectorํ™”๋ฅผ ํ•ด๋ณด์ž

์œ„์˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ์•ž์„œ 1๋ฒˆ์—์„œ ๋งํ–ˆ๋˜ ์ฝ”๋“œ 4์ค„๋กœ Neural Network ๊ตฌํ˜„ํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ๊ฒƒ์„ ๋ณด์—ฌ์ค€๋‹ค.

3. Activation Functions

๋‹ค์–‘ํ•œ Activation Function์˜ ํŠน์ง•์„ ์•Œ์•„๋ณด์ž

  • ์ผ๋ฐ˜์ ์œผ๋กœ ์ข‹์€ ์„ฑ๋Šฅ์„ ๋‚ด๋Š” ํ™œ์„ฑํ™” ํ•จ์ˆ˜๊ฐ€ ๋ชจ๋“  ๋ฌธ์ œ์—์„œ ์ตœ์„ ์˜ ๊ฒฐ๊ณผ๋ฅผ ๊ฐ€์ ธ๋‹ค์ฃผ์ง€๋Š” ์•Š๋Š”๋‹ค
  • ํ™œ์„ฑํ™” ํ•จ์ˆ˜๊ฐ€ ์ตœ์ƒ์˜ ๊ฒฐ๊ณผ๋ฅผ ๋‚ด๋Š”์ง€ ๋ชจ๋ฅด๊ฒ ๋‹ค๋ฉด, ์ „๋ถ€ ์‹œ๋„ํ•ด๋ณด๊ณ  ํ‰๊ฐ€ํ•ด๋ณด๋Š” ๊ฒƒ์„ ์ถ”์ฒœํ•œ๋‹ค

Then... Why do we need non-linear activation functions?

4. Why do you need non-linear activation functions

  • Non-linear Activation Function์ด ์—†๋‹ค๋ฉด ์—ฌ๋Ÿฌ layer๋ฅผ ์Œ“์•„๋„ linear function๊ณผ ๊ฐ™์•„์ง„๋‹ค
  • Activation์„ ์Œ“์•„์•ผ ํ•˜๋Š” '์ด์œ '๋ฅผ ์•Œ๋ฉด ๋‰ด๋Ÿด๋„ท์˜ ํ•™์Šต ๊ณผ์ •์ด ๋” ์žฌ๋ฐŒ์–ด์งˆ ๊ฒƒ์ด๋‹ค

5. Backpropagation Intuition

6. Random Initialization

  • Why? - ๋งˆ์ฐฌ๊ฐ€์ง€ weight๊ฐ’์„ ๋žœ๋คํ•˜๊ฒŒ ์ดˆ๊ธฐํ™”ํ•˜๋Š” ์ด์œ ๋ฅผ ์•„๋Š” ๊ฒƒ์ด ์ค‘์š”ํ•˜๋‹ค

โœ… ์™œ weight๋ฅผ ์ž‘์€ ๊ฐ’์œผ๋กœ ์ดˆ๊ธฐํ™” ํ•˜๋Š”๊ฑด๊ฐ€?

  • ๋งŒ์•ฝ w๊ฐ€ ํฌ๋‹ค๋ฉด z๊ฐ’๋„ ์ปค์งˆ ๊ฒƒ์ด๊ณ 
  • ๊ทธ๋ ‡๊ฒŒ ๋˜๋ฉด sigmoid ํ•จ์ˆ˜๋‚˜ tanh ํ•จ์ˆ˜๊ฐ€ Saturation ๋  ๊ฒƒ์ด๋‹ค
  • ๋”ฐ๋ผ์„œ ์ž‘์€ ์ˆ˜๋กœ weight๋ฅผ ์ดˆ๊ธฐํ™”ํ•˜๋Š” ๊ฒƒ์ด ์ผ๋ฐ˜์ ์ด๋‹ค

 

โœ… weight๋ฅผ ํฐ ๊ฐ’์œผ๋กœ ์ดˆ๊ธฐํ™”ํ•˜๋ฉด ์•ˆ ๋˜๋Š” ๊ฑด๊ฐ€? (์ด ๊ฒฝ์šฐ๋„ ํ•œ ๋ฒˆ ์ƒ๊ฐํ•ด๋ณด์ž)

  • weight๊ฐ’์„ ํฐ ๊ฐ’์œผ๋กœ ์ดˆ๊ธฐํ™”ํ•  ๊ฒฝ์šฐ sigmoid function๊ณผ tanh function์˜ ๊ฒฐ๊ณผ ๊ฐ’์ด ํฐ ๊ฐ’์„ ๊ฐ™๊ฒŒ ๋œ๋‹ค
  • ๋”ฐ๋ผ์„œ Backpropagation ์ˆ˜ํ–‰ ์‹œ ๊ธฐ์šธ๊ธฐ๊ฐ€ 0์— weight๊ฐ’์ด ๊ฐ™์ด ์—…๋ฐ์ดํŠธ๋˜๊ธฐ ๋•Œ๋ฌธ์— ํ•™์Šต์ด ๋Š๋ ค์งˆ ๊ฒƒ์ด๋‹ค(์ตœ์ ํ™” ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ์ˆ˜ํ–‰์ด ๋Š๋ ค์ง„๋‹ค)
  • ์ด๋ฅผ Saturation์ด๋ผ๊ณ  ํ•œ๋‹ค 

 

Summary

  • ์ง€๋‚œ ์‹œ๊ฐ„์—๋Š” Deep Learning์˜ ๊ธฐ๋ณธ์ด๋ผ๊ณ  ํ•  ์ˆ˜ ์žˆ๋Š” Logistic Regression๊ณผ Cost Function์— ๋Œ€ํ•ด ํ•™์Šตํ–ˆ๋‹ค
  • ์ด๋ฒˆ ์‹œ๊ฐ„์—๋Š” Shallow Neural Network๋ผ๋Š” 1๊ฐœ์˜ hidden layer๋ฅผ ๊ฐ–๋Š” ์–•์€ ์‹ ๊ฒฝ๋ง์„ ์ง์ ‘ ๊ตฌํ˜„ํ•ด ๋ดค๋‹ค
  • ๊ตฌํ˜„ ๊ณผ์ •์—์„œ ํ•„์š”ํ•œ ์ง€์‹์œผ๋กœ๋Š” ์—ฌ๋Ÿฌ ํŠธ๋ ˆ์ด๋‹ ๋ฐ์ดํ„ฐ๋ฅผ ํ•œ ๋ฒˆ์— ๊ณ„์‚ฐํ•  ์ˆ˜ ์žˆ๋Š” 'Vectorization'์™€ Activation Function์— ๋Œ€ํ•œ ๊ฐœ๋…์ด ํ•„์š”ํ•˜๋‹ค
  • Assignment๋กœ๋Š” Shallow Neural Network๋ฅผ ํ†ตํ•œ Binary Classification ๊ณผ์ œ๊ฐ€ ์ฃผ์–ด์ง„๋‹ค
  • Network๋ฅผ ์ •์˜ํ•˜๊ณ , Weight๋ฅผ ์ดˆ๊ธฐํ™”ํ•˜๊ณ , Forwardprop/Backprop ๊ณ„์‚ฐ์„ ํ•˜๊ณ , Weight๋ฅผ ์—…๋ฐ์ดํŠธํ•ด์ฃผ๋Š” ์ผ๋ จ์˜ ํ”„๋กœ์„ธ์Šค๋ฅผ ๊ธฐ์–ตํ•˜๋„๋ก ํ•˜์ž
  • ๊ณผ์ œ๋ฅผ ๊ตฌํ˜„ํ•˜๋Š” ๊ณผ์ •์—์„œ ์ˆ˜์—…์‹œ๊ฐ„์— ๋ช…ํ™•ํ•˜๊ฒŒ ์ดํ•ด๊ฐ€ ๊ฐ€์ง€ ์•Š์•˜๋˜ ๋ถ€๋ถ„์„ ์ดํ•ดํ•  ์ˆ˜ ์žˆ์—ˆ๋˜ ๊ฒƒ ๊ฐ™๋‹ค
  • ์ด๋ ‡๊ฒŒ 3์ฃผ ์ฐจ๊นŒ์ง€ ๋น ๋ฅด๊ฒŒ ๋‹ฌ๋ ค์™”๋‹ค. ์•ž์œผ๋กœ์˜ ์‹œ๊ฐ„์ด ๋” ๊ธฐ๋Œ€๋œ๋‹ค -20.11.08.Sun.pm5:50- 

 

Reference

 

์‹ฌ์ธต ํ•™์Šต

Learn Deep Learning from deeplearning.ai. If you want to break into Artificial intelligence (AI), this Specialization will help you. Deep Learning is one of the most highly sought after skills in tech. We will help you become good at Deep Learning.

www.coursera.org

 

๋Œ“๊ธ€