[Model Review] MobileNet SSD 논문 퀵 리뷰

darly213 2022. 12. 13. 09:09

728x90

퀄리티가 높지 않습니다. 주의!

Mobile Object Detection model - based on VGG- 16

1. Summary

VGG-16 기반 기본 모델이다. 기존 VGG-16 모델이 3x3x3 convolution을 3-dimention으로 사용했기 때문에 총 parameter 개수가 81개였는데, mobile 기기 위에 올리기 위해 depthwise convolution과 pointwise convolution을 함께 사용하여 331 x 3 + 311 x 3 = 27 + 9 = 36개의 parameter로 줄인 방식의 모델이다.

→ 이를 Depth separable convolution 이라고 부른다.

2. Architecture

Depthwise Separable Convolution
- input channel에 대해 single filter(depthwise)를 적용
- filter의 output에 pointwise(1x1 conv)를 적용, combine
기존 standard convolution의 경우 filter와 combine을 한 큐에 했는데, 여기서는 두 단계로 나눠서 적용

VGG-16와 비교

기존 = conv → batch normalization → relu
depthwise = depthwise conv → batch normalization → relu → pointwise conv → batch normaliztion → relu

첫 layer만 full conv(standard conv) 사용, 이후엔 쭉 depthwise separable conv
pooling 대신 stride를 조정해서 크기 조절

728x90

저작자표시 비영리 변경금지 (새창열림)