728x90

Load Pretrained model in pytorch

Pretrained model

pth로 저장된 torch pretrained model(weight)를 불러와서 사용
weight의 일부만 불러와서 사용할 수 있다.
pth = dictionary 로 구성된다.

Get format

pth 파일은 Dictionary 형태로 저장되어 있다. pytorch의 load를 통해서 불러올 수 있다.

import torch

model = torch.load('model.pth')
print(model.keys())

model.keys() 를 사용해서 key 값들을 불러올 수 있는데, 이것으로 모델 구조를 파악할 수 있다. 현재 예제로 사용하고 있는 pth 파일은 mobilenet-ssd-v1 모델의 mAP 0.675 pretrained weight 파일이다.

이 모델의 경우에는 object detection 모델이기 때문에 regression과 classfication이 모두 존재한다. 여기에

base_net의 weight, bias, running mean, running var
extras의 weight, bias
classification_header의 weight, bias
regression_header의 weight, bias

가 레이어별로 저장되어 있다. (모델마다 포맷이 다를 수 있다.)

이 모델에서 분류하는 class의 개수를 얻기 위해서는 마지막 output layer의 아이템 개수를 확인하면 되는데,

output_len = len(list(model.values())[-1])

이렇게 확인할 수 있다. 다른 데이터 포맷에서는 상위 딕셔너리 아래에 value가 저장되어 있을 수도 있다. 그런 경우에는

output_len = len(list(model['{your-key}'].values())[-1])

위와 같이 접근할 수 있다. 예제 모델의 경우 classification 개수가 24이다.

Extract backbone

내 경우에는 pretrained model과 다른 커스텀 데이터를 사용할 예정이라 class의 개수가 달랐다. 하지만 classification과 관계가 적은, bounding box를 학습한 데이터인 base net 파트는 가져가는게 학습 시간을 줄일 수 있는 방법이라고 생각해서 pretrained model의 일부 weight만 잘라서 적용하기로 했다.

Extract weights of specific module from pretrained model file

위 링크의 pytorch forum을 참고했다.

dictionary key의 이름을 기준으로 새 dictionary에 추가한다.

new_weights = {}
for key, value in model.items():
    if key.startwith('base'): # key의 시작 이름이 base인 경우
        new_weight[key] = value # 새 dictionary에 추가

새로 만든 dictionary를 pth 파일로 저장한다.

torch.save(new_weights, {save_path})

Load model and apply

model load는 torch.load_state_dict 로 할 수 있다.

torch.load_state_dict 는 argument로 path가 아니라 dictionary object를 요구하기 때문에, torch.load 로 먼저 불러온 dictionary를 넣어주어야한다.

model = torch.load_state_dict(torch.load('{model_path}'))

그런데 지금은 base_net만 backbone으로 저장되어 있다면 모델의 뒷부분인 extra나 classification, regression 부분은 어떻게 weight를 가져갈까? initial value를 알아서 집어넣어주나? 하면 아니다. 이대로 실행할 때 layer 이름이 맞지 않으면 당연히 key로 맞춰주어야하고, 모델의 크기나 이외의 것들이 맞지 않으면 맞춰주어야한다.

# 모델 레이어가 안 맞는 경우 발생
RuntimeError: Error(s) in loading state_dict for ....

모델의 레이어가 넘치거나 부족할 때 혹은 key가 다를 때 나머지를 무시해버리는 option을 추가할 수 있다. 다만 key 이름이 다르면 죄다 무시해버리기 때문에 필요한 부분과 현재 커스텀 모델의 key가 같은지 다른지 확인하고 다르다면 수정해준 뒤에 해야한다.

model = CustomModel.load_state_dict(torch.load('{model_path}'), strict=False)

이렇게 하면 예제 모델에서 extra, classification, regression 부분 레이어의 weight를 로드하지 못하는 것을 무시하고 진행해준다.

728x90

저작자표시 비영리 변경금지 (새창열림)

'🐬 ML & Data > ❔ Q & etc.' 카테고리의 다른 글

[On-Device AI] 라즈베리파이에서 Ollama로 llama3.2 동작시키기 (0)	2025.02.11
[Model Compression] 모델 양자화(Model Optimization) with Tensorflow (0)	2024.05.21
[Math] Mathematics for Machine Learning 2. Linear Algebra (0)	2024.01.10
[Data] 전동 모터 이상탐지 및 분류를 위한 주파수 분석 (0)	2023.09.26

[PyTorch] pretrained model load/save, pretrained model 편집

Load Pretrained model in pytorch

Pretrained model

Get format

Extract backbone

Load model and apply

'🐬 ML & Data > ❔ Q & etc.' 카테고리의 다른 글

티스토리툴바