[Model Compression] ๋ชจ๋ธ ์–‘์žํ™”(Model Optimization) with Tensorflow

2024. 5. 21. 13:41ยท๐Ÿฌ ML & Data/โ” Q & etc.
728x90

๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์„ ๊ฒฝ๋Ÿ‰ํ™”ํ•˜๋Š” ๊ฒƒ์€ ๋ชจ๋ธ ํ•™์Šต ์ดํ›„ ์‹ค์ œ ๋ฌธ์ œ์— ๋”ฅ๋Ÿฌ๋‹ ํ•ด๋ฒ•์„ ์ ์šฉํ•˜๋Š” ๊ณผ์ •์— ์žˆ์–ด์„œ ์‹คํ–‰ ์‹œ๊ฐ„, ์˜ˆ์ธก์— ํ•„์š”ํ•œ ๋ฆฌ์†Œ์Šค ์†Œ๋ชจ๋Ÿ‰์„ ์ค„์ด๊ธฐ ์œ„ํ•ด์„œ ํ•„์š”ํ•œ ๊ณผ์ •์ด๋‹ค.
๋ชจ๋ธ ๊ฒฝ๋Ÿ‰ํ™”์—๋Š” (๋‚ด๊ฐ€ ์•Œ๊ณ  ์žˆ๊ธฐ๋กœ๋Š”) ์„ธ ๊ฐ€์ง€ ๋ฐฉ๋ฒ•์ด ์žˆ๋Š”๋ฐ,

  1. ๋ชจ๋ธ ์–‘์žํ™”(๋น„ํŠธ ์ˆ˜๋ฅผ ์ค„์ด๋Š” ๋ฐฉ์‹)
  2. ๋ชจ๋ธ pruning(์ค‘์š”ํ•˜์ง€ ์•Š์€ ๋ถ€๋ถ„์„ ๋ฒ„๋ฆฌ๋Š” ๋ฐฉ์‹)
  3. ๊ทธ๋ƒฅ ๋ชจ๋ธ ์„ค๊ณ„๋ฅผ ์ž˜ํ•˜๊ธฐ

์ค‘์— ์ด๋ฏธ ํ•™์Šตํ•œ ๋ชจ๋ธ์— ์žˆ์–ด์„œ ๊ฐ€์žฅ ์‰ฌ์šด ์–‘์žํ™”๋ฅผ ์šฐ์„  ํ•˜๊ธฐ๋กœ ๊ฒฐ์ •ํ•˜์˜€๋‹ค. Tensorflow ๊ธฐ๋ฐ˜์˜ ๋ชจ๋ธ์„ ์–‘์žํ™”ํ•˜๋Š” ์˜ˆ์ œ๋ฅผ ๊ธฐ๋กํ•ด๋‘”๋‹ค.

์ถœ์ฒ˜ : https://medium.com/@jan_marcel_kezmann/master-the-art-of-quantization-a-practical-guide-e74d7aad24f9

  • Tensorflow๋กœ ๊ตฌ์„ฑ๋˜์–ด ํ•™์Šตํ•˜๊ณ  ๊ฐ€์ค‘์น˜๋ฅผ .h5 ํ™•์žฅ์ž๋กœ ์ €์žฅํ•œ ๋ชจ๋ธ

1. ๋ชจ๋ธ ์–‘์žํ™”

a. ๋ชจ๋ธ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ

import tensorflow as tf

model = your_model(parameter)
model.load_weights('YOUR_MODEL_PATH')

b. ์–‘์žํ™”

converter = tf.lite.TFLiteConverter.from_keras_model(model) # ๋ชจ๋ธ์ด ์ผ€๋ผ์Šค๋กœ ๋นŒ๋“œ๋˜์ง€ ์•Š์•˜๋‹ค๋ฉด ๋‹ค๋ฅธ ๊ฒƒ ์‚ฌ์šฉ
converter.optimizations = [tf.lite.Optimize.DEFAULT]
tflite_quant_model = converter.convert()

open("SAVE_PATH.tflite", "wb").write(tflite_quant_model) # write byte๋กœ ์ €์žฅ

๋งŒ์ผ LSTM๊ณผ ๊ฐ™์€ ๋ ˆ์ด์–ด(๊ธฐ๋ณธ ์ง€์›ํ•˜์ง€ ์•Š๋Š” operator)๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒฝ์šฐ์—๋Š” ์•„๋ž˜ ๋‚ด์šฉ์„ convert ์ „์— ์ถ”๊ฐ€

converter.target_spec.supported_ops = [
  tf.lite.OpsSet.TFLITE_BUILTINS, # enable TensorFlow Lite ops.
  tf.lite.OpsSet.SELECT_TF_OPS # enable TensorFlow ops.
]

2. ์–‘์žํ™” ๋œ ๋ชจ๋ธ ๋กœ๋“œ

a. interpreter ์ •์˜

import tensorflow as tf
interpreter = tf.lite.Interpreter('YOUR_MODEL_PATH')
interpreter.allocate_tensors()
input_detail = interpreter.get_input_details()
output_detail = interpreter.get_output_details()

b. ํ…Œ์ŠคํŠธ

interpreter.set_tensor(input_detail[0]['index'], data)
interpreter.invoke()
output_data = interpreter.get_tensor(output_detail[0]['index'])
  • ์–‘์žํ™” ์ „ ํŒŒ์ผ ํฌ๊ธฐ

  • ์–‘์žํ™” ํ›„ ํŒŒ์ผ ํฌ๊ธฐ

์–‘์žํ™” ์ „ํ›„ ์„ฑ๋Šฅ ๋น„๊ต

  • ์–‘์žํ™” ์ „
    • F1 Score = 0.9703125 / Accuracy = 97.031 %

  • ์–‘์žํ™” ํ›„
    • F1 Score = 0.9671875 | Accuracy = 96.719 %

 

์ฐธ๊ณ 

https://medium.com/@jan_marcel_kezmann/master-the-art-of-quantization-a-practical-guide-e74d7aad24f9

 

Master the Art of Quantization: A Practical Guide

Exploring and Implementing Quantization Methods with TensorFlow and PyTorch

medium.com

 

728x90
์ €์ž‘์žํ‘œ์‹œ ๋น„์˜๋ฆฌ ๋ณ€๊ฒฝ๊ธˆ์ง€ (์ƒˆ์ฐฝ์—ด๋ฆผ)

'๐Ÿฌ ML & Data > โ” Q & etc.' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€

[On-Device AI] ๋ผ์ฆˆ๋ฒ ๋ฆฌํŒŒ์ด์—์„œ Ollama๋กœ llama3.2 ๋™์ž‘์‹œํ‚ค๊ธฐ  (0) 2025.02.11
[Math] Mathematics for Machine Learning 2. Linear Algebra  (0) 2024.01.10
[Data] ์ „๋™ ๋ชจํ„ฐ ์ด์ƒํƒ์ง€ ๋ฐ ๋ถ„๋ฅ˜๋ฅผ ์œ„ํ•œ ์ฃผํŒŒ์ˆ˜ ๋ถ„์„  (0) 2023.09.26
[PyTorch] pretrained model load/save, pretrained model ํŽธ์ง‘  (0) 2022.09.19
'๐Ÿฌ ML & Data/โ” Q & etc.' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€
  • [On-Device AI] ๋ผ์ฆˆ๋ฒ ๋ฆฌํŒŒ์ด์—์„œ Ollama๋กœ llama3.2 ๋™์ž‘์‹œํ‚ค๊ธฐ
  • [Math] Mathematics for Machine Learning 2. Linear Algebra
  • [Data] ์ „๋™ ๋ชจํ„ฐ ์ด์ƒํƒ์ง€ ๋ฐ ๋ถ„๋ฅ˜๋ฅผ ์œ„ํ•œ ์ฃผํŒŒ์ˆ˜ ๋ถ„์„
  • [PyTorch] pretrained model load/save, pretrained model ํŽธ์ง‘
darly213
darly213
ํ˜ธ๋ฝํ˜ธ๋ฝํ•˜์ง€ ์•Š์€ ๊ฐœ๋ฐœ์ž๊ฐ€ ๋˜์–ด๋ณด์ž
  • darly213
    ERROR DENY
    darly213
  • ์ „์ฒด
    ์˜ค๋Š˜
    ์–ด์ œ
    • ๋ถ„๋ฅ˜ ์ „์ฒด๋ณด๊ธฐ (97)
      • ๐Ÿฌ ML & Data (50)
        • ๐ŸŒŠ Computer Vision (2)
        • ๐Ÿ“ฎ Reinforcement Learning (12)
        • ๐Ÿ“˜ ๋…ผ๋ฌธ & ๋ชจ๋ธ ๋ฆฌ๋ทฐ (8)
        • ๐Ÿฆ„ ๋ผ์ดํŠธ ๋”ฅ๋Ÿฌ๋‹ (3)
        • โ” Q & etc. (5)
        • ๐ŸŽซ ๋ผ์ดํŠธ ๋จธ์‹ ๋Ÿฌ๋‹ (20)
      • ๐Ÿฅ Web (21)
        • โšก Back-end | FastAPI (2)
        • โ›… Back-end | Spring (5)
        • โ” Back-end | etc. (9)
        • ๐ŸŽจ Front-end (4)
      • ๐ŸŽผ Project (8)
        • ๐ŸงŠ Monitoring System (8)
      • ๐Ÿˆ Algorithm (0)
      • ๐Ÿ”ฎ CS (2)
      • ๐Ÿณ Docker & Kubernetes (3)
      • ๐ŸŒˆ DEEEEEBUG (2)
      • ๐ŸŒ  etc. (8)
      • ๐Ÿ˜ผ ์‚ฌ๋‹ด (1)
  • ๋ธ”๋กœ๊ทธ ๋ฉ”๋‰ด

    • ํ™ˆ
    • ๋ฐฉ๋ช…๋ก
    • GitHub
    • Notion
    • LinkedIn
  • ๋งํฌ

    • Github
    • Notion
  • ๊ณต์ง€์‚ฌํ•ญ

    • Contact ME!
  • 250x250
  • hELLOยท Designed By์ •์ƒ์šฐ.v4.10.3
darly213
[Model Compression] ๋ชจ๋ธ ์–‘์žํ™”(Model Optimization) with Tensorflow
์ƒ๋‹จ์œผ๋กœ

ํ‹ฐ์Šคํ† ๋ฆฌํˆด๋ฐ”