[On-Device AI] ๋ผ์ฆˆ๋ฒ ๋ฆฌํŒŒ์ด์—์„œ Ollama๋กœ llama3.2 ๋™์ž‘์‹œํ‚ค๊ธฐ

2025. 2. 11. 13:14ยท๐Ÿฌ ML & Data/โ” Q & etc.
728x90

์ถœ์ฒ˜: sk ํ•˜์ด๋‹‰์Šค

๊ฐค๋Ÿญ์‹œ S24 ์‹œ๋ฆฌ์ฆˆ ์ถœ์‹œ ์ดํ›„๋กœ On-Device AI๊ฐ€ ๊ตญ๋‚ด์—์„œ ์–ธ๊ธ‰๋˜๋Š” ๊ฒƒ์„ ์—ฌ๋Ÿฌ ๋ฒˆ ๋ดค๋Š”๋ฐ, ์‹œ๊ฐ„์ด ์ข€ ์žˆ์–ด์„œ ๋ผ์ฆˆ๋ฒ ๋ฆฌํŒŒ์ด ์œ„์—์„œ ๊ฒฝ๋Ÿ‰ ๋ชจ๋ธ์ธ llama ์–ธ์–ด๋ชจ๋ธ์„ ๊ฐ„๋‹จํ•˜๊ฒŒ ์‹คํ—˜ํ•ด๋ณด์•˜๋‹ค. Ollama๋ฅผ ์‚ฌ์šฉํ•˜์˜€๋Š”๋ฐ, Nexa SDK๋‚˜ Samsung One, Optimum... ๋ญ ์—ฌ๋Ÿฌ ํˆด์ด ์žˆ๋Š” ๊ฒƒ ๊ฐ™๋‹ค. 

  • ์ฐธ๊ณ ๋ฌธํ—Œ์—์„œ๋Š” raspberry pi 4์— ์˜ค๋ฒ„ํด๋Ÿฌํ‚น ์ž‘์—…์„ ํ–ˆ์ง€๋งŒ, ์•ˆ ํ•˜๊ณ  ํ…Œ์ŠคํŠธ๋ฅผ ์ˆ˜ํ–‰ํ–ˆ๋‹ค.
  • ์˜ค๋ฒ„ํด๋Ÿฌํ‚น์„ ํ•˜๊ณ ์ž ํ•˜๋ฉด raspberry pi 4์— ๋ฐฉ์—ดํŒ๊ณผ ํŒฌ์„ ๋ถ€์ฐฉํ•ด์•ผํ•œ๋‹ค.

Swap ๋ฉ”๋ชจ๋ฆฌ ํ™•์žฅ

  • ์Šค์™‘ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ๋Š˜๋ ค์„œ application ๊ฐ„ ์ถฉ๋Œ์„ ๋ฐฉ์ง€ํ•œ๋‹ค. 
    sudo fallocate -l 10G /swapfile    # swap ๋ฉ”๋ชจ๋ฆฌ 10G ํ• ๋‹น
    sudo chmod 600 /swapfile           # root ๊ถŒํ•œ ์„ค์ •
    sudo mkswap /swapfile              # ์ƒ์„ฑ
    sudo swapon /swapfile              # ํ™œ์„ฑํ™”
  • sudo swap --show # swap ํŒŒํ‹ฐ์…˜๊ณผ ๋ฉ”๋ชจ๋ฆฌ ํ™•์ธ sudo swapoff -a # swap ๋ฉ”๋ชจ๋ฆฌ ๋น„ํ™œ์„ฑํ™” sudo rm /swapfile # swap file ์‚ญ์ œ

Ollama ์„ค์น˜

  • Ollama๊ฐ€ port 11434๋ฅผ ์‚ฌ์šฉํ•˜๋ฏ€๋กœ ํฌํŠธ๋ฅผ ๋ฏธ๋ฆฌ ์—ด์–ด ์ธ๋ฐ”์šด๋“œ ์—ฐ๊ฒฐ์„ ํ—ˆ์šฉํ•จ
    # ufw ์‚ฌ์šฉ ๋ฐฉ๋ฒ•
    sudo ufw allow 11434/tcp 
    # ๋Œ€์ฒด - iptables
    sudo iptables -I INPUT 1 -p tcp --dport 11434 -j ACCEPT
    sudo iptables -nL    # ํ™•์ธ
  • ์„ค์น˜
    curl -fsSL https://ollama.com/install.sh | sh
  • raspberry pi ์—๋Š” ๊ธฐ๋ณธ์ ์œผ๋กœ gpu๊ฐ€ ํƒ‘์žฌ๋˜์–ด ์žˆ์ง€ ์•Š์œผ๋ฏ€๋กœ cpu-only mode๋กœ ์‹คํ–‰๋œ๋‹ค.
  • ollama serve ๋ช…๋ น์–ด๋ฅผ ์‚ฌ์šฉํ•ด์„œ ์‹œ์ž‘ํ•  ์ˆ˜ ์žˆ๋Š”๋ฐ, ๊ธฐ๋ณธ์ ์œผ๋กœ๋Š” ์„œ๋น„์Šค๋กœ ๋™์ž‘ํ•œ๋‹ค. serveํ–ˆ์„ ๋•Œ ํฌํŠธ ๋ฐ”์ธ๋“œ๋˜์–ด ์žˆ๋‹ค๊ณ  ๋งํ•˜๋ฉด systemctl status ollama ๋กœ ์ƒํƒœ๋ฅผ ํ™•์ธํ•ด๋ณด์ž. ์ด๋ฏธ ์‹คํ–‰ ์ค‘์ผ ์ˆ˜ ์žˆ๋‹ค.

๋ชจ๋ธ ๋‹ค์šด๋กœ๋“œ

  • llama3.2:1b : 10์–ต ๊ฐœ ํŒŒ๋ผ๋ฏธํ„ฐ, 1.3G
    ollama pull llama3.2:1b
  • llama3.2:3b : 30์–ต ๊ฐœ ํŒŒ๋ผ๋ฏธํ„ฐ, 2G
    ollama pull llama3.2:3b

    ๋ชจ๋ธ ์‚ฌ์šฉ

    ollama run llama3.2:3b
    >> ...

ํ›„๊ธฐ

  • ์•„๋ฌด๋ž˜๋„ raspberry pi ์˜ ๋นˆ์•ฝํ•œ cpu๋งŒ์œผ๋กœ ์‹คํ–‰ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ๋ชน์‹œ ๋А๋ฆฌ์ง€๋งŒ, ๊ทธ๋ž˜๋„ ์˜ˆ์ƒ๋ณด๋‹ค ํ›จ์”ฌ ๊ดœ์ฐฎ์•˜๋‹ค. 1b ๋ชจ๋ธ๋ณด๋‹ค๋Š” 3b ๋ชจ๋ธ์ด ํ™•์‹คํžˆ ์ƒํ™ฉ์— ๋”ฐ๋ฅธ ์ •ํ™•ํ•œ ๋Œ€๋‹ต์„ ๋‚ด๋†“๋Š” ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค.
    • ex) ๋žœ๋คํ•œ sf ์˜ํ™” ํ•˜๋‚˜๋ฅผ 3~4๊ฐœ ์‹œํ€€์Šค๋กœ ์š”์•ฝํ•ด์ค˜.
      • llama3.2:1b - 3~4๊ฐœ์˜ sf ์˜ํ™”๋ฅผ ๊ฐ๊ฐ 1๊ฐœ ์‹œํ€€์Šค๋กœ ์š”์•ฝ
      • llama3.2:3b - 1๊ฐœ์˜ sf ์˜ํ™”๋ฅผ 4๊ฐœ ์‹œํ€€์Šค๋กœ ์š”์•ฝ
  • ํ•œ๊ตญ์–ด๋Š” ์ง€์›์ด ์•ˆ๋œ๋‹ค๊ณ  ๋ด์•ผํ•œ๋‹ค.

๊ทธ๋ž˜๋ผ

  • ์˜ˆ์ƒ ์™ธ๋กœ ๊ฒฐ๊ณผ๊ฐ€ ์ข‹์•˜๋‹ค. ๋ฒค์น˜๋งˆํฌ๋กœ ๋ดค์„ ๋•Œ๋Š” ํฌ๊ฒŒ ๊ธฐ๋Œ€ํ•˜์ง€ ์•Š์•˜๋Š”๋ฐ ์†๋„๋งŒ ์กฐ๊ธˆ ๋” ๋นจ๋ผ์ง„๋‹ค๋ฉด ์œ ์šฉํ•˜๊ฒŒ ์“ธ ์ˆ˜ ์žˆ์„ ๊ฒƒ ๊ฐ™๋‹ค. ๊ธฐํšŒ๊ฐ€ ๋œ๋‹ค๋ฉด ๋‹ค๋ฅธ on-device ai ๋ชจ๋ธ์„ ํ™œ์šฉํ•œ ํ”„๋กœ์ ํŠธ๋ฅผ ์ง„ํ–‰ํ•ด๋ด๋„ ์ข‹์„ ๊ฒƒ ๊ฐ™๋‹ค.

์ฐธ๊ณ  ๋ฌธํ—Œ
https://aleksandarhaber.com/how-to-install-and-run-llama-3-2-1b-and-3b-large-language-models-llms-on-raspberry-pi-4-and-linux-ubuntu/

 

How to install and Run Llama 3.2 1B and 3B Large Language Models (LLMs) on Raspberry Pi 4 and Linux Ubuntu – Fusion of Enginee

In this Large Language Model (LLM) and machine learning tutorial, we explain how to run Llama 3.2 1B and 3B LLMs on Raspberry Pi in Linux Ubuntu. In this tutorial, we use Raspberry Pi 4. However, the performance and speed of running the models will be bett

aleksandarhaber.com

 

728x90
์ €์ž‘์žํ‘œ์‹œ ๋น„์˜๋ฆฌ ๋ณ€๊ฒฝ๊ธˆ์ง€ (์ƒˆ์ฐฝ์—ด๋ฆผ)

'๐Ÿฌ ML & Data > โ” Q & etc.' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€

[Model Compression] ๋ชจ๋ธ ์–‘์žํ™”(Model Optimization) with Tensorflow  (0) 2024.05.21
[Math] Mathematics for Machine Learning 2. Linear Algebra  (0) 2024.01.10
[Data] ์ „๋™ ๋ชจํ„ฐ ์ด์ƒํƒ์ง€ ๋ฐ ๋ถ„๋ฅ˜๋ฅผ ์œ„ํ•œ ์ฃผํŒŒ์ˆ˜ ๋ถ„์„  (0) 2023.09.26
[PyTorch] pretrained model load/save, pretrained model ํŽธ์ง‘  (0) 2022.09.19
'๐Ÿฌ ML & Data/โ” Q & etc.' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€
  • [Model Compression] ๋ชจ๋ธ ์–‘์žํ™”(Model Optimization) with Tensorflow
  • [Math] Mathematics for Machine Learning 2. Linear Algebra
  • [Data] ์ „๋™ ๋ชจํ„ฐ ์ด์ƒํƒ์ง€ ๋ฐ ๋ถ„๋ฅ˜๋ฅผ ์œ„ํ•œ ์ฃผํŒŒ์ˆ˜ ๋ถ„์„
  • [PyTorch] pretrained model load/save, pretrained model ํŽธ์ง‘
darly213
darly213
ํ˜ธ๋ฝํ˜ธ๋ฝํ•˜์ง€ ์•Š์€ ๊ฐœ๋ฐœ์ž๊ฐ€ ๋˜์–ด๋ณด์ž
  • darly213
    ERROR DENY
    darly213
  • ์ „์ฒด
    ์˜ค๋Š˜
    ์–ด์ œ
    • ๋ถ„๋ฅ˜ ์ „์ฒด๋ณด๊ธฐ (97)
      • ๐Ÿฌ ML & Data (50)
        • ๐ŸŒŠ Computer Vision (2)
        • ๐Ÿ“ฎ Reinforcement Learning (12)
        • ๐Ÿ“˜ ๋…ผ๋ฌธ & ๋ชจ๋ธ ๋ฆฌ๋ทฐ (8)
        • ๐Ÿฆ„ ๋ผ์ดํŠธ ๋”ฅ๋Ÿฌ๋‹ (3)
        • โ” Q & etc. (5)
        • ๐ŸŽซ ๋ผ์ดํŠธ ๋จธ์‹ ๋Ÿฌ๋‹ (20)
      • ๐Ÿฅ Web (21)
        • โšก Back-end | FastAPI (2)
        • โ›… Back-end | Spring (5)
        • โ” Back-end | etc. (9)
        • ๐ŸŽจ Front-end (4)
      • ๐ŸŽผ Project (8)
        • ๐ŸงŠ Monitoring System (8)
      • ๐Ÿˆ Algorithm (0)
      • ๐Ÿ”ฎ CS (2)
      • ๐Ÿณ Docker & Kubernetes (3)
      • ๐ŸŒˆ DEEEEEBUG (2)
      • ๐ŸŒ  etc. (8)
      • ๐Ÿ˜ผ ์‚ฌ๋‹ด (1)
  • ๋ธ”๋กœ๊ทธ ๋ฉ”๋‰ด

    • ํ™ˆ
    • ๋ฐฉ๋ช…๋ก
    • GitHub
    • Notion
    • LinkedIn
  • ๋งํฌ

    • Github
    • Notion
  • ๊ณต์ง€์‚ฌํ•ญ

    • Contact ME!
  • 250x250
  • hELLOยท Designed By์ •์ƒ์šฐ.v4.10.3
darly213
[On-Device AI] ๋ผ์ฆˆ๋ฒ ๋ฆฌํŒŒ์ด์—์„œ Ollama๋กœ llama3.2 ๋™์ž‘์‹œํ‚ค๊ธฐ
์ƒ๋‹จ์œผ๋กœ

ํ‹ฐ์Šคํ† ๋ฆฌํˆด๋ฐ”