主页 » Posts

使用 MLX 在 Apple M 芯片进行 Lora 微调

2025年10月28日 · 1 分钟 · 180 字 · Dorianyang

目录

苹果加油，和 CUDA 碰一碰

数据准备

从 github 克隆 Apple 给出的示例项目项目 git clone https://github.com/ml-explore/mlx-examples.git
在项目lora/data/中放置你的训练、验证和测试集数据
下载一个模型。此处我们从modelscope下载一个小模型minimind2

pip install modelscope
modelscope download --model gongjy/MiniMind2

下载后，记住模型的路径。此处我的路径如下

环境配置

使用 uv 初始化一个环境

$ uv init --name mlx . -p3.13 
Initialized project `hypollm` at `/Users/dorian/Documents/Programme/NUS311/model_test`

你也可以使用conda

$ conda create -n mlx python==3.13

安装所需的包

$ uv add mlx-lm transformers torch numpy

微调过程

进入项目路径 cd /mlx_example/lora/。如果不使用 swanlab 就删除最后一个 arg

$ mlx_lm.lora --model <此处你的基座模型路径> --train --data ./data --report-to swanlab

# 显示如下则成功
# Loading pretrained model
# Loading datasets
# Training
# Trainable parameters: 1.654% (1.720M/104.031M)
# Starting training..., iters: 1000
# Calculating loss...: 100%|██| 25/25 [00:07<00:00,  3.42it/s]

微调结束后，可以看到如下内容，是 checkpoints 和最终模型
然后我们进行融合权重

$ mlx_lm.fuse --model <你的基座模型路径>  --adapter-path adapters --save-path <你的模型名字>

最终得到以下文件

模型部署

下载 Ollama（麻烦自行查找并下载）
新建一个Modelfile，注意没有后缀，内容如下（也可以参考此网站）

FROM <模型文件夹的绝对路径>

# 推理参数（适合精确结构推理）
PARAMETER temperature 0.2
PARAMETER top_p 0.8
PARAMETER top_k 50
PARAMETER repeat_penalty 1.05
PARAMETER num_ctx 4096
PARAMETER num_predict 512

# 对齐 Qwen 格式的多轮对话模版
TEMPLATE """<|im_start|>system
{{ .System }}<|im_end|>
<|im_start|>user
{{ .Prompt }}<|im_end|>
<|im_start|>assistant
"""

# 设定系统指令
SYSTEM "You are an expert in causal inference and graph theory."

利用 ollama 创建

$ ollama create <模型名字> -f <Modelfile位置>

运行

$ ollama run <模型名字>

这篇博客也很好 https://juejin.cn/post/7426343844595335168