在M2 Macbook上运行ChatGLM2

2023年7月11日

介绍

ChatGLM 是一个“开源双语对话语言模型”（官方介绍），也就是我们所熟知的“大语言模型”，或者简单理解为开源版ChatGPT平替（之一）。ChatGLM有时候也被叫作“清华大学的大语言模型”，但它的作者应该是和清华大学有合作，总之和清华大学有点关系就是了。

ChatGLM2 是 ChatGLM 的第二个版本，性能更强大，支持更长的上下文，推理更高效。目前 ChatGLM2 已经开源，能够下载到的模型是 6B 参数训练的，因此也叫 ChatGLM2-6B 。

本文主要记录一下在 M2 Macbook 上运行 ChatGLM2 的过程。

下载模型

ChatGLM-6B 的下载地址 https://huggingface.co/THUDM/chatglm2-6b ，在 Files and versions 页面中找到 Clone repository，按照指引即可克隆下来：

git lfs install
git clone https://huggingface.co/THUDM/chatglm2-6b

git lfs install
git clone https://huggingface.co/THUDM/chatglm2-6b

整个模型有 13G 左右，需要比较长的时间。

注意，自行解决科学上网等问题。

下载ChatGLM.cpp

默认的模型在 Mac 系统下推理性能比较差，因此有人基于 ggml 重新实现了一个 ChatGLM.cpp，完全使用 C++ 实现，性能更好。

ChatGLM.cpp 的下载地址 https://github.com/li-plus/chatglm.cpp，按照指引克隆下来：

git clone --recursive https://github.com/li-plus/chatglm.cpp.git && cd chatglm.cpp

git clone --recursive https://github.com/li-plus/chatglm.cpp.git && cd chatglm.cpp

安装依赖

接下来需要安装一些 Python 依赖。

如果你没有安装 Python 3 的话，可以通过 Homebrew 安装：brew install python3

pip install protobuf transformers==4.30.2 cpm_kernels torch==2.0 gradio mdtex2html sentencepiece accelerate

pip install protobuf transformers==4.30.2 cpm_kernels torch==2.0 gradio mdtex2html sentencepiece accelerate

转换模型

接下来需要将原始的模型转换成 ChatGLM.cpp 可以使用的格式。

python convert.py -i ../chatglm2-6b/ -t q4_0 -o chatglm2-ggml.bin

python convert.py -i ../chatglm2-6b/ -t q4_0 -o chatglm2-ggml.bin

其中../chatglm2-6b是原始的模型目录，chatglm2-ggml.bin是转换后的模型文件。参数-t是量化类型，官方说明如下：

q4_0 4-bit integer quantization with fp16 scales.
q5_0 5-bit integer quantization with fp16 scales.
q5_1 5-bit integer quantization with fp16 scales and minimum values.
q8_0 8-bit integer quantization with fp16 scales.
f16 half precision floating point weights without quantization.
f32 single precision floating point weights without quantization.

总之q4_0是对机器性能要求最低的，这里就选它了。

编译ChatGLM.cpp

ChatGLM.cpp 的编译需要使用cmake，如果没有安装的话使用 Homebrew 安装：brew install cmake。

cmake -B build
cmake --build build -j

cmake -B build
cmake --build build -j

运行ChatGLM.cpp

./build/bin/main -m chatglm2-ggml.bin -p 你好
你好👋！我是人工智能助手 ChatGLM2-6B，很高兴见到你，欢迎问我任何问题。

./build/bin/main -m chatglm2-ggml.bin -p 你好
你好👋！我是人工智能助手 ChatGLM2-6B，很高兴见到你，欢迎问我任何问题。

可以使用-i参数进入交互式模式：

./build/bin/main -m chatglm2-ggml.bin -i
    ________          __  ________    __  ___
   / ____/ /_  ____ _/ /_/ ____/ /   /  |/  /_________  ____
  / /   / __ \/ __ `/ __/ / __/ /   / /|_/ // ___/ __ \/ __ \
 / /___/ / / / /_/ / /_/ /_/ / /___/ /  / // /__/ /_/ / /_/ /
 \____/_/ /_/\__,_/\__/\____/_____/_/  /_(_)___/ .___/ .___/
                                              /_/   /_/

Welcome to ChatGLM.cpp! Ask whatever you want. Type 'clear' to clear context. Type 'stop' to exit.

Prompt   > 你好
ChatGLM2 > 你好👋！我是人工智能助手 ChatGLM2-6B，很高兴见到你，欢迎问我任何问题。
Prompt   >

./build/bin/main -m chatglm2-ggml.bin -i
    ________          __  ________    __  ___
   / ____/ /_  ____ _/ /_/ ____/ /   /  |/  /_________  ____
  / /   / __ \/ __ `/ __/ / __/ /   / /|_/ // ___/ __ \/ __ \
 / /___/ / / / /_/ / /_/ /_/ / /___/ /  / // /__/ /_/ / /_/ /
 \____/_/ /_/\__,_/\__/\____/_____/_/  /_(_)___/ .___/ .___/
                                              /_/   /_/

Welcome to ChatGLM.cpp! Ask whatever you want. Type 'clear' to clear context. Type 'stop' to exit.

Prompt   > 你好
ChatGLM2 > 你好👋！我是人工智能助手 ChatGLM2-6B，很高兴见到你，欢迎问我任何问题。
Prompt   >

简单试了一下鸡兔同笼的问题：

可见 ChatGLM.cpp 的中文理解非常不错，数学成绩也还行，但是后续实测中代码的理解上还有待提高，和 ChatGPT 有差距。但不管怎样，一个本地能运行的大语言模型，也可以干很多事了，突然对这个行业的未来充满了期待。

在M2 Macbook上运行ChatGLM2

介绍 ​

下载模型 ​

下载ChatGLM.cpp ​

安装依赖 ​

转换模型 ​

编译ChatGLM.cpp ​

运行ChatGLM.cpp ​