【大模型研究】（1）：从零开始部署书生·浦语2-20B大模型，使用fastchat和webui部署测试，autodl申请2张显卡，占用显存40G可以运行

2024-03-11 1474阅读

温馨提示：这篇文章已超过376天没有更新，请注意相关的内容是否还可用！

1，演示视频

https://www.bilibili.com/video/BV1pT4y1h7Af/

2，书生·浦语2-对话-20B

https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-chat-20b/summary

InternLM2 开源了一个 200 亿参数的基础模型和一个针对实际场景定制的聊天模型。该模型具有以下特点：

200K 上下文窗口：在 200K 长的上下文中几乎能完美地找到针尖，在长上下文任务（如 LongBench 和 L-Eval）上具有领先性能。使用 LMDeploy 尝试 200K 上下文推理。

卓越的全面性能：在所有维度上的表现都显著优于上一代模型，尤其是在推理、数学、编程、聊天体验、指令遵循和创意写作方面，在类似规模的开源模型中性能领先。在某些评估中，InternLM2-Chat-20B 可能与 ChatGPT（GPT-3.5）相匹敌，甚至可能超越。

代码解释器与数据分析：具备代码解释器功能，InternLM2-Chat-20B 在 GSM8K 和 MATH 上的表现与 GPT-4 相当。InternLM2-Chat 也提供数据分析能力。

更强的工具使用：基于在遵循指令、选择工具和反思方面的更好工具利用能力，InternLM2 可以支持更多种类的代理和复杂任务的多步骤工具调用。查看示例。

下载后占空间71G ，需要单独申请磁盘。

3，使用autodl创建环境，安装最新的 fastchat

要是 48G 申请一个就可以，要是 24G 显存需申请2个显卡才可以。

apt update && apt install -y git-lfs net-tools
# 一定要保证有大磁盘空间：
cd /root/autodl-tmp
git clone https://www.modelscope.cn/Shanghai_AI_Laboratory/internlm2-chat-20b.git
# 1，安装 torch 模块，防止依赖多次下载
pip3 install torch==2.1.0
# 最后安装 软件
pip3 install "fschat[model_worker,webui]" einops

安装完成之后就可以使用fastchat启动了。

# 首先启动 controller ：
nohup python3 -m fastchat.serve.controller --host 0.0.0.0 --port 21001 > controller.log 2>&1 &
# 启动 openapi的 兼容服务 地址 8000
nohup python3 -m fastchat.serve.openai_api_server --controller-address http://127.0.0.1:21001 \
  --host 0.0.0.0 --port 8000 > api_server.log 2>&1 &
# 启动 web ui
nohup python -m fastchat.serve.gradio_web_server --controller-address http://127.0.0.1:21001 \
 --host 0.0.0.0 --port 6006 > web_server.log 2>&1 &
nohup python3 -m fastchat.serve.model_worker  --num-gpus 2 --model-names internlm2-chat-20b \
  --model-path ./internlm2-chat-20b --controller-address http://127.0.0.1:21001 \
  --worker-address http://127.0.0.1:8080 --host 0.0.0.0 --port 8080 > model_worker.log 2>&1 &

启动成功：

测试api 接口：

curl http://localhost:8000/v1/chat/completions   -H "Content-Type: application/json" -d '{
     "model": "internlm2-chat-20b",
     "messages": [{"role": "user", "content": "你是谁"}],
     "temperature": 0.7, "stream": true
}'
stream 可以正常返回。但是 非流返回错误：
2024-01-21 21:40:12 | ERROR | stderr |   File "/root/miniconda3/lib/python3.8/site-packages/fastapi/routing.py", line 294, in app
2024-01-21 21:40:12 | ERROR | stderr |     raw_response = await run_endpoint_function(
2024-01-21 21:40:12 | ERROR | stderr |   File "/root/miniconda3/lib/python3.8/site-packages/fastapi/routing.py", line 191, in run_endpoint_function
2024-01-21 21:40:12 | ERROR | stderr |     return await dependant.call(**values)
2024-01-21 21:40:12 | ERROR | stderr |   File "/root/miniconda3/lib/python3.8/site-packages/fastchat/serve/base_model_worker.py", line 206, in api_generate
2024-01-21 21:40:12 | ERROR | stderr |     output = await asyncio.to_thread(worker.generate_gate, params)
2024-01-21 21:40:12 | ERROR | stderr | AttributeError: module 'asyncio' has no attribute 'to_thread'
估计需要安装 python3.10 高版本才可以。

界面展示成功：

 python3 -m fastchat.serve.test_throughput --controller-address http://127.0.0.1:21001 --model-name internlm2-chat-20b --n-thread 1

当版本不对时候，使用python3.10 报错：

"object":"error","message":"**NETWORK ERROR DUE TO HIGH TRAFFIC. PLEASE REGENERATE OR REFRESH THIS PAGE.**\n\n(CUDA error: device-side assert triggered\nCUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.\nFor debugging consider passing CUDA_LAUNCH_BLOCKING=1.\nCompile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.\n)","code":50001}

5，总结

总体上效果还不错。速度也挺快的。需要 40G的显存才可以启动成功。

而且可以使用webui 启动成功。

VPS购买请点击我

免责声明：我们致力于保护作者版权，注重分享，被刊用文章因无法核实真实出处，未能及时与作者取得联系，或有版权异议的，请联系管理员，我们会立即处理! 部分文章是来自自研大数据AI进行生成,内容摘自(百度百科,百度知道,头条百科,中国民法典,刑法,牛津词典,新华词典,汉语词典,国家院校,科普平台)等数据,内容仅供学习参考,不准确地方联系删除处理! 图片声明：本站部分配图来自人工智能系统AI生成,觅知网授权图片,PxHere摄影无版权图库和百度，360，搜狗等多加搜索引擎自动关键词搜索配图，如有侵权的图片，请第一时间联系我们，邮箱：ciyunidc@ciyunshuju.com。本站只作为美观性配图使用,无任何非法侵犯第三方意图,一切解释权归图片著作权方,本站不承担任何责任。如有恶意碰瓷者,必当奉陪到底严惩不贷!

【大模型研究】（1）：从零开始部署书生·浦语2-20B大模型，使用fastchat和webui部署测试，autodl申请2张显卡，占用显存40G可以运行

1，演示视频

2，书生·浦语2-对话-20B

3，使用autodl创建环境，安装最新的 fastchat

5，总结

相关阅读

怎么把织梦的模板替换?

dedecms怎么调用特定的栏目文档?

怎么抓包一个网页?

wap怎么封装app?

目录[+]