CANN/AMCT OFMR算法示例-尧图企业网站定制

AMCT Large Model Quantization【免费下载链接】amctAMCT是CANN提供的昇腾AI处理器亲和的模型压缩工具仓。项目地址: https://gitcode.com/cann/amct1 Quantization Prerequisites1.1 Install DependenciesThe dependency packages for this sample can be found in requirements.txtNote that the torch_npu package version needs to match the Python and torch package versions, and the CANN package needs to be installed1.2 Model and Dataset PreparationThis sample uses Llama2-7b, qwen2-7b, and qwen3-8b models with pileval data and wikitext2 dataset as examples. Please download the models yourself and pass the model path to the script. The dataset is loaded online.1.3 Simple Quantization ConfigurationThe quantization configuration used in this sample is built into the tool and can be obtained and used in the following ways:from amct_pytorch import HIFP8_OFMR_CFGIf you need to modify the detailed configuration, please refer to the documentation to construct the required quantization configuration dict.The OFMR algorithm supports weight-only quantization and full quantization. The supported quantization types and quantization configurations are:FieldTypeDescriptionValue RangeNotesbatch_numuint32Number of batches used for quantization1/skip_layersstrLayers to skip quantization/Skip quantization layers support fuzzy matching. When the configured string is a layer name substring or matches the layer name, skip quantization for that layer and do not generate quantization configuration. The string must contain numbers or lettersweights.typestrQuantized weight typefloat8_e4m3fn/hifloat8/weights.symmetricboolSymmetric quantizationTRUE/weights.strategystrQuantization granularitytensor/channel/inputs.typestrQuantized activation typefloat8_e4m3fn/hifloat8/inputs.symmetricboolSymmetric quantizationTRUE/inputs.strategystrQuantization granularitytensor/algorithmdictQuantization algorithm configuration used{ofmr}/2 Quantization Example2.1 Use Interface Method to Callstep 1.Please execute the following command in the current directory to run the sample program. Users need to modify the model path in the sample program according to actual conditions:python3 src/run_llama2_samples.py --model_path/data/Llama2_7b_hf/python3 src/run_qwen_samples.py --model_path/data/Qwen2-7b/python3 src/run_qwen_samples.py --model_path/data/Qwen3-8B/If the following information appears, it indicates that quantization is successful:Test time taken: 1.0 min 59.24865388870239 s Score: 5.477707Where Score is the quantized model PPL. For specific values, refer to the following table:ModelCalibration SetDatasetPre-quantization PPLPost-quantization PPLLLAMA2-7Bpilevalwikitext25.4725.505QWEN2-7Bpilevalwikitext27.1377.196QWEN3-8Bpilevalwikitext29.7159.808After inference succeeds, a quantization log file ./amct_log/amct_pytorch.log is generated in the current directory【免费下载链接】amctAMCT是CANN提供的昇腾AI处理器亲和的模型压缩工具仓。项目地址: https://gitcode.com/cann/amct创作声明：本文部分内容由AI辅助生成（AIGC），仅供参考

相关新闻

5G-A演进：从确定性网络到通感算一体，如何解决5G痛点并迈向6G

SMA、SMB、SMC封装二极管选型指南：从尺寸、功率到应用场景全解析

单片机矢量图形显示方案：从SVG解析到渲染优化实战

3分钟搞定AI视频创作：Auto-Video-Generator终极快速上手指南

Linux 2.6.29.4内核移植mini2440：从源码修改到YAFFS2根文件系统启动

Python亚马逊SP-API集成终极指南：3个步骤快速上手

Extension Manager：重新定义GNOME Shell扩展管理的现代化解决方案

告别手动整理：用快马AI自动化生成专利分析报告，效率提升数倍

01-React基础入门——03-组件与Props

3分钟掌握VideoDownloadHelper：简单高效的网页视频下载插件终极指南 [特殊字符]

DDrawCompat终极指南：三步拯救Windows老游戏兼容性难题

3步解锁Windows安卓应用新体验：轻量级APK安装器完全指南

毕业论文神器！2026最新AI论文写作软件测评与推荐

基于指数矩的车牌识别解析方案【附代码】

前轮驱动自行车机器人建模与自适应控制策略优化【附代码】

从陌生到熟悉：Royal TSX中文汉化包的体验地图之旅

时延最优化设计

别再重启了！Windows 11下dwm.exe内存飙升，我用Intel官方工具升级显卡驱动搞定