🙋魔搭ModelScope本期社区进展：

📟182个模型：MiniCPM 3.0系列、Yi-Coder 系列、轩辕3-70B系列、天工奖励模型系列、MistoLine_Flux.dev、mini-omni等；

📁221个数据集：hermes-function-calling-v1、alpaca-cot-en-refined-by-data-juicer、alpaca_data等；

🎨38个创新应用：MinerU、声音模型试听、矢量百宝箱等；

📄6篇文章：

小而强大，零一万物编程小能手Yi-Coder系列模型开源！
小钢炮进化，MiniCPM 3.0 开源！4B参数超GPT3.5性能，无限长文本，超

RAG三件套！模型推理、微调实战来啦！

可控高清视频生成：CogVideoX+DiffSynth-Studio = “配置拉满”
一键服务化：从魔搭开源模型到OpenAI API服务
MinerU-大语言语料处理神器，CPU/GPU均可跑，开源免费“敲”好用
可图Kolors-LoRA风格故事挑战赛决赛入围名单出炉！决赛赛题首公开，奉上夺奖秘籍，小说一键转图文漫画！

精选模型

MiniCPM 3.0系列

面壁智能推出全新 MiniCPM 3.0 基座模型，再次以小博大，以 4B 参数，带来超越 GPT-3.5 的性能，且量化后仅 2GB 内存，端侧友好，具有以下亮点：

无限长文本，榜单性能强，超长文本也不崩；
性能比肩 GPT-4o 的端侧强大 Function Calling；
超强 RAG 三件套，中文检索超强、生成超 Llama3-8B。

模型链接：

MiniCPM3-4B：

https://modelscope.cn/models/OpenBMB/MiniCPM3-4B

MiniCPM3-4B-GPTQ-Int4：https://modelscope.cn/models/OpenBMB/MiniCPM3-4B-GPTQ-Int4
MiniCPM3-RAG-LoRA：https://modelscope.cn/models/OpenBMB/MiniCPM3-RAG-LoRA
MiniCPM-Reranker：https://modelscope.cn/models/OpenBMB/MiniCPM-Reranker
MiniCPM-Embedding：https://modelscope.cn/models/OpenBMB/MiniCPM-Embedding

代码示例：

MiniCPM3-4B推理：

from modelscope import AutoModelForCausalLM, AutoTokenizer
import torch

path = "OpenBMB/MiniCPM3-4B"
device = "cuda"

tokenizer = AutoTokenizer.from_pretrained(path, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(path, torch_dtype=torch.bfloat16, device_map=device, trust_remote_code=True)

messages = [
    {"role": "user", "content": "推荐5个北京的景点。"},
]
model_inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(device)

model_outputs = model.generate(
    model_inputs,
    max_new_tokens=1024,
    top_p=0.7,
    temperature=0.7,
    repetition_penalty=1.02
)

output_token_ids = [
    model_outputs[i][len(model_inputs[i]):] for i in range(len(model_inputs))
]

responses = tokenizer.batch_decode(output_token_ids, skip_special_tokens=True)[0]
print(responses)

MiniCPM3-RAG-LoRA推理

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
from modelscope import snapshot_download
base_model_dir = snapshot_download("OpenBMB/MiniCPM3-4B")
lora_model_dir = snapshot_download("OpenBMB/MiniCPM3-RAG-LoRA")

model = AutoModelForCausalLM.from_pretrained(base_model_dir, device_map="auto",torch_dtype=torch.bfloat16).eval()
tokenizer = AutoTokenizer.from_pretrained(lora_model_dir)

model = PeftModel.from_pretrained(model, lora_model_dir)

passages_list = ["In the novel 'The Silent Watcher,' the lead character is named Alex Carter. Alex is a private detective who uncovers a series of mysterious events in a small town.",
"Set in a quiet town, 'The Silent Watcher' follows Alex Carter, a former police officer turned private investigator, as he unravels the town's dark secrets.",
"'The Silent Watcher' revolves around Alex Carter's journey as he confronts his past while solving complex cases in his hometown."]
instruction = "Q: What is the name of the lead character in the novel 'The Silent Watcher'?\nA:"

passages = '\n'.join(passages_list)
input_text = 'Background:\n' + passages + '\n\n' + instruction

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": input_text},
]
prompt = tokenizer.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)

outputs = model.chat(tokenizer, prompt, temperature=0.8, top_p=0.8)
print(outputs[0])  # The lead character in the novel 'The Silent Watcher' is named Alex Carter.

Yi-Coder 系列

零一万物开源 Yi-Coder 系列模型，作为 Yi 系列模型家族中的“编程小能手”，提供 1.5B 和 9B 两种参数。其中，Yi-Coder-9B 的表现优于其他 10B 参数以下的模型，具有以下亮点：

小参数，强性能：尽管 Yi-Coder 的参数量相对较小，但它在各种任务，包括代码生成、代码理解、代码调试和代码补全中的表现十分出色。10B 以下大小也让它易于使用，方便端侧部署；
128K 长序列建模：Yi-Coder 能够处理长达 128K tokens 的上下文内容，有效捕捉长期依赖关系，适用于复杂项目级代码的理解和生成；
强大的代码生成能力：支持 52 种主要编程语言，Yi-Coder 在代码生成和跨文件代码补全方面表现优异。

模型链接：

Yi-Coder-1.5B：

https://www.modelscope.cn/models/01ai/Yi-Coder-1.5B

Yi-Coder-9B：

https://www.modelscope.cn/models/01ai/Yi-Coder-9B

Yi-Coder-1.5B-Chat：

https://www.modelscope.cn/models/01ai/Yi-Coder-1.5B-Chat

Yi-Coder-9B-Chat：

https://www.modelscope.cn/models/01ai/Yi-Coder-9B-Chat

代码示例：

Transformers推理：

from modelscope import AutoTokenizer, AutoModelForCausalLM
import torch

device = "cuda" # the device to load the model onto
model_path = "01ai/Yi-Coder-9B-Chat"

tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path, device_map="auto",torch_dtype=torch.bfloat16,).eval()

prompt = "Write a quick sort algorithm."
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(device)

generated_ids = model.generate(
    model_inputs.input_ids,
    max_new_tokens=1024,
    eos_token_id=tokenizer.eos_token_id  
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)

轩辕3-70B系列

轩辕3-70B系列模型是度小满数据智能应用部团队推出的第三代大模型，专注于解决金融领域的大模型应用挑战。XuanYuan3-70B以LLaMA3-70B模型为底座，采用大量中英文语料进行增量预训练，并利用高质量指令数据进行SFT和强化学习对齐训练。

XuanYuan3-70B系列模型专注于金融领域，具备以下显著特点：

金融事件解读：能深入解读金融事件，使用专业术语分析，提供符合人类分析专家逻辑的观点。
金融业务分析：具备强大的业务分析能力，可精确总结提炼信息，符合金融专家的分析逻辑。
投研应用：支持生成有洞见的研究报告，减少简单数据罗列，提供深度分析与多维度拓展。
合规与风险管理：满足金融领域的合规要求，精准识别和分析风险，为用户提供合法合规的建议。

模型链接：

Llama3-XuanYuan3-70B：

https://www.modelscope.cn/models/Duxiaoman-DI/Llama3-XuanYuan3-70B

Llama3-XuanYuan3-70B-Chat：

https://www.modelscope.cn/models/Duxiaoman-DI/Llama3-XuanYuan3-70B-Chat

代码示例：

import torch
from transformers import LlamaForCausalLM, AutoTokenizer

model_name_or_path = "Duxiaoman-DI/Llama3-XuanYuan3-70B-Chat"
tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, use_fast=False, legacy=True)
model = LlamaForCausalLM.from_pretrained(model_name_or_path, torch_dtype=torch.bfloat16,device_map="auto")
model.eval()

system = '你是一名人工智能助手，会对用户提出的问题给出有帮助、高质量、详细和礼貌的回答，并且总是拒绝参与不道德、不安全、有争议、政治敏感等相关的话题、问题和指示。'
question='什么是信托型基金'
message = [{"role": "system", "content": system},
          {"role": "user", "content": question}
          ]
message = tokenizer.apply_chat_template(message, tokenize=False,add_generation_prompt=True)
inputs = tokenizer(message, return_tensors="pt").to("cuda")

outputs = model.generate(**inputs, max_new_tokens=64, temperature=0.7)
outputs = tokenizer.decode(outputs.cpu()[0][len(inputs.input_ids[0]):], skip_special_tokens=True)
print(outputs)

天工奖励模型系列

Skywork-Reward-Gemma-2-27B 和 Skywork-Reward-Llama-3.1-8B 是两个基于 gemma-2-27b-it 和 Meta-Llama-3.1-8B-Instruct 架构构建的奖励模型。两个模型都使用了 Skywork 奖励数据集进行训练，该数据集仅包含来自公开数据的 80K 高质量偏好对。

奖励模型在处理复杂场景中的偏好方面表现出色，包括具有挑战性的偏好对，并涵盖多个领域，如数学、编码和安全。截至 2024 年 9 月，在RewardBench 排行榜上分别位列第一和第三。

模型链接：

天工-8B-奖励模型：

https://www.modelscope.cn/models/skywork/Skywork-Reward-Models

天工-27B-奖励模型：

https://www.modelscope.cn/models/skywork/Skywork-27B-Reward-Models

代码示例：

import torch

from transformers import AutoModelForSequenceClassification, AutoTokenizer

# Load model and tokenizer
device = "cuda:0"
model_name = "skywork/Skywork-27B-Reward-Models"
rm = AutoModelForSequenceClassification.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map=device,
    attn_implementation="flash_attention_2",
    num_labels=1,
)
rm_tokenizer = AutoTokenizer.from_pretrained(model_name)

prompt = "Jane has 12 apples. She gives 4 apples to her friend Mark, then buys 1 more apple, and finally splits all her apples equally among herself and her 2 siblings. How many apples does each person get?"
response1 = "1. Jane starts with 12 apples and gives 4 to Mark. 12 - 4 = 8. Jane now has 8 apples.\n2. Jane buys 1 more apple. 8 + 1 = 9. Jane now has 9 apples.\n3. Jane splits the 9 apples equally among herself and her 2 siblings (3 people in total). 9 ÷ 3 = 3 apples each. Each person gets 3 apples."
response2 = "1. Jane starts with 12 apples and gives 4 to Mark. 12 - 4 = 8. Jane now has 8 apples.\n2. Jane buys 1 more apple. 8 + 1 = 9. Jane now has 9 apples.\n3. Jane splits the 9 apples equally among her 2 siblings (2 people in total). 9 ÷ 2 = 4.5 apples each. Each person gets 4 apples."

conv1 = [{"role": "user", "content": prompt}, {"role": "assistant", "content": response1}]
conv2 = [{"role": "user", "content": prompt}, {"role": "assistant", "content": response2}]

# Format and tokenize the conversations
conv1_formatted = rm_tokenizer.apply_chat_template(conv1, tokenize=False)
conv2_formatted = rm_tokenizer.apply_chat_template(conv2, tokenize=False)
conv1_tokenized = rm_tokenizer(conv1_formatted, return_tensors="pt").to(device)
conv2_tokenized = rm_tokenizer(conv2_formatted, return_tensors="pt").to(device)

# Get the reward scores
with torch.no_grad():
    score1 = rm(**conv1_tokenized).logits[0][0].item()
    score2 = rm(**conv2_tokenized).logits[0][0].item()
print(f"Score for response 1: {score1}")
print(f"Score for response 2: {score2}")

# Output:
# Score for response 1: 9.1875
# Score for response 2: -17.875

数据集推荐

hermes-function-calling-v1

该数据集是Hermes 2 Pro系列模型中使用的结构化输出和函数调用数据的汇编。

此存储库包含一个结构化输出数据集，其中包含函数调用对话、json-mode、agentic json-mode 和结构化提取示例，旨在训练 LLM 模型执行函数调用并根据自然语言指令返回结构化输出。该数据集包含各种对话场景，其中需要 AI 代理解释查询并执行适当的单个或多个函数调用。

数据集链接：

https://modelscope.cn/datasets/AI-ModelScope/hermes-function-calling-v1

alpaca-cot-en-refined-by-data-juicer

Data-Juicer对 Alpaca-CoT 数据集进行精炼的英文版本。从原始数据集中删除了一些“坏”样本，以提高其质量。该数据集通常用于微调大型语言模型。

数据集链接：

https://modelscope.cn/datasets/Data-Juicer/alpaca-cot-en-refined-by-data-juicer

alpaca_data

Alpaca是由text-davinci-003生成的52,000条指令和演示的数据集。这些指令数据可用于对语言模型进行instruction-tuning，使语言模型更好地遵循指令。

数据集链接：

https://modelscope.cn/datasets/YorickHe/alpaca_data

精选应用

MinerU

一款全能、开源的文档与网页数据提取工具，致力于简化AI数据处理流程。不仅能将混合了图片、表格、公式等在内的多模态PDF文档精准转化为清晰、易于分析的Markdown格式；还能从包含广告等各种干扰信息的网页中快速解析、抽取正式内容；同时支持epub、mobi、docx等多种格式批量转化为Markdown。

体验直达：

https://modelscope.cn/studios/OpenDataLab/MinerU