微调大语言模型报错

1 # 训练1931 else:1937)2270 if (2274 ):3336 else:1531 else:1543 try:1531 else:1543 try:

m0_74816511

339人浏览 · 2024-07-26 16:01:39

m0_74816511 · 2024-07-26 16:01:39 发布

模型微调时报错：

ValueError: Expected input batch_size (8) to match target batch_size (256)."

处理：

我这边是两种情况：

数据处理错误，传入def preprocess_function(examples):函数中的examples数据格式有问题
训练器错误。这个也是我自己比较粗心，在创建训练器时指定的评估函数定义错误，评估函数写成了数据处理函数。而在训练中传入评估函数的是三维数组的数据，此时再调用tokenizer传入三维数组的token去处理数据就会报错

# 创建训练器
trainer = Seq2SeqTrainer(
    model=model,               # 指定模型
    args=training_args,         # 指定训练参数对象
    train_dataset=tokenized_tarin_datasets, # 训练数据
    eval_dataset=tokenized_validation_dataset,   # 评估数据
    tokenizer=tokenizer, # 指定tokenizer
    # data_collator=data_collator,
    compute_metrics=compute_metrics, # 指定评估函数（该函数会在每个评估点被调用以计算特定的评估指标。compute_metrics函数通常接受预测结果和真实标签作为输入，并返回一个字典，其中包含各种评估指标）
)

最后。我这个出现问题就是训练器的评估函数写错了，附带好用的评估函数代码

import numpy as np
rouge = load("D:\\project\\llm\\evaluate\\evaluate-main\\metrics\\rouge")
# 评估函数
def compute_metrics(evaPred):
    # 获取predictions（模型返回的预测文本）, labels（给定预期结果）
    inputs, predictions, labels = evaPred

   
    decode_preds = tokenizer.batch_decode(predictions, skip_special_tokens=True)
    
    # 替换labels中的-100为pad填充
    # np.where表达式数据替换，条件-真-假
    # 在序列到序列任务中，-100 通常被用作一个特殊的占位符，用来标记那些在计算损失时应该被忽略的位置
    labels = np.where(labels != -100, labels, tokenizer.pad_token_id)
    decode_labels = tokenizer.batch_decode(labels, skip_special_tokens=True)
    scores = rouge.compute(predictions=decode_preds, references=decode_labels)
    print(scores)
    return scores

欢迎加入ModelScope魔搭中文开源社区

ModelScope旨在打造下一代开源的模型即服务共享平台，为泛AI开发者提供灵活、易用、低成本的一站式模型服务产品，让模型应用更简单！

更多推荐

Qwen2-VL全面解读！阿里开源多模态视觉语言模型，多项超越GPT4o与Claude 3.5-Sonnet

ModelScope魔搭社区

错误 RuntimeError: expected dtype Double but got dtype Float (validate_dtype at ..\aten\src\ATen\nativ

当调用 loss.backward() 函数时，出现以下错误 RuntimeError: expected dtype Double but got dtype Float (validate_dtype at …\aten\src\ATen\native\TensorIterator.cpp:143) 报错。解决方法：当你使用的损失函数是nn.MSELoss 时，要保证输入数据的类型是一样的。如