Multi-Agent实践第5期：RAG智能体的应用：让AgentScope介绍一下自己吧

魔搭ModelScope社区

593人浏览 · 2024-05-13 13:52:47

魔搭ModelScope社区 · 2024-05-13 13:52:47 发布

前言

在前几期的文章中，我们由浅入深地介绍了怎么用AgentScope构建一个带有@功能的对话、如何搭建一个简单的五子棋游戏，也带领大家一起构建了一个既能动态反思、也能动态决定如何调用工具的ReAct Agent。除了ReAct技术外，另一个能让智能体回答更靠谱、回答内容更切合场景的技术，就是检索增强生成（Retrieval Augmented Generation，RAG）。

本期文章，我们将向大家展示如何使用AgentScope中构建和使用具有RAG功能的智能体，创造AgentScope助手群，为大家解答和AgentScope相关的问题。完整的代码在链接（https://github.com/modelscope/agentscope/tree/main/examples/conversation_with_RAG_agents）中找到。

（先叠个甲：RAG agent目前还处于beta样例（example）状态，将来可能会以更加通用易用的方式加入框架中。）

欢迎大家关注AgentScope，在github上（https://github.com/modelscope/agentscope）为我们star 。在接下来几天，我们会继续推出教程，带领大家搭建出有趣的多智能体应用！

AgenrScope与RAG

RAG

RAG是最近在模型领域非常火的一类算法。这类算法的一个共通点就是通过检索外挂知识库，获得和提问相关的知识，并用这些相关知识“影响”大模型的输出。这一类的算法中一个经典、简单但也非常实用的实现是基于大模型in-context learning的能力，通过检索外挂知识库获得在某些领域的相关专业知识，然后将这些检索而来的知识直接放在给大模型的提示词中。这样一来，哪怕大模型在训练中没有用到相关语料，在这些检索出来的知识的影响下，大模型也可以给出相当靠谱的回答。传统的检索一般会返回若干条符合条件或者足够相似的答案，直接让人去提取有效信息、判断信息是否有用；和传统的检索相比，RAG算法可以通过大模型的能力，自动整合检索的结果、过滤无效信息，给出全面合理的回答。本文提到的样例，也是基于这个思路实现的。

从AgentScope的multi-agent角度来看，RAG可以为每个agent提供更加客制化的回答问题能力。在某些应用中，RAG技术可以和工具调用、CoT等agent技术结合，创造出更加可靠、多才多艺的智能体。

AgentScope中构建RAG智能体

应用总览

在样例中，我们利用了RAG技术来构建一个AgentScope答疑助手群的应用。这个应用由四个agents组成，除了作为接收用户输入的user agent，我们还设置了一个专门基于教程（markdown文件）回答问题的智能体（教程助理，Tutorial Assistant），一个基于AgentScope代码库（.py文件）回答问题的智能体（代码讲解助理，Code Assistant），以及一个负责总结前两者发言的总结智能体（总结助理，Summarize Assistant）。教程助理和代码讲解助理可以通过配置我们应用中提供的封装好的LlamaIndex的RAG agent （LlamaIndexAgent）非常简单方便地通过配置实现。在这里，我们给大家介绍的AgentScope答疑应用中主程序（https://github.com/modelscope/agentscope/blob/main/examples/conversation_with_RAG_agents/rag_example.py）里的逻辑。

模型依赖

在RAG应用中，我们需要引入两种模型：大语言模型（language model）和向量表征模型（embedding model）。这个样例里，我们用的是通义的qwen-max和text-embedding-v2。

agentscope.init(
    model_configs=[
        {
            "model_type": "dashscope_chat",
            "config_name": "qwen_config",
            "model_name": "qwen-max",
            "api_key": f"{os.environ.get('DASHSCOPE_API_KEY')}",
        },
        {
            "model_type": "dashscope_text_embedding",
            "config_name": "qwen_emb_config",
            "model_name": "text-embedding-v2",
            "api_key": f"{os.environ.get('DASHSCOPE_API_KEY')}",
        },
    ],
)

RAG agent配置

和一般的agent相比，样例中的rag agent需要额外的两个参数：emb_model_config_name和rag_config。前者是传入用来生成emedding的模型。除此之外，我们的样例里需要大家通过配置RAG智能体的rag_config来决定插入的llamaindex的模块，根据不同的文件类型客制化自己的RAG智能体。比如像下面的案例里，我们配置代码讲解助理的时候，数据载入的时候用到的是llamaindex的SimpleDirectoryReader；在对文件分块的时候，用到的是llamaindex专门为代码文件提供的CodeSplitter。后面init_args里面的内容分别是对与SimpleDirectoryReader和CodeSplitter初始化时候需要的参数。其中最重要的参数是数据载入的input_dir，这个参数规定了我们从哪里读取文件。

{
    "class": "LlamaIndexAgent",
    "args": {
      "name": "AgentScope Framework Code Assistant",
      "sys_prompt": "You're a helpful assistant about coding. You can very familiar with the framework code of AgentScope.",
      "model_config_name": "qwen_config",
      "emb_model_config_name": "qwen_emb_config",
      "rag_config": {
            "load_data": {
              "loader": {
                "create_object": true,
                "module": "llama_index.core",
                "class": "SimpleDirectoryReader",
                "init_args": {
                  "input_dir": "../../src/agentscope",
                  "recursive": true,
                  "required_exts": [".py"]
                }
              }
            },
            "store_and_index": {
              "transformations": [
                {
                  "create_object": true,
                  "module": "llama_index.core.node_parser",
                  "class": "CodeSplitter",
                  "init_args": {
                    "language": "python",
                    "chunk_lines": 100
                  }
                }
              ]
            },
            "chunk_size": 2048,
            "chunk_overlap": 40,
            "similarity_top_k": 10,
            "log_retrieval": false,
            "recent_n_mem": 1
      }
    }
  }

创建智能体

跟之前的样例一样，我们可以通过上面提到的配置文件创建智能体。三个助理都有自己的配置。

with open("./agent_config.json", "r", encoding="utf-8") as f:
    agent_configs = json.load(f)
tutorial_agent = LlamaIndexAgent(**agent_configs[0]["args"])
code_explain_agent = LlamaIndexAgent(**agent_configs[1]["args"])
summarize_agent = DialogAgent(**agent_configs[2]["args"])
user_agent = UserAgent()

运行逻辑

作为AgentScope的答疑应用，我们希望能从教程、代码中分别提取有用信息，交由总结助理处理。

while True:
    x = user_agent()
    x.role = "user"  # to enforce dashscope requirement on roles
    if len(x["content"]) == 0 or str(x["content"]).startswith("exit"):
        break
    # 教程助理回答
    tutorial_response = tutorial_agent(x)
    # 代码讲解助理回答
    code_explain = code_explain_agent(x)
    # 汇总信息
    msg = Msg(
        name="user",
        role="user",
        content=tutorial_response["content"]
        + "\n"
        + code_explain["content"]
        + "\n"
        + x["content"],
    )
    #总结
    summarize_agent(msg)

运行结果

我们提供的AgentScope答疑样例可以简单通过简单的命令行 python ./rag_example.py来运行。当数据处理完后，我们可以通过user agent输入问题，比如：

User: what is the difference between DialogAgent and DictDialogAgent?

我们会看到基于教程的RAG智能体的回答

AgentScope Tutorial Assistant: `DialogAgent` and `DictDialogAgent` are both agents in the AgentScope framework, but they differ in how they handle conversations and the level of customization:

1. **DialogAgent**: This is a general-purpose agent class that manages dialogues by understanding context and generating coherent responses. （本文中省略...）

2. **DictDialogAgent**: This agent type seems to be specialized for scenarios where the agent's behavior needs to adhere to specific constraints or rules defined in a dictionary or through a structured prompt.（本文中省略...）

While both agents are capable of holding a conversation, the `DialogAgent` is more versatile and generally （本文中省略...）

基于代码的RAG智能体的回答

AgentScope Framework Code Assistant: In the given context, `DictDialogAgent` and `DialogAgent` are both subclasses of `AgentBase`, designed for different use cases within a conversational AI system. Here's a brief comparison:

1. **DictDialogAgent**:
   - This agent specifically deals with responses returned by the model in dictionary format.
   - （本文中省略...）

2. **DialogAgent**:
   - This is a more general-purpose dialog agent without the assumption that responses are in dictionary format.
   - （本文中省略...）

Both agents utilize prompts to request specific formats from the model and have mechanisms to handle and process the model's responses. （本文中省略...）

然后，基于它们的回答，有一个智能体进行总结性发言：

Summarize Assistant: **Summary:** 

The `DialogAgent` and `DictDialogAgent` are both classes in the AgentScope framework, designed for managing conversational AI interactions. They differ primarily in their intended usage and response handling:

1. **DialogAgent**:
   - This is a generic agent class meant for a wide variety of conversation contexts.
   - （本文中省略...）

2. **DictDialogAgent**:
   - This agent is more specialized and suited for scenarios where the dialogue must adhere to specific rules or constraints encoded in a dictionary or structured prompt.
   - （本文中省略...）

In summary, `DialogAgent` is more versatile and adaptable to various conversational contexts, while `DictDialogAgent` is tailored for environments where the dialogue is more predictable and rule-bound,

大家也可以通过as_studio来快速建立一个有界面的答疑应用：只需要在命令行中输入as_studio ./rag_example.py即可运行。

拓展&总结

我们AgentScope助手群的应用也可以很方便地被迁移去服务其他文档库或者代码库。使用者只需要确定他们的文件类型：文字数据可以用上教程助理，代码数据需要解释助理。用户需要改变input_dir，去让智能体读取制定的存放在改路径下的数据，就能客制化出自己助手啦！

到这里，我们已经完成了利用AgentScope创建AgentScope答疑助手群的任务，并进行了初步的尝试。该次尝试的完整的代码和运行结果可以在链接（https://github.com/modelscope/agentscope/tree/main/examples/conversation_with_RAG_agents）中找到。未来AgentScope也会持续添加新的算法以及样例，欢迎大家持续关注~

延伸阅读和资源