QAGenerationChainV2

用户2345

用户1272

用户639

2024年8月2日修改

QAGenerationChainV2可以帮助用户自动生成QA对。核心原理：输入一篇长文档（pdf or word or others），经过文档解析模块得到一连串文本token，内部会拆分成一个个chunk，对每个chunk利用大模型生成QA对。​

common.docs_name - LarkCCM_Docs_Menu_Image

完整技能

完整技能导出文件：

QA Generator.json

工作原理​
◦
必须连接Documents组件（[装载器/Loaders]）和Llm组件[语言模型/LLMs]​
核心步骤：​
◦
输入一篇长文档（pdf or word or others），经过[装载器/Loaders]组件得到一连串文本token，QAGenerationChainV2内部会拆分成一个个chunk，对每个chunk利用大模型生成QA对​
▪
模型判断chunk的context是否有效，过滤掉无效chunk；​
▪
根据chunk的context，利用大模型生成question；​
▪
模型判断生成的question是否有效，过滤掉无效question；​
▪
根据chunk的context和生成的question，生成对应的答案；​

上游触点​
◦
Documents：与[装载器/Loaders]中的组件连接，获取用户上传的文件​
◦
Llm：与[语言模型/LLMs]中的组件连接​
◦
Question Prompt：与[ChatPromptTemplate]组件连接，可支持自定义 question 的生成逻辑。自定义 Question Prompt 时，user 角色 prompt 必须包含context变量。​
◦
Answer Prompt：与[HumanMessagePromptTemplate]组件连接，可支持自定义 answer 的生成逻辑。自定义 Answer Prompt 时，必须包含 context 和 question 变量。​

下游触点​
◦
可以不连接下游触点，直接使用​
◦
可以与Tool组件连接​

参数设置​
◦
chunk_size：设置拆分的chunk大小，默认为512​
◦
k：控制整份文档生成QA的个数，默认为1000，最终生成的QA数量 = min(k，有效chunk数量)。每个chunk最多生成一个QA。​
◦
filter_lowquality_context：是否过滤低质量 chunk​
◦
filter_lowquality_question：是否过滤低质量 question​

QAGenerationChainV2​

QAGenerationChainV2