diff --git a/docs/en/notes/guide/domain_specific_operators/rare_operators.md b/docs/en/notes/guide/domain_specific_operators/rare_operators.md index 1f47740fc..0ac082a70 100644 --- a/docs/en/notes/guide/domain_specific_operators/rare_operators.md +++ b/docs/en/notes/guide/domain_specific_operators/rare_operators.md @@ -1,6 +1,6 @@ --- title: RARE Operators -createTime: 2025/06/24 11:43:42 +createTime: 2025/09/26 11:47:42 permalink: /en/guide/RARE_operators/ --- @@ -20,24 +20,24 @@ The RARE operator workflow systematically generates synthetic data for reasoning | Name | Application Type | Description | Official Repository or Paper | | :--- | :--- | :--- | :--- | -| Doc2Query✨ | Question Generation | Generates complex reasoning questions and corresponding scenarios based on original documents. | ReasonIR: Training Retrievers for Reasoning Tasks | -| BM25HardNeg✨ | Hard Negative Mining | Mines hard negative samples that are textually similar but semantically irrelevant to the generated questions to construct challenging retrieval contexts. | ReasonIR: Training Retrievers for Reasoning Tasks | -| ReasonDistill🚀 | Reasoning Process Generation | Combines the question, positive, and negative documents to prompt a large language model to generate a detailed reasoning process, "distilling" its domain thinking patterns. | RARE: Retrieval-Augmented Reasoning Modeling | +| RAREDoc2QueryGenerator✨ | Question Generation | Generates complex reasoning questions and corresponding scenarios based on original documents. | ReasonIR: Training Retrievers for Reasoning Tasks | +| RAREBM25HardNegGenerator✨ | Hard Negative Mining | Mines hard negative samples that are textually similar but semantically irrelevant to the generated questions to construct challenging retrieval contexts. | ReasonIR: Training Retrievers for Reasoning Tasks | +| RAREReasonDistillGenerator🚀 | Reasoning Process Generation | Combines the question, positive, and negative documents to prompt a large language model to generate a detailed reasoning process, "distilling" its domain thinking patterns. | RARE: Retrieval-Augmented Reasoning Modeling | ## Operator Interface Usage Instructions For operators that require specifying storage paths or calling models, we provide encapsulated **model interfaces** and **storage object interfaces**. You can predefine the model API parameters for an operator as follows: ```python -from dataflow.llmserving import APILLMServing_request +from dataflow.serving.api_llm_serving_request import APILLMServing_request api_llm_serving = APILLMServing_request( api_url="your_api_url", + key_name_of_api_key="YOUR_API_KEY", model_name="model_name", max_workers=5 ) ``` - You can predefine the storage parameters for an operator as follows: ```python @@ -57,7 +57,7 @@ Regarding parameter passing, the constructor of an operator object primarily rec ## Detailed Operator Descriptions -### 1\. Doc2Query +### 1\. RAREDoc2QueryGenerator **Functional Description** @@ -77,9 +77,9 @@ This operator is the first step in the RARE data generation workflow. It utilize **Usage Example** ```python -from dataflow.operators.generate.RARE import Doc2Query +from dataflow.operators.rare import RAREDoc2QueryGenerator -doc2query_step = Doc2Query(llm_serving=api_llm_serving) +doc2query_step = RAREDoc2QueryGenerator(llm_serving=api_llm_serving) doc2query_step.run( storage=self.storage.step(), input_key="text", @@ -88,7 +88,7 @@ doc2query_step.run( ) ``` -### 2\. BM25HardNeg +### 2\. RAREBM25HardNegGenerator **Functional Description** @@ -96,7 +96,7 @@ This operator uses the classic BM25 algorithm to retrieve and select the most re **Dependency Installation** -The BM25HardNeg operator depends on pyserini, gensim, and JDK. The configuration method for Linux is as follows: +The RAREBM25HardNegGenerator operator depends on pyserini, gensim, and JDK. The configuration method for Linux is as follows: ```Bash sudo apt install openjdk-21-jdk pip install pyserini gensim @@ -116,9 +116,9 @@ pip install pyserini gensim **Usage Example** ```python -from dataflow.operators.generate.RARE import BM25HardNeg +from dataflow.operators.rare import RAREBM25HardNegGenerator -bm25hardneg_step = BM25HardNeg() +bm25hardneg_step = RAREBM25HardNegGenerator() bm25hardneg_step.run( storage=self.storage.step(), input_question_key="question", @@ -128,11 +128,11 @@ bm25hardneg_step.run( ) ``` -### 3\. ReasonDistill +### 3\. RAREReasonDistillGenerator **Functional Description** -This operator is the core implementation of the RARE paradigm. It integrates the question and scenario generated by `Doc2Query`, the original positive document, and the hard negatives mined by `BM25HardNeg` to construct a complex context. It then prompts a large language model (the teacher model) to generate a detailed, step-by-step reasoning process based on this context. This process aims to "distill" the teacher model's domain thinking patterns and generate data for training a student model, teaching it how to perform contextualized reasoning rather than relying on parameterized knowledge. +This operator is the core implementation of the RARE paradigm. It integrates the question and scenario generated by `RAREDoc2QueryGenerator`, the original positive document, and the hard negatives mined by `RAREBM25HardNegGenerator` to construct a complex context. It then prompts a large language model (the teacher model) to generate a detailed, step-by-step reasoning process based on this context. This process aims to "distill" the teacher model's domain thinking patterns and generate data for training a student model, teaching it how to perform contextualized reasoning rather than relying on parameterized knowledge. **Input Parameters** @@ -149,9 +149,9 @@ This operator is the core implementation of the RARE paradigm. It integrates the **Usage Example** ```python -from dataflow.operators.generate.RARE import ReasonDistill +from dataflow.operators.rare import RAREReasonDistillGenerator -reasondistill_step = ReasonDistill(llm_serving=api_llm_serving) +reasondistill_step = RAREReasonDistillGenerator(llm_serving=api_llm_serving) reasondistill_step.run( storage=self.storage.step(), input_text_key="text", diff --git a/docs/en/notes/guide/pipelines/RAREPipeline.md b/docs/en/notes/guide/pipelines/RAREPipeline.md index 05553cd50..81dba7371 100644 --- a/docs/en/notes/guide/pipelines/RAREPipeline.md +++ b/docs/en/notes/guide/pipelines/RAREPipeline.md @@ -1,7 +1,7 @@ --- title: RARE Data Synthesis Pipeline icon: game-icons:great-pyramid -createTime: 2025/07/04 15:40:18 +createTime: 2025/09/26 11:54:18 permalink: /en/guide/rare_pipeline/ --- @@ -17,7 +17,7 @@ The **RARE (Retrieval-Augmented Reasoning Modeling) Data Synthesis Pipeline** is This pipeline can generate high-quality, knowledge- and reasoning-intensive training data from a given set of documents, enabling even lightweight models to achieve top-tier performance, potentially surpassing large models like GPT-4 and DeepSeek-R1. ### Dependency Installation -The `BM25HardNeg` operator in `RAREPipeline` depends on `pyserini`, `gensim`, and `JDK`. The configuration method for Linux is as follows: +The `RAREBM25HardNegGenerator` operator in `RAREPipeline` depends on `pyserini`, `gensim`, and `JDK`. The configuration method for Linux is as follows: ```bash sudo apt install openjdk-21-jdk pip install pyserini gensim @@ -44,9 +44,9 @@ self.storage = FileStorage( ) ``` -### 2\. Generate Knowledge and Reasoning-Intensive Questions (Doc2Query) +### 2\. Generate Knowledge and Reasoning-Intensive Questions (RAREDoc2QueryGenerator) -The first step in the pipeline is the **`Doc2Query`** operator. It uses an LLM to generate questions and scenarios based on the input documents that require complex reasoning to answer. These questions are designed to be independent of the original document, but the reasoning process required to answer them relies on the knowledge contained within the document. +The first step in the pipeline is the **`RAREDoc2QueryGenerator`** operator. It uses an LLM to generate questions and scenarios based on the input documents that require complex reasoning to answer. These questions are designed to be independent of the original document, but the reasoning process required to answer them relies on the knowledge contained within the document. **Functionality:** @@ -64,9 +64,9 @@ self.doc2query_step1.run( ) ``` -### 3\. Mine Hard Negative Samples (BM25HardNeg) +### 3\. Mine Hard Negative Samples (RAREBM25HardNegGenerator) -The second step uses the **`BM25HardNeg`** operator. After generating the questions, this step utilizes the BM25 algorithm to retrieve and filter "hard negative samples" for each question from the entire dataset. These negative samples are textually similar to the "correct" document (the positive sample) but cannot be logically used to answer the question, thus increasing the challenge for the model in the subsequent reasoning step. +The second step uses the **`RAREBM25HardNegGenerator`** operator. After generating the questions, this step utilizes the BM25 algorithm to retrieve and filter "hard negative samples" for each question from the entire dataset. These negative samples are textually similar to the "correct" document (the positive sample) but cannot be logically used to answer the question, thus increasing the challenge for the model in the subsequent reasoning step. **Functionality:** @@ -85,9 +85,9 @@ self.bm25hardneg_step2.run( ) ``` -### 4\. Distill the Reasoning Process (ReasonDistill) +### 4\. Distill the Reasoning Process (RAREReasonDistillGenerator) -The final step is the **`ReasonDistill`** operator. It combines the question, scenario, one positive sample, and multiple hard negative samples to construct a complex prompt. It then leverages a powerful "teacher" LLM (like GPT-4o) to generate a detailed, step-by-step reasoning process (Chain-of-Thought) that demonstrates how to use the provided (mixed true and false) information to arrive at the final answer. +The final step is the **`RAREReasonDistillGenerator`** operator. It combines the question, scenario, one positive sample, and multiple hard negative samples to construct a complex prompt. It then leverages a powerful "teacher" LLM (like GPT-4o) to generate a detailed, step-by-step reasoning process (Chain-of-Thought) that demonstrates how to use the provided (mixed true and false) information to arrive at the final answer. **Functionality:** @@ -115,56 +115,60 @@ self.reasondistill_step3.run( Below is the sample code for running the complete `RAREPipeline`. It executes the three steps described above in sequence, progressively transforming the original documents into high-quality training data that includes a question, a scenario, hard negative samples, and a detailed reasoning process. ```python -from dataflow.operators.generate.RARE import ( - Doc2Query, - BM25HardNeg, - ReasonDistill, +from dataflow.operators.rare import ( + RAREDoc2QueryGenerator, + RAREBM25HardNegGenerator, + RAREReasonDistillGenerator, ) from dataflow.utils.storage import FileStorage -from dataflow.llmserving import APILLMServing_request, LocalModelLLMServing +from dataflow.serving.api_llm_serving_request import APILLMServing_request +from dataflow.serving.local_model_llm_serving import LocalModelLLMServing_vllm class RAREPipeline(): def __init__(self): + self.storage = FileStorage( - first_entry_file_name="../example_data/AgenticRAGPipeline/pipeline_small_chunk.json", + first_entry_file_name="./dataflow/example/RAREPipeline/pipeline_small_chunk.json", cache_path="./cache_local", file_name_prefix="dataflow_cache_step", cache_type="json", ) - # Use an API server as the LLM service + # Using an API server as the LLM service, you can switch to `LocalModelLLMServing_vllm` to use a local model. llm_serving = APILLMServing_request( api_url="https://api.openai.com/v1/chat/completions", + key_name_of_api_key="OPENAI_API_KEY", model_name="gpt-4o", max_workers=1 ) - self.doc2query_step1 = Doc2Query(llm_serving) - self.bm25hardneg_step2 = BM25HardNeg() - self.reasondistill_step3 = ReasonDistill(llm_serving) - + self.doc2query_step1 = RAREDoc2QueryGenerator(llm_serving) + self.bm25hardneg_step2 = RAREBM25HardNegGenerator() + self.reasondistill_step3 = RAREReasonDistillGenerator(llm_serving) + def forward(self): + self.doc2query_step1.run( - storage=self.storage.step(), - input_key="text", + storage = self.storage.step(), + input_key = "text", ) self.bm25hardneg_step2.run( - storage=self.storage.step(), - input_question_key="question", - input_text_key="text", - output_negatives_key="hard_negatives", + storage = self.storage.step(), + input_question_key = "question", + input_text_key = "text", + output_negatives_key = "hard_negatives", ) self.reasondistill_step3.run( - storage=self.storage.step(), - input_text_key="text", - input_question_key="question", - input_scenario_key="scenario", - input_hardneg_key="hard_negatives", - output_key="reasoning", + storage= self.storage.step(), + input_text_key = "text", + input_question_key = "question", + input_scenario_key = "scenario", + input_hardneg_key = "hard_negatives", + output_key= "reasoning", ) - + if __name__ == "__main__": model = RAREPipeline() model.forward() diff --git a/docs/zh/notes/guide/domain_specific_operators/rare_operators.md b/docs/zh/notes/guide/domain_specific_operators/rare_operators.md index e7eca0aab..994464292 100644 --- a/docs/zh/notes/guide/domain_specific_operators/rare_operators.md +++ b/docs/zh/notes/guide/domain_specific_operators/rare_operators.md @@ -1,6 +1,6 @@ --- title: RARE算子 -createTime: 2025/06/24 11:43:42 +createTime: 2025/09/26 11:44:42 permalink: /zh/guide/RARE_operators/ --- @@ -28,19 +28,19 @@ RARE 算子流程通过三个核心步骤,系统性地生成用于推理能力 - Doc2Query✨ + RAREDoc2QueryGenerator✨ 问题生成 基于原始文档,生成需要复杂推理才能解答的问题和相应场景。 ReasonIR: Training Retrievers for Reasoning Tasks - BM25HardNeg✨ + RAREBM25HardNegGenerator✨ 困难负例挖掘 为生成的问题挖掘文本相似但语义不相关的困难负样本,构建具有挑战性的检索上下文。 ReasonIR: Training Retrievers for Reasoning Tasks - ReasonDistill🚀 + RAREReasonDistillGenerator🚀 推理过程生成 结合问题、正负文档,提示大语言模型生成详尽的推理过程,以“蒸馏”其领域思维模式。 RARE: Retrieval-Augmented Reasoning Modeling @@ -53,10 +53,11 @@ RARE 算子流程通过三个核心步骤,系统性地生成用于推理能力 对于指定存储路径或调用模型的算子,我们提供了封装好的**模型接口**和**存储对象接口**。你可以通过如下方式为算子预定义模型 API 参数: ```python -from dataflow.llmserving import APILLMServing_request +from dataflow.serving.api_llm_serving_request import APILLMServing_request api_llm_serving = APILLMServing_request( api_url="your_api_url", + key_name_of_api_key="YOUR_API_KEY", model_name="model_name", max_workers=5 ) @@ -64,10 +65,10 @@ api_llm_serving = APILLMServing_request( 你可以通过如下方式为算子预定义存储参数: -``` +```python from dataflow.utils.storage import FileStorage - self.storage = FileStorage( +self.storage = FileStorage( first_entry_file_name="your_file_path", cache_path="./cache", file_name_prefix="dataflow_cache_step", @@ -75,13 +76,13 @@ from dataflow.utils.storage import FileStorage ) ``` -下文中的 `api_llm_serving` 和 `self.storage` 即为此处定义的接口对象。完整的使用示例可见 `rare_pipeline.py`。 +下文中的 `api_llm_serving` 和 `self.storage` 即为此处定义的接口对象。完整的使用示例可见 `test_rare.py`。 参数传递方面,算子对象的构造函数主要传递与算子配置相关的信息(如 `llm_serving` 实例),可一次配置多次调用;而 `X.run()` 函数则传递与 IO 相关的 `key` 信息和运行时参数。具体细节可见下方算子描述示例。 ## 算子详细说明 -### 1. Doc2Query +### 1. RAREDoc2QueryGenerator **功能描述** @@ -100,10 +101,10 @@ from dataflow.utils.storage import FileStorage **使用示例** -``` -from dataflow.operators.generate.RARE import Doc2Query +```python +from dataflow.operators.rare import RAREDoc2QueryGenerator -doc2query_step = Doc2Query(llm_serving=api_llm_serving) +doc2query_step = RAREDoc2QueryGenerator(llm_serving=api_llm_serving) doc2query_step.run( storage = self.storage.step(), input_key = "text", @@ -112,7 +113,7 @@ doc2query_step.run( ) ``` -### 2. BM25HardNeg +### 2. RAREBM25HardNegGenerator **功能描述** @@ -120,7 +121,7 @@ doc2query_step.run( **依赖安装** -BM25HardNeg算子依赖于pyserini, gensim和JDK。Linux配置方法如下: +RAREBM25HardNegGenerator算子依赖于pyserini, gensim和JDK。Linux配置方法如下: ```Bash sudo apt install openjdk-21-jdk pip install pyserini gensim @@ -139,10 +140,10 @@ pip install pyserini gensim **使用示例** -``` -from dataflow.operators.generate.RARE import BM25HardNeg +```python +from dataflow.operators.rare import RAREBM25HardNegGenerator -bm25hardneg_step = BM25HardNeg() +bm25hardneg_step = RAREBM25HardNegGenerator() bm25hardneg_step.run( storage = self.storage.step(), input_question_key = "question", @@ -152,11 +153,11 @@ bm25hardneg_step.run( ) ``` -### 3. ReasonDistill +### 3. RAREReasonDistillGenerator **功能描述** -该算子是 RARE 范式的核心实现。它将 `Doc2Query` 生成的问题和场景、原始的正面文档以及 `BM25HardNeg` 挖掘出的困难负例整合在一起,构建一个复杂的上下文。然后,它提示大语言模型(教师模型)基于此上下文生成一个详尽的、分步的推理过程。这个过程旨在“蒸馏”出大模型的领域思维模式(domain thinking),并生成用于训练学生模型的数据,使其学会如何进行上下文推理(contextualized reasoning)而非依赖参数化知识。 +该算子是 RARE 范式的核心实现。它将 `RAREDoc2QueryGenerator` 生成的问题和场景、原始的正面文档以及 `RAREBM25HardNegGenerator` 挖掘出的困难负例整合在一起,构建一个复杂的上下文。然后,它提示大语言模型(教师模型)基于此上下文生成一个详尽的、分步的推理过程。这个过程旨在“蒸馏”出大模型的领域思维模式(domain thinking),并生成用于训练学生模型的数据,使其学会如何进行上下文推理(contextualized reasoning)而非依赖参数化知识。 **输入参数** @@ -172,10 +173,10 @@ bm25hardneg_step.run( **使用示例** -``` -from dataflow.operators.generate.RARE import ReasonDistill +```python +from dataflow.operators.rare import RAREReasonDistillGenerator -reasondistill_step = ReasonDistill(llm_serving=api_llm_serving) +reasondistill_step = RAREReasonDistillGenerator(llm_serving=api_llm_serving) reasondistill_step.run( storage = self.storage.step(), input_text_key = "text", diff --git a/docs/zh/notes/guide/pipelines/RAREPipeline.md b/docs/zh/notes/guide/pipelines/RAREPipeline.md index f01070311..2b69d8263 100644 --- a/docs/zh/notes/guide/pipelines/RAREPipeline.md +++ b/docs/zh/notes/guide/pipelines/RAREPipeline.md @@ -1,7 +1,7 @@ --- title: RARE数据合成流水线 icon: game-icons:great-pyramid -createTime: 2025/07/04 14:35:31 +createTime: 2025/09/26 11:51:31 permalink: /zh/guide/rare_pipeline/ --- @@ -16,7 +16,7 @@ permalink: /zh/guide/rare_pipeline/ 该流程可以从给定的文档中,生成高质量的、知识和推理密集型的训练数据,使轻量级模型也能实现顶尖的性能,甚至超越像 GPT-4 和 DeepSeek-R1 这样的大型模型。 ### 依赖安装 -RAREPipeline中的`BM25HardNeg`算子依赖于`pyserini`, `gensim`和`JDK`。Linux配置方法如下: +RAREPipeline中的`RAREBM25HardNegGenerator`算子依赖于`pyserini`, `gensim`和`JDK`。Linux配置方法如下: ```bash sudo apt install openjdk-21-jdk pip install pyserini gensim @@ -43,9 +43,9 @@ self.storage = FileStorage( ) ``` -### 2. **生成知识和推理密集型问题 (Doc2Query)** +### 2. **生成知识和推理密集型问题 (RAREDoc2QueryGenerator)** -流程的第一步是使用 **`Doc2Query`** 算子。它会根据输入的文档,利用大语言模型(LLM)生成需要复杂推理才能回答的问题和场景。这些问题被设计为独立于原始文档,但答案的推理过程需要文档中的知识作为支撑。 +流程的第一步是使用 **`RAREDoc2QueryGenerator`** 算子。它会根据输入的文档,利用大语言模型(LLM)生成需要复杂推理才能回答的问题和场景。这些问题被设计为独立于原始文档,但答案的推理过程需要文档中的知识作为支撑。 **功能:** @@ -63,9 +63,9 @@ self.doc2query_step1.run( ) ``` -### 3. **挖掘困难负样本 (BM25HardNeg)** +### 3. **挖掘困难负样本 (RAREBM25HardNegGenerator)** -流程的第二步是使用 **`BM25HardNeg`** 算子。在生成了问题之后,这一步利用 BM25 算法为每个问题从整个数据集中检索并筛选出“困难负样本”。这些负样本在文本上与“正确”的文档(正样本)相似,但在逻辑上无法用于回答问题,从而增加了模型在后续推理步骤中的挑战。 +流程的第二步是使用 **`RAREBM25HardNegGenerator`** 算子。在生成了问题之后,这一步利用 BM25 算法为每个问题从整个数据集中检索并筛选出“困难负样本”。这些负样本在文本上与“正确”的文档(正样本)相似,但在逻辑上无法用于回答问题,从而增加了模型在后续推理步骤中的挑战。 **功能:** @@ -84,9 +84,9 @@ self.bm25hardneg_step2.run( ) ``` -### 4. **蒸馏推理过程 (ReasonDistill)** +### 4. **蒸馏推理过程 (RAREReasonDistillGenerator)** -流程的最后一步是 **`ReasonDistill`** 算子。它将问题、场景、一个正样本和多个困难负样本组合在一起,构建一个复杂的提示(Prompt)。然后,它利用一个强大的“教师”LLM(如 GPT-4o)来生成一个详细的、分步的推理过程(Chain-of-Thought),展示如何利用提供的(真假混合的)信息来最终回答问题。 +流程的最后一步是 **`RAREReasonDistillGenerator`** 算子。它将问题、场景、一个正样本和多个困难负样本组合在一起,构建一个复杂的提示(Prompt)。然后,它利用一个强大的“教师”LLM(如 GPT-4o)来生成一个详细的、分步的推理过程(Chain-of-Thought),展示如何利用提供的(真假混合的)信息来最终回答问题。 **功能:** @@ -113,36 +113,38 @@ self.reasondistill_step3.run( 以下是运行完整 `RAREPipeline` 的示例代码。它依次执行上述三个步骤,将原始文档逐步转化为包含问题、场景、困难负样本和详细推理过程的高质量训练数据。 -``` -from dataflow.operators.generate.RARE import ( - Doc2Query, - BM25HardNeg, - ReasonDistill, +```python +from dataflow.operators.rare import ( + RAREDoc2QueryGenerator, + RAREBM25HardNegGenerator, + RAREReasonDistillGenerator, ) from dataflow.utils.storage import FileStorage -from dataflow.llmserving import APILLMServing_request, LocalModelLLMServing +from dataflow.serving.api_llm_serving_request import APILLMServing_request +from dataflow.serving.local_model_llm_serving import LocalModelLLMServing_vllm class RAREPipeline(): def __init__(self): self.storage = FileStorage( - first_entry_file_name="../example_data/AgenticRAGPipeline/pipeline_small_chunk.json", + first_entry_file_name="./dataflow/example/RAREPipeline/pipeline_small_chunk.json", cache_path="./cache_local", file_name_prefix="dataflow_cache_step", cache_type="json", ) - # 使用 API 服务器作为 LLM 服务 + # 使用 API 服务器作为 LLM 服务,可以修改为LocalModelLLMServing_vllm以使用本地模型 llm_serving = APILLMServing_request( api_url="https://api.openai.com/v1/chat/completions", + key_name_of_api_key="OPENAI_API_KEY", model_name="gpt-4o", max_workers=1 ) - self.doc2query_step1 = Doc2Query(llm_serving) - self.bm25hardneg_step2 = BM25HardNeg() - self.reasondistill_step3 = ReasonDistill(llm_serving) - + self.doc2query_step1 = RAREDoc2QueryGenerator(llm_serving) + self.bm25hardneg_step2 = RAREBM25HardNegGenerator() + self.reasondistill_step3 = RAREReasonDistillGenerator(llm_serving) + def forward(self): self.doc2query_step1.run( @@ -165,7 +167,7 @@ class RAREPipeline(): input_hardneg_key = "hard_negatives", output_key= "reasoning", ) - + if __name__ == "__main__": model = RAREPipeline() model.forward()