手把手教你構建屬於自己的深度研究和報告生成代理

構建自己的深度研究代理

AI 代理系統如今風靡全球！它們是簡單的 LLM，與特定的提示和工具相連，可以自主地為你完成任務。不過，你也可以建立可靠的分步工作流程，指導 LLM 更可靠地為你解決問題。最近，OpenAI 在 2025 年 2 月推出了 “深度研究”（Deep Research），它是一個代理，可以根據使用者的主題，自動執行大量搜尋，並將其編譯成一份精美的報告。不過，它只適用於 200 美元的專業計劃。在這裡，我將手把手教你如何使用 LangGraph 以不到一美元的價格建立自己的深度研究和報告生成代理！

OpenAI深度研究簡介

OpenAI 於 2025 年 2 月 2 日推出了深度研究，並將其作為 ChatGPT 產品的一項附加功能。他們稱這是一種新的代理能力，可以針對使用者提出的複雜任務或查詢在網際網路上進行多步驟研究。他們聲稱，它可以在數十分鐘內完成人類需要花費數小時才能完成的工作。

深度研究執行任務

深度研究執行任務-來源：OpenAI

深度研究是 OpenAI 當前的 Agentic AI 產品，可以自主為您完成工作。您只需透過提示給它一個任務或主題，ChatGPT 就會查詢、分析和綜合數百個線上資料來源，以研究分析師的水平建立一份綜合報告。ChatGPT 由即將推出的 OpenAI o3 模型版本提供支援，該模型針對網頁瀏覽和資料分析進行了最佳化，它利用推理來搜尋、解釋和分析網際網路上的海量文字、圖片和 PDF 檔案，最終編制出一份結構合理的報告。

不過，這也有一些限制，因為只有訂閱了 200 美元的 ChatGPT 專業版才能使用它。這就是我的 Agentic AI 系統的優勢所在，它可以在不到一美元的時間內進行深入研究，並編寫出一份精美的報告。讓我們開始吧！

深度研究與結構化報告生成規劃Agentic AI系統架構

下圖顯示了我們系統的整體架構，我們將使用 LangChain 的 LangGraph 開源框架來實現該系統，從而輕鬆構建有狀態的代理系統。

深度研究與報告生成AI代理

為上述系統提供動力的關鍵元件包括

強大的大型語言模型（Large Language Model），推理能力強。我們使用的是 GPT-4o，它並不昂貴，速度也很快，不過，你甚至可以使用 Llama 3.2 等 LLM 或其他開源替代品。
LangGraph 用於構建我們的代理系統，因為它是構建基於迴圈圖的系統的絕佳框架，可以在整個工作流程中保持狀態變數，並有助於輕鬆構建代理反饋迴路。
Tavily AI 是一款出色的人工智慧搜尋引擎，非常適合網路研究和從網站獲取資料，為我們的深度研究系統提供動力。

本專案的重點是為深度研究和結構化報告生成構建一個規劃代理 ，作為 OpenAI 深度研究的替代方案。該代理遵循流行的規劃代理設計模式（Planning Agent Design Pattern），自動分析使用者定義的主題、執行深度網路研究並生成結構良好的報告。這個工作流程的靈感實際上來自 LangChain 自己的Report mAIstro，所以我對他們提出的工作流程給予了充分肯定：

1. 報告規劃：

代理分析使用者提供的主題和預設報告模板，為報告建立自定義計劃。
根據主題定義導言、關鍵部分和結論等部分。
在確定主要章節之前，會使用網路搜尋工具收集所需資訊。

2. 2. 研究與寫作並行執行：

代理使用並行執行來高效執行：
- 網路研究：為每個章節生成查詢，並透過網路搜尋工具執行，以檢索最新資訊。
- 撰寫章節：利用檢索到的資料為每個章節撰寫內容，流程如下：
  - 研究員從網上收集相關資料。
  - 章節撰寫人使用這些資料為指定章節生成結構化內容。

3. 格式化已完成的章節：

所有章節撰寫完成後，將對其進行格式化，以確保報告結構的一致性和一致性。

4. 撰寫引言和結論：

在完成主要章節的撰寫和格式化之後：
- 根據其餘章節的內容撰寫引言和結論（同步進行）。
- 這一過程可確保這些部分與報告的整體流程和見解保持一致。

5. 最後彙編：

將所有已完成的章節彙編在一起，形成最終報告。
最終輸出的是一份全面而有條理的維基文件式報告。

現在，讓我們開始使用 LangGraph 和 Tavily 逐步構建這些元件。

深度研究與結構化報告生成規劃AI代理系統的實踐實施

現在，我們將根據上一節詳細討論的架構，透過詳細說明、程式碼和輸出，逐步實現深度研究報告生成器代理人工智慧系統的端到端工作流程。

安裝依賴項

我們首先安裝必要的依賴庫，這些庫將用於構建我們的系統。其中包括 langchain、LangGraph 和用於生成漂亮標記符報告的 rich。

!pip install langchain==0.3.14

!pip install langchain-openai==0.3.0

!pip install langchain-community==0.3.14

!pip install langgraph==0.2.64

!pip install rich

!pip install langchain==0.3.14 !pip install langchain-openai==0.3.0 !pip install langchain-community==0.3.14 !pip install langgraph==0.2.64 !pip install rich

!pip install langchain==0.3.14
!pip install langchain-openai==0.3.0
!pip install langchain-community==0.3.14
!pip install langgraph==0.2.64
!pip install rich

輸入Open AI API金鑰

我們使用 getpass() 函式輸入 Open AI 金鑰，這樣就不會在程式碼中意外暴露金鑰。

from getpass import getpass

OPENAI_KEY = getpass('Enter Open AI API Key: ')

from getpass import getpass OPENAI_KEY = getpass('Enter Open AI API Key: ')

from getpass import getpass
OPENAI_KEY = getpass('Enter Open AI API Key: ')

輸入Tavily Search API金鑰

我們使用 getpass() 函式輸入 Tavily Search 金鑰，這樣就不會在程式碼中意外暴露金鑰。您可以從這裡獲取金鑰，他們還提供免費服務。

TAVILY_API_KEY = getpass('Enter Tavily Search API Key: ')

TAVILY_API_KEY = getpass('Enter Tavily Search API Key: ')

設定環境變數

接下來，我們設定一些系統環境變數，這些變數將在以後驗證 LLM 和 Tavily Search 時使用。

import os

os.environ['OPENAI_API_KEY'] = OPENAI_KEY

os.environ['TAVILY_API_KEY'] = TAVILY_API_KEY

import os os.environ['OPENAI_API_KEY'] = OPENAI_KEY os.environ['TAVILY_API_KEY'] = TAVILY_API_KEY

import os
os.environ['OPENAI_API_KEY'] = OPENAI_KEY
os.environ['TAVILY_API_KEY'] = TAVILY_API_KEY

定義代理狀態模式

我們使用 LangGraph 將代理系統構建為帶有節點的圖，其中每個節點都包含整個工作流程中的一個特定執行步驟。每個特定的操作集（節點）都有自己的模式，定義如下。您可以根據自己的報告生成風格進一步定製。

from typing_extensions import TypedDict

from pydantic import BaseModel, Field

import operator

from typing import Annotated, List, Optional, Literal

# defines structure for each section in the report

class Section(BaseModel):

name: str = Field(

description="Name for a particular section of the report.",

)

description: str = Field(

description="Brief overview of the main topics and concepts to be covered in this section.",

)

research: bool = Field(

description="Whether to perform web search for this section of the report."

)

content: str = Field(

description="The content for this section."

)

class Sections(BaseModel):

sections: List[Section] = Field(

description="All the Sections of the overall report.",

)

# defines structure for queries generated for deep research

class SearchQuery(BaseModel):

search_query: str = Field(None, description="Query for web search.")

class Queries(BaseModel):

queries: List[SearchQuery] = Field(

description="List of web search queries.",

)

# consists of input topic and output report generated

class ReportStateInput(TypedDict):

topic: str # Report topic

class ReportStateOutput(TypedDict):

final_report: str # Final report

# overall agent state which will be passed and updated in nodes in the graph

class ReportState(TypedDict):

topic: str # Report topic

sections: list[Section] # List of report sections

completed_sections: Annotated[list, operator.add] # Send() API

report_sections_from_research: str # completed sections to write final sections

final_report: str # Final report

# defines the key structure for sections written using the agent

class SectionState(TypedDict):

section: Section # Report section

search_queries: list[SearchQuery] # List of search queries

source_str: str # String of formatted source content from web search

report_sections_from_research: str # completed sections to write final sections

completed_sections: list[Section] # Final key in outer state for Send() API

class SectionOutputState(TypedDict):

completed_sections: list[Section] # Final key in outer state for Send() API

from typing_extensions import TypedDict from pydantic import BaseModel, Field import operator from typing import Annotated, List, Optional, Literal # defines structure for each section in the report class Section(BaseModel): name: str = Field( description="Name for a particular section of the report.", ) description: str = Field( description="Brief overview of the main topics and concepts to be covered in this section.", ) research: bool = Field( description="Whether to perform web search for this section of the report." ) content: str = Field( description="The content for this section." ) class Sections(BaseModel): sections: List[Section] = Field( description="All the Sections of the overall report.", ) # defines structure for queries generated for deep research class SearchQuery(BaseModel): search_query: str = Field(None, description="Query for web search.") class Queries(BaseModel): queries: List[SearchQuery] = Field( description="List of web search queries.", ) # consists of input topic and output report generated class ReportStateInput(TypedDict): topic: str # Report topic class ReportStateOutput(TypedDict): final_report: str # Final report # overall agent state which will be passed and updated in nodes in the graph class ReportState(TypedDict): topic: str # Report topic sections: list[Section] # List of report sections completed_sections: Annotated[list, operator.add] # Send() API report_sections_from_research: str # completed sections to write final sections final_report: str # Final report # defines the key structure for sections written using the agent class SectionState(TypedDict): section: Section # Report section search_queries: list[SearchQuery] # List of search queries source_str: str # String of formatted source content from web search report_sections_from_research: str # completed sections to write final sections completed_sections: list[Section] # Final key in outer state for Send() API class SectionOutputState(TypedDict): completed_sections: list[Section] # Final key in outer state for Send() API

from typing_extensions import TypedDict
from pydantic import BaseModel, Field
import operator
from typing import  Annotated, List, Optional, Literal
# defines structure for each section in the report
class Section(BaseModel):
    name: str = Field(
        description="Name for a particular section of the report.",
    )
    description: str = Field(
        description="Brief overview of the main topics and concepts to be covered in this section.",
    )
    research: bool = Field(
        description="Whether to perform web search for this section of the report."
    )
    content: str = Field(
        description="The content for this section."
    )
class Sections(BaseModel):
    sections: List[Section] = Field(
        description="All the Sections of the overall report.",
    )
# defines structure for queries generated for deep research
class SearchQuery(BaseModel):
    search_query: str = Field(None, description="Query for web search.")
class Queries(BaseModel):
    queries: List[SearchQuery] = Field(
        description="List of web search queries.",
    )
# consists of input topic and output report generated
class ReportStateInput(TypedDict):
    topic: str # Report topic
class ReportStateOutput(TypedDict):
    final_report: str # Final report
# overall agent state which will be passed and updated in nodes in the graph
class ReportState(TypedDict):
    topic: str # Report topic
    sections: list[Section] # List of report sections
    completed_sections: Annotated[list, operator.add] # Send() API
    report_sections_from_research: str # completed sections to write final sections
    final_report: str # Final report
# defines the key structure for sections written using the agent 
class SectionState(TypedDict):
    section: Section # Report section
    search_queries: list[SearchQuery] # List of search queries
    source_str: str # String of formatted source content from web search
    report_sections_from_research: str # completed sections to write final sections
    completed_sections: list[Section] # Final key in outer state for Send() API
class SectionOutputState(TypedDict):
    completed_sections: list[Section] # Final key in outer state for Send() API

實用函式

我們定義了幾個實用函式，它們將幫助我們執行並行網路搜尋查詢並格式化從網路上獲取的結果。

1. run_search_queries(…)

該函式將非同步執行針對特定查詢列表的 Tavily 搜尋查詢，並返回搜尋結果。由於是非同步的，因此它是非阻塞的，可以並行執行。

from langchain_community.utilities.tavily_search import TavilySearchAPIWrapper

import asyncio

from dataclasses import asdict, dataclass

# just to handle objects created from LLM reponses

@dataclass

class SearchQuery:

search_query: str

def to_dict(self) -> Dict[str, Any]:

return asdict(self)

tavily_search = TavilySearchAPIWrapper()

async def run_search_queries(

search_queries: List[Union[str, SearchQuery]],

num_results: int = 5,

include_raw_content: bool = False

) -> List[Dict]:

search_tasks = []

for query in search_queries:

# Handle both string and SearchQuery objects

# Just in case LLM fails to generate queries as:

# class SearchQuery(BaseModel):

# search_query: str

query_str = query.search_query if isinstance(query, SearchQuery)

else str(query) # text query

try:

# get results from tavily async (in parallel) for each search query

search_tasks.append(

tavily_search.raw_results_async(

query=query_str,

max_results=num_results,

search_depth='advanced',

include_answer=False,

include_raw_content=include_raw_content

)

except Exception as e:

print(f"Error creating search task for query '{query_str}': {e}")

continue

# Execute all searches concurrently and await results

try:

if not search_tasks:

return []

search_docs = await asyncio.gather(*search_tasks, return_exceptions=True)

# Filter out any exceptions from the results

valid_results = [

doc for doc in search_docs

if not isinstance(doc, Exception)

]

return valid_results

except Exception as e:

print(f"Error during search queries: {e}")

return []

from langchain_community.utilities.tavily_search import TavilySearchAPIWrapper import asyncio from dataclasses import asdict, dataclass # just to handle objects created from LLM reponses @dataclass class SearchQuery: search_query: str def to_dict(self) -> Dict[str, Any]: return asdict(self) tavily_search = TavilySearchAPIWrapper() async def run_search_queries( search_queries: List[Union[str, SearchQuery]], num_results: int = 5, include_raw_content: bool = False ) -> List[Dict]: search_tasks = [] for query in search_queries: # Handle both string and SearchQuery objects # Just in case LLM fails to generate queries as: # class SearchQuery(BaseModel): # search_query: str query_str = query.search_query if isinstance(query, SearchQuery) else str(query) # text query try: # get results from tavily async (in parallel) for each search query search_tasks.append( tavily_search.raw_results_async( query=query_str, max_results=num_results, search_depth='advanced', include_answer=False, include_raw_content=include_raw_content ) ) except Exception as e: print(f"Error creating search task for query '{query_str}': {e}") continue # Execute all searches concurrently and await results try: if not search_tasks: return [] search_docs = await asyncio.gather(*search_tasks, return_exceptions=True) # Filter out any exceptions from the results valid_results = [ doc for doc in search_docs if not isinstance(doc, Exception) ] return valid_results except Exception as e: print(f"Error during search queries: {e}") return []

from langchain_community.utilities.tavily_search import TavilySearchAPIWrapper
import asyncio
from dataclasses import asdict, dataclass
# just to handle objects created from LLM reponses
@dataclass
class SearchQuery:
    search_query: str
    def to_dict(self) -> Dict[str, Any]:
        return asdict(self)
tavily_search = TavilySearchAPIWrapper()
async def run_search_queries(
    search_queries: List[Union[str, SearchQuery]],
    num_results: int = 5,
    include_raw_content: bool = False
) -> List[Dict]:
    search_tasks = []
    for query in search_queries:
        # Handle both string and SearchQuery objects
        # Just in case LLM fails to generate queries as:
        # class SearchQuery(BaseModel):
        #     search_query: str
        query_str = query.search_query if isinstance(query, SearchQuery) 
else str(query) # text query
        try:
            # get results from tavily async (in parallel) for each search query
            search_tasks.append(
                tavily_search.raw_results_async(
                    query=query_str,
                    max_results=num_results,
                    search_depth='advanced',
                    include_answer=False,
                    include_raw_content=include_raw_content
                )
            )
        except Exception as e:
            print(f"Error creating search task for query '{query_str}': {e}")
            continue
    # Execute all searches concurrently and await results
    try:
        if not search_tasks:
            return []
        search_docs = await asyncio.gather(*search_tasks, return_exceptions=True)
        # Filter out any exceptions from the results
        valid_results = [
            doc for doc in search_docs
            if not isinstance(doc, Exception)
        ]
        return valid_results
    except Exception as e:
        print(f"Error during search queries: {e}")
        return []

2. format_search_query_results(…)

這將從 Tavily 搜尋結果中提取上下文，確保相同 URL 中的內容沒有重複，並將其格式化以顯示來源、URL 和相關內容（以及可選的原始內容，原始內容可根據標記數量進行截斷）。

import tiktoken

from typing import List, Dict, Union, Any

def format_search_query_results(

search_response: Union[Dict[str, Any], List[Any]],

max_tokens: int = 2000,

include_raw_content: bool = False

) -> str:

encoding = tiktoken.encoding_for_model("gpt-4")

sources_list = []

# Handle different response formats if search results is a dict

if isinstance(search_response, dict):

if 'results' in search_response:

sources_list.extend(search_response['results'])

else:

sources_list.append(search_response)

# if search results is a list

elif isinstance(search_response, list):

for response in search_response:

if isinstance(response, dict):

if 'results' in response:

sources_list.extend(response['results'])

else:

sources_list.append(response)

elif isinstance(response, list):

sources_list.extend(response)

if not sources_list:

return "No search results found."

# Deduplicate by URL and keep unique sources (website urls)

unique_sources = {}

for source in sources_list:

if isinstance(source, dict) and 'url' in source:

if source['url'] not in unique_sources:

unique_sources[source['url']] = source

# Format output

formatted_text = "Content from web search:\n\n"

for i, source in enumerate(unique_sources.values(), 1):

formatted_text += f"Source {source.get('title', 'Untitled')}:\n===\n"

formatted_text += f"URL: {source['url']}\n===\n"

formatted_text += f"Most relevant content from source: {source.get('content', 'No content available')}\n===\n"

if include_raw_content:

# truncate raw webpage content to a certain number of tokens to prevent exceeding LLM max token window

raw_content = source.get("raw_content", "")

if raw_content:

tokens = encoding.encode(raw_content)

truncated_tokens = tokens[:max_tokens]

truncated_content = encoding.decode(truncated_tokens)

formatted_text += f"Raw Content: {truncated_content}\n\n"

return formatted_text.strip()

import tiktoken from typing import List, Dict, Union, Any def format_search_query_results( search_response: Union[Dict[str, Any], List[Any]], max_tokens: int = 2000, include_raw_content: bool = False ) -> str: encoding = tiktoken.encoding_for_model("gpt-4") sources_list = [] # Handle different response formats if search results is a dict if isinstance(search_response, dict): if 'results' in search_response: sources_list.extend(search_response['results']) else: sources_list.append(search_response) # if search results is a list elif isinstance(search_response, list): for response in search_response: if isinstance(response, dict): if 'results' in response: sources_list.extend(response['results']) else: sources_list.append(response) elif isinstance(response, list): sources_list.extend(response) if not sources_list: return "No search results found." # Deduplicate by URL and keep unique sources (website urls) unique_sources = {} for source in sources_list: if isinstance(source, dict) and 'url' in source: if source['url'] not in unique_sources: unique_sources[source['url']] = source # Format output formatted_text = "Content from web search:\n\n" for i, source in enumerate(unique_sources.values(), 1): formatted_text += f"Source {source.get('title', 'Untitled')}:\n===\n" formatted_text += f"URL: {source['url']}\n===\n" formatted_text += f"Most relevant content from source: {source.get('content', 'No content available')}\n===\n" if include_raw_content: # truncate raw webpage content to a certain number of tokens to prevent exceeding LLM max token window raw_content = source.get("raw_content", "") if raw_content: tokens = encoding.encode(raw_content) truncated_tokens = tokens[:max_tokens] truncated_content = encoding.decode(truncated_tokens) formatted_text += f"Raw Content: {truncated_content}\n\n" return formatted_text.strip()

import tiktoken
from typing import List, Dict, Union, Any
def format_search_query_results(
search_response: Union[Dict[str, Any], List[Any]],
max_tokens: int = 2000,
include_raw_content: bool = False
) -> str:
encoding = tiktoken.encoding_for_model("gpt-4")
sources_list = []
# Handle different response formats if search results is a dict
if isinstance(search_response, dict):
if 'results' in search_response:
sources_list.extend(search_response['results'])
else:
sources_list.append(search_response)
# if search results is a list
elif isinstance(search_response, list):
for response in search_response:
if isinstance(response, dict):
if 'results' in response:
sources_list.extend(response['results'])
else:
sources_list.append(response)
elif isinstance(response, list):
sources_list.extend(response)
if not sources_list:
return "No search results found."
# Deduplicate by URL and keep unique sources (website urls)
unique_sources = {}
for source in sources_list:
if isinstance(source, dict) and 'url' in source:
if source['url'] not in unique_sources:
unique_sources[source['url']] = source
# Format output
formatted_text = "Content from web search:\n\n"
for i, source in enumerate(unique_sources.values(), 1):
formatted_text += f"Source {source.get('title', 'Untitled')}:\n===\n"
formatted_text += f"URL: {source['url']}\n===\n"
formatted_text += f"Most relevant content from source: {source.get('content', 'No content available')}\n===\n"
if include_raw_content:
# truncate raw webpage content to a certain number of tokens to prevent exceeding LLM max token window
raw_content = source.get("raw_content", "")
if raw_content:
tokens = encoding.encode(raw_content)
truncated_tokens = tokens[:max_tokens]
truncated_content = encoding.decode(truncated_tokens)
formatted_text += f"Raw Content: {truncated_content}\n\n"
return formatted_text.strip()

我們可以測試一下這些函式是否能正常工作，如下所示：

docs = await run_search_queries(['langgraph'], include_raw_content=True)

output = format_search_query_results(docs, max_tokens=500,

include_raw_content=True)

print(output)

docs = await run_search_queries(['langgraph'], include_raw_content=True) output = format_search_query_results(docs, max_tokens=500, include_raw_content=True) print(output)

docs = await run_search_queries(['langgraph'], include_raw_content=True)
output = format_search_query_results(docs, max_tokens=500, 
include_raw_content=True)
print(output)

輸出

Content from web search:Source Introduction - GitHub Pages:===URL: https://langchain-ai.github.io/langgraphjs/===Most relevant content from source: Overview¶. LangGraph is a library for building stateful, multi-actor applications with LLMs, used to create agent and multi-agent workflows......===Raw Content: 🦜🕸️LangGraph.js¶⚡ Building language agents as graphs ⚡Looking for the Python version? Clickhere ( docs).Overview......Source ️LangGraph - GitHub Pages:===URL: https://langchain-ai.github.io/langgraph/===Most relevant content from source: Overview¶. LangGraph is a library for building stateful, multi-actor applications with LLMs, ......===Raw Content: 🦜🕸️LangGraph¶⚡ Building language agents as graphs ⚡NoteLooking for the JS version? See the JS repo and the JS docs.Overview¶LangGraph is a library for buildingstateful, multi-actor applications with LLMs, ......

建立預設報告模板

這是 LLM 瞭解如何建立一般報告的起點，它將以此為指導，根據主題建立自定義報告結構。請記住，這不是最終的報告結構，而更像是指導代理的提示。

# Structure Guideline

DEFAULT_REPORT_STRUCTURE = """The report structure should focus on breaking-down the user-provided topic

and building a comprehensive report in markdown using the following format:

1. Introduction (no web search needed)

- Brief overview of the topic area

2. Main Body Sections:

- Each section should focus on a sub-topic of the user-provided topic

- Include any key concepts and definitions

- Provide real-world examples or case studies where applicable

3. Conclusion (no web search needed)

- Aim for 1 structural element (either a list of table) that distills the main body sections

- Provide a concise summary of the report

When generating the final response in markdown, if there are special characters in the text,

such as the dollar symbol, ensure they are escaped properly for correct rendering e.g $25.5 should become \$25.5

"""

# Structure Guideline DEFAULT_REPORT_STRUCTURE = """The report structure should focus on breaking-down the user-provided topic and building a comprehensive report in markdown using the following format: 1. Introduction (no web search needed) - Brief overview of the topic area 2. Main Body Sections: - Each section should focus on a sub-topic of the user-provided topic - Include any key concepts and definitions - Provide real-world examples or case studies where applicable 3. Conclusion (no web search needed) - Aim for 1 structural element (either a list of table) that distills the main body sections - Provide a concise summary of the report When generating the final response in markdown, if there are special characters in the text, such as the dollar symbol, ensure they are escaped properly for correct rendering e.g $25.5 should become \$25.5 """

# Structure Guideline
DEFAULT_REPORT_STRUCTURE = """The report structure should focus on breaking-down the user-provided topic
and building a comprehensive report in markdown using the following format:
1. Introduction (no web search needed)
- Brief overview of the topic area
2. Main Body Sections:
- Each section should focus on a sub-topic of the user-provided topic
- Include any key concepts and definitions
- Provide real-world examples or case studies where applicable
3. Conclusion (no web search needed)
- Aim for 1 structural element (either a list of table) that distills the main body sections
- Provide a concise summary of the report
When generating the final response in markdown, if there are special characters in the text,
such as the dollar symbol, ensure they are escaped properly for correct rendering e.g $25.5 should become \$25.5
"""

報告規劃器的指令提示

主要有兩個指令提示：

1. REPORT_PLAN_QUERY_GENERATOR_PROMPT（報告計劃查詢生成器提示）

幫助 LLM 根據主題生成初始問題列表，以便從網上獲取更多有關該主題的資訊，從而規劃報告的整體章節和結構。

REPORT_PLAN_QUERY_GENERATOR_PROMPT = """You are an expert technical report writer, helping to plan a report.

The report will be focused on the following topic:

{topic}

The report structure will follow these guidelines:

{report_organization}

Your goal is to generate {number_of_queries} search queries that will help gather comprehensive information for planning the report sections.

The query should:

1. Be related to the topic

2. Help satisfy the requirements specified in the report organization

Make the query specific enough to find high-quality, relevant sources while covering the depth and breadth needed for the report structure.

"""

REPORT_PLAN_QUERY_GENERATOR_PROMPT = """You are an expert technical report writer, helping to plan a report. The report will be focused on the following topic: {topic} The report structure will follow these guidelines: {report_organization} Your goal is to generate {number_of_queries} search queries that will help gather comprehensive information for planning the report sections. The query should: 1. Be related to the topic 2. Help satisfy the requirements specified in the report organization Make the query specific enough to find high-quality, relevant sources while covering the depth and breadth needed for the report structure. """

REPORT_PLAN_QUERY_GENERATOR_PROMPT = """You are an expert technical report writer, helping to plan a report.
The report will be focused on the following topic:
{topic}
The report structure will follow these guidelines:
{report_organization}
Your goal is to generate {number_of_queries} search queries that will help gather comprehensive information for planning the report sections.
The query should:
1. Be related to the topic
2. Help satisfy the requirements specified in the report organization
Make the query specific enough to find high-quality, relevant sources while covering the depth and breadth needed for the report structure.
"""

2. REPORT_PLAN_SECTION_GENERATOR_PROMPT（報告計劃章節生成器提示）

在這裡，我們向 LLM 提供預設報告模板、主題名稱和初始查詢生成的搜尋結果，以建立詳細的報告結構。LLM 將為報告中的每個主要部分生成包含以下欄位的結構化響應（這只是報告結構–此步驟不建立內容）：

Name – 報告此部分的名稱。
Description – 本節將涵蓋的主要主題和概念的簡要概述。
Research – 是否對報告的這一部分進行網路搜尋。
Content – 本節的內容，暫時留空。

REPORT_PLAN_SECTION_GENERATOR_PROMPT = """You are an expert technical report writer, helping to plan a report.

Your goal is to generate the outline of the sections of the report.

The overall topic of the report is:

{topic}

The report should follow this organizational structure:

{report_organization}

You should reflect on this additional context information from web searches to plan the main sections of the report:

{search_context}

Now, generate the sections of the report. Each section should have the following fields:

- Name - Name for this section of the report.

- Description - Brief overview of the main topics and concepts to be covered in this section.

- Research - Whether to perform web search for this section of the report or not.

- Content - The content of the section, which you will leave blank for now.

Consider which sections require web search.

For example, introduction and conclusion will not require research because they will distill information from other parts of the report.

"""

REPORT_PLAN_SECTION_GENERATOR_PROMPT = """You are an expert technical report writer, helping to plan a report. Your goal is to generate the outline of the sections of the report. The overall topic of the report is: {topic} The report should follow this organizational structure: {report_organization} You should reflect on this additional context information from web searches to plan the main sections of the report: {search_context} Now, generate the sections of the report. Each section should have the following fields: - Name - Name for this section of the report. - Description - Brief overview of the main topics and concepts to be covered in this section. - Research - Whether to perform web search for this section of the report or not. - Content - The content of the section, which you will leave blank for now. Consider which sections require web search. For example, introduction and conclusion will not require research because they will distill information from other parts of the report. """

REPORT_PLAN_SECTION_GENERATOR_PROMPT = """You are an expert technical report writer, helping to plan a report.
Your goal is to generate the outline of the sections of the report.
The overall topic of the report is:
{topic}
The report should follow this organizational structure:
{report_organization}
You should reflect on this additional context information from web searches to plan the main sections of the report:
{search_context}
Now, generate the sections of the report. Each section should have the following fields:
- Name - Name for this section of the report.
- Description - Brief overview of the main topics and concepts to be covered in this section.
- Research - Whether to perform web search for this section of the report or not.
- Content - The content of the section, which you will leave blank for now.
Consider which sections require web search.
For example, introduction and conclusion will not require research because they will distill information from other parts of the report.
"""

報告規劃器節點函式

我們將構建報告規劃器節點的邏輯，其目的是根據輸入的使用者主題和預設報告模板指南，建立一個結構化的自定義報告模板，幷包含主要部分的名稱和描述。

報告規劃器節點函式

該功能使用之前建立的兩個提示：

首先，根據使用者主題生成一些查詢
搜尋網路，獲取有關這些查詢的一些資訊
利用這些資訊生成報告的整體結構，以及需要建立的關鍵部分

from langchain_openai import ChatOpenAI

from langchain_core.messages import HumanMessage, SystemMessage

llm = ChatOpenAI(model_name="gpt-4o", temperature=0)

async def generate_report_plan(state: ReportState):

"""Generate the overall plan for building the report"""

topic = state["topic"]

print('--- Generating Report Plan ---')

report_structure = DEFAULT_REPORT_STRUCTURE

number_of_queries = 8

structured_llm = llm.with_structured_output(Queries)

system_instructions_query = REPORT_PLAN_QUERY_GENERATOR_PROMPT.format(

topic=topic,

report_organization=report_structure,

number_of_queries=number_of_queries

)

try:

# Generate queries

results = structured_llm.invoke([

SystemMessage(content=system_instructions_query),

HumanMessage(content='Generate search queries that will help with planning the sections of the report.')

])

# Convert SearchQuery objects to strings

query_list = [

query.search_query if isinstance(query, SearchQuery) else str(query)

for query in results.queries

]

# Search web and ensure we wait for results

search_docs = await run_search_queries(

query_list,

num_results=5,

include_raw_content=False

)

if not search_docs:

print("Warning: No search results returned")

search_context = "No search results available."

else:

search_context = format_search_query_results(

search_docs,

include_raw_content=False

)

# Generate sections

system_instructions_sections = REPORT_PLAN_SECTION_GENERATOR_PROMPT.format(

topic=topic,

report_organization=report_structure,

search_context=search_context

)

structured_llm = llm.with_structured_output(Sections)

report_sections = structured_llm.invoke([

SystemMessage(content=system_instructions_sections),

HumanMessage(content="Generate the sections of the report. Your response must include a 'sections' field containing a list of sections. Each section must have: name, description, plan, research, and content fields.")

])

print('--- Generating Report Plan Completed ---')

return {"sections": report_sections.sections}

except Exception as e:

print(f"Error in generate_report_plan: {e}")

return {"sections": []}

from langchain_openai import ChatOpenAI from langchain_core.messages import HumanMessage, SystemMessage llm = ChatOpenAI(model_name="gpt-4o", temperature=0) async def generate_report_plan(state: ReportState): """Generate the overall plan for building the report""" topic = state["topic"] print('--- Generating Report Plan ---') report_structure = DEFAULT_REPORT_STRUCTURE number_of_queries = 8 structured_llm = llm.with_structured_output(Queries) system_instructions_query = REPORT_PLAN_QUERY_GENERATOR_PROMPT.format( topic=topic, report_organization=report_structure, number_of_queries=number_of_queries ) try: # Generate queries results = structured_llm.invoke([ SystemMessage(content=system_instructions_query), HumanMessage(content='Generate search queries that will help with planning the sections of the report.') ]) # Convert SearchQuery objects to strings query_list = [ query.search_query if isinstance(query, SearchQuery) else str(query) for query in results.queries ] # Search web and ensure we wait for results search_docs = await run_search_queries( query_list, num_results=5, include_raw_content=False ) if not search_docs: print("Warning: No search results returned") search_context = "No search results available." else: search_context = format_search_query_results( search_docs, include_raw_content=False ) # Generate sections system_instructions_sections = REPORT_PLAN_SECTION_GENERATOR_PROMPT.format( topic=topic, report_organization=report_structure, search_context=search_context ) structured_llm = llm.with_structured_output(Sections) report_sections = structured_llm.invoke([ SystemMessage(content=system_instructions_sections), HumanMessage(content="Generate the sections of the report. Your response must include a 'sections' field containing a list of sections. Each section must have: name, description, plan, research, and content fields.") ]) print('--- Generating Report Plan Completed ---') return {"sections": report_sections.sections} except Exception as e: print(f"Error in generate_report_plan: {e}") return {"sections": []}

from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, SystemMessage
llm = ChatOpenAI(model_name="gpt-4o", temperature=0)
async def generate_report_plan(state: ReportState):
"""Generate the overall plan for building the report"""
topic = state["topic"]
print('--- Generating Report Plan ---')
report_structure = DEFAULT_REPORT_STRUCTURE
number_of_queries = 8
structured_llm = llm.with_structured_output(Queries)
system_instructions_query = REPORT_PLAN_QUERY_GENERATOR_PROMPT.format(
topic=topic,
report_organization=report_structure,
number_of_queries=number_of_queries
)
try:
# Generate queries
results = structured_llm.invoke([
SystemMessage(content=system_instructions_query),
HumanMessage(content='Generate search queries that will help with planning the sections of the report.')
])
# Convert SearchQuery objects to strings
query_list = [
query.search_query if isinstance(query, SearchQuery) else str(query)
for query in results.queries
]
# Search web and ensure we wait for results
search_docs = await run_search_queries(
query_list,
num_results=5,
include_raw_content=False
)
if not search_docs:
print("Warning: No search results returned")
search_context = "No search results available."
else:
search_context = format_search_query_results(
search_docs,
include_raw_content=False
)
# Generate sections
system_instructions_sections = REPORT_PLAN_SECTION_GENERATOR_PROMPT.format(
topic=topic,
report_organization=report_structure,
search_context=search_context
)
structured_llm = llm.with_structured_output(Sections)
report_sections = structured_llm.invoke([
SystemMessage(content=system_instructions_sections),
HumanMessage(content="Generate the sections of the report. Your response must include a 'sections' field containing a list of sections. Each section must have: name, description, plan, research, and content fields.")
])
print('--- Generating Report Plan Completed ---')
return {"sections": report_sections.sections}
except Exception as e:
print(f"Error in generate_report_plan: {e}")
return {"sections": []}

章節生成器 – 查詢生成器的指令提示

有一個主要指令提示：

1. REPORT_SECTION_QUERY_GENERATOR_PROMPT

幫助 LLM 為需要構建的特定章節的主題生成一個全面的問題列表

REPORT_SECTION_QUERY_GENERATOR_PROMPT = """Your goal is to generate targeted web search queries that will gather comprehensive information for writing a technical report section.

Topic for this section:

{section_topic}

When generating {number_of_queries} search queries, ensure that they:

1. Cover different aspects of the topic (e.g., core features, real-world applications, technical architecture)

2. Include specific technical terms related to the topic

3. Target recent information by including year markers where relevant (e.g., "2024")

4. Look for comparisons or differentiators from similar technologies/approaches

5. Search for both official documentation and practical implementation examples

Your queries should be:

- Specific enough to avoid generic results

- Technical enough to capture detailed implementation information

- Diverse enough to cover all aspects of the section plan

- Focused on authoritative sources (documentation, technical blogs, academic papers)"""

REPORT_SECTION_QUERY_GENERATOR_PROMPT = """Your goal is to generate targeted web search queries that will gather comprehensive information for writing a technical report section. Topic for this section: {section_topic} When generating {number_of_queries} search queries, ensure that they: 1. Cover different aspects of the topic (e.g., core features, real-world applications, technical architecture) 2. Include specific technical terms related to the topic 3. Target recent information by including year markers where relevant (e.g., "2024") 4. Look for comparisons or differentiators from similar technologies/approaches 5. Search for both official documentation and practical implementation examples Your queries should be: - Specific enough to avoid generic results - Technical enough to capture detailed implementation information - Diverse enough to cover all aspects of the section plan - Focused on authoritative sources (documentation, technical blogs, academic papers)"""

REPORT_SECTION_QUERY_GENERATOR_PROMPT = """Your goal is to generate targeted web search queries that will gather comprehensive information for writing a technical report section.
Topic for this section:
{section_topic}
When generating {number_of_queries} search queries, ensure that they:
1. Cover different aspects of the topic (e.g., core features, real-world applications, technical architecture)
2. Include specific technical terms related to the topic
3. Target recent information by including year markers where relevant (e.g., "2024")
4. Look for comparisons or differentiators from similar technologies/approaches
5. Search for both official documentation and practical implementation examples
Your queries should be:
- Specific enough to avoid generic results
- Technical enough to capture detailed implementation information
- Diverse enough to cover all aspects of the section plan
- Focused on authoritative sources (documentation, technical blogs, academic papers)"""

章節生成器的節點函式 – 生成查詢（查詢生成器）

該功能使用章節主題和上面的指令提示生成一些問題，以便在網路上查詢有關章節主題的有用資訊。

查詢生成器節點函式

def generate_queries(state: SectionState):

""" Generate search queries for a specific report section """

# Get state

section = state["section"]

print('--- Generating Search Queries for Section: '+ section.name +' ---')

# Get configuration

number_of_queries = 5

# Generate queries

structured_llm = llm.with_structured_output(Queries)

# Format system instructions

system_instructions = REPORT_SECTION_QUERY_GENERATOR_PROMPT.format(section_topic=section.description, number_of_queries=number_of_queries)

# Generate queries

user_instruction = "Generate search queries on the provided topic."

search_queries = structured_llm.invoke([SystemMessage(content=system_instructions),

HumanMessage(content=user_instruction)])

print('--- Generating Search Queries for Section: '+ section.name +' Completed ---')

return {"search_queries": search_queries.queries}

def generate_queries(state: SectionState): """ Generate search queries for a specific report section """ # Get state section = state["section"] print('--- Generating Search Queries for Section: '+ section.name +' ---') # Get configuration number_of_queries = 5 # Generate queries structured_llm = llm.with_structured_output(Queries) # Format system instructions system_instructions = REPORT_SECTION_QUERY_GENERATOR_PROMPT.format(section_topic=section.description, number_of_queries=number_of_queries) # Generate queries user_instruction = "Generate search queries on the provided topic." search_queries = structured_llm.invoke([SystemMessage(content=system_instructions), HumanMessage(content=user_instruction)]) print('--- Generating Search Queries for Section: '+ section.name +' Completed ---') return {"search_queries": search_queries.queries}

def generate_queries(state: SectionState):
""" Generate search queries for a specific report section """
# Get state
section = state["section"]
print('--- Generating Search Queries for Section: '+ section.name +' ---')
# Get configuration
number_of_queries = 5
# Generate queries
structured_llm = llm.with_structured_output(Queries)
# Format system instructions
system_instructions = REPORT_SECTION_QUERY_GENERATOR_PROMPT.format(section_topic=section.description,                                                                       number_of_queries=number_of_queries)
# Generate queries
user_instruction = "Generate search queries on the provided topic."
search_queries = structured_llm.invoke([SystemMessage(content=system_instructions),
HumanMessage(content=user_instruction)])
print('--- Generating Search Queries for Section: '+ section.name +' Completed ---')
return {"search_queries": search_queries.queries}

章節生成器的節點函式 – 搜尋網路

獲取由 generate_queries(…)為特定章節生成的查詢，使用我們之前定義的實用功能搜尋網路並格式化搜尋結果。

網路研究員節點函式

async def search_web(state: SectionState):

""" Search the web for each query, then return a list of raw sources and a formatted string of sources."""

# Get state

search_queries = state["search_queries"]

print('--- Searching Web for Queries ---')

# Web search

query_list = [query.search_query for query in search_queries]

search_docs = await run_search_queries(search_queries, num_results=6, include_raw_content=True)

# Deduplicate and format sources

search_context = format_search_query_results(search_docs, max_tokens=4000, include_raw_content=True)

print('--- Searching Web for Queries Completed ---')

return {"source_str": search_context}

async def search_web(state: SectionState): """ Search the web for each query, then return a list of raw sources and a formatted string of sources.""" # Get state search_queries = state["search_queries"] print('--- Searching Web for Queries ---') # Web search query_list = [query.search_query for query in search_queries] search_docs = await run_search_queries(search_queries, num_results=6, include_raw_content=True) # Deduplicate and format sources search_context = format_search_query_results(search_docs, max_tokens=4000, include_raw_content=True) print('--- Searching Web for Queries Completed ---') return {"source_str": search_context}

async def search_web(state: SectionState):
""" Search the web for each query, then return a list of raw sources and a formatted string of sources."""
# Get state
search_queries = state["search_queries"]
print('--- Searching Web for Queries ---')
# Web search
query_list = [query.search_query for query in search_queries]
search_docs = await run_search_queries(search_queries, num_results=6, include_raw_content=True)
# Deduplicate and format sources
search_context = format_search_query_results(search_docs, max_tokens=4000, include_raw_content=True)
print('--- Searching Web for Queries Completed ---')
return {"source_str": search_context}

章節生成器–章節寫作的指令提示

有一個主要的指令提示：

1. SECTION_WRITER_PROMPT（章節編寫提示）

限制 LLM 使用特定的文體、結構、長度和方法指南生成並編寫特定章節的內容，同時傳送使用 search_web(…) 函式從網上獲取的文件。

SECTION_WRITER_PROMPT = """You are an expert technical writer crafting one specific section of a technical report.

Title for the section:

{section_title}

Topic for this section:

{section_topic}

Guidelines for writing:

1. Technical Accuracy:

- Include specific version numbers

- Reference concrete metrics/benchmarks

- Cite official documentation

- Use technical terminology precisely

2. Length and Style:

- Strict 150-200 word limit

- No marketing language

- Technical focus

- Write in simple, clear language do not use complex words unnecessarily

- Start with your most important insight in **bold**

- Use short paragraphs (2-3 sentences max)

3. Structure:

- Use ## for section title (Markdown format)

- Only use ONE structural element IF it helps clarify your point:

* Either a focused table comparing 2-3 key items (using Markdown table syntax)

* Or a short list (3-5 items) using proper Markdown list syntax:

- Use `*` or `-` for unordered lists

- Use `1.` for ordered lists

- Ensure proper indentation and spacing

- End with ### Sources that references the below source material formatted as:

* List each source with title, date, and URL

* Format: `- Title : URL`

3. Writing Approach:

- Include at least one specific example or case study if available

- Use concrete details over general statements

- Make every word count

- No preamble prior to creating the section content

- Focus on your single most important point

4. Use this source material obtained from web searches to help write the section:

{context}

5. Quality Checks:

- Format should be Markdown

- Exactly 150-200 words (excluding title and sources)

- Careful use of only ONE structural element (table or bullet list) and only if it helps clarify your point

- One specific example / case study if available

- Starts with bold insight

- No preamble prior to creating the section content

- Sources cited at end

- If there are special characters in the text, such as the dollar symbol,

ensure they are escaped properly for correct rendering e.g $25.5 should become \$25.5

"""

SECTION_WRITER_PROMPT = """You are an expert technical writer crafting one specific section of a technical report. Title for the section: {section_title} Topic for this section: {section_topic} Guidelines for writing: 1. Technical Accuracy: - Include specific version numbers - Reference concrete metrics/benchmarks - Cite official documentation - Use technical terminology precisely 2. Length and Style: - Strict 150-200 word limit - No marketing language - Technical focus - Write in simple, clear language do not use complex words unnecessarily - Start with your most important insight in **bold** - Use short paragraphs (2-3 sentences max) 3. Structure: - Use ## for section title (Markdown format) - Only use ONE structural element IF it helps clarify your point: * Either a focused table comparing 2-3 key items (using Markdown table syntax) * Or a short list (3-5 items) using proper Markdown list syntax: - Use `*` or `-` for unordered lists - Use `1.` for ordered lists - Ensure proper indentation and spacing - End with ### Sources that references the below source material formatted as: * List each source with title, date, and URL * Format: `- Title : URL` 3. Writing Approach: - Include at least one specific example or case study if available - Use concrete details over general statements - Make every word count - No preamble prior to creating the section content - Focus on your single most important point 4. Use this source material obtained from web searches to help write the section: {context} 5. Quality Checks: - Format should be Markdown - Exactly 150-200 words (excluding title and sources) - Careful use of only ONE structural element (table or bullet list) and only if it helps clarify your point - One specific example / case study if available - Starts with bold insight - No preamble prior to creating the section content - Sources cited at end - If there are special characters in the text, such as the dollar symbol, ensure they are escaped properly for correct rendering e.g $25.5 should become \$25.5 """

SECTION_WRITER_PROMPT = """You are an expert technical writer crafting one specific section of a technical report.
Title for the section:
{section_title}
Topic for this section:
{section_topic}
Guidelines for writing:
1. Technical Accuracy:
- Include specific version numbers
- Reference concrete metrics/benchmarks
- Cite official documentation
- Use technical terminology precisely
2. Length and Style:
- Strict 150-200 word limit
- No marketing language
- Technical focus
- Write in simple, clear language do not use complex words unnecessarily
- Start with your most important insight in **bold**
- Use short paragraphs (2-3 sentences max)
3. Structure:
- Use ## for section title (Markdown format)
- Only use ONE structural element IF it helps clarify your point:
* Either a focused table comparing 2-3 key items (using Markdown table syntax)
* Or a short list (3-5 items) using proper Markdown list syntax:
- Use `*` or `-` for unordered lists
- Use `1.` for ordered lists
- Ensure proper indentation and spacing
- End with ### Sources that references the below source material formatted as:
* List each source with title, date, and URL
* Format: `- Title : URL`
3. Writing Approach:
- Include at least one specific example or case study if available
- Use concrete details over general statements
- Make every word count
- No preamble prior to creating the section content
- Focus on your single most important point
4. Use this source material obtained from web searches to help write the section:
{context}
5. Quality Checks:
- Format should be Markdown
- Exactly 150-200 words (excluding title and sources)
- Careful use of only ONE structural element (table or bullet list) and only if it helps clarify your point
- One specific example / case study if available
- Starts with bold insight
- No preamble prior to creating the section content
- Sources cited at end
- If there are special characters in the text, such as the dollar symbol,
ensure they are escaped properly for correct rendering e.g $25.5 should become \$25.5
"""

章節建立器的節點函式 – 編寫章節（章節編寫器）

使用上面的 SECTION_WRITER_PROMPT，輸入章節名稱、描述和網路搜尋文件，然後將其傳遞給 LLM，由 LLM 撰寫該章節的內容

章節撰寫器節點函式

def write_section(state: SectionState):

""" Write a section of the report """

# Get state

section = state["section"]

source_str = state["source_str"]

print('--- Writing Section : '+ section.name +' ---')

# Format system instructions

system_instructions = SECTION_WRITER_PROMPT.format(section_title=section.name, section_topic=section.description, context=source_str)

# Generate section

user_instruction = "Generate a report section based on the provided sources."

section_content = llm.invoke([SystemMessage(content=system_instructions),

HumanMessage(content=user_instruction)])

# Write content to the section object

section.content = section_content.content

print('--- Writing Section : '+ section.name +' Completed ---')

# Write the updated section to completed sections

return {"completed_sections": [section]}

def write_section(state: SectionState): """ Write a section of the report """ # Get state section = state["section"] source_str = state["source_str"] print('--- Writing Section : '+ section.name +' ---') # Format system instructions system_instructions = SECTION_WRITER_PROMPT.format(section_title=section.name, section_topic=section.description, context=source_str) # Generate section user_instruction = "Generate a report section based on the provided sources." section_content = llm.invoke([SystemMessage(content=system_instructions), HumanMessage(content=user_instruction)]) # Write content to the section object section.content = section_content.content print('--- Writing Section : '+ section.name +' Completed ---') # Write the updated section to completed sections return {"completed_sections": [section]}

def write_section(state: SectionState):
""" Write a section of the report """
# Get state
section = state["section"]
source_str = state["source_str"]
print('--- Writing Section : '+ section.name +' ---')
# Format system instructions
system_instructions = SECTION_WRITER_PROMPT.format(section_title=section.name,                                                     section_topic=section.description,                                                       context=source_str)
# Generate section
user_instruction = "Generate a report section based on the provided sources."
section_content = llm.invoke([SystemMessage(content=system_instructions),
HumanMessage(content=user_instruction)])
# Write content to the section object
section.content = section_content.content
print('--- Writing Section : '+ section.name +' Completed ---')
# Write the updated section to completed sections
return {"completed_sections": [section]}

建立章節生成器子代理

這個代理（或者更具體地說，子代理）將被並行呼叫多次，每個章節都會被呼叫一次，以搜尋網路、獲取內容，然後編寫特定的章節。我們利用 LangGraph 的傳送結構來實現這一功能。

章節構建子代理

from langgraph.graph import StateGraph, START, END

# Add nodes and edges

section_builder = StateGraph(SectionState, output=SectionOutputState)

section_builder.add_node("generate_queries", generate_queries)

section_builder.add_node("search_web", search_web)

section_builder.add_node("write_section", write_section)

section_builder.add_edge(START, "generate_queries")

section_builder.add_edge("generate_queries", "search_web")

section_builder.add_edge("search_web", "write_section")

section_builder.add_edge("write_section", END)

section_builder_subagent = section_builder.compile()

# Display the graph

from IPython.display import display, Image

Image(section_builder_subagent.get_graph().draw_mermaid_png())

from langgraph.graph import StateGraph, START, END # Add nodes and edges section_builder = StateGraph(SectionState, output=SectionOutputState) section_builder.add_node("generate_queries", generate_queries) section_builder.add_node("search_web", search_web) section_builder.add_node("write_section", write_section) section_builder.add_edge(START, "generate_queries") section_builder.add_edge("generate_queries", "search_web") section_builder.add_edge("search_web", "write_section") section_builder.add_edge("write_section", END) section_builder_subagent = section_builder.compile() # Display the graph from IPython.display import display, Image Image(section_builder_subagent.get_graph().draw_mermaid_png())

from langgraph.graph import StateGraph, START, END
# Add nodes and edges
section_builder = StateGraph(SectionState, output=SectionOutputState)
section_builder.add_node("generate_queries", generate_queries)
section_builder.add_node("search_web", search_web)
section_builder.add_node("write_section", write_section)
section_builder.add_edge(START, "generate_queries")
section_builder.add_edge("generate_queries", "search_web")
section_builder.add_edge("search_web", "write_section")
section_builder.add_edge("write_section", END)
section_builder_subagent = section_builder.compile()
# Display the graph
from IPython.display import display, Image
Image(section_builder_subagent.get_graph().draw_mermaid_png())

輸出

AI代理流程

建立動態並行化節點函式 – 並行化章節編寫

Send(…) 用於並行化併為每個部分呼叫一次 section_builder_subagent，以（並行）寫入內容。

from langgraph.constants import Send

def parallelize_section_writing(state: ReportState):

""" This is the "map" step when we kick off web research for some sections of the report in parallel and then write the section"""

# Kick off section writing in parallel via Send() API for any sections that require research

return [

Send("section_builder_with_web_search", # name of the subagent node

{"section": s})

for s in state["sections"]

if s.research

]

from langgraph.constants import Send def parallelize_section_writing(state: ReportState): """ This is the "map" step when we kick off web research for some sections of the report in parallel and then write the section""" # Kick off section writing in parallel via Send() API for any sections that require research return [ Send("section_builder_with_web_search", # name of the subagent node {"section": s}) for s in state["sections"] if s.research ]

from langgraph.constants import Send
def parallelize_section_writing(state: ReportState):
""" This is the "map" step when we kick off web research for some sections of the report in parallel and then write the section"""
# Kick off section writing in parallel via Send() API for any sections that require research
return [
Send("section_builder_with_web_search", # name of the subagent node
{"section": s})
for s in state["sections"]
if s.research
]

建立格式化章節節點函式

這基本上是對所有章節進行格式化併合併成一個大文件的部分。

格式章節節點函式

def format_sections(sections: list[Section]) -> str:

""" Format a list of report sections into a single text string """

formatted_str = ""

for idx, section in enumerate(sections, 1):

formatted_str += f"""

{'='*60}

Section {idx}: {section.name}

{'='*60}

Description:

{section.description}

Requires Research:

{section.research}

Content:

{section.content if section.content else '[Not yet written]'}

"""

return formatted_str

def format_completed_sections(state: ReportState):

""" Gather completed sections from research and format them as context for writing the final sections """

print('--- Formatting Completed Sections ---')

# List of completed sections

completed_sections = state["completed_sections"]

# Format completed section to str to use as context for final sections

completed_report_sections = format_sections(completed_sections)

print('--- Formatting Completed Sections is Done ---')

return {"report_sections_from_research": completed_report_sections}

def format_sections(sections: list[Section]) -> str: """ Format a list of report sections into a single text string """ formatted_str = "" for idx, section in enumerate(sections, 1): formatted_str += f""" {'='*60} Section {idx}: {section.name} {'='*60} Description: {section.description} Requires Research: {section.research} Content: {section.content if section.content else '[Not yet written]'} """ return formatted_str def format_completed_sections(state: ReportState): """ Gather completed sections from research and format them as context for writing the final sections """ print('--- Formatting Completed Sections ---') # List of completed sections completed_sections = state["completed_sections"] # Format completed section to str to use as context for final sections completed_report_sections = format_sections(completed_sections) print('--- Formatting Completed Sections is Done ---') return {"report_sections_from_research": completed_report_sections}

def format_sections(sections: list[Section]) -> str:
""" Format a list of report sections into a single text string """
formatted_str = ""
for idx, section in enumerate(sections, 1):
formatted_str += f"""
{'='*60}
Section {idx}: {section.name}
{'='*60}
Description:
{section.description}
Requires Research:
{section.research}
Content:
{section.content if section.content else '[Not yet written]'}
"""
return formatted_str
def format_completed_sections(state: ReportState):
""" Gather completed sections from research and format them as context for writing the final sections """
print('--- Formatting Completed Sections ---')
# List of completed sections
completed_sections = state["completed_sections"]
# Format completed section to str to use as context for final sections
completed_report_sections = format_sections(completed_sections)
print('--- Formatting Completed Sections is Done ---')
return {"report_sections_from_research": completed_report_sections}

最後章節的指導提示

有一個主要的指導提示：

1. FINAL_SECTION_WRITER_PROMPT（最後章節寫作提示）

要求 LLM 根據有關文體、結構、長度、方法的某些指導原則生成並撰寫引言或結論的內容，同時傳送已撰寫部分的內容。

FINAL_SECTION_WRITER_PROMPT = """You are an expert technical writer crafting a section that synthesizes information from the rest of the report.

Title for the section:

{section_title}

Topic for this section:

{section_topic}

Available report content of already completed sections:

{context}

1. Section-Specific Approach:

For Introduction:

- Use # for report title (Markdown format)

- 50-100 word limit

- Write in simple and clear language

- Focus on the core motivation for the report in 1-2 paragraphs

- Use a clear narrative arc to introduce the report

- Include NO structural elements (no lists or tables)

- No sources section needed

For Conclusion/Summary:

- Use ## for section title (Markdown format)

- 100-150 word limit

- For comparative reports:

* Must include a focused comparison table using Markdown table syntax

* Table should distill insights from the report

* Keep table entries clear and concise

- For non-comparative reports:

* Only use ONE structural element IF it helps distill the points made in the report:

* Either a focused table comparing items present in the report (using Markdown table syntax)

* Or a short list using proper Markdown list syntax:

- Use `*` or `-` for unordered lists

- Use `1.` for ordered lists

- Ensure proper indentation and spacing

- End with specific next steps or implications

- No sources section needed

3. Writing Approach:

- Use concrete details over general statements

- Make every word count

- Focus on your single most important point

4. Quality Checks:

- For introduction: 50-100 word limit, # for report title, no structural elements, no sources section

- For conclusion: 100-150 word limit, ## for section title, only ONE structural element at most, no sources section

- Markdown format

- Do not include word count or any preamble in your response

- If there are special characters in the text, such as the dollar symbol,

ensure they are escaped properly for correct rendering e.g $25.5 should become \$25.5"""

FINAL_SECTION_WRITER_PROMPT = """You are an expert technical writer crafting a section that synthesizes information from the rest of the report. Title for the section: {section_title} Topic for this section: {section_topic} Available report content of already completed sections: {context} 1. Section-Specific Approach: For Introduction: - Use # for report title (Markdown format) - 50-100 word limit - Write in simple and clear language - Focus on the core motivation for the report in 1-2 paragraphs - Use a clear narrative arc to introduce the report - Include NO structural elements (no lists or tables) - No sources section needed For Conclusion/Summary: - Use ## for section title (Markdown format) - 100-150 word limit - For comparative reports: * Must include a focused comparison table using Markdown table syntax * Table should distill insights from the report * Keep table entries clear and concise - For non-comparative reports: * Only use ONE structural element IF it helps distill the points made in the report: * Either a focused table comparing items present in the report (using Markdown table syntax) * Or a short list using proper Markdown list syntax: - Use `*` or `-` for unordered lists - Use `1.` for ordered lists - Ensure proper indentation and spacing - End with specific next steps or implications - No sources section needed 3. Writing Approach: - Use concrete details over general statements - Make every word count - Focus on your single most important point 4. Quality Checks: - For introduction: 50-100 word limit, # for report title, no structural elements, no sources section - For conclusion: 100-150 word limit, ## for section title, only ONE structural element at most, no sources section - Markdown format - Do not include word count or any preamble in your response - If there are special characters in the text, such as the dollar symbol, ensure they are escaped properly for correct rendering e.g $25.5 should become \$25.5"""

FINAL_SECTION_WRITER_PROMPT = """You are an expert technical writer crafting a section that synthesizes information from the rest of the report.
Title for the section:
{section_title}
Topic for this section:
{section_topic}
Available report content of already completed sections:
{context}
1. Section-Specific Approach:
For Introduction:
- Use # for report title (Markdown format)
- 50-100 word limit
- Write in simple and clear language
- Focus on the core motivation for the report in 1-2 paragraphs
- Use a clear narrative arc to introduce the report
- Include NO structural elements (no lists or tables)
- No sources section needed
For Conclusion/Summary:
- Use ## for section title (Markdown format)
- 100-150 word limit
- For comparative reports:
* Must include a focused comparison table using Markdown table syntax
* Table should distill insights from the report
* Keep table entries clear and concise
- For non-comparative reports:
* Only use ONE structural element IF it helps distill the points made in the report:
* Either a focused table comparing items present in the report (using Markdown table syntax)
* Or a short list using proper Markdown list syntax:
- Use `*` or `-` for unordered lists
- Use `1.` for ordered lists
- Ensure proper indentation and spacing
- End with specific next steps or implications
- No sources section needed
3. Writing Approach:
- Use concrete details over general statements
- Make every word count
- Focus on your single most important point
4. Quality Checks:
- For introduction: 50-100 word limit, # for report title, no structural elements, no sources section
- For conclusion: 100-150 word limit, ## for section title, only ONE structural element at most, no sources section
- Markdown format
- Do not include word count or any preamble in your response
- If there are special characters in the text, such as the dollar symbol,
ensure they are escaped properly for correct rendering e.g $25.5 should become \$25.5"""

建立撰寫最後章節節點函式

該函式使用上述 FINAL_SECTION_WRITER_PROMPT 指令提示來編寫引言和結論。該函式將使用下面的 Send(…) 並行執行

最後章節寫作節點函式

def write_final_sections(state: SectionState):

""" Write the final sections of the report, which do not require web search and use the completed sections as context"""

# Get state

section = state["section"]

completed_report_sections = state["report_sections_from_research"]

print('--- Writing Final Section: '+ section.name + ' ---')

# Format system instructions

system_instructions = FINAL_SECTION_WRITER_PROMPT.format(section_title=section.name,

section_topic=section.description,

context=completed_report_sections)

# Generate section

user_instruction = "Craft a report section based on the provided sources."

section_content = llm.invoke([SystemMessage(content=system_instructions),

HumanMessage(content=user_instruction)])

# Write content to section

section.content = section_content.content

print('--- Writing Final Section: '+ section.name + ' Completed ---')

# Write the updated section to completed sections

return {"completed_sections": [section]}

def write_final_sections(state: SectionState): """ Write the final sections of the report, which do not require web search and use the completed sections as context""" # Get state section = state["section"] completed_report_sections = state["report_sections_from_research"] print('--- Writing Final Section: '+ section.name + ' ---') # Format system instructions system_instructions = FINAL_SECTION_WRITER_PROMPT.format(section_title=section.name, section_topic=section.description, context=completed_report_sections) # Generate section user_instruction = "Craft a report section based on the provided sources." section_content = llm.invoke([SystemMessage(content=system_instructions), HumanMessage(content=user_instruction)]) # Write content to section section.content = section_content.content print('--- Writing Final Section: '+ section.name + ' Completed ---') # Write the updated section to completed sections return {"completed_sections": [section]}

def write_final_sections(state: SectionState):
""" Write the final sections of the report, which do not require web search and use the completed sections as context"""
# Get state
section = state["section"]
completed_report_sections = state["report_sections_from_research"]
print('--- Writing Final Section: '+ section.name + ' ---')
# Format system instructions
system_instructions = FINAL_SECTION_WRITER_PROMPT.format(section_title=section.name,
section_topic=section.description,
context=completed_report_sections)
# Generate section
user_instruction = "Craft a report section based on the provided sources."
section_content = llm.invoke([SystemMessage(content=system_instructions),
HumanMessage(content=user_instruction)])
# Write content to section
section.content = section_content.content
print('--- Writing Final Section: '+ section.name + ' Completed ---')
# Write the updated section to completed sections
return {"completed_sections": [section]}

建立動態並行化節點函式 – 並行化最後章節的編寫

Send(…) 用於並行化，為引言和結論各呼叫一次 write_final_sections，（並行）寫入內容

from langgraph.constants import Send

def parallelize_final_section_writing(state: ReportState):

""" Write any final sections using the Send API to parallelize the process """

# Kick off section writing in parallel via Send() API for any sections that do not require research

return [

Send("write_final_sections",

{"section": s, "report_sections_from_research": state["report_sections_from_research"]})

for s in state["sections"]

if not s.research

]

from langgraph.constants import Send def parallelize_final_section_writing(state: ReportState): """ Write any final sections using the Send API to parallelize the process """ # Kick off section writing in parallel via Send() API for any sections that do not require research return [ Send("write_final_sections", {"section": s, "report_sections_from_research": state["report_sections_from_research"]}) for s in state["sections"] if not s.research ]

from langgraph.constants import Send
def parallelize_final_section_writing(state: ReportState):
""" Write any final sections using the Send API to parallelize the process """
# Kick off section writing in parallel via Send() API for any sections that do not require research
return [
Send("write_final_sections",
{"section": s, "report_sections_from_research": state["report_sections_from_research"]})
for s in state["sections"]
if not s.research
]

編譯最終報告節點函式

該函式將報告的所有部分合並在一起，並將其編譯成最終報告檔案

編譯最終報告節點函式

def compile_final_report(state: ReportState):

""" Compile the final report """

# Get sections

sections = state["sections"]

completed_sections = {s.name: s.content for s in state["completed_sections"]}

print('--- Compiling Final Report ---')

# Update sections with completed content while maintaining original order

for section in sections:

section.content = completed_sections[section.name]

# Compile final report

all_sections = "\n\n".join([s.content for s in sections])

# Escape unescaped $ symbols to display properly in Markdown

formatted_sections = all_sections.replace("\\$", "TEMP_PLACEHOLDER") # Temporarily mark already escaped $

formatted_sections = formatted_sections.replace("$", "\\$") # Escape all $

formatted_sections = formatted_sections.replace("TEMP_PLACEHOLDER", "\\$") # Restore originally escaped $

# Now escaped_sections contains the properly escaped Markdown text

print('--- Compiling Final Report Done ---')

return {"final_report": formatted_sections}

def compile_final_report(state: ReportState): """ Compile the final report """ # Get sections sections = state["sections"] completed_sections = {s.name: s.content for s in state["completed_sections"]} print('--- Compiling Final Report ---') # Update sections with completed content while maintaining original order for section in sections: section.content = completed_sections[section.name] # Compile final report all_sections = "\n\n".join([s.content for s in sections]) # Escape unescaped $ symbols to display properly in Markdown formatted_sections = all_sections.replace("\\$", "TEMP_PLACEHOLDER") # Temporarily mark already escaped $ formatted_sections = formatted_sections.replace("$", "\\$") # Escape all $ formatted_sections = formatted_sections.replace("TEMP_PLACEHOLDER", "\\$") # Restore originally escaped $ # Now escaped_sections contains the properly escaped Markdown text print('--- Compiling Final Report Done ---') return {"final_report": formatted_sections}

def compile_final_report(state: ReportState):
""" Compile the final report """
# Get sections
sections = state["sections"]
completed_sections = {s.name: s.content for s in state["completed_sections"]}
print('--- Compiling Final Report ---')
# Update sections with completed content while maintaining original order
for section in sections:
section.content = completed_sections[section.name]
# Compile final report
all_sections = "\n\n".join([s.content for s in sections])
# Escape unescaped $ symbols to display properly in Markdown
formatted_sections = all_sections.replace("\\$", "TEMP_PLACEHOLDER")  # Temporarily mark already escaped $
formatted_sections = formatted_sections.replace("$", "\\$")  # Escape all $
formatted_sections = formatted_sections.replace("TEMP_PLACEHOLDER", "\\$")  # Restore originally escaped $
# Now escaped_sections contains the properly escaped Markdown text
print('--- Compiling Final Report Done ---')
return {"final_report": formatted_sections}

建立我們的深度研究和報告撰寫代理

現在，我們將所有已定義的元件和子代理整合在一起，建立我們的主規劃代理。

深度研究與報告撰寫代理工作流程

builder = StateGraph(ReportState, input=ReportStateInput, output=ReportStateOutput)

builder.add_node("generate_report_plan", generate_report_plan)

builder.add_node("section_builder_with_web_search", section_builder_subagent)

builder.add_node("format_completed_sections", format_completed_sections)

builder.add_node("write_final_sections", write_final_sections)

builder.add_node("compile_final_report", compile_final_report)

builder.add_edge(START, "generate_report_plan")

builder.add_conditional_edges("generate_report_plan",

parallelize_section_writing,

["section_builder_with_web_search"])

builder.add_edge("section_builder_with_web_search", "format_completed_sections")

builder.add_conditional_edges("format_completed_sections",

parallelize_final_section_writing,

["write_final_sections"])

builder.add_edge("write_final_sections", "compile_final_report")

builder.add_edge("compile_final_report", END)

reporter_agent = builder.compile()

# view agent structure

display(Image(reporter_agent.get_graph(xray=True).draw_mermaid_png()))

builder = StateGraph(ReportState, input=ReportStateInput, output=ReportStateOutput) builder.add_node("generate_report_plan", generate_report_plan) builder.add_node("section_builder_with_web_search", section_builder_subagent) builder.add_node("format_completed_sections", format_completed_sections) builder.add_node("write_final_sections", write_final_sections) builder.add_node("compile_final_report", compile_final_report) builder.add_edge(START, "generate_report_plan") builder.add_conditional_edges("generate_report_plan", parallelize_section_writing, ["section_builder_with_web_search"]) builder.add_edge("section_builder_with_web_search", "format_completed_sections") builder.add_conditional_edges("format_completed_sections", parallelize_final_section_writing, ["write_final_sections"]) builder.add_edge("write_final_sections", "compile_final_report") builder.add_edge("compile_final_report", END) reporter_agent = builder.compile() # view agent structure display(Image(reporter_agent.get_graph(xray=True).draw_mermaid_png()))

builder = StateGraph(ReportState, input=ReportStateInput, output=ReportStateOutput)
builder.add_node("generate_report_plan", generate_report_plan)
builder.add_node("section_builder_with_web_search", section_builder_subagent)
builder.add_node("format_completed_sections", format_completed_sections)
builder.add_node("write_final_sections", write_final_sections)
builder.add_node("compile_final_report", compile_final_report)
builder.add_edge(START, "generate_report_plan")
builder.add_conditional_edges("generate_report_plan",
parallelize_section_writing,
["section_builder_with_web_search"])
builder.add_edge("section_builder_with_web_search", "format_completed_sections")
builder.add_conditional_edges("format_completed_sections",
parallelize_final_section_writing,
["write_final_sections"])
builder.add_edge("write_final_sections", "compile_final_report")
builder.add_edge("compile_final_report", END)
reporter_agent = builder.compile()
# view agent structure
display(Image(reporter_agent.get_graph(xray=True).draw_mermaid_png()))

輸出

深度研究代理流程示意圖

現在我們可以執行並測試我們的代理系統了！

執行並測試我們的深度研究報告撰寫代理

最後，讓我們來測試一下我們的深度研究報告撰寫代理！我們將建立一個簡單的函式來即時流式傳輸進度，然後顯示最終報告。我建議在代理執行後關閉所有中間列印資訊！

from IPython.display import display

from rich.console import Console

from rich.markdown import Markdown as RichMarkdown

async def call_planner_agent(agent, prompt, config={"recursion_limit": 50}, verbose=False):

events = agent.astream(

{'topic' : prompt},

config,

stream_mode="values",

)

async for event in events:

for k, v in event.items():

if verbose:

if k != "__end__":

display(RichMarkdown(repr(k) + ' -> ' + repr(v)))

if k == 'final_report':

print('='*50)

print('Final Report:')

md = RichMarkdown(v)

display(md)

from IPython.display import display from rich.console import Console from rich.markdown import Markdown as RichMarkdown async def call_planner_agent(agent, prompt, config={"recursion_limit": 50}, verbose=False): events = agent.astream( {'topic' : prompt}, config, stream_mode="values", ) async for event in events: for k, v in event.items(): if verbose: if k != "__end__": display(RichMarkdown(repr(k) + ' -> ' + repr(v))) if k == 'final_report': print('='*50) print('Final Report:') md = RichMarkdown(v) display(md)

from IPython.display import display
from rich.console import Console
from rich.markdown import Markdown as RichMarkdown
async def call_planner_agent(agent, prompt, config={"recursion_limit": 50}, verbose=False):
events = agent.astream(
{'topic' : prompt},
config,
stream_mode="values",
)
async for event in events:
for k, v in event.items():
if verbose:
if k != "__end__":
display(RichMarkdown(repr(k) + ' -> ' + repr(v)))
if k == 'final_report':
print('='*50)
print('Final Report:')
md = RichMarkdown(v)
display(md)

測試執行

topic = "Detailed report on how is NVIDIA winning the game against its competitors"

await call_planner_agent(agent=reporter_agent,

prompt=topic)

topic = "Detailed report on how is NVIDIA winning the game against its competitors" await call_planner_agent(agent=reporter_agent, prompt=topic)

topic = "Detailed report on how is NVIDIA winning the game against its competitors"
await call_planner_agent(agent=reporter_agent,
prompt=topic)

輸出

--- Generating Report Plan ------ Generating Report Plan Completed ------ Generating Search Queries for Section: NVIDIA's Market Dominance in GPUs ------ Generating Search Queries for Section: Strategic Acquisitions and Partnerships ------ Generating Search Queries for Section: Technological Innovations and AI Leadership ------ Generating Search Queries for Section: Financial Performance and Growth Strategy ------ Generating Search Queries for Section: NVIDIA's Market Dominance in GPUs Completed ------ Searching Web for Queries ------ Generating Search Queries for Section: Financial Performance and Growth Strategy Completed ------ Searching Web for Queries ------ Generating Search Queries for Section: Technological Innovations and AI Leadership Completed ------ Searching Web for Queries ------ Generating Search Queries for Section: Strategic Acquisitions and Partnerships Completed ------ Searching Web for Queries ------ Searching Web for Queries Completed ------ Writing Section : Strategic Acquisitions and Partnerships ------ Searching Web for Queries Completed ------ Writing Section : Financial Performance and Growth Strategy ------ Searching Web for Queries Completed ------ Writing Section : NVIDIA's Market Dominance in GPUs ------ Searching Web for Queries Completed ------ Writing Section : Technological Innovations and AI Leadership ------ Writing Section : Strategic Acquisitions and Partnerships Completed ------ Writing Section : Financial Performance and Growth Strategy Completed ------ Writing Section : NVIDIA's Market Dominance in GPUs Completed ------ Writing Section : Technological Innovations and AI Leadership Completed ------ Formatting Completed Sections ------ Formatting Completed Sections is Done ------ Writing Final Section: Introduction ------ Writing Final Section: Conclusion ------ Writing Final Section: Introduction Completed ------ Writing Final Section: Conclusion Completed ------ Compiling Final Report ------ Compiling Final Report Done ---==================================================Final Report:

測試報告生成Part1 測試報告生成Part2

如上圖所示，它為我們提供了一份相當全面、經過深入研究且結構合理的報告！

小結

如果你正在閱讀這篇文章，我對你在這本大型指南中堅持到最後的努力表示讚賞！在這裡，我們看到了構建類似於 OpenAI 推出的成熟商業產品（而且還不便宜！）並不太困難，OpenAI 是一家絕對知道如何推出生成式人工智慧（Generative AI）優質產品的公司，現在又推出了代理式人工智慧（Agentic AI）。

我們看到了如何構建我們自己的深度研究和報告生成代理人工智慧系統的詳細架構和工作流程，總體而言，執行這個系統的成本還不到承諾的一美元！如果一切都使用開源元件，那麼它就是完全免費的！此外，這個系統完全可以定製，你可以控制搜尋的方式、報告的結構、長度和風格。需要注意的是，如果使用 Tavily，在執行該代理進行深度研究時，很容易會出現大量搜尋，因此要注意並跟蹤使用情況。這只是給你提供了一個基礎，你可以隨意使用這些程式碼和系統，並對其進行定製，使其變得更好！

AI代理 LangGraph 報告生成深度研究

手把手教你構建屬於自己的深度研究和報告生成代理

OpenAI深度研究簡介

深度研究與結構化報告生成規劃Agentic AI系統架構

1. 報告規劃：

2. 2. 研究與寫作並行執行：

3. 格式化已完成的章節：

4. 撰寫引言和結論：

5. 最後彙編：

深度研究與結構化報告生成規劃AI代理系統的實踐實施

安裝依賴項

輸入Open AI API金鑰

輸入Tavily Search API金鑰

設定環境變數

定義代理狀態模式

實用函式

1. run_search_queries(…)

2. format_search_query_results(…)

輸出

建立預設報告模板

報告規劃器的指令提示

1. REPORT_PLAN_QUERY_GENERATOR_PROMPT（報告計劃查詢生成器提示）

2. REPORT_PLAN_SECTION_GENERATOR_PROMPT（報告計劃章節生成器提示）

報告規劃器節點函式

章節生成器 – 查詢生成器的指令提示

1. REPORT_SECTION_QUERY_GENERATOR_PROMPT

章節生成器的節點函式 – 生成查詢（查詢生成器）

章節生成器的節點函式 – 搜尋網路

章節生成器–章節寫作的指令提示

1. SECTION_WRITER_PROMPT（章節編寫提示）

章節建立器的節點函式 – 編寫章節（章節編寫器）

建立章節生成器子代理

輸出

建立動態並行化節點函式 – 並行化章節編寫

建立格式化章節節點函式

最後章節的指導提示

1. FINAL_SECTION_WRITER_PROMPT（最後章節寫作提示）

建立撰寫最後章節節點函式

建立動態並行化節點函式 – 並行化最後章節的編寫

編譯最終報告節點函式

建立我們的深度研究和報告撰寫代理

輸出

執行並測試我們的深度研究報告撰寫代理

測試執行

輸出

小結

相關文章

評論留言

取消回覆

文章目录