如何创建你自己的AI智能新闻通讯助手

如何创建你自己的AI智能新闻通讯助手

随着人工智能的发展及其与社交媒体的融合,它无疑有助于创造有价值的内容。由于人工智能减少了人际互动,这种融合的后果是人们的注意力持续时间缩短。因此,问题在于,如何在创作能够提升参与度的内容的同时,吸引读者的全神贯注?

答案是新闻通讯。

在本文中,我将向您展示如何创建您自己的新闻通讯 AI 代理。

为什么要发送简报?

简报是一种定期分发的出版物。它通常通过电子邮件发送,用于与特定受众分享最新动态、见解或定制内容。简报的主要目的是:

  1. 分享信息:企业或组织使用简报来发布公司最新动态和发展动态(例如产品发布、博客更新、活动)。
  2. 推广:公司向目标受众发送简报,以一种微妙的、建立关系的方式推广产品、服务或课程。
  3. 互动:包含推广内容的社区首先会通过简报分享相关内容来与受众互动,因此,一旦受众分享了推广内容,他们很可能会忽略它。

转变:从手​​动到自主的内容创作

自诞生以来,新闻通讯一直遵循着相同的模式,那就是有人会花费数小时收集链接并总结内容。这些邮件除了收件人姓名之外,几乎没有个性化内容,而且难以针对小众受众进行扩展。但情况正在迅速改变。

自从我们采用 Agentic 工作流程以来,我们不仅朝着生成个性化内容的目标迈进了一步,而且还实现了自动化。通过结合 LLM 和 Agentic 工作流程,我们不仅可以制定内容策略,还可以做出决策并执行任务,而无需任何常规输入。

构建新闻通讯AI代理的项目计划

让我们尝试直观地了解一下 AI 驱动的新闻通讯代理的工作原理:

AI 驱动的新闻通讯代理的工作原理

  1. 提示:第一步,您将通过创建每周 AI 综述来展现您的意图。
  2. 目标设定:下一步,我们将通过定义我们想要的新闻简报类型来设定预期。
  3. 执行计划:接下来是代理的角色,它会搜索信息来源、总结见解并进行格式化。
  4. 输出:最后,它会整理最终的新闻简报,准备发送。

开始构建:您的首个新闻通讯AI代理

现在您已经了解了代理工作流对于制定新闻通讯策略和自动化的重要性,让我们继续探索“如何”实现。在本文中,我们将构建一个简单的 AI 驱动工作流,该工作流可自动从新闻文章的 CSV 数据集创建新闻通讯。让我们开始吧。

步骤 1:提取包含多条新闻条目的CSV文件

首先,我们需要读取包含新闻文章的 CSV 文件。CSV 文件的结构应为每行包含新闻文章的详细信息,例如标题、内容和其他元数据。我们可以使用 Pandas 库来提取和操作这些数据。

提取 CSV 的代码:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
import pandas as pd
# Load the CSV file containing news articles
def load_news_csv(file_path: str):
df = pd.read_csv(file_path)
return df
# Example usage
news_data = load_news_csv("news_articles.csv")
print(news_data.head()) # Display first few rows of the dataset
import pandas as pd # Load the CSV file containing news articles def load_news_csv(file_path: str): df = pd.read_csv(file_path) return df # Example usage news_data = load_news_csv("news_articles.csv") print(news_data.head()) # Display first few rows of the dataset
import pandas as pd
# Load the CSV file containing news articles
def load_news_csv(file_path: str):
df = pd.read_csv(file_path)
return df
# Example usage
news_data = load_news_csv("news_articles.csv")
print(news_data.head())  # Display first few rows of the dataset

在此步骤中,我们读取 CSV 文件并将其存储在 DataFrame 中。现在可以处理数据集以提取新闻稿的相关文章。

步骤 2:根据AI相关关键词或主题筛选和评分每篇文章

现在我们有了文章,我们需要分析其中与 AI 相关的关键词或主题。这些关键词可能包括“机器学习”、“人工智能”、“神经网络”等。代码中的 AIContentFilter 类就是用于此目的的。它会分析每篇文章的内容,以确定其是否与 AI/ML/数据科学相关。

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
class AIContentFilter:
"""AI Content Filter with multiple compatibility modes"""
def __init__(self, openai_api_key: str):
self.openai_api_key = openai_api_key
# Setup for LangChain or keyword-based filtering
self.mode = "keyword"
self.ai_keywords = [
'ai', 'artificial intelligence', 'machine learning', 'deep learning',
'neural network', 'chatgpt', 'claude', 'gemini', 'openai', 'anthropic'
]
def keyword_analysis(self, content: str) -> bool:
"""Keyword-based analysis for AI-related topics"""
content_lower = content.lower()
return any(keyword in content_lower for keyword in self.ai_keywords)
def filter_articles(self, df: pd.DataFrame) -> pd.DataFrame:
"""Filter articles based on AI-related keywords"""
return df[df['content'].apply(self.keyword_analysis)]
# Filter the articles in the dataset
ai_filter = AIContentFilter(openai_api_key="your-openai-api-key")
filtered_articles = ai_filter.filter_articles(news_data)
print(filtered_articles.head()) # Show filtered AI-related articles
class AIContentFilter: """AI Content Filter with multiple compatibility modes""" def __init__(self, openai_api_key: str): self.openai_api_key = openai_api_key # Setup for LangChain or keyword-based filtering self.mode = "keyword" self.ai_keywords = [ 'ai', 'artificial intelligence', 'machine learning', 'deep learning', 'neural network', 'chatgpt', 'claude', 'gemini', 'openai', 'anthropic' ] def keyword_analysis(self, content: str) -> bool: """Keyword-based analysis for AI-related topics""" content_lower = content.lower() return any(keyword in content_lower for keyword in self.ai_keywords) def filter_articles(self, df: pd.DataFrame) -> pd.DataFrame: """Filter articles based on AI-related keywords""" return df[df['content'].apply(self.keyword_analysis)] # Filter the articles in the dataset ai_filter = AIContentFilter(openai_api_key="your-openai-api-key") filtered_articles = ai_filter.filter_articles(news_data) print(filtered_articles.head()) # Show filtered AI-related articles
class AIContentFilter:
"""AI Content Filter with multiple compatibility modes"""
def __init__(self, openai_api_key: str):
self.openai_api_key = openai_api_key
# Setup for LangChain or keyword-based filtering
self.mode = "keyword"
self.ai_keywords = [
'ai', 'artificial intelligence', 'machine learning', 'deep learning',
'neural network', 'chatgpt', 'claude', 'gemini', 'openai', 'anthropic'
]
def keyword_analysis(self, content: str) -> bool:
"""Keyword-based analysis for AI-related topics"""
content_lower = content.lower()
return any(keyword in content_lower for keyword in self.ai_keywords)
def filter_articles(self, df: pd.DataFrame) -> pd.DataFrame:
"""Filter articles based on AI-related keywords"""
return df[df['content'].apply(self.keyword_analysis)]
# Filter the articles in the dataset
ai_filter = AIContentFilter(openai_api_key="your-openai-api-key")
filtered_articles = ai_filter.filter_articles(news_data)
print(filtered_articles.head())  # Show filtered AI-related articles

步骤 3:应用阈值以识别最相关的文章

筛选文章后,我们可能需要应用阈值,以仅包含高度相关的文章。例如,我们可以根据文章中与 AI 相关的关键词的数量来设置置信度分数。分数越高,文章的相关性就越高。

应用阈值的代码:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
def apply_relevance_threshold(df: pd.DataFrame, threshold: int = 3) -> pd.DataFrame:
"""Apply threshold to select only the most relevant articles"""
df['relevance_score'] = df['content'].apply(lambda x: sum(keyword in x.lower() for keyword in ai_filter.ai_keywords))
return df[df['relevance_score'] >= threshold]
# Apply threshold to filtered articles
relevant_articles = apply_relevance_threshold(filtered_articles, threshold=3)
print(relevant_articles.head()) # Display most relevant articles
def apply_relevance_threshold(df: pd.DataFrame, threshold: int = 3) -> pd.DataFrame: """Apply threshold to select only the most relevant articles""" df['relevance_score'] = df['content'].apply(lambda x: sum(keyword in x.lower() for keyword in ai_filter.ai_keywords)) return df[df['relevance_score'] >= threshold] # Apply threshold to filtered articles relevant_articles = apply_relevance_threshold(filtered_articles, threshold=3) print(relevant_articles.head()) # Display most relevant articles
def apply_relevance_threshold(df: pd.DataFrame, threshold: int = 3) -> pd.DataFrame:
"""Apply threshold to select only the most relevant articles"""
df['relevance_score'] = df['content'].apply(lambda x: sum(keyword in x.lower() for keyword in ai_filter.ai_keywords))
return df[df['relevance_score'] >= threshold]
# Apply threshold to filtered articles
relevant_articles = apply_relevance_threshold(filtered_articles, threshold=3)
print(relevant_articles.head())  # Display most relevant articles

步骤 4:使用语言学习模型 (LLM) 生成筛选后文章的摘要

现在我们已经获得了最相关的文章,下一步就是生成这些文章的摘要。我们将使用语言学习模型 (LLM) 来实现此目的。在提供的代码中,我们使用 LangChain 包中的 ChatOpenAI 类与 OpenAI 的模型进行交互,以对文章进行摘要。

生成文章摘要的代码:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
from langchain_openai import ChatOpenAI
from langchain.prompts import PromptTemplate
def generate_summary(content: str, openai_api_key: str) -> str:
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.1, api_key=openai_api_key)
prompt = f"Summarize the following article:\n\n{content}"
response = llm(prompt)
return response['choices'][0]['text'].strip()
# Generate summaries for the filtered articles
relevant_articles['summary'] = relevant_articles['content'].apply(lambda x: generate_summary(x, openai_api_key="your-openai-api-key"))
print(relevant_articles[['title', 'summary']].head()) # Display article summaries
from langchain_openai import ChatOpenAI from langchain.prompts import PromptTemplate def generate_summary(content: str, openai_api_key: str) -> str: llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.1, api_key=openai_api_key) prompt = f"Summarize the following article:\n\n{content}" response = llm(prompt) return response['choices'][0]['text'].strip() # Generate summaries for the filtered articles relevant_articles['summary'] = relevant_articles['content'].apply(lambda x: generate_summary(x, openai_api_key="your-openai-api-key")) print(relevant_articles[['title', 'summary']].head()) # Display article summaries
from langchain_openai import ChatOpenAI
from langchain.prompts import PromptTemplate
def generate_summary(content: str, openai_api_key: str) -> str:
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.1, api_key=openai_api_key)
prompt = f"Summarize the following article:\n\n{content}"
response = llm(prompt)
return response['choices'][0]['text'].strip()
# Generate summaries for the filtered articles
relevant_articles['summary'] = relevant_articles['content'].apply(lambda x: generate_summary(x, openai_api_key="your-openai-api-key"))
print(relevant_articles[['title', 'summary']].head())  # Display article summaries

步骤 5:将选定内容格式化为可发送的新闻稿布局

最后,我们需要将选定内容格式化为适合作为新闻稿发送的布局。这可以使用 Markdown 或 HTML 格式,具体取决于您的偏好。以下示例展示了如何将选定文章及其摘要格式化为 Markdown 格式。

内容格式化代码:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
def format_newsletter(articles_df: pd.DataFrame) -> str:
"""Format the selected articles into a newsletter (Markdown)"""
newsletter_content = "# AI News Newsletter\n\n"
for _, row in articles_df.iterrows():
newsletter_content += f"## {row['title']}\n\n"
newsletter_content += f"**Summary**: {row['summary']}\n\n"
newsletter_content += "----\n"
return newsletter_content
# Format the relevant articles into a newsletter
newsletter = format_newsletter(relevant_articles)
print(newsletter) # Display the formatted newsletter
def format_newsletter(articles_df: pd.DataFrame) -> str: """Format the selected articles into a newsletter (Markdown)""" newsletter_content = "# AI News Newsletter\n\n" for _, row in articles_df.iterrows(): newsletter_content += f"## {row['title']}\n\n" newsletter_content += f"**Summary**: {row['summary']}\n\n" newsletter_content += "----\n" return newsletter_content # Format the relevant articles into a newsletter newsletter = format_newsletter(relevant_articles) print(newsletter) # Display the formatted newsletter
def format_newsletter(articles_df: pd.DataFrame) -> str:
"""Format the selected articles into a newsletter (Markdown)"""
newsletter_content = "# AI News Newsletter\n\n"
for _, row in articles_df.iterrows():
newsletter_content += f"## {row['title']}\n\n"
newsletter_content += f"**Summary**: {row['summary']}\n\n"
newsletter_content += "----\n"
return newsletter_content
# Format the relevant articles into a newsletter
newsletter = format_newsletter(relevant_articles)
print(newsletter)  # Display the formatted newsletter

通过这些步骤,我们实现了 AI 驱动的新闻通讯内容的自动化创建。

输出:

以下是筛选所有新闻后创建的新闻通讯:

筛选所有新闻后创建的新闻通讯

小结

在这个内容泛滥的世界里,新闻通讯仍然是策划有意义内容的重要来源。我们在本文章中介绍的内容只是冰山一角;如何利用 LLM 和 Agentic 工作框架将手动任务转变为可扩展的智能系统。随着这些工具越来越普及,使其更具可扩展性和个性化的能力不再局限于大型团队或开发人员。您可以根据自己的需求进行尝试和定制。

评论留言