Google Gen AI Python SDK：完整指南

生成式 AI 模型正在改變我們建立內容的方式，無論是文字、圖片、影片還是程式碼。藉助 Google 的 Gen AI Python SDK，除了使用 Gemini Developer API 和 Vertex AI API 之外，您現在還可以更輕鬆地在 Python 應用程式中訪問和使用 Google 的生成式 AI 模型。這意味著開發者可以更輕鬆地建立應用程式，包括聊天機器人、內容生成器或創意工具。在本文中，我們將介紹開始使用 Google Gen AI Python SDK 所需的所有知識。

什麼是Google Gen AI Python SDK？

Google Gen AI Python SDK 是一個客戶端庫，開發者可以透過 Python 輕鬆使用 Google 的生成式 AI 功能。它提供：

支援 Gemini Developer API（Google 的高階文字和多模態生成模型）
與 Vertex AI API 整合，用於企業級 AI 工作負載
支援生成文字、影像、影片、嵌入、聊天對話等
檔案管理、快取和非同步支援工具
高階函式呼叫和模式強制功能

此 SDK 還抽象了 API 呼叫的大部分複雜性，讓您專注於構建 AI 驅動的應用程式。

安裝

安裝 SDK 非常簡單。執行：

pip install google-genai

上述命令將使用 pip 安裝 Google Gen AI Python SDK 軟體包。此命令會下載啟動 Google 生成式 AI 服務所需的 Python 環境的所有資源，包括資源和所有依賴項。

匯入和客戶端設定

安裝 SDK 後，建立一個 Python 檔案並匯入 SDK：

from google import genai
from google.genai import types

該 SDK 包含兩個模組：genai 和 types。genai 模組建立用於 API 互動的客戶端，而 types 模組包含資料結構和類，它們可作為輔助程式，用於構建請求和配置請求引數。

您將為與 Google 生成式 AI 模型的每次互動建立一個客戶端例項。您將根據所使用的 API，使用不同的方法來例項化客戶端。

對於 Gemini 開發者 API，您可以透過傳遞 API 金鑰來例項化客戶端：

client = genai.Client(api_key='YOUR_GEMINI_API_KEY')

例項化客戶端後，您可以透過傳入 API 金鑰與 Gemini 開發者 API 進行互動。該客戶端將負責訪問令牌和請求管理。

可選：使用Google Cloud Vertex AI

client = genai.Client(
vertexai=True,
project='your-project-id',
location='us-central1'
)

如果您要使用 Google Cloud Vertex AI，則需要透過指定專案 ID 和位置來以不同的方式初始化客戶端。

注意：使用 Vertex AI 是可選的。您可以在此處建立專案 ID。

如果您不使用 Vertex AI，則可以直接使用上面的 API 金鑰方法。

API版本和配置

預設情況下，SDK 使用 Beta 版端點來訪問 Beta 版功能。但是，如果您想使用穩定版 API，可以使用 http_options 引數指定 API 版本：

from google.genai import types
client = genai.Client(
vertexai=True,
project='your-project-id',
location='us-central1',
http_options=types.HttpOptions(api_version='v1')
)

如何平衡穩定性與前沿功能，完全取決於您。

使用環境變數（可選）

與其直接傳遞金鑰，不如先設定環境變數：

Gemini 開發者 API：

export GEMINI_API_KEY='your-api-key'

Vertex AI:

export GOOGLE_GENAI_USE_VERTEXAI=true
export GOOGLE_CLOUD_PROJECT='your-project-id'
export GOOGLE_CLOUD_LOCATION='us-central1'

然後，簡單地初始化客戶端：

client = genai.Client()

Google Gen AI Python SDK用例

設定完成後，您可以透過多種方式使用 Google Gen AI Python SDK 的功能。

內容生成

該 SDK 的主要功能是生成 AI 內容。您可以以各種形式提供提示，例如簡單的字串、結構化內容或複雜的多模態輸入。

基本文字生成

response = client.models.generate_content(
model='gemini-2.0-flash-001',
contents='Why Does the sun rises from east'
)
print(response.text)

輸出

內容生成

這會向模型傳送提示並返回生成的答案。

結構化內容輸入

您可以跨各種角色插入結構化內容，例如，在聊天機器人、對話或多輪對話的語境中，插入使用者或模型。

from google.genai import types
content = types.Content(
role='user',
parts=[types.Part.from_text(text='Tell me a fun fact about work.')]
)
response = client.models.generate_content(model='gemini-2.0-flash-001', contents=content)
print(response.text)

輸出

結構化內容輸入

SDK 內部將多種不同的輸入型別轉換為模型所需的結構化資料格式。

檔案上傳和使用

Gemini Developers API 允許您上傳檔案供模型處理。這對於摘要或內容提取非常有用：

file = client.files.upload(file='/content/sample_file.txt')
response = client.models.generate_content(
model='gemini-2.0-flash-001',
contents=[file, 'Please summarize this file.']
)
print(response.text)

輸出

檔案上傳和使用

這是將 AI 功能新增到基於文件的任務的理想方法。

函式呼叫

一項獨特的功能是能夠將 Python 函式作為“工具”傳遞給模型，以便在生成補全時自動呼叫。

def get_current_weather(location: str) -> str:
return 'sunny'
response = client.models.generate_content(
model='gemini-2.0-flash-001',
contents='What is the weather like in Ranchi?',
config=types.GenerateContentConfig(tools=[get_current_weather])
)
print(response.text)

輸出

函式呼叫

這使得 AI 響應中的動態即時資料整合成為可能。

高階配置

您可以使用溫度、max_output_tokens 等引數以及安全設定來自定義生成，以管理隨機性、長度並過濾有害內容。

config = types.GenerateContentConfig(
temperature=0.3,
max_output_tokens=100,
safety_settings=[types.SafetySetting(category='HARM_CATEGORY_HATE_SPEECH', threshold='BLOCK_ONLY_HIGH')]
)
response = client.models.generate_content(
model='gemini-2.0-flash-001',
contents='''Offer some encouraging words for someone starting a new journey.''',
config=config
)
print(response.text)

輸出

高階配置

這可以更細緻地控制內容質量和安全性。

多媒體支援：圖片和影片

該 SDK 允許您生成和編輯圖片以及生成影片（預覽版）。

使用文字提示生成圖片。
放大或調整生成的圖片。
從文字或圖片生成影片。

圖片生成示例：

response = client.models.generate_images(
model='imagen-3.0-generate-002',
prompt='A tranquil beach with crystal-clear water and colorful seashells on the shore.',
config=types.GenerateImagesConfig(number_of_images=1)
)
response.generated_images[0].image.show()

輸出

多媒體支援：圖片和影片

影片生成示例：

import time
operation = client.models.generate_videos(
model='veo-2.0-generate-001',
prompt='A cat DJ spinning vinyl records at a futuristic nightclub with holographic beats.',
config=types.GenerateVideosConfig(number_of_videos=1, duration_seconds=5)
)
while not operation.done:
time.sleep(20)
operation = client.operations.get(operation)
video = operation.response.generated_videos[0].video
video.show()

輸出：

這可以實現富有創意的多模式 AI 應用。

聊天和對話

您可以發起聊天會話，並在聊天過程中保留上下文：

chat = client.chats.create(model='gemini-2.0-flash-001')
response = chat.send_message('Tell me a story')
print(response.text)

聊天和對話

response = chat.send_message('Summarize that story in one sentence')
print(response.text)

聊天和對話

這對於建立能夠記住先前對話的對話式 AI 非常有用。

非同步支援

所有主要 API 方法都包含非同步函式，以便更好地整合到非同步 Python 應用中：

response = await client.aio.models.generate_content(
model='gemini-2.0-flash-001',
contents='Tell a Horror story in 200 words.'
)
print(response.text)

非同步支援

Token計數

Token 計數可以追蹤輸入中包含的詞元（文字片段）數量。這有助於您控制在模型限制範圍內，並做出經濟高效的決策。

token_count = client.models.count_tokens(
model='gemini-2.0-flash-001',
contents='Why does the sky have a blue hue instead of other colors?'
)
print(token_count)

嵌入

嵌入將文字轉換為表示其含義的數字向量，可用於搜尋、聚類和 AI 評估。

embedding = client.models.embed_content(
model='text-embedding-004',
contents='Why does the sky have a blue hue instead of other colors?'
)
print(embedding)

使用 SDK，您可以輕鬆統計 token 數量並進行嵌入，從而改進和增強您的 AI 應用程式。

小結

Google Gen AI Python SDK 是一款功能強大且用途廣泛的工具，可幫助開發者在其 Python 專案中訪問 Google 的頂級生成式 AI 模型。從文字生成、聊天和聊天機器人，到影像/影片生成、函式呼叫等等，它提供了強大的功能集和簡潔的介面。憑藉便捷的軟體包安裝、簡單的客戶端配置流程以及對非同步程式設計和多媒體的支援，該 SDK 使構建利用 AI 的應用程式變得非常容易。無論您是初學者還是經驗豐富的開發者，使用該 SDK 都相對輕鬆，但在將生成式 AI 融入您的工作流程時卻非常強大。

Google Gen AI Python SDK：完整指南

文章目录

什麼是Google Gen AI Python SDK？

安裝

匯入和客戶端設定

可選：使用Google Cloud Vertex AI

API版本和配置

使用環境變數（可選）

Google Gen AI Python SDK用例

內容生成

基本文字生成

結構化內容輸入

檔案上傳和使用

函式呼叫

高階配置

多媒體支援：圖片和影片

聊天和對話

非同步支援

Token計數

嵌入

小結

評論留言

取消回覆

Google Gen AI Python SDK：完整指南

文章目录

什麼是Google Gen AI Python SDK？

安裝

匯入和客戶端設定

可選：使用Google Cloud Vertex AI

API版本和配置

使用環境變數（可選）

Google Gen AI Python SDK用例

內容生成

基本文字生成

結構化內容輸入

檔案上傳和使用

函式呼叫

高階配置

多媒體支援：圖片和影片

聊天和對話

非同步支援

Token計數

嵌入

小結

相關文章

評論留言

取消回覆