国产成人av在线免播放观看,午夜无码精品一区二区,无码专区中文字幕视频在线

搜索

APP

起點課堂會員權(quán)益

職業(yè)體系課特權(quán)

線下行業(yè)大會特權(quán)

個人IP打造特權(quán)

30+門專項技能課

1300+專題課程

12場職場軟技能直播

12場求職輔導(dǎo)直播

12場專業(yè)技能直播

會員專屬社群

榮耀標(biāo)識

發(fā)布

注冊 | 登錄

看完這篇，你也能做 AI 搜索：論「結(jié)構(gòu)化輸出」

賽博禪心

2024-08-19

1 評論 2653 瀏覽 8 收藏

33 分鐘

隨著AI技術(shù)的不斷進(jìn)步，結(jié)構(gòu)化輸出的應(yīng)用將越來越廣泛，從AI搜索到IoT設(shè)備適配，再到復(fù)雜的對話系統(tǒng)，結(jié)構(gòu)化輸出都扮演著至關(guān)重要的角色。了解和掌握結(jié)構(gòu)化輸出的使用方法，對于AI行業(yè)的從業(yè)者來說，是提升產(chǎn)品性能和用戶體驗的關(guān)鍵。

先說結(jié)論：結(jié)構(gòu)化輸出，是絕大多數(shù) AI 產(chǎn)品和 Agent 的核心

無論是 AI 搜索、有記憶的 bot，還是各類 agent，都是基于結(jié)構(gòu)化輸出搭建的。

在這篇文章里，我講從多角度講一下結(jié)構(gòu)化輸出的用途、起源和示例。

同時，文章會附帶一些代碼片段，方便更直觀的表現(xiàn)（當(dāng)然，完全不看關(guān)系也沒問題）

一、什么是「格式化輸出」

如果我問 GPT：介紹一下喬布斯

通常會收到這樣的回復(fù)：

史蒂夫·喬布斯，1955年2月24日出生，2011年10月5日去世，美國人。他活躍于科技、創(chuàng)新、企業(yè)管理和動畫領(lǐng)域。喬布斯創(chuàng)立了Apple、NeXT和Pixar公司，推出了Mac、iPod、iPhone等具有劃時代意義的產(chǎn)品，重塑了個人電腦、音樂和手機(jī)行業(yè)，奠定了蘋果在全球科技領(lǐng)域的領(lǐng)軍地位。作為Pixar的創(chuàng)辦人之一，他也在動畫領(lǐng)域留下了深遠(yuǎn)的影響。喬布斯是20世紀(jì)末至21世紀(jì)初最具影響力的企業(yè)家和創(chuàng)新者之一。

而在很多 AI 產(chǎn)品中，我們并非看到一串文字，而是這樣的腦圖：

其背后，就是結(jié)構(gòu)化輸出 – 讓 AI 輸出 json，而非文本，比如：

{
“name”: “史蒂夫·喬布斯”,
“birth_date”: “1955-02-24”,
“death_date”: “2011-10-05”,
“nationality”: “美國”,
“fields”: [“科技”, “創(chuàng)新”, “企業(yè)管理”, “動畫”],
“companies_founded”: [“Apple”, “NeXT”, “Pixar”],
“achievements”: [
“創(chuàng)立蘋果公司”,
“推出Mac、iPod、iPhone等產(chǎn)品”,
“重塑個人電腦、音樂、手機(jī)行業(yè)”,
“奠定蘋果全球科技領(lǐng)軍地位”,
“Pixar創(chuàng)辦人之一”
],
“influence”: “20世紀(jì)末至21世紀(jì)初最具影響力的企業(yè)家和創(chuàng)新者之一”
}

二、產(chǎn)品的背后，都是結(jié)構(gòu)化輸出

依然拿「介紹一下喬布斯」這個問題舉例，在不同 AI 產(chǎn)品中，這個問題的內(nèi)部輸出是不同的。

如果是搜索，它的內(nèi)部輸出可能是這樣：

{
“query”: “喬布斯”,
“search_by”: “Google”
}

獲得這個結(jié)果后，再用谷歌搜索「喬布斯」，并將結(jié)果通過 AI 總結(jié)，返回給用戶。

對于 Rag 工具，其數(shù)據(jù)庫為《硅谷縣志》，它的內(nèi)部輸出可能是這樣

{
“rag1”: “喬布斯的家庭”,
“rag2”: “喬布斯的成長”,
“rag3”: “喬布斯的產(chǎn)品”,
“rag4”: “喬布斯的成就”,
}

分別對這幾個信息進(jìn)行 rag 后，在把結(jié)果匯總，通過 AI 總結(jié)，返回給用戶。

對于四格漫畫，其內(nèi)部輸出可能是這樣的：

{
“stories”: [
{
“story”: “喬布斯的家庭”,
“prompt”: “20世紀(jì)70年代復(fù)古風(fēng)格，溫暖的色調(diào)，柔和的線條。在美國加州的一間溫馨家庭住宅，窗外陽光明媚，庭院中充滿綠植和鮮花。年輕的喬布斯與他的養(yǎng)父母在客廳里，其母親在織毛衣，父親在看報紙，喬布斯坐在地上玩著一臺老式計算機(jī)。畫面呈現(xiàn)出和諧溫馨的家庭場景，濃厚的親情氛圍中，喬布斯的眼中充滿了好奇與探索。”,
“caption”:?“家庭的力量塑造了偉大的夢想”},
{
“story”: “喬布斯的成長”,
“prompt”: “1970年代末期的黑白攝影風(fēng)格，帶有強(qiáng)烈的對比效果。在舊金山一所簡樸的高中教室，光線從窗外斜射進(jìn)來，課桌上擺滿了書本和筆記。年輕的喬布斯坐在教室后排，注視著老師手中的物理實驗，身邊的同學(xué)們都在認(rèn)真聽課。畫面體現(xiàn)了喬布斯對知識的渴望，眼神專注，透出不凡的好奇心和思考的深度。”,
“caption”:?“追求知識與個人成長”},
{
“story”: “喬布斯的產(chǎn)品”,
“prompt”: “極簡主義風(fēng)格，采用現(xiàn)代化的色彩搭配，注重設(shè)計感。在蘋果公司現(xiàn)代化的辦公室內(nèi)，簡潔的玻璃桌面上擺放著第一代Macintosh，背景是白色的墻壁和大型蘋果標(biāo)志。喬布斯站在桌前，手指輕觸Macintosh，身后幾位工程師在討論。畫面重點突出喬布斯與他的產(chǎn)品，展示出科技與設(shè)計的完美結(jié)合，喬布斯的神態(tài)自信且充滿遠(yuǎn)見。”,
“caption”:?“通過產(chǎn)品改變世界”},
{
“story”: “喬布斯的成就”,
“prompt”: “超現(xiàn)實主義風(fēng)格，帶有未來感，色彩鮮明且具有沖擊力。在龐大的蘋果公司總部前，未來風(fēng)格的天空中懸浮著喬布斯的頭像，周圍環(huán)繞著iPhone、iPad、Mac等產(chǎn)品。喬布斯的巨大肖像與天空中的科技產(chǎn)品融為一體，象征著他對現(xiàn)代科技的深遠(yuǎn)影響。畫面展現(xiàn)了一幅震撼的圖景，喬布斯的形象如同神話般屹立在現(xiàn)代科技的頂峰。”,
“caption”: “達(dá)到科技的巔峰”
}
]}

然后分別對這幾個信息，進(jìn)行畫圖，在展示給用戶。

三、以「AI 天氣預(yù)報」為例

現(xiàn)在換個例子：我有一個天氣預(yù)報 AI，如果用戶問到了天氣，則進(jìn)行告知。

實際上，這個 AI 并不是真的用 AI 去實時預(yù)測，而是問題，轉(zhuǎn)化成一個請求，去查詢「天氣預(yù)報數(shù)據(jù)庫」，再返回給用戶（當(dāng)然，如果用戶的問題和天氣無關(guān)，則正常回答）。

具體的流程如下：

從程序的角度，執(zhí)行以下：

1）通過結(jié)構(gòu)化輸出，判斷該問題和天氣有關(guān)，并拆解出 location 和 date 兩個字段分別是北京 ?和明天

2）向接口請求信息北京 ?和明天（嚴(yán)謹(jǐn)來說是，Beijing 和 2024-08-16）

3）AI 進(jìn)行結(jié)構(gòu)化返回：

{
“date”: “2024-08-16”,
“location”: “北京”,
“temperature”:
{
“high”: “32°C”,
“low”: “24°C”
},
“weather”: “雷陣雨，云層厚”,
“humidity”: “77%”,
“UV_index”: “高”,
“advice”: “攜帶雨具，注意防曬”
}

4）向接口發(fā)送類似下面的信息

client.chat.completions.create(
model=”gpt-4o”,
messages=[
{“role”: “system”, “content”: “你是天氣預(yù)報機(jī)器人，今天是 2024-08-16”},
{“role”: “user”, “content”: “””
{“date”: “2024-08-16″,”location”: “北京”,”temperature”: {“high”: “32°C”,”low”: “24°C”},”weather”: “雷陣雨，云層厚”,”humidity”: “77%”,”UV_index”: “高”,”advice”: “攜帶雨具，注意防曬”}
北京明天的天氣怎么樣？
“””}
]
)

5）接口返回的結(jié)果，會類似：

明天北京的天氣預(yù)報顯示白天的氣溫將達(dá)到最高約32°C，夜間最低溫度約為24°C。預(yù)計會有雷陣雨，全天云層較厚，濕度較高，達(dá)到77%左右。雷陣雨可能會在上午和下午出現(xiàn)，因此出行時建議攜帶雨具，并注意防曬，因為紫外線指數(shù)較高?？偟膩碚f，天氣悶熱，空氣濕潤，體感溫度可能會比實際溫度更高

也可以利用「結(jié)構(gòu)化輸出」，對設(shè)備進(jìn)行 IoT 適配。

比如，我學(xué)過電工（EE），就可以讓 Coze 變成一個家庭中控，如：

家里的數(shù)據(jù)：

封裝成 Coze Bot：

四、產(chǎn)業(yè)演化史

在 AI 領(lǐng)域，我們通常認(rèn)為，結(jié)構(gòu)化輸出的第一次大規(guī)模使用，是源自去年 5 月 OpenAI 的 Plugin 正式上線：AI 可以通過結(jié)構(gòu)化輸出，來調(diào)用外部工具。

并且，截止到當(dāng)前，OpenAI 在結(jié)構(gòu)化輸出這塊，供進(jìn)行了 4 次迭代，包括 Plugin 方法，F(xiàn)unction Calling，Json Mode 和前兩天新出的 Structured Outputs。

當(dāng)然了，你也可以用 markdown 等 prompt 方法來模擬結(jié)構(gòu)化輸出，但不在本次的討論范圍。

Plugin 方法

在 2023 年 3 月，當(dāng)時參與到 plugin 內(nèi)測的朋友，會看到一份如何讓 ChatGPT 調(diào)用外部工具的文檔，也是結(jié)構(gòu)化輸出的雛形。

流程就和上文一樣，ChatGPT 在獲知用戶的請求后，通過結(jié)構(gòu)化輸出的方式，生成包括插件選擇在內(nèi)的一個 json，插件在接受到這些參數(shù)后開始處理，并給到一個回調(diào)。之后這套東西，變成了 GPTs 的 Action。

注意：這套方法并未通過接口的方式發(fā)布

Function Calling

在 2023 年 6 月，OpenAI 帶來了 0613 年中更新，并發(fā)布了 Function Calling，也是現(xiàn)在看來最廣泛使用的調(diào)用方法，國內(nèi)模型普遍支持。

下面，我們以一個更直觀的例子，來看看 Function Calling 的使用過程。以用戶查詢包裹為例，這個 bot 處理任務(wù)的過程中，總計分 2 步：

1）用戶向 AI 詢問【我的包裹，編號12345，寄了嗎？】的時候，其請求額外帶上字段 tools，在其中定義要獲取的信息 order_id

2）假設(shè)獲取到的信息是 order_12345 ，通過查詢數(shù)據(jù)庫，獲得包裹信息 2024-08-01

3）將這個信息，和歷史提問合并，再交給大模型，獲得最終輸出包裹在 2024-08-01 的時候已經(jīng)寄出去了

如果用代碼的方式，就是：

tools = [
{
“type”: “function”,
“function”: {
“name”: “get_delivery_date”,
“description”: “Get the delivery date for a customer’s order. Call this whenever you need to know the delivery date, for example when a customer asks ‘Where is my package'”,
“parameters”: {
“type”: “object”,
“properties”: {
“order_id”: {
“type”: “string”,
“description”: “The customer’s order ID.”
}
},
“required”: [“order_id”],
“additionalProperties”: False
}
}
}
]

messages = []
messages.append({“role”: “system”, “content”: “You are a helpful customer support assistant. Use the supplied tools to assist the user.”})
messages.append({“role”: “user”, “content”: “Hi, can you tell me the delivery date for my order?”})
messages.append({“role”: “assistant”, “content”: “Hi there! I can help with that. Can you please provide your order ID?”})
messages.append({“role”: “user”, “content”: “i think it is order_12345”})

rsp = client.chat.completions.create(
model=’gpt-4o’,
messages=messages,
tools=tools
)

之后，AI 會返回類似：

ChatCompletion(id=’chatcmpl-9wY3ulTLZswqZLF58L0LQ0sM1EAsG’, choices=[Choice(finish_reason=’tool_calls’, index=0, logprobs=None, message=ChatCompletionMessage(content=None, refusal=None, role=’assistant’, function_call=None, tool_calls=[ChatCompletionMessageToolCall(id=’call_W1KzfgxvkoxjCAGT3Td9oVPk’, function=Function(arguments='{“order_id”:”order_12345″}’, name=’get_delivery_date’), type=’function’)]))], created=1723740986, model=’gpt-4o-2024-05-13′, object=’chat.completion’, service_tier=None, system_fingerprint=’fp_3aa7262c27′, usage=CompletionUsage(completion_tokens=19, prompt_tokens=140, total_tokens=159))

其中 response.choices[0].message.tool_calls[0].function.arguments 的值，就是 {“order_id”:”order_12345″}

假定查詢到的結(jié)果是 2024-08-01

# Prepare the chat completion call payload
completion_payload = {
“model”: “gpt-4o”,
“messages”: [
{“role”: “system”, “content”: “You are a helpful customer support assistant. Use the supplied tools to assist the user.”},
{“role”: “user”, “content”: “Hi, can you tell me the delivery date for my order?”},
{“role”: “assistant”, “content”: “Hi there! I can help with that. Can you please provide your order ID?”},
{“role”: “user”, “content”: “i think it is order_12345”},
rsp.choices[0].message,
{“role”: “tool”, “content”: “delivery_date：2024-08-01”, “tool_call_id”: rsp.choices[0].message.tool_calls[0].id},
]
}

# Call the OpenAI API’s chat completions endpoint to send the tool call result back to the model
response = client.chat.completions.create(
model=completion_payload[“model”],
messages=completion_payload[“messages”],
)

# Print the response from the API. In this case it will typically contain a message such as “The delivery date for your order #12345 is xyz. Is there anything else I can help you with?”
print(response)

最終，你會得到

ChatCompletion(id=’chatcmpl-9wYV7Yhkimzlpg3ejNkjRjI0GKqyw’, choices=[Choice(finish_reason=’stop’, index=0, logprobs=None, message=ChatCompletionMessage(content=’Your order with ID “order_12345” is scheduled to be delivered on August 1, 2024. If you have any other questions or need further assistance, feel free to ask!’, refusal=None, role=’assistant’, function_call=None, tool_calls=None))], created=1723742673, model=’gpt-4o-2024-05-13′, object=’chat.completion’, service_tier=None, system_fingerprint=’fp_3aa7262c27′, usage=CompletionUsage(completion_tokens=40, prompt_tokens=111, total_tokens=151))

也就是 ?包裹在 2024-08-01 的時候已經(jīng)寄出去了

回顧一下

上面完成這個對話的時候，用戶給出了一次 prompt: i think it is order_12345，但 AI 實際上是跑了 2 次：

第一次是獲取 order id

第二次才是真正是生成內(nèi)容 ?包裹在 2024-08-01 的時候已經(jīng)寄出去了

同時，在第二次的對話中，結(jié)尾掛著第一次的 response 和數(shù)據(jù)庫查找結(jié)果。

在數(shù)據(jù)庫的查詢結(jié)果中，role 為 tool

還需注意

如果你在某些代碼中，看到 Function Calling 的查詢信息，不是用 tool，而是用 function，這也沒錯。

因為 OpenAI 曾經(jīng)改過 Function Calling 的接口實現(xiàn)：最開始是 ?function 結(jié)構(gòu)，后面改成了 tool 結(jié)構(gòu)。對于 tool ?和 function 這兩種寫法，目前都行，但后續(xù) OpenAI 將只支持 tool 結(jié)構(gòu)

吐槽：我個人更喜歡 function 結(jié)構(gòu)，更優(yōu)雅

使用 tool 結(jié)構(gòu)”messages”:

“messages”: [
{“role”: “system”, “content”: “You are a helpful customer support assistant. Use the supplied tools to assist the user.”},
{“role”: “user”, “content”: “Hi, can you tell me the delivery date for my order?”},
{“role”: “assistant”, “content”: “Hi there! I can help with that. Can you please provide your order ID?”},
{“role”: “user”, “content”: “i think it is order_12345”},
rsp.choices[0].message,
{“role”: “tool”, “content”: “delivery_date：2024-08-01”, “tool_call_id”: rsp.choices[0].message.tool_calls[0].id}]

使用 function 結(jié)構(gòu)

“messages”: [
{“role”: “system”, “content”: “You are a helpful customer support assistant. Use the supplied tools to assist the user.”},

{“role”: “user”, “content”: “Hi, can you tell me the delivery date for my order?”},
{“role”: “assistant”, “content”: “Hi there! I can help with that. Can you please provide your order ID?”},
{“role”: “user”, “content”: “i think it is order_12345”},
{“role”: “function”, “content”: “delivery_date：2024-08-01”, “name”: “delevery_record”}]

使用 function 結(jié)構(gòu)

“messages”:[
{“role”: “system”, “content”: “You are a helpful customer support assistant. Use the supplied tools to assist the user.”},
{“role”: “user”, “content”: “Hi, can you tell me the delivery date for my order?”},
{“role”: “assistant”, “content”: “Hi there! I can help with that. Can you please provide your order ID?”},
{“role”: “user”, “content”: “i think it is order_12345”},
{“role”: “function”, “content”: “delivery_date：2024-08-01”, “name”: “delevery_record”}
]

使用 function 結(jié)構(gòu)

“messages”: [
{“role”: “system”, “content”: “You are a helpful customer support assistant. Use the supplied tools to assist the user.”},
{“role”: “user”, “content”: “Hi, can you tell me the delivery date for my order?”},
{“role”: “assistant”, “content”: “Hi there! I can help with that. Can you please provide your order ID?”},
{“role”: “user”, “content”: “i think it is order_12345”},
{“role”: “function”, “content”: “delivery_date：2024-08-01”, “name”: “delevery_record”}]

另外：也可以兩種結(jié)構(gòu)都不用

“messages”: [
{“role”: “system”, “content”: “You are a helpful customer support assistant. Use the supplied tools to assist the user.”},
{“role”: “user”, “content”: “Hi, can you tell me the delivery date for my order?”},
{“role”: “assistant”, “content”: “Hi there! I can help with that. Can you please provide your order ID?”},
{“role”:?“user”,?“content”:?“i?think?it?is?order_12345. Related record is: delivery_date：2024-08-01”}]

Json Mode

在 2023 年 11 月，OpenAI 在開發(fā)者大會上，帶來了 Json Mode 更新。

仔細(xì)看上面的 Function Calling，其參數(shù)是通過 string 給到的，不夠穩(wěn)定。Json Mode 便是為了解決這一問題：直接輸出 Json。

注意：這種方法仍然不夠穩(wěn)定，并已被 Structured Outputs 取代

調(diào)用的時候，要求：

prompt 里出現(xiàn) json 這個單詞
response_format 設(shè)置為 “type”: “json_object”

比如

completion_payload = {
‘model’: ‘gpt-3.5-turbo’,
‘messages’: [{‘role’: ‘user’, ‘content’: ‘告訴我四大名著分別是什么，以及他們的作者是誰，按這個 json 格式: {{‘書名’:’xxx’，’作者’:’xxx’}…}’}],
‘response_format’: {‘type’: ‘json_object’}
}

# Call the OpenAI API’s chat completions endpoint to send the tool call result back to the model
response = client.chat.completions.create(
model=completion_payload[“model”],
messages=completion_payload[“messages”],
)

得到 resoponse 為Chat

Completion(id=’chatcmpl-9wZ5DHWicaarxccmTBGi8MfJsa6AQ’, choices=[Choice(finish_reason=’stop’, index=0, logprobs=None, message=ChatCompletionMessage(content=”{n ? ?{‘書名’: ‘西游記’, ‘作者’: ‘吳承恩’},n ? ?{‘書名’: ‘紅樓夢’, ‘作者’: ‘曹雪芹’},n ? ?{‘書名’: ‘水滸傳’, ‘作者’: ‘施耐庵’},n ? ?{‘書名’: ‘三國演義’, ‘作者’: ‘羅貫中’}n}”, refusal=None, role=’assistant’, function_call=None, tool_calls=None))], created=1723744911, model=’gpt-3.5-turbo-0125′, object=’chat.completion’, service_tier=None, system_fingerprint=None, usage=CompletionUsage(completion_tokens=92, prompt_tokens=57, total_tokens=149))

其中，通過 response.choices[0].message.content 可去到 json 信息，如需進(jìn)行后續(xù)處理，依然沿用 function calling 中的方法

Structured Outputs

較之 Function Calling 和 Json Mode，Structured OutPuts 明顯好用了很多，當(dāng)前支持以下模型：gpt-4o-mini, gpt-4o-2024-08-06，當(dāng)然，也包括之后的模型。

簡單調(diào)試測試一下

剛才的四大名著的例子，代碼這么寫

from pydantic import BaseModel

class theBook(BaseModel):
name: str
writer: str

class theFour(BaseModel):
steps: list[theBook]

completion = client.beta.chat.completions.parse(
model=”gpt-4o-2024-08-06″,
messages=[
{“role”: “system”, “content”: “Extract the event information.”},
{“role”: “user”, “content”: “告訴我四大名著分別是什么，以及他們的作者是誰”},
],
response_format = theFour,
)

response = completion.choices[0].message.parsed

得到的結(jié)果是

theFour
(
steps=[
theBook(name=’《紅樓夢》’, writer=’曹雪芹’),
theBook(name=’《西游記》’, writer=’吳承恩’),
theBook(name=’《三國演義》’, writer=’羅貫中’),
theBook(name=’《水滸傳》’, writer=’施耐庵’)])

非常好用！

通過這種方法，還可以完成單次對話的 CoT，比如：

from pydantic import BaseModel

class Step(BaseModel):
explanation: str
output: str

class MathReasoning(BaseModel):
steps: list[Step]
final_answer: str

completion = client.beta.chat.completions.parse(
model=”gpt-4o-2024-08-06″,
messages=[
{“role”: “system”, “content”: “You are a helpful math tutor. Guide the user through the solution step by step.”},
{“role”: “user”, “content”: “how can I solve 8x + 7 = -23”} ? ?], ? ?response_format=MathReasoning,)math_reasoning = completion.choices[0].message.parsed

得到結(jié)果

{
“steps”: [
{
“explanation”: “Start with the equation 8x + 7 = -23.”, ? ? ?“output”: “8x + 7 = -23”
},
{
“explanation”: “Subtract 7 from both sides to isolate the term with the variable.”, ? ? ?“output”: “8x = -23 – 7”
},
{
“explanation”: “Simplify the right side of the equation.”, ? ? ?“output”: “8x = -30”

},

{

“explanation”: “Divide both sides by 8 to solve for x.”, ? ? ?“output”: “x = -30 / 8”

},

{

“explanation”: “Simplify the fraction.”, ? ? ?“output”: “x = -15 / 4”

} ?],

“final_answer”: “x = -15 / 4”

}