大语言模型的函数调用

大语言模型的函数调用能力是指其能够调用外部工具或API来实现特定功能的能力。这种能力使得大语言模型能够超越自身的限制，访问实时信息、执行复杂计算或操作外部系统，从而增强其解决问题的能力。

Openai的对话模型通过其API支持函数调用功能，其实现方式为：

API参数设置：在调用OpenAI的API时，可以通过tools参数提供函数调用的规范。这包括函数的名称、描述和参数。
模型输出处理：模型不会直接执行函数调用，而是生成一个包含函数名称和参数的JSON对象。开发人员需要使用这个对象来实际调用函数。
函数调用的响应：如果模型决定调用函数，其输出将包含在finish_reason为tool_call的响应中，包含要调用的函数和参数信息。
手动调用函数后，创建一个role为tool的消息添加到历史消息列表中，再调用一次对话模型。模型会根据函数调用的结果返回相应的信息。

使用函数调用

import os

import openai

messages = [
    {
        "role": "system",
        "content": "不要假设或猜测传入函数的参数值。如果用户的描述不明确，请要求用户提供必要信息",
    },
    {
        "role": "user",
        "content": "帮我查询从2024年1月20日，从北京出发前往上海的航班",
    },
]


tools = [
    {
        "type": "function",
        "function": {
            "name": "get_flight_number",
            "description": "根据始发地、目的地和日期，查询对应日期的航班号",
            "parameters": {
                "type": "object",
                "properties": {
                    "departure": {"description": "出发地", "type": "string"},
                    "destination": {"description": "目的地", "type": "string"},
                    "date": {
                        "description": "日期",
                        "type": "string",
                    },
                },
                "required": ["departure", "destination", "date"],
            },
        },
    }
]

client = openai.OpenAI()

response = client.chat.completions.create(
    model=os.getenv("OPENAI_MODEL"),
    messages=messages,
    tools=tools,
)

tool_call = response.choices[0].message.tool_calls[0]

print(f"{tool_call=}")

messages.append(response.choices[0].message.model_dump())
messages.append(
    {
        "role": "tool",
        "tool_call_id": tool_call.id,
        "content": "NH-8743",
    }
)
response = client.chat.completions.create(
    model=os.getenv("OPENAI_MODEL"), messages=messages
)

print(response.choices[0].message.content)

我先声明了一个get_flight_number函数，可以用于查询航班。首次调用对话模型后，会返回一个tool_call相应。然后回传函数调用的结果并进行新的对话模型调用，得到的结果如下。

1
2
3

tool_call=ChatCompletionMessageToolCall(id='call_20240816153456e4ebd6501be84e4d', function=Function(arguments='{"date": "2024-01-20", "departure": "北京", "destination": "上海"}', name='get_flight_number'), type='function', index=0)

根据您提供的信息，查询到的从2024年1月20日从北京出发前往上海的航班号为NH-8743。请注意，航班信息可能会有变动，建议您在出行前再次确认。如果您需要查询其他日期或者有其他需求，请提供详 细信息。

在这个示例中，我们通过函数调用为模型添加了查询航班的功能，有效扩展了AI模型的能力边界。

自动生成函数的JSON Schema

手动编写函数的JSON Schema是一个繁琐的工作，容易出错，也不好维护。Langchain、Semantic Kernel都提供自动根据Python代码生成函数定义的能力，不过在简单的应用场景中，没必要使用这么重的框架。Pypi上有一个小巧的function-schema项目也可以帮我们完成符合Openai归还的函数JSON Schema的自动生成。

import os
from typing import Annotated

import openai
from function_schema import get_function_schema


def get_weather(city: Annotated[str, "city"]) -> str:
    """Get the weather for a given city."""
    return "Sunny, 20 degrees Celsius"


client = openai.OpenAI()

response = client.chat.completions.create(
    model=os.getenv("OPENAI_MODEL"),
    messages=[{"role": "user", "content": "What's the weather in Beijing?"}],
    tools=[{"type": "function", "function": get_function_schema(get_weather)}],
)

print(response.choices[0].message.tool_calls[0])
# ChatCompletionMessageToolCall(id='call_20240816155110429934a7bf4d476d', function=Function(arguments='{"city": "Beijing"}', name='get_weather'), type='function', index=0)

自动进行函数调用

可以再添加一些逻辑实现自动函数调用，并进行简单的封装，就可以实现一个能力出众的AI Agent。

import json
import logging
from collections.abc import Callable
from typing import Annotated

from function_schema import get_function_schema
from openai import OpenAI
from openai.types.chat import ChatCompletionMessageParam

logging.basicConfig(level=logging.INFO)
logging.getLogger("httpx").disabled = True


class Agent:

    def __init__(
        self,
        *,
        model: str,
        client: OpenAI | None = None,
        system: str | None = None,
        tools: dict[str, tuple[Callable, dict]] | None = None,
    ):

        self.messages: list[ChatCompletionMessageParam] = [
            {"role": "system", "content": system}
        ]
        self.client = client or OpenAI()
        self.model = model
        self.tools = tools or {}
        self.tool_definitions = [
            {"type": "function", "function": schema}
            for _, schema in self.tools.values()
        ]

    def __call__(self, content: str) -> str:

        self.messages.append({"role": "user", "content": content})

        response = self.client.chat.completions.create(
            messages=self.messages,
            model=self.model,
            tools=self.tool_definitions,
        )
        logging.info(response.usage)

        message = response.choices[0].message
        while message.tool_calls:
            self.messages.append(message.model_dump())
            for tool_call in message.tool_calls:
                tool, _ = self.tools[tool_call.function.name]
                tool_result = tool(**json.loads(tool_call.function.arguments))
                self.messages.append(
                    {
                        "role": "tool",
                        "content": str(tool_result),
                        "tool_call_id": tool_call.id,
                    }
                )
            inner_response = self.client.chat.completions.create(
                messages=self.messages,
                model=self.model,
                tools=self.tool_definitions,
            )
            logging.info(inner_response.usage)

            message = inner_response.choices[0].message

        message_content = message.content or ""

        self.messages.append({"role": "assistant", "content": message_content})

        return message_content

我定义了一个Agent类，可以初始化系统消息和函数列表。当返回的消息包含函数调用时，会根据参数调用函数并再次调用会话模型。下面是一个简单的例子。

def get_weather(city: Annotated[str, "city"]) -> str:
    """Get the weather for a given city."""
    logging.info(f"Getting the weather for {city}")
    return "Sunny, 20 degrees Celsius"


def get_location() -> str:
    """Get the current location"""
    logging.info("Getting the current location")
    return "Beijing"


agent = Agent(
    model=os.getenv("OPENAI_MODEL"),
    tools={
        "get_weather": (get_weather, get_function_schema(get_weather)),
        "get_location": (get_location, get_function_schema(get_location)),
    },
)

print(agent("当前位置的天气怎么样？"))
print(agent.messages)

# INFO:root:CompletionUsage(completion_tokens=5, prompt_tokens=222, total_tokens=227)
# INFO:root:Getting the current location
# INFO:root:CompletionUsage(completion_tokens=11, prompt_tokens=237, total_tokens=248)
# INFO:root:Getting the weather for Beijing
# INFO:root:CompletionUsage(completion_tokens=14, prompt_tokens=261, total_tokens=275)
# 当前北京的天气是晴天，气温为20摄氏度。
# [{'role': 'system', 'content': None}, {'role': 'user', 'content': '当前位置的天气怎么样？'}, {'content': '', 'role': 'assistant', 'function_call': None, 'tool_calls': [{'id': 'call_20240816155636653661fe15564063', 'function': {'arguments': '{}', 'name': 'get_location'}, 'type': 'function', 'index': 0}]}, {'role': 'tool', 'content': 'Beijing', 'tool_call_id': 'call_20240816155636653661fe15564063'}, {'content': '', 'role': 'assistant', 'function_call': None, 'tool_calls': [{'id': 'call_20240816155637f7ea3c687f564ae4', 'function': {'arguments': '{"city": "Beijing"}', 'name': 'get_weather'}, 'type': 'function', 'index': 0}]}, {'role': 'tool', 'content': 'Sunny, 20 degrees Celsius', 'tool_call_id': 'call_20240816155637f7ea3c687f564ae4'}, {'role': 'assistant', 'content': '当前北京的天气是晴天，气温为20摄氏度。'}]

通过日志和消息列表可以得到，依次调用了get_location和get_weather两个函数，然后根据函数返回的结果回答了问题。

使用函数调用提高模型的数学能力

利用函数调用，我们可以有效提高大语言模型的数学和推理能力。

def square(x: int) -> int:
    """Return the square of x"""
    logging.info(f"Calculating the square of {x}")
    return x * x


agent = Agent(
    model=os.getenv("OPENAI_MODEL"),
    system="回答用户的问题，简短明了；优先使用提供的工具函数",
    tools={
        "square": (square, get_function_schema(square)),
    },
)

print(agent("计算19384的平方"))
# 19384的平方是375739456。

agent = Agent(model=os.getenv("OPENAI_MODEL"), system="回答用户的问题，简短明了")

print(agent("计算19384的平方"))
# 19384的平方是376614144。错误

在这个示例中我们分别让大语言模型使用函数和自己计算一个五位数的平方，比起自己发挥，将数学计算委托给专业的函数调用能得到正确的结果。

与外部交互

当然，我们也可以通过函数调用赋予大语言模型与外部交互的能力，例如访问网络、文件系统与数据库。

import requests
from bs4 import BeautifulSoup


def ask_wikipedia(title: str):
    """Get the content of a Wikipedia page."""
    logging.info(f"Getting the content of the Wikipedia page for {title}")
    resp = requests.get(
        f"https://en.wikipedia.org/w/api.php?action=parse&page={title}&format=json"
    )
    page = BeautifulSoup(resp.json()["parse"]["text"]["*"], "html.parser")
    return page.text


agent = Agent(
    model=os.getenv("OPENAI_MODEL"),
    system="你是一个互联网百科全书，可以回答关于任何事物的问题。请从英文维基百科获取信息并给出简短的回答。",
    tools={
        "ask_wikipedia": (ask_wikipedia, get_function_schema(ask_wikipedia)),
    },
)

print(agent("什么是大语言模型？"))
#INFO:root:CompletionUsage(completion_tokens=17, prompt_tokens=142, total_tokens=159)
# INFO:root:Getting the content of the Wikipedia page for Large language model
# INFO:root:CompletionUsage(completion_tokens=103, prompt_tokens=26906, total_tokens=27009)
# 大语言模型（Large Language Model，LLM）是一种基于人工神经网络的语言模型，具有生成通用语言和其他自然语言处理任务的能力。LLM通过从大量文本中学习统计关系来获得这些能力，通常基于变压器（Transformer）架构。截至2024年，最大的LLM具有数百亿到数千亿个参数，如GPT-3和GPT-4，并在多种语言任务中表现出色。然而，LLM也继承了训练数据中的不准确性和偏见。

上面的示例定义了一个可以从维基百科获取内容并回答用户问题的AI Agent，很大程度上减少了AI自己胡说八道的情况。值得注意的是由于需要将维基百科的内容作为函数返回结果发给语言模型，使用的token数多了很多，实际使用时需要权衡成本。