通过编写更好的提示来释放人工智能的潜力

Image generated with FLUX.1 by Black Forest Labs

在Emburse，一家商业旅行和费用管理公司，我们正在使用大型语言模型（LLMs）来自动化文档提取，以及总结这些文档如何支持特定报销申请或提供购买证明的证据。LLMs是一种令人难以置信的工具，帮助我们为客户创造并提供有价值的新功能。然而，当使用LLM时，即时工程可能是一项艰巨的任务。经验丰富时，一个好的提示的风格和文学可能会变得直观，但当刚开始时，你可能会想知道任何人是如何有效地使用LLM的。作为Emburse的数据科学家，我已经掌握了一些对抗垃圾回应的武器。

写一个成功的提示可以感觉像是在建立逻辑门和辅导一个五岁的孩子之间，因为LLMs在表现出严格数学或天真和自发之间波动。当它按照你说的去做，但不是你的意思时，问题就会出现。同时，你也希望LLM能够独立和可预测地做出决策。这被称为提示工程，因为它需要一些规划和结构。只需几个基本组件，你就可以编写清晰、有效的提示，充分利用LLMs所提供的一切。

假定没有任何事情，将一切情境化。

让我们假设我将科学论文提供给ChatGPT（或你喜欢的任何LLM），我想要的回报是从论文中提取某些信息的简短摘要。与其直接要求信息，更重要的是首先建立上下文。Heston和Khun描述提示级别，类似于告诉它“如何逐步解决数学问题”。这既满足逻辑学家又迎合了ChatGPT中那个容易分心的孩子。在这里，我们可以以“你是阅读和解析科学论文方面的专家。你将收到不同主题的各种论文。”开始。不要担心听起来过于正式或专业 - 唯一重要的是清晰地传达你的信息。

尽管ChatGPT听起来很像人类并且富有会话性，但我们要记得它并不是人类，我们不能假设它有对人类理解的深刻。虽然它擅长解释语言和重复信息，但它没有生活经验和文化背景。如果我以“我想要的信息是…”开头，它可能会意识到它只会收到科学出版物，或者因为它试图概括任何类型的媒体而返回一些意外的内容。不要指望它读懂任何潜台词。不要期望它理解对你而言显而易见的社会习俗。虽然有时它可能非常擅长这一点，但你会牺牲一致性。采取这样的理念：对LLM来说，没有什么是太明显而不值得指出的。

将任务分解为步骤

让我们回到解决数学问题的这个想法，或者甚至更好地说，做一道菜谱。假设我正在为一个有过敏的朋友做一道菜。如果我带着原始食谱去商店，即兴改变，我可能会不小心漏掉关键的配料，并购买一些应该避开的食材。相反，我应该在去商店之前决定替代品，并逐步重写食谱。我知道我不想要科学论文中的所有信息，否则我就不会使用ChatGPT来帮我总结它。新的信息食谱可能如下：

1. Extract the title.
2. Extract the authors’ names.
3. Extract the date of publication.
4. Return the text of the abstract.
5. Return three keywords which describe the topic of the paper.

这有助于通过使我们对所提取内容的期望明确来确保输出的一致性 — 但它可以更清晰。

对期望的输出进行具体描述

这是我的提示：

You are an expert in reading and parsing scientific papers. You will 
receive various papers on different topics. You are to:

1. Extract the title.
2. Extract the authors’ names.
3. Extract the date of publication.
4. Return the text of the abstract.
5. Return three keywords which describe the topic of the paper.

它有上下文，它写得简单明了，它分解成步骤，而且它有我需要的一切，对吗？也许吧。这是我提交的第一篇论文的输出：

Extracted Information:
Title:

Prompt Engineering: A Methodology for Optimizing Interactions with 
AI-Language Models in the Field of Engineering

Authors:
* Juan David Velásquez-Henao
* Carlos Jaime Franco-Cardona
* Lorena Cadavid-Higuita

Date of Publication:
November 2023

Abstract:
ChatGPT is a versatile conversational Artificial Intelligence model that 
responds to user input prompts, with applications in academia and various 
sectors. However, crafting effective prompts can be challenging, leading 
to potentially inaccurate or contextually inappropriate responses, 
emphasizing the importance of prompt engineering in achieving accurate 
outcomes across different domains. This study aims to address this void 
by introducing a methodology for optimizing interactions with Artificial 
Intelligence language models, like ChatGPT, through prompts in the field 
of engineering. The approach is called GPEI and relies on the latest 
advancements in this area; and consists of four steps: define the 
objective, design the prompt, evaluate the response, and iterate. Our 
proposal involves two key aspects: data inclusion in prompt design for 
engineering applications and the integration of Explainable Artificial 
Intelligence principles to assess responses, enhancing transparency. It 
combines insights from various methodologies to address issues like 
hallucinations, emphasizing iterative prompt refinement techniques like 
posing opposing questions and using specific patterns for improvement. 
This methodology could improve prompt precision and utility in engineering.

Keywords:
1. ChatGPT
2. Prompt Engineering
3. Large Language Models

经常发生这种情况，你看着你的结果，意识到虽然它是你要求的，但却不太对。这对人类来说是清晰且容易阅读的，但如果我要将数据输入到另一个模型或Python函数中怎么办？某些字段可能需要标准化，"提取的信息"标题是多余且不必要的，整体格式，混合了项目符号和数字，使得编程方式解析输出变得困难。（这可能是OpenAI API中最近添加的结构化输出的一个好用例，但这是另一天的另一篇博客文章）。也许我不打算在其他地方输入数据，而只是喜欢按字母顺序读出名字，首先是姓，我想要一个更短的摘要，而不是整篇文字。在任何情况下，这些细节必须被指定。

重要的是准确指导模型，告诉它你希望输出的样子。不仅要详细说明你期望接收的内容，还要说明你期望如何接收。通常需要反复迭代来调整和完善提示，但有一种更快的方法来解决这些细节，并确保不遗漏任何内容。这引出了我调整和调优提示的最后一个技巧：

让LLM构建提示

也许你尝试过我到目前为止写的所有方法。您在调整提示，像打打地鼠一样，当您修复一个错误时，又出现了两个新的错误。您似乎无法与模型说同样的语言。您需要一个翻译器，谁比语言模型本身更擅长语言模型呢？让我们请ChatGPT为ChatGPT撰写一个提示。这在悖论上仍然需要一个提示，并且要遵循先前概述的方法。幸运的是，这方面有很多指导。我们将使用一个改编自ChainBrain AI的Prompt Creator的提示。

I want you to become my prompt engineer. Your goal is to help me craft the 
best possible prompt for my needs. The Prompt will be used by chatgpt.

这是上下文。

You will follow the following process: Your first response should be to 
ask me what the prompt should be about. I will provide my answer, but we 
will need to improve it through continual iterations by going through the 
next steps.

在这里，我们阐明了我们的期望并逐步列出了流程。

Based on my input, you will generate two sections. 
  a. Revised prompt (provide your rewritten prompt. It should be clear, 
     and easily understood by ChatGPT.) 

  b. Questions (ask any relevant questions pertaining to what additional 
     information is needed for me to improve the prompt). 

We will continue this iterative process, with me providing additional 
information to you, and you updating the prompt in the revised prompt 
section until I say we are done.

这里我们非常清楚输出应该是什么样子。

发送了这个提示后，ChatGPT 回复我询问提示应该是关于什么的，然后我发送原始的纸张解析提示。

这是回复：

Revised Prompt:
You are an expert in reading and parsing scientific papers. 
Your task is to:
1. Extract the title of the paper.
2. Extract the names of the authors.
3. Extract the publication date.
4. Retrieve the text of the abstract.
5. Identify and return three keywords that best describe the topic of the 
   paper.

Questions:
1. Are the scientific papers all in a similar format (e.g., PDF, text), or 
   should the prompt account for different formats?
2. Do you want the keywords to be directly from the paper, or should they 
   be generated based on the abstract content?
3. Should the extracted data be returned in a specific format (e.g., JSON, 
   plain text, or a structured table)?
4. Is there a preferred maximum length or number of characters for the 
   abstract text that should be returned?
5. Are there any specific disciplines or types of scientific papers that 
   the prompt should focus on, or is it intended to be universal across 
   all fields?

现在我可以逐个简单回答问题，添加任何我认为的额外要求，并使用ChatGPT帮助我制定出详细、健壮和逻辑严谨的提示。它会不断返回新问题并修改提示，直到我对结果满意为止。

结果

经过一番讨论，这是我的最终提示：

You are an expert in reading and parsing scientific papers in various 
formats (PDF, text, etc.). Your task is to:

1. Extract the title of the paper.
2. Extract the names of the authors in the format: Last Name, First Name; 
   and list them alphabetically, separated by semicolons. If no authors 
   are found, use "None."
3. Extract the publication date (format: MM-YYYY). If no date is 
   available, use "None."
4. Extract the full journal name. If the journal name is missing, use 
   "None."
5. Retrieve the text of the abstract and summarize it if it exceeds 200 
   characters, retaining technical terminology. If there is no abstract, 
   use "None."
6. Identify and return three keywords that best describe the topic of the 
   paper, based on the content. If keywords cannot be identified, use 
   "None."


Return the extracted data in the form of a structured table with the 
following fields:

* Title
* Authors (in alphabetical order, formatted as Last Name, First Name, 
  separated by semicolons)
* Publication Date (MM-YYYY format)
* Journal Name (full name, no abbreviations)
* Abstract (summarized to 200 characters if necessary, while retaining 
  technical terminology)
* Keywords (three keywords based on the content)

If multiple papers are processed, each table should be separated by the 
delimiter: ~~~~~~

现在这比我起初开始的更具体，我可以更有信心地确定我将会有全面、一致、可重复的输出。这种方法适用于任何LLM，无论是ChatGPT还是类似Llama或Mistral的开源模型，对于任何用例，无论是专业还是个人，都是一个很好的起点。

随着我们在Emburse工作流程中继续整合LLMs，有效的提示工程展现了AI的真正潜力。写出更好的提示是解锁AI全部功能的关键，将复杂任务转化为简化解决方案。虽然刚开始时这个过程可能看起来令人望而却步，但通过一点练习和构建，再加上你喜爱的LLM的一点帮助，你将能够开始创建强大、有效的提示内容。随着技术的演变，我们与之互动的能力也会提升 — 更好的提示只是可能性的开始。