比AgentGpt还强，一句话，生成一个可运行的工程？

原创

老码小张

修改于 2023-06-21 10:10:46

59500

代码可运行

文章被收录于专栏：玩转全栈玩转全栈

运行总次数：0

代码可运行

背景

你有没有想过，一句话的需求，GPT就可以给你实现这个需求，而且工程都给你创建好？你可能觉得这个事情不可能，你可能之前听说过 AgentGpt，AutoGpt，也有可能玩过，但是似乎最终也没有产出一个完整的可运行的工程给到你呀，但是今天我想和你共同研究的这个开源库做到了，地址在此：https://github.com/AntonOsika/gpt-engineer。

视频效果地址：https://user-images.githubusercontent.com/4467025/243695075-6e362e45-4a94-4b0d-973d-393a31d92d9b.mov

GPT Engineer实现的原理

要说实现原理，一句话概括，就是就是提示词工程，但是关键是人家怎么将这个过程串起来的，这儿是值得研究的点之所在。

我们直接看项目的 main.py 文件，这个是项目的入口文件，里面有这么一段代码：

    for step in STEPS[steps_config]:
        messages = step(ai, dbs)
        dbs.logs[step.__name__] = json.dumps(messages)

可以，看到他是通过做了这么多个步骤生成了一个可以运行的工程。其中steps的配置如下：

# Different configs of what steps to run
STEPS = {
    "default": [simple_gen, gen_entrypoint, execute_entrypoint],
    "benchmark": [simple_gen, gen_entrypoint],
    "simple": [simple_gen, gen_entrypoint, execute_entrypoint],
    "tdd": [gen_spec, gen_unit_tests, gen_code, gen_entrypoint, execute_entrypoint],
    "tdd+": [
        gen_spec,
        gen_unit_tests,
        gen_code,
        fix_code,
        gen_entrypoint,
        execute_entrypoint,
    ],
    "clarify": [clarify, gen_clarified_code, gen_entrypoint, execute_entrypoint],
    "respec": [
        gen_spec,
        respec,
        gen_unit_tests,
        gen_code,
        gen_entrypoint,
        execute_entrypoint,
    ],
    "execute_only": [execute_entrypoint],
    "use_feedback": [use_feedback],
}

很容易理解，他默认的生成一个工程会执行 simple_gen, gen_entrypoint, execute_entrypoint ，这几个步骤，这就相当于一个pipe，将处理结果一步步的走下去，等等，好像和langchain有点像呀，先埋下一个伏笔，等会我们在看看是不是那么回事？

看看simple_gen 做了啥事?

def simple_gen(ai: AI, dbs: DBs):
    """Run the AI on the main prompt and save the results"""
    messages = ai.start(
        setup_sys_prompt(dbs),
        dbs.input["main_prompt"],
    )
    to_files(messages[-1]["content"], dbs.workspace)
    return messages

非常简单，就是把我们那个一句话需求喂给了ChatGPT，让后把应答写入到文件中。

在看看gen_entrypoint做了啥事？

def gen_entrypoint(ai, dbs):
    messages = ai.start(
        system=(
            "You will get information about a codebase that is currently on disk in "
            "the current folder.\n"
            "From this you will answer with code blocks that includes all the necessary "
            "unix terminal commands to "
            "a) install dependencies "
            "b) run all necessary parts of the codebase (in parallell if necessary).\n"
            "Do not install globally. Do not use sudo.\n"
            "Do not explain the code, just give the commands.\n"
        ),
        user="Information about the codebase:\n\n" + dbs.workspace["all_output.txt"],
    )
    print()

    regex = r"```\S*\n(.+?)```"
    matches = re.finditer(regex, messages[-1]["content"], re.DOTALL)
    dbs.workspace["run.sh"] = "\n".join(match.group(1) for match in matches)
    return messages

非常直观，意识是说，让他把运行这个工程的 shell 脚本给些好，这就是这个工程提示词的杀手锏之一，他的输入正是我们simple_gen的输出， dbs.workspace["all_output.txt"]，我们注意一下他这样生成一个shell脚本内容大概长这样：

这里我们发现他需要创建两个文件对比，requirements.txt,以及main.py ，当然这是python工程，如果是node工程，应该是另外shell脚本了，anyway，不论怎么样，我们相信这样肯定可以启动这个工程了。

execute_entrypoint做了什么？

def execute_entrypoint(ai, dbs):
    command = dbs.workspace["run.sh"]

    print("Do you want to execute this code?")
    print()
    print(command)
    print()
    print('If yes, press enter. Otherwise, type "no"')
    print()
    if input() != "":
        print("Ok, not executing the code.")
        return []
    print("Executing the code...")
    print()
    subprocess.run("bash run.sh", shell=True, cwd=dbs.workspace.path)
    return []

也很简单，就是执行脚本，当你直接回车就开始执行脚本了，相当于运行这个ChatGPT一句话生成的工程。

以上就是整个项目框架的流程，至此就生成了一个可以运行的工程，GPT4效果会好一些，乞丐版版本的GPT3.5也是可以跑的。

一些额外的技术细节，如何生成工程里面的其他文件

有些比较复杂的项目肯定不是一个main.py就搞定的，一定有一些文件对来组织对吧，不然代码肯定非常难以维护，这其实就是模块化，作者也考虑到这一点，所以我们可以关注下：

def setup_sys_prompt(dbs):
    return dbs.identity["generate"] + "\nUseful to know:\n" + dbs.identity["philosophy"]


def simple_gen(ai: AI, dbs: DBs):
    """Run the AI on the main prompt and save the results"""
    messages = ai.start(
        setup_sys_prompt(dbs),
        dbs.input["main_prompt"],
    )
    to_files(messages[-1]["content"], dbs.workspace)
    return messages

在 simple_gen是，调用了setup_sys_prompt，我们看看这里面的提示词工程有什么诀窍？在项目中，我们可以找到generate

对应的提示词identity/generate内容如下：

You will get instructions for code to write.
You will write a very long answer. Make sure that every detail of the architecture is, in the end, implemented as code.
Make sure that every detail of the architecture is, in the end, implemented as code.

Think step by step and reason yourself to the right decisions to make sure we get it right.
You will first lay out the names of the core classes, functions, methods that will be necessary, as well as a quick comment on their purpose.

Then you will output the content of each file including ALL code.
Each file must strictly follow a markdown code block format, where the following tokens must be replaced such that
FILENAME is the lowercase file name including the file extension,
LANG is the markup code block language for the code's language, and CODE is the code:

FILENAME
```LANG
CODE
```

You will start with the "entrypoint" file, then go to the ones that are imported by that file, and so on.
Please note that the code should be fully functional. No placeholders.

Follow a language and framework appropriate best practice file naming convention.
Make sure that files contain all imports, types etc. Make sure that code in different files are compatible with each other.
Ensure to implement all code, if you are unsure, write a plausible implementation.
Include module dependency or package manager dependency definition file.
Before you finish, double check that all parts of the architecture is present in the files.

这里的提示词才是生成工程文件的关键点，除此之外，这个项目还可以在生成工程时，可以在里面做一些生成测试用例的模式，这依赖于用户传参。不同的传承将执行不同的 STEPS。

总结

事实上，我们完全可以将这样的提示词组合起来，放到 langchain 上去做一个同样的事，只要知晓原理，同样可以达到效果。这个工程本质上是将GPT的响应过程中的内容规范化，已便于可以保存为代码文件，然后是是一shell脚本串起来，让工程可以运行。

比如，我将他的提示词串起来问GPT：

##系统##
您将收到编写代码的指示。
您将编写一个非常长的答案。确保最终将架构的每个细节都实现为代码。
确保最终将架构的每个细节都实现为代码。

逐步思考并理性地做出正确决策，以确保我们正确理解。
首先，您将列出所需的核心类、函数、方法的名称，以及对它们目的的快速注释。

然后，您将输出每个文件的内容，包括所有代码。
每个文件必须严格遵循Markdown代码块格式，其中以下标记必须替换，以便：
FILENAME 是包括文件扩展名的小写文件名，
LANG 是代码语言的标记代码块语言，
CODE 是代码：

FILENAME
```LANG
CODE
```

您将从"入口点"文件开始，然后继续处理该文件导入的文件，依此类推。
请注意，代码应该是完全可用的，不要使用占位符。

遵循适合语言和框架的最佳实践文件命名约定。
确保文件包含所有导入、类型等。确保不同文件中的代码彼此兼容。
确保实现所有代码，如果不确定，请编写一个合理的实现。
包括模块依赖或包管理器依赖定义文件。
在完成之前，请仔细检查文件中是否存在架构的所有部分。

有用的信息：

您几乎总是将不同的类放在不同的文件中。
对于Python，您始终创建一个适当的requirements.txt文件。
对于NodeJS，您始终创建一个适当的package.json文件。
您总是添加一个简要描述函数定义目的的注释。
您尝试添加解释非常复杂的逻辑的注释。
您始终遵循请求语言的最佳实践，以描述代码编写为一个定义的包/项目。

Python工具首选项：
- pytest
- dataclasses

##用户##
使用html+css+javascript 些一个简单的 todo-list，将任务存储在local storage中，允许用户添加任务，删除任务，编辑任务
##助理##