
CodeBuddy code CLI 是一款面向开发者的自主编排编程智能体,通过命令行界面为开发者提供强大的 AI 编程支持。它能够直接访问和修改本地代码、调用 MCP 服务、执行系统命令及访问网络资源,既适合交互式开发流程,也能在无交互环境(如 CI/CD 流水线、自动化脚本等)中稳定运行。本文将通过一个视频生成项目的实际案例,展示 CodeBuddy code CLI 如何高效应对复杂开发任务,提升开发者效率。
我们有一个视频生成项目,需要根据 JSON 配置文件生成带有字幕和背景音乐的视频。项目的主要功能包括:
ffmpeg 合成视频。JSON格式如下:
{
"bgImage": "https://p9-bot-workflow-sign.byteimg.com/tos-cn-i-mdko3gqilj/7d8db97a3b564eb3a886cc487016598f.png~tplv-mdko3gqilj-image.png?rk3s=c8fe7ad5&x-expires=1791007650&x-signature=u2zG3Og3CVP3IWUBhYbfjDtPMRc%3D",
"captions": [
{
"audioList": [
{
"code": 0,
"data": {
"duration": 1.549,
"link": "https://lf26-appstore-sign.oceancloudapi.com/ocean-cloud-tos/VolcanoUserVoice/speech_7426725529681657907_3e7596e4-a30a-4825-b995-953bfd04e782.mp3?lk3s=da27ec82&x-expires=1760162811&x-signature=OZzBUPG5tXeFdV9669g2qO905oM%3D"
},
"log_id": "2025100814064194EFAEF17607D0AC4B56",
"msg": "success"
},
{
"code": 0,
"data": {
"duration": 4.22,
"link": "https://lf3-appstore-sign.oceancloudapi.com/ocean-cloud-tos/VolcanoUserVoice/speech_7426725529681657907_d1e03581-d372-4f8c-97f5-961b53bef3ee.mp3?lk3s=da27ec82&x-expires=1760162812&x-signature=NytMljDny3Gstboh%2B5SiIk58vPg%3D"
},
"log_id": "2025100814064194EFAEF17607D0AC4B56",
"msg": "success"
},
{
"code": 0,
"data": {
"duration": 2.335,
"link": "https://lf6-appstore-sign.oceancloudapi.com/ocean-cloud-tos/VolcanoUserVoice/speech_7426725529681657907_3cf4db4b-f595-49e0-aec8-db55546afe99.mp3?lk3s=da27ec82&x-expires=1760162811&x-signature=rWUyiQqF%2FpOEkovUJnZEKIBflOA%3D"
},
"log_id": "2025100814064194EFAEF17607D0AC4B56",
"msg": "success"
}
],
"durationList": [
{
"duration": 1.632,
"message": "ok"
},
{
"duration": 4.296,
"message": "ok"
},
{
"duration": 2.424,
"message": "ok"
}
],
"textList": [
"在日常生活中",
"我们常常会注意到某些事物而忽略其他",
"这就是知觉选择性的体现"
]
},
{
"audioList": [
{
"code": 0,
"data": {
"duration": 0.788,
"link": "https://lf6-appstore-sign.oceancloudapi.com/ocean-cloud-tos/VolcanoUserVoice/speech_7426725529681657907_fc079965-1254-4515-ac65-024959bd31d3.mp3?lk3s=da27ec82&x-expires=1760162813&x-signature=kw9Y851cYXqnYlQGfhMLK11UY7Q%3D"
},
"log_id": "2025100814064194EFAEF17607D0AC4B56",
"msg": "success"
},
{
"code": 0,
"data": {
"duration": 2.092,
"link": "https://lf9-appstore-sign.oceancloudapi.com/ocean-cloud-tos/VolcanoUserVoice/speech_7426725529681657907_c3fc98c1-e7c7-42a2-9813-25712efbb7e4.mp3?lk3s=da27ec82&x-expires=1760162813&x-signature=dSyuvKtLa1yyObnKpiK9QOgP3%2BI%3D"
},
"log_id": "2025100814064194EFAEF17607D0AC4B56",
"msg": "success"
},
{
"code": 0,
"data": {
"duration": 3.074,
"link": "https://lf6-appstore-sign.oceancloudapi.com/ocean-cloud-tos/VolcanoUserVoice/speech_7426725529681657907_8f83ee5b-4354-4d07-82f7-43a6d82358c1.mp3?lk3s=da27ec82&x-expires=1760162813&x-signature=0WK9jLvo0aqB%2B9medh7ljiz8%2BPo%3D"
},
"log_id": "2025100814064194EFAEF17607D0AC4B56",
"msg": "success"
},
{
"code": 0,
"data": {
"duration": 4.046,
"link": "https://lf9-appstore-sign.oceancloudapi.com/ocean-cloud-tos/VolcanoUserVoice/speech_7426725529681657907_89e0ec74-09d5-45a6-a8c4-fd9118b555f5.mp3?lk3s=da27ec82&x-expires=1760162814&x-signature=iGv9aeo3eHzW3dU9zJc9H%2BcHr8c%3D"
},
"log_id": "2025100814064194EFAEF17607D0AC4B56",
"msg": "success"
},
{
"code": 0,
"data": {
"duration": 2.056,
"link": "https://lf6-appstore-sign.oceancloudapi.com/ocean-cloud-tos/VolcanoUserVoice/speech_7426725529681657907_a2d70b91-76ad-46a0-989f-1333c01057f6.mp3?lk3s=da27ec82&x-expires=1760162813&x-signature=DactTsVL3JIxB%2BCjMI5t%2Fv9Wmw0%3D"
},
"log_id": "2025100814064194EFAEF17607D0AC4B56",
"msg": "success"
}
],
"durationList": [
{
"duration": 0.864,
"message": "ok"
},
{
"duration": 2.184,
"message": "ok"
},
{
"duration": 3.168,
"message": "ok"
},
{
"duration": 4.128,
"message": "ok"
},
{
"duration": 2.136,
"message": "ok"
}
],
"textList": [
"比如",
"当你在嘈杂的咖啡馆里",
"却能清晰地听到朋友的声音",
"这是因为你的大脑自动过滤了无关信息",
"专注于你感兴趣的部分"
]
},
{
"audioList": [
{
"code": 0,
"data": {
"duration": 4.898,
"link": "https://lf6-appstore-sign.oceancloudapi.com/ocean-cloud-tos/VolcanoUserVoice/speech_7426725529681657907_b69d9a81-c58c-4a96-a51a-a7e4e3f79b37.mp3?lk3s=da27ec82&x-expires=1760162816&x-signature=eo0i5v4NmF%2FD0tGPgM4sjfb98zs%3D"
},
"log_id": "2025100814064194EFAEF17607D0AC4B56",
"msg": "success"
}
],
"durationList": [
{
"duration": 4.992,
"message": "ok"
}
],
"textList": [
"这种现象在心理学中被称为‘鸡尾酒会效应’"
]
},
{
"audioList": [
{
"code": 0,
"data": {
"duration": 3.805,
"link": "https://lf3-appstore-sign.oceancloudapi.com/ocean-cloud-tos/VolcanoUserVoice/speech_7426725529681657907_fdaa9eb1-1885-4aac-87de-8b608213bd2c.mp3?lk3s=da27ec82&x-expires=1760162818&x-signature=vIB1EEHrdgKPmdZmFYf5gkXhDNo%3D"
},
"log_id": "2025100814064194EFAEF17607D0AC4B56",
"msg": "success"
},
{
"code": 0,
"data": {
"duration": 2.898,
"link": "https://lf3-appstore-sign.oceancloudapi.com/ocean-cloud-tos/VolcanoUserVoice/speech_7426725529681657907_4a64f754-df94-42e2-9ac4-b6e70c87c5ba.mp3?lk3s=da27ec82&x-expires=1760162818&x-signature=SfNyy7SzvdwrCb3prOrj4zdQzPI%3D"
},
"log_id": "2025100814064194EFAEF17607D0AC4B56",
"msg": "success"
}
],
"durationList": [
{
"duration": 3.888,
"message": "ok"
},
{
"duration": 2.976,
"message": "ok"
}
],
"textList": [
"知觉选择性不仅帮助我们高效处理信息",
"还影响着我们的情绪和行为"
]
},
{
"audioList": [
{
"code": 0,
"data": {
"duration": 0.852,
"link": "https://lf3-appstore-sign.oceancloudapi.com/ocean-cloud-tos/VolcanoUserVoice/speech_7426725529681657907_ff9d1717-dee1-4014-9a22-0bb616659ce0.mp3?lk3s=da27ec82&x-expires=1760162819&x-signature=RQ89evmqfmRJfZGILjfzdHNQkuE%3D"
},
"log_id": "2025100814064194EFAEF17607D0AC4B56",
"msg": "success"
},
{
"code": 0,
"data": {
"duration": 3.883,
"link": "https://lf3-appstore-sign.oceancloudapi.com/ocean-cloud-tos/VolcanoUserVoice/speech_7426725529681657907_6c1fb5ec-3b88-4b4a-baa9-6c3f999315df.mp3?lk3s=da27ec82&x-expires=1760162819&x-signature=H8ibNY8SGvSwNj3%2B%2FrE0kDlimp0%3D"
},
"log_id": "2025100814064194EFAEF17607D0AC4B56",
"msg": "success"
},
{
"code": 0,
"data": {
"duration": 3.838,
"link": "https://lf9-appstore-sign.oceancloudapi.com/ocean-cloud-tos/VolcanoUserVoice/speech_7426725529681657907_e4b7cf3d-fc07-44e1-a676-e7c045499040.mp3?lk3s=da27ec82&x-expires=1760162821&x-signature=viTV1E%2FBbnoZncrhs6uAWndTR18%3D"
},
"log_id": "2025100814064194EFAEF17607D0AC4B56",
"msg": "success"
}
],
"durationList": [
{
"duration": 0.936,
"message": "ok"
},
{
"duration": 3.96,
"message": "ok"
},
{
"duration": 3.912,
"message": "ok"
}
],
"textList": [
"例如",
"一个乐观的人更容易注意到积极的信息",
"而悲观的人则可能更关注负面事件"
]
},
{
"audioList": [
{
"code": 0,
"data": {
"duration": 2.506,
"link": "https://lf9-appstore-sign.oceancloudapi.com/ocean-cloud-tos/VolcanoUserVoice/speech_7426725529681657907_edf0936b-fd99-4424-a26d-1a16cc2ff284.mp3?lk3s=da27ec82&x-expires=1760162822&x-signature=oMEcImxulDlmD9Xvq234tk1qhEA%3D"
},
"log_id": "2025100814064194EFAEF17607D0AC4B56",
"msg": "success"
},
{
"code": 0,
"data": {
"duration": 3.161,
"link": "https://lf3-appstore-sign.oceancloudapi.com/ocean-cloud-tos/VolcanoUserVoice/speech_7426725529681657907_a11448c9-bcc5-4518-8467-df84195b94f6.mp3?lk3s=da27ec82&x-expires=1760162822&x-signature=dqb9jy%2FKf6ck9EpmXjmdsW4DK%2B8%3D"
},
"log_id": "2025100814064194EFAEF17607D0AC4B56",
"msg": "success"
},
{
"code": 0,
"data": {
"duration": 1.609,
"link": "https://lf3-appstore-sign.oceancloudapi.com/ocean-cloud-tos/VolcanoUserVoice/speech_7426725529681657907_19f898f4-e6ff-4429-b552-66472ae07576.mp3?lk3s=da27ec82&x-expires=1760162822&x-signature=7wTlRpjzpPJFI2G5iR9zIReXYO4%3D"
},
"log_id": "2025100814064194EFAEF17607D0AC4B56",
"msg": "success"
}
],
"durationList": [
{
"duration": 2.592,
"message": "ok"
},
{
"duration": 3.24,
"message": "ok"
},
{
"duration": 1.704,
"message": "ok"
}
],
"textList": [
"了解知觉选择性",
"可以帮助我们更好地管理注意力",
"提升生活质量"
]
},
{
"audioList": [],
"durationList": [],
"textList": []
}
],
"height": "1080",
"imageList": [
"https://s.coze.cn/t/6TRei6ICGyU/",
"https://s.coze.cn/t/0KdNwd72_yI/",
"https://s.coze.cn/t/O4_HMKWiO3M/",
"https://s.coze.cn/t/ZzGVTfjPGMc/",
"https://s.coze.cn/t/fw6PD6Qumlw/",
"https://s.coze.cn/t/-xsWgj652rQ/",
"https://s.coze.cn/t/DFfldcr0Dw8/"
],
"materialList": [
{
"audioList": [
{
"code": 0,
"data": {
"duration": 1.549,
"link": "https://lf26-appstore-sign.oceancloudapi.com/ocean-cloud-tos/VolcanoUserVoice/speech_7426725529681657907_3e7596e4-a30a-4825-b995-953bfd04e782.mp3?lk3s=da27ec82&x-expires=1760162811&x-signature=OZzBUPG5tXeFdV9669g2qO905oM%3D"
},
"log_id": "2025100814064194EFAEF17607D0AC4B56",
"msg": "success"
},
{
"code": 0,
"data": {
"duration": 4.22,
"link": "https://lf3-appstore-sign.oceancloudapi.com/ocean-cloud-tos/VolcanoUserVoice/speech_7426725529681657907_d1e03581-d372-4f8c-97f5-961b53bef3ee.mp3?lk3s=da27ec82&x-expires=1760162812&x-signature=NytMljDny3Gstboh%2B5SiIk58vPg%3D"
},
"log_id": "2025100814064194EFAEF17607D0AC4B56",
"msg": "success"
},
{
"code": 0,
"data": {
"duration": 2.335,
"link": "https://lf6-appstore-sign.oceancloudapi.com/ocean-cloud-tos/VolcanoUserVoice/speech_7426725529681657907_3cf4db4b-f595-49e0-aec8-db55546afe99.mp3?lk3s=da27ec82&x-expires=1760162811&x-signature=rWUyiQqF%2FpOEkovUJnZEKIBflOA%3D"
},
"log_id": "2025100814064194EFAEF17607D0AC4B56",
"msg": "success"
}
],
"durationList": [
{
"duration": 1.632,
"message": "ok"
},
{
"duration": 4.296,
"message": "ok"
},
{
"duration": 2.424,
"message": "ok"
}
],
"textList": [
"在日常生活中",
"我们常常会注意到某些事物而忽略其他",
"这就是知觉选择性的体现"
]
},
{
"audioList": [
{
"code": 0,
"data": {
"duration": 0.788,
"link": "https://lf6-appstore-sign.oceancloudapi.com/ocean-cloud-tos/VolcanoUserVoice/speech_7426725529681657907_fc079965-1254-4515-ac65-024959bd31d3.mp3?lk3s=da27ec82&x-expires=1760162813&x-signature=kw9Y851cYXqnYlQGfhMLK11UY7Q%3D"
},
"log_id": "2025100814064194EFAEF17607D0AC4B56",
"msg": "success"
},
{
"code": 0,
"data": {
"duration": 2.092,
"link": "https://lf9-appstore-sign.oceancloudapi.com/ocean-cloud-tos/VolcanoUserVoice/speech_7426725529681657907_c3fc98c1-e7c7-42a2-9813-25712efbb7e4.mp3?lk3s=da27ec82&x-expires=1760162813&x-signature=dSyuvKtLa1yyObnKpiK9QOgP3%2BI%3D"
},
"log_id": "2025100814064194EFAEF17607D0AC4B56",
"msg": "success"
},
{
"code": 0,
"data": {
"duration": 3.074,
"link": "https://lf6-appstore-sign.oceancloudapi.com/ocean-cloud-tos/VolcanoUserVoice/speech_7426725529681657907_8f83ee5b-4354-4d07-82f7-43a6d82358c1.mp3?lk3s=da27ec82&x-expires=1760162813&x-signature=0WK9jLvo0aqB%2B9medh7ljiz8%2BPo%3D"
},
"log_id": "2025100814064194EFAEF17607D0AC4B56",
"msg": "success"
},
{
"code": 0,
"data": {
"duration": 4.046,
"link": "https://lf9-appstore-sign.oceancloudapi.com/ocean-cloud-tos/VolcanoUserVoice/speech_7426725529681657907_89e0ec74-09d5-45a6-a8c4-fd9118b555f5.mp3?lk3s=da27ec82&x-expires=1760162814&x-signature=iGv9aeo3eHzW3dU9zJc9H%2BcHr8c%3D"
},
"log_id": "2025100814064194EFAEF17607D0AC4B56",
"msg": "success"
},
{
"code": 0,
"data": {
"duration": 2.056,
"link": "https://lf6-appstore-sign.oceancloudapi.com/ocean-cloud-tos/VolcanoUserVoice/speech_7426725529681657907_a2d70b91-76ad-46a0-989f-1333c01057f6.mp3?lk3s=da27ec82&x-expires=1760162813&x-signature=DactTsVL3JIxB%2BCjMI5t%2Fv9Wmw0%3D"
},
"log_id": "2025100814064194EFAEF17607D0AC4B56",
"msg": "success"
}
],
"durationList": [
{
"duration": 0.864,
"message": "ok"
},
{
"duration": 2.184,
"message": "ok"
},
{
"duration": 3.168,
"message": "ok"
},
{
"duration": 4.128,
"message": "ok"
},
{
"duration": 2.136,
"message": "ok"
}
],
"textList": [
"比如",
"当你在嘈杂的咖啡馆里",
"却能清晰地听到朋友的声音",
"这是因为你的大脑自动过滤了无关信息",
"专注于你感兴趣的部分"
]
},
{
"audioList": [
{
"code": 0,
"data": {
"duration": 4.898,
"link": "https://lf6-appstore-sign.oceancloudapi.com/ocean-cloud-tos/VolcanoUserVoice/speech_7426725529681657907_b69d9a81-c58c-4a96-a51a-a7e4e3f79b37.mp3?lk3s=da27ec82&x-expires=1760162816&x-signature=eo0i5v4NmF%2FD0tGPgM4sjfb98zs%3D"
},
"log_id": "2025100814064194EFAEF17607D0AC4B56",
"msg": "success"
}
],
"durationList": [
{
"duration": 4.992,
"message": "ok"
}
],
"textList": [
"这种现象在心理学中被称为‘鸡尾酒会效应’"
]
},
{
"audioList": [
{
"code": 0,
"data": {
"duration": 3.805,
"link": "https://lf3-appstore-sign.oceancloudapi.com/ocean-cloud-tos/VolcanoUserVoice/speech_7426725529681657907_fdaa9eb1-1885-4aac-87de-8b608213bd2c.mp3?lk3s=da27ec82&x-expires=1760162818&x-signature=vIB1EEHrdgKPmdZmFYf5gkXhDNo%3D"
},
"log_id": "2025100814064194EFAEF17607D0AC4B56",
"msg": "success"
},
{
"code": 0,
"data": {
"duration": 2.898,
"link": "https://lf3-appstore-sign.oceancloudapi.com/ocean-cloud-tos/VolcanoUserVoice/speech_7426725529681657907_4a64f754-df94-42e2-9ac4-b6e70c87c5ba.mp3?lk3s=da27ec82&x-expires=1760162818&x-signature=SfNyy7SzvdwrCb3prOrj4zdQzPI%3D"
},
"log_id": "2025100814064194EFAEF17607D0AC4B56",
"msg": "success"
}
],
"durationList": [
{
"duration": 3.888,
"message": "ok"
},
{
"duration": 2.976,
"message": "ok"
}
],
"textList": [
"知觉选择性不仅帮助我们高效处理信息",
"还影响着我们的情绪和行为"
]
},
{
"audioList": [
{
"code": 0,
"data": {
"duration": 0.852,
"link": "https://lf3-appstore-sign.oceancloudapi.com/ocean-cloud-tos/VolcanoUserVoice/speech_7426725529681657907_ff9d1717-dee1-4014-9a22-0bb616659ce0.mp3?lk3s=da27ec82&x-expires=1760162819&x-signature=RQ89evmqfmRJfZGILjfzdHNQkuE%3D"
},
"log_id": "2025100814064194EFAEF17607D0AC4B56",
"msg": "success"
},
{
"code": 0,
"data": {
"duration": 3.883,
"link": "https://lf3-appstore-sign.oceancloudapi.com/ocean-cloud-tos/VolcanoUserVoice/speech_7426725529681657907_6c1fb5ec-3b88-4b4a-baa9-6c3f999315df.mp3?lk3s=da27ec82&x-expires=1760162819&x-signature=H8ibNY8SGvSwNj3%2B%2FrE0kDlimp0%3D"
},
"log_id": "2025100814064194EFAEF17607D0AC4B56",
"msg": "success"
},
{
"code": 0,
"data": {
"duration": 3.838,
"link": "https://lf9-appstore-sign.oceancloudapi.com/ocean-cloud-tos/VolcanoUserVoice/speech_7426725529681657907_e4b7cf3d-fc07-44e1-a676-e7c045499040.mp3?lk3s=da27ec82&x-expires=1760162821&x-signature=viTV1E%2FBbnoZncrhs6uAWndTR18%3D"
},
"log_id": "2025100814064194EFAEF17607D0AC4B56",
"msg": "success"
}
],
"durationList": [
{
"duration": 0.936,
"message": "ok"
},
{
"duration": 3.96,
"message": "ok"
},
{
"duration": 3.912,
"message": "ok"
}
],
"textList": [
"例如",
"一个乐观的人更容易注意到积极的信息",
"而悲观的人则可能更关注负面事件"
]
},
{
"audioList": [
{
"code": 0,
"data": {
"duration": 2.506,
"link": "https://lf9-appstore-sign.oceancloudapi.com/ocean-cloud-tos/VolcanoUserVoice/speech_7426725529681657907_edf0936b-fd99-4424-a26d-1a16cc2ff284.mp3?lk3s=da27ec82&x-expires=1760162822&x-signature=oMEcImxulDlmD9Xvq234tk1qhEA%3D"
},
"log_id": "2025100814064194EFAEF17607D0AC4B56",
"msg": "success"
},
{
"code": 0,
"data": {
"duration": 3.161,
"link": "https://lf3-appstore-sign.oceancloudapi.com/ocean-cloud-tos/VolcanoUserVoice/speech_7426725529681657907_a11448c9-bcc5-4518-8467-df84195b94f6.mp3?lk3s=da27ec82&x-expires=1760162822&x-signature=dqb9jy%2FKf6ck9EpmXjmdsW4DK%2B8%3D"
},
"log_id": "2025100814064194EFAEF17607D0AC4B56",
"msg": "success"
},
{
"code": 0,
"data": {
"duration": 1.609,
"link": "https://lf3-appstore-sign.oceancloudapi.com/ocean-cloud-tos/VolcanoUserVoice/speech_7426725529681657907_19f898f4-e6ff-4429-b552-66472ae07576.mp3?lk3s=da27ec82&x-expires=1760162822&x-signature=7wTlRpjzpPJFI2G5iR9zIReXYO4%3D"
},
"log_id": "2025100814064194EFAEF17607D0AC4B56",
"msg": "success"
}
],
"durationList": [
{
"duration": 2.592,
"message": "ok"
},
{
"duration": 3.24,
"message": "ok"
},
{
"duration": 1.704,
"message": "ok"
}
],
"textList": [
"了解知觉选择性",
"可以帮助我们更好地管理注意力",
"提升生活质量"
]
},
{
"audioList": [],
"durationList": [],
"textList": []
}
],
"width": "1920"
}其实结构还蛮复杂的,这里 JSON 文件是通过工作流生成的,现在还需要一个组装的步骤才能拿到成品。
比较常见的方案是通过小助手生成草稿再搞到剪映中去生成视频。
但都用 ai 了还要手动合成多少有点不够优雅。
于是就有了本次探索,当然最核心的原因还是我用的是 Chrome OS,小助手没有对应客户端。
于是程序的目标:
输入一个工作流生产的 JSON 文件,给我生产出图片视频流,音频流,字幕流匹配的 mp4 文件。
通过自然语言描述需求,CodeBuddy code CLI 能够快速生成或修改代码。
我们的第一个指令
根据 res.json 该文件使用 python 将其生产为视频,使用 ffmpeg,最终给我一个 gen.pyCodeBuddy code CLI 自动分析代码上下文并完成修改。
该提示词不是很具体,主要有两个原因,其一,我想测测 CLI 的分析能力,其二,我对视频合成 ffmpeg 也不是很熟悉。
但它还真生成了,毕竟虽然提示词给的信息不是很多,但 res.json 中还是有很多信息的,且是AI友好的结构化数据。
第一次给的结果运行后不是很行,视频中的音频是ok的,但图片它就用了一张背景图,字幕更是压根没有。
但在我没有明确说明的情况下,它在代码中写了下载资源的相关代码并存到本地,这点还是很牛的。
为了找到原因我又给了下面的提示词
任务完成时不要删除素材,素材目录名根据时间生成,outputmp4名称也是时间主要是没有中间产物,实在是很难定位问题,视频合成是一个很繁琐的流程。
看到生成的片段中只有背景图片于是我又给了一个提示词
你需要将 image 合理地放在 bgpng 上有了这些中间产物问题就好定位了,于是就有了下面这些提示词。
我看/workspace/temp_20251008_034916/processed_image_6.png是有背景和image的,但最终生成的视频只有背景
你说 修改内容:\\n\\n移除了图片序列的拼接逻辑,直接使用第一张叠加后的图片作为输入。这样确实是生成了视频。但只有一张并不符合需求。我看出现来了多次拼接错误,为什么不是audio与拼接后的image生成视频片段最后合成完整视频的方案,又或者查下ffmpeg最新的文档看看有没有更好的方案。\"
seg的时长应该根据对应的音频时间来定,同时最终要将seg合成为完整视频如果是自己debug,我先要去读 ffmpeg 的文档,一天都不见得能搞明白,但通过 CodeBuddy code CLI 我只需要写出提示词,几分钟它就会自动生成代码,还是很方便的。
在视频生成项目中,字幕时间轴需要与音频时长严格匹配。通过 CodeBuddy code CLI,我们快速修复了字幕生成逻辑,确保每个字幕文本使用对应的音频时长,避免了时间轴错位的问题。
现在添加一个功能,添加字幕,参考resjson的captions字段,要硬字幕,需要支持中文这是它给的代码:
# Generate subtitle file (.srt)
subtitle_file = os.path.join(temp_dir, "subtitles.srt")
with open(subtitle_file, 'w', encoding='utf-8') as f:
start_time = 0
for i, caption_data in enumerate(data["captions"]):
if not caption_data.get("textList"):
continue
# Ensure audioList and textList are aligned
for j, text in enumerate(caption_data["textList"]):
if j >= len(caption_data["audioList"]):
break
duration = float(caption_data["audioList"][j]["data"]["duration"])
end_time = start_time + duration
f.write(f"{i + 1}.{j + 1}\n")
f.write(f"{datetime.utcfromtimestamp(start_time).strftime('%H:%M:%S,%f')[:-3]} --> {datetime.utcfromtimestamp(end_time).strftime('%H:%M:%S,%f')[:-3]}\n")
f.write(f"{text}\n\n")
start_time = end_time最终得到srt文件
1.1
00:00:00,000 --> 00:00:01,549
在日常生活中
1.2
00:00:01,549 --> 00:00:05,769
我们常常会注意到某些事物而忽略其他
1.3
00:00:05,769 --> 00:00:08,104
这就是知觉选择性的体现
2.1
00:00:08,104 --> 00:00:08,892
比如
2.2
00:00:08,892 --> 00:00:10,984
当你在嘈杂的咖啡馆里
2.3
00:00:10,984 --> 00:00:14,058
却能清晰地听到朋友的声音
2.4
00:00:14,058 --> 00:00:18,104
这是因为你的大脑自动过滤了无关信息
2.5
00:00:18,104 --> 00:00:20,160
专注于你感兴趣的部分
3.1
00:00:20,160 --> 00:00:25,058
这种现象在心理学中被称为‘鸡尾酒会效应’
4.1
00:00:25,058 --> 00:00:28,863
知觉选择性不仅帮助我们高效处理信息
4.2
00:00:28,863 --> 00:00:31,761
还影响着我们的情绪和行为
5.1
00:00:31,761 --> 00:00:32,613
例如
5.2
00:00:32,613 --> 00:00:36,496
一个乐观的人更容易注意到积极的信息
5.3
00:00:36,496 --> 00:00:40,334
而悲观的人则可能更关注负面事件
6.1
00:00:40,334 --> 00:00:42,840
了解知觉选择性
6.2
00:00:42,840 --> 00:00:46,001
可以帮助我们更好地管理注意力
6.3
00:00:46,001 --> 00:00:47,610
提升生活质量CLI 自动生成项目文档(如 README.md)显著提升了项目管理的效率。
通过 CLI 执行 ffmpeg 命令,自动完成视频合成,并输出结果验证视频生成是否成功。
直接发号施令
跑下gen.pyimport json
import os
import requests
from PIL import Image, ImageDraw, ImageFont
import subprocess
from datetime import datetime
def download_file(url, save_path, max_retries=3):
for attempt in range(max_retries):
try:
response = requests.get(url, stream=True, timeout=10)
response.raise_for_status()
with open(save_path, 'wb') as f:
for chunk in response.iter_content(chunk_size=8192):
f.write(chunk)
return
except Exception as e:
if attempt == max_retries - 1:
raise
print(f"Download failed (attempt {attempt + 1}/{max_retries}): {e}")
time.sleep(1)
def add_text_to_image(image_path, text, output_path, font_size=40):
image = Image.open(image_path)
draw = ImageDraw.Draw(image)
font = ImageFont.truetype("/usr/share/fonts/truetype/noto/NotoSansCJK-Regular.ttc", 40)
text_width = draw.textlength(text, font=font)
text_height = font_size
position = ((image.width - text_width) // 2, (image.height - text_height) // 2)
draw.text(position, text, font=font, fill="white")
image.save(output_path)
def generate_video(bg_image_path, image_list, audio_list, output_video_path):
# Create a temporary directory for intermediate files
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
temp_dir = f"temp_{timestamp}"
os.makedirs(temp_dir, exist_ok=True)
# Download background image
bg_image_local = os.path.join(temp_dir, "bg.png")
download_file(bg_image_path, bg_image_local)
# Download and process images
processed_images = []
for i, image_url in enumerate(image_list):
image_local = os.path.join(temp_dir, f"image_{i}.png")
download_file(image_url, image_local)
# Overlay image on bg.png
bg_image = Image.open(bg_image_local)
overlay_image = Image.open(image_local)
if overlay_image.mode != 'RGBA':
overlay_image = overlay_image.convert('RGBA')
# Resize overlay image to 80% of background image size
new_width = int(bg_image.width * 0.8)
new_height = int(bg_image.height * 0.8)
overlay_image = overlay_image.resize((new_width, new_height), Image.Resampling.LANCZOS)
# Calculate position to center the overlay image
position = ((bg_image.width - new_width) // 2, (bg_image.height - new_height) // 2)
bg_image.paste(overlay_image, position, overlay_image)
# Skip adding caption to the image directly
processed_image_path = os.path.join(temp_dir, f"processed_image_{i}.png")
bg_image.save(processed_image_path)
processed_images.append(processed_image_path)
# Download audio files
audio_files = []
for i, audio_url in enumerate(audio_list):
audio_local = os.path.join(temp_dir, f"audio_{i}.mp3")
download_file(audio_url, audio_local)
audio_files.append(audio_local)
# Generate subtitle file (.srt)
subtitle_file = os.path.join(temp_dir, "subtitles.srt")
with open(subtitle_file, 'w', encoding='utf-8') as f:
start_time = 0
for i, caption_data in enumerate(data["captions"]):
if not caption_data.get("textList"):
continue
# Ensure audioList and textList are aligned
for j, text in enumerate(caption_data["textList"]):
if j >= len(caption_data["audioList"]):
break
duration = float(caption_data["audioList"][j]["data"]["duration"])
end_time = start_time + duration
f.write(f"{i + 1}.{j + 1}\n")
f.write(f"{datetime.utcfromtimestamp(start_time).strftime('%H:%M:%S,%f')[:-3]} --> {datetime.utcfromtimestamp(end_time).strftime('%H:%M:%S,%f')[:-3]}\n")
f.write(f"{text}\n\n")
start_time = end_time
# Generate video using ffmpeg
# Create a text file listing all processed images for ffmpeg
concat_file = os.path.join(temp_dir, "concat.txt")
with open(concat_file, 'w') as f:
for image_path in processed_images:
f.write(f"file '{image_path}'\nduration 2\n")
# Create a text file listing all audio files for ffmpeg
audio_concat_file = os.path.join(temp_dir, "audio_concat.txt")
with open(audio_concat_file, 'w') as f:
for audio_path in audio_files:
f.write(f"file '{audio_path}'\n")
# Generate video segments for each image and audio pair
segment_files = []
for i, image_path in enumerate(processed_images):
segment_path = os.path.join(temp_dir, f"segment_{i}.mp4")
# Get audio duration using ffprobe
audio_path = audio_files[i % len(audio_files)]
ffprobe_command = [
"/usr/bin/ffprobe",
"-v", "error",
"-show_entries", "format=duration",
"-of", "default=noprint_wrappers=1:nokey=1",
audio_path
]
duration = subprocess.check_output(ffprobe_command).decode('utf-8').strip()
# Generate video segment with dynamic duration
ffmpeg_command = [
"/usr/bin/ffmpeg",
"-y",
"-loop", "1",
"-framerate", "1",
"-i", image_path,
"-i", audio_path,
"-c:v", "libx264",
"-c:a", "aac",
"-strict", "experimental",
"-t", duration,
segment_path
]
subprocess.run(ffmpeg_command, check=True)
segment_files.append(segment_path)
# Concatenate all segments into the final video
segments_list_file = os.path.join(temp_dir, "segments.txt")
with open(segments_list_file, 'w') as f:
for segment_path in segment_files:
# Ensure the path is relative to the segments.txt file
segment_name = os.path.basename(segment_path)
f.write(f"file '{segment_name}'\n")
ffmpeg_command = [
"/usr/bin/ffmpeg",
"-y",
"-f", "concat",
"-safe", "0",
"-i", segments_list_file,
"-vf", f"subtitles={subtitle_file}",
"-c:v", "libx264",
"-c:a", "copy",
output_video_path
]
subprocess.run(ffmpeg_command, check=True)
# Clean up temporary files
# 保留素材文件,不再清理临时目录
if __name__ == "__main__":
# Load JSON file
with open("/workspace/res.json", "r") as f:
data = json.load(f)
# Extract data
bg_image = data["bgImage"]
image_list = data["imageList"]
audio_list = [audio["data"]["link"] for caption in data["captions"] for audio in caption["audioList"]]
# Generate video
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
output_video_path = f"output_{timestamp}.mp4"
generate_video(bg_image, image_list, audio_list, output_video_path)
print(f"Video generated: {output_video_path}")FFmpeg 的命令还是蛮复杂的,借助CLI工具,可以大大简化命令的书写。
CodeBuddy code CLI 在视频生成项目中的应用,展示了其在代码生成、文件操作和命令执行方面的强大能力。
@CodeBuddy
#CodeBuddy Code
#AI CLI
#无界生成力
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。