2026-04-21-music-generator-design
# AI 音乐生成器 Design
# Goal
把 /Users/ywang/Downloads/minimax-music-package 里的 MiniMax 音乐生成能力整合到博客,作为免费在线工具开放给访客使用。第一版只做从主题/歌词生成歌曲和纯音乐生成,不开放翻唱上传。
核心约束是不能暴露站长的 MiniMax token。所有 MiniMax API 调用都必须由服务器端代理完成,前端、Markdown、构建产物、接口响应和日志都不能包含 token。
# Existing Context
当前博客是 VuePress 1 + vdoing 静态站点。工具页放在 docs/07.工具/,交互组件放在 docs/.vuepress/components/。
已有一个相似的后端工具:
- 页面:
docs/07.工具/20.学习工具/20.AI播客生成器.md - 前端组件:
docs/.vuepress/components/PodcastGeneratorTool.vue - 后端服务:
pywork/podcast-generator/backend/app.py - 线上代理:
/tools/podcast-generator/api/->127.0.0.1:5002/api/ - 部署脚本:
scripts/deploy_prod.sh同步pywork/,scripts/remote_restart.sh启动 gunicorn,scripts/remote_configure_podcast_nginx.sh写 nginx location。
MiniMax 包提供的是一个 Flask 原型:
app.py包含歌词生成、音乐生成、翻唱、下载和风格列表接口。templates/index.html是独立页面,不适合直接嵌入 VuePress。- 当前任务字典是内存态,不适合 gunicorn 多 worker 和服务重启。
- API token 从
MINIMAX_API_KEY环境变量读取,方向正确,但需要补齐限流、持久任务和部署隔离。
# Recommended Approach
新增独立服务 pywork/music-generator/,前端新增 VuePress 组件和工具页,部署方式复用 podcast generator 的 nginx + gunicorn 模式。
这比把 MiniMax 包直接拷到静态页面里更稳妥:前端永远不接触 token,长耗时生成任务由后端排队执行,服务端可以做限流、并发控制、文件清理和错误脱敏。
# Alternatives Considered
# A. 直接嵌入原始 Flask 页面
优点是最快,原型页面已经可用。
缺点是页面风格和博客不统一,难以复用 VuePress 导航、SEO、统计和现有测试模式;还会让 /templates/index.html 成为第二套前端。该方案不采用。
# B. 前端直接调用 MiniMax API
优点是无需后端服务。
缺点是必然暴露 token,和本项目的核心约束冲突。该方案禁止采用。
# C. 复用现有 podcast 后端进程
优点是少开一个服务。
缺点是两个工具的依赖、任务模型、输出目录、限流策略和环境变量不同,耦合后会增加故障影响范围。该方案暂不采用。
# Scope
第一版包含:
- 新工具页
/tools/music-generator/ - 主题生成歌词
- 用户编辑歌词
- 歌词生成歌曲
- 纯音乐生成
- 风格预设
- 任务状态轮询
- 在线播放生成结果
- 下载 MP3
- 免费额度提示
- 服务端限流和并发控制
- 输出文件自动清理
第一版不包含:
- 翻唱上传
- 用户登录
- 支付或积分系统
- 自定义歌手声音
- 公开历史作品广场
- 多用户持久作品库
翻唱上传先不做,是因为它引入用户上传音频、文件大小限制、版权风险和更高成本控制复杂度。后续可以作为独立第二阶段设计。
# Architecture
flowchart LR
User["Visitor"] --> Page["/tools/music-generator/ VuePress page"]
Page --> API["/tools/music-generator/api/*"]
API --> Nginx["Nginx reverse proxy"]
Nginx --> Flask["pywork/music-generator/backend Flask app"]
Flask --> DB["SQLite tasks + usage limits"]
Flask --> Worker["single background worker"]
Worker --> MiniMax["MiniMax API"]
Worker --> Output["output/*.mp3"]
Page --> Download["/tools/music-generator/api/download/:filename"]
Download --> Output
2
3
4
5
6
7
8
9
10
11
The VuePress page only talks to same-origin /tools/music-generator/api/*. Nginx forwards that path to a local-only Flask service. Flask reads MINIMAX_API_KEY from the server environment and calls MiniMax.
# Backend Service
Create:
pywork/music-generator/backend/app.pypywork/music-generator/backend/config.pypywork/music-generator/backend/minimax_client.pypywork/music-generator/backend/task_store.pypywork/music-generator/backend/rate_limit.pypywork/music-generator/backend/cleanup.pypywork/music-generator/requirements.txtpywork/music-generator/README.mdpywork/music-generator/tests/
Responsibilities:
config.py: environment variables, output paths, limits, API host.minimax_client.py: MiniMax HTTP calls, SSE parsing, audio hex decoding, error normalization.task_store.py: SQLite-backed task creation, status update, lookup, usage counters.rate_limit.py: global and per-IP daily quota checks plus queue/concurrency checks.cleanup.py: delete expired output files and old completed tasks.app.py: Flask routes, request validation, worker dispatch, download endpoint.
Use SQLite instead of an in-memory dict because generation can take 60-300 seconds and the production service runs under gunicorn. Process restarts should not make status endpoints return "task not found" for recently created tasks.
# Backend API
All routes are under /api/ internally and exposed online under /tools/music-generator/api/.
# GET /api/config
Returns public tool configuration:
{
"success": true,
"limits": {
"daily_global_limit": 20,
"daily_ip_limit": 2,
"max_queue_size": 5
},
"styles": [
{"id": "pop", "name": "流行", "prompt": "pop, catchy, upbeat, modern production"}
]
}
2
3
4
5
6
7
8
9
10
11
The response must not include environment variable names, token status details, or any token value.
# POST /api/lyrics
Request:
{
"prompt": "一首关于春天的民谣歌曲",
"title": "",
"mode": "write_full_song"
}
2
3
4
5
Validation:
modemust bewrite_full_songin V1.promptis required, 5-500 Chinese characters or equivalent text length.titleis optional, max 80 characters.
Response:
{
"success": true,
"title": "春日晨曦",
"style": "folk, morning, fresh, spring",
"lyrics": "[Verse]\n..."
}
2
3
4
5
6
# POST /api/music/start
Request:
{
"lyrics": "[Verse]\n歌词内容...",
"prompt": "folk, acoustic, warm",
"instrumental": false
}
2
3
4
5
Validation:
- If
instrumentalis false,lyricsis required and max 3500 characters. promptis required for instrumental mode and optional for vocal mode, max 500 characters.- Rate limits and queue limits are checked before task creation.
Response:
{
"success": true,
"task_id": "9f4a1d2c",
"status": "pending"
}
2
3
4
5
# GET /api/music/status/<task_id>
Returns:
{
"success": true,
"task_id": "9f4a1d2c",
"status": "processing",
"progress": 40
}
2
3
4
5
6
When complete:
{
"success": true,
"task_id": "9f4a1d2c",
"status": "completed",
"progress": 100,
"download_url": "/tools/music-generator/api/download/9f4a1d2c.mp3",
"duration": 42.8
}
2
3
4
5
6
7
8
Failed responses expose a short user-facing error category, not raw upstream response bodies.
# GET /api/download/<filename>
Serves generated MP3 files.
Security requirements:
- Only allow filenames matching
^[a-f0-9]{8,32}\\.mp3$in V1. - Resolve paths inside the configured output directory.
- Return 404 for missing files.
- Never echo filesystem paths in JSON errors.
# Quotas and Abuse Controls
Default environment variables:
MINIMAX_API_KEY: required in production.MINIMAX_API_HOST: defaulthttps://api.minimaxi.com.MUSIC_DAILY_GLOBAL_LIMIT: default20.MUSIC_DAILY_IP_LIMIT: default2.MUSIC_DAILY_LYRICS_IP_LIMIT: default20.MUSIC_MAX_QUEUE_SIZE: default5.MUSIC_MAX_CONCURRENT_JOBS: default1.MUSIC_OUTPUT_TTL_HOURS: default24.
The server stores usage by date and IP hash. Hashing prevents storing raw IP in the first version. The hash can be sha256(date + ip + MUSIC_USAGE_SALT).
Quota rules:
- Lyrics generation counts against
MUSIC_DAILY_LYRICS_IP_LIMIT. - Music generation counts against global and IP music quota when a task is accepted.
- Failed upstream calls still count once accepted, because they consume queue and may consume upstream quota.
- Queue full returns HTTP 429 with a friendly message.
- Missing server token returns HTTP 503 with "服务暂不可用", not "MINIMAX_API_KEY missing".
# Token Safety
Token safety is a release gate.
Rules:
- Do not commit
.envfiles. - Do not place token in Vue component constants, Markdown, JSON-LD, scripts, tests, or docs examples.
- Do not return raw MiniMax request headers or upstream error bodies to the browser.
- Do not log
Authorizationheaders. - Do not expose token presence through
/api/config; config can say the service is available/unavailable without naming the secret. - Deployment loads the token only on the remote server, outside rsync-managed source directories.
Verification should include a repository scan for:
MINIMAX_API_KEY
Bearer
Authorization
sk-
2
3
4
The expected source code may contain the environment variable name MINIMAX_API_KEY, but no token value.
# Frontend
Create docs/.vuepress/components/MusicGeneratorTool.vue.
User flow:
- User picks a mode: "主题写歌" or "纯音乐".
- In "主题写歌", user enters a theme and optional title.
- User clicks "生成歌词".
- The generated title/style/lyrics appear in editable fields.
- User adjusts style prompt or lyrics.
- User clicks "生成音乐".
- The page shows pending/processing progress and polls status.
- On completion, an audio player and download button appear.
UI state:
idlegeneratingLyricslyricsReadystartingMusicmusicProcessingmusicCompletederror
The component should match existing tool page conventions: scoped component styles, no new frontend dependencies, responsive layout, and one page-view tracking event named tools_music_generator_view.
# Page and Navigation
Create page:
docs/07.工具/20.学习工具/50.AI音乐生成器.md- permalink:
/tools/music-generator/ - title:
AI音乐生成器 article: falsecomment: false
Add links:
docs/07.工具/00.概览.mdunder 学习工具.docs/.vuepress/config.tsTools nav.- Related links from
docs/07.工具/20.学习工具/20.AI播客生成器.mdanddocs/07.工具/10.测试工具/71.样例音频库.md.
The page should clearly say it is free to use, but not promise unlimited generation. Copy should mention that music generation usually takes 1-5 minutes and may be limited by daily quota.
# Deployment
Add:
scripts/remote_configure_music_nginx.sh- update
scripts/deploy_prod.sh - update
scripts/remote_restart.sh - update
scripts/smoke_test.sh
Online proxy:
location /tools/music-generator/api/ {
proxy_pass http://127.0.0.1:5003/api/;
proxy_connect_timeout 30s;
proxy_send_timeout 300s;
proxy_read_timeout 300s;
send_timeout 300s;
}
2
3
4
5
6
7
Gunicorn:
- bind:
127.0.0.1:5003 - app:
app:app - timeout:
360 - workers:
1for V1, because in-process background worker and global concurrency limit are simpler and predictable.
The remote server must define MINIMAX_API_KEY outside the git repo. The deployment scripts should not overwrite or print it.
# Error Handling
Frontend messages:
- Missing theme:
请输入歌曲主题 - Lyrics generation failed:
歌词生成失败,请稍后重试 - Quota exceeded:
今日免费额度已用完,明天再来试试 - Queue full:
当前排队人数较多,请稍后再试 - Music failed:
音乐生成失败,请调整歌词或风格后重试 - Timeout:
生成时间较长,请稍后刷新任务状态
Backend errors:
- Return consistent
{success: false, error: "..."}JSON. - Use HTTP 400 for validation errors.
- Use HTTP 429 for quota/queue limits.
- Use HTTP 503 for missing server token or upstream unavailable.
- Use HTTP 500 only for unexpected local failures.
- Log upstream details server-side after stripping headers and long bodies.
# Testing
Backend tests:
- Config endpoint returns public fields and no token-like strings.
- Lyrics validation rejects missing prompt and overlong prompt.
- Music start rejects missing lyrics when not instrumental.
- Music start accepts instrumental prompt without lyrics.
- Rate limiter rejects over daily IP limit.
- Queue limiter rejects when max queue size is reached.
- Download rejects path traversal.
- SSE parser extracts final audio hex and duration.
- MiniMax client redacts upstream errors.
Frontend/wiring tests:
- Music tool page exists at the planned path.
- Page contains
<MusicGeneratorTool />. - Vue component calls
/tools/music-generator/api. - Vue component does not contain token-like strings.
- Tools overview links to
/tools/music-generator/. - VuePress config nav links to
/tools/music-generator/.
Deployment smoke tests:
https://wangmouren.online/tools/music-generator/returns page content.https://wangmouren.online/tools/music-generator/api/configreturns JSON.api/configresponse contains no token-like strings.
Manual verification:
- Generate lyrics from a short Chinese theme.
- Edit lyrics and generate one vocal song.
- Generate one instrumental track.
- Confirm result audio plays and downloads.
- Confirm exceeding the configured test quota returns the quota message.
# Rollout
- Build and test locally with fake MiniMax client responses.
- Start the Flask service locally with
MINIMAX_API_KEYunset and confirm safe 503 behavior. - Start locally with a real key only from shell environment and generate one real track.
- Deploy to the server after setting remote environment variables.
- Run smoke tests.
- Monitor quota and output directory size for the first day.
# Future Work
- Add翻唱上传 as a separate phase with file validation, copyright warning, duration checks, and a stricter quota.
- Add admin-only usage dashboard.
- Add optional captcha or honeypot if public abuse appears.
- Store successful generated songs in object storage if local disk retention becomes a problem.