Initial commit: add all skills files
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
127
gif-sticker-maker/SKILL.md
Normal file
127
gif-sticker-maker/SKILL.md
Normal file
@@ -0,0 +1,127 @@
|
||||
---
|
||||
name: gif-sticker-maker
|
||||
description: |
|
||||
Convert photos (people, pets, objects, logos) into 4 animated GIF stickers with captions.
|
||||
Use when: user wants to create cartoon stickers, GIF expressions, emoji packs, animated avatars,
|
||||
or convert photos to Funko Pop / Pop Mart blind box style animations.
|
||||
Triggers: sticker, GIF, cartoon, emoji, expression pack, avatar animation.
|
||||
license: MIT
|
||||
metadata:
|
||||
version: "1.2"
|
||||
category: creative-tools
|
||||
style: Funko Pop / Pop Mart
|
||||
output_format: GIF
|
||||
output_count: 4
|
||||
sources:
|
||||
- MiniMax Image Generation API
|
||||
- MiniMax Video Generation API
|
||||
---
|
||||
|
||||
# GIF Sticker Maker
|
||||
|
||||
Convert user photos into 4 animated GIF stickers (Funko Pop / Pop Mart style).
|
||||
|
||||
## Style Spec
|
||||
|
||||
- Funko Pop / Pop Mart blind box 3D figurine
|
||||
- C4D / Octane rendering quality
|
||||
- White background, soft studio lighting
|
||||
- Caption: black text + white outline, bottom of image
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Before starting any generation step, ensure:
|
||||
|
||||
1. **Python venv** is activated with dependencies from [requirements.txt](references/requirements.txt) installed
|
||||
2. **`MINIMAX_API_KEY`** is exported (e.g. `export MINIMAX_API_KEY='your-key'`)
|
||||
3. **`ffmpeg`** is available on PATH (for Step 3 GIF conversion)
|
||||
|
||||
If any prerequisite is missing, set it up first. Do NOT proceed to generation without all three.
|
||||
|
||||
## Workflow
|
||||
|
||||
### Step 0: Collect Captions
|
||||
|
||||
Ask user (in their language):
|
||||
> "Would you like to customize the captions for your stickers, or use the defaults?"
|
||||
|
||||
- **Custom**: Collect 4 short captions (1–3 words). Actions auto-match caption meaning.
|
||||
- **Default**: Look up [captions table](references/captions.md) by **detected user language**. **Never mix languages.**
|
||||
|
||||
### Step 1: Generate 4 Static Sticker Images
|
||||
|
||||
**Tool**: `scripts/minimax_image.py`
|
||||
|
||||
1. Analyze the user's photo — identify subject type (person / animal / object / logo).
|
||||
2. For each of the 4 stickers, build a prompt from [image-prompt-template.txt](assets/image-prompt-template.txt) by filling `{action}` and `{caption}`.
|
||||
3. **If subject is a person**: pass `--subject-ref <user_photo_path>` so the generated figurine preserves the person's actual facial likeness.
|
||||
4. Generate (all 4 are independent — **run concurrently**):
|
||||
|
||||
```bash
|
||||
python3 scripts/minimax_image.py "<prompt>" -o output/sticker_hi.png --ratio 1:1 --subject-ref <photo>
|
||||
python3 scripts/minimax_image.py "<prompt>" -o output/sticker_laugh.png --ratio 1:1 --subject-ref <photo>
|
||||
python3 scripts/minimax_image.py "<prompt>" -o output/sticker_cry.png --ratio 1:1 --subject-ref <photo>
|
||||
python3 scripts/minimax_image.py "<prompt>" -o output/sticker_love.png --ratio 1:1 --subject-ref <photo>
|
||||
```
|
||||
|
||||
> `--subject-ref` only works for person subjects (API limitation: type=character).
|
||||
> For animals/objects/logos, omit the flag and rely on text description.
|
||||
|
||||
### Step 2: Animate Each Image → Video
|
||||
|
||||
**Tool**: `scripts/minimax_video.py` with `--image` flag (image-to-video mode)
|
||||
|
||||
For each sticker image, build a prompt from [video-prompt-template.txt](assets/video-prompt-template.txt), then:
|
||||
|
||||
```bash
|
||||
python3 scripts/minimax_video.py "<prompt>" --image output/sticker_hi.png -o output/sticker_hi.mp4
|
||||
python3 scripts/minimax_video.py "<prompt>" --image output/sticker_laugh.png -o output/sticker_laugh.mp4
|
||||
python3 scripts/minimax_video.py "<prompt>" --image output/sticker_cry.png -o output/sticker_cry.mp4
|
||||
python3 scripts/minimax_video.py "<prompt>" --image output/sticker_love.png -o output/sticker_love.mp4
|
||||
```
|
||||
|
||||
All 4 calls are independent — **run concurrently**.
|
||||
|
||||
### Step 3: Convert Videos → GIF
|
||||
|
||||
**Tool**: `scripts/convert_mp4_to_gif.py`
|
||||
|
||||
```bash
|
||||
python3 scripts/convert_mp4_to_gif.py output/sticker_hi.mp4 output/sticker_laugh.mp4 output/sticker_cry.mp4 output/sticker_love.mp4
|
||||
```
|
||||
|
||||
Outputs GIF files alongside each MP4 (e.g. `sticker_hi.gif`).
|
||||
|
||||
### Step 4: Deliver
|
||||
|
||||
Output format (strict order):
|
||||
1. Brief status line (e.g. "4 stickers created:")
|
||||
2. `<deliver_assets>` block with all GIF files
|
||||
3. **NO text after deliver_assets**
|
||||
|
||||
```xml
|
||||
<deliver_assets>
|
||||
<item><path>output/sticker_hi.gif</path></item>
|
||||
<item><path>output/sticker_laugh.gif</path></item>
|
||||
<item><path>output/sticker_cry.gif</path></item>
|
||||
<item><path>output/sticker_love.gif</path></item>
|
||||
</deliver_assets>
|
||||
```
|
||||
|
||||
## Default Actions
|
||||
|
||||
| # | Action | Filename ID | Animation |
|
||||
|---|--------|-------------|-----------|
|
||||
| 1 | Happy waving | hi | Wave hand, slight head tilt |
|
||||
| 2 | Laughing hard | laugh | Shake with laughter, eyes squint |
|
||||
| 3 | Crying tears | cry | Tears stream, body trembles |
|
||||
| 4 | Heart gesture | love | Heart hands, eyes sparkle |
|
||||
|
||||
See [references/captions.md](references/captions.md) for multilingual caption defaults.
|
||||
|
||||
## Rules
|
||||
|
||||
- Detect user's language, all outputs follow it
|
||||
- Captions MUST come from [captions.md](references/captions.md) matching user's language column — never mix languages
|
||||
- All image prompts must be in **English** regardless of user language (only caption text is localized)
|
||||
- `<deliver_assets>` must be LAST in response, no text after
|
||||
23
gif-sticker-maker/assets/image-prompt-template.txt
Normal file
23
gif-sticker-maker/assets/image-prompt-template.txt
Normal file
@@ -0,0 +1,23 @@
|
||||
Transform the subject into a Funko Pop / Pop Mart blind box style 3D figurine.
|
||||
|
||||
Style:
|
||||
- Cute cartoon proportions (large head, small body)
|
||||
- 3D rendered (C4D/Octane quality), premium plastic/vinyl finish
|
||||
- Clean white background, soft studio lighting
|
||||
|
||||
Subject handling:
|
||||
- Person: preserve facial features, hairstyle, clothing
|
||||
- Animal/Pet: preserve species, fur color, markings
|
||||
- Object: stylize into cute mascot figurine
|
||||
- Logo/Icon: transform to 3D toy, preserve original colors and shape
|
||||
|
||||
Action: {action}
|
||||
Caption: "{caption}"
|
||||
|
||||
Caption rendering (CRITICAL — follow exactly):
|
||||
- Black bold text with thick white outline stroke
|
||||
- Large, clear sans-serif font (e.g. Impact, Helvetica Bold)
|
||||
- MUST be placed at the absolute bottom center of the image as a standalone text banner
|
||||
- MUST NOT appear on the character's body, clothing, or any accessory
|
||||
- Leave visible gap between the character's feet and the caption text
|
||||
- Text must have sharp anti-aliased edges — it must survive video animation without warping
|
||||
14
gif-sticker-maker/assets/video-prompt-template.txt
Normal file
14
gif-sticker-maker/assets/video-prompt-template.txt
Normal file
@@ -0,0 +1,14 @@
|
||||
Animate this cute 3D cartoon figurine performing: {action}
|
||||
|
||||
Requirements:
|
||||
- Smooth loopable motion, keep action within 6 seconds
|
||||
- Character stays centered, white background remains static
|
||||
- Text at bottom must stay sharp and stable — no warping, no blur
|
||||
|
||||
Action reference:
|
||||
- hi: wave hand cheerfully, slight head tilt
|
||||
- laugh: shake with laughter, eyes squint shut
|
||||
- cry: tears stream down, body trembles gently
|
||||
- love: make heart gesture with both hands, eyes sparkle
|
||||
|
||||
CRITICAL: The caption text must remain perfectly readable throughout the entire animation. Zero text distortion.
|
||||
25
gif-sticker-maker/references/captions.md
Normal file
25
gif-sticker-maker/references/captions.md
Normal file
@@ -0,0 +1,25 @@
|
||||
# Default Captions by Language
|
||||
|
||||
Select captions based on user's conversation language.
|
||||
|
||||
| Action | English | Spanish | French | German | Chinese | Japanese | Korean |
|
||||
|--------|---------|---------|--------|--------|---------|----------|--------|
|
||||
| Waving | Hi~ | ¡Hola! | Salut~ | Hallo~ | 嗨~ | やあ~ | 안녕~ |
|
||||
| Laughing | LOL | Jajaja | MDR | Haha | 哈哈哈 | 笑 | ㅋㅋㅋ |
|
||||
| Crying | Boo-hoo | Buaaa | Snif | Heul | 呜呜呜 | えーん | 흑흑 |
|
||||
| Heart | Love ya | Te quiero | Je t'aime | Liebe | 爱你哦 | 大好き | 사랑해 |
|
||||
|
||||
## Filename Convention
|
||||
|
||||
| Action | Filename ID |
|
||||
|--------|-------------|
|
||||
| Happy waving | hi |
|
||||
| Laughing hard | laugh |
|
||||
| Crying tears | cry |
|
||||
| Heart gesture | love |
|
||||
|
||||
## Custom Caption Guidelines
|
||||
|
||||
- Keep captions short: 1-3 words work best
|
||||
- Actions auto-match caption meaning (e.g., "Sleepy" → yawning action)
|
||||
- Users can provide captions in any language
|
||||
5
gif-sticker-maker/references/requirements.txt
Normal file
5
gif-sticker-maker/references/requirements.txt
Normal file
@@ -0,0 +1,5 @@
|
||||
# Python dependencies
|
||||
requests>=2.28
|
||||
|
||||
# System dependency (install separately):
|
||||
# ffmpeg — brew install ffmpeg (macOS) / apt install ffmpeg (Ubuntu)
|
||||
89
gif-sticker-maker/scripts/convert_mp4_to_gif.py
Normal file
89
gif-sticker-maker/scripts/convert_mp4_to_gif.py
Normal file
@@ -0,0 +1,89 @@
|
||||
#!/usr/bin/env python3
|
||||
# SPDX-License-Identifier: MIT
|
||||
"""
|
||||
Batch MP4 → GIF converter using ffmpeg.
|
||||
|
||||
Usage:
|
||||
python convert_mp4_to_gif.py sticker_hi.mp4 sticker_laugh.mp4 sticker_cry.mp4 sticker_love.mp4
|
||||
python convert_mp4_to_gif.py *.mp4 --fps 12 --width 320
|
||||
python convert_mp4_to_gif.py input.mp4 -o custom_output.gif
|
||||
|
||||
Requires: ffmpeg (must be on PATH)
|
||||
"""
|
||||
|
||||
import os
|
||||
import sys
|
||||
import argparse
|
||||
import subprocess
|
||||
import shutil
|
||||
|
||||
|
||||
def check_ffmpeg():
|
||||
if not shutil.which("ffmpeg"):
|
||||
raise SystemExit("ERROR: ffmpeg not found. Install via: brew install ffmpeg / apt install ffmpeg")
|
||||
|
||||
|
||||
def mp4_to_gif(input_path: str, output_path: str, fps: int = 15, width: int = 360):
|
||||
"""Convert a single MP4 to GIF via ffmpeg two-pass (palette for quality)."""
|
||||
if not os.path.isfile(input_path):
|
||||
print(f"SKIP: {input_path} not found", file=sys.stderr)
|
||||
return False
|
||||
|
||||
palette = output_path + ".palette.png"
|
||||
scale_filter = f"fps={fps},scale={width}:-1:flags=lanczos"
|
||||
|
||||
try:
|
||||
subprocess.run(
|
||||
["ffmpeg", "-y", "-i", input_path,
|
||||
"-vf", f"{scale_filter},palettegen=stats_mode=diff",
|
||||
palette],
|
||||
check=True, capture_output=True,
|
||||
)
|
||||
subprocess.run(
|
||||
["ffmpeg", "-y", "-i", input_path, "-i", palette,
|
||||
"-lavfi", f"{scale_filter} [x]; [x][1:v] paletteuse=dither=bayer:bayer_scale=5:diff_mode=rectangle",
|
||||
output_path],
|
||||
check=True, capture_output=True,
|
||||
)
|
||||
except subprocess.CalledProcessError as e:
|
||||
print(f"FAIL: {input_path} -> {e.stderr.decode()[-200:]}", file=sys.stderr)
|
||||
return False
|
||||
finally:
|
||||
if os.path.exists(palette):
|
||||
os.remove(palette)
|
||||
|
||||
size = os.path.getsize(output_path)
|
||||
print(f"OK: {size:,} bytes -> {output_path}")
|
||||
return True
|
||||
|
||||
|
||||
def main():
|
||||
p = argparse.ArgumentParser(description="Batch MP4 → GIF converter (ffmpeg two-pass palette)")
|
||||
p.add_argument("inputs", nargs="+", help="MP4 file(s) to convert")
|
||||
p.add_argument("-o", "--output", default=None, help="Output path (only for single file input)")
|
||||
p.add_argument("--fps", type=int, default=15, help="GIF frame rate (default: 15)")
|
||||
p.add_argument("--width", type=int, default=360, help="GIF width in pixels, height auto-scaled (default: 360)")
|
||||
args = p.parse_args()
|
||||
|
||||
if args.output and len(args.inputs) > 1:
|
||||
raise SystemExit("ERROR: -o/--output only works with a single input file")
|
||||
|
||||
check_ffmpeg()
|
||||
|
||||
ok, fail = 0, 0
|
||||
for mp4 in args.inputs:
|
||||
if args.output:
|
||||
gif_path = args.output
|
||||
else:
|
||||
gif_path = os.path.splitext(mp4)[0] + ".gif"
|
||||
|
||||
if mp4_to_gif(mp4, gif_path, fps=args.fps, width=args.width):
|
||||
ok += 1
|
||||
else:
|
||||
fail += 1
|
||||
|
||||
print(f"\nDone: {ok} converted, {fail} failed")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
158
gif-sticker-maker/scripts/minimax_image.py
Normal file
158
gif-sticker-maker/scripts/minimax_image.py
Normal file
@@ -0,0 +1,158 @@
|
||||
#!/usr/bin/env python3
|
||||
# SPDX-License-Identifier: MIT
|
||||
"""
|
||||
MiniMax Text-to-Image — synchronous generation with optional character reference.
|
||||
|
||||
Usage:
|
||||
python3 minimax_image.py "A cat in space" -o cat.png
|
||||
python3 minimax_image.py "Mountain landscape" -o bg.png --ratio 16:9
|
||||
python3 minimax_image.py "Funko Pop figurine waving" -o sticker.png --subject-ref photo.jpg
|
||||
|
||||
Env: MINIMAX_API_KEY (required)
|
||||
"""
|
||||
|
||||
import os
|
||||
import sys
|
||||
import json
|
||||
import base64
|
||||
import argparse
|
||||
import requests
|
||||
|
||||
API_KEY = os.getenv("MINIMAX_API_KEY")
|
||||
API_BASE = "https://api.minimax.io/v1"
|
||||
|
||||
ASPECT_RATIOS = ["1:1", "16:9", "4:3", "3:2", "2:3", "3:4", "9:16", "21:9"]
|
||||
|
||||
|
||||
def _headers():
|
||||
if not API_KEY:
|
||||
raise SystemExit("ERROR: MINIMAX_API_KEY is not set.\n export MINIMAX_API_KEY='your-key'")
|
||||
return {
|
||||
"Authorization": f"Bearer {API_KEY}",
|
||||
"Content-Type": "application/json",
|
||||
}
|
||||
|
||||
|
||||
def _encode_image(image_path: str) -> str:
|
||||
"""Read local image file and return base64 data URI."""
|
||||
ext = os.path.splitext(image_path)[1].lower().lstrip(".")
|
||||
mime_map = {"jpg": "jpeg", "jpeg": "jpeg", "png": "png", "webp": "webp"}
|
||||
mime = mime_map.get(ext, "jpeg")
|
||||
with open(image_path, "rb") as f:
|
||||
raw = f.read()
|
||||
return f"data:image/{mime};base64,{base64.b64encode(raw).decode()}"
|
||||
|
||||
|
||||
def generate_image(
|
||||
prompt: str,
|
||||
model: str = "image-01",
|
||||
aspect_ratio: str = "1:1",
|
||||
n: int = 1,
|
||||
response_format: str = "url",
|
||||
prompt_optimizer: bool = False,
|
||||
seed: int = None,
|
||||
subject_reference: list = None,
|
||||
) -> dict:
|
||||
"""Generate image(s). Returns API response dict."""
|
||||
payload = {
|
||||
"model": model,
|
||||
"prompt": prompt,
|
||||
"aspect_ratio": aspect_ratio,
|
||||
"n": n,
|
||||
"response_format": response_format,
|
||||
"prompt_optimizer": prompt_optimizer,
|
||||
}
|
||||
if seed is not None:
|
||||
payload["seed"] = seed
|
||||
if subject_reference:
|
||||
payload["subject_reference"] = subject_reference
|
||||
|
||||
resp = requests.post(
|
||||
f"{API_BASE}/image_generation",
|
||||
headers=_headers(),
|
||||
json=payload,
|
||||
timeout=120,
|
||||
)
|
||||
resp.raise_for_status()
|
||||
data = resp.json()
|
||||
|
||||
base_resp = data.get("base_resp", {})
|
||||
if base_resp.get("status_code", 0) != 0:
|
||||
raise SystemExit(f"API Error [{base_resp.get('status_code')}]: {base_resp.get('status_msg')}")
|
||||
|
||||
return data
|
||||
|
||||
|
||||
def download_and_save(url: str, output_path: str):
|
||||
"""Download image from URL and save."""
|
||||
resp = requests.get(url, timeout=60)
|
||||
resp.raise_for_status()
|
||||
with open(output_path, "wb") as f:
|
||||
f.write(resp.content)
|
||||
return len(resp.content)
|
||||
|
||||
|
||||
def main():
|
||||
p = argparse.ArgumentParser(description="MiniMax Text-to-Image")
|
||||
p.add_argument("prompt", help="Image description (max 1500 chars)")
|
||||
p.add_argument("-o", "--output", required=True, help="Output file path (.png/.jpg)")
|
||||
p.add_argument("--model", default="image-01", help="Model (default: image-01)")
|
||||
p.add_argument("--ratio", default="1:1", choices=ASPECT_RATIOS, help="Aspect ratio (default: 1:1)")
|
||||
p.add_argument("-n", "--count", type=int, default=1, choices=range(1, 10), help="Number of images (1-9, default: 1)")
|
||||
p.add_argument("--seed", type=int, default=None, help="Random seed for reproducibility")
|
||||
p.add_argument("--optimize", action="store_true", help="Enable prompt auto-optimization")
|
||||
p.add_argument("--base64", action="store_true", help="Use base64 response instead of URL")
|
||||
p.add_argument("--subject-ref", default=None,
|
||||
help="Reference image for character likeness (local path or URL, person only)")
|
||||
p.add_argument("--subject-type", default="character",
|
||||
help="Subject reference type (default: character)")
|
||||
args = p.parse_args()
|
||||
|
||||
os.makedirs(os.path.dirname(args.output) or ".", exist_ok=True)
|
||||
|
||||
subject_ref = None
|
||||
if args.subject_ref:
|
||||
ref_value = args.subject_ref
|
||||
if not ref_value.startswith(("http://", "https://", "data:")):
|
||||
ref_value = _encode_image(ref_value)
|
||||
subject_ref = [{"type": args.subject_type, "image_file": ref_value}]
|
||||
|
||||
fmt = "base64" if args.base64 else "url"
|
||||
result = generate_image(
|
||||
prompt=args.prompt,
|
||||
model=args.model,
|
||||
aspect_ratio=args.ratio,
|
||||
n=args.count,
|
||||
response_format=fmt,
|
||||
prompt_optimizer=args.optimize,
|
||||
seed=args.seed,
|
||||
subject_reference=subject_ref,
|
||||
)
|
||||
|
||||
meta = result.get("metadata", {})
|
||||
print(f"Generated: {meta.get('success_count', '?')} success, {meta.get('failed_count', '?')} failed")
|
||||
|
||||
if args.base64:
|
||||
images = result.get("data", {}).get("image_base64", [])
|
||||
for i, b64 in enumerate(images):
|
||||
path = args.output if len(images) == 1 else _numbered_path(args.output, i)
|
||||
raw = base64.b64decode(b64)
|
||||
with open(path, "wb") as f:
|
||||
f.write(raw)
|
||||
print(f"OK: {len(raw)} bytes -> {path}")
|
||||
else:
|
||||
urls = result.get("data", {}).get("image_urls", [])
|
||||
for i, url in enumerate(urls):
|
||||
path = args.output if len(urls) == 1 else _numbered_path(args.output, i)
|
||||
size = download_and_save(url, path)
|
||||
print(f"OK: {size} bytes -> {path}")
|
||||
|
||||
|
||||
def _numbered_path(path: str, index: int) -> str:
|
||||
"""Insert index before extension: out.png -> out-0.png"""
|
||||
base, ext = os.path.splitext(path)
|
||||
return f"{base}-{index}{ext}"
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
226
gif-sticker-maker/scripts/minimax_video.py
Normal file
226
gif-sticker-maker/scripts/minimax_video.py
Normal file
@@ -0,0 +1,226 @@
|
||||
#!/usr/bin/env python3
|
||||
# SPDX-License-Identifier: MIT
|
||||
"""
|
||||
MiniMax Video Generation — supports both Text-to-Video and Image-to-Video.
|
||||
|
||||
Usage (T2V):
|
||||
python minimax_video.py "A cat playing piano" -o cat.mp4
|
||||
python minimax_video.py "Ocean waves [Truck left]" -o waves.mp4 --duration 10
|
||||
|
||||
Usage (I2V):
|
||||
python minimax_video.py "Character waves cheerfully" --image sticker.png -o sticker.mp4
|
||||
python minimax_video.py "Figurine laughing" --image laugh.png -o laugh.mp4 --duration 6
|
||||
|
||||
Env: MINIMAX_API_KEY (required)
|
||||
"""
|
||||
|
||||
import os
|
||||
import sys
|
||||
import json
|
||||
import time
|
||||
import base64
|
||||
import argparse
|
||||
import requests
|
||||
|
||||
API_KEY = os.getenv("MINIMAX_API_KEY")
|
||||
API_BASE = "https://api.minimax.io/v1"
|
||||
|
||||
I2V_MODELS = [
|
||||
"MiniMax-Hailuo-2.3",
|
||||
"MiniMax-Hailuo-2.3-Fast",
|
||||
"MiniMax-Hailuo-02",
|
||||
"I2V-01-Director",
|
||||
"I2V-01-live",
|
||||
"I2V-01",
|
||||
]
|
||||
|
||||
T2V_MODELS = [
|
||||
"MiniMax-Hailuo-2.3",
|
||||
"MiniMax-Hailuo-02",
|
||||
"T2V-01-Director",
|
||||
"T2V-01",
|
||||
]
|
||||
|
||||
|
||||
def _headers():
|
||||
if not API_KEY:
|
||||
raise SystemExit("ERROR: MINIMAX_API_KEY is not set.\n export MINIMAX_API_KEY='your-key'")
|
||||
return {
|
||||
"Authorization": f"Bearer {API_KEY}",
|
||||
"Content-Type": "application/json",
|
||||
}
|
||||
|
||||
|
||||
def _check_resp(data):
|
||||
base_resp = data.get("base_resp", {})
|
||||
code = base_resp.get("status_code", 0)
|
||||
if code != 0:
|
||||
msg = base_resp.get("status_msg", "Unknown error")
|
||||
raise SystemExit(f"API Error [{code}]: {msg}")
|
||||
|
||||
|
||||
def _encode_image(image_path: str) -> str:
|
||||
"""Read local image file and return base64 data URI."""
|
||||
ext = os.path.splitext(image_path)[1].lower().lstrip(".")
|
||||
mime_map = {"jpg": "jpeg", "jpeg": "jpeg", "png": "png", "webp": "webp"}
|
||||
mime = mime_map.get(ext, "png")
|
||||
|
||||
with open(image_path, "rb") as f:
|
||||
raw = f.read()
|
||||
|
||||
return f"data:image/{mime};base64,{base64.b64encode(raw).decode()}"
|
||||
|
||||
|
||||
def create_task(
|
||||
prompt: str,
|
||||
model: str = "MiniMax-Hailuo-2.3",
|
||||
duration: int = 6,
|
||||
resolution: str = "768P",
|
||||
prompt_optimizer: bool = True,
|
||||
first_frame_image: str = None,
|
||||
) -> str:
|
||||
"""Submit a video generation task (T2V or I2V). Returns task_id."""
|
||||
payload = {
|
||||
"model": model,
|
||||
"prompt": prompt,
|
||||
"duration": duration,
|
||||
"resolution": resolution,
|
||||
"prompt_optimizer": prompt_optimizer,
|
||||
}
|
||||
|
||||
if first_frame_image:
|
||||
payload["first_frame_image"] = first_frame_image
|
||||
|
||||
resp = requests.post(
|
||||
f"{API_BASE}/video_generation",
|
||||
headers=_headers(),
|
||||
json=payload,
|
||||
timeout=30,
|
||||
)
|
||||
resp.raise_for_status()
|
||||
data = resp.json()
|
||||
_check_resp(data)
|
||||
|
||||
task_id = data.get("task_id")
|
||||
if not task_id:
|
||||
raise SystemExit(f"No task_id in response: {json.dumps(data, indent=2)}")
|
||||
return task_id
|
||||
|
||||
|
||||
def poll_task(task_id: str, interval: int = 10, max_wait: int = 600) -> str:
|
||||
"""Poll task status until Success. Returns file_id."""
|
||||
elapsed = 0
|
||||
while elapsed < max_wait:
|
||||
resp = requests.get(
|
||||
f"{API_BASE}/query/video_generation",
|
||||
headers=_headers(),
|
||||
params={"task_id": task_id},
|
||||
timeout=30,
|
||||
)
|
||||
resp.raise_for_status()
|
||||
data = resp.json()
|
||||
_check_resp(data)
|
||||
|
||||
status = data.get("status", "")
|
||||
file_id = data.get("file_id", "")
|
||||
|
||||
if status == "Success":
|
||||
if not file_id:
|
||||
raise SystemExit("Task succeeded but no file_id returned")
|
||||
print(f" Done! file_id={file_id}")
|
||||
return file_id
|
||||
elif status == "Fail":
|
||||
raise SystemExit(f"Video generation failed: {json.dumps(data, indent=2)}")
|
||||
else:
|
||||
print(f" [{elapsed}s] Status: {status}...")
|
||||
time.sleep(interval)
|
||||
elapsed += interval
|
||||
|
||||
raise SystemExit(f"Timeout after {max_wait}s. task_id={task_id}, check manually.")
|
||||
|
||||
|
||||
def download_video(file_id: str, output_path: str):
|
||||
"""Retrieve download URL via file_id and save the video."""
|
||||
resp = requests.get(
|
||||
f"{API_BASE}/files/retrieve",
|
||||
headers=_headers(),
|
||||
params={"file_id": file_id},
|
||||
timeout=30,
|
||||
)
|
||||
resp.raise_for_status()
|
||||
data = resp.json()
|
||||
_check_resp(data)
|
||||
|
||||
download_url = data.get("file", {}).get("download_url", "")
|
||||
if not download_url:
|
||||
raise SystemExit(f"No download_url in response: {json.dumps(data, indent=2)}")
|
||||
|
||||
print(f" Downloading from {download_url[:80]}...")
|
||||
video_resp = requests.get(download_url, timeout=300)
|
||||
video_resp.raise_for_status()
|
||||
|
||||
os.makedirs(os.path.dirname(output_path) or ".", exist_ok=True)
|
||||
with open(output_path, "wb") as f:
|
||||
f.write(video_resp.content)
|
||||
|
||||
print(f"OK: {len(video_resp.content)} bytes -> {output_path}")
|
||||
|
||||
|
||||
def generate(
|
||||
prompt: str,
|
||||
output_path: str,
|
||||
model: str = "MiniMax-Hailuo-2.3",
|
||||
duration: int = 6,
|
||||
resolution: str = "768P",
|
||||
prompt_optimizer: bool = True,
|
||||
poll_interval: int = 10,
|
||||
max_wait: int = 600,
|
||||
image_path: str = None,
|
||||
):
|
||||
"""Full pipeline: create task -> poll -> download."""
|
||||
mode = "I2V" if image_path else "T2V"
|
||||
print(f"Creating {mode} task...")
|
||||
print(f" Model: {model} | Duration: {duration}s | Resolution: {resolution}")
|
||||
if image_path:
|
||||
print(f" Image: {image_path}")
|
||||
print(f" Prompt: {prompt[:100]}{'...' if len(prompt) > 100 else ''}")
|
||||
|
||||
first_frame = _encode_image(image_path) if image_path else None
|
||||
task_id = create_task(prompt, model, duration, resolution, prompt_optimizer, first_frame)
|
||||
print(f" task_id={task_id}")
|
||||
print(f"Waiting for generation...")
|
||||
|
||||
file_id = poll_task(task_id, poll_interval, max_wait)
|
||||
download_video(file_id, output_path)
|
||||
|
||||
|
||||
def main():
|
||||
all_models = sorted(set(T2V_MODELS + I2V_MODELS))
|
||||
p = argparse.ArgumentParser(description="MiniMax Video Generation (T2V + I2V)")
|
||||
p.add_argument("prompt", help="Video description (max 2000 chars). Use [Camera Command] for camera control.")
|
||||
p.add_argument("-o", "--output", required=True, help="Output file path (.mp4)")
|
||||
p.add_argument("--image", default=None, help="First frame image path for I2V mode (jpg/png/webp, <20MB)")
|
||||
p.add_argument("--model", default="MiniMax-Hailuo-2.3", choices=all_models,
|
||||
help="Model (default: MiniMax-Hailuo-2.3)")
|
||||
p.add_argument("--duration", type=int, default=6, choices=[6, 10], help="Duration in seconds (default: 6)")
|
||||
p.add_argument("--resolution", default="768P", choices=["720P", "768P", "1080P"], help="Resolution (default: 768P)")
|
||||
p.add_argument("--no-optimize", action="store_true", help="Disable prompt auto-optimization")
|
||||
p.add_argument("--poll-interval", type=int, default=10, help="Poll interval in seconds (default: 10)")
|
||||
p.add_argument("--max-wait", type=int, default=600, help="Max wait time in seconds (default: 600)")
|
||||
args = p.parse_args()
|
||||
|
||||
generate(
|
||||
prompt=args.prompt,
|
||||
output_path=args.output,
|
||||
model=args.model,
|
||||
duration=args.duration,
|
||||
resolution=args.resolution,
|
||||
prompt_optimizer=not args.no_optimize,
|
||||
poll_interval=args.poll_interval,
|
||||
max_wait=args.max_wait,
|
||||
image_path=args.image,
|
||||
)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
Reference in New Issue
Block a user