Two ways into a product clip, one API call: text-to-video (describe the shot) and image-to-video (animate a product photo). Image-to-video takes your packshot as first_frame_url and preserves the exact product — the brand workflow. Both 5s 720p clips cost $1.65; pricing is per second by resolution (480p $0.15, 720p $0.33, 1080p $0.823). Prototype motion at 480p, then re-render keepers at higher resolution.
HiAPI Team
Tutorial · Jun 23, 2026 · 6 min read


Seedance 2.0 handles text-to-video and image-to-video in one model, with native audio and 4–15s clips at 480p/720p/1080p. Every render runs through hiapi's async task API: create a task, poll (or use a callback), then download the time-limited output URL. Pricing is per second by resolution: 480p $0.15, 720p $0.33, 1080p $0.823. Iterate at 480p, then re-render the winner at ship resolution — and divide cost by your acceptance rate.

Product imagery is the highest-volume image job — and where a generation API pays off fastest. Every shot is one POST /v1/tasks call with gpt-image-2; no fine-tuning or reference images. GPT Image 2 renders on-pack text cleanly, which makes it the safe pick for packaging mockups. Flat $0.03 per 1K image ($0.04 at 2K, $0.06 at 4K) — a 500-SKU packshot run is $15.

Ad creative needs legible on-image text — the one thing most models butcher; GPT Image 2 is the exception. Three templates (hero / social promo / lifestyle banner) with copy-paste placeholders and real outputs. Quote the exact copy and say where it goes — the single biggest factor in correct ad text. Flat $0.03 per 1K image; a 200-variant campaign is $6.

There is no single best image API — the right model depends on the job; one integration should give you all of them. hiapi's image roster spans 70× in price, from flux-schnell ($0.005) to gpt-image-2-pro ($0.35). Same-prompt test: gpt-image-2 and Nano-Banana-2 render on-image text cleanly; FLUX 1.1 Pro is weaker on dense text. gpt-image-2 ($0.03) is the best price-to-instruction-following default.

Image-to-image transforms an asset you already own — background swap, restyle, scene change — instead of generating from scratch. On hiapi, gpt-image-2-image-to-image is $0 per call; iterate for free. The only new parameter is input_urls: the source image(s) to transform. Anchor the subject with 'keep the product shape and color unchanged' so the model restyles the scene, not the product.

qwen-image-2.0 is one of the few text-to-image models that renders multi-character Chinese and short English headlines cleanly inside the picture. It's on hiapi at $0.025 per image, flat, with a 2K default output and the full set of common aspect ratios from 1:1 to 21:9. Six tested recipes are included — bilingual storefront, ink landscape with calligraphy, modern poster, Pixar character, editorial illustration, photoreal flat-lay — each with the exact prompt and hiapi input. All seven images in the article (one cover + six recipes) cost a total of $0.175 to produce.

ChatGPT Images 2.0 (consumer) and gpt-image-2 (developer API) draw from the same model family but solve different problems. OpenAI's direct API is token-billed, so the same 1024×1024 canvas can run ~35× more expensive at high vs low quality. hiapi resells gpt-image-2 at a flat $0.03/call at 1K, with gpt-image-2-pro at $0.35 for the polished hero-shot tier. Three decision rules: explore in ChatGPT, ship in the API, escalate to pro only where polish is the product.

gpt-image-2-pro and gpt-image-2-image-to-image-pro are stability-tier entries — built for production jobs where consistency beats cost. Live pricing: $0.35 / $0.42 at 1K, $0.70 / $0.84 at 2K — roughly 12–21× the standard gpt-image-2 tier. Pro earns its keep on customer-facing surfaces, brand-critical sets, and pipelines where downstream review is expensive. Both Pro variants speak the OpenAI Chat Completions shape on hiapi — the call is the same one your existing gpt-image-2 worker already uses.

GPT Image 2 wins instruction-following, text accuracy, multi-element compositions, and per-image price ($0.03 vs $0.05). FLUX 1.1 Pro wins raw speed by roughly 9.2× (~6.4s vs ~59s per image in our test) and photoreal portraits with dramatic lighting. FLUX 1.1 Pro made a spelling error in the text-rendering test (EYERY for EVERY). GPT Image 2 rendered all text correctly across the same prompt. On a complex eight-object flat-lay, GPT Image 2 placed every item with the right count; FLUX 1.1 Pro produced a stylish but inaccurate version. Numbers are from hiapi production endpoints as of 2026-05; sample size is small by design — six paired prompts, single image per model.
Get the latest model launches, engineering deep-dives, and API tutorials delivered straight to your inbox.
Email subscription is coming soon
No spam. Quality content on AI image & video generation, unsubscribe anytime.