Media · Content production (Little Tummy)

A video agent that ships character-consistent short-form content at generation cost.

A 4-beat script structure, a cached reference face on every generation, a VLM safety gate for age-appropriate content, and automatic crops for Reels, Shorts, and TikTok. 30+ assets per month for the price of a few lunches.

30+
Brand-safe videos produced per month
$15–60
Total generation cost for a full month
100%
Character consistency across the series
The Challenge

Video production scales linearly with editors. Generic AI video scales instantly — and inconsistently.

A food-content brand needs 30 videos a month. Hiring editors caps out at 5–10 per week per head. Turning to generic AI video tools produces a character that looks different every generation — and occasionally something you can't publish, especially when the audience is kids.

The failures are specific. Every generation picks a different character face, so a five-episode series looks like five different shows. Scripts wander in length and pacing because nothing enforces structure. A 60-second asset needs three platform crops — Reels, Shorts, TikTok — and each one ends up as separate work. A single off-brand frame slips through and you're doing emergency takedown calls.

The root cause isn't the generation model. It's that there's no pipeline between "recipe brief" and "published asset" that enforces structure, consistency, and safety. Without that pipeline, you either pay editors to hand-build every video or you publish output that looks like it came from five different studios.

This agent adds the pipeline: a fixed 4-beat script, a cached reference face passed to every generation, a VLM pass on every key frame, and automatic multi-platform rendering — then schedules the posts.

How the agent handles it

Recipe brief. 4-beat script. Cached face. VLM gate. Three platform crops.

SCRIPTRecipe IdeaIngredients, stepsprep time, complexity STRUCTURE4-Beat StructureHook · Steps ·Reveal · CTA GENERATEKling 3.0 APIw/ cached ref faceper-beat generation VALIDATEContent SafetyVLM scanage-check PUBLISHMulti-PlatformReels · Shorts ·TikTok distribution PERSISTENT REFERENCE~/.hermes/image_cache/img_4295fad7b826.jpg — cached reference face passed to every Kling generationCharacter consistency enforced across all episodes. No manual blending or averaging needed. Age-appropriate content verified via VLM. OUTPUT60–90s video asset · 3 platform crops (Instagram 9:16, YouTube 16:9, TikTok 9:16) · fully auditable generation log
1

Every script follows the same four-beat structure.

Hook (2s) → quick steps (fast cuts, close-ups) → satisfying reveal → call-to-action. The structure is proven for short-form retention; variations that break viewer expectations get rejected before generation. Pacing stops being an editorial judgment call.

2

One cached reference face. Every generation. No drift.

A high-quality reference face lives in ~/.hermes/image_cache/ and gets passed to every Kling call. The character stays visually consistent across 100 episodes without post-production blending or averaging. That consistency is what turns one-off videos into a recognizable series.

3

A VLM safety gate stands between generation and publish.

Vision language model scans key frames for age-inappropriate content, brand drift, or unsafe elements. Critical for kids' content. Borderline confidence flags for human review; confident violations get blocked outright. Zero auto-publish without the gate.

4

One asset produces three platform crops, automatically.

FFmpeg generates Instagram 9:16, YouTube 16:9, TikTok 9:16 from the same source. Platform APIs then schedule posts on a brand-safe cadence with retry logic if Instagram or YouTube is flaky. One brief in, three platforms out, with a full audit trail per video.

What you get

Three things change once the pipeline runs.

30+ videos/mo

Production capacity per brand

Recipe brief in, 60–90s asset out. Script, generation, safety check, crops, publishing — all orchestrated end-to-end.

$15–60/mo

Total generation cost for a month

Kling 3.0 at roughly $0.50–$2 per video. Predictable monthly budget that doesn't scale with editor headcount.

100%

Character consistency across the series

Cached reference face enforced on every generation. No visual drift between episodes, no manual averaging in post.

Numbers observed in Brilworks' internal reference deployment (Little Tummy kids' content). Actual figures on your stack will depend on brand visual complexity, safety tolerance, and platform posting cadence.

Is this right for you?

Honest fit criteria. We'd rather say no than oversell.

Strong fit if

  • You need 20+ short-form videos per month consistently, not one-off campaigns
  • Character or visual consistency is core to your brand identity across a series
  • You're spending $3K+ per month on video production or freelancers today
  • Your category benefits from safety guardrails (kids', health, regulated content)

Not a fit if

  • You publish fewer than five videos a month — the pipeline overhead is wasted
  • Your content depends on real actors, documentarian depth, or narrative complexity
  • Your brand doesn't have a locked visual identity yet — start there, not here
  • Your audience would reject AI-generated content as inauthentic for your category

Book a 30-minute scoping call.

We'll walk through your content calendar, your brand identity, and your safety constraints — then tell you honestly whether a character-consistent video pipeline is the right next step.