KAIMEDIA · AI Video Intelligence

16:9 ⟶ 9:16 Fully Automated

Broadcast Video,
Reframed for Mobile

KAIMEDIA's AI engine understands scenes, detects people, reads subtitles, and intelligently reframes every shot — no manual editing required.

▶ Watch Demo Learn How It Works

Live Demo

See the Transformation

The same broadcast clip — landscape for TV, vertical for mobile — automatically reframed by AI.

Original Broadcast
1280 × 720 · 30 fps

16:9

⟶ AI Reframe

AI-Reframed Output
720 × 1280 · 30 fps

9:16

Pipeline

How It Works

A fully automated six-stage AI pipeline converts every frame with editorial precision.

🎬

Scene Detection

Scene change detection analyses shot boundaries so only keyframes are processed.

👤

Person Detection

AI Object Detection runs first, providing pixel-accurate bounding boxes before VLM processing.

🧠

Scene Graph

VLM receives detection results as grounding hints and classifies roles (Anchor, Reporter) and extracts relationships.

📐

Layout Solver

An ontology-based scorer selects the optimal split-screen or fullscreen layout.

📝

Subtitle OCR

Bottom captions are detected frame-by-frame and re-rendered with crisp Korean/English text.

🎞️

Composition

Video encoder produces the final 9:16 output at broadcast quality with original audio.

Features

Intelligent Frame-by-Frame Decisions

Every scene gets a custom layout — no fixed crop, no black bars forced onto important content.

🎯

Subject-Aware Cropping

Person detection ensures anchors, reporters, and guests are never clipped. Safety margins prevent face or body cutoff.

🗺️

Ontology Scene Reasoning

A semantic ontology classifies every shot — SoloAnchor, ConversationScene, ThreePersonScene, MaterialScene — and picks the right layout family.

📺

Split-Screen Intelligence

Multi-person or anchor+background scenes get a two-panel layout: tight portrait crop on top, widescreen context below.

🔤

Subtitle Preservation

Korean and English lower-thirds are OCR'd and re-rendered in the native 9:16 space — no more clipped or shrunk captions.

⏱️

Temporal Consistency

Scene-level layout caching and temporal smoothing eliminate flickering or jumpy reframes within a single shot.

🎬

Broadcast-Quality Output

High-quality video encoder with full audio preservation — indistinguishable from native vertical production.

9:16

Output aspect ratio
(TikTok, Reels, Shorts)

AI pipeline stages

Video encoder quality
(near-lossless)

VLM

On-device vision AI
(no cloud required)

100%

Automated — zero
manual edits required

Gallery

Frame-by-Frame Results

Video and frame comparisons from a real broadcast clip reframed by AI. Left: original 16:9. Right: AI-reframed 9:16.

Original Broadcast Video

16:9 1280 × 720

→ AI Reframe

AI-Reframed Vertical Output

9:16 720 × 1280

Original Broadcast Video

16:9 1920 × 1080

→ AI Reframe

AI-Reframed Vertical Output

9:16 1080 × 1920

Scene 1 — Reporter · Beijing Input scene 1 - reporter

16:9 1920 × 1080

→ AI Reframe

Reframed Output Output scene 1 - reframed

9:16 1080 × 1920

Scene 2 — Two-Anchor Split · Studio / Beijing Input scene 2 - two anchors

16:9 1920 × 1080

→ AI Reframe

Reframed Output Output scene 2 - reframed

9:16 1080 × 1920

Scene 3 — Two-Anchor Split · Continuous Input scene 3 - anchors

16:9 1920 × 1080

→ AI Reframe

Reframed Output Output scene 3 - reframed

9:16 1080 × 1920

Technology

Built on State-of-the-Art AI

Every component is best-in-class, running fully on-premises with no cloud dependency.

Vision Language Model (VLM) AI Object Detection Scene Change Detection RDF Ontology Reasoning Video Encoder OpenCV Deep Learning Framework Supersampled Text Rendering High-Quality Audio Python GPU / CPU Inference Multilingual Font Support

🔒

On-Premises Only

All AI inference runs locally. No video data ever leaves your infrastructure.

⚡

GPU or CPU

VLM inference runs on GPU for maximum speed; graceful CPU fallback for edge deployments.

🌐

Korean & English

Subtitle OCR handles mixed Korean/English lower-thirds with character-level accuracy.

🔧

Configurable Pipeline

Every parameter — scene threshold, layout ratios, subtitle safe area — is tunable via YAML config.

Broadcast Video, Reframed for Mobile

See the Transformation

How It Works

Scene Detection

Person Detection

Scene Graph

Layout Solver

Subtitle OCR

Composition

Intelligent Frame-by-Frame Decisions

Subject-Aware Cropping

Ontology Scene Reasoning

Split-Screen Intelligence

Subtitle Preservation

Temporal Consistency

Broadcast-Quality Output

Frame-by-Frame Results

Built on State-of-the-Art AI

On-Premises Only

GPU or CPU

Korean & English

Configurable Pipeline

Broadcast Video,
Reframed for Mobile