ThinkDiffusion

Product

Pricing

Enterprise

Docs

ThinkDiffusion

LTX 2 19B Fast for Text to Video

A text video model using LTX 2

Filmmaking

LTX 2

LTX 2 Fast

Open Source

Text2Video

Videography

1.5k

LTX‑2 Fast is the high‑speed, open‑source mode of the LTX‑2 audio‑video foundation model that turns short text prompts into complete video clips with synchronized audio in just a few seconds. It’s built on distilled LTX‑Video weights (Fast / LTXV models), optimized so you can get 6–10 second HD or 4K clips quickly enough for real iterative work, not just one‑off demos.

What it is

An open‑source DiT‑based text‑to‑video model variant focused on speed, derived from LTX‑Video/LTX‑2 and released as distilled “Fast” checkpoints.
Supports text‑to‑video (and image‑to‑video via the same stack) with synchronized audio generation—sound effects, ambience, and simple music are generated together with the frames.

Why it matters

Enables near real‑time ideation: drafts render in seconds, so you can iterate on prompts, camera moves, and story beats the way you iterate on still images.
Being open‑source, it can be self‑hosted, fine‑tuned, and wired into ComfyUI or custom pipelines, which is critical when you need control over data, latency, and costs.
Distilled and quantized variants (FP8 / Q8) reduce VRAM and compute needs, making 720p–1216×704 videos possible even on mid‑range GPUs.

Typical usage

You provide a compact prompt describing subject, motion, camera, and mood; LTX‑2 Fast generates a 6–10 second clip, often at 1216×704 or 1080p, with matching audio.
In ComfyUI or similar, you choose the Fast/distilled sampler and low diffusion steps (around 8) to get fast preview renders, then optionally re‑render in a higher‑quality mode if needed.

Insights

Based on user experiences and technical reviews as of January 2026, LTX-2 (and the broader LTX-Video model) is considered not well-suited for fast-moving, complex, or high-action scenes. While it offers high resolution and decent speed, it frequently produces artifacts, distortions, or "melting" effects when tasked with rapid motion.

Here is a breakdown of why it struggles with fast motion and how to improve it:

Why LTX-2 Struggles with Fast Motion

"Memory vs. Memory" Tradeoff: To maintain high generation speeds, the model often compresses temporal context, causing it to lose track of details during complex or fast motion.
Action Sequence Limitations: Complex, non-linear, or rapid movements (e.g., fighting scenes, heavy, fast-paced action) frequently lead to unusable, blurry, or distorted results.
"Melting" Effects: In Image-to-Video (I2V) workflows, fast-motion scenes often result in the initial, high-quality image breaking down into unrealistic, blurry, or distorted ("melting") footage.
Lower Initial Resolution: The base models often operate at lower resolutions, and if not upscaled correctly, fast movement turns into "blurry crap".

Tips to Improve LTX-2 Motion

Despite these limitations, users have found ways to improve performance:

Increase FPS for Realism: Change default FPS from 24 to 48 or 60 to make motions look more realistic.
Use Specific Checkpoints/LoRAs: Use the LTX-2 detailer LoRA on stage 1 and consider using LoRAs specifically designed for camera movements (e.g., dolly-in).
Avoid Complex Prompts: Keep prompts simple. Excessive, layered actions in a single prompt increase the likelihood of chaotic, poor-quality output.
Initial Resolution: Start with at least 720p or higher to prevent blurry, low-resolution results.
Use Specific Samplers: Some users report better results with the Clownshark Res_2s sampler.

EXTENSIONS

ComfyUI-LTXVideo

Lightricks

ComfyUI-LTXVideo enhances COMFYUI with custom nodes for the LTX-2 video generation model, offering unified control over multiple conditions and efficient workflows for text and image-to-video generation.

3074

2026-01-29

ComfyMath

evanspearman

ComfyMath offers specialized math nodes for COMFYUI, enabling boolean logic, integer and floating-point arithmetic, and vector operations, enhancing computational capabilities within the web interface.

161

2025-03-08

Generates in about -- secs

floyoofficial

Nodes & Models

ComfyUI Official

CheckpointLoaderSimple

ltx-2-19b-distilled.safetensors

LatentUpscaleModelLoader

ltx-2-spatial-upscaler-x2-1.0.safetensors

LTXVAudioVAELoader

ltx-2-19b-distilled.safetensors

WorkflowGraphics

RandomNoise

KSamplerSelect

ManualSigmas

PrimitiveFloat

PrimitiveInt

EmptyImage

MarkdownNote

PrimitiveStringMultiline

LoraLoaderModelOnly

your_camera_lora.safetensors

ImageScaleBy

LTXVEmptyLatentAudio

GetImageSize

CLIPTextEncode

EmptyLTXVLatentVideo

LTXVConditioning

LTXVConcatAVLatent

CFGGuider

SamplerCustomAdvanced

LTXVSeparateAVLatent

LTXVLatentUpsampler

LTXVAudioVAEDecode

CreateVideo

SaveVideo

ComfyUI-LTXVideo

LTXVGemmaCLIPModelLoader

gemma-3-12b-it-qat-q4_0-unquantized/model-00001-of-00005.safetensors

ltx-2-19b-distilled.safetensors

LTXVGemmaEnhancePrompt

ComfyMath

CM_FloatToInt

What it is

An open‑source DiT‑based text‑to‑video model variant focused on speed, derived from LTX‑Video/LTX‑2 and released as distilled “Fast” checkpoints.
Supports text‑to‑video (and image‑to‑video via the same stack) with synchronized audio generation—sound effects, ambience, and simple music are generated together with the frames.

Why it matters

Enables near real‑time ideation: drafts render in seconds, so you can iterate on prompts, camera moves, and story beats the way you iterate on still images.
Being open‑source, it can be self‑hosted, fine‑tuned, and wired into ComfyUI or custom pipelines, which is critical when you need control over data, latency, and costs.
Distilled and quantized variants (FP8 / Q8) reduce VRAM and compute needs, making 720p–1216×704 videos possible even on mid‑range GPUs.

Typical usage

You provide a compact prompt describing subject, motion, camera, and mood; LTX‑2 Fast generates a 6–10 second clip, often at 1216×704 or 1080p, with matching audio.
In ComfyUI or similar, you choose the Fast/distilled sampler and low diffusion steps (around 8) to get fast preview renders, then optionally re‑render in a higher‑quality mode if needed.

Insights

Here is a breakdown of why it struggles with fast motion and how to improve it:

Why LTX-2 Struggles with Fast Motion

"Memory vs. Memory" Tradeoff: To maintain high generation speeds, the model often compresses temporal context, causing it to lose track of details during complex or fast motion.
Action Sequence Limitations: Complex, non-linear, or rapid movements (e.g., fighting scenes, heavy, fast-paced action) frequently lead to unusable, blurry, or distorted results.
"Melting" Effects: In Image-to-Video (I2V) workflows, fast-motion scenes often result in the initial, high-quality image breaking down into unrealistic, blurry, or distorted ("melting") footage.
Lower Initial Resolution: The base models often operate at lower resolutions, and if not upscaled correctly, fast movement turns into "blurry crap".

Tips to Improve LTX-2 Motion

Despite these limitations, users have found ways to improve performance:

Increase FPS for Realism: Change default FPS from 24 to 48 or 60 to make motions look more realistic.
Use Specific Checkpoints/LoRAs: Use the LTX-2 detailer LoRA on stage 1 and consider using LoRAs specifically designed for camera movements (e.g., dolly-in).
Avoid Complex Prompts: Keep prompts simple. Excessive, layered actions in a single prompt increase the likelihood of chaotic, poor-quality output.
Initial Resolution: Start with at least 720p or higher to prevent blurry, low-resolution results.
Use Specific Samplers: Some users report better results with the Clownshark Res_2s sampler.

EXTENSIONS

ComfyUI-LTXVideo

Lightricks

3074

2026-01-29

ComfyMath

evanspearman

161

2025-03-08