API

Pricing

Workflows

API

Pricing

LongCat for Text to Image

Create cool images using the LongCat

LongCat

Text2Image

762

ComfyUI_temp_gvuvr_00002_ (1)_1773805311005.png

Generates in about 12 secs

floyoofficial

Nodes & Models

ComfyUI Official

CLIPLoader

qwen_2.5_vl_7b_fp8_scaled.safetensors

Ver Private

Comm Use

VAELoader

ae.safetensors

Ver Private

Comm Use

UNETLoader

longcat_image_bf16.safetensors

Ver Private

Comm Use

ResolutionSelector

WorkflowGraphics

CLIPTextEncode

CFGNorm

EmptySD3LatentImage

FluxGuidance

KSampler

VAEDecode

AddLabel

PreviewImage

ComfyUI-Easy-Use

easy positive

LongCat‑Image is a 6B‑parameter text‑to‑image (and image‑edit) foundation model focused on photorealism and extremely accurate English/Chinese text rendering in images.

What it is

An open‑source model from Meituan that generates and edits images from text prompts, using a Flux‑style diffusion backbone plus a Qwen2.5‑VL text encoder.
Built to be smaller and faster than many flagship models while still matching or beating them on realism and text‑in‑image benchmarks.

Key features

Strong photorealism and material rendering (skin, fabric, lighting) for commercial‑quality images.
Bilingual text rendering: handles Chinese and English text in images with high spelling accuracy by treating quoted text at character level.
Unified generation + editing: a paired LongCat‑Image‑Edit variant supports precise inpainting/outpainting and instruction‑based edits while preserving structure and identity.
Efficient 6B architecture yields fast inference and lower VRAM use than SDXL/Flux‑class models, especially in cloud or optimized runtimes.

Best use cases

Posters, ads, and UI mockups that need clean layout plus correctly spelled Chinese/English text (titles, buttons, labels, signage).
Photoreal product and lifestyle images for marketing, where realism and brand‑safe detail matter.
Image editing tasks like background changes, object insertion/removal, or text replacement in existing visuals using natural‑language instructions.

Discover more workflows

You might like these too.

Z-Image Turbo: Fast Image Generation in Seconds

floyoofficial

21.9k

Marketing

Photography

Production

Text2Image

Z-Image Turbo

Fast Image Generation in Seconds

Z-Image Turbo: Fast Image Generation in Seconds

Fast Image Generation in Seconds

Nano Banana 2: Fast Image Generation & Editing

floyoofficial

4.6k

API

gemini flash image

Image2Image

Text2Image

typography

The top-ranked image model on Artificial Analysis and LM Arena. 4K output, text rendering, and subject consistency across 5 characters.

Nano Banana 2: Fast Image Generation & Editing

The top-ranked image model on Artificial Analysis and LM Arena. 4K output, text rendering, and subject consistency across 5 characters.

floyoofficial

25.2k

AiVideo

API

image to video

video generation

wan 2.5

Wan 2.5: Image to Video with Audio

LTX 2.3 Image-to-Video and Text-to-Video (Combo)

luminousinitiative

2.9k

Image to Video

LTX2.3

Text to Video

Create both from Image-to-Video and Text-to-Video using LTX 2.3

LTX 2.3 Image-to-Video and Text-to-Video (Combo)

Create both from Image-to-Video and Text-to-Video using LTX 2.3

floyoofficial

14.6k

VFX

Video2Video

Video Production

Wan2.6

Wan 2.6 Reference to Video

floyoofficial

14.6k

API

gemini 3 pro

Image2Image

typography

Google just released Nano Banana Pro, and honestly, it's a pretty big step up from the original Nano Banana. The main thing? It can actually put legible text in images now. Like, real text that you can read, not the garbled nonsense most AI models spit out.

Nano Banana Pro: Generate & Edit Images

mdmz

11.0k

wan 2.2

wan22

wan 2.2 animate

wan 22 animate

wan animate

Wan 2.2 Animate Preprocess by Kijai (MDMZ Edition)