Happy Horse 1.1 · Reference to Video
Upload up to nine reference images of characters, objects, or scenes, and Happy Horse 1.1 generates a cinematic video that preserves their identity, style, and detail with synchronized audio.
ai video
character consistency
happy horse 1.1
reference to video
0
31
Nodes & Models
HappyHorse11ReferenceToVideo_floyo
VideoToFrames
LoadImage
CreateVideo
SaveVideo
ABOUT THE WORKFLOW
Generate Video from References
Upload one or more reference images of the characters, products, or scenes you want in the video. Write a prompt describing the action, camera movement, and dialogue. Happy Horse 1.1 carries each reference subject into the generated video, keeping their face, wardrobe, and visual style consistent throughout the clip. The output is an MP4 with synchronized audio.
Partner node. This workflow calls an external API, so each run uses credits from your API wallet. No API key needed. Floyo handles the connection.
Model
Happy Horse 1.1 by Alibaba. The top-ranked video model on Artificial Analysis. Generates synchronized video and audio in one pass with native lip-sync in seven languages. Reference-to-video mode preserves subject identity across the full clip from up to nine input images.
HOW IT WORKS
Step 1. Upload your reference images
Add one or more images of the characters, products, or environments you want in the video. Each image becomes a visual anchor the model carries into the scene. You can use up to nine.
Works great with: character portraits · product photos · environment shots · AI-generated images
Step 2. Write your prompt
Describe the scene, action, camera movement, and dialogue. Reference each image by its position: "The man from image 1 walks down the street while the woman from image 2 watches from a window." Be specific about what each subject does.
Step 3. Set resolution, aspect ratio, and duration
Pick 1080P for final output or 720P for a faster draft. Choose the aspect ratio for your platform (16:9 widescreen, 9:16 vertical, 1:1 square, or ultrawide 21:9). Duration runs from 3 to 15 seconds.
Step 4. Hit run and download
Happy Horse 1.1 generates the video with synchronized audio and returns an MP4.
Ready for: Premiere · DaVinci Resolve · After Effects · TikTok · Instagram · YouTube
First time? Leave every setting as-is. The defaults (1080P, 16:9, 5 seconds) are the right starting point for almost everyone.
RECOMMENDED SETTINGS
Quick-start guide. Find the goal that matches yours and copy the settings.
Standard reference video (most people) — 1080P, 16:9, 5 seconds, random seed. The right starting point for almost everyone.
Quick preview before committing credits — 720P, 3 seconds. Check the character consistency and motion before running the full clip.
Vertical for social media — 1080P, 9:16, 5 to 10 seconds. Native portrait output for TikTok, Reels, and Shorts.
Multi-character scene — Upload a separate reference image for each character. In the prompt, refer to them by position: "The woman from image 1 sits across from the man from image 2." Distinct references per character keep identities from blending.
Dialogue scene with lip-sync — Write dialogue in quotes and attribute it to the reference. "The man from image 1 says: 'We need to leave now'" triggers native lip-sync. Works in seven languages.
Product in a scene — Upload the product as one reference and a setting or model as another. Describe how they interact. The model preserves product detail, branding, and color.
Reproduce or tweak a result — Lock the seed. Same seed with a small prompt edit lets you adjust motion or framing without regenerating everything.
Prompt: Reference each image by its number ("image 1," "image 2") so the model knows which subject does what. Front-load the action and camera direction. "The man from image 1 walks toward the camera on a rain-soaked street, slow tracking shot from the front, neon reflections on wet pavement" is specific. "A person walks down a street" wastes the reference.
LEARN
📹 Videos
ComfyUI 101 Free Course ft. Sebastian Kamph
Floyo 101 for Team Collaboration
✨ Quick links
USE CASES
🎬 Multi-Character Scenes
Cast multiple characters from separate reference photos into the same scene. Each keeps their face, wardrobe, and build throughout the clip. Works for short dramas, ad spots, and storyboard animatics.
🛍️ Product Videos
Turn a product photo into a lifestyle video. Upload the product as a reference and describe the setting and action. The model preserves branding, color, and detail while adding cinematic motion and sound.
👤 Virtual Influencer and Avatar Content
Generate consistent talking-head or full-body videos of a character across clips. The reference image locks the identity so the same face and style appear every time.
🎤 Multilingual Dialogue Scenes
Create speaking characters with native lip-sync in English, Mandarin, Cantonese, Japanese, Korean, German, or French. Upload a portrait, write the dialogue, and the model handles the mouth movement.
📱 Social and Ad Content at Scale
Produce multiple video variations from the same reference images with different prompts, durations, and aspect ratios. Test scenes, angles, and copy without reshooting.
WHAT WORKS BEST / WHAT TO AVOID
✅ Works great
Clear, well-lit reference images with the subject facing the camera
One reference image per character or product
Prompts that name each reference by number ("the woman from image 1")
Dialogue written as direct quotes with speaker attribution
⚠️ May produce softer results
Blurry, cropped, or heavily filtered reference images
Too many references (more than four) in a short clip under 5 seconds
Prompts that do not specify which reference does what
References with very similar-looking subjects (the model may blend them)
FAQ
What is reference-to-video in Happy Horse 1.1?
Reference-to-video lets you upload up to nine images as visual anchors. The model carries the identity, style, and detail of each subject into the generated video. You refer to each image by its position in the prompt ("the person from image 1," "the object from image 3") and the model keeps them distinct throughout the clip. This is different from image-to-video, where the uploaded image becomes the opening frame.
How many reference images can I use at once?
Up to nine. Each image acts as a separate visual anchor. For most scenes, two to four references produce the strongest results. More references in a short clip can dilute the model's attention on each subject.
Does the model keep character identity consistent across the video?
Yes. Happy Horse 1.1 preserves face, wardrobe, voice, and build from the reference image across the full clip. This is one of the model's core strengths, and it holds up across camera cuts and angle changes within a single generation.
Can I use reference-to-video for product shots?
Yes. Upload the product image as a reference and describe the scene in the prompt. The model preserves product branding, color, and physical detail while placing it into a new environment with motion and audio.
What languages does the lip-sync support?
Seven: English, Mandarin, Cantonese, Japanese, Korean, German, and French. Write the dialogue in the language you want spoken and the model matches the lip movement to that language's phonetics.
How is reference-to-video different from image-to-video?
Image-to-video uses your uploaded image as the literal first frame and animates forward from it. Reference-to-video uses the uploaded images as identity anchors. The subjects appear in the video with their look preserved, but the scene composition, camera angle, and framing come from your prompt. Use image-to-video when you want to animate a specific frame. Use reference-to-video when you want specific characters or products in a new scene.
How to run Happy Horse 1.1 reference to video online?
You can run Happy Horse 1.1 reference to video online through Floyo. No installation, no setup, no API key to wire up. Open the workflow in your browser, upload your reference images, write your prompt, and hit run. Free to try.
WHY FLOYO?
Floyo is the only platform with team collaboration for ComfyUI in the browser. You run workflows with no install. You share run history, assets, and models across your team. You pay only when you generate. Floyo supports open-source and closed-source models.
A designer runs an edit and likes the result. A teammate opens that exact run from shared history and keeps going. No file handoffs. No version confusion.
For studios and enterprise teams, Floyo adds private workspaces, pooled resources, and a team usage dashboard. Other ComfyUI cloud tools run for one person at a time. Floyo runs for the whole team, with transparent per-generation costs.
Ready to try it?
Upload your reference images, describe the scene, and hit run.
Questions? Watch the free course or check the FAQ above.
Read more
_1782473642860.webp?width=1400&height=620&quality=80&resize=cover)
_1782473642860.webp?width=1400&height=620&quality=80&resize=cover)
_1782473728307.webp?width=1400&height=620&quality=80&resize=cover)
_1782473642860.webp?width=104&height=104&quality=80&resize=cover)
_1782473642860.webp?width=104&height=104&quality=80&resize=cover)
_1782473728307.webp?width=104&height=104&quality=80&resize=cover)
_1782470627803.webp?width=400&height=300&quality=80&resize=cover)





