GUIDEFREE

How to Make a Cinematic Kissing Scene with Two AI Girls

Step-by-step guide to generating emotional, film-like intimate scenes with AI using text-to-image and image-to-video

ZenCreator Team
Updated December 9, 2025
8 min read
text-to-imageimage-to-videotutorialai-charactersanimation

Step 1 — Create the Starting Image

If you don't yet have an image where both characters are together in the same location, you can generate it easily.

Open Text-to-Image: https://app.zencreator.pro/tools/generator-by-prompt

For the example video, the chosen model was Nana Banana.

Here is the sample prompt for the starting frame:

Prompt
A cinematic shot of two young adults standing closely inside a narrow, warm-toned elevator. Soft amber lighting from above creates gentle shadows on their faces. The girl touches the girl's face with a tender, emotional gesture. They stand very close, facing each other, with an intimate, dramatic atmosphere. Minimalistic red-brown elevator walls, shallow depth of field, film-look color grading, moody lighting, natural poses, expressive eyes, realistic skin texture, high-resolution portrait.

Choose:

  • Your aspect ratio (9:16 for Reels / 16:9 for YouTube / 4:5 for Instagram)
  • The number of image variations

Click Generate.

You can modify the prompt to change:

  • Lighting
  • Mood
  • Characters
  • Environment
  • Emotional tone

Step 2 — Animate the Scene Using Image-to-Video

Open Image-to-Video: https://app.zencreator.pro/tools/video-generator

Select:

  • Model: Seedance Pro Fast (optimal for natural, emotional, human-like motion)

Upload the image generated in Step 1.

Now describe the movement you want your characters to make.

Here is the exact sample prompt used in the reference video:

Prompt
Cinematic video of two young adults kissing very passionately and touching each other extremely close inside a narrow warm-toned elevator. The atmosphere is filled with emotional tension. Their faces are inches apart, eyes locked, breathing softly, creating a powerful intimate moment. The girl gently touches the boy's face. Warm top light casts dramatic shadows on red-brown walls. The camera performs a slow dolly-in, then rotates around them with smooth handheld motion, adding intensity. Shallow depth of field, soft bokeh, warm amber color grading, film grain, natural skin texture, high-end cinematic look.

Click Generate.

Your video is ready.


Customize and Improve Your Result

You can:

  • Extend the video by screenshotting the last frame and generating the next segment
  • Write more detailed or more subtle action prompts
  • Try alternate models for different cinematic styles
  • Change camera movement (dolly in, rotation, slow zoom, handheld, static)
  • Adjust light, mood, and intensity
  • Experiment with different emotional tones (soft, dramatic, romantic, stylized)

Tips for Best Cinematic Quality

Use warm, soft lighting

Elevators, hallways, and evening rooms work particularly well.

Describe facial expressions

The model responds strongly to words like: "breathing softly," "emotional tension," "gentle," "natural skin texture."

Add camera motion

Even simple movement (slow dolly-in) makes the scene look premium.

Keep the environment simple

A minimal location (like an elevator) increases realism.

Maintain aspect ratio consistency

Match your image generation and video generation ratios.

Ready to put this into practice?

Try Image-to-Video