How to Create a Video of an AI Girl Eating a Banana or Sucking a Lollipop
Step-by-step guide to generating natural, cinematic eating and licking motions with AI using text-to-image and image-to-video tools
This tutorial shows you how to turn a single AI image into a full video of a girl eating a banana or seductively sucking/licking a Chupa Chups lollipop — using Text-to-Image and Image-to-Video tools inside ZenCreator.
You can use this workflow for any type of food or candy: fruit, snacks, desserts, lollipops, ice cream — whatever fits your creative vision.
Step 1 — Generate Your Starting Image (Text-to-Image)
Open Text-to-Image: https://app.zencreator.pro/tools/generator-by-prompt
Choose a model — for example Nana Banana, which works well for realistic portraits.
Here are ready-to-use prompt examples:
For Banana version:
A 20-year-old girl with red hair tied in high ponytail of her head, large breasts, and plump lips, holding a banana near her mouth. She is wearing a black leather top and black leather leggings.
For Chupa Chups Lollipop version:
A 20-year-old girl with black hair tied in two ponytails on either side of her head, large breasts, and plump lips, holding a lollipop near her mouth. She is wearing a pink T-shirt and a knee-length skirt. She is kneeling on the bed.
You can copy-paste exactly as-is or slightly adjust appearance/clothes if you want.
Next, select:
- Number of image variations
- Aspect ratio (9:16 for Reels, 16:9 for YouTube, 4:5 for Instagram)
Click Generate.
Pick the best-looking frame.
Step 2 — Animate the Image (Image-to-Video)
Open Image-to-Video: https://app.zencreator.pro/tools/video-generator
Upload your chosen image.
Select a generation model.
Important: When choosing the video generation model, pay attention to whether the model is marked SFW or NSFW:
- SFW models → will automatically censor or soften explicit licking/sucking motions (safe for TikTok/Instagram)
- NSFW models → allow full uncensored tongue play, visible saliva, deep motions, moaning audio, etc. (perfect for OnlyFans, Twitter/X, spicier platforms)
Choose accordingly depending on how explicit you want the final video to be.
Now paste this exact movement prompt:
For Banana version:
A beautiful woman playfully and seductively teasing a banana, her full glossy lips gliding around it in slow, intimate motions. She toys with it sensually — brushing it with the tip of her tongue, sliding her lips along the length, keeping the rhythm soft, erotic, and inviting. Her expression is flirty, warm, dreamy, filled with quiet sexual tension as she plays with the fruit. And then, after all the teasing, she gently bites off a piece — a slow, deliberate, sensual bite — and eats it with a soft, satisfied smile. Warm diffused lighting, creamy bokeh, velvety atmosphere, fluid and intimate movements.
For Chupa Chups Lollipop version:
A beautiful woman playfully and seductively teasing a lollipop, her full glossy lips gliding around it in slow, intimate motions. She toys with it sensually — brushing it with the tip of her tongue, sliding her lips along the candy, keeping the rhythm soft, erotic, and inviting. Her expression is flirty, warm, dreamy, filled with quiet playful tension as she plays with the treat. Warm diffused lighting, creamy bokeh, velvety atmosphere, fluid and intimate movements.
Customize it:
- Specify the eating action (bite, taste, chew, smile, interact with food)
- Describe camera motion (slow zoom, pan, handheld feel)
- Set emotional tone (playful, cute, natural, cinematic)
- Add environment details (kitchen, café, outdoors)
Then select:
- Aspect ratio
- Duration of the video
Click Generate.
Your video is ready!
Video Examples
Banana Eating Example
Watch how the banana animation creates natural, playful motions:
Tips for Best Results
Generate several variations
Try different camera motions and eating gestures — then combine them into a longer scene.
Keep the food visible
AI performs better when the food is clear in the initial image.
Use soft, flattering lighting
Natural or cinematic lighting produces realistic eating animations.
Choose an expressive facial prompt
Words like "natural expression," "soft smile," "gentle interaction," "realistic skin texture" help.
Audio makes a difference
WAN + Audio can generate ambient sound or you can upload your own track for realism.
Create a Full Video Sequence
For a longer video:
- Generate multiple short clips
- Use slightly different prompts
- Animate from different image angles
- Edit them together in any video editor (CapCut, Premiere, VN, DaVinci)
This creates a smooth, natural, extended sequence of the girl interacting with food.