Skip to content
Tutorials

How to Create Photorealistic Interior Design Mockups with Veo 3 — No 3D Modeling Required

Learn how to generate photorealistic interior mockups with Veo 3 — furnished rooms, material swaps, lighting variations, and before/after renovations from a single reference photo.

10 min read
How to Create Photorealistic Interior Design Mockups with Veo 3 — No 3D Modeling Required

Hiring a 3D visualization studio to render a kitchen remodel costs anywhere from $500 to $2,000 per image, takes a week, and requires you to explain to someone in great detail exactly what “warm but not too warm” means for a paint color. Veo 3 does a version of this in under two minutes, from a photo you took on your phone. Is it a perfect replacement for a professional CGI pipeline? No. Is it good enough to close deals, pitch clients, and run design concepts past homeowners who can’t visualize anything from a flat floor plan? Absolutely.

This tutorial walks through a complete workflow for generating photorealistic interior design mockups using Veo 3 — covering furniture placement, material swaps, lighting variations, and before/after renovation sequences. No 3D modeling, no SketchUp, no drama. Just a reference photo, a well-structured prompt, and some patience with aspect ratios.

A note on expectations before we dive in: Veo 3 is a video generation model, which means it outputs short clips rather than static renders. For interior design work, you’ll mostly be working with near-static or slowly panning shots — think a 3–5 second dolly through a room, not a full architectural animation. For still exports, you screenshot the best frame. It works better than it sounds.

What You’ll Achieve

By the end of this tutorial, you’ll know how to generate a furnished empty room from a bare reference photo, swap wall colors and flooring materials on an existing interior, create a before/after renovation sequence using two chained prompts, produce lighting variations (golden hour, overcast, artificial evening) from the same room, and export still frames good enough for client presentations and listings.

What You Need

You need access to Veo 3 through Google Labs or the VideoFX interface, which is available to Google One AI Premium subscribers. You’ll also want at least one high-quality reference photo of the space you’re working with — ideally shot straight-on with decent natural light, not a fisheye lens nightmare. A second photo showing your desired style reference (a room you like the look of) helps enormously for style-transfer prompts. That’s genuinely it. No plugins, no external tools required for the basic workflow.

Pro tip ✅

Shoot your reference photo in landscape orientation at the widest natural angle your phone allows — not with a dedicated wide-angle lens, just the standard camera. Veo 3 handles standard perspective well. Extreme wide-angle distortion confuses material mapping and makes floors look like they’re melting.

Step 1: Analyze Your Reference Photo Before Touching a Prompt

Veo 3’s image-to-video pipeline anchors generation to your reference image. The more precisely your prompt describes what’s already in the photo, the more control you have over what changes. Before writing a single word, mentally catalog: floor material (tile, hardwood, carpet — and approximate color), wall color and finish, ceiling height (is it visible?), light sources (windows, fixtures), and any furniture or objects already in frame. You’ll describe these elements as “existing” in your prompt so the model knows to preserve them — or explicitly name them as things to replace.

Note 💡

Veo 3 doesn’t have a dedicated “inpaint this region only” control like Photoshop. Your prompt structure does that job. The phrase “preserve existing [element], replace only [element]” is doing heavy lifting throughout this workflow — don’t skip it.

Step 2: The Empty Room Furnishing Prompt

This is the most requested use case from real estate agents: an empty listing photo transformed into a styled, furnished room. The key is describing furniture in terms of style, material, and placement — not brand names, which Veo 3 ignores anyway.

photorealistic interior design video, slow dolly forward, unfurnished living room with white walls and light oak hardwood floors transformed into a furnished Scandinavian living space, add low-profile beige linen sofa facing a marble-top coffee table, wool area rug in muted sage green, single large fiddle leaf fig plant in terracotta pot near window, warm afternoon sunlight from left-side window, architectural photography style, 4K, no text, no people

That prompt structure — transformation statement, then itemized furniture list, then lighting, then style qualifier — gives the model a clear hierarchy. Change “Scandinavian” to “mid-century modern” and swap “beige linen” for “burnt orange velvet” and you have a completely different room without rewriting the whole prompt. Always end with “no text, no people” unless you specifically want those elements, because Veo 3 will occasionally hallucinate both.

photorealistic interior design video, static camera, unfurnished master bedroom with gray carpet and pale blue walls transformed into a furnished maximalist bedroom, add upholstered emerald green king bed with brass frame, two matching nightstands with ceramic table lamps, heavy velvet curtains in deep burgundy, warm evening lighting, shallow depth of field, architectural photography style, 4K, no text, no people

Notice “static camera” in the second prompt. For still-frame exports, a static or very slow pan gives you cleaner frames than a moving dolly. Use dolly shots when you want to show a client a walkthrough feel; use static when you’re screenshot-hunting for a presentation deck.

Pro tip ✅

Generate the same furnishing prompt three times before settling. Veo 3 has meaningful variation between runs — the first output might place the sofa awkwardly against a wall, the second gets the proportions right, the third nails the lighting. Three runs takes about six minutes total and dramatically improves your hit rate.

Step 3: Material and Color Swap Prompts

This is where interior designers actually save time. A client can’t decide between white oak flooring and dark walnut. Instead of re-rendering in a 3D tool, you run two prompts from the same reference.

photorealistic interior design video, static camera, existing living room with beige walls and light hardwood floors, replace flooring with dark walnut herringbone hardwood, preserve all existing furniture and wall color, preserve existing window position and natural daylight, architectural photography style, 4K, no text, no people
photorealistic interior design video, static camera, existing living room with beige walls and light hardwood floors, repaint walls in deep forest green matte finish, preserve all existing furniture and flooring, preserve existing window position and natural daylight, architectural photography style, 4K, no text, no people

The phrase “preserve all existing furniture” is doing critical work here. Without it, Veo 3 treats the whole scene as fair game and will rearrange, replace, or remove things seemingly at random. With it, the model treats existing elements as fixed constraints and focuses change on what you’ve specified. It’s not perfect — you’ll occasionally see a couch warp slightly or a lamp migrate — but it’s reliable enough for client-facing work 70–80% of the time.

Warning ⚠️

Avoid prompting for multiple simultaneous changes in a single run — “replace flooring AND repaint walls AND swap the sofa” typically produces worse results than running three separate targeted prompts. Veo 3 handles constrained, focused changes more accurately than wholesale room transformations.

Step 4: Before/After Renovation Sequences

For renovation pitches, a before/after sequence is the most persuasive deliverable you can hand a client. The technique here is chaining two prompts — one anchored to the existing reference photo (the “before”), one using the first output as a visual reference for the transformed state (the “after”).

Run your reference photo through this prompt first:

photorealistic interior design video, slow rightward pan, dated kitchen with laminate countertops, dark wood cabinets, linoleum floor, fluorescent overhead lighting, existing layout preserved, no modifications, architectural photography style, 4K, no text, no people

Export the best frame as a still. Now use that still as your new reference image for the transformation prompt:

photorealistic interior design video, slow rightward pan matching previous camera movement, same kitchen layout fully renovated, replace laminate countertops with white Calacatta marble, replace dark cabinets with shaker-style white cabinetry, replace linoleum with large-format white hexagonal tile, add pendant lighting over island, warm natural light from added window above sink, architectural photography style, 4K, no text, no people

The instruction “matching previous camera movement” helps maintain spatial consistency between the two clips when you cut them together in any basic video editor. It won’t be frame-perfect, but clients watching a before/after cut don’t need frame-perfect — they need to feel the transformation.

Pro tip ✅

Add a 0.5-second dissolve transition between before and after clips in any editor — CapCut, DaVinci Resolve, even iMovie. A hard cut between renders usually highlights inconsistencies; a dissolve makes the comparison feel intentional and polished. Clients consistently rate dissolve cuts as “more professional” in user testing, which is just human psychology doing its thing.

Step 5: Lighting Variations

Lighting is the fastest way to sell a room. A dining room at golden hour feels completely different from the same room under cool overcast daylight, and both feel different from a candlelit evening scene. Veo 3 handles lighting description well when you’re specific about light color temperature, direction, and intensity.

photorealistic interior design video, static camera, furnished dining room with white walls and natural wood table, golden hour sunlight streaming through west-facing windows, long warm shadows across table, light color temperature 2700K, dust particles visible in light shafts, architectural photography style, 4K, no text, no people
photorealistic interior design video, static camera, same furnished dining room with white walls and natural wood table, overcast afternoon light, soft diffused shadows, cool blue-white daylight, no direct sun, color temperature 5500K, architectural photography style, 4K, no text, no people
photorealistic interior design video, static camera, same furnished dining room with white walls and natural wood table, evening scene, overhead pendant lights on, warm amber glow, candles lit on table, no exterior light, dramatic contrast between lit and shadowed areas, architectural photography style, 4K, no text, no people

Running all three lighting variations from the same reference takes about 15 minutes total. The resulting set gives a designer or agent three entirely different emotional registers for the same physical space — which, if you’ve ever tried to explain “cozy but sophisticated” to a client, is worth considerably more than 15 minutes.

Pro tip ✅

Use color temperature numbers (2700K, 4000K, 5500K, 6500K) rather than adjectives like “warm” or “cool.” Veo 3 responds to technical lighting language more precisely than mood words, and the difference between 2700K and 4000K outputs is significant enough to matter for client presentations.

Step 6: Exporting and Presenting Still Frames

Veo 3 outputs video, but for most interior design deliverables — client PDFs, listing photos, mood boards — you need stills. The workflow is simple: play the generated clip, pause at the sharpest, best-composed frame, and screenshot at your screen’s native resolution. On a 4K monitor, this gives you a usable still at roughly 3840 × 2160 pixels — more than enough for print up to A3 size and easily sufficient for any digital use.

For presentation decks, crop the still to 16:9 or square depending on your format. For property listings, check platform specs — most MLS platforms cap uploads at 1920 × 1080 anyway, so you have room to spare. If you need a cleaner still than a screenshot provides, run the same prompt with “single frame, no camera movement” — Veo 3 will generate something closer to a static render, though output quality varies.

Avoid 🚫

Don’t use frames from the first or last 0.5 seconds of a Veo 3 clip for still exports. The model tends to fade in and out, and those frames are softer and more prone to artifacts. Seek frames from the 30–70% mark of the clip duration for the sharpest results.

Real-World Workflow: A 20-Minute Listing Prep

Here’s what a practical session looks like for a real estate agent prepping a vacant two-bedroom listing. Take three reference photos: living room, master bedroom, kitchen. For the living room, run the furnishing prompt and generate three variations in different furniture styles (contemporary, transitional, cozy/rustic). Pick the best output, screenshot the best frame. For the master bedroom, run a furnishing prompt plus two lighting variations (morning and evening). For the kitchen, run a material swap showing updated countertops and cabinet color. Total prompts: eight to ten. Total time including generation: 20–25 minutes. Total output: a set of styled room images that would have cost $800–1,500 from a staging photographer and virtual staging service — delivered before you’ve finished your second coffee.

What This Actually Means for Your Workflow

Veo 3 doesn’t replace architects, doesn’t replace interior designers with real taste, and definitely doesn’t replace professional photography for final listing photos. What it replaces is the expensive, slow middle layer — the visualization work that happens between “client meeting” and “client decision.” That layer has historically required either a 3D modeling budget or a very patient client who can imagine things. Neither is reliable. A prompt-driven mockup that takes 20 minutes and looks photorealistic enough to communicate the idea? That changes the speed of a design conversation fundamentally. The fact that it requires no software expertise beyond writing a coherent sentence is what makes it actually usable by the people who need it most — not the designers who already have the 3D tools, but the agents and decorators and homeowners who never did.

author avatar
promptyze

promptyze

ADMINISTRATOR