How to Generate Architectural Renderings in Veo 3 — No SketchUp Needed
Learn how to generate photorealistic architectural renderings in Veo 3 using text prompts and sketches — no 3D modeling software required.
Architectural renderings used to mean one of two things: hiring a visualization studio that charges like a law firm, or spending three days wrestling with SketchUp, V-Ray, and your own dwindling patience. Veo 3 changes that math considerably. Google’s latest video generation model, available through the Gemini platform, can produce photorealistic exterior and interior shots from nothing but a text prompt — or a rough sketch you photographed with your phone. The results won’t replace a dedicated architectural CGI pipeline for final client deliverables, but for early-stage concept visualization, design iteration, and quick stakeholder presentations, they’re genuinely impressive.
This tutorial walks through the full workflow: setting up Veo 3, structuring prompts that actually control what you get, using sketch inputs to anchor the geometry, and running multiple shots of the same scene with consistent spatial relationships. No 3D modeling software required. A reasonable laptop and a Gemini account will do.
What You’ll Achieve
By the end of this guide you’ll be able to generate a coherent multi-angle architectural visualization — exterior facade, interior shot, and a detail close-up — from a single text description or uploaded sketch. You’ll know how to control lighting conditions, material appearance, and camera position through prompt language alone, and you’ll understand where Veo 3’s spatial consistency tools help and where they still need a human eye to catch problems.
What You Need Before Starting
You need access to Veo 3 through the Gemini platform. As of early 2026, Veo 3 is available via Google AI Studio (aistudio.google.com) and through Gemini Advanced tiers — check your current plan, since access levels vary. You don’t need any 3D software. If you want to use sketch input, a photo of a hand-drawn floor plan or elevation drawing works fine. Keep your sketches high-contrast — black lines on white paper photograph better than pencil on grey.
Note 💡
Veo 3 access through Google AI Studio may require joining a waitlist depending on your region and account tier. Check labs.google.com for the current access status before planning a deadline around this workflow.
Step 1 — Set Up Your Project in AI Studio
Open Google AI Studio and select Veo 3 from the model dropdown. You’ll be working in the video generation interface. Before writing a single prompt, set up a project folder or session label — something like “Riverside Apartment Block — Concept A.” This matters because spatial consistency across multiple shots depends on keeping generations within the same session context, where Veo 3 can maintain reference to previously generated frames.
If you have a sketch, upload it now as your base image input. Use the image upload button in the prompt interface. Veo 3 supports image-to-video generation, meaning your sketch becomes the spatial anchor for everything that follows. A clean elevation drawing gives the model a defined silhouette to work from; a floor plan helps with interior shots.
Pro tip ✅
Photograph your sketch in natural daylight rather than under warm indoor bulbs. Yellow-tinted scans confuse the model’s edge detection and produce softer geometry in the output. High contrast = sharper architectural lines.
Step 2 — Write Your Foundation Prompt
The foundation prompt defines the building, its context, and the visual style for the entire session. Get this right and subsequent shots are much easier to control. The formula that works consistently is: [Building type] + [Architectural style] + [Key materials] + [Setting/context] + [Lighting condition] + [Camera position] + [Visual style].
Here’s a foundation prompt for a contemporary residential building:
A five-story contemporary residential apartment building, minimalist Scandinavian architecture, floor-to-ceiling glazing with thin aluminum frames, white rendered concrete facade with vertical cedar wood cladding panels, situated on a quiet tree-lined urban street, late afternoon golden hour sunlight casting long shadows across the facade, wide-angle exterior shot from street level, slight upward camera tilt, photorealistic architectural visualization, sharp depth of field, 8K quality
This prompt covers every variable that matters: scale, style, materials (concrete, aluminum, cedar), environmental context (urban street, trees), lighting (golden hour, shadow direction), camera position (street level, slight upward tilt), and output quality signal (photorealistic, 8K). Remove any one of these and the model fills the gap with a guess — sometimes a good guess, often not.
Step 3 — Control Lighting with Precision
Lighting is where architectural renderings live or die, and Veo 3 responds well to specific light descriptors. Vague terms like “good lighting” produce mediocre, flat results. You need to name the light source, its position, its color temperature, and its effect on surfaces.
Here are four lighting prompt variants for the same building — notice how a single changed phrase shifts the entire mood:
Overcast soft diffuse daylight, even illumination across the facade, no harsh shadows, matte material surfaces clearly visible, cool grey sky backdrop, photorealistic exterior architectural shot
Dawn light from the east, warm orange sunrise glow raking across the glazed facade, long horizontal shadows, interior lights visible through the windows, atmospheric morning mist at street level, photorealistic exterior architectural shot
Dusk blue hour lighting, deep indigo sky, interior warm artificial lighting glowing from every window, exterior facade lit by recessed ground uplights, high contrast between warm interior and cool exterior, photorealistic exterior architectural shot
Bright midday summer sun from directly overhead, sharp hard-edged shadows, high contrast, intense reflections in the glazing, vivid material colors, clear blue sky with scattered cirrus clouds, photorealistic exterior architectural shot
Each of these will generate a meaningfully different image of the same building. The blue hour prompt consistently produces the most dramatic results for client presentations — the contrast between warm interior light and cool exterior sky is genuinely striking and Veo 3 handles it well.
Pro tip ✅
Add a compass direction to your lighting prompts (“low sun from the southwest”) rather than just saying “afternoon light.” It gives the model a consistent spatial reference that helps maintain shadow direction across multiple shots of the same building.
Step 4 — Specify Materials Without Ambiguity
Material rendering is one of Veo 3’s stronger suits, but you have to be specific. “Stone facade” produces generic results. “Rough-hewn travertine limestone panels with visible fossil inclusions, aged patina, horizontal coursing with 20mm shadow gaps” produces something actually useful. The more physically specific your material description, the more accurately the model renders surface behavior — how light scatters across rough concrete differently than it reflects off polished steel.
Try these material prompt blocks as components you can insert into any foundation prompt:
Facade material: hand-finished board-formed concrete with visible timber grain impressions, dark grey tone, slight surface roughness, damp from recent rain, subtle moss growth at base
Facade material: dark weathered Corten steel panels with natural rust patina, deep orange-brown coloration, slight surface texture variation, sharp shadow lines at panel joints
Facade material: large-format porcelain tiles 1200x600mm, light sand beige color, low-gloss finish, precision-cut joints with 5mm grey grout lines, perfectly regular pattern
Interior floor material: wide-plank white oak hardwood flooring, brushed finish, light grey stain, 200mm plank width, subtle grain variation, sharp reflection of overhead light fittings
Warning ⚠️
Veo 3 struggles with highly repetitive small-scale patterns like fine brick courses, small mosaic tiles, or dense mesh screens. It tends to “smooth out” these textures at distance and generate visual noise up close. For these materials, reserve detailed shots for close-up prompts rather than wide establishing shots.
Step 5 — Generate Interior Shots from the Same Scene
Once your exterior foundation prompt is established, switching to interior shots in the same session is straightforward. The key is referencing the architectural language from your exterior prompt explicitly — this is what spatial consistency depends on. Don’t start a fresh prompt from scratch; extend the established vocabulary.
Interior of the same contemporary minimalist apartment building, open-plan living area on the third floor, floor-to-ceiling glazing from the exterior facade now visible from inside, views of the tree-lined street below, white rendered concrete ceiling with exposed slim steel beams, wide-plank white oak flooring, sparse Scandinavian furniture, late afternoon golden hour light flooding in from the west-facing windows, warm interior ambient lighting from recessed ceiling spots, photorealistic architectural interior visualization, 24mm wide-angle lens perspective, eye-level camera position
The phrase “same contemporary minimalist apartment building” and the repeated material references (white oak, concrete, aluminum frames) tell Veo 3 to maintain continuity with the exterior session. You’ll notice the window proportions, material palette, and light quality stay coherent across exterior and interior shots when you cross-reference this way.
Pro tip ✅
Generate your exterior shot first, then use it as an image input for your interior prompt. Uploading the exterior render as a reference image for the interior generation significantly improves material and light consistency — the model can literally see what it’s supposed to be working inside.
Step 6 — Control Camera Angles Systematically
Camera position in architectural visualization isn’t just aesthetic — it communicates spatial information. Use precise camera language borrowed from photography and cinematography. Veo 3 understands these terms reliably.
Bird's eye aerial view from directly above, 90-degree downward angle, showing the building footprint and roof garden layout, surrounding urban block context visible, orthographic-style perspective, photorealistic architectural visualization
Worm's eye view from ground level looking straight up at the building corner, dramatic convergence of vertical lines, glass curtain wall reflecting sky, extreme wide-angle distortion, photorealistic architectural visualization
Three-quarter frontal view from across the street, slight elevation, showing both the main facade and one side elevation, human figures at street level for scale, photorealistic architectural exterior, natural daylight
Interior perspective from the back wall of the living room looking toward the glazed facade, strong one-point perspective, furniture in foreground establishing scale, city view through windows in background, photorealistic architectural interior
Pro tip ✅
Always include a scale reference in exterior shots. Add “pedestrians on the pavement” or “parked cars at street level” to your prompt. Without human scale, even a beautifully rendered building can look like a scale model rather than a real structure.
Step 7 — Iterate and Refine
Veo 3 is generative, which means every output is a proposal, not a final answer. Plan for three to five generation attempts per shot type before settling on a result. The most efficient iteration strategy is to isolate the variable you want to change rather than rewriting the whole prompt. If the lighting is right but the material looks wrong, keep your lighting language exactly as-is and swap only the material description. If the camera angle is off but everything else works, adjust only the camera terms.
When a generation comes back with a consistent architectural flaw — say, the window proportions are collapsing, or the facade is getting distorted — add a negative constraint. Veo 3 responds to negative framing like “maintain strict vertical lines, no perspective distortion, no curved or warped walls” appended to your prompt.
Avoid 🚫
Don’t try to fix everything in one giant prompt revision. Changing five variables at once makes it impossible to know which change fixed (or broke) the output. Change one thing at a time, generate, evaluate, then move to the next variable.
Realistic Expectations — What Veo 3 Gets Right and Where It Still Slips
Veo 3 is genuinely impressive on massing, lighting atmosphere, and broad material character. A rough concrete wall looks rough. A glass curtain wall reflects the sky. Golden hour light behaves like golden hour light. For early-stage concept visualization and stakeholder presentations, the output quality is more than enough.
Where it still struggles: precise dimensional accuracy, fine repetitive details (brick bonds, tile grids, perforated screens), complex structural connections, and interior furniture layouts with multiple specific pieces. If a client needs to verify that the facade modules are dimensionally coordinated with the structural grid, they still need a human modeler. If they just need to feel what the building might look like on a Tuesday afternoon in autumn — Veo 3 handles that faster than any traditional rendering pipeline.
Note 💡
Save every prompt that produces a result you like in a separate document. Building a personal prompt library for architectural visualization is genuinely valuable — a lighting formula that works for residential glazing will work again next project.
Where This Fits in a Real Workflow
The architects and designers getting the most out of Veo 3 right now are using it at the concept and schematic design stages — not for final client deliverables. The workflow looks like this: quick sketch on paper, photograph it, upload to Veo 3 as image input, generate three to four lighting and material variants, pick the most promising direction, then invest the real modeling time only in the design direction the client responds to. That’s hours saved in the earliest, most iterative phase of the project.
Traditional rendering software isn’t going anywhere for final presentation work. But burning two days on a V-Ray render of a concept that a client rejects in the first review meeting is a waste that Veo 3 can help you stop doing. Use the right tool for the right stage — that’s the actual shift here, and it’s a meaningful one.


