IJCAI 2025

DeepShade: Enable Shade Simulation by Text-conditioned Image Generation

Enabling real-time urban shade prediction through AI-powered diffusion models to combat extreme heat and save lives

Longchao Da · Xiangrui Liu · Mithun Shivakoti · Thirulogasankar Pranav Kutralingam · Yezhou Yang · Hua Wei

Arizona State University

Overview

Abstract

Heatwaves pose a significant threat to public health, especially as global warming intensifies. However, current routing systems (e.g., online maps) fail to incorporate shade information due to the difficulty of estimating shades directly from noisy satellite imagery and the limited availability of training data for generative models. In this paper, we address these challenges through two main contributions. First, we build an extensive dataset covering diverse longitude-latitude regions, varying levels of building density, and different urban layouts. Leveraging Blender-based 3D simulations alongside building outlines, we capture building shadows under various solar zenith angles throughout the year and at different times of day. These simulated shadows are aligned with satellite images in terms of the areas, providing a rich resource for learning shade patterns. Second, we propose the DeepShade, a diffusion-based model designed to learn and synthesize shade variations over time. It emphasizes the nuance of edge features by jointly considering RGB with the Canny edge layer, and incorporates contrastive learning to capture the temporal change rules of shade. Then, by conditioning on textual descriptions of known conditions (e.g., time of day, solar angles), our framework provides improved performance in generating shade images.

Live Demo

Shadow Movement Throughout the Day

Watch real-time shade predictions as DeepShade simulates shadow patterns through different times of the day

178K+

Annual Heat Deaths

Cities Tested

Countries Covered

70/30

Train/Test Split

Data Collection

Comprehensive Global Dataset

A rigorous pipeline covering geographical diversity, urban layout variability, and traffic rule variations

Figure 1: Dataset Creation Pipeline

Geographical Diversity

Covers 12 cities across Asia, Americas, Europe, and Africa including Beijing, Phoenix, São Paulo, Madrid, Cairo, Mumbai, Xi'An, Tempe, Brasilia, Seville, Aswan, and Jaipur.

Urban Layout Variability

Captures diverse configurations from dense high-rise areas to sparse flat regions, ensuring model robustness across different building densities and architectural styles.

Simulation Pipeline

Uses Blender-based 3D rendering with OSM building data, aligned with Google Maps tile level 13, capturing shades at various solar angles and times throughout the year.

Innovation

Edge-Enhanced Diffusion Architecture

DeepShade combines cutting-edge computer vision with temporal learning

Figure 2: DeepShade Architecture

Edge-Enhanced Conditioning

We concatenate RGB building skeleton with Canny edge features to form a 4-channel input (R, G, B, Edge), enabling the model to capture subtle shade boundaries with precision.

x_cond = [x_sk_R, x_sk_G, x_sk_B, x_edge] ∈ ℝ^(H×W×4)

Contrastive Learning

Our contrastive framework enforces temporal consistency by distinguishing between shade patterns at different times, using InfoNCE loss to learn realistic shade evolution.

L_total = L_ControlNet + 0.1×L_contrastive

Performance

State-of-the-Art Results

DeepShade outperforms all baseline methods across 12 cities worldwide

Training Convergence

Figure 3: Training Loss Curves

Ablation Study

Component Contributions

Evaluating the importance of each component in DeepShade (tested on Tempe dataset)

Model Configuration	SSIM ↑	mIoU ↑	B-IoU ↑	MSE ↓	LPIPS ↓
Backbone Model (Direct)	0.4252 ± 0.01	0.0358 ± 0.00	0.0213 ± 0.00	41.2666 ± 1.65	0.7967 ± 0.00
Vanilla ControlNet	0.9690 ± 0.04	0.2736 ± 0.13	0.0812 ± 0.05	18.3388 ± 3.37	0.3304 ± 0.03
+ Edge Conditioning	0.9684 ± 0.01	0.2898 ± 0.04	0.1040 ± 0.01	18.6686 ± 0.70	0.3358 ± 0.01
DeepShade (Full Model)	0.9692 ± 0.04	0.2903 ± 0.20	0.1240 ± 0.07	18.1721 ± 4.09	0.3024 ± 0.29

Key Features

Why DeepShade Works

🎯 Edge-Enhanced Conditioning

Combines RGB satellite imagery with Canny edge detection (4-channel input) to capture precise building boundaries and shade edges, ensuring sharp and accurate shadow generation.

⏱️ Temporal Consistency

Contrastive learning with InfoNCE loss enforces temporal rules - shadows from nearby time steps are more similar than distant ones, maintaining realistic shade progression throughout the day.

🌍 Global Generalization

Trained on diverse cities across 3 continents with varying building densities and layouts, the model generalizes to unseen locations with only a satellite image - no LiDAR required.

Technical Details

Model Architecture & Training

Training Configuration

→Base model: ControlNet with Stable Diffusion backbone
→Input: 4-channel (RGB + Edge), 512×512 resolution
→Loss: L_total = L_ControlNet + 0.1×L_contrastive
→Temperature τ = 0.1 for InfoNCE loss
→Dataset split: 70% train, 30% test
→Training: 50 epochs, converges 3× faster than vanilla ControlNet

Data Pipeline

→Blender-based 3D simulation with OSM building data
→Aligned with Google Maps tile level 13
→Multiple solar angles and times captured per location
→Ground truth: x_gt = x_shade - x_skeleton - I(x_shade ≤ α)
→Positive pairs: same location, 1-hour temporal gap
→Contrastive buffer with balanced pos/neg sampling

Visual Results

Generated Shade Examples

DeepShade generates realistic shadows across diverse urban environments and times of day

Figure 4: Shade orientation changes across different times of the day

Impact

Why This Matters

🌡️ Public Health Crisis

Over 178,700 people die annually from extreme heat. DeepShade enables shade-aware navigation to reduce heat exposure for vulnerable populations during heatwaves.

🏙️ Urban Planning Tool

City planners can identify areas lacking shade coverage and optimize placement of cooling corridors, shade structures, and green infrastructure.

📱 Scalable Solution

Unlike LiDAR-based methods, DeepShade only requires satellite imagery - enabling city-wide deployment at low cost with real-time updates.

Real-World Impact

Shade-Aware Route Planning

Protecting people from extreme heat through intelligent navigation

Figure 4: Route Planning Demonstration

Real-World Application

DeepShade has been integrated into a prototype routing system designed for deployment in urban environments prone to extreme heat. By combining real-time solar geometry simulation with AI-driven shade inference, the system provides safe and comfortable navigation options for pedestrians, cyclists, and drivers. This demonstrates DeepShade’s potential as a public-facing, data-driven tool for heat mitigation and sustainable urban design.

Input

Real-time GPS coordinates, date, and time

Output

Safe, shade-optimized navigation routes with interactive visual overlays

Public Health Impact

With over 178,700 annual deaths from extreme heat globally, shade-aware navigation can significantly reduce heat exposure for vulnerable populations including the elderly and outdoor workers.

🌡️

Heat Mitigation

Reduce direct sun exposure during peak heat hours

🗺️

Urban Planning

Identify areas needing artificial shade infrastructure

⚡

Real-Time Adaptation

Dynamic routing based on time of day and solar angles

Resources

Get Started

📄 Paper

Read the full IJCAI 2025 paper with technical details and comprehensive experiments.

arXiv PDF

💻 Code

Access the complete codebase, training scripts, and inference notebooks on GitHub.

GitHub Repo

🗂️ Dataset

Download the comprehensive shade dataset covering 12 cities across 3 continents.

HuggingFace

Team

Authors

Longchao Da* · Xiangrui Liu* · Mithun Shivakoti · Thirulogasankar Pranav Kutralingam · Yezhou Yang · Hua Wei

(* Equal contribution)

Arizona State University

Presented at the International Joint Conference on Artificial Intelligence (IJCAI) 2025

Citation

Use This Work

If you find DeepShade useful for your research, please cite our paper

@article{da2025deepshade,
  title={Deepshade: Enable shade simulation by text-conditioned image generation},
  author={Da, Longchao and Liu, Xiangrui and Shivakoti, Mithun and 
          Kutralingam, Thirulogasankar Pranav and Yang, Yezhou and Wei, Hua},
  journal={arXiv preprint arXiv:2507.12103},
  year={2025}
}