EndoGaussian-4D

Real-Time Deformable 4D Gaussian Splatting for Endoscopic Surgery

A research framework for reconstructing dynamic endoscopic scenes using deformable 3D Gaussian Splatting with HexPlane-encoded temporal deformation fields.

Architecture

G_t = G_0 + Δ_θ(t)

Component	Description
G₀	Canonical Gaussians {μ, q, s, α, SH} initialized via HGI
Δ_θ	HexPlane encoder → MLP decoder → per-Gaussian deltas
HexPlane	6 feature planes (XY,XZ,YZ,XT,YT,ZT) × multi-resolution
Decoder	Shared backbone → 4 zero-initialized heads (Δμ, Δq, Δs, Δα)
Renderer	gsplat differentiable rasterization with absgrad

Target performance: 37.9 PSNR, 0.97 SSIM, 195 FPS, 2 min training (EndoNeRF benchmark)

Project Structure

endogaussian4d/
├── Dockerfile                    # CUDA 12.1 + PyTorch 2.3 + gsplat + COLMAP
├── requirements.txt              # Pip dependencies
├── __init__.py
│
├── models/
│   ├── __init__.py
│   ├── trainer.py               # EndoGaussianTrainer + HexPlane + Decoder
│   └── metrics.py               # PSNR, SSIM, LPIPS, D-SSIM, depth metrics
│
├── scripts/
│   ├── download_datasets.py     # Dataset registry + download + unified loader
│   ├── extract_poses.py         # Multi-stage SfM (COLMAP + Depth-Anything + PnP)
│   └── fast_fail_experiment.py  # Static 3DGS baseline + artifact taxonomy
│
├── configs/
│   ├── endonerf.yaml            # Default config for EndoNeRF dataset
│   ├── c3vd.yaml                # Config for C3VD dataset
│   └── fast_fail.yaml           # Config for fast-fail experiment
│
└── docs/
    ├── formalization.tex        # LaTeX mathematical formalization (CVPR-style)
    └── formalization.md         # Markdown version of the math

Week 1 Sprint Roadmap

Phase 1 (Days 1-2): Data Pipeline ✅

Dataset registry with 7 endoscopic datasets (EndoNeRF, EndoSLAM, C3VD, SCARED, StereoMIS, D4D, Hamlyn)
Unified EndoDataset loader supporting LLFF/C3VD/EndoSLAM formats
DatasetOrganizer for download, validation, and manifest generation
Multi-stage SfM pipeline (COLMAP sequential → exhaustive → Depth-Anything+PnP)
Holistic Gaussian Initialization via depth backprojection
Tool mask handling throughout the pipeline

Phase 2 (Days 3-4): Baseline & Fast-Fail ✅

Full EndoGaussianTrainer with HexPlane deformation
gsplat-based rendering with absgrad densification
Complete metrics suite (PSNR, SSIM, LPIPS, D-SSIM loss, 7 depth metrics)
Fast-fail experiment: static 3DGS on dynamic tissue
Artifact taxonomy (6 failure modes with physics explanations)
Automated artifact detection and structured reporting

Phase 3 (Day 5): Mathematical Formalization ✅

LaTeX formalization of deformation field G_t = G₀ + Δ_θ(t)
HexPlane encoding with memory complexity analysis
Full loss derivation with physics motivation
Tool occlusion handling (3-stage masking)
Training schedule and adaptive density control
Loss → artifact mapping (which loss term fixes which artifact)

Quick Start

1. Environment Setup

# Using Docker (recommended)
docker build -t endogaussian4d .
docker run --gpus all -it endogaussian4d bash

# Or using pip
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121
pip install -r requirements.txt

2. Download Data

# List available datasets
python scripts/download_datasets.py --list

# Download EndoNeRF (smallest, auto-downloadable, standard benchmark)
python scripts/download_datasets.py --datasets endonerf --data-root ./data

# Validate installation
python scripts/download_datasets.py --validate --data-root ./data

3. Extract Poses (if needed)

# Auto mode: tries COLMAP first, falls back to Depth-Anything + PnP
python scripts/extract_poses.py --input ./data/endonerf/cutting --mode auto

# Force learning-based pipeline (for texture-less tissue)
python scripts/extract_poses.py --input ./data/endonerf/cutting --mode depth_pnp --hgi

4. Run Fast-Fail Experiment

# Benchmark static 3DGS on dynamic tissue (documents failure modes)
python scripts/fast_fail_experiment.py \
    --data ./data/endonerf/cutting \
    --output ./experiments/fast_fail \
    --max-iters 1000

5. Train EndoGaussian-4D

from models.trainer import EndoGaussianTrainer, EndoGaussianConfig
from scripts.download_datasets import EndoDataset

# Load dataset
dataset = EndoDataset("./data/endonerf/cutting")
points, colors = dataset.get_point_cloud(subsample=0.001)

# Configure and train
config = EndoGaussianConfig(
    total_iters=3000,
    warmup_iters=1000,
    output_dir="./output",
    experiment_name="cutting_v1",
)

trainer = EndoGaussianTrainer(config)
trainer.initialize_from_point_cloud(points, colors, cameras=[])
# trainer.train(dataset)  # Requires GPU + torch Dataset wrapper

Dataset Summary

Dataset	Sequences	Resolution	Depth	Poses	Access
EndoNeRF	2	320×256	✓ GT	✓	Auto (Dropbox)
C3VD	22	675×540	✓ GT	✓	Request form
SCARED	7	1280×1024	✓ GT	✓	Registration
StereoMIS	11	640×480	Stereo	✗	Auto (Zenodo)
EndoSLAM	35	640×480	Synthetic	✓	Auto (GitHub)
D4D	98	640×480	✓ GT	✓	Auto (DOI)
Hamlyn	20	360×288	✗	✗	Registration

Recommended for initial experiments: EndoNeRF (smallest, auto-download, standard benchmark, includes tool masks)

Key Design Decisions

gsplat over diff-gaussian-rasterization: Modern API, absgrad support, maintained by nerfstudio team
HexPlane over pure MLP: 37.8 vs 34.8 PSNR, 6× faster training
Zero-initialized decoder heads: Identity deformation at init → stable training
Multi-stage SfM: COLMAP first (accurate), Depth+PnP fallback (robust)
Tool masking in 3 stages: Init exclusion + loss masking + densification exclusion

Loss Function Summary

Term	Weight	Purpose	Artifact Fixed
L1	1-λ₁=0.8	Pixel reconstruction	General quality
D-SSIM	λ₁=0.2	Structural similarity	Specular artifacts
Depth (SI)	λ₂=0.1	Geometric accuracy	Floaters
Smooth	λ₃=0.01	Temporal coherence	Ghosting, tearing
TV	λ₄=0.001	HexPlane regularization	Scale bloat
Tool mask	—	Exclude instruments	Tool smearing

References

Kerbl et al. "3D Gaussian Splatting." SIGGRAPH 2023.
Liu et al. "EndoGaussian." arXiv:2401.12561, 2024.
Huang et al. "Endo-4DGS." MICCAI 2024.
Cao & Johnson. "HexPlane." CVPR 2023.
Wang et al. "EndoNeRF." MICCAI 2022.
Ye et al. "gsplat." arXiv:2409.06765, 2024.

License

Research use only. See individual dataset licenses for data terms.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Papers for mnunziant/EndoGaussian-4D

gsplat: An Open-Source Library for Gaussian Splatting

Paper • 2409.06765 • Published Sep 10, 2024 • 17

EndoGaussian: Real-time Gaussian Splatting for Dynamic Endoscopic Scene Reconstruction

Paper • 2401.12561 • Published Jan 23, 2024