Robotic Dog Training in a Reconstructed Temple

From locomotion demo to spatial testbed

An empty stair test explains locomotion while leaving out the architectural context that makes the behavior spatial. A reconstructed temple scene gives that context; collision and stair-contact questions still need a separate proxy layer.

I split the project into three readable layers: the temple reconstruction for visual context, a simplified Isaac / Omniverse proxy for collision and stair interaction, and Isaac Lab / RobotLab replay clips for the robot motion. Seeing those layers together makes the reasoning inspectable while keeping each layer’s responsibility separate.

Robotic dog climbing stairs inside a reconstructed temple scene in Isaac Lab — **Primary visual.** A robotic dog is replayed in Isaac Lab inside a reconstructed temple scene. The visual splat gives spatial context; the proxy geometry and locomotion clips explain how the stair interaction is being read.

What is in the scene

Scene context

A captured temple environment gives the robot replay a recognizable architectural setting.

Physics proxy

A separate collision/stair layer carries the contact assumptions behind the visual scene.

Robot replay

Recorded Isaac Lab / RobotLab clips show the dog in controlled stair and temple views.

Current boundary

The current artifact covers simulation replay, scene assembly, and visual/proxy separation.

The direction came from a practical problem in presentation: a single polished replay hides too many assumptions. I wanted the viewer to see the asset chain itself — visual reconstruction, OpenUSD / Isaac scene assembly, proxy geometry, and robot replay.

How the scene is assembled

The project is organized as a source-to-simulation chain. Each step has a different job.

01 Capture

Temple imagery / reconstruction becomes a spatial reference. In Isaac, it appears as a ParticleField3DG-style visual layer, separate from the simplified collision geometry.

02 Proxy

The robot needs collision geometry, stair dimensions, and stable physics. Those are handled as a simplified proxy layer, separate from the visual reconstruction.

03 Train

Separate locomotion experiments test Go2 stair climbing, rough-terrain control, and teacher/student policy behavior.

04 Replay

The robot is replayed in stair and temple views. Videos are encoded as silent H.264 MP4s.

05 Review

Hero imagery, viewport clips, and UI screenshots are kept separate so the viewer can see which layer they are looking at.

Locomotion notes

The locomotion side began with a community teacher policy, then moved into smaller checks around stair climbing, rough terrain, stop commands, yaw, and arc motion. I kept the training notes because they explain why the page shows both successful replay clips and unfinished control work.

The useful lesson was practical: for this robot setup, teacher warm-start, behavior cloning, and targeted DAgger exposed the failure modes more clearly than broad random PPO attempts. Stop creep, rough yaw, and rough high-speed drift became concrete observations that informed what still needs replay coverage.

Visual and replay material

The page uses three kinds of material: temple images for spatial context, replay videos for the robot motion, and one UI screenshot to show the simulator setup behind the cleaner views.

Wide temple reconstruction with robotic dog on stairs — **Spatial context.** The robot is small in this frame, but the environment scale and stair layout are clear.

Side view of a robotic dog on temple stairs — **Robot interaction view.** A closer stair view keeps the dog leg pose readable while still preserving the temple reconstruction.

Go2 stair test. A separate stair replay used to reason about step height and locomotion stability before placing the robot in the reconstructed temple context.

Nankunshen stair replay — oblique view. A cropped OBS replay showing the dog climbing the Nankunshen temple stairs, placed after the baseline stair test to show the same locomotion question inside the reconstructed scene.

Nankunshen stair replay — high-side view. A second OBS viewport crop emphasizing temple scale, stair geometry, and the dog’s placement inside the reconstructed Nankunshen environment.

Isaac Lab UI showing the temple ParticleField3DG scene and physics settings — **Simulator setup.** This image is intentionally less polished: it keeps the Isaac Lab interface, the `ParticleField3DG` scene entry, and physics settings visible.

Why I kept the temple context

My background is stage, light, spatial perception, and real-time visual systems as much as software. That makes the temple context useful: stairs, scale, atmosphere, and cultural setting all affect how a viewer reads embodied motion.

The observation behind the page is simple: captured places become more useful when they can host behavior, constraints, and reviewable scene layers. Here the temple reconstruction carries visual memory, the proxy geometry carries simulator assumptions, and the robot replay tests how embodied motion reads inside that reconstructed place.

Next steps

The remaining rough spots point directly to the next work:

Better proxy documentation

Show the proxy stair/collision geometry beside the visual splat so reviewers can see what is physical and what is visual.

More scenario replays

Add rough yaw, rough arc, and stop-specialist clips so the failure modes are as visible as the success cases.

Cleaner OpenUSD package

Package the scene layers, robot replay, and simulator captures as a reproducible local demo rather than a set of screenshots.

Related context: the visual reconstruction work connects to the Nankunshen 3DGS capture, while the source-to-response framing connects to the source-aware digital twin case study.