LCDP-Sim: Language-Conditioned Diffusion Policy

LCDP-Sim Project

Language-Conditioned Diffusion Policy for Robotic Control

*Independent DeveloperAustin, TexasDec. 2025 - Present*

GitHub Repository: https://github.com/16yunH

Overview

Developed an end-to-end Vision-Language-Action (VLA) system utilizing Diffusion Policy to map RGB images and natural language instructions into robotic control signals.

Key Features

  • Vision-Language-Action (VLA): Maps RGB images and natural language instructions directly to robotic control signals.
  • Diffusion Policy: Utilizes diffusion models for robust policy learning.
  • Action Chunking: Predicts 16-step trajectories to mitigate error accumulation and enhance motion smoothness in long-horizon tasks.

Technical Implementation

  • CLIP Text Encoder: Integrated for semantic understanding of language instructions.
  • U-Net Architecture: Employed a U-Net-based DDPM/DDIM architecture for high-fidelity action generation.
  • Simulation: Built a complete pipeline for data collection, distributed training, and closed-loop evaluation in ManiSkill2 simulation.