Safe Vision Language Action Models via Barrier Enhanced Flow Matching

The XXXXXXXX University
Submitted to IEEE Robotics and Automation Letters (RA-L)
Barrier Enhanced VLA Architecture

Our framework modifies the Flow Matching denoising process within the model to inherently generate safe trajectories using a smooth Log-Sum-Exponential aggregate barrier.

Abstract

This article introduces a modular inference framework that integrates Vision-Language-Action (VLA) foundation models with formal Control Barrier Function (CBF) safety guarantees. Unlike existing methods that apply external safety filters to a model's final output, our approach modifies the Flow Matching denoising process within the model to inherently generate safe trajectories. By utilizing a smooth Log-Sum-Exponential aggregate barrier, we enforce safety across entire steps of action chunks without compromising the semantic intent of the generative model. The framework eliminates the need for safety-specific datasets or costly retraining. Hardware experiments demonstrate that our framework achieves reliable safety without degrading the success rate of the baseline model.

Barrier Demonstration

Wall Barrier

Restricting the end-effector from passing a 3D plane to avoid table collisions.

Spherical Barrier

Maintaining safety by keeping the robot frame outside a defined spherical region.

Policy Comparison

Proposed: CBF-FM

Baseline: E2E-CBF

Performance metrics evaluated across 150+ hardware trials, comparing our proposed CBF-FM architecture against baseline models.

Metric Base VLA E2E CBF CBF-FM (Ours)
Safety Rate (%) 15.0% 68.18% 100.0%
Success Rate (%) 85.0% 68.18% 77.41%

*Trials included scenarios with objects placed inside and outside of unsafe collision zones. Violation of the safety is not critical to the robot's health and objective, so an unsafe rollout can still end-up being successful which is only a measure of task completion. Based on our experience, CBF-FM matches baseline success rate while ensuring zero safety violations.

Sensitivity Analysis

Kappa Comparison

Effect of the smoothing parameter κ on the conservativeness of the barrier.

Step Comparison

Impact of the number of denoising steps with CBF-QP on trajectory safety.

Trajectory Smoothing and Velocity Enforcement

Beyond collision avoidance, our formulation enables the direct optimization of the action chunk for physical feasibility. By incorporating a sparse finite-difference matrix D into the barrier function Quadratic Program (QP), we can:

  • Enforce Joint Velocity Limits: Hard constraints ensure that no joint command exceeds the physical limits of the SO-101 arm, preventing motor saturation.
  • Optimize for Smoothness: An additional cost term minimizes the difference between consecutive actions, significantly reducing jerky motions and improving the quality of the generated trajectory.

This integrated approach ensures that the "Safe VLA" trajectories are not only collision-free but also dynamically consistent and smooth for real-world execution.

Joint Velocity Profiles
Joint velocity profiles showing strict adherence to maximum limits (red dashed lines) and the resulting smooth path generated by the Action Expert.