Path Planning API Reference#
This page documents ManipulaPy.path_planning, the module for optimized trajectory generation with adaptive GPU/CPU execution, CUDA acceleration, and collision avoidance.
Tip
For conceptual explanations, see Trajectory Planning User Guide.
OptimizedTrajectoryPlanning Class#
- class ManipulaPy.path_planning.OptimizedTrajectoryPlanning(serial_manipulator, urdf_path, dynamics, joint_limits, torque_limits=None, *, use_cuda=None, cuda_threshold=10, memory_pool_size_mb=None, enable_profiling=False, auto_optimize=True, kernel_type='auto', target_speedup=40.0)[source]#
Bases:
objectHighly optimized trajectory planning class with adaptive GPU/CPU execution, memory pooling, and batch processing capabilities for 40x+ speedups.
Highly optimized trajectory planning class with adaptive GPU/CPU execution, memory pooling, and batch processing capabilities.
Constructor
- Parameters:
- __init__(serial_manipulator, urdf_path, dynamics, joint_limits, torque_limits=None, *, use_cuda=None, cuda_threshold=10, memory_pool_size_mb=None, enable_profiling=False, auto_optimize=True, kernel_type='auto', target_speedup=40.0)[source]#
Enhanced trajectory planner with advanced CUDA optimizations.
- Parameters:
serial_manipulator (SerialManipulator)
urdf_path (str)
dynamics (ManipulatorDynamics)
use_cuda (None | bool) β
None β auto-detect (default)
True β force GPU (raise if CUDA absent)
False β force CPU
cuda_threshold (int) β Min. (N Γ joints) before we bother launching the GPU.
memory_pool_size_mb (int | None) β If set, resize the global CUDA memory pool (in MB).
enable_profiling (bool) β Enable CUDA profiling for performance analysis.
auto_optimize (bool) β Automatically setup CUDA environment for maximum performance.
kernel_type (str) β Kernel selection strategy: βautoβ, βstandardβ, βvectorizedβ, βmemory_optimizedβ, βwarp_optimizedβ, βcache_friendlyβ, βauto_tuneβ
target_speedup (float) β Target speedup over CPU (used for recommendations).
- Return type:
None
Parameters:
serial_manipulator (SerialManipulator) β Robot kinematics object
urdf_path (str) β Path to URDF file for collision checking
dynamics (ManipulatorDynamics) β Robot dynamics object
joint_limits (list) β Joint limits as [(min, max), β¦] tuples
torque_limits (list, optional) β Torque limits as [(min, max), β¦] tuples
use_cuda (bool, optional) β Force GPU (True), CPU (False), or auto-detect (None)
cuda_threshold (int) β Minimum problem size before using GPU (default: 50)
memory_pool_size_mb (int, optional) β GPU memory pool size in MB
enable_profiling (bool) β Enable CUDA profiling for performance analysis
Attributes Created:
collision_checker (CollisionChecker) β URDF-based collision detection
potential_field (PotentialField) β Artificial potential field for obstacle avoidance
cuda_available (bool) β Whether CUDA acceleration is available
gpu_properties (dict) β GPU device properties for optimization
performance_stats (dict) β Performance tracking for GPU/CPU usage
Core Trajectory Generation
- joint_trajectory(thetastart, thetaend, Tf, N, method, kernel_type=None, enable_monitoring=None)[source]#
Enhanced joint trajectory generation with advanced CUDA optimizations.
- Parameters:
thetastart (numpy.ndarray) β The starting joint angles.
thetaend (numpy.ndarray) β The ending joint angles.
Tf (float) β The final time for the trajectory.
N (int) β The number of steps in the trajectory.
method (int) β The method to use (3=cubic, 5=quintic).
kernel_type (str, optional) β Override default kernel selection.
enable_monitoring (bool, optional) β Override default monitoring.
- Returns:
A dictionary containing positions, velocities, and accelerations.
- Return type:
Parameters:
thetastart (array_like) β Starting joint angles in radians
thetaend (array_like) β Target joint angles in radians
Tf (float) β Total trajectory time in seconds
N (int) β Number of trajectory points
method (int) β Time scaling method (3=cubic, 5=quintic)
Returns:
trajectory (dict) β Dictionary containing:
positions (numpy.ndarray) β Joint positions (NΓn)
velocities (numpy.ndarray) β Joint velocities (NΓn)
accelerations (numpy.ndarray) β Joint accelerations (NΓn)
Adaptive Execution: Automatically chooses GPU or CPU based on problem size and hardware availability.
CUDA Acceleration: Uses optimized 2D parallelized kernels with shared memory optimization.
Collision Avoidance: Automatically applies potential field-based collision avoidance.
- batch_joint_trajectory(thetastart_batch, thetaend_batch, Tf, N, method, kernel_type=None)[source]#
Enhanced batch trajectory generation with optimal kernel selection.
- Parameters:
thetastart_batch (numpy.ndarray) β Starting angles (batch_size, num_joints)
thetaend_batch (numpy.ndarray) β Ending angles (batch_size, num_joints)
Tf (float) β Final time for all trajectories
N (int) β Number of trajectory points
method (int) β Time scaling method
kernel_type (str, optional) β Override kernel selection
- Returns:
Batch trajectory data with shape (batch_size, N, num_joints)
- Return type:
Parameters:
thetastart_batch (numpy.ndarray) β Starting angles for multiple trajectories (batch_size, num_joints)
thetaend_batch (numpy.ndarray) β Ending angles for multiple trajectories (batch_size, num_joints)
Tf (float) β Total trajectory time in seconds
N (int) β Number of trajectory points
method (int) β Time scaling method (3=cubic, 5=quintic)
Returns:
trajectory (dict) β Batch trajectory data with shape (batch_size, N, num_joints)
3D Parallelization: Uses CUDA 3D grids for (batch, time, joint) parallel computation.
Memory Optimization: Efficient memory pooling for large batch processing.
- cartesian_trajectory(Xstart, Xend, Tf, N, method)[source]#
Enhanced Cartesian trajectory generation with optimal kernel selection.
- Parameters:
- Returns:
Dictionary with positions, velocities, accelerations, and orientations.
- Return type:
Parameters:
Xstart (numpy.ndarray) β Initial SE(3) transformation matrix (4Γ4)
Xend (numpy.ndarray) β Target SE(3) transformation matrix (4Γ4)
Tf (float) β Total trajectory time in seconds
N (int) β Number of trajectory points
method (int) β Time scaling method (3=cubic, 5=quintic)
Returns:
trajectory (dict) β Dictionary containing:
positions (numpy.ndarray) β Cartesian positions (NΓ3)
velocities (numpy.ndarray) β Linear velocities (NΓ3)
accelerations (numpy.ndarray) β Linear accelerations (NΓ3)
orientations (numpy.ndarray) β Rotation matrices (NΓ3Γ3)
Hybrid Computation: Uses GPU for position derivatives, CPU for complex orientation interpolation.
SE(3) Interpolation: Matrix logarithms and exponentials for smooth orientation changes.
Dynamics Integration
- inverse_dynamics_trajectory(thetalist_trajectory, dthetalist_trajectory, ddthetalist_trajectory, gravity_vector=None, Ftip=None)[source]#
Compute joint torques with enhanced CUDA acceleration.
- Parameters:
thetalist_trajectory (np.ndarray) β Array of joint angles over the trajectory.
dthetalist_trajectory (np.ndarray) β Array of joint velocities over the trajectory.
ddthetalist_trajectory (np.ndarray) β Array of joint accelerations over the trajectory.
gravity_vector (np.ndarray, optional) β Gravity vector affecting the system.
Ftip (list, optional) β External forces applied at the end effector.
- Returns:
Array of joint torques required to follow the trajectory.
- Return type:
np.ndarray
Parameters:
thetalist_trajectory (numpy.ndarray) β Joint angle trajectory (NΓn)
dthetalist_trajectory (numpy.ndarray) β Joint velocity trajectory (NΓn)
ddthetalist_trajectory (numpy.ndarray) β Joint acceleration trajectory (NΓn)
gravity_vector (array_like, optional) β Gravity vector [gx, gy, gz] (default: [0, 0, -9.81])
Ftip (array_like, optional) β End-effector wrench [fx, fy, fz, mx, my, mz] (default: zeros)
Returns:
torques (numpy.ndarray) β Required joint torques (NΓn)
Adaptive Execution: GPU with 2D parallelization for large trajectories, CPU for small ones.
Memory Management: Automatic memory pooling and pinned memory transfers.
Robust Fallback: Automatic CPU fallback on GPU errors.
- forward_dynamics_trajectory(thetalist, dthetalist, taumat, g, Ftipmat, dt, intRes)[source]#
Enhanced forward dynamics trajectory computation.
- Parameters:
thetalist (np.ndarray) β Initial joint angles.
dthetalist (np.ndarray) β Initial joint velocities.
taumat (np.ndarray) β Array of joint torques over the trajectory.
g (np.ndarray) β Gravity vector.
Ftipmat (np.ndarray) β Array of external forces.
dt (float) β Time step.
intRes (int) β Integration resolution.
- Returns:
Dictionary containing positions, velocities, and accelerations.
- Return type:
Parameters:
thetalist (numpy.ndarray) β Initial joint angles
dthetalist (numpy.ndarray) β Initial joint velocities
taumat (numpy.ndarray) β Applied torque trajectory (NΓn)
g (numpy.ndarray) β Gravity vector [gx, gy, gz]
Ftipmat (numpy.ndarray) β External force trajectory (NΓ6)
dt (float) β Integration time step
intRes (int) β Integration resolution (sub-steps per dt)
Returns:
simulation (dict) β Dictionary containing:
positions (numpy.ndarray) β Simulated joint positions (NΓn)
velocities (numpy.ndarray) β Simulated joint velocities (NΓn)
accelerations (numpy.ndarray) β Simulated joint accelerations (NΓn)
Parallel Integration: GPU-accelerated numerical integration with automatic joint limit enforcement.
Private GPU/CPU Methods
- _joint_trajectory_gpu(thetastart, thetaend, Tf, N, method, kernel_type, enable_monitoring)[source]#
GPU joint trajectory generation with optimal kernel selection.
Runs the monitored GPU trajectory generator, clips positions to the configured joint limits, optionally applies GPU collision avoidance, updates performance stats, and falls back to the CPU path on any error.
- Parameters:
thetastart β (num_joints,) ndarray of starting joint angles (rad).
thetaend β (num_joints,) ndarray of ending joint angles (rad).
Tf (float) β Total trajectory duration, seconds.
N (int) β Number of trajectory points.
method (int) β Time-scaling method; 3 for cubic, 5 for quintic.
kernel_type (str) β Kernel selection strategy passed to the GPU generator (e.g. βautoβ, βvectorizedβ).
enable_monitoring (bool) β If True, enable CUDA profiling/logging.
- Returns:
Dict with keys
"positions","velocities"and"accelerations", each an(N, num_joints)array.- Return type:
Dict[str, np.ndarray]
GPU-accelerated joint trajectory generation with optimized memory management.
- _joint_trajectory_cpu(thetastart, thetaend, Tf, N, method)[source]#
CPU joint trajectory generation with performance tracking.
Generates the trajectory via the Numba CPU kernel, clips positions to the configured joint limits, optionally applies CPU collision avoidance, and records the elapsed time as the CPU baseline used for speedup reporting.
- Parameters:
- Returns:
Dict with keys
"positions","velocities"and"accelerations", each an(N, num_joints)array.- Return type:
Dict[str, np.ndarray]
CPU-based joint trajectory generation with Numba optimization.
- _apply_collision_avoidance_gpu(traj_pos, thetaend)[source]#
Apply potential-field collision avoidance on the CUDA-enabled path.
Collision avoidance is a joint-space computation and has no GPU kernel (the GPU potential field is 3-D Cartesian, not joint-space), so this delegates to the validated joint-space CPU routine. Kept as a distinct dispatch target so the GPU/CPU planning code paths stay symmetric.
- Parameters:
- Returns:
The (possibly adjusted)
(N, num_joints)trajectory.- Return type:
np.ndarray
Apply GPU-accelerated potential field-based collision avoidance.
- _apply_collision_avoidance_cpu(traj_pos, thetaend)[source]#
Apply CPU-based potential field collision avoidance.
For each colliding trajectory point, iteratively descends the potential-field gradient toward
thetaend(up to 100 iterations) until the configuration is collision-free.- Parameters:
- Returns:
The (possibly adjusted)
(N, num_joints)trajectory.- Return type:
np.ndarray
Apply CPU-based potential field collision avoidance.
- _inverse_dynamics_gpu(thetalist_trajectory, dthetalist_trajectory, ddthetalist_trajectory, gravity_vector, Ftip)[source]#
GPU-accelerated inverse dynamics over a trajectory.
Transfers the trajectory and the manipulatorβs dynamics data (Glist, S_list, M_list) to the device, launches the 2D inverse-dynamics kernel to compute per-point joint torques, clips them to
self.torque_limitsand returns them. Falls back to the CPU implementation if dynamics-data conversion or kernel execution fails.- Parameters:
thetalist_trajectory β (num_points, num_joints) ndarray of joint angles, radians.
dthetalist_trajectory β (num_points, num_joints) ndarray of joint velocities, rad/s.
ddthetalist_trajectory β (num_points, num_joints) ndarray of joint accelerations, rad/s^2.
gravity_vector β (3,) ndarray of gravitational acceleration, m/s^2.
Ftip β length-6 sequence of the external wrench at the end effector.
- Returns:
(num_points, num_joints) array of joint torques, clipped to the configured torque limits.
- Return type:
np.ndarray
GPU-accelerated inverse dynamics computation with optimized memory management.
- _inverse_dynamics_cpu(thetalist_trajectory, dthetalist_trajectory, ddthetalist_trajectory, gravity_vector, Ftip)[source]#
CPU-based inverse dynamics over a trajectory.
Computes per-point joint torques by calling
self.dynamics.inverse_dynamicsat each trajectory point (using zero torques for any point that raises), then clips the result toself.torque_limits.- Parameters:
thetalist_trajectory β (num_points, num_joints) ndarray of joint angles, radians.
dthetalist_trajectory β (num_points, num_joints) ndarray of joint velocities, rad/s.
ddthetalist_trajectory β (num_points, num_joints) ndarray of joint accelerations, rad/s^2.
gravity_vector β (3,) ndarray of gravitational acceleration, m/s^2.
Ftip β length-6 sequence of the external wrench at the end effector.
- Returns:
(num_points, num_joints) array of joint torques, clipped to the configured torque limits.
- Return type:
np.ndarray
CPU-based inverse dynamics computation.
- _forward_dynamics_gpu(thetalist, dthetalist, taumat, g, Ftipmat, dt, intRes)[source]#
GPU-accelerated forward dynamics integration over a trajectory.
Seeds the first step with the initial state, transfers the torque profile, external forces and dynamics data to the device, and launches the forward-dynamics kernel to integrate joint motion with
intRessub-steps per outer step, enforcing joint limits on the device. Falls back to the CPU implementation on any error.- Parameters:
thetalist β (num_joints,) ndarray of initial joint angles, radians.
dthetalist β (num_joints,) ndarray of initial joint velocities, rad/s.
taumat β (num_steps, num_joints) ndarray of applied joint torques.
g β (3,) ndarray of gravitational acceleration, m/s^2.
Ftipmat β (num_steps, 6) ndarray of end-effector wrenches per step.
dt (float) β Outer time step between trajectory points, seconds.
intRes (int) β Number of integration sub-steps per outer step.
- Returns:
Dict with keys
"positions","velocities"and"accelerations", each a(num_steps, num_joints)array.- Return type:
Dict[str, np.ndarray]
GPU-accelerated forward dynamics computation with optimized memory management.
- _forward_dynamics_cpu(thetalist, dthetalist, taumat, g, Ftipmat, dt, intRes)[source]#
CPU-based forward dynamics integration over a trajectory.
Integrates joint motion step by step using
self.dynamics.forward_dynamicswithintResEuler sub-steps per outer step (dt / intReseach), clamping joint angles toself.joint_limitsand using zero acceleration for any step that raises.- Parameters:
thetalist β (num_joints,) ndarray of initial joint angles, radians.
dthetalist β (num_joints,) ndarray of initial joint velocities, rad/s.
taumat β (num_steps, num_joints) ndarray of applied joint torques.
g β (3,) ndarray of gravitational acceleration, m/s^2.
Ftipmat β (num_steps, 6) ndarray of end-effector wrenches per step.
dt (float) β Outer time step between trajectory points, seconds.
intRes (int) β Number of integration sub-steps per outer step.
- Returns:
Dict with keys
"positions","velocities"and"accelerations", each a(num_steps, num_joints)array.- Return type:
Dict[str, np.ndarray]
CPU-based forward dynamics computation.
- _cartesian_trajectory_gpu(pstart, pend, Tf, N, method)[source]#
GPU computation of Cartesian linear velocity and acceleration.
Launches the Cartesian trajectory kernel to compute the time-scaled translational velocity and acceleration of the straight-line path from
pstarttopend. Falls back to the CPU implementation on error.- Parameters:
- Returns:
(traj_vel, traj_acc), each an(N, 3)array of linear velocity (m/s) and acceleration (m/s^2).- Return type:
Tuple[np.ndarray, np.ndarray]
GPU-accelerated Cartesian trajectory computation.
- _cartesian_trajectory_cpu(pstart, pend, Tf, N, method)[source]#
CPU computation of Cartesian linear velocity and acceleration.
Evaluates the time-scaling derivatives at each point and applies them to the straight-line displacement
pend - pstartto obtain the translational velocity and acceleration profiles.- Parameters:
- Returns:
(traj_vel, traj_acc), each an(N, 3)array of linear velocity (m/s) and acceleration (m/s^2).- Return type:
Tuple[np.ndarray, np.ndarray]
CPU-based Cartesian trajectory computation.
- _batch_joint_trajectory_cpu(thetastart_batch, thetaend_batch, Tf, N, method)[source]#
CPU fallback for batch trajectory generation.
Generates each trajectory in the batch sequentially with the Numba CPU kernel and clips positions to the configured joint limits (matching the GPU batch path).
- Parameters:
thetastart_batch β (batch_size, num_joints) ndarray of starting joint angles, radians.
thetaend_batch β (batch_size, num_joints) ndarray of ending joint angles, radians.
Tf (float) β Total trajectory duration, seconds.
N (int) β Number of trajectory points per trajectory.
method (int) β Time-scaling method; 3 for cubic, 5 for quintic.
- Returns:
Dict with keys
"positions","velocities"and"accelerations", each a(batch_size, N, num_joints)array.- Return type:
Dict[str, np.ndarray]
CPU fallback for batch trajectory generation.
- _should_use_gpu(N, num_joints)[source]#
Decide whether to dispatch a problem to the GPU.
Returns
Falsewhen CUDA is unavailable or the total work (N * num_joints) is belowself.cpu_threshold. Otherwise logs a debug hint when the per-SM element count is unlikely to reach the configured target speedup, then approves GPU execution.- Parameters:
- Returns:
Trueif the GPU path should be used,Falseotherwise.- Return type:
Determine if GPU should be used based on problem size and availability.
Parameters:
N (int) β Number of trajectory points
num_joints (int) β Number of joints
Returns:
bool β True if GPU should be used
- _get_or_resize_gpu_array(array_name, shape, dtype=<class 'numpy.float32'>)[source]#
Return a pooled CUDA array with the requested shape / dtype.
Looks up a previously pooled device array by
array_nameand reuses it when its shape and dtype match; otherwise the old array (if any) is returned to the pool and a freshly sized one is acquired and cached.- Parameters:
- Returns:
The pooled CUDA device array of the requested shape/dtype, or
Noneif CUDA is unavailable.- Return type:
Any
Return a pooled CUDA array with the requested shape/dtype.
Parameters:
array_name (str) β Key for array cache
shape (tuple) β Desired array shape
dtype (np.dtype) β Desired data type
Returns:
cuda.device_array or None β GPU array or None if CUDA unavailable
Performance Monitoring
- get_performance_stats()[source]#
Enhanced performance statistics with speedup analysis.
- Returns:
Comprehensive performance statistics
- Return type:
Get comprehensive performance statistics for GPU vs CPU usage.
Returns:
dict β Performance statistics including:
gpu_calls (int) β Number of GPU function calls
cpu_calls (int) β Number of CPU function calls
total_gpu_time (float) β Total GPU execution time
total_cpu_time (float) β Total CPU execution time
avg_gpu_time (float) β Average GPU execution time
avg_cpu_time (float) β Average CPU execution time
gpu_usage_percent (float) β Percentage of operations using GPU
memory_transfers (int) β Number of memory transfers
kernel_launches (int) β Number of CUDA kernel launches
- reset_performance_stats()[source]#
Reset performance statistics.
Reset all performance statistics to zero.
- Return type:
None
- benchmark_performance(test_cases=None, include_cpu_comparison=True)[source]#
Enhanced performance benchmarking with detailed analysis.
- Parameters:
- Returns:
Comprehensive benchmark results
- Return type:
Benchmark the performance of GPU vs CPU implementations.
Parameters:
test_cases (list, optional) β List of test cases to benchmark
Returns:
dict β Benchmark results for each test case
Memory Management
- cleanup_gpu_memory()[source]#
Enhanced GPU memory cleanup.
Clean up GPU memory pools and cached arrays.
- Return type:
None
Visualization Methods
- static plot_trajectory(trajectory_data, Tf, title='Joint Trajectory', labels=None, performance_stats=None)[source]#
Plot joint position, velocity and acceleration trajectories.
Draws a 3-by-num_joints grid of subplots (position, velocity, acceleration per joint) over a time axis spanning
[0, Tf]and appends GPU speedup / kernel info to the title whenperformance_statsis provided. Displays the figure viaplt.show().- Parameters:
trajectory_data (dict) β Trajectory dict with keys
"positions","velocities"and"accelerations", each an(num_steps, num_joints)array.Tf (float) β Total trajectory duration, seconds, defining the time axis.
title (str) β Base figure title. Defaults to
"Joint Trajectory".labels (list, optional) β Per-joint labels; used only when its length equals
num_joints, otherwise generic"Joint i"labels are generated.performance_stats (dict, optional) β Optional stats with keys such as
"speedup_achieved"and"best_kernel_used"used to annotate the title.
- Return type:
None
Parameters:
trajectory_data (dict) β Trajectory data with positions, velocities, accelerations
Tf (float) β Total trajectory time
title (str, optional) β Plot title (default: βJoint Trajectoryβ)
labels (list, optional) β Joint labels for legend
- plot_tcp_trajectory(trajectory, dt)[source]#
Enhanced TCP trajectory plotting with performance monitoring.
- Parameters:
- Returns:
None
- Return type:
None
Parameters:
trajectory (list) β List of joint angle configurations
dt (float) β Time step between trajectory points
- plot_cartesian_trajectory(trajectory_data, Tf, title='Cartesian Trajectory', performance_stats=None)[source]#
Enhanced Cartesian trajectory plotting with performance information.
- Parameters:
- Returns:
None
- Return type:
None
Parameters:
trajectory_data (dict) β Cartesian trajectory data
Tf (float) β Total trajectory time
title (str, optional) β Plot title (default: βCartesian Trajectoryβ)
- plot_ee_trajectory(trajectory_data, Tf, title='End-Effector Trajectory')[source]#
Enhanced end-effector trajectory plotting.
- Parameters:
- Returns:
None
- Return type:
None
Parameters:
trajectory_data (dict) β End-effector trajectory data
Tf (float) β Total trajectory time
title (str, optional) β Plot title (default: βEnd-Effector Trajectoryβ)
Utility Methods
- calculate_derivatives(positions, dt)[source]#
Calculate the velocity, acceleration, and jerk of a trajectory.
- Parameters:
positions (list or numpy.ndarray) β A list or array of positions.
dt (float) β The time step between each position.
- Returns:
An array of velocities. acceleration (numpy.ndarray): An array of accelerations. jerk (numpy.ndarray): An array of jerks.
- Return type:
velocity (numpy.ndarray)
Parameters:
positions (array_like) β Position trajectory
dt (float) β Time step between positions
Returns:
velocity (numpy.ndarray) β Finite difference velocities
acceleration (numpy.ndarray) β Finite difference accelerations
jerk (numpy.ndarray) β Finite difference jerk
- plan_trajectory(start_position, target_position, obstacle_points)[source]#
Enhanced trajectory planning with collision avoidance.
- Parameters:
- Returns:
Joint trajectory as a list of joint configurations.
- Return type:
Parameters:
start_position (list) β Initial joint configuration
target_position (list) β Desired joint configuration
obstacle_points (list) β Environment obstacle points
Returns:
trajectory (list) β Planned joint trajectory waypoints
TrajectoryPlanning Class#
- class ManipulaPy.path_planning.TrajectoryPlanning(*args, **kwargs)[source]#
Bases:
OptimizedTrajectoryPlanningBackward compatibility alias for OptimizedTrajectoryPlanning.
This ensures existing code continues to work while providing access to all optimizations.
Backward compatibility alias for OptimizedTrajectoryPlanning.
This ensures existing code continues to work while providing access to all optimizations.
Utility Functions#
- ManipulaPy.path_planning.create_optimized_planner(serial_manipulator, urdf_path, dynamics, joint_limits, torque_limits=None, target_speedup=40.0, gpu_memory_mb=None, enable_profiling=False, kernel_type='auto')[source]#
Enhanced factory function to create an optimized trajectory planner.
- Parameters:
serial_manipulator β SerialManipulator instance
urdf_path β Path to URDF file
dynamics β ManipulatorDynamics instance
joint_limits β Joint limits
torque_limits β Torque limits (optional)
target_speedup β Target speedup over CPU (default: 40x)
gpu_memory_mb β GPU memory pool size in MB (optional)
enable_profiling β Enable CUDA profiling (optional)
kernel_type β Kernel selection strategy (optional)
- Returns:
Configured planner instance
- Return type:
Factory function to create an optimized trajectory planner with recommended settings.
Parameters:
serial_manipulator (SerialManipulator) β SerialManipulator instance
urdf_path (str) β Path to URDF file
dynamics (ManipulatorDynamics) β ManipulatorDynamics instance
joint_limits (list) β Joint limits
torque_limits (list, optional) β Torque limits
gpu_memory_mb (int, optional) β GPU memory pool size in MB
enable_profiling (bool) β Enable CUDA profiling
Returns:
OptimizedTrajectoryPlanning β Configured planner instance
- ManipulaPy.path_planning.compare_implementations(serial_manipulator, urdf_path, dynamics, joint_limits, test_params=None, detailed_analysis=True)[source]#
Enhanced implementation comparison with detailed kernel analysis.
- Parameters:
serial_manipulator β SerialManipulator instance
urdf_path β Path to URDF file
dynamics β ManipulatorDynamics instance
joint_limits β Joint limits
test_params β Test parameters (optional)
detailed_analysis β Whether to perform detailed kernel comparison
- Returns:
Comprehensive comparison results
- Return type:
Compare performance between CPU and GPU implementations.
Parameters:
serial_manipulator (SerialManipulator) β SerialManipulator instance
urdf_path (str) β Path to URDF file
dynamics (ManipulatorDynamics) β ManipulatorDynamics instance
joint_limits (list) β Joint limits
test_params (dict, optional) β Test parameters
Returns:
dict β Comparison results including timing and accuracy metrics
CPU Optimization Functions#
- ManipulaPy.path_planning._trajectory_cpu_fallback(*args, **kwargs)#
MagicMock is a subclass of Mock with default implementations of most of the magic methods. You can use MagicMock without having to configure the magic methods yourself.
If you use the spec or spec_set arguments then only magic methods that exist in the spec will be created.
Attributes and the return value of a MagicMock will also be MagicMocks.
Numba-optimized CPU trajectory generation with parallel execution.
Parameters:
thetastart (numpy.ndarray) β Starting joint angles
thetaend (numpy.ndarray) β Ending joint angles
Tf (float) β Final time
N (int) β Number of trajectory points
method (int) β Time scaling method
Returns:
tuple β (positions, velocities, accelerations) arrays
Optimization: Uses Numbaβs parallel prange for CPU parallelization.
- ManipulaPy.path_planning._traj_cpu_njit(*args, **kwargs)#
MagicMock is a subclass of Mock with default implementations of most of the magic methods. You can use MagicMock without having to configure the magic methods yourself.
If you use the spec or spec_set arguments then only magic methods that exist in the spec will be created.
Attributes and the return value of a MagicMock will also be MagicMocks.
Thin wrapper for Numba-optimized trajectory computation.
Parameters:
thetastart (numpy.ndarray) β Starting joint angles
thetaend (numpy.ndarray) β Ending joint angles
Tf (float) β Final time
N (int) β Number of trajectory points
method (int) β Time scaling method
Returns:
tuple β (positions, velocities, accelerations) arrays
Usage Examples#
Optimized Trajectory Planning:
from ManipulaPy.path_planning import OptimizedTrajectoryPlanning
from ManipulaPy.urdf_processor import URDFToSerialManipulator
# Setup with optimization options
processor = URDFToSerialManipulator("robot.urdf")
planner = OptimizedTrajectoryPlanning(
processor.serial_manipulator,
"robot.urdf",
processor.dynamics,
joint_limits=[(-np.pi, np.pi)] * 6,
use_cuda=None, # Auto-detect
cuda_threshold=100, # Use GPU for problems > 100*6 = 600 elements
memory_pool_size_mb=256, # 256MB GPU memory pool
enable_profiling=True
)
# Generate trajectory with automatic GPU/CPU selection
trajectory = planner.joint_trajectory(
thetastart=[0, 0, 0, 0, 0, 0],
thetaend=[0.5, -0.3, 0.8, 0.1, -0.2, 0.4],
Tf=3.0,
N=2000, # Large enough to trigger GPU
method=5
)
# Check performance stats
stats = planner.get_performance_stats()
print(f"GPU usage: {stats['gpu_usage_percent']:.1f}%")
print(f"Average GPU time: {stats['avg_gpu_time']*1000:.2f}ms")
Factory Function Usage:
from ManipulaPy.path_planning import create_optimized_planner
# Create planner with auto-optimized settings
planner = create_optimized_planner(
serial_manipulator=processor.serial_manipulator,
urdf_path="robot.urdf",
dynamics=processor.dynamics,
joint_limits=[(-np.pi, np.pi)] * 6,
gpu_memory_mb=512,
enable_profiling=False
)
Batch Trajectory Processing:
# Generate 10 trajectories simultaneously
batch_size = 10
num_joints = 6
starts = np.random.uniform(-np.pi, np.pi, (batch_size, num_joints))
ends = np.random.uniform(-np.pi, np.pi, (batch_size, num_joints))
# Process all trajectories in parallel on GPU
batch_trajectories = planner.batch_joint_trajectory(
thetastart_batch=starts,
thetaend_batch=ends,
Tf=2.0,
N=500,
method=3
)
print(f"Batch shape: {batch_trajectories['positions'].shape}") # (10, 500, 6)
Performance Comparison:
from ManipulaPy.path_planning import compare_implementations
# Compare CPU vs GPU performance
results = compare_implementations(
serial_manipulator=processor.serial_manipulator,
urdf_path="robot.urdf",
dynamics=processor.dynamics,
joint_limits=[(-np.pi, np.pi)] * 6,
test_params={"N": 5000, "Tf": 3.0, "method": 5}
)
print(f"CPU time: {results['cpu']['time']:.4f}s")
if results['gpu']['available']:
print(f"GPU time: {results['gpu']['time']:.4f}s")
print(f"Speedup: {results['gpu']['speedup']:.2f}x")
print(f"Max position difference: {results['accuracy']['max_pos_diff']:.2e}")
Advanced Dynamics Integration:
# Generate trajectory
trajectory = planner.joint_trajectory(
thetastart=[0, 0, 0, 0, 0, 0],
thetaend=[1.0, -0.5, 0.8, 0.2, -0.3, 0.6],
Tf=4.0,
N=1000,
method=5
)
# Compute required torques with GPU acceleration
torques = planner.inverse_dynamics_trajectory(
trajectory["positions"],
trajectory["velocities"],
trajectory["accelerations"],
gravity_vector=[0, 0, -9.81],
Ftip=[0, 0, -10, 0, 0, 0] # 10N downward force
)
# Forward dynamics simulation
simulation = planner.forward_dynamics_trajectory(
thetalist=[0, 0, 0, 0, 0, 0],
dthetalist=[0, 0, 0, 0, 0, 0],
taumat=torques,
g=[0, 0, -9.81],
Ftipmat=np.zeros((1000, 6)),
dt=0.004,
intRes=5
)
Performance Monitoring:
# Run multiple operations
for i in range(5):
trajectory = planner.joint_trajectory(
thetastart=np.random.uniform(-1, 1, 6),
thetaend=np.random.uniform(-1, 1, 6),
Tf=2.0,
N=1000,
method=3
)
# Analyze performance
stats = planner.get_performance_stats()
print(f"Total operations: {stats['gpu_calls'] + stats['cpu_calls']}")
print(f"GPU efficiency: {stats['gpu_usage_percent']:.1f}%")
print(f"Memory transfers: {stats['memory_transfers']}")
# Benchmark different problem sizes
benchmark_results = planner.benchmark_performance([
{"N": 100, "joints": 6, "name": "Small"},
{"N": 2000, "joints": 6, "name": "Large"},
{"N": 1000, "joints": 12, "name": "Many joints"},
])
for name, result in benchmark_results.items():
print(f"{name}: {result['total_time']:.4f}s (GPU: {result['used_gpu']})")
Memory Management:
# Clean up GPU resources explicitly
planner.cleanup_gpu_memory()
# Reset performance stats
planner.reset_performance_stats()
Cartesian Space Planning with Hybrid Computation:
# Define start and end poses
T_start = np.eye(4)
T_start[:3, 3] = [0.5, 0.2, 0.3]
T_end = np.eye(4)
T_end[:3, 3] = [0.3, 0.4, 0.5]
# Add rotation
from ManipulaPy.utils import rotation_matrix_z
T_end[:3, :3] = rotation_matrix_z(np.pi/3)
# Generate Cartesian trajectory (GPU for derivatives, CPU for orientations)
cart_traj = planner.cartesian_trajectory(
Xstart=T_start,
Xend=T_end,
Tf=3.0,
N=1000,
method=5
)
# Visualize
planner.plot_cartesian_trajectory(cart_traj, Tf=3.0)
Adaptive Threshold Tuning:
# The planner automatically adapts thresholds based on performance
initial_threshold = planner.cpu_threshold
# Run several operations
for _ in range(10):
trajectory = planner.joint_trajectory(
thetastart=np.random.uniform(-1, 1, 6),
thetaend=np.random.uniform(-1, 1, 6),
Tf=2.0,
N=800, # Around threshold
method=3
)
# Check if threshold adapted
final_threshold = planner.cpu_threshold
print(f"Threshold adapted: {initial_threshold} -> {final_threshold}")
Key Features#
Adaptive Execution:
Automatic GPU/CPU selection based on problem size
Intelligent threshold adaptation based on performance history
Graceful fallback on GPU errors or memory constraints
CUDA Acceleration:
2D parallelized kernels for optimal GPU utilization
Shared memory optimization for time-scaling computations
Memory pooling to reduce allocation overhead
Pinned memory transfers for maximum PCIe bandwidth
Memory Management:
Per-instance GPU array caching
Global memory pool with automatic cleanup
Configurable memory pool sizes
Explicit memory management controls
Performance Monitoring:
Detailed timing statistics for GPU vs CPU usage
Automatic performance tracking and analysis
Built-in benchmarking capabilities
Adaptive threshold tuning based on empirical performance
Batch Processing:
3D parallelized batch trajectory generation
Efficient memory management for large batches
Automatic load balancing across GPU cores
Robust Operation:
Comprehensive error handling with fallbacks
Automatic recovery from GPU memory issues
Extensive logging for debugging and optimization
Performance Characteristics#
GPU Acceleration Thresholds:
Small problems (N Γ joints < 200): CPU (lower overhead)
Medium problems (200 β€ N Γ joints < 5000): Adaptive selection
Large problems (N Γ joints β₯ 5000): GPU preferred
Memory Usage:
Joint trajectory: ~12 bytes per (time, joint) element
Cartesian trajectory: ~36 bytes per time step
Dynamics computation: ~20 bytes per (time, joint) element
Typical Speedups (on modern GPU):
Joint trajectories: 5-20Γ for N > 1000
Batch processing: 10-50Γ for large batches
Dynamics computation: 3-15Γ for long trajectories
Configuration Guidelines#
GPU Memory Pool Sizing:
Small robots (β€6 DOF): 128-256 MB
Medium robots (7-12 DOF): 256-512 MB
Large robots (>12 DOF): 512+ MB
CUDA Threshold Tuning:
High-end GPUs: Lower threshold (50-100)
Mid-range GPUs: Medium threshold (100-200)
Integrated GPUs: Higher threshold (200-500)
Batch Size Recommendations:
Memory-limited: batch_size β€ 50
Compute-limited: batch_size β€ 200
High-memory systems: batch_size β€ 1000
See Also#
CUDA Kernels API Reference β Low-level CUDA acceleration functions
Kinematics API Reference β Forward and inverse kinematics for trajectory execution
Dynamics API Reference β Dynamics computations for torque calculation
Control API Reference β Controllers for trajectory following
Potential Field API Reference β Collision avoidance and path optimization
Optimized Path Planning User Guide β Conceptual overview and planning strategies