Perception API Reference#
This page documents ManipulaPy.perception, the module for higher-level perception capabilities including obstacle detection, 3D point cloud generation, clustering, and environmental understanding for robotic systems.
Tip
For conceptual explanations, see Perception User Guide.
β
Perception Class#
- class ManipulaPy.perception.Perception(vision_instance=None, logger_name='PerceptionLogger')[source]#
Bases:
objectA higher-level perception module that uses a Vision instance to handle tasks like obstacle detection, 3D point cloud generation, and clustering.
- logger#
Logger for debugging and status messages.
- Type:
Higher-level perception module that integrates Vision capabilities with machine learning algorithms for comprehensive environmental understanding and obstacle analysis.
Constructor
- __init__(vision_instance=None, logger_name='PerceptionLogger')[source]#
Initialize the Perception system with a Vision instance.
Parameters:
vision_instance (Vision) β Required Vision instance for camera operations and low-level processing
logger_name (str, optional) β Logger identifier (default: βPerceptionLoggerβ)
Validation:
Raises ValueError if vision_instance is None
Initializes logging with configurable debug levels
Establishes Vision module dependency relationship
β
Primary Perception Methods#
Obstacle Detection Pipeline#
- Perception.detect_and_cluster_obstacles(camera_index=0, depth_threshold=5.0, step=2, eps=0.1, min_samples=3)[source]#
Capture an image from the Vision instance, detect 3D obstacle points, and then cluster those points using DBSCAN.
- Parameters:
camera_index (int) β Index of the camera to capture from.
depth_threshold (float) β Maximum depth (in meters) to consider for obstacles.
step (int) β Step size for downsampling the depth image.
eps (float) β Maximum distance between two points to be considered neighbors in DBSCAN.
min_samples (int) β Minimum number of points required to form a cluster in DBSCAN.
- Returns:
obstacle_points (np.ndarray of shape (N, 3)) β 3D points representing detected obstacles.
labels (np.ndarray) β Cluster labels for each point (with -1 for noise).
- Return type:
Complete environmental analysis pipeline combining vision-based detection with machine learning clustering.
Parameters:
camera_index (int, optional) β Camera identifier for image capture (default: 0)
depth_threshold (float, optional) β Maximum depth for obstacle consideration in meters (default: 5.0)
step (int, optional) β Depth image downsampling stride (default: 2)
eps (float, optional) β DBSCAN neighbor distance threshold (default: 0.1)
min_samples (int, optional) β DBSCAN minimum cluster size (default: 3)
Returns:
obstacle_points (np.ndarray) β Shape (N, 3) array of 3D obstacle coordinates
labels (np.ndarray) β Shape (N,) cluster labels with -1 indicating noise
Processing Pipeline:
Image Acquisition: vision.capture_image(camera_index)
Depth Validation: Checks depth.ndim >= 2 and non-None status
Obstacle Detection: vision.detect_obstacles() with YOLO integration
Point Validation: Verifies non-None and non-empty point arrays
Clustering Analysis: DBSCAN application to 3D point coordinates
Result Logging: Debug information and cluster statistics
β
Stereo Vision Integration#
Disparity Computation#
- Perception.compute_stereo_disparity(left_img, right_img)[source]#
Compute a stereo disparity map from two images.
- Parameters:
left_img (np.ndarray) β Image from the left camera.
right_img (np.ndarray) β Image from the right camera.
- Returns:
disparity β Computed disparity map.
- Return type:
np.ndarray of type float32
Wrapper for stereo disparity computation with error handling and logging.
Parameters:
left_img (np.ndarray) β Left camera image array
right_img (np.ndarray) β Right camera image array
Returns:
disparity (np.ndarray) β Float32 disparity map
Dependencies:
Requires vision.stereo_enabled = True
Delegates to vision.rectify_stereo_images()
Delegates to vision.compute_disparity()
Point Cloud Generation#
- Perception.get_stereo_point_cloud(left_img, right_img)[source]#
Generate a 3D point cloud from a stereo pair of images.
- Parameters:
left_img (np.ndarray) β Image from the left camera.
right_img (np.ndarray) β Image from the right camera.
- Returns:
point_cloud β The 3D point cloud in world coordinates.
- Return type:
np.ndarray of shape (N, 3)
3D point cloud generation from stereo image pairs with comprehensive error handling.
Parameters:
left_img (np.ndarray) β Left camera image array
right_img (np.ndarray) β Right camera image array
Returns:
point_cloud (np.ndarray) β Shape (N, 3) 3D point coordinates
Error Handling:
Returns empty (0, 3) array if stereo not enabled
Logs point cloud size for debugging
Delegates processing to vision.get_stereo_point_cloud()
β
Machine Learning Components#
Clustering Analysis#
- Perception.cluster_obstacles(points, eps=0.1, min_samples=3)[source]#
Cluster the 3D points using DBSCAN.
- Parameters:
- Returns:
labels (np.ndarray of shape (N,)) β The cluster label for each point (-1 indicates noise).
num_clusters (int) β The number of clusters found (excluding noise).
- Return type:
DBSCAN-based clustering of 3D obstacle points for spatial grouping and noise filtering.
Parameters:
points (np.ndarray) β Shape (N, 3) array of 3D coordinates
eps (float, optional) β Maximum neighbor distance (default: 0.1)
min_samples (int, optional) β Minimum cluster density (default: 3)
Returns:
labels (np.ndarray) β Shape (N,) cluster assignments
num_clusters (int) β Count of identified clusters excluding noise
- Raises:
ImportError β if scikit-learn is not installed. Install the optional ML extra with
pip install "ManipulaPy[ml]".- Parameters:
- Return type:
Changed in version 1.3.2:
import ManipulaPy.perceptionnow succeeds without scikit-learn β onlycluster_obstacles(and pipelines that call it) raiseImportErrorat call time.DBSCAN Implementation:
Uses sklearn.cluster.DBSCAN with L2 distance metric
Noise points labeled as -1
Cluster counting excludes noise category
Comprehensive logging of cluster statistics
Algorithm Details:
Input Validation: Empty array handling with early return
Model Instantiation: DBSCAN(eps=eps, min_samples=min_samples)
Fitting: dbscan_model.fit(points) on 3D coordinates
Label Extraction:
dbscan_model.labels_arrayStatistics Computation: Unique label counting and noise analysis
β
Resource Management#
Cleanup Methods#
- Perception.release()[source]#
Release resources held by the Vision instance.
Explicit resource cleanup for Vision instance dependencies.
Implementation:
Safely calls vision.release() with exception handling
Logs cleanup operations for debugging
Handles None vision instances gracefully
- Return type:
None
- Perception.__del__()[source]#
Destructor to ensure Vision resources are released.
Destructor ensuring automatic resource cleanup on object destruction.
Safety Features:
hasattr() validation before vision access
Silent exception handling to prevent destructor failures
Automatic cleanup without user intervention
- Return type:
None
β
Logging Infrastructure#
Logger Configuration#
- Perception._setup_logger(name)[source]#
Configure and return a logger for this Perception module.
Configures structured logging with consistent formatting across the perception pipeline.
Parameters:
name (str) β Logger identifier for message routing
Returns:
logger (logging.Logger) β Configured logger instance
Configuration Details:
Level: DEBUG for comprehensive information
Handler: StreamHandler for console output
Format: β%(asctime)s - %(name)s - %(levelname)s - %(message)sβ
Duplicate Prevention: Checks existing handlers before addition
β
Error Handling Hierarchy#
Vision Instance Validation: Constructor-level checking
Image Availability: Depth image validation with fallbacks
Detection Results: None-checking with empty array returns
Clustering Input: Empty point array handling
Resource Cleanup: Exception-safe cleanup operations
β
Dependencies and Integration#
Required Dependencies#
Module |
Purpose |
Integration Point |
|---|---|---|
|
Point clustering |
cluster_obstacles() method β install via |
|
Array operations |
All coordinate manipulations |
|
Debug/status output |
Logger configuration and usage |
|
Low-level vision |
vision_instance dependency |
Note
As of v1.3.2, import ManipulaPy.perception no longer requires
scikit-learn β the import is guarded internally. Only
Perception.cluster_obstacles() (and the
Perception.detect_and_cluster_obstacles() pipeline that calls it)
raise ImportError if the ml extra is missing.
Vision Module Interface#
Perception requires Vision instance with specific methods:
capture_image(): RGB/depth image acquisition
detect_obstacles(): YOLO-based object detection
stereo_enabled: Boolean stereo capability flag
rectify_stereo_images(): Stereo rectification
compute_disparity(): Disparity map computation
get_stereo_point_cloud(): 3D reconstruction
release(): Resource cleanup
β
Error Recovery and Robustness#
Vision Failure Handling#
Missing Depth: Empty array return with warning logs
Detection Failures: None-result handling with error logs
Stereo Unavailability: Runtime exception with clear messaging
Clustering Robustness#
Empty Input: Graceful handling with early return
Insufficient Points: DBSCAN handles minimum sample requirements
Parameter Validation: Implicit through sklearn parameter checking
Resource Management Safety#
Constructor Validation: Immediate failure for invalid Vision instances
Cleanup Exceptions: Silent handling in destructor context
Multiple Cleanup Calls: Safe repeated release() invocation
β
See Also#
Vision API Reference β Low-level computer vision and camera management
Path Planning API Reference β Trajectory planning with obstacle avoidance
Potential Field API Reference β Obstacle-aware potential field methods
Control API Reference β Control systems with environmental feedback
External Dependencies#
scikit-learn β DBSCAN clustering algorithm
NumPy β Numerical array operations
Python logging β Structured logging framework