Introduction
Masked arrays are arrays that may have missing or invalid entries. Thenumpy.ma module provides a nearly work-alike replacement for NumPy that supports data arrays with masks. When performing operations on masked arrays, invalid values are automatically suppressed from computations.
What Are Masked Arrays?
A masked array consists of two components:- Data array: The underlying NumPy array containing the actual data
- Mask: A boolean array where
Trueindicates invalid/masked values andFalseindicates valid values
When to Use Masked Arrays
Masked arrays are particularly useful when:Missing Data
Your dataset has missing values that should be excluded from calculations
Invalid Measurements
Sensor readings contain outliers or errors (NaN, inf) that need to be ignored
Conditional Analysis
You need to temporarily exclude certain values based on conditions without modifying the original data
Data Quality Control
Processing scientific data where quality flags indicate unreliable measurements
Basic Example
Consider an array with NaN values:Key Concepts
The Mask
The mask is a boolean array with the same shape as the data:True: Value is masked (invalid/excluded)False: Value is unmasked (valid/included)nomask: Special value indicating no elements are masked
Fill Values
Fill values are used to replace masked values when converting back to a regular array:| Data Type | Default Fill Value |
|---|---|
bool | True |
int | 999999 |
float | 1.e20 |
complex | 1.e20+0j |
object | '?' |
string | 'N/A' |
Hard vs. Soft Masks
Soft mask (default): Masked values can be unmasked by assigning new valuesCommon Operations
Masked arrays support most NumPy operations:Performance Considerations
Masked arrays have some overhead compared to regular NumPy arrays:- Additional memory for the mask array
- Mask checks during operations
- More complex indexing and broadcasting
- Using NaN for missing values in float arrays
- Filtering data before computation
- Using specialized libraries like pandas
See Also
- Creation Functions - Create masked arrays
- Operations - Operations on masked arrays
numpy.ma.MaskedArray- Base class reference
