Welcome to The Coding College, where coding concepts are simplified for everyone! In this article, we’ll discuss splitting arrays in NumPy, a fundamental skill for manipulating data in Python.
What is Splitting Arrays in NumPy?
Splitting arrays involves breaking a larger array into smaller sub-arrays. This is the reverse of joining arrays and is often used in data preprocessing, feature engineering, or dividing datasets into manageable parts.
NumPy provides built-in functions such as split()
, array_split()
, and others for this purpose.
1. Using the split()
Function
The split()
function splits an array into equal parts.
Syntax
numpy.split(array, sections, axis=0)
- array: The array to be split.
- sections: Number of equal parts to split the array into.
- axis: The axis along which the array is split (default is
0
).
Example: Splitting a 1D Array
import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6])
splits = np.split(arr, 3)
print(splits)
Output:
[array([1, 2]), array([3, 4]), array([5, 6])]
Example: Splitting a 2D Array
arr = np.array([[1, 2], [3, 4], [5, 6], [7, 8]])
splits = np.split(arr, 2, axis=0) # Split along rows
print(splits)
Output:
[array([[1, 2],
[3, 4]]),
array([[5, 6],
[7, 8]])]
2. Using the array_split()
Function
The array_split()
function is more flexible than split()
, as it allows splitting into unequal parts.
Syntax
numpy.array_split(array, sections, axis=0)
- sections: The number of parts (can be unequal).
Example: Unequal Splitting of a 1D Array
arr = np.array([1, 2, 3, 4, 5])
splits = np.array_split(arr, 3)
print(splits)
Output:
[array([1, 2]), array([3, 4]), array([5])]
3. Splitting a 2D Array
Splitting Rows (Axis = 0)
arr = np.array([[1, 2], [3, 4], [5, 6]])
splits = np.array_split(arr, 2, axis=0)
print(splits)
Output:
[array([[1, 2],
[3, 4]]),
array([[5, 6]])]
Splitting Columns (Axis = 1)
splits = np.array_split(arr, 2, axis=1)
print(splits)
Output:
[array([[1],
[3],
[5]]),
array([[2],
[4],
[6]])]
4. Splitting Using hsplit()
The hsplit()
function splits arrays horizontally (column-wise).
Example
arr = np.array([[1, 2, 3], [4, 5, 6]])
splits = np.hsplit(arr, 3)
print(splits)
Output:
arr = np.array([[1, 2, 3], [4, 5, 6]])
splits = np.hsplit(arr, 3)
print(splits)
5. Splitting Using vsplit()
The vsplit()
function splits arrays vertically (row-wise).
Example
arr = np.array([[1, 2], [3, 4], [5, 6]])
splits = np.vsplit(arr, 3)
print(splits)
Output:
[array([[1, 2]]),
array([[3, 4]]),
array([[5, 6]])]
6. Splitting Using dsplit()
The dsplit()
function splits arrays along the third dimension (depth).
Example
arr = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
splits = np.dsplit(arr, 2)
print(splits)
Output:
[array([[[1],
[3]],
[[5],
[7]]]),
array([[[2],
[4]],
[[6],
[8]]])]
Practical Use Cases
- Data Preprocessing: Divide large datasets into smaller parts for analysis.
- Training and Testing: Split data into training and testing sets for machine learning.
- Data Visualization: Manage and visualize segments of data separately.
Comparison of Splitting Functions
Function | Description | Use Case |
---|---|---|
split() | Splits into equal parts only. | When array size is divisible. |
array_split() | Splits into unequal parts. | When array size is not uniform. |
hsplit() | Splits arrays horizontally (column-wise). | 2D arrays with columns. |
vsplit() | Splits arrays vertically (row-wise). | 2D arrays with rows. |
dsplit() | Splits arrays along depth (third dimension). | 3D arrays. |
Summary
Understanding how to split arrays is essential for efficient data manipulation in Python. Functions like split()
, array_split()
, hsplit()
, and others offer flexibility to divide arrays based on your requirements.
For more insights and tutorials, visit The Coding College and advance your coding skills today!