NumPy Splitting Arrays

Welcome to The Coding College, where coding concepts are simplified for everyone! In this article, we’ll discuss splitting arrays in NumPy, a fundamental skill for manipulating data in Python.

What is Splitting Arrays in NumPy?

Splitting arrays involves breaking a larger array into smaller sub-arrays. This is the reverse of joining arrays and is often used in data preprocessing, feature engineering, or dividing datasets into manageable parts.

NumPy provides built-in functions such as split(), array_split(), and others for this purpose.

1. Using the split() Function

The split() function splits an array into equal parts.

Syntax

numpy.split(array, sections, axis=0)
  • array: The array to be split.
  • sections: Number of equal parts to split the array into.
  • axis: The axis along which the array is split (default is 0).

Example: Splitting a 1D Array

import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6])
splits = np.split(arr, 3)
print(splits)

Output:

[array([1, 2]), array([3, 4]), array([5, 6])]

Example: Splitting a 2D Array

arr = np.array([[1, 2], [3, 4], [5, 6], [7, 8]])
splits = np.split(arr, 2, axis=0)  # Split along rows
print(splits)

Output:

[array([[1, 2],
        [3, 4]]), 
 array([[5, 6],
        [7, 8]])]

2. Using the array_split() Function

The array_split() function is more flexible than split(), as it allows splitting into unequal parts.

Syntax

numpy.array_split(array, sections, axis=0)
  • sections: The number of parts (can be unequal).

Example: Unequal Splitting of a 1D Array

arr = np.array([1, 2, 3, 4, 5])
splits = np.array_split(arr, 3)
print(splits)

Output:

[array([1, 2]), array([3, 4]), array([5])]

3. Splitting a 2D Array

Splitting Rows (Axis = 0)

arr = np.array([[1, 2], [3, 4], [5, 6]])
splits = np.array_split(arr, 2, axis=0)
print(splits)

Output:

[array([[1, 2],
        [3, 4]]), 
 array([[5, 6]])]

Splitting Columns (Axis = 1)

splits = np.array_split(arr, 2, axis=1)
print(splits)

Output:

[array([[1],
        [3],
        [5]]), 
 array([[2],
        [4],
        [6]])]

4. Splitting Using hsplit()

The hsplit() function splits arrays horizontally (column-wise).

Example

arr = np.array([[1, 2, 3], [4, 5, 6]])
splits = np.hsplit(arr, 3)
print(splits)

Output:

arr = np.array([[1, 2, 3], [4, 5, 6]])
splits = np.hsplit(arr, 3)
print(splits)

5. Splitting Using vsplit()

The vsplit() function splits arrays vertically (row-wise).

Example

arr = np.array([[1, 2], [3, 4], [5, 6]])
splits = np.vsplit(arr, 3)
print(splits)

Output:

[array([[1, 2]]), 
 array([[3, 4]]), 
 array([[5, 6]])]

6. Splitting Using dsplit()

The dsplit() function splits arrays along the third dimension (depth).

Example

arr = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
splits = np.dsplit(arr, 2)
print(splits)

Output:

[array([[[1],
         [3]],
        [[5],
         [7]]]), 
 array([[[2],
         [4]],
        [[6],
         [8]]])]

Practical Use Cases

  1. Data Preprocessing: Divide large datasets into smaller parts for analysis.
  2. Training and Testing: Split data into training and testing sets for machine learning.
  3. Data Visualization: Manage and visualize segments of data separately.

Comparison of Splitting Functions

FunctionDescriptionUse Case
split()Splits into equal parts only.When array size is divisible.
array_split()Splits into unequal parts.When array size is not uniform.
hsplit()Splits arrays horizontally (column-wise).2D arrays with columns.
vsplit()Splits arrays vertically (row-wise).2D arrays with rows.
dsplit()Splits arrays along depth (third dimension).3D arrays.

Summary

Understanding how to split arrays is essential for efficient data manipulation in Python. Functions like split(), array_split(), hsplit(), and others offer flexibility to divide arrays based on your requirements.

For more insights and tutorials, visit The Coding College and advance your coding skills today!

Leave a Comment