**Introduction to NumPy and Data Manipulation

This lesson introduces NumPy, the fundamental library for numerical computation in Python. You'll learn how to create and manipulate arrays, perform basic mathematical operations, and understand the core functionalities of NumPy that are essential for data science.

Learning Objectives

  • Understand the purpose and benefits of using NumPy.
  • Create and manipulate NumPy arrays of various dimensions.
  • Perform basic mathematical operations on NumPy arrays.
  • Use indexing and slicing to access and modify array elements.

Text-to-Speech

Listen to the lesson content

Lesson Content

Introduction to NumPy

NumPy (Numerical Python) is a powerful library that provides efficient ways to work with numerical data in Python. Its core data structure is the 'ndarray' (n-dimensional array), which is a grid of values, all of the same type. NumPy arrays are much faster and more memory-efficient than Python lists for numerical computations. To use NumPy, you first need to import it:

import numpy as np

The np is a common alias for NumPy, making it easier to refer to it in your code.

Creating NumPy Arrays

You can create NumPy arrays in several ways:

  • From Lists:
    python import numpy as np my_list = [1, 2, 3, 4, 5] my_array = np.array(my_list) print(my_array) # Output: [1 2 3 4 5]

  • Using np.zeros(): Creates an array filled with zeros.
    python zeros_array = np.zeros(5) # Creates an array of 5 zeros. print(zeros_array) # Output: [0. 0. 0. 0. 0.]

  • Using np.ones(): Creates an array filled with ones.
    python ones_array = np.ones((2, 3)) # Creates a 2x3 array of ones. print(ones_array) # Output: [[1. 1. 1.] # [1. 1. 1.]]

  • Using np.arange(): Creates an array with a range of values, similar to Python's range().
    python range_array = np.arange(0, 10, 2) # Start, Stop, Step print(range_array) # Output: [0 2 4 6 8]

  • Using np.linspace(): Creates an array with a specified number of elements, evenly spaced between a start and end value.
    python linspace_array = np.linspace(0, 1, 5) # Start, Stop, Number of elements print(linspace_array) # Output: [0. 0.25 0.5 0.75 1. ]

Array Attributes

NumPy arrays have useful attributes:

  • .shape: Returns a tuple representing the dimensions of the array. For a 2x3 array, it would be (2, 3).
  • .dtype: Shows the data type of the elements in the array (e.g., int64, float64).
  • .ndim: Returns the number of dimensions of the array (1 for a vector, 2 for a matrix, etc.).
import numpy as np
my_array = np.array([[1, 2, 3], [4, 5, 6]])
print("Shape:", my_array.shape) # Output: Shape: (2, 3)
print("Data type:", my_array.dtype) # Output: Data type: int64
print("Number of dimensions:", my_array.ndim) # Output: Number of dimensions: 2

Array Indexing and Slicing

You can access elements in a NumPy array using indexing and slicing, similar to Python lists but with added flexibility for multi-dimensional arrays.

  • Indexing: Accessing a single element.
    python import numpy as np my_array = np.array([10, 20, 30, 40, 50]) print(my_array[0]) # Output: 10 (first element) print(my_array[2]) # Output: 30 (third element)

  • Slicing: Accessing a range of elements.
    python import numpy as np my_array = np.array([10, 20, 30, 40, 50]) print(my_array[1:4]) # Output: [20 30 40] (elements from index 1 to 3) print(my_array[:3]) # Output: [10 20 30] (elements from the beginning to index 2) print(my_array[2:]) # Output: [30 40 50] (elements from index 2 to the end)

  • Multi-dimensional Arrays: For 2D arrays (matrices), you can use comma-separated indexing:
    python import numpy as np my_matrix = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) print(my_matrix[0, 1]) # Output: 2 (element in the first row, second column) print(my_matrix[1:, :2]) # Output: [[4 5], [7 8]] (elements from row 1 onwards, columns 0 and 1)

Basic Array Operations

NumPy allows you to perform mathematical operations on arrays easily.

  • Arithmetic Operations: Operations are performed element-wise.
    python import numpy as np array1 = np.array([1, 2, 3]) array2 = np.array([4, 5, 6]) print(array1 + array2) # Output: [5 7 9] (element-wise addition) print(array1 * array2) # Output: [ 4 10 18] (element-wise multiplication) print(array1 - array2) # Output: [-3 -3 -3] print(array1 / array2) # Output: [0.25 0.4 0.5]

  • Broadcasting: NumPy can perform operations even if arrays have different shapes, as long as they are compatible.
    python import numpy as np array1 = np.array([1, 2, 3]) scalar = 2 print(array1 * scalar) # Output: [2 4 6] (scalar is broadcast to array's shape)

  • Aggregate Functions: NumPy provides functions like sum(), mean(), std(), min(), max(), etc.
    python import numpy as np my_array = np.array([1, 2, 3, 4, 5]) print("Sum:", my_array.sum()) # Output: Sum: 15 print("Mean:", my_array.mean()) # Output: Mean: 3.0 print("Min:", my_array.min()) # Output: Min: 1

Progress
0%