Alright, let's define what we mean by a "vector" in the context of linear algebra and machine learning. You've seen that linear algebra is important for representing data and understanding algorithms, and vectors are one of the most fundamental building blocks for this.
Think of a vector in two primary ways:
Both perspectives are useful, but the algebraic view is what we typically work with when programming and performing calculations, especially in machine learning.
Imagine an arrow drawn on a piece of graph paper, starting at the origin (the point (0,0)) and pointing to some other location, say (3, 4). This arrow has two defining properties:
A vector represented as an arrow starting from the origin (0,0) and pointing to the coordinate (3, 4) in a 2D plane.
This geometric intuition is helpful for visualizing vectors in 2 or 3 dimensions. You can think of vectors representing things like displacement (moving 3 units right and 4 units up), velocity, or force. While we often draw vectors starting from the origin, what truly defines the vector is its magnitude and direction, not its starting point. Two arrows with the same length and direction represent the same vector, regardless of where they are drawn in space.
While arrows are great for visualization, they become difficult to work with mathematically and computationally, especially when we move beyond three dimensions (which happens very often in machine learning).
The algebraic view defines a vector as an ordered list of numbers. These numbers are called the components or elements of the vector.
For the arrow in the diagram above, which points from (0,0) to (3,4), the algebraic representation is simply:
[34]Or sometimes written horizontally as [3,4]. The important part is that it's a list (3
and 4
) and the order matters (3
is the first component, 4
is the second).
This list directly corresponds to the coordinates of the endpoint of the geometric arrow when it starts at the origin.
The number of components (n) is the dimension or size of the vector.
In machine learning, we constantly deal with data that has multiple features. Think about predicting house prices. Your features might include:
We can represent a single house using a vector where each component corresponds to one of these features:
house_data=15003251.2This is a 4-dimensional vector (it lives in R4). We can't easily visualize this as an arrow in 4D space, but the algebraic representation works perfectly. A whole dataset could then be represented as a collection of such vectors (often organized into a matrix, which we'll discuss later).
This algebraic representation allows us to use the tools of linear algebra, implemented efficiently in libraries like NumPy, to perform calculations, analyze relationships between features, and build predictive models.
Now that we have a conceptual understanding of what vectors are, the next step is to learn the standard mathematical notation used to write them down and how to create them using Python's NumPy library.
© 2025 ApX Machine Learning