└── README.md /README.md: -------------------------------------------------------------------------------- 1 | # 70 Core Linear Algebra Interview Questions in 2025 2 | 3 |
4 |

5 | 6 | machine-learning-and-data-science 7 | 8 |

9 | 10 | #### You can also find all 70 answers here 👉 [Devinterview.io - Linear Algebra](https://devinterview.io/questions/machine-learning-and-data-science/linear-algebra-interview-questions) 11 | 12 |
13 | 14 | ## 1. What is a _vector_ and how is it used in _machine learning_? 15 | 16 | In machine learning, **vectors** are essential for representing diverse types of data, including numerical, categorical, and text data. 17 | 18 | They form the framework for fundamental operations like adding and multiplying with a scalar. 19 | 20 | ### What is a Vector? 21 | 22 | A **vector** is a tuple of one or more values, known as its components. Each component can be a number, category, or more abstract entities. In **machine learning**, vectors are commonly represented as one-dimensional arrays. 23 | 24 | #### Types of Vectors 25 | 26 | - **Row Vector**: Will have only one row. 27 | - **Column Vector**: Comprising of only one column. 28 | 29 | Play and experiment with the code to know about vectors. Here is the Python code: 30 | 31 | ```python 32 | # Define a row vector with 3 components 33 | row_vector = [1, 2, 3] 34 | 35 | # Define a column vector with 3 components 36 | column_vector = [[1], 37 | [2], 38 | [3]] 39 | 40 | # Print the vectors 41 | print("Row Vector:", row_vector) 42 | print("Column Vector:", column_vector) 43 | ``` 44 | 45 | ### Common Vector Operations in Machine Learning 46 | 47 | #### Addition 48 | 49 | Each corresponding element is added. 50 | 51 | $$ 52 | \begin{bmatrix} 53 | 1 \\ 54 | 2 \\ 55 | 3 56 | \end{bmatrix} + 57 | \begin{bmatrix} 58 | 4 \\ 59 | 5 \\ 60 | 6 61 | \end{bmatrix} = 62 | \begin{bmatrix} 63 | 5 \\ 64 | 7 \\ 65 | 9 66 | \end{bmatrix} 67 | $$ 68 | 69 | #### Dot Product 70 | 71 | Sum of the products of corresponding elements. 72 | 73 | $$ 74 | \[ 75 | \begin{bmatrix} 76 | 1 & 2 & 3 77 | \end{bmatrix} 78 | \cdot 79 | \begin{bmatrix} 80 | 4 \\ 81 | 5 \\ 82 | 6 83 | \end{bmatrix} = 84 | 1 \times 4 + 2 \times 5 + 3 \times 6 = 32 85 | \] 86 | $$ 87 | 88 | #### Multiplying with a Scalar 89 | 90 | Each element is multiplied by the scalar. 91 | 92 | $$ 93 | 2 \times 94 | \begin{bmatrix} 95 | 1 \\ 96 | 2 \\ 97 | 3 98 | \end{bmatrix} = 99 | \begin{bmatrix} 100 | 2 \\ 101 | 4 \\ 102 | 6 103 | \end{bmatrix} 104 | $$ 105 | 106 | #### Length (Magnitude) 107 | 108 | Euclidean distance is calculated by finding the square root of the sum of squares of individual elements. 109 | 110 | $$ 111 | \| 112 | \begin{bmatrix} 113 | 1 \\ 114 | 2 \\ 115 | 3 116 | \end{bmatrix} 117 | \| = 118 | \sqrt{1^2 + 2^2 + 3^2} = \sqrt{14} 119 | $$ 120 |
121 | 122 | ## 2. Explain the difference between a _scalar_ and a _vector_. 123 | 124 | **Scalars** are single, real numbers that are often used as the coefficients in linear algebra equations. 125 | 126 | **Vectors**, on the other hand, are multi-dimensional objects that not only have a magnitude but also a specific direction in a coordinate space. In machine learning, vectors are commonly used to represent observations or features of the data, such as datasets, measurements, or even the coefficients of a linear model. 127 | 128 | ### Key Distinctions 129 | 130 | #### Dimensionality 131 | 132 | - **Scalar**: Represents a single point in space and has no direction. 133 | 134 | - **Vector**: Defines a direction and magnitude in a multi-dimensional space. 135 | 136 | #### Components 137 | 138 | - **Scalar**: Is standalone and does not have components. Scalars can be considered as 0-D vectors. 139 | 140 | - **Vector**: Consists of elements called components, which correspond to the magnitudes of the vector in each coordinate direction. 141 | 142 | #### Mathematical Formulation 143 | 144 | - **Scalar**: Denoted by a lower-case italicized letter 145 | ![equation](https://latex.codecogs.com/gif.latex?x) 146 | 147 | - **Vector**: Typically represented using a lowercase bold letter (e.g., 148 | ![equation](https://latex.codecogs.com/gif.latex?\mathbf{v})) or with an arrow over the variable ( 149 | ![equation](https://latex.codecogs.com/gif.latex?\vec{v})). Its components can be expressed in a column matrix 150 | ![equation](https://latex.codecogs.com/gif.latex?v&space;=&space;\begin{bmatrix}&space;v_1&space;\\&space;v_2&space;\\&space;\ldots&space;\\&space;v_n&space;\end{bmatrix}) or as a transposed row vector. 151 | ![equation](https://latex.codecogs.com/gif.latex?v&space;=&space;[v_1,&space;v_2,&space;\ldots,&space;v_n]) 152 | 153 | 154 | #### Visualization in 3D Space 155 | 156 | - **Scalar**: Represents a single point with no spatial extent and thus is dimensionless. 157 | 158 | - **Vector**: Extends from the origin to a specific point in 3D space, effectively defining a directed line segment. 159 |
160 | 161 | ## 3. What is a _matrix_ and why is it central to _linear algebra_? 162 | 163 | At the heart of Linear Algebra lies the concept of **matrices**, which serve as a compact, efficient way to represent and manipulate linear transformations. 164 | 165 | ### Essential Matrix Operations 166 | 167 | - **Addition and Subtraction**: Dually to arithmetic, matrix addition and subtraction are performed component-wise. 168 | 169 | - **Scalar Multiplication**: Each element in the matrix is multiplied by the scalar. 170 | 171 | - **Matrix Multiplication**: Denoted as $C = AB$, where $A$ is $m \times n$ and $B$ is $n \times p$, the dot product of rows of $A$ and columns of $B$ provides the elements of the $m \times p$ matrix $C$. 172 | 173 | - **Transpose**: This operation flips the matrix over its main diagonal, essentially turning its rows into columns. 174 | 175 | - **Inverse**: For a square matrix $A$, if there exists a matrix $B$ such that $AB = BA = I$, then $B$ is the inverse of $A$. 176 | 177 | ### Two Perspectives on Operations 178 | 179 | 1. **Machine Perspective**: Matrices are seen as a sequence of transformations, with emphasis on matrix multiplication. This viewpoint is prevalent in Computer Graphics and other fields. 180 | 181 | 2. **Data Perspective**: Vectors comprise the individual components of a system. Here, matrices are considered a mechanism to parameterize how the vectors change. 182 | 183 | ### Visual Representation 184 | 185 | - The **Cartesian Coordinate System** can visually represent transformations through matrices. For example: 186 | 187 | - **For Reflection**: The 2D matrix 188 | 189 | ![equation](https://latex.codecogs.com/gif.latex?\begin{bmatrix}&space;1&space;&&space;0&space;\\&space;0&space;&&space;-1&space;\end{bmatrix}) flips the y-component. 190 | 191 | - **For Rotation**: The 2D matrix 192 | 193 | ![equation](https://latex.codecogs.com/gif.latex?\begin{bmatrix}&space;\cos(\theta)&space;&&space;-\sin(\theta)&space;\\&space;\sin(\theta)&space;&&space;\cos(\theta)&space;\end{bmatrix}) rotates points by 194 | 195 | ![equation](https://latex.codecogs.com/gif.latex?\theta) radians. 196 | 197 | - **For Scaling**: The 2D matrix 198 | 199 | ![equation](https://latex.codecogs.com/gif.latex?\begin{bmatrix}&space;k&space;&&space;0&space;\\&space;0&space;&&space;k&space;\end{bmatrix}) scales points by a factor of 200 | 201 | ![equation](https://latex.codecogs.com/gif.latex?k) in both dimensions. 202 | 203 | ### Application in Multiple domains 204 | 205 | #### Computer Science 206 | 207 | - **Graphic Systems**: Matrices are employed to convert vertices from model to world space and to perform perspective projection. 208 | 209 | - **Data Science**: Principal Component Analysis (PCA) inherently entails eigendecompositions of covariance matrices. 210 | 211 | #### Physics 212 | 213 | - **Quantum Mechanics**: Operators (like Hamiltonians) associated with physical observables are represented as matrices. 214 | 215 | - **Classical Mechanics**: Systems of linear equations describe atmospheric pressure, fluid dynamics, and more. 216 | 217 | #### Engineering 218 | 219 | - **Control Systems**: Transmitting electrical signals or managing mechanical loads can be modeled using state-space or transfer function representations, which rely on matrices. 220 | 221 | - **Optimization**: The notorious Least Squares method resolves linear systems, often depicted as matrix equations. 222 | 223 | #### Business and Economics 224 | 225 | - **Markov Chains**: Navigating outcomes subject to variables like customer choice or stock performance benefits from matrix manipulations. 226 | 227 | #### Textiles and Animation 228 | 229 | - **Rotoscoping**: In earlier hand-drawn animations or even in modern CGI, matrices facilitate transformations and movements of characters or objects. 230 |
231 | 232 | ## 4. Explain the concept of a _tensor_ in the context of _machine learning_. 233 | 234 | In **Machine Learning**, a **tensor** is a **generalization** of scalars, vectors, and matrices to higher dimensions. It is the primary data structure you'll work with across frameworks like TensorFlow, PyTorch, and Keras. 235 | 236 | ### Tensor Basics 237 | 238 | - **Scalar**: A single number, often a real or complex value. 239 | - **Vector**: Ordered array of numbers, representing a direction in space. Vectors in $\mathbb{R}^n$ are `n`-dimensional. 240 | - **Matrix**: A 2D grid of numbers representing linear transformations and relationships between vectors. 241 | 242 | - **Higher-Dimensional Tensors**: Generalize beyond 1D (vectors) and 2D (matrices) and are crucial in deep learning for handling multi-dimensional structured data. 243 | 244 | ### Key Features of Tensors 245 | 246 | - **Data Representation**: Tensors conveniently represent multi-dimensional data, such as time series, text sequences, and images. 247 | - **Flexibility in Operations**: Can undergo various algebraic operations such as addition, multiplication, and more, thanks to their defined shape and type. 248 | - **Memory Management**: Modern frameworks manage underlying memory, facilitating computational efficiency. 249 | - **Speed and Parallel Processing**: Tensors enable computations to be accelerated through hardware optimizations like GPU and TPU. 250 | 251 | ### Code Example: Tensors in TensorFlow 252 | 253 | Here is the Python code: 254 | 255 | ```python 256 | import tensorflow as tf 257 | 258 | # Creating Scalars, Vectors, and Matrices 259 | scalar = tf.constant(3) 260 | vector = tf.constant([1, 2, 3]) 261 | matrix = tf.constant([[1, 2], [3, 4]]) 262 | 263 | # Accessing shapes of the created objects 264 | print(scalar.shape) # Outputs: () 265 | print(vector.shape) # Outputs: (3,) 266 | print(matrix.shape) # Outputs: (2, 2) 267 | 268 | # Element-wise operations 269 | double_vector = vector * 2 # tf.constant([2, 4, 6]) 270 | 271 | # Reshaping 272 | reshaped_matrix = tf.reshape(matrix, shape=(1, 4)) 273 | ``` 274 | 275 | ### Real-world Data Use-Cases 276 | 277 | - **Time-Series Data**: Capture events at distinct time points. 278 | - **Text Sequences**: Model relationships in sentences or documents. 279 | - **Images**: Store and process pixel values in 2D arrays. 280 | - **Videos and Beyond**: Handle multi-dimensional data such as video frames. 281 | 282 | Beyond deep learning, tensors find applications in physics, engineering, and other computational fields due to their ability to represent complex, multi-dimensional phenomena. 283 |
284 | 285 | ## 5. How do you perform _matrix addition_ and _subtraction_? 286 | 287 | **Matrix addition** is an operations between two matrices which both are of the same order $(m \times n)$. The result is a matrix of the same order where the corresponding elements of the two input matrices are added. 288 | 289 | 290 | ### Algebraic Representation 291 | 292 | Given two matrices: 293 | 294 | $$ 295 | A= 296 | \begin{bmatrix} 297 | a_{11} & a_{12} & \ldots & a_{1n} \\ 298 | a_{21} & a_{22} & \ldots & a_{2n} \\ 299 | \vdots & \vdots & \ddots & \vdots \\ 300 | a_{m1} & a_{m2} & \ldots & a_{mn} 301 | \end{bmatrix} 302 | $$ 303 | 304 | and 305 | 306 | $$ 307 | B= 308 | \begin{bmatrix} 309 | b_{11} & b_{12} & \ldots & b_{1n} \\ 310 | b_{21} & b_{22} & \ldots & b_{2n} \\ 311 | \vdots & \vdots & \ddots & \vdots \\ 312 | b_{m1} & b_{m2} & \ldots & b_{mn} 313 | \end{bmatrix} 314 | $$ 315 | 316 | The sum of $A$ and $B$ which is denoted as $A + B$ will be: 317 | 318 | $$ 319 | A + B = 320 | \begin{bmatrix} 321 | a_{11} + b_{11} & a_{12} + b_{12} & \ldots & a_{1n} + b_{1n} \\ 322 | a_{21} + b_{21} & a_{22} + b_{22} & \ldots & a_{2n} + b_{2n} \\ 323 | \vdots & \vdots & \ddots & \vdots \\ 324 | a_{m1} + b_{m1} & a_{m2} + b_{m2} & \ldots & a_{mn} + b_{mn} 325 | \end{bmatrix} 326 | $$ 327 | 328 | For the projection of these operations in code, you could use Python: 329 | 330 | ```python 331 | import numpy as np 332 | 333 | A = np.array([[1, 2, 3], [4, 5, 6]]) 334 | B = np.array([[7, 8, 9], [10, 11, 12]]) 335 | 336 | result = A + B 337 | ``` 338 |
339 | 340 | ## 6. What are the properties of _matrix multiplication_? 341 | 342 | **Matrix multiplication** is characterized by several fundamental properties, each playing a role in the practical application of both linear algebra and machine learning. 343 | 344 | ### Core Properties 345 | 346 | #### Closure 347 | 348 | The product $AB$ of matrices $A$ and $B$ is a valid matrix, subject to a defined number of columns in $A$ matching the number of rows in $B$. 349 | 350 | $$ 351 | \begin{aligned} 352 | A & : m \times n \\ 353 | B & : n \times p \\ 354 | AB & : m \times p 355 | \end{aligned} 356 | $$ 357 | 358 | #### Associativity 359 | 360 | Matrix multiplication is associative, meaning that the order of multiplication remains consistent despite bracketing changes: 361 | 362 | $$ 363 | A(BC) = (AB)C 364 | $$ 365 | 366 | #### Non-Commutativity 367 | 368 | In general, **matrix multiplication** is not commutative: 369 | 370 | $$ 371 | AB \neq BA 372 | $$ 373 | 374 | This implies that, for matrices to be **commutative**, they must be square and diagonal. 375 | 376 | #### Distributivity 377 | 378 | Matrix multiplication distributes across addition and subtraction: 379 | 380 | $$ 381 | A(B \pm C) = AB \pm AC 382 | $$ 383 | 384 | ### Additional Properties 385 | 386 | #### Identity Matrix 387 | 388 | When a matrix is multiplied by an **identity matrix** $I$, the original matrix is obtained: 389 | 390 | $$ 391 | AI = A 392 | $$ 393 | 394 | #### Zero Matrix 395 | 396 | Multiplying any matrix by a **zero matrix** results in a zero matrix: 397 | 398 | $$ 399 | 0 \times A = 0 400 | $$ 401 | 402 | #### Inverse Matrix 403 | 404 | Assuming that an inverse exists, $AA^{-1} = A^{-1}A = I$. However, not all matrices have **multiplicative inverses**, and care must be taken to compute them. 405 | 406 | #### Transpose 407 | 408 | For a product of matrices $(AB)^T$, the order is reversed: 409 | 410 | $$ 411 | (AB)^T = B^TA^T 412 | $$ 413 |
414 | 415 | ## 7. Define the _transpose_ of a _matrix_. 416 | 417 | The **transpose** of a matrix is generated by swapping its rows and columns. For any matrix $\mathbf{A}$ with elements $a_{ij}$, the transpose is denoted as $\mathbf{A}^T$ and its elements are $a_{ji}$. In other words, if matrix $\mathbf{A}$ has dimensions $m \times n$, the transpose $\mathbf{A}^T$ will have dimensions $n \times m$. 418 | 419 | ### Transposition Properties 420 | 421 | - **Self-Inverse**: $(\mathbf{A}^T)^T = \mathbf{A}$ 422 | - **Operation Consistency**: 423 | - For a constant $c$: $(c \mathbf{A})^T = c \mathbf{A}^T$ 424 | - For two conformable matrices $\mathbf{A}$ and $\mathbf{B}$: $(\mathbf{A} + \mathbf{B})^T = \mathbf{A}^T + \mathbf{B}^T$ 425 | 426 | ### Code Example: Matrix Transposition 427 | 428 | Here is the Python code: 429 | 430 | ```python 431 | import numpy as np 432 | 433 | # Create a sample matrix 434 | A = np.array([[1, 2, 3], [4, 5, 6]]) 435 | print("Original Matrix A:\n", A) 436 | 437 | # Transpose the matrix using NumPy 438 | A_transpose = np.transpose(A) # or A.T 439 | print("Transpose of A:\n", A_transpose) 440 | ``` 441 |
442 | 443 | ## 8. Explain the _dot product_ of two _vectors_ and its significance in _machine learning_. 444 | 445 | In machine learning, the **dot product** has numerous applications from basic data transformations to sophisticated algorithms like PCA and neural networks. 446 | 447 | ### Visual Representation 448 | 449 | The dot product $\mathbf{a} \cdot \mathbf{b}$ measures how far one vector $\mathbf{a}$ "reaches" in the direction of another $\mathbf{b}$. 450 | 451 | ![Dot Product](https://firebasestorage.googleapis.com/v0/b/dev-stack-app.appspot.com/o/linear-algebra%2Fdot-product.jpg?alt=media&token=4a14aa5d-6a70-4c90-a05e-4e56cc1bbde1) 452 | 453 | ### Notable Matrix and Vector Operations Derived From Dot Product 454 | 455 | #### Norm 456 | 457 | The norm or magnitude of a vector can be obtained from the dot product using: 458 | 459 | $$ 460 | \lVert \mathbf{a} \rVert = \sqrt{\mathbf{a} \cdot \mathbf{a}} 461 | $$ 462 | 463 | This forms the basis for Euclidean distance and algorithms such as k-nearest neighbors. 464 | 465 | #### Angle Between Vectors 466 | 467 | The angle $\theta$ between two non-zero vectors $\mathbf{a}$ and $\mathbf{b}$ is given by: 468 | 469 | $$ 470 | \cos \theta = \frac{\mathbf{a} \cdot \mathbf{b}}{\lVert \mathbf{a} \rVert \lVert \mathbf{b} \rVert} 471 | $$ 472 | 473 | #### Projections 474 | 475 | The dot product is crucial for determining the projection of one vector onto another. It's used in tasks like feature extraction in PCA and in calculating gradient descent steps in optimization algorithms. 476 | 477 | ### Code Example: Computing the Dot Product 478 | 479 | Here is the Python code: 480 | 481 | ```python 482 | import numpy as np 483 | 484 | a = np.array([1, 2, 3]) 485 | b = np.array([4, 5, 6]) 486 | 487 | dot_product = np.dot(a, b) 488 | print("Dot Product:", dot_product) 489 | 490 | # Alternatively, you can use the @ operator in recent versions of Python (3.5+) 491 | dot_product_alt = a @ b 492 | print("Dot Product (Alt):", dot_product_alt) 493 | ``` 494 |
495 | 496 | ## 9. What is the _cross product_ of _vectors_ and when is it used? 497 | 498 | The **cross product** is a well-known operation between two vectors in three-dimensional space. It results in a third vector that's orthogonal to both input vectors. The cross product is extensively used within various domains, including physics and computer graphics. 499 | 500 | ### Cross Product Formula 501 | 502 | For two three-dimensional vectors ![equation](https://latex.codecogs.com/gif.latex?\mathbf{a}&space;=&space;\begin{bmatrix}&space;a_1&space;\\&space;a_2&space;\\&space;a_3&space;\end{bmatrix}) and ![equation](https://latex.codecogs.com/gif.latex?\mathbf{b}&space;=&space;\begin{bmatrix}&space;b_1&space;\\&space;b_2&space;\\&space;b_3&space;\end{bmatrix}), their cross product ![equation](https://latex.codecogs.com/gif.latex?\mathbf{c}) is calculated as: 503 | 504 | $$ 505 | \mathbf{c} = \mathbf{a} \times \mathbf{b} = 506 | \begin{bmatrix} 507 | a_2b_3 - a_3b_2 \\ 508 | a_3b_1 - a_1b_3 \\ 509 | a_1b_2 - a_2b_1 510 | \end{bmatrix} 511 | $$ 512 | 513 | ### Key Operational Properties 514 | 515 | - **Direction**: The cross product yields a vector that's mutually perpendicular to both input vectors. The direction, as given by the right-hand rule, indicates whether the resulting vector points "up" or "down" relative to the plane formed by the input vectors. 516 | 517 | - **Magnitude**: The magnitude of the cross product, denoted by $\lvert \mathbf{a} \times \mathbf{b} \rvert$, is the area of the parallelogram formed by the two input vectors. 518 | 519 | ### Applications 520 | 521 | The cross product is fundamental in many areas, including: 522 | 523 | - **Physics**: It's used to determine torque, magnetic moments, and angular momentum. 524 | - **Engineering**: It's essential in mechanics, fluid dynamics, and electric circuits. 525 | - **Computer Graphics**: For tasks like calculating surface normals and implementing numerous 3D manipulations. 526 | - **Geography**: It's utilized, alongside the dot product, for various mapping and navigational applications. 527 |
528 | 529 | ## 10. How do you calculate the _norm_ of a _vector_ and what does it represent? 530 | 531 | The **vector norm** quantifies the length or size of a vector. It's a fundamental concept in linear algebra, and has many applications in machine learning, optimization, and more. 532 | 533 | The most common norm is the **Euclidean norm** or **L2 norm**, denoted as $\lVert \mathbf{x} \rVert_2$. The general formula for the Euclidean norm in $n$-dimensions is: 534 | 535 | $$ 536 | \lVert \mathbf{x} \rVert = \sqrt{x_1^2 + x_2^2 + \ldots + x_n^2} 537 | $$ 538 | 539 | ### Code Example: Euclidean Norm 540 | 541 | Here is the Python code: 542 | 543 | ```python 544 | import numpy as np 545 | 546 | vector = np.array([3, 4]) 547 | euclidean_norm = np.linalg.norm(vector) 548 | 549 | print("Euclidean Norm:", euclidean_norm) 550 | ``` 551 | 552 | ### Other Common Vector Norms 553 | 554 | 1. **L1 Norm (Manhattan Norm)**: The sum of the absolute values of each component. 555 | 556 | $\lVert \mathbf{x} \rVert_1 = |x_1| + |x_2| + \ldots + |x_n|$ 557 | 558 | 2. **L-Infinity Norm (Maximum Norm)**: The maximum absolute component value. 559 | 560 | $\lVert \mathbf{x} \rVert_{\infty} = \max_i |x_i|$ 561 | 562 | 3. **L0 Pseudonorm**: Represents the count of nonzero elements in the vector. 563 | 564 | ### Code Example: Computing L1 and L-Infinity Norms 565 | 566 | Here is the Python code: 567 | 568 | ```python 569 | L1_norm = np.linalg.norm(vector, 1) 570 | L_infinity_norm = np.linalg.norm(vector, np.inf) 571 | 572 | print("L1 Norm:", L1_norm) 573 | print("L-Infinity Norm:", L_infinity_norm) 574 | ``` 575 | 576 | It is worth to note that L2 is suitable for many mathematical operations like inner products and projections, that is why it is widely used in ML. 577 |
578 | 579 | ## 11. Define the concept of _orthogonality_ in _linear algebra_. 580 | 581 | In **linear algebra**, vectors in a space can be defined by their direction and magnitude. **Orthogonal vectors** play a significant role in this framework, as they are perpendicular to one another. 582 | 583 | ### Orthogonality in Euclidean Space 584 | 585 | In a real vector space, two vectors $\mathbf{v}$ and $\mathbf{w}$ are **orthogonal** if their dot product (also known as inner product) is zero: 586 | 587 | $$ 588 | \mathbf{v} \cdot \mathbf{w} = 0 589 | $$ 590 | 591 | This defines a geometric relationship between vectors as their dot product measures the projection of one vector onto the other. 592 | 593 | ### General Orthogonality Criteria 594 | 595 | For any two vectors $\mathbf{v}$ and $\mathbf{w}$ in an inner product space, they are orthogonal if and only if: 596 | 597 | $$ 598 | \| \mathbf{v} + \mathbf{w} \|^2 = \| \mathbf{v} \|^2 + \| \mathbf{w} \|^2 599 | $$ 600 | 601 | This relationship embodies the Pythagorean theorem: the sum of squares of the side lengths of a right-angled triangle is equal to the square of the length of the hypotenuse. 602 | 603 | In terms of the dot product, this can be expressed as: 604 | 605 | $$ 606 | \mathbf{v} \cdot \mathbf{w} = -\frac{1}{2} (\|\mathbf{v}\|^2 + \|\mathbf{w}\|^2 - \|\mathbf{v} - \mathbf{w}\|^2 ) 607 | $$ 608 | 609 | or 610 | 611 | $$ 612 | \mathbf{v} \cdot \mathbf{w} + \mathbf{v} \cdot (\mathbf{v} - \mathbf{w}) + \|\mathbf{v} - \mathbf{w}\|^2 - \|\mathbf{v}\|^2 - \|\mathbf{w}\|^2 = 0 613 | $$ 614 | 615 | ### Practical Applications 616 | 617 | 1. **Geometry**: Orthogonality defines perpendicularity in geometry. 618 | 619 | 2. **Machine Learning**: Orthogonal matrices are used in techniques like Principal Component Analysis (PCA) for dimensionality reduction and in whitening operations, which ensure zero covariances between variables. 620 | 621 | 3. **Signal Processing**: In digital filters and Fast Fourier Transforms (FFTs), orthogonal functions are used because their dot products are zero, making their projections independent. 622 | 623 | ### Code Example: Checking Orthogonality of Two Vectors 624 | 625 | Here is the Python code: 626 | 627 | ```python 628 | import numpy as np 629 | 630 | # Initialize two vectors 631 | v = np.array([3, 4]) 632 | w = np.array([-4, 3]) 633 | 634 | # Check orthogonality 635 | if np.dot(v, w) == 0: 636 | print("Vectors are orthogonal!") 637 | else: 638 | print("Vectors are not orthogonal.") 639 | ``` 640 |
641 | 642 | ## 12. What is the _determinant_ of a _matrix_ and what information does it provide? 643 | 644 | The **determinant** of a matrix, denoted by $\text{det}(A)$ or $|A|$, is a scalar value that provides important geometric and algebraic information about the matrix. For a square matrix $A$ of size $n \times n$, the determinant is defined. 645 | 646 | ### Core Properties 647 | 648 | The determinant of a matrix possesses several key properties: 649 | 650 | - **Linearity**: It's linear in each row and column when the rest are fixed. 651 | - **Factor Out**: It's factored out if two rows (or columns) are added or subtracted, or a scalar is multiplied to a row (or column). 652 | 653 | ### Calculation Methods 654 | 655 | The **Laplace expansion** method and using the **Eigendecomposition** of matrices are two common approaches for computing the determinant. 656 | 657 | #### Laplace Expansion 658 | 659 | The determinant of a matrix $A$ can be calculated using the Laplace expansion with any row or any column: 660 | 661 | $$ 662 | \text{det}(A) = \sum_{i=1}^{n} (-1)^{i+j} \cdot a_{ij} \cdot M_{ij} 663 | $$ 664 | 665 | Where $M_{ij}$ is the minor matrix obtained by removing the $i$-th row and $j$-th column, and $a_{ij}$ is the element of matrix $A$ at the $i$-th row and $j$-th column. 666 | 667 | #### Using Eigendecomposition 668 | 669 | If matrix $A$ has $n$ linearly independent eigenvectors, $\text{det}(A)$ can be calculated as the product of its eigenvalues. 670 | 671 | $$ 672 | \text{det}(A) = \prod_{i=1}^{n} \lambda_i 673 | $$ 674 | 675 | Where $\lambda_i$ are the eigenvalues of the matrix. 676 | 677 | ### Geometrical and Physical Interpretations 678 | 679 | 1. **Orientation of Linear Transformations**: The determinant of the matrix representation of a linear transformation indicates whether the transformation is orientation-preserving (positive determinant) or orientation-reversing (negative determinant), or if it is just a translation or a projection (determinant of zero). 680 | 681 | 2. **Volume Scaling**: The absolute value of the determinant represents the factor by which volumes are scaled when a linear transformation is applied. A determinant of 1 signifies no change in volume, while a determinant of 0 indicates a transformation that collapses the volume to zero. 682 | 683 | 3. **Linear Independence and Invertibility**: The existence of linearly independent rows or columns is captured by a non-zero determinant. If the determinant is zero, the matrix is singular and not invertible. 684 | 685 | 4. **Conditioning in Optimization Problems**: The determinant of the Hessian matrix, which is the matrix of second-order partial derivatives in optimization problems, provides insights into the local behavior of the objective function, helping to diagnose convergence issues and the geometry of the cost landscape. 686 | 687 | ### Code Example: Computing Determinant 688 | 689 | Here is the Python code: 690 | 691 | ```python 692 | import numpy as np 693 | 694 | # Create a random matrix 695 | A = np.random.rand(3, 3) 696 | 697 | # Compute the determinant 698 | det_A = np.linalg.det(A) 699 | ``` 700 |
701 | 702 | ## 13. Can you explain what an _eigenvector_ and _eigenvalue_ are? 703 | 704 | **Eigenvectors** and **Eigenvalues** have paramount significance in linear algebra and are fundamental to numerous algorithms, especially in fields like data science, physics, and engineering. 705 | 706 | ### Key Concepts 707 | 708 | - **Eigenvalue**: A scalar (represented by the Greek letter $\lambda$) that indicates how the corresponding eigenvector is scaled by a linear transformation. 709 | 710 | - **Eigenvector**: A non-zero vector (denoted as $v$) that remains in the same span or direction during a linear transformation, except for a potential scaling factor indicated by its associated eigenvalue. 711 | 712 | 713 | ### Math Definition 714 | Let $A$ be a square matrix. A non-zero vector $v$ is an eigenvector of $A$ if $Av$ is a scalar multiple of $v$. 715 | 716 | More formally, for some scalar $\lambda$, the following equation holds: 717 | 718 | $$ 719 | Av = \lambda v 720 | $$ 721 | 722 | In this context, $\lambda$ is the eigenvalue. A matrix can have one or more eigenvalues and their corresponding eigenvectors. 723 | 724 | 725 | ### Geometric Interpretation 726 | For a geometric perspective, consider a matrix $A$ as a linear transformation on the 2D space $\mathbb{R}^2$. 727 | - The eigenvectors of $A$ are unchanged in direction, except for potential scaling. 728 | - The eigenvalues determine the scaling factor. 729 | 730 | In 3D or higher-dimensional spaces, the eigenvector description remains analogous. 731 | 732 | ### Code Example: Calculating Eigenvalues and Eigenvectors 733 | 734 | Here is the Python code: 735 | 736 | ```python 737 | import numpy as np 738 | # Define the matrix 739 | A = np.array([[2, 1], [1, 3]]) 740 | # Calculate eigenvalues and eigenvectors 741 | eigenvalues, eigenvectors = np.linalg.eig(A) 742 | print("Eigenvalues:", eigenvalues) 743 | print("Eigenvectors:", eigenvectors) 744 | ``` 745 | 746 | ### Utility in Machine Learning 747 | 748 | - **Principal Component Analysis (PCA)**: Eigenvalues and eigenvectors are pivotal for computing principal components, a technique used for feature reduction. 749 | - **Data Normalization**: Eigenvectors provide directions along which data varies the most, influencing the choice of axes for normalization. 750 | - **Singular Value Decomposition (SVD)**: The guiding principle in SVD is akin to that in eigen-decomposition. 751 |
752 | 753 | ## 14. How is the _trace_ of a _matrix_ defined and what is its relevance? 754 | 755 | The **trace** of a square matrix, often denoted as $\text{tr}(\mathbf{A})$, is the sum of its diagonal elements. In mathematical notation: 756 | 757 | $$ 758 | \text{tr}(\mathbf{A}) = \sum_{i=1}^{n} A_{ii} 759 | $$ 760 | 761 | ### Properties of Trace 762 | 763 | - **Linearity**: For matrices $\mathbf{A}, \mathbf{B},$ and scalar $k$, $\text{tr}(k \mathbf{A}) = k \text{tr}(\mathbf{A})$ and $\text{tr}(\mathbf{A} + \mathbf{B}) = \text{tr}(\mathbf{A}) + \text{tr}(\mathbf{B})$. 764 | 765 | - **Cyclic Invariance**: The trace of $\mathbf{A} \mathbf{B} \mathbf{C}$ is equal to the trace of $\mathbf{B} \mathbf{C} \mathbf{A}$. 766 | 767 | - **Transposition Invariance**: The trace of a matrix is invariant under its transpose: $\text{tr}(\mathbf{A}) = \text{tr}(\mathbf{A}^T)$. 768 | 769 | - **Trace and Determinant**: The trace of a matrix is related to its determinant via characteristic polynomials. 770 | 771 | - **Trace and Eigenvalues**: The trace is the sum of the eigenvalues. This can be shown by putting the matrix in Jordan form where the diagonal elements are the eigenvalues. 772 | 773 | - **Orthogonal Matrices**: For an orthogonal matrix, $\text{tr}(\mathbf{S})$ equals the dimension of the matrix and $\det(\mathbf{S})$ takes the values $\pm 1$. 774 |
775 | 776 | ## 15. What is a _diagonal matrix_ and how is it used in _linear algebra_? 777 | 778 | A **diagonal matrix** is a structured linear operator seen in both applied and theoretical linear algebra. In this matrix, non-diagonal elements, which reside off the principal diagonal, are all zero. 779 | 780 | ### Mathematical Representation 781 | 782 | A diagonal matrix $D$ is characterized by: 783 | 784 | $$ 785 | \begin{bmatrix} 786 | d_1 & 0 & \cdots & 0 \\ 787 | 0 & d_2 & \cdots & 0 \\ 788 | \vdots & \vdots & \ddots & \vdots \\ 789 | 0 & 0 & \cdots & d_n 790 | \end{bmatrix} 791 | $$ 792 | 793 | where $d_1, \ldots, d_n$ are the **diagonal entries**. 794 | 795 | ### Matrix Multiplication Shortcut 796 | 797 | When a matrix is diagonal, matrix multiplication simplifies: 798 | 799 | $$ 800 | Dx = y 801 | $$ 802 | 803 | can be rewritten as: 804 | 805 | $$ 806 | \begin{bmatrix} 807 | d_1x_1 \\ 808 | d_2x_2 \\ 809 | \vdots \\ 810 | d_nx_n 811 | \end{bmatrix} = \begin{bmatrix} 812 | y_1 \\ 813 | y_2 \\ 814 | \vdots \\ 815 | y_n 816 | \end{bmatrix} 817 | $$ 818 | 819 | This reduces to: 820 | 821 | $$ 822 | \begin{bmatrix} 823 | d_1 & 0 & \cdots & 0 \\ 824 | 0 & d_2 & \cdots & 0 \\ 825 | \vdots & \vdots & \ddots & \vdots \\ 826 | 0 & 0 & \cdots & d_n 827 | \end{bmatrix} \begin{bmatrix} 828 | x_1 \\ 829 | x_2 \\ 830 | \vdots \\ 831 | x_n 832 | \end{bmatrix} = \begin{bmatrix} 833 | y_1 \\ 834 | y_2 \\ 835 | \vdots \\ 836 | y_n 837 | \end{bmatrix} 838 | $$ 839 | 840 | which is equivalent to the system of linear equations: 841 | 842 | $$ 843 | $$ 844 | d_1x_1 &= y_1 \\ 845 | d_2x_2 &= y_2 \\ 846 | &\vdots \\ 847 | d_nx_n &= y_n 848 | $$ 849 | $$ 850 | 851 | This is especially advantageous when matrix-vector multiplication can be efficiently computed using element-wise operations. 852 | 853 | ### Practical Applications 854 | 855 | - **Coordinate Transformation**: Diagonal matrices facilitate transforming coordinates in a multi-dimensional space. 856 | - **Component-wise Operations**: They allow for operations like scaling specific dimensions without affecting others. 857 | 858 | ### Code Example: Matrix-Vector Multiplication 859 | 860 | You can use Python to demonstrate matrix-vector multiplication with a diagonal matrix: 861 | 862 | ```python 863 | import numpy as np 864 | 865 | # Define a random diagonal matrix 866 | D = np.array([ 867 | [2, 0, 0], 868 | [0, 3, 0], 869 | [0, 0, 5] 870 | ]) 871 | 872 | # Define a random vector 873 | x = np.array([1, 2, 3]) 874 | 875 | # Compute the matrix-vector product 876 | y = D.dot(x) 877 | 878 | # Display the results 879 | print("D:", D) 880 | print("x:", x) 881 | print("Dx:", y) 882 | ``` 883 |
884 | 885 | 886 | 887 | #### Explore all 70 answers here 👉 [Devinterview.io - Linear Algebra](https://devinterview.io/questions/machine-learning-and-data-science/linear-algebra-interview-questions) 888 | 889 |
890 | 891 | 892 | machine-learning-and-data-science 893 | 894 |

895 | 896 | --------------------------------------------------------------------------------