I agree with prior comments about the article (not great). But I love LA, and will indulge a bit.
For me, the most amusing feature of LA is how quickly results and tools escalate from trivial and obvious to surprising and hard. LA is often introduced as slick notation to express systems of linear equations. Gaussian elimination and matrix inverses (when they exist) are natural evolutions of this, but not what I consider surprising.
Matrix multiplication is an elementary next step, but it is not at all clear why one would multiply matrices. My first real mathematical surprise (I was in jr high at the time) is a result of matrix multiplication: there is a matrix A of real values that satisfies A @ A = -I. (The matrix is [[0,1],[-1,0]]). I was aware of complex numbers, but until that moment, I didn't know you could get "minus identity" without first artificially introducing the idea of "square root of -1". But this shows one can construct "square root of -1" without introducing anything like that.
When one gets to determinants, characteristic polynomials, eigenstructure, fixed subspaces, projection operators, SVD, orthogonal transformations, and so-on, the work really begins. That determinants are multiplicative is really wonderful, but not at all obvious. SVD in full generality was only shown in 1936 and modern algorithms emerged in the 1950's to compute it [1].
Finally, check out the LDU decomposition of the Walsh matrix [2].
> it is not at all clear why one would multiply matrices
I disagree. Matrix multiplication is the composition of linear maps. AB is the transformation you get by applying B, then applying A to the result of that. Furthermore this provides a great intuition for why matrix multiplication is not, in general, commutative. If your professor didn't make this clear when you learned the definition of matrix multiplication, then they didn't do a great job.
>That determinants are multiplicative is really wonderful, but not at all obvious
Again, I think it is kind of obvious. The determinant of a map measures the degree to which it alters the area/volume/measure of the unit cube. So if A doubles it and B triples it, then BA should scale it by a factor of 6, since matrix multiplication is the composition of maps, and you're tripling the size of something that was already doubled.
This is why in a first course on linear algebra, it's more important to get the intuition and logical structure down than learn advanced matrix decompositions. Now your complex number example is a little less obvious. I actually think using the 2x2 matrix representation of complex numbers is the best way to introduce them because it sidesteps the part where you take roots of negative numbers, which is the most conceptually problematic part for many people and usually requires some kind of handwave.
I only recently realized that multiplying a matrix with a column vector is equivalent to a linear combination of the vector's columns. This explains the importance of matrix multiplication since 95% of Linear Algebra is calculating linear combinations of vectors.
Indeed, and this leads to another important interpretation: matrix multiplication is function evaluation.
Arbitrary functions which take in vectors and output vectors can be very complex and thus difficult to reason about. A useful simplifying assumption is that of linearity: that f applied to a linear combination is just a linear combination of f applied to each piece of the combination separately. Linear algebra, broadly speaking, is the study of functions of this kind and the properties that emerge from making the linearity assumption.
It turns out that, if we assume a function f is linear, all of the information about that function is contained in what it does to a set of basis vectors. We can in essence "encode" the function by a table of numbers (a matrix), where the kth column contains the result of f applied to the kth basis vector. In this way, given a basis, any linear transformation f has a matrix A which compactly represents it.
Since f is linear, to compute f(v) I could write v in my chosen basis then apply f to each basis vector and recombine. Alternatively, I could write the matrix A representing f in that basis, and then multiply Av. The two are equivalent: that is, Av = f(v). And so matrix-vector multiplication is "just" evaluating the function f.
Matrix columns, you mean. I think it is Strang who is a staunch proponent of teaching this to students early on. It is indeed helpful to realize that the matrix can be seen as a collection of the images of the basis vectors under the linear transformation the matrix represents.
For me, the most amusing feature of LA is how quickly results and tools escalate from trivial and obvious to surprising and hard. LA is often introduced as slick notation to express systems of linear equations. Gaussian elimination and matrix inverses (when they exist) are natural evolutions of this, but not what I consider surprising.
Matrix multiplication is an elementary next step, but it is not at all clear why one would multiply matrices. My first real mathematical surprise (I was in jr high at the time) is a result of matrix multiplication: there is a matrix A of real values that satisfies A @ A = -I. (The matrix is [[0,1],[-1,0]]). I was aware of complex numbers, but until that moment, I didn't know you could get "minus identity" without first artificially introducing the idea of "square root of -1". But this shows one can construct "square root of -1" without introducing anything like that.
When one gets to determinants, characteristic polynomials, eigenstructure, fixed subspaces, projection operators, SVD, orthogonal transformations, and so-on, the work really begins. That determinants are multiplicative is really wonderful, but not at all obvious. SVD in full generality was only shown in 1936 and modern algorithms emerged in the 1950's to compute it [1].
Finally, check out the LDU decomposition of the Walsh matrix [2].
[1] https://en.wikipedia.org/wiki/Singular_value_decomposition#H...
[2] https://en.wikipedia.org/wiki/Walsh_matrix