Great post, one nitpick -- I wouldn't say that a matrix is a "sparsely defined" ...

wildmanx · on July 6, 2021

Also, what's confusing is that algebra usually uses matrices to describe linear functions from n-dimensional to m-dimensional vector spaces. Matrix has n rows, m columns, you give it an n-dim vector and after matrix multiplication you get back an m-dim vector.

The author uses a matrix quite differently. You give it two integer coordinates i and j and it gives you the value at position (i, j) back. That's a valid use, but not quite what you'd expect in a math-oriented article.

mferraro89 · on July 6, 2021

Thanks for calling this out, I thought it might cause confusion. Matrices are super weird objects because they don't fit nicely into the {scalar, vector, function, operator} classes that maybe we're used to. A matrix is a function in that it can take in a vector and map it to a new vector. It is also an operator in that it can take in some other matrix (a function!) and give you a transformed matrix (a new function). It is also a function in the sense that it can map vectors to scalars, where the input vectors (x, y) are the coordinates and the scalar stored there is the output. All of this gets further complicated by the fact that the elements of a matrix can be scalars, complex numbers, or even matrices! They are really strange objects and maybe I'll write up a whole post just about that strangeness.

yccs27 · on July 6, 2021

Just writing down the connections here for myself and maybe others:

A matrix represents a linear function taking a vector and returning a vector, written as

w = M v

Matrix multiplication corresponds to function composition.

Vectors can be indexed, and we can view them as a function i -> v[i] defined on the indexing set. We can also define basis vectors b_i, such that b_i[j] is 1 at index j=i and 0 otherwise. Any vector can be written as a weighted sum of basis vectors, with the vector components as coefficients:

v = Σ_i v[i] b_i

where Σ_i represents summation over the index i.

Matrices can be indexed with two indices, and this is closely related to vector indexing: For a matrix M, we have

M[i, j] = (M b_j)[i]

Each column of the matrix represents its output for a certain basis vector as input. By writing a vector as a sum of basis vectors and using linearity, we get the well-known matrix-vector multiplication formula:

(M v)[i] = Σ_j M[i, j] v[j]

burnished · on July 6, 2021

Can you link to context for this? I learned both in linear algebra, so it seems like either would be just as 'expected'.

mixedmath · on July 6, 2021

Here's a concrete example. The first matrix in the post is f = [[1, 1, 1], [1, 1, 1], [1, 1, 1]].

In linear algebra, we would interpret this as a linear map. A true equation would be f([1, 2, 3]^T) = [6, 6, 6]^T (where I'm using ^T to mean "transpose to a column vector").

But here, the author means f(1, 2) = 1, i.e. the (1,2) coordinate of the matrix is 1.

burnished · on July 6, 2021

Thank you! Yes, I agree, thank you for explaining that to me.

wildmanx · on July 6, 2021

And interestingly, both are connected. If d_i somewhat hand-wavingly expresses the vector d_i = (0, ..., 0, 1, 0, ... 0) with the 1 at position i, then given matrix M you can do

f(i, j) := d_i^T * M * d_j

The RHS is using classical matrix multiplication, and the function value will be the matrix' entry at column i, row j.