Multiplying “Scalar-Looking” Terms in Matrix Equations by Unit Matrix
- Code for inserting the unit matrix in a Mathematica 5 notebook
In matrix equations, multiplication of “scalar-looking” terms by the identity matrix is understood. How to implement it in Mathematica?
By convention, either the whole expression looks like a scalar (e.g., 3/4
), or there are scalar terms added to matrix ones.
The first possibility is easily handled: if the expression, call it eq
, is FreeQ[eq, List]
, we simply multiply it by the identity matrix of the right dimension.
In the second case, first we have to keep the sheep from the goats. By default, Plus
is Listable
: when adding a scalar to a vector or matrix, Mathematica adds the scalar to all elements of the array. To avoid that, we remove the Listable
attribute: ClearAttributes[Plus, Listable];
Linearise Your Functions
When Plus
is not Listable
, not only sums of scalars and lists like 1 + {2, 3}
, but also sums of lists like {1, 2} + {3, 4}
remain unevaluated.
Therefore the evaluation of functions like trace is inhibited, if they are not defined to be linear in their arguments. Trace makes scalars out of lists; but with unlistable addition, lists remain its subexpressions, rendering useless the patterns below. We have to Unprotect[Tr]; Tr[a_ + b_] := Tr[a] + Tr[b]; Protect[Tr];
A foolproof transformation to insert identity matrices is eq = eq//.x_ + y_ :> x IDM + y /; FreeQ[x, List] && (!FreeQ[y, List]) && FreeQ[x, IDM];
Let us look at it more closely. In a sum, we find two terms: a scalar x
, and a y
containing at least one List
(to avoid matching elements of matrices). To avoid an infinite loop with ReplaceRepeated
(//.
), we require that x
be FreeQ[x, IDM]
. Later, the identity matrix is to be substituted for IDM
.
The y
term cannot just be a list: in the expression a + b (c + M)
, the a
term will not be multiplied by IDM
, as b (c + M)
is not a list; it does, however, contain a list, and the transformation works.
I denote scalars by lower-case and matrices by upper-case letters.
Unfortunately, the transformation gets mired in combinatorics for any real-world case; even for a simple expression like c1 + c2 + c3 + c4 + c5 + c6 + c7 + c8 + c9 + c10 + {{a, b}, {c, d}}
it takes about 0.96 seconds on my computer with Core i5 processor.
We may try to divide the constraint between the terms of the pattern in order to make it fail quicker in case of non-match: (x_ /; FreeQ[
x, List] && FreeQ[x, IDM]) + (y_ /; !FreeQ[y, List]) -> x IDM + y;
It takes 0.61 seconds for the previous expression, an impressive cut by 1/3. But the time is still unacceptable and grows fast with the complexity of the expression. A real-world case would take minutes.
We have to be more specific. Let us try the transformation rule (x_ /; FreeQ[x, List] && FreeQ[x, IDM]) + {X__} -> x IDM + {X};
Here we assume that the second term is a list. It is not foolproof anymore; in fact, it fails on some relevant expressions, as noted above. But it takes only about 0.0000211 seconds to complete on the average for the simple above expression.
We already know one special case the pattern does not match: a + b (A + c)
. The factor b
is essential here — otherwise the brackets would be superfluous.
It is covered by the pattern (x_ /; FreeQ[x, List] && FreeQ[x, IDM]) + c_ ({Y__} + y_) -> x IDM + c({Y} + y);
Of course, the latter also matches more general expressions with more terms like a + b + c (d + {{a, b}, {c, d}} + e + f);
thanks to the associativity of addition (Flat
ness of Plus
). — It matches even expressions like a + b (A + B)
, as we have not specified that y
be a scalar.
In fact, the last two patterns are the special case of (x_ /; FreeQ[x, List] && FreeQ[x, IDM]) + c_. ({Y__} + y_.) ->
where we have made
x IDM + c ({Y} + y);c
and y
optional by placing a period after them. If there is no c
, it is replaced by 1
. If there is no y
, it is replaced by 0
.
Note that adding 0
to a matrix still yields the matrix, even though Plus
is not listable.
There is another class of special cases: like a + c A.(b + B)
with matrix multiplication, matched by (x_ /; FreeQ[x, List] && FreeQ[x, IDM]) + c_.*Dot[Y_, Z_] -> x IDM + c Dot[Y, Z];
The Mathematica parsing engine favours the use of the period to denote an optional variable over its use in the dot product, forcing one to use c_.*Dot[Y_, Z_]
instead of c_.*Y_.Z_
It is crucial that we can always assume that the term with matrix multiplication is a matrix. (Of course, Dot
is also used to denote the dot product of vectors, but by checking first that the initial condition for the equation is given by a square matrix, this possibility is eliminated.) It relieves us of the headache of separately accounting for all reasonable special cases. Dot[Y_, Z_]
even matches expressions like A.B.C
, since Dot
is Flat
just like Plus
.
Using each of the three patterns in a separate repeated replacement, we finally have the complete transformation in eq //.(x_ /; FreeQ[x, List] && FreeQ[x, IDM]) + c_. ({Y__} + y_.) ->
and we are done, having only to restore
x IDM + c ({Y} + y)
//. (x_ /; FreeQ[x, List] && FreeQ[x,
IDM]) + c_.*Dot[Y_, Z_] -> x IDM + c Dot[Y, Z]SetAttributes[Plus, Listable];
The transformation does not work in the case of a scalar in the form of a dot product of two vectors.
How long does it all take? For some real-world cases, 0.29 seconds on the average, ranging from 0.015 to 2.109 seconds. It is not little. But if, for example, one has to solve the equations repeatedly with different initial conditions, and he can construct the equations once and for all, it is tolerable.