Gauss method for solving system of linear equations¶
Given a system of
Formally, the problem is formulated as follows: solve the system:
where the coefficients
This problem also has a simple matrix representation:
where
It is worth noting that the method presented in this article can also be used to solve the equation modulo any number p, i.e.:
Gauss¶
Strictly speaking, the method described below should be called "Gauss-Jordan", or Gauss-Jordan elimination, because it is a variation of the Gauss method, described by Jordan in 1887.
Overview¶
The algorithm is a sequential elimination
of the variables in each equation, until each equation will have only one remaining variable. If
Gaussian elimination is based on two simple transformation:
- It is possible to exchange two equations
- Any equation can be replaced by a linear combination of that row (with non-zero coefficient), and some other rows (with arbitrary coefficients).
In the first step, Gauss-Jordan algorithm divides the first row by
As a result, after the first step, the first column of matrix
Similarly, we perform the second step of the algorithm, where we consider the second column of second row. First, the row is divided by
We continue this process for all columns of matrix
Search for the pivoting element¶
The described scheme left out many details. At the select a pivoting row
: find one row of the matrix where the
Note that, here we swap rows but not columns. This is because if you swap columns, then when you find a solution, you must remember to swap back to correct places. Thus, swapping rows is much easier to do.
In many implementations, when
Degenerate cases¶
In the case where
Now we consider the general case
, where
So, some of the variables in the process can be found to be independent. When the number of variables,
In general, if you find at least one independent variable, it can take any arbitrary value, while the other (dependent) variables are expressed through it. This means that when we work in the field of real numbers, the system potentially has infinitely many solutions. But you should remember that when there are independent variables, SLAE can have no solution at all. This happens when the remaining untreated equations have at least one non-zero constant term. You can check this by assigning zeros to all independent variables, calculate other variables, and then plug in to the original SLAE to check if they satisfy it.
Implementation¶
Following is an implementation of Gauss-Jordan. Choosing the pivot row is done with heuristic: choosing maximum value in the current column.
The input to the function gauss
is the system matrix
The function returns the number of solutions of the system
const double EPS = 1e-9;
const int INF = 2; // it doesn't actually have to be infinity or a big number
int gauss (vector < vector<double> > a, vector<double> & ans) {
int n = (int) a.size();
int m = (int) a[0].size() - 1;
vector<int> where (m, -1);
for (int col=0, row=0; col<m && row<n; ++col) {
int sel = row;
for (int i=row; i<n; ++i)
if (abs (a[i][col]) > abs (a[sel][col]))
sel = i;
if (abs (a[sel][col]) < EPS)
continue;
for (int i=col; i<=m; ++i)
swap (a[sel][i], a[row][i]);
where[col] = row;
for (int i=0; i<n; ++i)
if (i != row) {
double c = a[i][col] / a[row][col];
for (int j=col; j<=m; ++j)
a[i][j] -= a[row][j] * c;
}
++row;
}
ans.assign (m, 0);
for (int i=0; i<m; ++i)
if (where[i] != -1)
ans[i] = a[where[i]][m] / a[where[i]][i];
for (int i=0; i<n; ++i) {
double sum = 0;
for (int j=0; j<m; ++j)
sum += ans[j] * a[i][j];
if (abs (sum - a[i][m]) > EPS)
return 0;
}
for (int i=0; i<m; ++i)
if (where[i] == -1)
return INF;
return 1;
}
Implementation notes:
- The function uses two pointers - the current column
- For each variable
- In this implementation, the current
- After finding a solution, it is inserted back into the matrix - to check whether the system has at least one solution or not. If the test solution is successful, then the function returns 1 or
Complexity¶
Now we should estimate the complexity of this algorithm. The algorithm consists of
- Search and reshuffle the pivoting row. This takes
- If the pivot element in the current column is found - then we must add this equation to all other equations, which takes time
So, the final complexity of the algorithm is
Note that when the SLAE is not on real numbers, but is in the modulo two, then the system can be solved much faster, which is described below.
Acceleration of the algorithm¶
The previous implementation can be sped up by two times, by dividing the algorithm into two phases: forward and reverse:
- Forward phase: Similar to the previous implementation, but the current row is only added to the rows after it. As a result, we obtain a triangular matrix instead of diagonal.
- Reverse phase: When the matrix is triangular, we first calculate the value of the last variable. Then plug this value to find the value of next variable. Then plug these two values to find the next variables...
Reverse phase only takes
Solving modular SLAE¶
For solving SLAE in some module, we can still use the described algorithm. However, in case the module is equal to two, we can perform Gauss-Jordan elimination much more effectively using bitwise operations and C++ bitset data types:
int gauss (vector < bitset<N> > a, int n, int m, bitset<N> & ans) {
vector<int> where (m, -1);
for (int col=0, row=0; col<m && row<n; ++col) {
for (int i=row; i<n; ++i)
if (a[i][col]) {
swap (a[i], a[row]);
break;
}
if (! a[row][col])
continue;
where[col] = row;
for (int i=0; i<n; ++i)
if (i != row && a[i][col])
a[i] ^= a[row];
++row;
}
// The rest of implementation is the same as above
}
Since we use bit compress, the implementation is not only shorter, but also 32 times faster.
A little note about different heuristics of choosing pivoting row¶
There is no general rule for what heuristics to use.
The heuristics used in previous implementation works quite well in practice. It also turns out to give almost the same answers as "full pivoting" - where the pivoting row is search amongst all elements of the whose submatrix (from the current row and current column).
Though, you should note that both heuristics is dependent on how much the original equations was scaled. For example, if one of the equation was multiplied by implicit pivoting
.
Implicit pivoting compares elements as if both lines were normalized, so that the maximum element would be unity. To implement this technique, one need to maintain maximum in each row (or maintain each line so that maximum is unity, but this can lead to increase in the accumulated error).
Improve the solution¶
Despite various heuristics, Gauss-Jordan algorithm can still lead to large errors in special matrices even of size
Therefore, the resulting Gauss-Jordan solution must sometimes be improved by applying a simple numerical method - for example, the method of simple iteration.
Thus, the solution turns into two-step: First, Gauss-Jordan algorithm is applied, and then a numerical method taking initial solution as solution in the first step.