-
Notifications
You must be signed in to change notification settings - Fork 0
/
c-solve.tex
33 lines (31 loc) · 2.14 KB
/
c-solve.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
The command \texttt{python -m transmutagen.gensolve} takes a given sparsity
pattern for $A$ and generates a C function that solves $(A - \theta)x =\alpha
b$. Default sparsity patterns based on data from PyNE~\cite{ationneeded} and
ORIGEN~\cite{ationneeded} are included with Transmutagen. C functions are
generated from the CRAM approximations of given orders (by default, 6, 8, 10,
12, 14, 16, and 18, but any even order can be used), which compute $e^{-A}b$
using the generated solver.
The command generates a C source file and header from a
Jinja~\cite{ationneeded} template based on the pseudocode in
Algorithm~\ref{alg:lu-pseudocode}. The code uses C99 complex numbers for the
arithemetic. Each line that is known to be a no-op from the provided sparsity
pattern is automatically removed. The resulting C source file is 12 MB with
the default orders using the ORIGEN data sparsity pattern and 16 MB for the
PyNE data sparsity pattern. Compilation of this file requires disabling most
optimizations, as otherwise the compiler either does not finish or runs out of
memory. However, certain compilation flags were found to speed up the
performance of the algorithm while still allowing the compiler to finish,
particularly flags to speed up complex number operations. Without
optimizations, complex numbers in C are slow due to NaN checks, but these can
be disabled to make the code faster. Through experimentation, we found the GCC flags \texttt{-O0
-fcx-fortran-rules -fcx-limited-range -ftree-sra -ftree-ter
-fexpensive-optimizations} provided speedups without adversely slowing down
compile times. In fact, with these flags, the compilation times are around
$6\times$ faster than with \texttt{-O0} alone, and the resulting code runs
$2\times$ faster. The performance of this method compared to
\texttt{scipy.\allowbreak{}sparse} solvers is outlined in
Section~\ref{sec:origen-comparison}.
To make the method more accessible to nuclear scientists, the sparsity pattern
input file is formatted as a JSON file with a list of nuclides in the order
they appear in the matrix, and a list of from--to pairs of transitions between
nuclides that may be represented in the input matrix $A$.