 |
Typical Cactus Simulation for Numerical Relativity
This section tries to describe how users run typical Cactus simulations
on their production machines.
- User has a particular parameter file (for example BenchBSSN_40l.par) which they want to run
on a machine, from this there is a (perl) script which generates
the a ThornList (for example, BSSN.th) of Cactus thorns (modules) they need
in the executable. [In practise they usually just have a default
list of thorns they compile everywhere]
- The ThornList contains the locations of the different CVS servers
holding thorns, and the user then uses another perl script called
GetCactus which automatically checks out from CVS the Cactus
flesh and the required thorns.
- The user types gmake commands to build the executable.
The first stage involves configuring for the machine, and some
machine specific options may be required.
http://www.cactuscode.org/Documentation/Configurations.html
- The executable is then run with a particular parameter
file. (There could be additional command lines options but
usually they are not needed).
- Running an executable interactively (to test a parameter file,
or do a short run) uses a command which is usually
mpirun -np <number of processors> cactus_exe <parameter file>
Although this is often a different command (vmirun, ...)
- Production runs always use mpi and
are submitted to machines with batch scripts, the group has a
script qs2 which automatically submits to a
batch queue for the machines they use. The scripts also performs
other operations which may be needed, for example:
- Set required environment variables.
- At the end of the run move output data to mass storage to
avoid it being purged.
- Typical runs will use between 32 and 512 processors, will
require a total of around 50GB core memory, will run for between
8 and 24 hours (that is they will require multiple checkpoints
and restarts), in which time they will do around 10000
iterations.
- Checkpoint files are usually written every 2 to 4 hours, and
will write around 20GB. The frequency at which they are written
depends on how fast they can be written. Depending on the
architecture and file system a checkpoint write could take 1
hour (although this is extreme).
|