.. raw:: latex \clearpage .. _running: Running the Code ================ Whenever you run a new simulation, a similar series of steps must be performed. A summary of the typical Rayleigh work flow is: #. Create a unique directory for storing simulation output #. Create a main_input file #. Copy or soft link the Rayleigh executable into the simulation directory #. Modify main_input as desired #. Run the code #. Examine output and restart simulation as necessary Preparation ----------- Each simulation run using Rayleigh should have its own directory. The code is run from within that directory, and any output is stored in various subdirectories created by Rayleigh at run time. Wherever you create your simulation directory, ensure that you have sufficient space to store the output. **Do not run Rayleigh from within the source code directory. Do not cross the beams: no running two models from within the same directory.** After you create your run directory, you will want to copy (cp) or soft link (ln -s ) the executable from Rayleigh/bin to your run directory. Soft-linking is recommended; if you recompile the code, the executable remains up-to-date. If running on an IBM machine, copy the script named Rayleigh/etc/make_dirs to your run directory and execute the script. This will create the directory structure expected by Rayleigh for its outputs. This step is unnecessary when compiling with the Intel or GNU compilers. Next, you must create a main_input file. This file contains the information that describes how your simulation is run. Rayleigh always looks for a file named main_input in the directory that it is launched from. Copy one of the sample input files from the Rayleigh/input_examples/ into your run directory, and rename it to main_input. The file named *benchmark_diagnostics_input* can be used to generate output for the diagnostics plotting tutorial (see §\ :ref:`diagnostics`). Finally, Rayleigh has some OpenMP-related logic that is still in development. We do not support Rayleigh’s OpenMP mode at this time, but on some systems, it can be important to explicitly disable OpenMP in order to avoid tripping any OpenMP flags used by external libraries, such as Intel’s MKL. Please be sure and run the following command before executing Rayleigh. This command should be precede *each* call to Rayleigh. :: export OMP_NUM_THREADS=1 (bash) setenv OMP_NUM_THREADS 1 (c-shell) Code Execution and Load-Balancing --------------------------------- Rayleigh is parallelized using MPI and a 2-D domain decomposition. The 2-D domain decomposition means that we envision the MPI Ranks as being distributed in rows and columns. The number of MPI ranks within a row is *nprow* and the number of MPI ranks within a column is *npcol*. When Rayleigh is run with N MPI ranks, the following constraint must be satisfied: .. math:: \mathrm{N} = \mathrm{npcol} \times \mathrm{nprow} . If this constraint is not satisfied , the code will print an error message and exit. The values of *nprow* and *npcol* can be specified in *main_input* or on the command line via the syntax: :: mpiexec -np 8 ./rayleigh.opt -nprow 4 -npcol 2 Load Balancing ~~~~~~~~~~~~~~ Rayleigh’s performance is sensitive to the values of *nprow* and *npcol*, as well as the number of radial grid points :math:`N_r` and latitudinal grid points :math:`N_\theta`. If you examine the main_input file, you will see that it is divided into Fortran namelists. The first namelist is the problemsize_namelist. Within this namelist, you will see a place to specify nprow and npcol. Edit main_input so that nprow and npcol agree with the N you intend to use (or use the command-line syntax mentioned above). The dominate effect on parallel scalability is the number of messages sent per iteration. For optimal message counts, nprow and npcol should be as close to one another in value as possible. #. N = nprow :math:`\times` npcol. #. nprow and npcol should be equal or within a factor of two of one another. The value of nprow determines how spherical harmonics are distributed across processors. Spherical harmonics are distributed in high-\ :math:`m`/low-:math:`m` pairs, where :math:`m` is the azimuthal wavenumber. Each process is responsible for all :math:`\ell`-values associated with those :math:`m`\ ’s contained in memory. The value of npcol determines how radial levels are distributed across processors. Radii are distributed uniformly across processes in contiguous chunks. Each process is responsible for a range of radii :math:`\Delta r`. The number of spherical harmonic degrees :math:`N_\ell` is defined by .. math:: N_\ell = \frac{2}{3}N_\theta For optimal load-balancing, *nprow* should divide evenly into :math:`N_r` and *npcol* should divide evenly into the number of high-\ :math:`m`/low-:math:`m` pairs (i.e., :math:`N_\ell/2`). Both *nprow* and *npcol* must be at least 2. In summary, #. :math:`nprow \ge 2`. #. :math:`npcol \ge 2`. #. :math:`n \times npcol = N_r` (for integer :math:`n`). #. :math:`k \times nprow = \frac{1}{3}N_\theta` (for integer :math:`k`). Specifying Resolution & Domain Bounds ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ As discussed, the number of radial grid points is denoted by :math:`N_r`, and the number of :math:`\theta` grid points by :math:`N_\theta`. The number of grid points in the :math:`\phi` direction is always :math:`N_\phi=2\times N_\theta`. :math:`N_r` and :math:`N_\theta` may each be defined in the problemsize_namelist of main_input: :: &problemsize_namelist n_r = 48 n_theta = 96 / :math:`N_r` and :math:`N_\theta` may also be specified at the command line (overriding the values in main_input) via: :: mpiexec -np 8 ./rayleigh.opt -nr 48 -ntheta 96 If desired, the number of spherical harmonic degrees :math:`N_\ell` or the maximal spherical harmonic degree :math:`\ell_\mathrm{max}\equiv N_\ell-1` may be specified in lieu of :math:`N_\theta`. The example above may equivalently be written as :: &problemsize_namelist n_r = 48 l_max = 63 / or :: &problemsize_namelist n_r = 48 n_l = 64 / The radial domain bounds are determined by the namelist variables :math:`rmin` (the lower radial boundary) and :math:`rmax` (the upper radial boundary): :: &problemsize_namelist rmin = 1.0 rmax = 2.0 / Alternatively, the user may specify the shell depth (:math:`rmax-rmin`) and aspect ratio (:math:`rmin/rmax`) in lieu of :math:`rmin` and :math:`rmax`. The preceding example may then be written as: :: &problemsize_namelist aspect_ratio = 0.5 shell_depth = 1.0 / Note that the interpretation of :math:`rmin` and :math:`rmax` depends on whether your simulation is dimensional or nondimensional. We discuss these alternative formulations in §\ :ref:`physics` Controlling Run Length & Time Stepping -------------------------------------- A simulation’s runtime and time-step size can be controlled using the **temporal_controls** namelist. The length of time for which a simulation runs before completing is controlled by the namelist variable **max_time_minutes**. The maximum number of time steps that a simulation will run for is determined by the value of the namelist **max_iterations**. The simulation will complete when it has run for *max_time_minutes minutes* or when it has run for *max_iterations time steps* – whichever occurs first. An orderly shutdown of Rayleigh can be manually triggered by creating a file with the name set in **terminate_file** (i.e., running the command *touch terminate* in the default setting). If the file is found, Rayleigh will stop after the next time step and write a checkpoint file. The existence of **terminate_file** is checked every **terminate_check_interval** iterations. The check can be switched off completely by setting **terminate_check_interval** to -1. Both of these options are set in the **io_controls_namelist**. With the appropriate job script this feature can be used to easily restart the code with new settings without losing the current allocation in the queuing system. A **terminate_file** left over from a previous run is automatically deleted when the code starts. Time-step size in Rayleigh is controlled by the Courant-Friedrichs-Lewy condition (CFL; as determined by the fluid velocity and Alfvén speed). A safety factor of **cflmax** is applied to the maximum time step determined by the CFL. Time-stepping is adaptive. An additional variable **cflmin** is used to determine if the time step should be increased. The user may also specify the maximum allowed time-step size through the namelist variable **max_time_step**. The minimum allowable time-step size is controlled through the variable **min_time_step**. If the CFL condition is less than this value, the simulation will exit. Let :math:`\Delta t` be the current time-step size, and let :math:`t_\mathrm{CFL}` be the maximum time-step size as determined by the CFL limit. The following logic is employed by Rayleigh when calculating the time-step size: - IF { :math:`\Delta_t\ge \mathrm{cflmax}\times t_\mathrm{CFL}` } THEN { :math:`\Delta_t` is set to :math:`\mathrm{cflmax}\times t_\mathrm{CFL}` }. - IF { :math:`\Delta_t\le \mathrm{cflmin}\times t_\mathrm{CFL}` } THEN { :math:`\Delta_t` is set to :math:`\mathrm{cflmax}\times t_\mathrm{CFL}` }. - IF{ :math:`t_\mathrm {CFL}\ge \mathrm{max\_time\_step}` } THEN { :math:`\Delta_t` is set to max_time_step } - IF{ :math:`t_\mathrm {CFL}\le \mathrm{min\_time\_step}` } THEN { Rayleigh Exits } The default values for these variables are: :: &temporal_controls_namelist max_iterations = 1000000 max_time_minutes = 1d8 cflmax = 0.6d0 cflmin = 0.4d0 max_time_step = 1.0d0 min_time_step = 1.0d-13 /