Command line options¶

Overview¶

Arguments are processed in hoomd.context.initialize(). Call hoomd.context.initialize() immediately after importing hoomd so that the requested MPI and GPU options can be initialized as early as possible.

There are two ways to specify arguments.

On the command line: python script.py [options]:
import hoomd
hoomd.context.initialize()
Within your script:
import hoomd
hoomd.context.initialize("[options]")

With no arguments, hoomd.context.initialize() will attempt to parse all arguments from the command line, whether it understands them or not. When you pass a string, it ignores the command line (sys.argv) and parses the given string as if it were issued on the command line. In jupyter notebooks, use context.initialize("") to avoid errors from jupyter specific command line arguments.

Options¶

no options given

hoomd will automatically detect the fastest GPU and run on it, or fall back on the CPU if no GPU is found.
-h, --help

print a description of all the command line options
--mode={cpu | gpu}

force hoomd to run either on the cpu or gpu
--gpu=#

specify the GPU id or comma-separated list of GPUs (with NVLINK) that hoomd will use. Implies --mode=gpu.
--ignore-display-gpu

prevent hoomd from using any GPU that is attached to a display
--minimize-cpu-usage

minimize the CPU usage of hoomd when it runs on a GPU at reduced performance
--gpu_error_checking

enable error checks after every GPU kernel call
--notice-level=#

specifies the level of notice messages to print
--msg-file=filename

specifies a file to write messages (the file is overwritten)
--single-mpi

allow single-threaded HOOMD builds in MPI jobs
--user

user options
MPI only options
- --nx=#
  
  Number of domains along the x-direction
- --ny=#
  
  Number of domains along the y-direction
- --nz=#
  
  Number of domains along the z-direction
- --linear
  
  Force a slab (1D) decomposition along the z-direction
- --nrank=#
  
  Number of ranks per partition
- --shared-msg-file=prefix
  
  specifies the prefix of files to write per-partition output to (filename: prefix.<partition_id>)
Option available only when compiled with TBB support
- --nthreads=#
  
  Number of TBB threads to use, by default use all CPUs in the system

Detailed description¶

Control hoomd execution¶

HOOMD-blue can run on the CPU or the GPU. To control which, set the --mode option on the script command line. Valid settings are cpu and gpu:

python script.py --mode=cpu

When --mode is set to gpu and no other options are specified, hoomd will choose a GPU automatically. It will prioritize the GPU choice based on speed and whether it is attached to a display. Unless you take steps to configure your system (see below), then running a second instance of HOOMD-blue will place it on the same GPU as the first. HOOMD-blue will run correctly with more than one simulation on a GPU as long as there is enough memory, but at reduced performance.

You can select the GPU on which to run using the --gpu command line option:

python script.py --gpu=1

Note

--gpu implies --mode=gpu. To find out which id is assigned to each GPU in your system, download the CUDA SDK for your system from http://www.nvidia.com/object/cuda_get.html and run the deviceQuery sample.

If you run a script without any options:

python script.py

hoomd first checks if there are any GPUs in the system. If it finds one or more, it makes the same automatic choice described previously. If none are found, it runs on the CPU.

Multi-GPU (and multi-CPU) execution with MPI¶

HOOMD-blue uses MPI domain decomposition for parallel execution. Execute python with mpirun, mpiexec, or whatever the appropriate launcher is on your system. For more information, see MPI domain decomposition:

mpirun -n 8 python script.py

All command line options apply to MPI execution in the same way as single process runs.

When n > 1 and no explicit GPU is specified, HOOMD uses the the local MPI rank to assign GPUs to ranks on each node. This is the default behavior and works on most cluster schedulers.

Multi-GPU execution with NVLINK¶

You can run HOOMD on multiple GPUs in the same compute node that are connected with NVLINK. To find out if your node supports it, run:

nvidia-smi -m topo

If the GPUs are connected by NVLINK, launch HOOMD with:

python script.py --gpu=0,1,2

to execute on GPUs 0,1 and 2. For multi-GPU execution it is required that all GPUs have the same compute capability >= 6.0. Not all kernels are currently NVLINK enabled; performance may depend on the subset of features used.

Multi-GPU execution with NVLINK may be combined with MPI parallel execution (see above). It is especially beneficial when further decomposition of the domain using MPI is not feasible or slower, but speed-ups are still possible.

Automatic free GPU selection¶

You can configure your system for HOOMD-blue to choose free GPUs automatically when each instance is run. To utilize this capability, the system administrator (root) must first use the nvidia-smi utility to enable the compute-exclusive mode on all GPUs in the system. With this mode enabled, running hoomd with no options or with the --mode=gpu option will result in an automatic choice of the first free GPU from the prioritized list.

The compute-exclusive mode allows only a single CUDA application to run on each GPU. If you have 4 compute-exclusive GPUs available in the system, executing a fifth instance of hoomd with python script.py will result in the error: ***Error! no CUDA-capable device is available.

Most compute clusters do not support automatic free GPU selection. Insteady the schedulers pin jobs to specific GPUs and bind the host processes to attached cores. In this case, HOOMD uses the rank-based GPU selection described above. HOOMD only applies exclusive mode automatic GPU selection when built without MPI support (ENABLE_MPI=off) or executing on a single rank.

Minimize the CPU usage of HOOMD-blue¶

When hoomd is running on a GPU, it uses 100% of one CPU core by default. This CPU usage can be decreased significantly by specifying the --minimize-cpu-usage command line option:

python script.py --minimize-cpu-usage

Enabling this option incurs a 10% overall performance reduction, but the CPU usage of hoomd is reduced to only 10% of a single CPU core.

Prevent HOOMD-blue from running on the display GPU¶

Running hoomd on the display GPU works just fine, but it does moderately slow the simulation and causes the display to lag. If you wish to prevent hoomd from running on the display, add the --ignore-display-gpu command line flag:

python script.py --ignore-display-gpu

Enable error checking on the GPU¶

Detailed error checking is off by default to enable the best performance. If you have trouble that appears to be caused by the failure of a calculation to run on the GPU, you should run with GPU error checking enabled to check for any errors returned by the GPU.

To do this, run the script with the --gpu_error_checking command line option:

python script.py --gpu_error_checking

Control message output¶

You can adjust the level of messages written to sys.stdout by a running hoomd script. Set the notice level to a high value to help debug where problems occur. Or set it to a low number to suppress messages. Set it to 0 to remove all notices (warnings and errors are still output):

python script.py --notice-level=10

All messages (notices, warnings, and errors) can be redirected to a file. The file is overwritten:

python script.py --msg-file=messages.out

In MPI simulations, messages can be aggregated per partition. To write output for partition 0,1,.. in files messages.0, messages.1, etc., use:

mpirun python script.py --shared-msg-file=messages

Set the MPI domain decomposition¶

When no MPI options are specified, HOOMD uses a minimum surface area selection of the domain decomposition strategy:

mpirun -n 8 python script.py
# 2x2x2 domain

The linear option forces HOOMD-blue to use a 1D slab domain decomposition, which may be faster than a 3D decomposition when running jobs on a single node:

mpirun -n 4 python script.py --linear
# 1x1x4 domain

You can also override the automatic choices completely:

mpirun -n 4 python script.py --nx=1 --ny=2 --nz=2
# 1x2x2 domain

You can group multiple MPI ranks into partitions, to simulate independent replicas:

mpirun -n 12 python script.py --nrank=3

This sub-divides the total of 12 MPI ranks into four independent partitions, with to which 3 GPUs each are assigned.

User options¶

User defined options may be passed to a job script via --user and retrieved by calling hoomd.option.get_user(). For example, if hoomd is executed with:

python script.py --gpu=2 --ignore-display-gpu --user="--N=5 --rho=0.5"

then hoomd.option.get_user() will return ['--N=5', '--rho=0.5'], which is a format suitable for processing by standard tools such as optparse.

Execution with CPU threads (Intel TBB support)¶

Some classes in HOOMD support CPU threads using Intel’s Threading Building Blocks (TBB). TBB can speed up the calculation considerably, depending on the number of CPU cores available in the system. If HOOMD was compiled with support for TBB, the number of threads can be set. On the command line, this is done using:

python script.py --mode=cpu --nthreads=20

Alternatively, the same option can be passed to hoomd.context.initialize(), and the number of threads can be updated any time using hoomd.option.set_num_threads() . If no number of threads is specified, TBB by default uses all CPUs in the system. For compatibility with OpenMP, HOOMD also honors a value set in the environment variable OMP_NUM_THREADS.