Accessing System Configurations With MPI

Overview

Questions

  • How can I access the state of the simulation in parallel simulations?

  • What are the differences between local and global snapshots?

Objectives

  • Describe how to write GSD files in MPI simulations.

  • Show examples using local snapshots.

  • Show examples using global snapshots.

Writing GSD files in parallel jobs

You can write GSD files in parallel jobs just as you do in serial. Saving the simulation trajectory to a file is useful for visualization and analysis after the simulation completes. As mentioned in the previous section, write a single program and add the write.GSD operation with identical parameters on all ranks:

[2]:
%pycat lj_trajectory.py
import hoomd

# Initialize the simulation.
device = hoomd.device.CPU()
sim = hoomd.Simulation(device=device)
sim.create_state_from_gsd(filename='random.gsd')

# Set the operations for a Lennard-Jones particle simulation.
integrator = hoomd.md.Integrator(dt=0.005)
cell = hoomd.md.nlist.Cell()
lj = hoomd.md.pair.LJ(nlist=cell)
lj.params[('A', 'A')] = dict(epsilon=1, sigma=1)
lj.r_cut[('A', 'A')] = 2.5
integrator.forces.append(lj)
nvt = hoomd.md.methods.NVT(kT=1.5, filter=hoomd.filter.All(), tau=1.0)
integrator.methods.append(nvt)
sim.operations.integrator = integrator

# Define and add the GSD operation.
gsd_writer = hoomd.write.GSD(filename='trajectory.gsd',
                             trigger=hoomd.trigger.Periodic(1000),
                             mode='xb')
sim.operations.writers.append(gsd_writer)

# Run the simulation.
sim.run(1000)

[3]:
!mpirun -n 4 python3 lj_trajectory.py
notice(2): Using domain decomposition: n_x = 1 n_y = 2 n_z = 2.

Modifying particle properties with local snapshots

Use snapshots when you need to modify particle properties during a simulation, or perform analysis where the results need to be known as the simulation progresses (e.g. umbrella sampling). Local snapshots provide high performance direct access to the particle data stored in HOOMD-blue. The direct access comes with several costs. Your script can only access particles local to the domain of the current rank. These particles may appear in any order in the local snapshot and a given particle is only present on one rank. To access the properties of a specific particle, find the index given the particle’s tag and handle the condition where it is not present on the rank.

The example below demonstrates this with an example that doubles the mass of all particles, and quadruples the mass of the particle with tag 100:

[4]:
%pycat local_snapshot.py
import hoomd

# Initialize the simulation.
device = hoomd.device.CPU()
sim = hoomd.Simulation(device=device)
sim.create_state_from_gsd(filename='random.gsd')

# Access the local snapshot.
with sim.state.cpu_local_snapshot as snapshot:
    N = len(snapshot.particles.position)

    # Double the mass of every particle.
    snapshot.particles.mass *= 2

    # Look up the index of the particle with tag 100.
    idx = snapshot.particles.rtag[100]

    # Modify the particle, but only if it is found.
    # This condition will be true on one rank and false on the others.
    if idx < N:
        snapshot.particles.mass[idx] *= 2

[5]:
!mpirun -n 4 python3 local_snapshot.py
notice(2): Using domain decomposition: n_x = 1 n_y = 2 n_z = 2.

Notice how the example uses the rtag lookup array to efficiently find the index of the particle with the given tag. When the particle is not present on the local rank, rtag is set to a number greater than the local number of particles.

Any analysis you perform may require MPI communication to combine results across ranks using mpi4py (this is beyond the scope of this tutorial).

Using global snapshots with MPI

Global snapshots collect all particles onto rank 0 and sort them by tag. This removes a number of the inconveniences of the local snapshot API, but at the cost of much slower performance. When you use global snapshots in MPI simulations, you need to add if snapshot.communicator.rank == 0: checks around all the code that accesses the data in the snapshot. The get_snapshot() call itself MUST be made on all ranks. Here is an example that computes the total mass of the system using a global snapshot:

[6]:
%pycat global_snapshot.py
import hoomd
import numpy

# Initialize the simulation.
device = hoomd.device.CPU()
sim = hoomd.Simulation(device=device)
sim.create_state_from_gsd(filename='random.gsd')

# Call get_snapshot on all ranks.
snapshot = sim.state.get_snapshot()

# Access particle data on rank 0 only.
if snapshot.communicator.rank == 0:
    total_mass = numpy.sum(snapshot.particles.mass)
    print(total_mass)

[7]:
!mpirun -n 4 python3 global_snapshot.py
notice(2): Using domain decomposition: n_x = 1 n_y = 2 n_z = 2.
6912.0

Summary

In this section, you have written trajectories to a GSD file, modified the state of the system efficiently using local snapshots, and analyzed the state of the system with a global snapshot - all with conditions that work in both MPI parallel and serial simulations. The next section of this tutorial shows you how to use MPI to run many independent simulations with different inputs.