School of Computer Science

# The Big Dance as seen by Barnes and Hutt

As you can guess from the title of the assignment, you are going to implement the Barnes-Hutt algorithm. The emphasis will be on load balancing, since oct-trees are not very good at knowing where the bodies are located. You should use the BSP (Bulk Synchronous Programming) model to implement the algorithm.

In BSP all of the processors do the following:

• compute
• barrier
• synchronize
• barrier
• repeat

This approach is easy to implement and the serial code will not wind up looking all that different from the parallel code. The implementation should be on a shared memory machine.

Now to the fun part-load balancing. You can implement one of two load balancing algorithms (1) orthogonal recursive bisection (2) cost-zones. The first scheme is the easier of the two, and only works in 2-D. The second scheme will take more work to implement. You should use shared memory for this load balancing algorithm, as it will be much easier to implement this way.

The choice is yours-if you feel adventurous and have the time, try 2. However it is not required (you will clearly get extra consideration for implementing the second method). There will be no loss of marks if you chose 1.

You should use 4 to 6 processors . Assume that the particles all have the same mass (say 1). You should present results for several (at least 3) different initial configurations, i.e. different load distributions in which you show the results for all of the processors. You should run the algorithm without load balancing for one of the initial configurations in order to show the improvements produced by the algorithm.

## References

• The Barnes-Hutt lecture material, Berkeley slides are a good place to start. This will be posted. The material we cover in class will help.
• " A Hierarchical O(n log n) force calculation algorithm", J. Barnes and P. Hut, Nature, v. 324 (1986). This is the original paper.
• Blackston and Suel, Supercomputing 1997 is the article which covers the parallelization scheme
• Warren and Salmon, Supercomputing 1992 covers the orthogonal recursive bisection algorithm
• Warren and Salmon, Supercomputing 1992 for LET and hashed cost zones
• ## The assignment

Your assignment should be submitted in two parts:
• A tar file containing a separate directory for each of your programs. Each directory should contain your source code and a Makefile.