Ten top tips for converting programs from serial to parallel

At yesterday’s BCS Fortran event in London, which included an overview of Fortran standards and an introduction to coarrays for parallel programming, Stephen Blair-Chappell of Intel outlined some techniques for converting serial programs to run on parallel architectures. His presentation was based around a workshop taking a serial program and converting it to run in parallel, but here are the top tips I picked up along the way:

  1. Before starting, work out whether you can solve the problem with a relatively simple hardware upgrade. A faster computer or a memory upgrade might be what you need. Don’t overlook the obvious and then dive into parallel programming just because it’s a sexier technology.
  2. Read the compiler manual to see what optimisations it can offer you. Blair-Chappell said it can be amazing what the compilers can do at a flick of a switch. The automatic vectorisation in the Intel Compiler can deliver a speedup of 2x just by switching it on (depending on the program).
  3. Make good use of the available hardware features. Sometimes you can change the code yourself to take advantage of technologies like SSE instructions, which compilers will have intrinsics to support. He showed how it was possible to achieve a speedup of 24x by changing 14 lines of code in one program.
  4. Use software like Intel VTune to measure how software performs, but focus on a few key events. VTune registers a couple of thousand events, but there are five key ones you need to know about: cpu_clk_unhalted.core (the time tick), inst_retired.any (count of completed instructions), bus_trans_any.all_agents (shows any bottleneck between CPU and memory), rs_uops_dispatched.cycles_none (which measures any cycles where no instructions were dispatched), mem_load_retired.l2_miss (which measures L2 cache misses – a lot of these suggests the program is poorly configured, compiled or written).
  5. Intel VTune was described as a ‘beast of a tool’. Although it’s not easy to use, everyone who optimises performance uses it, so it’s worth persevering with.
  6. If you join a project mid-way through, make sure you know whether it’s already running in parallel or not. People often say a program is parallel, when it isn’t, and vice versa.
  7. Find hotspots using a tool like Intel Parallel Amplifier. Make busy things parallel. It’s not worth optimising for things that aren’t already busy.
  8. Beware of unnecessary duplication. The demonstration included a progress counter but this was updated on screen every loop iteration, and not only when there was a change to report. This had a dramatic negative impact on the speed of the application. 
  9. Use Amdahl’s Law to calculate the maximum potential speedup, so that you can benchmark progress.
  10. Make sure you validate the correctness of your parallel application. It’s easy to be distracted by a rapid speedup and not notice that the program is only doing half the work is should be because of a data race.

One Response

  1. […] Ten top tips for converting programs from serial to parallel […]

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: