Simple rules to write efficient parallel programs

It is said that “Concurrent programming is still more art than science”

Here are 8 simple rules for for efficient  implementation of your parallel programs.

1. Identify truly independent computations:  The operations which can’t be run independently of each other can’t run in parallel, so the first step in to identify the independent computation units.

2. Program using abstraction: Our focus should be on writing parallel code and not the code to manage the threads and cores. Working directly with the raw threads provides the flexibility but requires too much time to write, debug and maintain code. You can take advantage of the new features offered by the Task Parallel Library (TPL) in .NET Framework 4 to make your high-level code reflect a problem and not complex low-level thread management techniques

3. Program in tasks, not threads: Leave the mapping of tasks to threads or processor cores as a distinctly separate operation in your program, preferably an abstraction that handles thread/core management for you. TPL allows you to write code to implement task-based designs without worrying about the underlying threads.

4. Make concurrency configurable: The parallel code may also need to be executed on single-core microprocessors (those with only one physical core). Making concurrency configurable will also help in debugging your applications.

5.  Use synchronization mechanisms wisely: Instead of using the lock try to use the classes, methods, and structures designed to eliminate the need for complex synchronization mechanisms. TPL provides many options to avoid using heavyweight locks in many complex scenarios, and it offers new lightweight synchronization mechanisms.

6. Use Testing and debugging tools: Visual Studio 2010 provides many tools to debug and offers new tools to debug, test, and to examine the behavior of your parallel programs.

7. Use scalable memory allocators:  TPL provides scalable memory allocators in the Common Language Runtime (CLR), and it uses them automatically when working with tasks and threads. However, to maximize the usage of cache memories, you must analyse the different partitioning possibilities and try to avoid consuming excessive memory in each task.

8. Plan early for scalability to take advantage of increasing numbers of cores: If you prepare your design for future scalability, you will be able to write the code to scale as the number of cores increases. Windows 7 and Windows Server 2008 R2 support up to 256 hardware threads or logical processors; therefore, there is room for scalability.

These rules will help you to utilize the most out of multicore. The importance of some of the rules is going to grow in future with increasing number of cores in processors.

Reference:

Dr. Dobb’s Journal article entitled “Rules for Parallel Programming for Multicore” by James reinders (www.drdobbs.com/hpc-high-performance-computing/201804248)

2 thoughts on “Simple rules to write efficient parallel programs

    • As the number of processors in the system grows, the performance of the allocator must scale linearly with the number of processors to ensure scalable application performance, In case of TPL it is handled well by the CLR.

Leave a Reply

Your email address will not be published. Required fields are marked *


× 1 = four

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>