Friday, September 23, 2011

New Hero of .NET 4.0 :Task Parallel Library


The Task Parallel Library (TPL) is a set of public types and APIs in the System.Threading and System.Threading.Tasks namespaces in the .NET Framework version 4. The purpose of the TPL is to make developers more productive by simplifying the process of adding parallelism and concurrency to applications.

Starting with the .NET Framework 4, the TPL is the preferred way to write multithreaded and parallel code. However, not all code is suitable for parallelization; for example, if a loop performs only a small amount of work on each iteration, or it doesn't run for many iterations, then the overhead of parallelization can cause the code to run more slowly. Furthermore, parallelization like any multithreaded code adds complexity to your program execution. Although the TPL simplifies multithreaded scenarios, we recommend that you have a basic understanding of threading concepts, for example, locks, deadlocks, and race conditions, so that you can use the TPL effectively. For more information about basic parallel computing concepts, see the "http://go.microsoft.com/fwlink/?LinkID=160570"

Data Parallelism (Task Parallel Library)

Data parallelism refers to scenarios in which the same operation is performed concurrently (that is, in parallel) on elements in a source collection or array. Data parallelism is supported by several overloads of the For and ForEach methods . In data parallel operations, the source collection is partitioned so that multiple threads can operate on different segments concurrently. TPL  provides method-based parallel implementations of for and foreach loops. You write the loop logic for a Parallel.Foror Parallel.ForEach loop much as you would write a sequential loop. You do not have to create threads or queue work items. In basic loops, you do not have to take locks. The TPL handles all the low-level work for you. The following code example shows a simple foreach loop and its parallel equivalent.


// Sequential version            
foreach (var item in sourceCollection)
{
    Process(item);
}

// Parallel equivalent
Parallel.ForEach(sourceCollection, item => Process(item));
When a parallel loop runs, the TPL partitions the data source so that the loop can operate on multiple parts concurrently. Behind the scenes, the Task Scheduler partitions the task based on system resources and workload. When possible, the scheduler redistributes work among multiple threads and processors if the workload becomes unbalanced.


Some useful links about TPL are:


How to: Write a Simple Parallel.For Loop


How to: Write a Simple Parallel.ForEach Loop


How to: Stop or Break from a Parallel.For Loop


How to: Speed Up Small Loop Bodies


The most helpful one is


TPL With Other Asynchronous Patterns



No comments:

Post a Comment