The .NET multithreading APIs and constructs aimed at leveraging multicore processors
are:
• Parallel LINQ or PLINQ
• The Parallel class
• The task parallelism constructs
• The concurrent collections
• SpinLock and SpinWait
These are new to Framework 4.0 and are collectively known as PFX (Parallel Framework).
The Parallel class together with the task parallelism constructs is called the
Task Parallel Library or TPL.
Why Parallel?
In recent years, CPU clock speed increases have reached a plateau, and manufacturers
have focused instead on increasing core counts. This is an issue for us as developers
since our typical single-threaded code will not automatically take advantage
of the fact that the box it's running on has a CPU with multiple cores. In order
to really take advantage we've had to partition our operation into small chunks,
execute these in parallel via the Threadpool, and collate the results in a threadsafe
way. This type of operation -- prior to PFX -- was awkward, error-prone, and
the usual approach of locking on objects for thread safety causes a lot of contention
when many threads are attempting to work on the same data.
PFX has two main concepts, data parallelism and task parallelism. When a set of tasks
has to be performed on many data values, you can parallelize by having each thread
do the same set of tasks on a subset of the values. This is data parallelism
because we are partitioning the data between threads.
Conversely, with task parallelism we partition the tasks, each thread performing
a different task.
In .NET Framework 4.0, PFX components are broken down into the Parallel Class and
PLINQ. With the Parallel class, you need to explicitly write code that will handle
partitioning and collation. With PLINQ, all steps are automated; you simply declare
that you want your work parallelized, structure it as a LINQ query, and off you
go.
To use PLINQ, simply call AsParallel() on the input sequence and continue the LINQ
query as you normally would. PLINQ is especially useful on queries that are not
necessarily CPU - intensive but which involve a blocking call - such as waiting
for a web page to download, or for a Ping call to return:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Net.NetworkInformation;
using System.Text;
namespace PFX
{
class Program
{
static void Main(string[] args)
{
var query = from site in new[]
{
"www.eggheadcafe.com",
"www.microsoft.com",
"www.google.com",
"www.bing.com",
"www.yahoo.com",
"www.codeplex.com"
}
.AsParallel().WithDegreeOfParallelism(6)
let p = new Ping().Send(site)
select new
{
site,
Result = p.Status,
Time = p.RoundtripTime
};
foreach(var s in query.ToList())
{
Console.WriteLine(s.site+ ": "+s.Result + ": " + s.Time);
}
Console.ReadKey();
}
}
}
WithDegreeOfParallelism forces PLINQ to run the specified number of tasks simultaneously. You need to do
this when calling a blocking method because PLINQ will assume that the query
is CPU- intensive and allocate tasks accordingly. On a two - core machine, PLINQ
might default to running only two tasks at once, which is clearly not what we
want in this example.
Improving performance with ForAll()
A big advantage of PLINQ is that it collates results from parallel operations into
a single output sequence. Often all you end up doing with that is to call some
method over each element. In that case, when you do not care about the order
in which the elements are processed, you can improve performance with PLINQ's
ForAll method.
So in the example above, instead of the foreach loop, we could just do something
like this:
query.ForAll(Console.Write);
The Parallel Class
Parallel.Invoke executes an array of Action delegates in parallel, and waits for
them to complete.
Here is how we could use Parallel.Invoke to download all of the above default site
pages:
Parallel.Invoke(
()=> new WebClient.DownloadFile("http://www.eggheadcafe.com","one.htm"),
()=> new WebClient.DownloadFile("http://www.microsoft.com", "two.htm"),
()=> new WebClient.DownloadFile("http://www.google.com", "three.htm"),
()=> new WebClient.DownloadFile("http://www.bing.com", "four.htm"),
()=> new WebClient.DownloadFile("http:/www.yahoo.com", "five.htm"),
()=> new WebClient.DownloadFile("http://www.codeplex.com","six.htm")
);
This short article just gives a quickie look at PFX. For more information, see the Parallel Whitepaper. This is a remarkable PDF whitepaper by Stephen Toub, and it's free. There is also
a link within this to the Parallel Samples solution. Another good resource for
PFX is "C# 4.0 In a Nutshell" by the Albahari brothers, which I have reviewed on this site.