Parallelism#
Multithreading = Use of multiple threads Concurrency = Order in which multiple tasks execute is not determined Parallelism = Simultaneous execution (e.g. on multiple cores)
class Printer {
char ch;
int sleepTime
public Printer(char c, int t) {
ch = c;
sleepTime = t;
}
public void Print() {
for (var i = 0; i < 100; i++) {
Console.Write(ch);
Thread.Sleep(sleepTime)
}
}
}
class Test {
static void Main() {
var a = new Printer('.', 60);
var b = new Printer('*', 70);
new Thread(() => a.Print()).Start();
new Thread(() => b.Print()).Start();
}
}
Thread types#
Foreground thread = Program will not terminate as long as at least one foreground thread is running Background thread = Background threads do not prevent the program from terminating
var bgThread = new Thread(…);
bgThread.IsBackground = true;
bgThread.Start();
Threadpooling#
Thread creation requires quite some resources ~1’000’000 clock cycles, about 1MB of memory and also requires kernel interaction. ThreadPool offers automatic thread management and recycling Number of threads is limited, additional requests are queued Used for short running tasks Don’t change thread priority or thread state Background threads only Control over the thread only inside the method given.
Number of Cores: Environment.ProcessorCount
Starting a new thread: var t = new Thread (() => { / ... /}); t.Start();
Wait for another thread to finish: t.Join()
Use a thread from the ThreadPool: ThreadPool.QueueUserWorkItem((o) => { / …/});
Mutual exclusion: lock(someObject){}
Using semaphores: var sem = new Semaphore(0,3); sem.WaitOne(); sem.Release();
Using barriers: var b = new Barrier(7); b.SignalAndWait();
Parallel Library#
Task Parallel Library (TPL) offers abstractions and reuse approach over threads
PLINQ#
PLINQ allows us to parallelize LINQ statements which is based on the TPL.
Partitioning Strategies#
Data parallelism = The simultaneous execution of the same function across split data of a data set. For example processing 100 elements, two cores work on 500 each.
Task parallelism = The simultaneous execution of multiple and different functions across the same or different data sets. For example sharpen and resize 100 pictures. First task sharpens, second task resizes.
Parallel.For#
//sequential execution
for(var i = 0; i < 10; i++)
{
Console.WriteLine(i);
}
// parallel execution
Parallel.For(0, 10, i =>
{
Console.WriteLine(i); // order is unspecified!
});
Parallel.ForEach#
string[] capitals = {"London", "Paris", ...}
//sequential execution
foreach(var city in capitals)
{
Console.WriteLine(city);
}
// parallel execution
Parallel.ForEach(capitals, city =>
{
Console.WriteLine(city); // order is unspecified!
});
Parallel.Invoke#
Parallel.Invoke(()=>Function1(),
()=>Functions2()
);
Task.Run#
Task < double > [] tasks = {
Task.Run(() => DoComputation1()),
Task.Run(() => DoComputation2())
};
var results = new double[tasks.Length];
for (var i = 0; i < tasks.Length; i++)
results[i] = await tasks[i];
Using PLINQ#
var parallelQuery = Enumerable.Range(3, 30)
.Where(n => SomePredicate(n))
.Sum(n => n * n);
// becomes
var parallelQuery = Enumerable.Range(3, 30).AsParallel()
.Where(n => SomePredicate(n))
.Sum(n => n * n);
Limit the number of threads with .WithDegreeOfParallelism(4)
PLINQ only parallelizes work, if it suspects benefits. You can force parallelization with .WithExecutionMode(ParallelExecutionMode.ForceParallelism)
Ordering can be forced with .AsOrdered() Lift ordering requirement with .AsUnordered()
A function has “side effects” when it modifies something outside the function make sure this never happens!