In other words, more confusion. Don't read if you don't understand words like "a...

ohyes · on March 22, 2011

Sure,

So pretty much he is saying that concurrency and parallelism aren't the same thing. (Which is kind of a 'duh' if you think about it).

Parallelism is concerned with efficiency of the algorithm in a parallel environment. His example is that quick-sort is an easily parallelized algorithm (each level of recursion is two completely independent steps). This means that at each level, you can run the independent steps at the same time, meaning that in a parallel environment, you can see a big speed improvement.

Concurrency is more dealing with the control flow of a program. Concurrency tries to manage resources in a 'nondeterministic' environment.

Pretty much, its like an OS scheduler. The OS scheduler has to manage all of its resources, like the printer or the hard drive or whatever else, between different concurrently running processes, and prevent things like race conditions or deadlock. This isn't a parallelism issue, because you can have, for example, a single threaded processor with a multi-tasking operating system.

does that help?

T-R · on March 22, 2011

Concurrency = managing shared resources. Things like mutual exclusion and transactions. Getting processors to get in line (serialization/composition) for a shared resource instead of just trampling over each other by just grabbing it when they want (non-determinism), so that actions that depend on being done in order and without interruption can be done.

Parallelism = splitting up subtasks that don't depend on each other. If the tasks don't depend on each other, the things doing them shouldn't need to talk. If you have enough processors, you can do almost everything at once, so the algorithm is only as slow as things that actually need to be done in order (asymptotic efficiency).

In imperative programming, we tell the computer what order to do things in, so it doesn't know what actions don't depend on each other - to parallelize, we have to worry about concurrency. In functional programming, the compiler knows what depends on what (what uses the result of something else), so it can parallelize things that don't depend on each other. It might be implemented with concurrency, but the programmer doesn't need to worry about it.

te_platt · on March 22, 2011

Parallelism is recognizing the 5 guys should be able to dig a ditch close to 5 times as fast as 1. Concurrency is making sure everyone has a shovel and is digging in the right place.

barrkel · on March 22, 2011

Determinism is making sure the digging is in the right place. Non-deterministic digging may be possible, but it's harder to get right.

Concurrency could also be having one guy with one shovel, but digging iteratively in 5 separate places and digging a shovelful or two at a time.

Or to put it another way, parallelism is digging two ditches with 4 people 4 times as fast as 1 person digging two ditches; concurrency is 1 person digging two ditches at the same time, alternating; determinism is making sure that the two ditches are being dug correctly, at an even pace, etc.

VladRussian · on March 22, 2011

Great illustration! I just imagine 5 ditches all over the place and going into 5 different directions and/or crossing, or one big hole as all 5 started at the same place and, while shoveling, were hitting each other with elbows and the other end of shovel, or, starting at the correct place and in the correct direction, just 1 guy digging at any given time while the other 4 waiting for their chance to dig. ( i did my share of digging at student summer construction projects some 20+ years ago :)

jlouis · on March 22, 2011

He means:

Parallelism is a property of the machine. We are obsessed with the idea of making a program run in less time by better utilizing the hardware.

Concurrency is a property of the program. It is about describing that multiple events can happen independently of each other.

An operating system kernel (old ones) are concurrent but not parallel for instance. Concurrently it will handle multiple users, input from keyboard, network and disk, switch process contexts and so on. Yet, only a single CPU will be doing the task, so everything happens in a serial stream of events.

The point where the two blend is when you have a concurrent system and want it to execute faster. Then, you will often find yourself employing parallel tactics to speed the system up. In the OS case, modern NIC's and disk drives have processors on them carrying out tasks in parallel to the CPU. So the system is not strictly concurrent anymore. They also have SMP-capability of course.

Parallel but not concurrent systems exist as well. The perhaps best example is a GPU, which is massively parallel (1700+ cores is not uncommon), but all the cores are doing the same thing. Hence, it cannot handle different events at different times with independence.

akjj · on March 22, 2011

I found the article he links to at the end to be a more transparent explanation:

http://ghcmutterings.wordpress.com/2009/10/06/parallelism-co...

vilhelm_s · on March 22, 2011

But Bob Harper really makes two points, and Simon Marlow's article only covers one of them. The fist point is that parallel functional languages makes it possible to write correct parallel programs easily, by not exposing any nondeterminism to the programmer. The second point is that for such a language it's particularly easy to describe what the running time of the program will be, by talking about the depth and work of a computation.

Of course, the reason Marlow doesn't make that point is that it's not true for Haskell! In ML, the work involved in computing f(g(1)) is the work of computing g plus the work of computing f. But in Haskell that's not true, since if f doesn't use it's argument, g will never run.

barrkel · on March 22, 2011

Yes, that's a better article than this one.

jallmann · on March 22, 2011

Concurrency does not necessarily have anything to do with parallel programming, although the latter may need to be aware of concurrency issues if the parallel algorithm needs to manage shared resources and sequential dependencies.

The gist of the article is that functional languages lend themselves to parallel constructs, and such tooling should be built into languages themselves to free the programmer from worrying about concurrency issues. For more details, read on...

Asymptotic efficiency is just a fancy way of saying algorithmic efficiency. In computer science terms, asymptotic analysis measures how work increases relative to how much input needs to be processed; eg, sorting an N-element list using Quicksort would take an amount of work corresponding to NlogN.

Now, when you try to parallelize an algorithm such as Quicksort, you are concerned with your speedup -- how much you will gain by processing in parallel. The speedup is not always linear because of sequential dependencies within the algorithm. Wiki has a decent primer on the subject: http://en.wikipedia.org/wiki/Speedup.

In this case, the article was looking at the asymptotic efficiency of Quicksort's critical section; eg, the part that could be parallelized, to determine just how much it could be parallelized.

Concurrency, on the other hand, deals with more classical problems such as scheduling and resource deadlock. It is the OS's job to manage resources, which is why you hear a lot about concurrency in terms of the OS. And since thread/process scheduling by the OS is unpredictable, this is where 'nondeterministic composition' comes from. This manifests in things like race conditions. http://en.wikipedia.org/wiki/Race_condition

So it may become the parallel application's responsibility to avoid such nondeterministic behavior within resources shared by the application. Note this doesn't just apply to threads running on the same computer/OS; it may also apply to data coming in from the network from a distributed parallel algorithm.

In conventional, procedural programming languages, concurrency has to be addressed explicitly through things like locks and barriers. Functional languages help make this more implicit due to immutability. Languages like Erlang go one step further and absolve you of any responsibility for things like IPC.

HTH.