Ahhh! I love this one. FP was such an interesting language and point-free programming is still one of my favourite styles.
I like this topic too (breaking away from von Neumann style programming) because I believe that it is useful for some, but NOT all, problems. There are problems which are better solved with other paradigms. For example, mathematical algorithms and inherently sequential calculations (issuing commands to hardware often has to be strictly sequential, for example) - I'm sure I missed a load - are very well suited to von Neumann style programming. There are other tasks, however, which are not so well suited, as we have recently discovered: concurrent programming, for example, is pretty difficult in this style of programming.
I'm sure theres plenty of alternatives, but the one I'm currently interested in is dataflow, which appears to solve concurrency inherently and without programmer input, though obviously, being implemented on von Neumann hardware means the implementations may not make the best use of this.
Dataflow is interesting, to me, because it seems to be a minor step from imperative programming - a step which could have been taken arbitrarily back when computing was in its infancy. I don't believe that it was, I believe that imperative was chosen because it was, indeed, most suited to what early programmers were trying to achieve and the flaws (mainly concurrency) only became apparent a long time later. Its interesting to think about what programming would have been like if dataflow were the programming paradigm of choice, though. Ultimately, a nice balance between both would be best, IMHO.
I see it like this: in imperative programming, you have a stream of instructions, which is routed through the data it processes (that is, the next instruction is chosen by the previous instruction and the operate on a static (non moving) block of data) whereas in dataflow, you have static instructions with streams of data (that is, the data is routed by the instructions, much like logic gates route electrical signals in digital circuits). Conceptually, the difference is very small - but it makes a big difference.
I think theres values in both ways and would love a programming model which allowed both of these seamlessly. Oz provides some dataflow constructs, but IMHO they're too limited (and also the implementation should make use of these for implicit and transparent concurrency).
You're right that dataflow is well suited for concurrent programming. For a certain class of problems, it's ability to express your intentions in simplistic form is outstanding.
I currently work on the compiler team for a dataflow programming language. Our compiler generates machine code for various processors, as well as VHDL for FPGAs. As you would probably expect, the generated code is very different depending on the target. Most CPUs are more suited to imperative programming rather than dataflow, so we often have to compromise on the granularity of parallelism to get good performance. On FPGAs, the compromise is typically reducing either parallelism or performance to save resources and making the code fit on the device.
Programming for a dataflow environment is definitely a very different experience. Our customers typically do not have a computer science background, and in some ways this is a blessing. Most people with a computer science background initially try to program imperatively when exposed to an inherently parallel dataflow language. It takes some time to adjust to it, but when it finally clicks you can begin to look at problems from the both perspectives simultaneously. I feel like having a mastery of the different styles (imperative, dataflow, functional, etc.) has improved my overall ability to solve problems regardless of which model of computation I decide to use.
Are you able to talk about your dataflow language and/or compilation techniques? I'd love to hear about them, if possible, as its something I'm very interested in and am looking into myself. I've found it dificult to find information on compilation techniques for compiling a dataflow language to a traditional processor so have been making most of it up as I need, but my own work is far from optimal. Any insights you can provide here would be appreaciated.
Are you simply converting dataflow code to blocks of imperative code and have these blocks running concurrently (eg, in threads) and communicating with each other? Or is there a more elaborate instruction scheduling mechanism at work?
Without going into too much detail for obvious reasons, we are doing as you mention - converting dataflow into blocks of imperative code and scheduling them on threads. Although the scheduling we do both at the thread level and also at the block level is non-trivial and reasonably optimized. I'm sure most of what I'll mention below you've already discovered if you're implementing a dataflow language, but just in case:
When implementing/generating code for a dataflow language it really depends on the level of granularity you want for your parallelism. On one extreme, you have super-fine-grained parallelism and can schedule each operation independently. On the other extreme, you have little or no parallelism and pre-schedule all of your operations in one big chunk. In the middle somewhere is typically a sweet spot where you have chunks of operations pre-scheduled, and then dynamically schedule those chunks.
A key to optimal dataflow code generation (while retaining parallelism) is figuring out where to draw that line regarding parallelism granularity. Too granular and you spend all your time scheduling. Too coarse and you lose all of your parallelism. The ideal point somewhere in the middle likely depends on the operations themselves (i.e. the composition of your language). In some cases it is a decision you have to make by trial and error. In other cases you can make more informed decisions based on the nature of the operations and the context of their use within the program.
Communication is really important as well. CPU caches encourage you to minimize data copies and operate on the data in place when possible (to minimize cache line fetches). Thus if your communication is based on a scheme where you need separate copies for each operation then your performance will really suffer. Message passing can work, so long as you are passing things by pointer and aren't having to make copies.
My own simple implementation currently compiles mathematical expressions with an (optional) pre- and post-condition into a single block and then executes these blocks in a thread pool (ie, they don't have dedicated threads, but share a number of pre-created threads). There is obviously a significant overhead in passing data between the blocks, determining when they are ready to execute and so on. I have not yet looked at merging blocks, at some point I think I'll JIT merge multiple small blocks into a single large block.
As for data, my current system is statically typed, so it knows ahead of time what the datatype is. Simple values are passed in place and copied as needed. Complex values (structures, lists etc) are (or will be, I still have some work to do here) passed as pointers and the pointers copied as needed. In this case, the data itself uses a copy-on-write scheme, so that if it is only read, then no copies are made.
My implementation is currently not very intelligent. Operations are executed and scheduled in a virtual machine of sorts, though I will be JIT compiling expressions and conditions within the next week or two. Theres still loads of room for optimization though and I'm still trying to think of ways to make the block scheduling cheaper.
Always interested in hearing how people go about these problems, so thanks again for sharing.
Dataflow programming is exceptional for GUI programming, although I've never tried using it in a large application.
This style of programming does its best if it has support from the underlying language to enforce consistency. Without it, it's just too easy to get the program into an unconsistent state.
I'm aware of only two implementations providing dataflow programming: Cells for Common Lisp and FrTime for PLT Scheme (but I've actually used only the former).
"(issuing commands to hardware often has to be strictly sequential, for example)"
Do they, though? I mean, yes, sure, with current hardware this is true, but is this a true necessity, or an artifact of the fact that modern hardware is designed to be used by von Neumann machines?
Honest question, I don't know enough about hardware to have an answer. Can there be "functional hardware", or perhaps more practical, "dataflow hardware"?
It would be hard to create truly "functional" hardware. A piece of hardware is a real physical object with an internal state which is shared by all accessors and which can change over time. Shared, modifiable state is the exact opposite paradigm from the one that pure functional languages use.
I think this is also why pure functional languages are so hard for "normal" people to understand -- their basic premise is the opposite of the real world around us that's filled with stateful physical objects.
I think one other reason why von Neumann-type imperative programming was chosen over dataflow IS because sequential instructions and shared memory makes a lot of sense to us.
Having said that, a lot of non-programmers use graphical dataflow languages all the time, so maybe dataflow is ACTUALLY easier (at least for non-programmers - maybe the mathematical feel of imperative code attracts us programmers to it?) - eg, graphics tools often have dataflow languages for defining animations, materials, lighting and so on. I once used a dataflow tool for generating 3D visualizations and it was very intuitive.
The way I have had it explained to me (which I still don't quite get) is that an object is immutable if there was no time. When you mutate an object, it only happens due to time passing (ie if you were to "go back in time" the object would be as it was) - so changing something creates something else (in time). Confused.
If the commands don't have to be executed sequentially, then you don't have to issue them sequentially.
A modern processor will look at the incoming instructions and execute whatever it can without creating a hazard, this is called out of order execution.
A very simple example is user input and output: I want to print the prompt text before I let the user type their input.
The same is true for plenty of other hardware: I need to move the robot forward before I turn on the drill; I need to print page 1 before page 2 - etc etc. There are plenty of inherently sequential operations, thats all I was trying to say. ;-)
But you are not incorrect - obviously there IS some hardware which is only sequential because it was designed for von Neumann machines to drive it.
Dataflow hardware can exist (there are, or were, dataflow processors made) and hardware itself is inherently dataflow. In fact, VHDL and Verilog are inherently dataflow languages (possibly) running on FPGAs, which I guess could be seen as a sort of low level dataflow processor.
EDIT: @scott_s: Sure, but the end effect is the same, isn't it?
I think it's inaccurate to characterize VHDL and Verilog as "languages running on FPGAs." FGPAs aren't designed in anyway to run those languages; they're designed to emulate hardware. Since those are the languages used to describe hardware - including modern processors - what they produce can be mapped to run on an FPGA.
dkersten, indeed, the effect is the same. But this is a good discussion, and I wanted to make sure we all had the facts right.
Meta talk: responding to my post instead of editing yours makes more sense to me. First, I'm more likely to see it, and second, now I'm in the awkward position of replying to myself. I also find it disheartening that someone downmodded your comment from 2 to 1, presumably because I corrected you, but probably unaware that I was the one who upmodded you to begin with. Your point was valid, I just wanted the record to be accurate.
I'm glad that you are correcting me. I like to be corrected, because it means that I'm not relying on invalid information and I very much like learning new things, especially if related to something which interests me - like dataflow.
Sorry, HN wouldn't allow me to reply to your comment for some reason, so I edited mine.
PG recently implemented a brief "cooling off" period for comments, where there was no reply link. The purpose was to prevent long, nit-picking back-and-forths.
Why is imperative programming still the most widely understood and used form by far (despite its certain weaknesses)?
Because the world, in most cases, is sequential. Understanding of time and tenses are very important for navigating life. We learn to process in sequential style from babyhood (and likely over evolutionary time as well).
Mathematics (anything beyond basic) and functional programming are hard for 99+% of population and more than half of programmers. (People here are above average, so your mileague may vary.)
Functional programming corresponds to the Present Simple tense in English. The tense is not that common in everyday conversation and we rarely see it in complex sentences apart from technical/scientific writing. No wonder most people have little training for such style of thinking.
I think you're "almost" right. The world is not quite sequential - the world is composed of a vast quantity of concurrently running agents which communicate through a message passing system. Not unlike a many core system running sequential tasks and using a message passing system. Or a dataflow system where the data stream could be seen as messages being passed between concurrently executing operations.
You are right, though: we work and think in a very sequential manner. Once you introduce concurrency to any everyday life aspect, it becomes more difficult for us to grasp. Even cooking is difficult when you prepare multiple things at once and must remember the timing for each of these things and schedule your time or the work between multiple cooks.. with practice, its not so bad - programming is the same, it gets complicated when you stop using a sequential paradigm, but with practice other paradigms can become natural too.
It is also for this reason why I think the "ultimate language" (tm) would be a mixture between something like dataflow and sequential programming, offering the programmer the choice to use whichever style is most natural to the task at hand, without having to put much effort into splitting the program into components of either category. I have started to see languages moving that way, slowly, though I think it'll be another while still before they're usable enough to replace currently popular languages.
I'm reluctant to characterize the world as being composed of independent agents passing messages. I can't put my finger on it exactly, but it just feels like a very lossy oversimplification.
Can you elaborate at all? I'd be very interested to hear (since this is a topic I find quite intriguing).
For humans, we ourselves may work sequentially, but there are many of us working concurrently. Like many many sequential processing cores. Perhaps your comment was targeted at the "message passing" aspect. Its a bit of shared state mixed with message passing, really. Often we interact directly - I could tell you something (message passing) or I can shout something (multicast message passing). Of course, we also operate in a shared state model: multiple people taking food from a plate, we have to synchronise this so that we don't clash (physically, in this case).
If I were to conceive of the universe as a program, I would think of it as one giant state with a few fundamental particles and a simple functional program that resolves the state over time based on the previous state.
Passing messages to me implies a finite set of messages and pre-conceived interaction, whereas in reality there is no such thing as discrete objects or interactions, everything interacts with everything, and the world we know is emergent from the low level physical properties.
I suppose there could be an infinitely polymorphous OO program behind the universe, but that just doesn't strike me as a good fit with what we know about physics.
I never meant to imply OOP when I said message passing. I guess it was the wrong term to use, because its got too much of an OOP ring to it. What I mean is that events or packets of data or what have you are passed between independent agents. The communication is loosely coupled fropm the agents and the agents all function independently (they often work together, but the "function" independently).
If I say "Hello, Bob", this "message" or communicatable data is passed from me to Bob. But coupling is also low - Mary and Peter, who are standing nearby, can also hear this - unless I whisper.
Basically, the events or messages or data packets or whatever you want to call them are simply sent OUT for whatever to pick them up and do with them what they will. The messages can be directed though (say to Bob vs simply yell) and the listeners can be restricted (whisper). In this respect, I see the world as independent threads of execution which communicate through some (anonymous, if I hear someone yell "Fire", I don't care who) message dispatch system. Something like this: http://didntread.files.wordpress.com/2009/07/components.png
People are made up of sub-agents, which themselves are made up of smaller, independent agents and so on. Cells communicate with nearby cells through proteins and chemical reactions and electrical signals - these are messages being passed (anonymously) between agents.
Now, if each agent was a single thread of execution, then you have discrete points where communication and interaction occurs. You said that this is not how the world works, and you are right. Imagine instead a network of connected operations, where when one operation produces some result, it passes it to all the connected operations. When an operation has received all its inputs, it can execute - concurrently, in parallel with any other operations which also have their dependencies met. At some point, these operations too will complete their processing and their output is propogated to connected operations. In such a system, when a message is received or when an interaction occurs, the network of operations (the agent) can process it immediately. No longer does the agent have to wait for a certain discrete point in a sequential processing system to communicate or interact - it can do so as soon as some data is available. These entworks do contain state, just like you said the agents in the real world do, but instead of having some fixed memory buffer contain the state (well, technically, you could have that too..), the state is maintained in feedback loops between operations (think of a flip-flop in electronics: the output of a logic gate is fed back as its input, either directly or indirectly through other logic gates).
What I have just described is a dataflow system and THAT is how I think the world works, if it were a programming system.
Note that some things ARE inherently sequential - the operations are still ordered by their dependencies. Maybe the operations themselves process their data in a number of sequential steps. This is fine, because certain things require ordering to make sense or to work well.
Every verb in what you just wrote is in the present simple, with the single exception of "your mileage may vary". It's probably the commonest of all tenses both in everyday conversation and in complex sentences. I see no reason to think that people in general (or programmers in particular) lack experience of expressing things in the present tense. It's also far from clear to me that there's any real correspondence between the present simple and functional programming.
Apart from that, though, I liked your last paragraph just fine.
Good observation on the irony of my own writing. :-)
I should have added abstract writing in general to technical/scientific.
In general, Present Simple tense is used when referring to abstract concepts, which is rarer in real life than in this forum. (A lot of people here are abstract thinkers.) Most news pieces and conversations refer to actual events and do not use as much Present Simple.
Why is imperative programming still by far the most popular and widely used form of programming?
Because the world, in most cases, is sequential and our brain has learned from babyhood (and probably over evolutionary time too) the sequential style of processing.
No wonder mathematics & functional programming is hard for 99+% of the population and most programmers, (People here are mostly above average, so YMMV)
I usually complain about the ever decreasing quality of the HN frontage, but today practically every single link is great! I don't know what changed today, but I love it.
I like this topic too (breaking away from von Neumann style programming) because I believe that it is useful for some, but NOT all, problems. There are problems which are better solved with other paradigms. For example, mathematical algorithms and inherently sequential calculations (issuing commands to hardware often has to be strictly sequential, for example) - I'm sure I missed a load - are very well suited to von Neumann style programming. There are other tasks, however, which are not so well suited, as we have recently discovered: concurrent programming, for example, is pretty difficult in this style of programming.
I'm sure theres plenty of alternatives, but the one I'm currently interested in is dataflow, which appears to solve concurrency inherently and without programmer input, though obviously, being implemented on von Neumann hardware means the implementations may not make the best use of this.
Dataflow is interesting, to me, because it seems to be a minor step from imperative programming - a step which could have been taken arbitrarily back when computing was in its infancy. I don't believe that it was, I believe that imperative was chosen because it was, indeed, most suited to what early programmers were trying to achieve and the flaws (mainly concurrency) only became apparent a long time later. Its interesting to think about what programming would have been like if dataflow were the programming paradigm of choice, though. Ultimately, a nice balance between both would be best, IMHO.
I see it like this: in imperative programming, you have a stream of instructions, which is routed through the data it processes (that is, the next instruction is chosen by the previous instruction and the operate on a static (non moving) block of data) whereas in dataflow, you have static instructions with streams of data (that is, the data is routed by the instructions, much like logic gates route electrical signals in digital circuits). Conceptually, the difference is very small - but it makes a big difference.
I think theres values in both ways and would love a programming model which allowed both of these seamlessly. Oz provides some dataflow constructs, but IMHO they're too limited (and also the implementation should make use of these for implicit and transparent concurrency).
Thoughts?