Julia and Python: a dynamic duo for scientific computing [video]

cabacon · on Aug 12, 2013

To be fair, I haven't used Julia, but I have heard it suggested as an open-source replacement for Matlab/Octave. To that end, I hope it turns out well. When I was looking to see where the project was at, I got nervous when I read a recent thread on the -dev mailing list: https://groups.google.com/d/msg/julia-dev/2JcZdFKisis/Ag9rBJ...

In particular, the line "[...] I suspect that isequal should be transitive (it has to be for hashing to work), while == will not be transitive. We still need some coherent theory of what == means." made me wince. If one of the benefits that the video makes for Julia over Python is a type system, then there ought to be a pretty well-developed sense of how promotion and equality between types is going to work. That seems like a piece of the language whose theoretical underpinnings should have been nailed down already.

StefanKarpinski · on Aug 12, 2013

Most languages don't have a coherent theory of what "==" means. Consider all the well-known problems in JavaScript and PHP, where "==" isn't even symmetric and certainly isn't transitive. Equality isn't transitive in C or Java either and in Python and Ruby it's a bit of a free-for-all. Most languages can sweep their lack of a theory for "==" under the carpet simply by saying "here are the rules, learn 'em" since you can't create your own numeric types or add behaviors to "==". In Julia, since anyone can add new numeric types that "==" will apply to, we need something better: we need to be able to tell people exactly what "==" means so that they can correctly implement it for their own types. This choice has deep implications for arithmetic (does "x == y" if and only if "x-y == 0"?), hashing and promotion, among other things.

The particular thread you're quoting was about a change (by me) that briefly altered the promotion behavior of rational numbers (and was reverted before any harm was done). It was fairly rapidly concluded that the existing behaviors of ints and floats were correct and that rationals should fall in line with those. Specifically, the correct behavior of "==" is that it must be transitive ("x == y" && "y == z" ==> "x == z"), but "x-y == 0" may occur in cases where "x != y" (e.g. when "x = 2^53+1; y = 2.0^53").

stephencanon · on Aug 12, 2013

Can you give an example of “==" failing to be transitive in C? The standard defines “==“ as a predicate that tests the numerical equality of the values of the (promoted) operands; unless you can hide something very clever in the promotion rules, that’s going to make the relation transitive. [Edit: as Stefan points out, there are cases where C’s promotion rules cause values to be rounded (which should have been obvious) — int32 to float32 and int64 to float64 in particular; this naturally leads to transitivity failures].

I’m also curious about your distinction between x == y and x - y == 0. Those are equivalent expressions for all reasonable integral and floating-point types (they fail to be equivalent when floating-point types lack denormals, but that’s “not reasonable”).

StefanKarpinski · on Aug 12, 2013

The issues come in the mixed type cases. Consider this C program:

    #include <stdio.h>

    int main(void) {
        long   x = 9007199254740992;    // 2^53
        double y = 9007199254740992.0;  // 2.0^53
        long   z = 9007199254740993;    // 2^53+1
        printf("x == y: %d\n", x == y);
        printf("y == z: %d\n", y == z);
        printf("x == z: %d\n", x == z);
        return 0;
    }

Its output shows that "==" is not transitive in C:

    x == y: 1
    y == z: 1
    x == z: 0

On the other hand, "==" is transitive in Julia:

    julia> x, y, z = 9007199254740992, 9007199254740992.0, 9007199254740993;

    julia> x == y, y == z, x == z
    (true,false,false)

You have to be careful: if you compare "y" and "z" by converting "z" to Float64, they appear to be equal even though they represent entirely different integer values. When subtracting them, however, one does convert them to Float64, so their difference is zero:

    julia> z - y
    0.0

This kind of issue is why it's so important to have a theory of what "==" means.

stephencanon · on Aug 12, 2013

Ah, of course.

So what you’ve essentially done is to make it so that comparisons and arithmetic use distinct promotion rules; comparisons are done without conversions that change value, whereas conversions are applied for arithmetic?

Transitivity failures that arise from clearly specified conversion rules are at least easy to understand once you encounter them. If I understand correctly, to avoid losing transitivity, you’ve introduced a divergence between how types behave in arithmetic and comparisons, and broken the equivalence between x - y == 0 and x == y. To my mind as a numericist and low-level programmer, that’s much worse; it’s the path that Excel started to follow a long time ago, and it leads to all sorts of maddening details when corner cases collide.

My preferred solution for high-level languages is “promote the operands to a type that can represent both exactly” (for the case of double and int64, this would be a floating-point type with at least 64 bits of significand and 11 exponent bits, like the Intel 80-bit format or quad). For a compute-oriented language, there are good performance reasons to avoid that, but it’s definitely the solution that complies with the principle of least surprise.

doug1001 · on Aug 12, 2013

i don't think Julia is viewed by the relevant communities as a replacement for Matlab, Octave, et al; likewise, i don't think that was the purpose underlying the Julia project.

"Scientific Python"--ie, python + NumPy + SciPy + Matplotlib (and perhaps including iPython) have been a de facto open source replacement for Matlab for some time now. This is apparent if you look at the similarity in syntax--eg, most significantly) NumPy copied Matlab's beautifully dense array-indexing syntax. Performance-wise, across a reasonably complete set of benchmarks reflecting the workflow of a typical user, Scientific Python and Matlab are interchangeable (https://modelingguru.nasa.gov/docs/DOC-1762).

Instead, what the Julia creators wanted to do, as i understand it, is provide an array-oriented language having true compiled -code performance; this level of performance, Scientific Python user can only achieve through additional steps in their workflow that usually involve re-writing performance-critical portions of the code in another language and compiling them in-line, e.g., cython, weave (C++ inline), numexpr (JIT for a limited set of NumPy functions), direct access to the BLAS (provided via a SciPy module), swig, f2py,....

iskander · on Aug 12, 2013

>cython, weave (C++ inline), numexpr (JIT for a limited set of NumPy functions), direct access to the BLAS (provided via a SciPy module), swig, f2py,....

I agree with your comment, I just want to add a few more Python performance options to the mix:

   * Numba (http://numba.pydata.org/)

   * Parakeet (https://github.com/iskandr/parakeet)

   * Pythran (https://github.com/serge-sans-paille/pythran)

tomrod · on Aug 12, 2013

Have you considered pointing this out to them? Perhaps the people in the project are unaware that this is a solved problem.

cabacon · on Aug 12, 2013

Not really - the discussion on the thread makes it sound like the kind of thing that comes up from time to time anyway. If I had to guess, I'd expect that it happened based on the mission of making something that was good enough to replace matlab, and going from the matlab interface people were used to and working backwards, rather than starting from a concept of what a language ought to be like and working forwards. Given my understanding of their aim and the kinds of things I have read about Matlab at http://abandonmatlab.wordpress.com/, I suspect that "close enough to matlab to have people jump ship" and "first-principles CompSci" language aren't really going to intersect very hard. Given the low popularity of some of the first-principles programming languages, that may be okay?

That said, I'm encouraged to see so many people interested in it and contributing to it. My perspective, though, is that of a C/Fortran HPC person who is cheering for things that will help people move to a more scalable coding environment. I was looking at Julia to see if it was an alternative to Octave for university users; the impression I got was that it was still too subject to change to recommend at this point.

carterschonwald · on Aug 12, 2013

hehe, then it'd interest you to know I'm working on writing HPC grade numerical tools in Haskell. I'm actually a few weeks away from doing a preview release even (terrifyingly enough).

[edit: and the preview release will include some cute simple SIMD codes i've written, not the optimal ones, but a really nicely written set none the less.]

jamesjporter · on Aug 13, 2013

I (and I'm guessing some of the people involved with Julia) would argue that regardless of how good the numerical programming tools are, you'll never get the average working scientist to use Haskell. That sounds like an interesting project though, I've become curious lately about the prospects for integrating high performance numerical computing with functional programming.

tikhonj · on Aug 13, 2013

You certainly won't if you don't try!

However, here at Jane Street at least, many of the traders are picking up and using OCaml. If traders can do, so can scientists. And if it works for OCaml, it'll probably work for Haskell. (Which, honestly, is quite a bit nicer overall.)

Also, the general consensus seems to be that Haskell is easier to learn if you don't already have extensive imperative programming experience. So it's also a good fit for scientists in that sense. If all the needed libraries and tools existed--many do and more will, soon--and it was easy to get up and running, Haskell would be a perfect tool. Happily, it's moving in that direction, becoming simpler and more capable.

bachback · on Aug 12, 2013

The problem is not ==, but == for rationals.

jparmer · on Aug 12, 2013

We're thinking about adding a Julia API: https://plot.ly/api

Hit us up if you'd like to be a part of this or have input. And hey, we're also hiring: https://plot.ly/jobs

daemonk · on Aug 12, 2013

Nice. I know plenty of biologists who would love to have a simple to use interface for making publication ready figures. Excel just isn't cutting it anymore with the complexity of the data we get today.

Hope you guys will still be hiring when I graduate next year. This sounds like a really fun project to work on.

astrieanna · on Aug 11, 2013

I'm very impressed by the smoothness of the Julia-Python interoperation they demonstrate here. The IPython implementation (IJulia) has made a lot of progress since then and is really cool.

coolsunglasses · on Aug 12, 2013

Clojure and Incanter have served me well in the place of Python and R lately.

Mikeb85 · on Aug 12, 2013

It's interesting. When Julia first came onto the scene, it seemed everyone was gung-ho to rewrite everything in Julia. Now they've capitulated and are simply incorporating Python bits...