Yeah to be honest one of the things that made me skeptical of the code compression / productivity claim is looking at the implementations of Chez Scheme and Racket (after also looking at 20+ compilers / interpreters, and working on a language for a few years).
I'm pointing to them as very long-lived and valuable codebases written in Lisp dialects. Chez Scheme is a 35 year old codebase and Racket is also decades old.
So I'm not saying there's anything wrong with them, but I am saying that it doesn't appear that they're 10x easier to understand or modify than LLVM or CPython (Chez being a compiler and Racket being an interpreter as far as I remember). Or that you can get a 10x better result.
Basically for the claim to be true, why can't you write something like Racket in Racket 10x faster? Like 3 years instead of 30 years. And why doesn't it do way better things than CPython or Ruby? Those claims might be "slightly" true depending on who you are, but they're not devastatingly true. There's been more than enough time to evaluate the claims empirically.
In other words they would have already proven themselves in the market if that were the case. You would have extraordinarily productive teams using these languages -- along the lines of what PG hypothesized 15+ years ago.
In fact the thing I found was interesting is that at the core of Racket is a big pile of C code, just like CPython. A year or 2 ago I watched a talk about them self-hosting more, and moving to Chez scheme's backend, but I don't recall the details now.
(FWIW I also looked at and hacked on femtolisp around the same time, since I was impressed by how Julia uses it.)
correction: it looks like Racket has a JIT too, written in C. Still same point applies: it's not magic and looks a lot like similar codebases in C. Chez is more self hosted AFAIR but it's also hundreds of thousands of lines of code.
> In other words they would have already proven themselves in the market if that were the case. You would have extraordinarily productive teams using these languages -- along the lines of what PG hypothesized 15+ years ago.
There's K and Kdb, et. al. (I think they're on to "Q" now.)
I'd argue very little of what makes K and Kdb special has to do with the language per se, but with the focus on array operations, and with the particular runtime environment of Kdb (heavily focused on distributed services that logs (and can replay) operations against a columnar database with most operations in-memory).
Q to me is a tacit admission that the obtuse syntax of K is not necessary to get the benefits of K, and that K presents too steep a learning curve for most.
They're fascinating, and I think vastly underrated, but they're hard to grasp even without the syntax hurdle...
1. The terse language is actually quite expressive and very well suited to vector programming. So less code is more. Also gets you thinking in a particular way...
2. The database, a mix of transactional, in memory, and fast analytics (aggregations, joins..) concepts
3. The runtime framework: Kdb is an ideal building block for a network distributed system. Opening sockets and connecting to other Kdb instances is simple.
2/3 I buy, but for 1 I think you need to decouple the syntax from the underlying array-programming model. K-level terseness and the semantic structure that makes K well suited to vector programming are pretty much entirely orthogonal.
You can implement the core of K's array manipulation in other languages and ends up with similarly terse code if that's what you want. But you can also produce something that will be far more readable to most people without losing the expressiveness by e.g. opting for naming some of the operations etc.
> I'd argue very little of what makes K and Kdb special has to do with
the language per se, but with the focus on array operations, and with the
particular runtime environment of Kdb
I'm also not convinced that K as a language is essential to kdb as an
application but it's undeniable that they have "proven themselves in the
market", eh? It's a niche, but a well-paid one, and they don't have any
competitors (with e.g. easier syntax.)
Are you familiar with arcfide's Co-dfns compiler?
https://github.com/Co-dfns/Co-dfns I ask because it might serve as a
counter-example of a "killer app" that relies on the concision of the
syntax.
> I think you need to decouple the syntax from the underlying
array-programming model. K-level terseness and the semantic structure
that makes K well suited to vector programming are pretty much entirely
orthogonal.
FWIW, I think so too. You could say that things like Numpy and Pandas are
moving towards integrated array processing, eh?
They have proven themselves in the market, but it's impossible to decouple the language agnostic benefits of KDB from K's syntax in that respect. That said, they have plenty of competitors - just most of them are just run of the mill languages.
But it can't easily (and perhaps essentially) be decoupled and still be usable.
At first, the math notation for "x ^ n" is cryptic, and writing "x * x * ... * x repeated n times" is clearer. But then, you grok exponentation, and everything longer than "x^n" becomes too verbose and needlessly complex; furthermore, once you've grokked it, "x^n" naturally extends to negative, rational and real n naturally, whereas the expanded form does not.
K is similar; At first, it seems needlessly terse, weird and opaque. But once you've grokked it, it's clear that it's much more than the underlying array model (which, as you pointed out, can be and has been implemented in other environments).
As supporting evidence, I bring numpy; It has the same kind of array operations, and as-high-a-level of language; and yet, code in numpy written by most people is still an order of magnitude longer (and slower) than code produced by people writing K. There is a selection bias here (Everyone uses python, only K compatible people use K), but I think it's more.
Look at an example from [0] - the complete function (executable by the 300KB entire K runtime, independent of any library, and with named arguments which makes it 50% longer than some people would write it):
{[t;f;o]{x y z x}/[_n;o;t f]}
This is a K idiom known afaik as "converging multicolumn select"[1]. "t" is a table; "f" is a [list of] columns, e.g. `price`size ; "o" is a [list-of] subset-and-ordering functions, one for each column.
To collect all the rows from table "sales", where time is more than 9am, price is less than 100, and ordered by decreasing size, you would call it with the arguments [sales;`time`price`size;9h00<,100>,>:]
Although it is possible to write such a simple, generic, useful idiom in Python/numpy, I have personally never seen it done. R's data.table (and its Python port) come close in a limited subset of cases, but the extreme usefulness of converging multicolumn select does not extend; also notice that the call has exactly the specification, no boilerplate or even keyword (and just minimal punctuation). This example is not unique. e.g. "flatten" is ,// ; Yes, that's the entire function implementatiion, and no, it's not special cased.
K optimizes the use of your eyes and brain as a pattern matching device. It has been my experience that, much like good math notation (e.g. Leibniz's differential notation), good programming notation makes things easier and evident in a way that is not captured by equivalent semantics that aren't human compatible in the same sense.
> But it can't easily (and perhaps essentially) be decoupled and still be usable.
I'd argue it's not usable to most people in its current form at all, which is why they remain tiny niche languages.
> At first, the math notation for "x ^ n" is cryptic, and writing "x * x * ... * x repeated n times" is clearer. But then, you grok exponentation, and everything longer than "x^n" becomes too verbose and needlessly complex;
This is contrived. The argument is not for writing out like that. Now of course x^n is well understood by most, but many languages do offer longer versions, such as e.g. math.pow(x,n). For x^n I don't think that is necessary, but for less common operations I'd much rather have a word that hints at a concept than a symbol that does not.
For most us remembering a large number of single-character operators slows us down, rather than speed us up.
> Although it is possible to write such a simple, generic, useful idiom in Python/numpy, I have personally never seen it done.
I don't use either, but e.g. considering Ruby, the main reason why you wouldn't tends to be the lack of the equivalent of a "table" type. With a table type that provides suitable methods, such as e.g. provided by typical Ruby ORMs, you might end up with something like this:
Of course we wouldn't write it like that, because it is unreadable.
I think this is typical of an attitude difference. I don't see the value in your example, because to me it is semantic overhead that requires me to slow down when reading code.
> also notice that the call has exactly the specification, no boilerplate or even keyword (and just minimal punctuation).
This is part of the negative aspect of K syntax to me, not a positive.
> K optimizes the use of your eyes and brain as a pattern matching device. It has been my experience that, much like good math notation (e.g. Leibniz's differential notation), good programming notation makes things easier and evident in a way that is not captured by equivalent semantics that aren't human compatible in the same sense.
My experience is that there is a sharp dividing line between people who are happy reading maths, who tends to be ok with syntax like that, and people who rely on visual layout to recognize code who tends to absolutely hate it, and judging by which programming languages are successful, the latter group seems to be substantially bigger. I'm glad you're happy with a syntax like that.
But to me it is not useful, because I can't skim it and quickly get an idea what it does.
> Basically for the claim to be true, why can't you write something like Racket in Racket 10x faster? Like 3 years instead of 30 years. And why doesn't it do way better things than CPython or Ruby?
Maybe it offers the same level of functionality with 10x less contributors/funding/...?
I think it's plausible that writing code in C, Pascal, or languages of a similar level is several times more productive than writing code in assembly language, and writing code in Python, Ruby, JS, Scheme, Lua, or other languages of a similar level is several times more productive than writing code in C.
This is not true for all programs, for a couple of different reasons.
First, the reason writing code in Python is often more efficient than writing code in C is that it allows you to leave things out. Specifically, you can leave out resource lifetimes, specifying low-level types (representation in bits), checking high-level types (conformance to protocols), memory layout, and error handling. The CPython interpreter will improvise something for you in all of these cases. It won't be as good as what you could have come up with if you were doing it yourself in C, but for many programs, other things are more important. A lot of times you need to have some idea about those things in order for your program to work properly, but it can be informal, incomplete, and tacit, rather than formal, complete, and explicit. But for some programs, these things that are being left out are too important to leave out. In those cases using JS instead of C isn't saving you anything.
Second, as Alan Kay says, architecture trumps materials at scale. In the taxonomy of SICP, programming languages contain primitive concepts, means of abstracting concepts, and means of combining concepts, and the ways in which a higher-level programming language is more efficient than a lower-level programming language are mostly in the form of primitive concepts. Python has auto-growing lists and dicts ready-made, plus polymorphic method calls with duck typing and garbage collection and whatnot. These reduce your code size by some constant factor, and they also nudge you toward using a certain architecture, an OOP architecture which is more scalable in some ways. Maybe one line of Python does as much as 64 lines of assembly language. But your software's architecture defines its order of growth: does its functionality increase proportional to the amount of effort put into writing code, proportional to its square, or proportional to its square root? And you can implement any architecture in any programming language.
I do think there's something to the claim that Chez Scheme represents a substantially better language implementation than CPython on some axes despite being the work of a much smaller group of people. Chez is mostly Dybvig, actually, though Scheme itself contains important ideas from dozens of people. By contrast, /usr/share/doc/python2.7/ACKS.gz on my system lists almost 1500 people, including, to my surprise, me.
Around that time I'd noticed that the stringly-typed languages of shell, awk, and make have no garbage collection. They're value oriented and not reference oriented. The goal of Oil was basically to "upgrade" shell into something more like Python, and on the surface that looks straightforward, but the reference model vs. value model makes it pretty different.
I've gone back and forth on that... whether I should just slap GC into a shell, or whether I should try to preserve the value model while making it less stringly-typed and more expressive.
I have a more ambitious idea for a language where every value is essentially a pair (Go-like string slice, structured data), but I probably won't get to that in the near term. For a concrete example, I noticed from writing some HTML processors that both DOM and SAX APIs have a bunch of flaws for processing semi-structured text.
-----
Also, on the other point, let's consider these two possibilities about Chez / Dybvig:
1) Dybvig is an average programmer who got "superpowers" from Lisp, i.e. got a 10x boost in productivity.
2) Dybvig and Chez are an outlier, e.g. like Fabrice Bellard and his recent QuickJS (and other projects), or Lattner and LLVM/Clang/Swift. The salient point is that these projects have nothing to do with Lisp.
It's obviously not an either/or thing, but I'd say the answer is closer to #2. The complexity is from the problem domain, and many major software projects are minor miracles that only a few people are qualified for, regardless of language.
I'm delighted that you enjoyed that note! You might enjoy parts of Dercuano, too, then.
As I understand it, shell, awk, and Tcl are more or less linear languages, except that copying and destruction are implicit. It's fairly straightforward to make a linear language that includes more expressive types than just strings; Tcl 8 even did it without breaking backward-compatibility. Linearity is pretty incompatible with the OO worldview in which objects have identity and mutable state, but to the extent that you provide Clojure-like FP-persistent data structures whose operations merely return new states, maybe the OO worldview can just go suck it. Rust takes a different tack in which copying is optionally explicit (though destruction still implicit: "affine typing" rather than "linear typing") and you can use "borrowed references" to avoid the headache of explicitly returning all the argument objects you didn't destruct.
A nice thing about variations like Rust's and Clojure's is that you can preserve the expressiveness of the Lisp object-graph memory model without all of its drawbacks: no aliasing bugs, no circular references complicating the life of the GC, and in Rust's case, no garbage collector at all.
> I have a more ambitious idea for a language where every value is essentially a pair (Go-like string slice, structured data), but I probably won't get to that in the near term. For a concrete example, I noticed from writing some HTML processors that both DOM and SAX APIs have a bunch of flaws for processing semi-structured text
Right, text markup isn't tree-structured, and I think HTML5 actually prescribes fixups for <i>some <b>incorrectly</i> nested</b> markup. The GNU Emacs/XEmacs schism was largely about how to handle this problem for text editing; the XEmacs side added "extents" to which you could attach structured data (such as a font or a link destination), which are more or less the type of values you're describing, while the GNU Emacs side instead added "text properties" which conceptually applied independently to each of the characters of buffers or strings, but under the covers were of course optimized with an extent-like representation. Neither side of the schism considered replacing text buffers with S-expression-like trees like the DOM; that was Microsoft's fault :)
Are you thinking that when you concatenate a couple of such string slices, the operation will be O(1) because it uses ropes behind the scenes? Or are you thinking of not having concatenation at all as a primitive operation, instead using an unboundedly-growing scratch buffer (not, itself, a value) that you can append them both to and then take slices of? Is the text in the slices mutable, and if so, can it also grow and shrink? I think there's a large design space of interesting things you could do.
About Dybvig and Chez, sure, Kent Dybvig is a wizard. But so is Guido van Rossum, whatever embarrassing errors he may have made regarding first-class blocks and tail-call optimization in Python. I haven't asked Dybvig, but I don't think he's a Bellard-class wizard, so I don't think that's the answer.
Or are you saying that Python's problem domain is much harder than Chez Scheme's domain?
By the way, my apologies for having delayed so long in responding to your thoughtful notes.
A tentative slogan is "bijections that aren't really bijections", i.e. to describe the "fallacy" of (bytes on disk -> data structure in memory -> bytes).
Hence the (slice, data structure) representation.
That's the correctness aspect. I also think there's a big performance aspect, e.g. it appears to me that the performance of many/most parsers is dominated by object allocation.
Anyway I don't think this will make it into Oil any time in the next year, but I think there is room for it in programming languages, and lots of other people are barking up that tree (e.g. FlatBuffers relate to a big part of it). I would be happy to continue batting around ideas in non-nested HN thread :) So as mentioned elsewhere, feel free to send me a mail or ping me on Zulip!
I'm looking at Dercuano but I'm confused why it's a tarball and not a live website :)
Yes I totally agree -- C is way more productive than assembly, and dynamic languages are way more productive than C. I think those are incontrovertible facts that have been proved over and over again in the market, according to the test in my post.
I'm arguing against the OP who is essentially claiming that the homoiconity and metaprogramming/DSLs of Lisp are a huge advantage over say Python.
I would argue the opposite: all the essential innovations of Lisp made its way into other languages like Python and Ruby (and Julia, which has a Lisp-like macro system by way of its femtolisp front end [1]). It was a hugely influential language but it's not ahead of the state of the art right now.
Another way of saying it: PARSING is the only difference between Lisp metaprogramming and metaprogramming in any other language, and parsing isn't an essential difference. It's a bit of friction. (Though I want to improve parsing, long story ... [2] )
-----
In fact I've done what you could consider a software engineering experiment along these lines in the last few years.
I wrote a bash-compatible shell, running thousands of lines of unmodified bash scripts, in Python. It's 5-7x fewer lines of code than bash, due to metaprogramming with Python, ASDL, and re2c. I claim it's significantly faster to develop and has fewer bugs because of this. It's very very different than how bash is written.
Well it took over 3 years, but bash has been around for 30, and I also enhanced the language in many ways. I won't argue if someone says the jury's still out, but here is a significant achievement on performance:
Despite being written in Python, and parsing more thoroughly by design, the parser (which is a ~10K line program in Python, and more in bash) is faster than bash's once it's automatically translated to C++. So basically the high level "DSL" of Python or "OPy" retains enough information to make it better than the state of the art.
Here's a comment on this non-Lisp heavily DSL-based architecture from a few months ago:
To summarize it, if you look at how bash is implemented, it's 140K lines of "groveling through backslashes and braces one at a time in C", as I like to call it. And it has about as bugs / mistakes as you'd expect with that style, including some serious security problems.
So you can use DSLs to "compress" that code -- and I claim make it more correct and faster (jury is still out on memory usage). And you don't need Lisp to do it. Python was a fine choice for metaprogramming and DSLs; Lisp doesn't have any advantage.
In fact in January 2016 I hacked on femtolisp specifically because I was thinking about borrowing Julia's implementation approach for Oil. But yeah it's not enough of a win to justify the downsides. If someone wants to prove me wrong, write a bash-compatible shell in Lisp in under 3 years ;)
It's not going to be any easier, just like writing an x86 backend isn't much easier in Lisp. The difficulty is from the problem domain and not the language. And I noticed that all compilers use metaprogramming and DSLs anyway -- LLVM has hundreds of thousands of lines of TableGen. The Go compiler uses a Lisp-like DSL to express SSA transformations.
So basically Lisp is great but it's been thoroughly absorbed, and PG basically extended a falsifiable argument that was proven false. All of YC's biggest successes use dynamic languages, not Lisp. Stripe uses a ton of Rails, AirBNB is Rails, Dropbox is Python, etc.
Instagram and YouTube are Python, etc. Wikipedia and Facebook are PHP, etc. I think that constitutes enough evidence.
-----
edit: I should also add/clarify that I've met programmers who I'm sure can bang out the equivalent of bash's 140K lines of C in (much better) C in a relatively short period. I'm not saying the way I did it is the only way to do it. Rather, I'm saying Lisp doesn't have any fundamental advantage in metaprogramming. The only difference is parsing, which is a small bit of friction.
The other point which people have brought up many times is that most people are bad DSL designers. It may be better to rely on "known" DSLs than to invent your own for every program. That can be a drag on productivity negating any of the potential benefits of smaller code.
> dynamic languages are way more productive than C.
I meant to say that high-level languages were more productive than C; I think there are a lot of people who will be happy to tell you that they are more productive in Haskell than in C, as much so as in Python, even though Haskell is a lot more static than C. (I don't know Haskell well enough to say; the only Hindley–Milner language I know is OCaml, and I feel less productive in OCaml than in C.)
> I'm arguing against the OP who is essentially claiming that the homoiconity and metaprogramming/DSLs of Lisp are a huge advantage over say Python. ¶ I would argue the opposite: all the essential innovations of Lisp made its way into other languages like Python and Ruby
I pretty much agree. Lisp was a unique language in, say, 1990, when it had garbage collection, dynamic typing, dynamically expanding data structures, higher-order functions, eval, easy serialization and deserialization of data structures, an object-graph memory model that made it easy to represent things like parse trees, and easy metaprogramming through homoiconicity. Meanwhile, the other popular programming languages were things like C, Pascal, BASIC, FORTRAN, sh, Lotus 1-2-3 formulas, dBASE, and COBOL, which had none of those; a lot of them didn't even have recursion. Today, though, Python has all of them except for homoiconicity, and we can do quite a bit of metaprogramming in Python, using its pervasive polymorphism and higher-order programming rather than macros. Most other popular modern languages like Perl 5, Ruby, JS, Lua, and PHP 5+ are basically dialects of Lisp. Even languages that differ dramatically from Lisp, like OCaml, Kotlin, C#, Swift, and recent versions of Java are pretty close.
> I wrote a bash-compatible shell, running thousands of lines of unmodified bash scripts, in Python. It's 5-7x fewer lines of code than bash, due to metaprogramming with Python, ASDL, and re2c. I claim it's significantly faster to develop and has fewer bugs because of this. It's very very different than how bash is written.
That's a very impressive achievement! Does it use more RAM than bash does? I'd think that one of the potential benefits of C over Python is that, because C requires you to specify your memory layout, C programs tend to use significantly less memory than otherwise equivalent Python, usually by about an order of magnitude.
That is, some of the order-of-magnitude complexity reduction you got in your experiment is surely due to Python, but perhaps some of it is due to RAM costing about US$100 per megabyte when Brian Fox started writing bash in 1988.
> Python was a fine choice for metaprogramming and DSLs; Lisp doesn't have any advantage.
I think that may be overstating the case; no doubt there are some things that are easier in Python and some things that are easier in Lisp. SBCL or Racket can usually beat CPython or Jython by about an order of magnitude on speed, for example, and having first-class lambdas really is an improvement over Python's botched mixture of context managers, function decorators, and generator abuse. In the other direction, Python's syntax is significantly more readable — especially for arithmetic — and SBCL and Racket don't have anything like Numpy, and their own compilers are not an adequate replacement. But the difference between Lisp and Python, whichever one it favors, certainly isn't close to the enormous advantage Lisp or Python has over Pascal.
(And I think Clojure is really the most interesting Lisp, actually.)
> I should also add/clarify that I've met programmers who I'm sure can bang out the equivalent of bash's 140K lines of C in (much better) C in a relatively short period
I would be very surprised about that; but Andrew Appel did come close to winning the ICFP programming contest one year using that well-known functional programming language, C. And a few years later someone won using C++.
On the other hand, you did end up writing OSH in OPy rather than, strictly speaking, in Python. The initial Python versions sort of ended up being a prototype, in a way. And you could presumably have written OPy in some other language. Maybe it would even have been easier to write it in OCaml or F# or Haskell. (But if the language it compiled were less compatible with Python, I guess you would have had to rewrite a lot of the Python standard library.)
So maybe that's what "much better C" would look like in the context of bash: a smallish core compiler or interpreter for the DSL in which the rest of the shell is written?
I'm pointing to them as very long-lived and valuable codebases written in Lisp dialects. Chez Scheme is a 35 year old codebase and Racket is also decades old.
So I'm not saying there's anything wrong with them, but I am saying that it doesn't appear that they're 10x easier to understand or modify than LLVM or CPython (Chez being a compiler and Racket being an interpreter as far as I remember). Or that you can get a 10x better result.
Basically for the claim to be true, why can't you write something like Racket in Racket 10x faster? Like 3 years instead of 30 years. And why doesn't it do way better things than CPython or Ruby? Those claims might be "slightly" true depending on who you are, but they're not devastatingly true. There's been more than enough time to evaluate the claims empirically.
In other words they would have already proven themselves in the market if that were the case. You would have extraordinarily productive teams using these languages -- along the lines of what PG hypothesized 15+ years ago.
http://www.paulgraham.com/avg.html
In fact the thing I found was interesting is that at the core of Racket is a big pile of C code, just like CPython. A year or 2 ago I watched a talk about them self-hosting more, and moving to Chez scheme's backend, but I don't recall the details now.
https://github.com/cisco/ChezScheme
https://en.wikipedia.org/wiki/Chez_Scheme
https://github.com/racket/racket/tree/master/racket/src/rack...
(FWIW I also looked at and hacked on femtolisp around the same time, since I was impressed by how Julia uses it.)
correction: it looks like Racket has a JIT too, written in C. Still same point applies: it's not magic and looks a lot like similar codebases in C. Chez is more self hosted AFAIR but it's also hundreds of thousands of lines of code.