The Oforth Programming Language

luckydude · on March 28, 2018

I'm a forth programmer, I've implemented an editor in forth, a more(1) clone in forth, a grep(1) in forth and I did a bunch of stuff I've forgotten in forth for the geophysics department at UW Madison.

I HATE forth. It's a miserable language, I just hate it.

And then I went to Sun and the boot prom language was forth, still hated it.

Then I got to PCs. The BIOS had no language and then Intel did whatever garbage they did and I was like please, just give me Forth. It's not what I'd do for debugging a panic but I could make it work. The Intel stuff was way worse.

I suspect that Mitch priced himself out, otherwise we'd all be using Forth as the boot language, I dunno what happened. But if you had asked me 20 years ago would I be saying anything positive about forth I would have kicked you in the XXXs. Yet here I am wishing that forth was how we dealt with a panic. We'd all be happier.

luckydude · on March 28, 2018

[I'm sticking this here because there was a dude who said this was a good reply and I tried to reply to him and HN say s his post was deleted. I don't know why, his post seemed pretty reasonable to me, he was asking if forth was a good boot loader. Shrug.]

I don't for a minute think that forth is the best answer, it's an awful answer. Even lisp, which I hate with a passion, would be a better answer because more people have experience in lisp than forth.

Forth is just less shitty than what Intel came up with, I think it was part of EFI, it's amazingly bad. I've designed a few programming languages and I'm not good at it, but holy crap, I could be drunk and come up with a better answer than Intel did. Forth is just a crappy language but it's less crappy than what Intel did.

The idea that Forth is somehow special for boot loaders is nuts. There is nothing, that I know of, that makes forth somehow magical for that.

What you want for the bootloader, and for the debugger that you drop into when there is a panic, is something like C but interpreted. You want to be able to walk a linked list of page table structures. And to its credit, Sun's forth could do that. It's sort of twisted to think it could but it did, there were forth words that did all sorts of kernel magic.

I think that the magic of forth was that it was tiny, back 30 years ago you didn't want to have a lot of storage for your debugger. That's not the case today, if someone made the case that they could make things better, here you go, here's a gig of storage. That's a little crazy but still. Forth was cool when a meg of storage was a crazy amount.

We can do better. Intel pushed us backwards, Forth would be a step forwards, but man, I'd take python or Tcl (because then I'd get my pet language L, http://little-lang.org) or even perl as a better boot language.

Froth has no special boot sauce in my opinion. It was just small.

lukego · on March 28, 2018

Speaking of Lisp and Forth, I was really stunned when I noticed that adding syntactic support for "structs" in the language is about 10 lines in Forth (https://github.com/openbios/openfirmware/blob/master/forth/l...) and about 2000 lines in Lisp (https://github.com/sbcl/sbcl/blob/master/src/code/defstruct....).

Mitch Bradley wrote up an explanation of the Forth library, and lots of other cool bits of Openfirmware, at http://wiki.laptop.org/go/Forth_Lesson_18.

kazinator · on March 30, 2018

You could also make some ten line hack to mock up structs in Lisp over a vector. That SBCL implementation has to conform to an ANSI standard.

The TXR Lisp implementation of structs (the syntactic part) is 300 Lines: http://www.kylheku.com/cgit/txr/tree/share/txr/stdlib/struct... of which the defstruct macro is about half.

The underlying implementation is in C: http://www.kylheku.com/cgit/txr/tree/struct.c, which is some 1600 lines.

TXR Lisp structs are an OOP system; they support inheritance, static and instance slots, inheritance of both, late injection of methods into a hierarchy and such. There is a fairly sophisticated initialization model and even support for C++-like RAII style.

Structs are real objects that know their own type. They have a print and read representation, they know what slots they have and will reject invalid accesses and so on.

You don't get something for nothing; the LOC's are spent for a reason.

Looks like there is a tutorial on what appear to be these same Forth structs here: http://wiki.laptop.org/go/Forth_Lesson_18

I don't see any example of how to create a struct, how to access it, how to show that two different struct types are incompatible, how accessing a field in a struct that it doesn't have results in an error, etc. It just looks like definitions of constants denoting offsets.

zeveb · on March 28, 2018

The real special sauce for Forth is that it's small and it's knowable. Anyone who wants to learn about how elegant a computer system can be should take a look at jonesforth[0], a Forth implemented in x86 assembly language. It really is possible for a programmer to easily understand everything a Forth does, which is wonderful when writing a bootloader or low-level OS.

In a lot of ways, it's a more-structured assembly language.

TCL could be cool for a low-level system too, as could be a Lisp (also: how can anyone hate Lisp with a passion‽‽‽), as of course could be Forth. Perl & Python are too big and not minimal enough.

0: https://github.com/nornagon/jonesforth/blob/master/jonesfort...

luckydude · on March 28, 2018

I tried to like lisp, it just doesn't fit with my brain, dunno why. I tried to like emacs and it doesn't fit either. I'm a C/sh/vi sort of guy. I like troff, don't care for LaTex. Some stuff just works better for me, working in C comes very naturally to me, working in lisp was always a struggle.

I think if I had done more compiler work I'd like lisp better, I've been told that making an AST in lisp is trivial and it is work in C.

I get your point about small and knowable, I love stuff like that.

kazinator · on March 30, 2018

Much larger things are perfectly knowable; you just need a more capacious noodle.

jacobush · on March 28, 2018

The special sauce was that it was tiny, that was not only important, it was imperative back when EPROMs ruled the motherboards and the backplanes. It was only until the serial flash memory arrived that bloaty alternatives became possible at all.

dang · on March 30, 2018

The comment was deleted because its author deleted it. That's what 'deleted' means on HN.

I agree that it was a good reply!

luckydude · on March 28, 2018

I do wonder why someone at HN decided that to stomp on the guy who was asking about forth as bootloader. Didn't seem weird at all to me. HN? Care to explain? Did I miss something?

dang · on March 30, 2018

As I explained at https://news.ycombinator.com/item?id=16719808, no one at HN touched the comment. It's deleted because the person who wrote it chose to delete it. Why? No idea. You'd have to ask them.

exikyut · on March 31, 2018

That raises a tricky (perhaps philosophical-meta) question: whether deleted questions should keep the username visible.

I can't ask the person who deleted the question, because I don't know who they were.

alxlaz · on March 28, 2018

This is, I guess, the uncomfortable secret about Forth :-). There's the community of Forth enthusiasts, who admire its simple, word-oriented concept, and the ease of implementation, and are super happy that you can put a working interpreter in 8K of flash and use it interactively (ok, I'm happy about that, too).

And then there's everyone who has to write Forth code but is not a Forth enthusiast, and truth is we kindda hate it.

I pained myself through writing something non-trivial in Forth and it was pretty awful. I kept hoping the enlightenment would finally hit me and whatnot but it didn't, and I haven't touched it ever since.

defined · on March 28, 2018

I got interested in Forth because of an article I read in either Byte magazine or Dr Dobb’s Journal. It was not because of the Forth language that I was curious , it was because Forth was the first TIL (threaded interpreted language) that I had encountered.

At the time, there were no Forth interpreters available locally for the only computer I had (a TRS-80 Model II running CP/M. Actually, it was my Dad’s.)

I found someone from the Forth Interest Group (FIG) who gave me a 90-page Z-80 assembly language listing for figForth, which I proceeded to type in overnight and assemble with masm, IIRC.

It worked, and sadly, I never really did much with Forth after that. But it was an interesting experience and I still think TILs are a cool idea.

lolc · on March 27, 2018

The last time I looked at Forth it was to get a more powerful replacement for the bc calculator. But the minimalistic approach of forth enthusiasts means there is no powerful REPL.

Unfortunately Oforth doesn't seem very mature to me. To compile it, I had to install libc6-dev-i386 and g++-multilib so it could do its 32-bit compile. Now it fails on me with a segmentation fault. If I try the precompiled version, it just exits with a nonzero exit-code.

So, back to bc, I guess :-)

Edit: Ah I see, it needs --i for interactive mode.

mikekchar · on March 28, 2018

Forth is a concatenative language. Its REPL is pretty similar to a Lisp REPL. Everything in Forth is accessible and it's actually common practice to override the interpreter to implement DSLs.

Haven't spent any time looking at this, but 30 years ago I actually did quite a lot of programming in an object oriented Forth (3D star field animation system for the university's planetarium dome, on an Amiga with a bit of extra hardware). I really enjoyed it.

mikekchar · on March 28, 2018

Edit: Thinking about it, I realise it's actually a lot closer to Smalltalk than to Lisp -- you have a built in editor, you build memory images that you constantly work with, all the code is decompilable so you can refactor easily, etc, etc

Actual edit: you'd think I could find the edit button :-P

TeMPOraL · on March 28, 2018

I guess Lisp was like that in the good days of Lisp machines.

Nowadays, all that remained is "you build memory images that you constantly work with", and Emacs, which is merely a shadow of the old stuff (but still beyond awesome).

kbp · on March 28, 2018

> I realise it's actually a lot closer to Smalltalk than to Lisp -- you have a built in editor, you build memory images that you constantly work with, all the code is decompilable so you can refactor easily, etc, etc

Lisps often provide built in editors; Common Lisp even has the standard function ED for invoking one[0], but like most of the "environment" functions its behaviour is left almost entirely unspecified. Lisp is also usually developed in an image-based fashion. Images are one of the features that separate Lisp and Smalltalk from almost everything else, not one that separates them from each other. Some Lisps provide access to source code from function objects, with varying degrees of integration, and again, Common Lisp provides the standard function FUNCTION-LAMBDA-EXPRESSION for this purpose[1]. They don't generally work via decompilation, but neither does Smalltalk.

0: http://www.lispworks.com/documentation/HyperSpec/Body/f_ed.h...

1: http://www.lispworks.com/documentation/HyperSpec/Body/f_fn_l...

_19qg · on March 28, 2018

Smalltalk does not decompile code (though one can with some limits), but works with an integrated code management.

Interlisp-D also did that.

jecel · on March 28, 2018

Actually, a bytecode to text decompiler is a very important part of the typical Smalltalk IDE. In Squeak or Pharo, for example, you give the virtual machine a binary file like "work.image" to start the system and it looks for text files "work.changes" and "version4.sources" so the code browser can show you the sources. If these extra files are missing it will complain, but will continue to work normally. If you pay attention you will notice that the sources have no comments and all local variables are called t2 or t5 - that is the decompiler subtly doing its job.

mikekchar · on March 30, 2018

Yeah, what I was thinking of was that the ability to create an AST from the running image is how the refactoring browser works. I'm not really aware of any Forth analogy, but it's fairly common (in my limited experience) for Forth programmers to hack away in the repl, decompile what they've done and then bung it in the block editor -- so it's kind of a manual version of the same thing. I really wasn't trying to make any profound statement. Just noting something that looked kind of similar :-)

_19qg · on March 28, 2018

How is that 'important'? Usually one writes sources. And that's what is maintained by the Smalltalk development environment. Since Smalltalk usually comes with the system sources there is also no need to decompile the bytecode.

OTOH some Lisps keep the s-expressions in the image and can run them directly via a Lisp interpreter.

lolc · on March 27, 2018

Nice it has some math functions predefined and is not finicky about types.

Edit: And with .show it even shows the stack after each evaluation. This could replace bc for me if it was packaged in Debian.

yiyus · on March 27, 2018

Don't you mean dc instead of bc?

lolc · on March 28, 2018

Indeed, it's dc. Muscle memory knows which one I use.

tom_mellior · on March 28, 2018

So, looking at some examples on rosettacode.org... This is basically a Smalltalk dialect, which is good. But it is also, completely needlessly, a Forth dialect. So instead of Smalltalk's statement separator (the period), you "end" statements by "dropping" some value.

That is, instead of:

    foo doSomeThing.
    bar doSomeOtherThing

you write:

    foo doSomeThing drop bar doSomeOtherThing

Why?

It also has named variables (which some Forth purists don't like), but instead of the reasonable Smalltalk syntax of

    x := baz compute: #something

it uses

    baz compute: #something ->x

for assignment.

Why?

This might eventually turn into a nice programming language, all it needs to do is drop (haha) some of the Forth baggage and admit to itself that its Smalltalk subsystem has everything one needs.

jabot · on March 28, 2018

The "drop" is not a statement terminator or separator.

There is a stack. You push values on the stack, then you call a function. The function takes its arguments off the stack, and pushes the return value(s) on the stack.

If you do not need the return value (which seems to be the case in your example) you ignore it by just removing it from the stack.

That's what the drop is for.

tom_mellior · on March 28, 2018

I know. De facto it's for separating parts of the computation ("statements").

eequah9L · on March 28, 2018

The -> syntax plays nicely with the stack nature of the language. It's basically just a word that pops TOS, and as a special effect it introduces a name with that value. (IIUIC, I didn't study the language.)

tom_mellior · on March 28, 2018

The greater point I'm making is that the language does not have a "stack nature" in any real or useful sense.

It has implemented syntax superior to Forth's by using variables and by using Smalltalk-like keyword (i.e., infix) syntax. Yes, there is a remainder of Forth-like stack syntax for expressions. It's ugly, jarring, hard to understand, and comes without any advantages since the Smalltalk-like parts require the language to use a real parser anyway!

IFranckyI · on March 28, 2018

The number of "Smalltalk-like" keywords is limited and this is not a syntax : they are executed like any other words.

imo you're overestimating the "smalltalk-like" part, as it is mainly word naming conventions (a ':' at the end of some words to show that something is required after) and not syntax. The "something" required after is read from the input stream as a part of the word's execution (like many Forth words).

There is no "real parser" : the parser is like any Forth parser ie one word at a time separated by blanks.

In fact, it is an Forth-like interpreter with immediate words and parsing words that use the stack structure a lot.

tom_mellior · on March 28, 2018

I guess you know a lot more about this than I do. But maybe you are overestimating the Forth-like part?

> There is no "real parser" : the parser is like any Forth parser ie one word at a time separated by blanks.

> In fact, it is an Forth-like interpreter with immediate words and parsing words that use the stack structure a lot.

Looking at an example like

    : catalan(n) n ifZero: [ 1 ] else: [ catalan(n 1-) 2 n * 1- * 2 * n 1+ / ] ;

(from http://rosettacode.org/wiki/Catalan_numbers#Oforth) it seems like, if what you say is true, then : must be a parsing word that not only reads "catalan(n)" but also somehow decomposes it in a way and sets things up such that the later call "catalan(n 1-)" is understood correctly even though "catalan", "(", and "n" are not separated by whitespace. It very much looks like there is some tokenization going on.

Also, is "ifZero:" really a separate message from "else:", i.e., or is "ifZero:" also a parsing word that looks if what follows is a well-formed expression followed by "else:"?

If you have parsing words everywhere, you have a parser. It may not be a classical parser where everything is written in one place, but it's a parser nonetheless.

I don't think there's anything wrong with implementing a Smalltalk dialect in Forth, with lots and lots of parsing words, and compiling the Smalltalk surface syntax to stack-based Forth code on the fly. That's lovely. It's just a shame to expose the underlying Forth in the surface syntax...

IFranckyI · on March 28, 2018

>Looking at an example like

> : catalan(n) n ifZero: [ 1 ] else: [ catalan(n 1-) 2 n * 1- * 2 * n 1+ / ] ; >(from http://rosettacode.org/wiki/Catalan_numbers#Oforth) it seems like, if what you say is true, then : must be a parsing word that not only reads "catalan(n)" but also somehow decomposes it in a way and sets things up such that the later call "catalan(n 1-)" is understood correctly even though "catalan", "(", and "n" are not separated by whitespace. It very much looks like there is some tokenization going on.

You are right, I forgot to say that the following characters stop the parsing of a name : { } [ ] ( ) | This is a difference with a classical Forth interpreter.

The call to catalan( n 1- ) is just sugar, the interpreter replace it by : n 1- catalan. It is not related to how : works.

The following definition is exactly the same : : catalan( n ) n ifZero: [ 1 ] else: [ n 1- catalan 2 n * 1- * 2 * n 1+ / ] ;

>Also, is "ifZero:" really a separate message from "else:", i.e., or is "ifZero:" also a parsing word that looks if what follows is a well-formed expression followed by "else:"?

Yes, ifZero: is really a separate word from else: . It is the same structure as the Forth structures :

if then

if else then

the if does not know if there will be a then or a else and is not a parsing word. This is the same with ifZero:. It is not a parsing word and it is the else: word that do the job.

By the way, you can see the definition of all these words in the prelude.of file (they are all written in Oforth).

>If you have parsing words everywhere, you have a parser. It may not be a classical parser where everything is written in one place, but it's a parser nonetheless.

Not much more than in a classical Forth, where you find lots of parsing words too. I have no problem to call this "a parser". I'am just saying that there is nothing different from a classic Forth.

> I don't think there's anything wrong with implementing a Smalltalk dialect in Forth, with lots and lots of parsing words, and compiling the Smalltalk surface syntax to stack-based Forth code on the fly. That's lovely. It's just a shame to expose the underlying Forth in the surface syntax...

All the ifTrue:, else:, ... are not parsing words. There are not "lots and lots" of parsing words, at least not much more than in a classical Forth. There is nothing to do with the Smalltalk messages mecanism. There is nothing more here than a "if else then" Forth structure...

tom_mellior · on March 29, 2018

Thanks for this explanation! I understand much more clearly now. And I must admit I'm a bit disappointed :-(

blunte · on March 28, 2018

Forth was my second language behind Basic on the C64, and I didn't even know what a "language" was... it was just the way I got my HP 28C to do cool things...

HelloNurse · on March 28, 2018

Many details suggest that it's a very mainstream object oriented design dressed with somewhat Forth-like syntax, not an "extended" Forth.

For example, both garbage collected objects and heap allocated objects, uncontrolled multithreading, everything is an object, arbitrary precision integers, caring so little about the stack that ROLL is missing, gratuitously different control structure syntax.

unclesaamm · on March 27, 2018

Sounds a lot like Factor. I wonder how it compares?

metaobject · on March 28, 2018

Here’s a HN post from 2010 drawing the comparison between Forth and Factor:

https://news.ycombinator.com/item?id=1623697

DonHopkins · on March 28, 2018

PostScript is essentially a cross between Forth and Lisp.

protomyth · on March 27, 2018

Source code looks to be GPL3.

sifoo · on March 27, 2018

Glad to see Forth getting some attention, took me around 25 years of coding to find it.

For those looking for a different take on the same ideas, I'm working on a Lisp-inspired Forth here:

https://github.com/basic-gongfu/cixl

cy_hauser · on March 28, 2018

Whoa. I'm afraid to look ;-) My immediate thought is, "lets combine all the mental fun of stack juggling from forth with the tree juggling from lisp." My brain hurts!

groovy2shoes · on March 28, 2018

Combining aspects of Lisp with aspects of Forth is not a new idea. The RPN calculators offered by HP used a language called RPL ("Reverse Polish Lisp")[1], including a user-level version (a little more Lisp-y) and a system-level version (a little more Forth-y). While RPL is what came immediately to my mind, I doubt it's the only historical example of such a combination; people have been drawing similarities between the Lisp family and the Forth family for ages.

[1]: https://en.wikipedia.org/wiki/RPL_(programming_language)

sifoo · on March 28, 2018

Thank you! PostScript is another example.

groovy2shoes · on March 31, 2018

While I do get that feel with PostScript, as far as I can tell it was accidental rather than a conscious design decision.

The stack-oriented nature of PostScript appears to have been influenced by the engineers' experiences with working with some stack-oriented architectures from Burroughs [1], rather than from Forth. Some aspects of PostScript were influenced heavily by other languages in use at Xerox PARC during the Interpress era [ibid], most notably Cedar, but considering that Interlisp was also in use at the time, it's not unreasonable to conjecture that it had some influence; however, any such Lisp influence was likely subconscious, and none of the people working on Interpress or PostScript or their forebears seems to explicated such an influence (as far as I can find).

Charles Geschke and John Warnock, the co-founders of Adobe, both deny any influence from Forth, suspecting that some overlap of design requirements between the two languages led to the shared properties [2]. I know that can be hard to believe, given the striking similarity of the two languages, but it's not the only time that concatenative programming would have been independently invented: Manfred von Thun's Joy language is likewise similar to Forth, but, again, not inspired by it. Whereas the similarity of PostScript to Forth can be explained by a simultaneous need for portability and dynamism in tightly-constrained environments, Joy's similarity to Forth is a result of a desire for algebraic manipulation of program text (Forth's (somewhat) having this property may have been a conscious design decision, but I'd bet cold, hard cash that it was a happy accident).

Indeed, any of these stack-oriented concatenative languages are computationally equivalent to a transformation monoid (algebra of functions and function composition) where evaluation proceeds according to a (possibly effectful) term-rewriting relation. Viewed in this light, the stack effectively becomes a mere implementation detail as far as the semantics are concerned, but I think that any language designer starting with such a monoid+rewriting semantics, when facing reasonable implementation on real hardware, is likely to come up with something quite similar to Forth &c. That is to say, starting with a low-level focus on efficient and portable implementation, or starting with a high-level focus on code representation as algebraic terms, there's a good chance you'll wind up with something similar to an established concatenative language, whether you mean to or not.

---

[1]: "PostScript and Interpress: a comparison", Brian Reid: http://tech-insider.org/unix/research/1985/0301.html

[2]: Masterminds of Programming, Federico Biancuzzi & Shane Warden (eds.)

sifoo · on March 28, 2018

Besides basic semantics, there's not that much stack juggling going on really. Cixl offers four stack operators, drop, dup, swap and clear; the idea is to use more convenient constructs like let bindings, closures and multi methods to keep the stack tidy.

lallysingh · on March 27, 2018

They're trolling o'caml, aren't they? This is wonderful.

FractalLP · on March 28, 2018

No, I think OForth has been out for a long time. Probably not before OCaml, or even close, but still not a trolling thing. Both just use "O" to symbolize some Object stuff.