Hacker News new | past | comments | ask | show | jobs | submit login
Tamgu, a functional, imperative, logical programming language (github.com/naver)
63 points by clauderoux on July 24, 2019 | hide | past | favorite | 48 comments



Automatic type conversion I think is a mistake. It’s a huge cause for bugs. But even if not, automatic string to number parsing is definitely not something I want in my production code. Not only is it something that causes hard to debug errors, if in a tight loop, it can have big performance impact. There’s a reason modern languages have become stricter on type conversation, instead of less strict. This language looks cool, but I will never use it for this reason.


I might cause a flame here, but I feel like academics and people who give conference talks have adopted stricter typed languages. Actual developers instead embraced from the get-go rather dynamic languages (Python,Javascript,etc.) and continue to use such languages without the type hints because it's easy and they have things to build.


IME dynamic typing is super easy and fast until it suddenly isn't. Then you're going through ten layers of functions to check if None is a valid value for some parameter because some intern three years ago figured that would be the easiest way to do it.

I'd say that while there's certainly a ton of purely dynamically-typed code out there, there's also a current reversal of the trend, with static and gradual typing coming back in vogue. Python has fully-supported type annotations, and Typescript is the trendy new way of writing JS. Facebook has ReasonML and Hack/PHP 7. Go and Rust, the current "cool kid" languages, are fully statically typed.

I think that the bad taste left in people's mouths by "enterprise" Java and its 30-character type names is fading, and that we'll see a few years of static types. Then of course the cycle will repeat itself, as is tradition.


My favourite language is Clojure, but this exact thing leaves a bad taste sometimes. Clojure has the attitude that you should validate on the edges, using spec or similar, and then have dynamic types in between. This is really super convenient! However, as the codebase grows, subtle type errors do slip through and it gets rather hard to catch, since its now a runtime error that may not happen until a long time later (and then Clojure's notoriously bad error messages certainly don't help). So you hope your test suite catches it. Luckily, with generative property-based testing, which Clojure has reasonable support for, this can often get caught early, but during tests is still not as early as with a type checker and generative tests could still take a very long time to catch subtle issues (but it catches a lot more than just type errors!). Of course, Clojure also encourages developing in the REPL, so most of the time, you can be pretty confident that it will work before you even run the tests. Still, I feel like some kind of static typed Clojure (not a separate thing like Typed Clojure, which is slow and clunky, but something with first-class static type checking as part of the language) would be awesome. Maybe in reality it would conflict with many of Clojure's other niceties though.


dynamic languages still have strong typing. the given example of type conversion is one of weak typing. it has nothing to do with strict typing that enforces type declarations.

in python this is an error:

  >>> i = 10
  >>> s = "20"
  >>> s+=i;
  TypeError: cannot concatenate 'str' and 'int' objects
  >>> i+=s;
  TypeError: unsupported operand type(s) for +=: 'int' and 'str'


It's a nice error message, but it's a runtime error, it might not happen until weeks into production. It's technically just a crash with a useful error message. "strict typing that enforces type declarations" is generally called static typing, and does not necessarily always require annotations or declarations. It spots such errors at compile time using static analysis.


correct, and i am a fan of that too, but the poster above was conflating weak typing with dynamic typing, which are entirely different things.

my preferred language is pike, which is a strong and declarative typed language (declarative typing is the term i prefer since not everyone defines static typing the same way, and some statically typed languages are weak, like this example here)

pike does catch type errors at compile time, but behaves like a strong dynamic typed language at runtime.

i understand that python and rubys optional types now allow for something similar.


> it might not happen until weeks into production.

It should never happen in production - there should be a test that proves this.


To quote Dijkstra, testing shows the presence, not the absence of bugs. If you want to actually prove the absence of these classes of bugs then you need a static type system.


with declarative or static typing i don't need a test for this case.


That's where a dynamically typed language with optional typing gets handy, when a compiler supports it...

In Common Lisp, the SBCL compiler would be such an example:

  * (defun foo () (let ((i 10) (s "20")) (concatenate 'string s i)))
  ; in: DEFUN FOO
  ;     (CONCATENATE 'STRING S I)
  ; 
  ; caught WARNING:
  ;   Derived type of I is
  ;     (VALUES (INTEGER 10 10) &OPTIONAL),
  ;   conflicting with its asserted type
  ;     SEQUENCE.
  ;   See also:
  ;     The SBCL Manual, Node "Handling of Types"
  ; 
  ; compilation unit finished
  ;   caught 1 WARNING condition
  FOO
  *


As a dev and not an academic, I totally disagree. "dynamic languages" sounds more Kewl than static (well, it's dynamic, innit, it's probably got fundamental synergisms too) but they are a bitch to work with when things get large unless you get really disciplined, and in less careful hands you get bugs. Lisp lovers say it works for them and I won't argue as I've no experience, but in such langs I've used (python, SQL, javascript, VB), well, the road to hell is warm and inviting, especially for the less experienced.

I'm saying this as an experienced dev, please consider from this and others that you may be wrong, and avoid the trouble we've had.

Or not. Everyone has to find their own way.


I’ve just started a project in Go coming from Python and I don’t feel less productive. In fact, I’m digging Go’s decoupling mechanisms when you use interfaces “as intended”. I might get more productive in Go rather than Python soon.


welcome to the fold. Python -> Go dev here and what a refreshing change of pace strong types and a super opinionated language are.


It's common for especially bigger companies that use JavaScript to use some framework that mandates static typing. This is true here at Google, for example.


> and continue to use such languages without the type hints because it's easy and they have things to build

I continue to use python without type-hints because of know of no other similar languages with type-hints that are as popular.


Sure, less strict languages have more production use, but they’re typically older too. That doesn’t mean that conversion bugs aren’t a problem or aren’t hard to find, just that the burden is pushed onto the test suite.


I have been working for years in the NLP and one of my main difficulties during all these years (I will not give any figures, some may think that I belong to a species that has long since disappeared: -) ) was the preparation of the corpora. The corpora are big dirty vicious beasts full of mistakes and traps. The first problem I encountered was the mixing of encodings where a piece of UTF8 mixes with some LATIN leftovers. Tamgu provides quite sophisticated means to solve this problem. The second problem was the parsing of the files to extract the relevant information. That's when I made the decision to offer automatic conversion elements. On the other hand, it should be noted that Tamgu is a typed language and that the interpretation of an object is done in context. When I write:

string s = 10; s += 20;

What I define is an operation within a string. The context is defined by the recipient variable. In the same way:

int i = 20; i += "30";

The variable "i" defines an integer context and therefore string 30 is considered in this case as a number.


I get that you made it to scratch an itch you had and solve a problem you had with other languages and that's cool, but as an external person looking in, who doesn't have that exact itch, its not for me, despite the rest of the language really looking quite awesome, because for the kind of code I'm writing, this seems like it will cause a lot more frustrating bugs than convenience.

I don't deal with a lot of ad-hoc data, but even when I do, that data usually needs strict validation, as it tends to not be very clean. Having things silently convert hides this and while it may make the code simpler, that code should have validation anyway, so it really won't be any more work in the long run (for me). Slight errors in data cause billion dollar bugs, after all. Maybe not in NLP though, so, again, we have different use cases.

But that's just me and obviously my priorities differ from yours. In any case, congratulations on building this, that's quite an achievement.


Thank you. Actually, we used this programming language and its ancestors (this is the fifth iteration since 2010) in many evaluation campaigns and it proved very useful. For instance, XML is often the way dataset are exposed and I have encapsulated libxml2 in order to read these files properly. I also provide a kind of BNF grammar compiler to read complex formats and split them into meaningful components. The conversion is really done at the very end of the process, which prevents most errors. Furthermore, Tamgu requires each variable to be explicitly typed in function declarations and for global and local variables, which prevents many of the problems that I used to have with Python. The fact that types are explicit transform many Python runtime errors into compile errors in Tamgu.


> The fact that types are explicit transform many Python runtime errors into compile errors in Tamgu.

Nice!


Automatic conversion hides type mismatches in all the program, even the parts that are not concerned with input validation and cleanup. So that makes static typing a bit less useful.

As a source code reader I would much prefer to see at least a keyword or function call to make the casting explicit.

string s = cast 10; s += cast 20;

(I had the same reaction as the parent and think the language looks really cool otherwise)


Thank you for your comments. Actually, it is possible in Tamgu to have an explicit cast:

i = int(s)+10;

Each type can be used as a "function" to create the object you like, which is basically what a cast is...


I also deal a lot with data transformations and also think auto-conversion is BAAAAAD...... BUT, is a signal that most languages (all?) are pretty bad at handle this concern.

I'm building one too in Rust and there exist a very nice concept:

https://doc.rust-lang.org/std/convert/trait.From.html

Is the first time I see a blessed way into a lang to define conversions.

For example, to parse strings into int you need to say (in pseudo-code):

    convert::From(String) -> Int //Or better convert::TryFrom
and suddenly you get the way to convert to strings into ints EVERYWHERE.

Is amazing.

The key here is that is for much more than just basic stuff. You can do a lot of very convoluted things here, like Auto-Convert a vector of ints into a Zip compressed stream in one go.

----

The second things required is how avoid the boilerplate when dealing with data. If I read a file and need a lot of parsing and validation and need to be relaxed about this... how about the same thing between eager and lazy?

- Have "strict" by default and need to manually do

    1->Str + "hello" + today->Str + ...
calls

- Have a "auto-convert" when need to chain a lot of steps:

    auto {
       1 + "hello" + today + ...
    }
this is nice because you can see where the " big dirty vicious beasts full of mistakes and traps" are in your code!


Tamgu offers something very similar to that. I borrowed a lot of elements from Haskell, including data management but also declarations. In Tamgu functional elements are enclosed in <..>

In this example, I implement a join that takes a vector and transforms it into a string.

//type declaration

<joining:: vector -> string>

//The function itself

<joining(v) = x | x <- v>

println(joining(['a'..'z']));


Any reason why you don't give addition and concatenation distinct operators?

Perl does this and it's a lot cleaner IMO. Whenever you see a "+" you can consider it an explicit numeric conversion of both arguments: https://codespeaks.blogspot.com/2007/09/ruby-and-python-over...

In fact, there is no dedicated numeric conversion operator in perl. The canonical "operator" for this is called the "0+" operator.

Python-style dynamic-typing and strictness is sort of the worst-of-both-worlds approach. Many times I've lost rarely-seen log messages to "cannot concatenate 'str' and 'int' objects" errors.


TAMGU (탐구) is a programming language that combines functional, imperative and logical paradigms into a single formalism. The language has also been specially designed to simplify automatic annotation and data augmentation for Data Programming. Pre-compiled versions for Windows, Mac OS and Linux are available at: https://github.com/naver/tamgu/releases


"20" + 10 = "2010"

10 + "20" = 30


`+=` a different operator than `+`. `+` is symmetric while `+=` isn't.


Doubtless. But it's one more special case to remember, is it worth it?


Hello,

I guess I should have been more explicit. I'm sorry for the confusion.

First, Tamgu is a language in which the recipient variable defines the context in which the instruction is evaluated.

If your recipient variable is an integer then everything on the right side of your assignment will be treated as an integer. If it is a string, well everything will be treated as a string.

We use this approach in many other instances:

int pos;

vector v;

string s="This is a testing case";

pos = "i" in s; //we are looking for "i" in the string

In this case, pos==2.

v = "i" in s; //we are looking for all positions of "i" in the string

v == [2,5,14]


I personally value clarity and this seems to obscure meaning. I'm ok with operator overloading when its super clear (eg + is addition, so having it overloaded to also do vector addition is clear to me, but overloading it to do string concatenation is less clear). I guess my worry is that this adds unnecessary cognitive load for minor convenience.

Having said that, if it works for you, awesome. Don't let me tell you otherwise, not everything has to be to my taste, after all.


oof. I see what you say about the LHS defining how the RHS is evaluated but that attaches expression evaluation to assignment. You can't do one without the other. Also the target type may be obscure to the programmer. What's the behaviour with

  string s="This is a testing case";
  println("i" in s);
But separately your example now looks even worse. It's quite reasonable to expect 'in' to behave the same way for both cases; that they both return [2,5,14] - now you've added another complication.

No offence but without some overriding principle, and a justification for that principle (that "things are demonstrably easier if you do it this way"), you've just dumped and extra load on the programmer which I do not need!.


None taken... :-)

Well, I did my fair share of programming over the years in so many different languages (Pascal, Cobol, C, C++, Java, Python, APL, Lisp, Prolog, Basic, Small Talk, various assemblers) that I honestly cannot remember all of them.

Tamgu is the result of this experience and my choice was always towards compactness and readability, with Perl being my personal nemesis.

The advantage of this approach is that you don't need to remember a long list of operators, they are simply re-interpreted in context and the re-interpretation is pretty consistent over most of the code.

But, well as the adage goes: Of tastes and colors...


By the way, you can easily enforce whatever interpretation you want with a cast:

println(vector("i" in s));


Quick point of feedback: the first two imperative (+=) statements presumably are being treated as independent, not part of the same sequence. I.e.:

  "20" + 10 = "2010"

  10 + "20" = 30
This was confusing, given that they are sequential statements listed without any clear separation.


I like the idea of having something like PROLOG/minikanren builtin to the language.

Prolog is neat, but I wouldn't use it for text processing and database access, so the logic programming piece to me makes better sense as embedded functionality in a more general purpose language, which is what I'm guessing you did here?


Absolutely, you're right. In fact, I first slightly modified the Prolog syntax by replacing the traditional variables that start with a capital letter with variables of the form:"?X". We can therefore differentiate between Tamgu variables and Prolog variables. In addition, Tamgu's basic objects, such as vectors, strings or numbers are automatically reinterpreted as vectors, strings or Prolog numbers.

We can therefore apply operations like: [?X|?Z].

Finally, Tamgu considers that the reception variables before the "=" sign define the execution context. Therefore, if we take the following program:

parent ("John", "Peter").

parent ("John", "Mary").

parent ("Peter", "Roland").

parent ("Roland", "Pierre").

bool b = parent("John",?X);

vector v = parent(?X,?Y);

The first case will return "true" if we find a matching occurrence. Here "b==true".

The second case will return all the possibilities that match:

[parent("John", "Peter"),parent("John", "Mary"),parent("Peter", "Roland"),parent("Roland", "Pierre")

Access to each parameter is possible:

v[0] = parent ("John", "Peter")

v[0][1] == "Peter"

v[0].name() == "parent"


Congrats @clauderoux! -- I've been working on similar fusions on and off during the years.

So "?X" is not true logical variable, but more of a part of a pattern expression? -- in other words, a logical var is a first-class object that can be stored in a data-structure (possibly in an unbound state) and be bound at a later time, and perhaps unbound and re-bound again if there is backtracking. So in Tamgu, "parent(?X,?Y)" acts as a one-shot "find" if the LHS is atomic, and acts as a generator (similar to a possibly nested list comprehension) if the LHS is a collection (but no inter-statement backtracking points are established), correct? So is the generation of the sequence eager/realized or lazy/by-need?


Exactly. If you provide a collection, the system will deduce that you want all solutions to be extracted, while if you provide a non collection variable then it will stop once one solution has been found. Actually, I have a very specific type, which is "predicatevar", if you want to get your unification once.

?X is actually a true Prolog variable that can be unified over and over again. Furtheremore, you can call Tamgu functions from your predicate description, in which these variables can be used if they are unified.

grandparent(?X,?Y) :- parent(?X,?Z), println(?Z), parent(?Z,?Y).

In the case above, the "println", which is not a predicate, will try to display the content of ?Z, if ?Z has been unified.

println could be replaced with a call to an actual function, as long as this function returns true. If the function returns false, then it will be considered as a fail.


I have the exact same opinion and like minikanren libraries in Scheme and Clojure for that reason. I always liked Prolog, it always felt like an awesome way to write certain aspects of a program, but always felt super awkward for the rest of the program. With this approach, you can write the bits that make sense in the style that most suits. That's a pretty nice thing to be able to do!


Actually, there is a simple program: hanoi.tmg in https://github.com/naver/tamgu/tree/master/examples/Predicat..., which shows how to mix Prolog and imperative programming for a benefit of both.


Exactly!


Nice set of features and paradigm mix. To which platforms does it deliver it a binaries? Does it deliver binaries or it is compiled, and where does it run the compiled program?

Can you show some realistic usage or application and how do all these feature hold together? An mvp todo list could make it, for web or for the command line.


Hello,

I provide binaries for Windows, Mac OS, Fedora, Centos and Ubuntu: https://github.com/naver/tamgu/releases

You also have some examples in: https://github.com/naver/tamgu/tree/master/examples

I provide makefiles for all platforms:

Windows: Visual 2013

Mac OS: both command line Makefile and Xcode Makefiles (including the native Mac OS GUI)

Linux: (also for Mac OS), I provide a python script: install.py that determines what is available on your machine and create a Makefile.in.


Looks like another rewriting of S-expression based logical programming system into non S-expression one.


tamgu(탐구). intersting. clauderoux, are you from korea?


Not exactly, but I work for Naver, which is a Korean company. The lab I work for is in France near Grenoble in the French alps




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: