- A dynamic part specifies what is actually allowed, and totally supports doubly linked lists and other sorts of cyclic mutable data structures.
- A static part conservatively approximates the dynamic part, but is still flexible enough to express the interfaces and usage of these data structures even if it can't check their implementations.
This is the important difference over traditional static analysis of C. It enables `unsafe` library code to bridge the dynamic and static rules in a modular way, so that that extensions to the static rules can be composed safely, downstream of their implementation.
Rust's strategy was never for the built-in types like `&mut T`/`&T` to be a complete final answer to safety. It actually started with a lot more built-in tools to support these patterns, and slowly moved them out of the compiler and runtime and into library code, as it turned out their APIs could still be expressed safely with a smaller core.
Something like Fil-C would be totally complementary to this approach- not only could the dynamic checking give you stronger guarantees about the extensions to the static rules, but the static checks could give the compiler more leverage to elide the dynamic checks.
C++ is not a dependently typed language, for the same reason that templates do not emit errors until after they are instantiated. All non-type template parameters get fully evaluated at instantiation time so they can be checked concretely.
A truly dependently typed language performs these checks before instantiation time, by evaluating those expressions abstractly. Code that is polymorphic over values is checked for all possible instantiations, and thus its types can actually depend on values that will not be known until runtime.
The classic example is a dynamic array whose type includes its size- you can write something like `concat(vector<int, N>, vector<int, M>) -> vector<int, N + M>` and call this on e.g. arrays you have read from a file or over the network. The compiler doesn't care what N and M are, exactly- it only cares that `concat` always produces a result with the length `N + M`.
I'm not sure what "dependently typed" means but in C++20 and beyond, concepts allow templates to constrain their parameters and issue errors for the templates when they're specialized, before the actual instantiation happens. E.g., a function template with constraints can issue errors if the template arguments (either explicit or deduced from the call-site) don't satisfy the constraints, before the template body is compiled. This was not the case before C++20, where some errors could be issued only upon instantiation. With C++20, in theory, no template needs to be instantiated to validate the template arguments if constraints are provided to check them at specialization-time.
This is the wrong side of the API to make C++20 dependently typed. Concepts let the compiler report errors at the instantiation site of a template, but they don't do anything to let the compiler report errors with the template definition itself (again before instantiation time).
To be clear this distinction is not unique to dependent types, either. Most languages with some form of generics or polymorphism check the definition of the generic function/type/etc against the constraints, so the compiler can report errors before it ever sees any instantiations. This just also happens to be a prerequisite to consider something "dependently typed."
In c++ it does care what N and M are at compile time, at least the optimizer does for autovectorization and unrolling. Would that not be the case with const generic expressions?
The question of whether a language is dependently typed only has to do with how type checking is done. The optimizer doesn't come into play until later, so whether it uses the information is unrelated to whether the language is dependently typed.
Ok, I think I understand now, but is it really dependently typed just because it symbolically verified it can work with any N and M? Because it will only generate code for the instantiations that get used at compile time.
Rust already does have some level of const generic expressions, but they are indeed only possible to instantiate with values known at compile time, like C++.
The difficulty of type checking them symbolically still applies regardless of how they're instantiated, but OTOH it doesn't look like Rust is really trying to go that direction.
the only thing needed here is to be able to lift N & M from run-time to the type system (which in C++ as it stands exists only at compile-time). For "small" values of N&M that's doable with switches and instantiations for instance.
The point of dependent types is to check these uses of N and M at compile time symbolically, for all possible values, without having to "lift" their actual concrete values to compile time.
Typical implementations of dependent types do not generate a separate copy of a function for every instantiation, the way C++ does, so they simply do not need the concrete values in the same way.
Right? So I'm apparently an idiot for thinking Rust's borrow checker, or Go's blissful deployment experience, or Zig's clever comptime system are good... Ok I guess I'm not going to like Odin then.
It's not the same unwritten social contract: in Rust even the unsafe code has the same stricter type signatures as the safe code, so there is a formal way to judge which part of the program is at fault when the contract is broken. You might say the contract is now written. :-)
In C, the type system does not express things like pointer validity, so you have to consider the system as a whole every time something goes wrong. In Rust, because the type system is sound, you can consider each part of the program in isolation, and know that the type system will prevent their composition from introducing any memory safety problems.
This has major implications in the other direction as well: soundness means that unsafe code can be given a type signature that prevents its clients from using it incorrectly. This means the set of things the compiler can verify can be extended by libraries.
The actual practice of writing memory-safe C vs memory-safe Rust is qualitatively different.
I take this to mean e.g. the Oxide project has proven the Rust type system sound?
There was a Git repository demontrating unsoundness issues in the compiler, but I seem to be unable to find it anymore :/. It seemed like there would be more than one underlying issue, but I can't really remember that.
They could not "just" have used SPIR-V bytecode. WebGL already had to do a bunch of work to restrict GLSL's semantics to work in the web sandbox; whatever WebGPU chose would have had the same problem.
I do not believe this is true anymore and WGSL is a "real language" now that actually requires a "real language frontend" to compile. The original spec didn't even have while loops, they expected you to effectively write the SSA graph by hand (because SPIR-V is an SSA IR). Then the design drifted as they kept adding pieces until they accidentally made an actual language again.
But it won't, because people implement conservative GCs when they are not in control of all the pieces, like the compilers used for some parts of their run-time.
In fact, those union-find trickeries come from the same paper that presented algorithm W, where they were named algorithm J. W was known from the start to be more useful for proofs than implementation:
> As it stands, W is hardly an efficient algorithm; substitutions are applied too often. It was formulated to aid the proof of soundness. We now present a simpler algorithm J which simulates W in a precise sense.
Isn't the union-find trickery "just" standard almost-linear unification as discovered by Paterson-Wegman and Martelli-Montanari around 1976? Or is there more to it in the setting of Algorithm J?
I'm actually a bit surprised that it took so long to discover these formulations of unification. I wonder what Prolog systems were doing at the time, given the importance of efficient unification.
Paterson-Wegman and Martelli-Montanari are worst-case linear, but Algorithm J just uses the earlier almost-linear union-find approach: unification variables are represented as mutable pointers to either a) nothing, or b) another type (which may itself be the "parent" unification variable in the union-find structure). Unification is then just a recursive function over a pair of types that updates these pointers as appropriate- and often doesn't even bother with e.g. path compression.
- A dynamic part specifies what is actually allowed, and totally supports doubly linked lists and other sorts of cyclic mutable data structures.
- A static part conservatively approximates the dynamic part, but is still flexible enough to express the interfaces and usage of these data structures even if it can't check their implementations.
This is the important difference over traditional static analysis of C. It enables `unsafe` library code to bridge the dynamic and static rules in a modular way, so that that extensions to the static rules can be composed safely, downstream of their implementation.
Rust's strategy was never for the built-in types like `&mut T`/`&T` to be a complete final answer to safety. It actually started with a lot more built-in tools to support these patterns, and slowly moved them out of the compiler and runtime and into library code, as it turned out their APIs could still be expressed safely with a smaller core.
Something like Fil-C would be totally complementary to this approach- not only could the dynamic checking give you stronger guarantees about the extensions to the static rules, but the static checks could give the compiler more leverage to elide the dynamic checks.