Oct 08, 2016

"Excellent, it's easy! Why haven't you completed this yet? :-)"

I did. Anyone that ever wrote their own Scheme, Oberon, or C compiler in primitive language did. Common assignment in classrooms. :P Anyway, I took a brain injury in an accident that cost me most of my memory back in 2011 or... idk. It's also hard to retain new things. People were amazed at what I've put together on hardware & some other stuff but I can't remember how to program hands-on, do set theory, a bunch of things. Only what I repeated a lot in year(s) leading up to injury. Unfortunately, I was in resaerch rather than build mode then. :(

Meanwhile, I still got enough fragments and learning ability to work on high-level passing stuff onto specialists. I've been mostly focused on solving secure hardware since most of rest is already done to point people can just build it from published techniques with some effort put in. Got digital, mask, and fab stuff pretty far but analog, RF, and EMSEC might be uncheatable. Might actually need years to decades of specialist knowledge for solution. (sighs) Also do lots of evangelism of high-assurance methods on various areas. However, the lack of effort on this very important problem in compiler assurance has led me to consider rolling up my sleeves and bumbling my brain-damaged ass through a compiler project with maybe this start:

http://schemeworkshop.org/2006/11-ghuloum.pdf

That, similar in PreScheme, or redo Oberon or Component Pascal. Then use those to build the rest. Who knows. Just a lot of stuff to relearn. QBE is nice on C side as it's simple plus has small number of high-bang-for-buck optimizations. So, if I did it, I'd recode that in CakeML with me checking initial tool (eg simple scheme/ML) by hand in assembly. I also found a way to do C code (or C compatible code) in ML. So there's options if I decide to take up your challenge. :)

"So let's talk about VLISP. "

I lost too many bookmarks in recent crash. I had a link to the page which had what looked like a downloadable Scheme48. The code generation was done for x86, PPC, ARM, and C language. You bet you can buy those today. Anyway, I'd do a current project in Myreen's LISP or CakeML if I wanted verification as they're current, maintained, and have more assurance. Actually, my scheme is side-by-side a fast version for development pace and safe verion for reference/production if not too slow. So, MLton and CakeML or (fast LISP here) and VLISP/LISP1.5. I theorize I'd get best of both worlds that way.

"We must improve the tooling and models so that the proofs and layers can be done in a much faster and cost-effective way."

Totally agree. I can't say which will win. Why is good. Lean is interesting. Coq & Isabelle/HOL are getting good libraries. Dependent types are faster but controversial. C has LiquidTypes, Frama-C, and Microsoft's VCC plus lots of static analysis. Don't know where it will go but doing anything critical in a functional language will definitely help. Certify equivalence to imperative for distribution. Just look at what productivity vs expertise COGENT achieves in the filesystem paper vs seL4 verification in Haskell/HOL/C:

https://ts.data61.csiro.au/projects/TS/cogent.pml

re LISP syntax.

I agree and thanks for Readable. Looks nice! Prior work I saw on this was Dylan language from Apple. Unfortunately didn't get adoption. Maybe time for another shot at beautiful, system LISP given Clojure's success.

" But we also need to admit that it's going to take a long time, because our tooling is only just becoming good enough, and even then only in certain cases. There are serious limitations you're glossing over."

I'm actually ignoring them on purpose to get a short-term win as you put it. Specifically, I'm expecting incremental, well-tested, inspectable compiler without expecting formal verification. I list it as necessary for full-trust but not necessary to get baseline up. The Goulum paper I linked and Scheme examples shows how little work first steps are if one uses published techniques & tooling good for job. Case in point by one person:

https://speakerdeck.com/nineties/creating-a-language-using-o...

The next component is keeping the implementation simple and putting in DbC-style interface checks in there. Initial quality comes from inspection, testing, modularity, and mostly pure interfacing. Overtime these two extra things allow high-assurance crowd to rewrite each pass using whatever tech they have available.

In case I'm not being clear, let me illustrate with example. The CompCert compiler verified compilation by using cleanly-defined passes into intermediate languages with specific steps on each. The medium-assurance approach just does that in Ocaml and/or CakeML with testing, QuickCheck, assertions, etc. Like regular development with slightly more effort and little expertise. The passes are also kept simple enough that their assurance can go up plug-and-play over time with others' help when they formally verify them. This can work with any software that doesn't channge a lot at once with usefulness over time. This includes compilers, protocols (esp SSL), key OS components, provers, standard libraries, maybe even parts of databases. I call it, similar to hardware field, Design for Verification for software. Like Orange Book B3, just spec, structure, and code it enough that it might get improved later.

Meanwhile, you get immediate benefits over a less-structured, monolithic compiler in C or C++ without things like QuickCheck or assertion-based, test generation. You also get more overtime without extra work if large uptake causes CompSci to focus on it. I cheat by recommending the ML stuff they already focus on. So, yeah, I'm all about interim stuff that doesn't need full, formal verification. If only because I both lack time and memory to use formal verification myself. Gotta help the new kids get finished faster and easier so I can trust/use the compiler, too! :)