-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Relationship of P6 and/or 007 macros to FEXPRs #302
Comments
I don't have the expertise to comment on FEXPRs, unfortunately. Especially since there seems to be some genuine debate about their true nature. What I can say though is that from my vantage point there is one big difference between anything Lisp did in the early days and what we're doing in 007 today (and ultimately Perl 6): whether you're in a quasiquote or not, a variable that's used has to be declared. (And since this is a static relationship, the program won't compile if it doesn't hold.) This is so fundamental to the language that it ends up also permeating how macros work, leading eventually to macro hygiene. If I understand correctly, Scheme's view on macros is quite close to this. Scheme has a flavor of lexical hygiene. I haven't studied it in debt, though I'm planning to. Yesternight I wrote this gist, which is something I've been planning to write for a while. I meant to write it long ago, but found it hard going until inspiration struck yesterday. It's really related to #212 and figuring that one out in order to make #207 work. But I think it's also relevant to this issue, or at least to this comment. 😄 |
Note: I got a lot about FEXPRs from this dissertation
I've seen the term "runtime macros" used, and I think I like that term for FEXPRs. They are more similar to runtime macros, and I'll come back to that. A macro is a "source->source" transform. That's where they're similar to source filters.
The only meta-circular interpreter I have looked at deeply enough would be Rubinius, which... isn't homoiconic, so obviously has huge differences between "source code read" and "syntactic forms". If I were to draw a comparison, FEXPRs are like if you kept the "unevaluated arguments" parts of macros, but instead of being replaced by a quasi at compile-time, they instead sit there until runtime, where they get evaluated (often containing # FEXPR style
fexpr $if(cond, t, f) {
my @fn = f, t # reverse arg, 0 = false, 1 = true
eval(@fn[!!eval cond]);
}
$if(1, print("true"), print("false")); Here, no transformation is done at compile-time. It's the FEXPR's job to eval its argument. Compare to a macro: macro if(cond, f, t) {
quasi {
my @fn = {; {{{f}}} }, {; {{{t}}} };
@fn[!!{{{cond}}}]();
};
}
if(1, print("true"), print("false")); which is transformed, at compile-time, to: [{; print("false") }, {; print("true") }][!!1](); (as a side-note: 007 used to have About @masak 's gist:
What is the "outer pointer" here? The macro context itself? |
Here's the Calculatrices digitales du déchiffrage. It's from 1954, and describes the first metacircular compiler, a few years before Lisp. |
Yes. In the quasi in the format macro, for example, the But the quasi is not what eventually runs. The injected code in the mainline (the "injectile") is what runs. It looks like the quasi, except unquotes have been filled in with Qtree fragments. The injectile's outer pointer is still the macro, though, just like with the quasi. Since the injectile is somewhere else, its outer pointer is "displaced". |
...huh. As I look at the format macro so soon after mentioning |
I know it wasn’t, but it really is quite similar in how it works. |
@raiph I created a Close this one? Or are we still waiting for some resolution wrt FEXPRs? |
This thread seems to have stalled, and there is no Definition of Done. Closing. |
Sneaking this link in here: http://axisofeval.blogspot.com/2012/03/should-i-really-take-time-to-learn.html |
@raiph Commenting in a closed thread, but (as I'm learning more about fexprs) I just found this quote:
(And so it's kinda implied that we can do better nowadays, with modern tools and lexical scope.) I found Kernel because allegedly its specification is very good. I haven't read it yet. |
@vendethiel This is my current take on fexprs (still not having read the dissertation): There are three "phases of matter", three forms of computation that interest us:
A compiler basically takes us 1->2, and a runtime does 2->3. What Common Lisp, Scheme and 007 is keep us in 1 for as long as we need. What fexprs do is allow us to take fragments of 1 all the way into 3 and only then "evaluate" them (1->2->3) at compile time. (Oh! The dissertation is by the Kernel author. D'oh.) |
Should it still be code "compile-time" then? :) I now understand what it means to have something post interesting links faster than I can read them; serves me right... I'll try to read what you've sent me. |
Hey, no need to feel "behind" on reading. 😉 Did you see the new papers.md? That's an (incomplete) list of things I haven't read but probably should. Remember, the end goal of all of this is to bring decent macros to Perl 6. |
Hi @masak, My summary of my latest (but very tentative) understanding of fexprs and their relationship to Raku is that the https://www.reddit.com/r/rakulang/comments/g7xluv/question_about_macros/fow63tp/ I've drawn this (again, very tentative) conclusion based on reading Shutt's comments on fexpr's, mostly at LtU. I note his commenting on how macros are more about filling in a template (which reminds me of stuff you've said about 007/alma macros), plus some sweeping simplifications that are possible related to hygiene and quasi quoting (which may at least trigger insights for you). Perhaps you have also already come to the same conclusions without realizing it's because you're implementing fexprs, not macros (in lisp parlance), presuming I'm at least somewhat right about that. |
Hi @raiph, I'm sorry, no, I don't believe Perl 6 macros (or Alma macros) are fexprs at all. At the risk of revealing my still-very-shaky understanding of fexprs — the big distinction is that in Lisp macros (and Perl 6 macros, and Alma macros), after the macro has expanded at compile time, it's "spent". Only its expansion remains in the target code. A fexpr, on the other hand, inserts itself as the thing to be called, and so itself remains in the target code as the "expansion". That's very different, and according to that difference, Perl 6 does not have fexprs. I'm sorry to knee-jerk reply like this without properly reading your reddit post. But I wanted to communicate my understanding and point out a possible confusion, even though my understanding is partial and the confusion is possibly also mine. I promise to go back and read the reddit post (and the source material) more carefully. |
I meant the same when I wrote "compile-time fexprs". Aiui, an alma macro returns an AST. It's simply a value, one that happens to be an AST tree, that's then just directly spliced into the overall AST. Perhaps I'm wrong about that; perhaps what's returned from an alma macro is an expression that's then immediately evaluated, but I would be fairly shocked to hear that.
|
@raiph I still believe what you are saying does not match what we are seeing in Alma and Perl 6. I say this as the implementor of macro systems in both languages, although I also don't doubt the credentials of John Shutt.
You are correct, as far as that goes. But — since the format of comments allows me to interrupt you to ask a question — are you aware that this is exactly what Common Lisp and Scheme also do? And many many other Lisp dialects. And other languages, too. Including Perl 6. To spell it out, just to make super-sure that we are on the same page: # Perl 6 or Alma
macro sayHi() {
quasi {
say("hi");
};
} ;; Bel
(mac prhi ()
`(pr "hi")) These two examples are as identical as the different syntax allows them. The little backquote in the Bel code corresponds to the Both of these are macro mechanisms. Neither of these is a fexpr mechanism. Talking about "compile-time fexprs" further confuses the issue when "macro" already correctly describes the thing. It's a bit like saying: "no no, I don't mean food, I'm talking about more like a kind of solid drink".
Instead of trying to pinpoint and explain exactly what Mr Shutt means when he's writing this, let me show you what a Perl 6 or Alma macro would look like if it were a fexpr: # imagined Perl 6 or Alma with fexprs
macro sayHi() {
say("hi");
} As you can see, the big difference is the lack of Perl 6 and Alma do not work this way. Kernel and picolisp do, as far as I understand. |
Oh! As I started reading the reddit post, it pretty quickly became clear to me where this confusion comes from. Quoting the relevant part:
You are right, the macros in Perl 6 and Alma are not immediately evaluated. But... ...this is an artifact of a deeper language difference, more like a philosophical difference between "compile-then-run languages", and Lisp. (I've been returning to this point in various ways over the years when talking with @vendethiel. Only recently, like last year or so, do I grok this difference.)
Perl 5, Perl 6, Alma, C, C++, Java, C# and many others belong to the first group. Various Lisps belong to the second group. Python has some aspects of the second group — in that both class and function definitions "happen at runtime", but in the end it also belongs to the first group — in that compilation completes for a whole unit before the first statement ever runs. In Alma and Perl 6, things are not immediately evaluated out of macros, because Alma and Perl 6 are not languages of this second type. In languages of the second type, the macro tends to be "expanded at runtime", which is OK because compile time and runtime are so close together. That's why these languages talk about the macro returning something which is then immediately evaluated. Alma and Perl 6 not being of this second type, their macros don't — can't — work like this. More generally, in any language of the first type, macros don't work like this.
I respectfully disagree with this conclusion. They are macros; nothing to do with fexprs. |
Something, I don't quite remember what, prompted me to start reading John Shutt's dissertation about fexprs and vau calculus. And... wow. Don't get me wrong, it's not that I'm now suddenly a raving fexpr convert. I'm not. (For example, I don't want to uproot everything we've done with Alma and start over, just to do it "right" with fexprs. In fact, I think the end result would maybe look more like Kernel and less like macros for Raku, which wouldn't be in line with 007's or Alma's mission statement.) No, the reason I say "wow" is that it's almost ridiculous how much falls, for lack of a better metaphor, into place when I read this. There's so much to process and appreciate from the dissertation, and I'm not even through the entire text yet, but I still want to try and summarize it. This will be a multi-comment kind of thing, by necessity. At some point when reading through the dissertation I needed to take a detour and read the Kernel Report. I heartily recommend doing so; more below. Ping @raiph @vendethiel @jnthn. |
In medias resInstead of building up to an exact description of my current understanding of Kernel's central abstraction — the operative — let's just define one, and talk about what falls out:
In textual order:
|
Is it cheating to say
Is it special? Shouldn't it also have a dollar sign otherwise?
So we could've used
I'm interested in how. |
Yes and no, I think. 😉 It kind of depends on what you think qualifies as a boolean. By Kernel's/Shutt's standards,
Ah, trust @vendethiel to sniff out the one part of what I wrote where I felt I didn't have enough to stand on to be sure. But this is important, so let's try. I think the answer to your question is in the metacircular evaluator (here using the one from SINK, Shutt's old prototype):
Note crucially that But when we call
Maybe a different way to state this is that
These two questions are answered above, I believe, but I'm not sure how clearly. In summary: when we call
I realize this is bewildering; it's still a bit messy to describe. It gets even weirder when one realizes that I hope to get back to this, and be able to state it in a more straightforward way. In the meantime, hope this helps. |
Self-replying because I'm thinking about what I wrote and I'm kind of understanding it a bit deeper. Combine this:
With this:
(And with the rather abstract things I attempted to clarify to @vendethiel.) What you get is this: a Contrast this with a normal The fexpr paper calls the former style the explicit-evaluation paradigm, and the latter the implicit-evaluation paradigm. Macros have largely been heavily associated with the latter, in Lisp/Scheme and elsewhere. Apparently some experimental languages called reflective Lisps (of which I know next to nothing) from the 1980s share with Kernel their focus on explicit evaluation. Quoting the fexpr dissertation:
|
Kernel and the lack of a static phaseI was going to establish the terms "1-phase languages" and "2-phase languages" (analogously to Lisp-1 and Lisp-2), but now I fear if I say "two-face" too much, I'll only think of Harvey Dent. Anyway. There are languages that just go ahead and run your program — but then there are also languages that first spend some time, well... type-checking, optimizing, and code-generating. You know, compiler-y stuff. Technically, this could be a difference between implementations, but usually it goes deep enough to be a feature of the language itself. We could refer to these as "dynamic lanaguages" and "static languages". We could also talk about "interpreted languages" and "compiled languages". I think what we have here are three axes, but they're almost aligned. Kernel has no compile phase. It's pretty into lexical lookup, which is usually compilation-friendly, but there's also something about operatives/fexprs that ruins all that static knowledge. (Shutt is aware of this. But it might not be an "all hope is lost" situation. The Kernel specification talks hopefully about a sufficiently smart "optimizing interpreter".) With all this in mind, it's kind of interesting that a Alma and Raku are two-phase languages. But the distinction between "static environment" and "dynamic environment" here is pretty much the same; the former belongs to the macro body, and the latter to the mainline where the macro is expanded. Biggest difference is that the "static environment" also happens in the "static phase" ( A bit earlier I wrote
This distinction is mostly what I meant. Neither Raku or Alma would be inclined at all to lose its compilation stage, not even for the pleasing theoretical properties of operatives. |
I want to link to an LTU comment by John Shutt, that explains the origins of Kernel. Worth reading in its entirety, but here's the central paragraph:
The factorization is the one where |
With the knowledge of Kernel as an intrinsically 1-phase language, I can now amend my reply to @raiph a little:
A fexpr/operative in Kernel doesn't much insert itself in any way. Very similar to a function, it just gets invoked. There's no expansion going on. But the interpreter knows the difference between operatives and applicatives, and only evaluates the operands for the latter type. There's no way for Kernel operatives to be "spent" at compile time, because there's no compile time. |
A small note on partial evaluation. Given a call to an unknown function,
then the first call can compile to one where the arguments have been (partially-) evaluated and the second call to one where the arguments are passed as data. I'm pretty sure that's sufficient to eliminate the overhead of fexpr relative to macros. It'll also constant fold correctly in the cases where f later turns out to be known. |
@JonChesterfield Interesting. Kind of a late-bound conditional I agree, this looks feasible. I was thinking about how to do exactly this (compile Kernel) yesterday, and your technique will definitely help. What I would also like to understand better is the following: when do we late-bind operatives? (Not to mention the applicative/operative distinction.) It's possible to read the entire Kernel spec and Shutt's entire PhD thesis and be none the wiser about this. Shutt seems to take pride in not knowing, with phrases like "in fact, not knowing this is an expected feature of a smooth language". [citation needed] Ok, thinking about this a bit more while getting coffee. I'm not sure we're in the clear yet with this technique. Yes, it will cover the things that are in the code at program start, but — and this is crucial and impossible to ignore with Kernel — it will not cover things that are beyond an
(emphasis mine); this is a good start but not enough. Think of the degenerate case of building a list Which is not to say that this technique won't work — it seems to me it might catch a lot of the "low-hanging fruit" in the actual program — only that it needs to be complemented by the regular (late-bound) discrimination between applicatives and operatives in I think what it comes down to is this: in the general case, we can't eliminate |
I agree with the conclusion. The runtime dispatch is not always eliminated, particularly in cases that look less like macros, so tagging remains. I'm not sure late bound conditional wrap matches the premise though. Perhaps I don't see what late bound means in a single phase language. My working premise is that operatives and applications are the same thing modulo a Boolean flag that eval/apply queries to decide whether to evaluate arguments before binding to the parameter tree. Wrap returns the same underlying function with the Boolean true, applicative? tests it. So both applicative and operative functions exist at runtime, it's only a difference in calling convention. Replacing evaluated lists with that branch is always meaning-preserving, where we know we have a list being passed to eval. Well strictly it needs to be a cond/case so that lists that don't start with either fail as they should. And care is needed if reflection is involved, depending on how you wish to expose that. For lists built on the fly that thwart constant folding / partial evaluation, or equivalently when the compiler optimisation is disabled, the function still has the runtime tag for eval/apply to dispatch on. On second thoughts, there's a decent chance of being able to tell eval is being passed a list even when some elements are unknown if the construction is visible, in which case those elements which are known still qualify. In general compilation is probably necessary to achieve competitive performance. I don't know the design tradeoffs in Bel. For Kernel, I think a caching JIT (transparent to the application) is the way to go. |
Yes, that was likely a careless phrase on my part. What I meant was more like "we surface in the code itself what used to be a hidden part in combiners".
When you phrased it that explicitly, I realized it's not quite true:
Aye; "interpreter with a JIT compiler" is my working hypothesis/hope as well. Basically a lot of assumptions are made, asserted by guards (but usually not broken). This made me associate to both macro-writer's bill of rights and Surgical Precision JIT Compilers. |
Ah, so that's a subtlety I missed from the language. I didn't realize an operative could require N applications of wrap to get an applicative. That probably means an applicative can be wrapped to get one that evaluates its arguments twice, up to N. That's not obviously a good property to me, would prefer wrap on an applicative to return an error or the original applicative unchanged. I hadn't seen the second reference, thanks. That's definitely not what I'm thinking as it involves the program explicitly invoking the JIT on parts of itself. By transparent I mean the program runs indentically under the interpreter to under the JIT compiler, which means no compiled? predicate and no compile function, as well as no behaviour change other than wall clock time. Thinking of accepting transforms that send infinite loops in the interpreter to terminating ones under compilation, e.g. discarding an unused argument that loops forever, which is a slightly generous definition of only changing wall clock time. |
I think we have the same goal, but maybe we're thinking of it slightly differently. I agree about "no
There's no type checker here that will stop you from expressing this obviously bad thought. If you run it, though...
(In passing: error messages/error handling not yet good enough. Working on that.) Now, was
As seen above, there's a middle ground: typechecking, in whatever form we imagine it, can be a "nice-to-have", but its absence need not rule out compilation — not if we're willing to emit target code that we know will "go wrong". (Edit: or even, in the less extreme case, code that we don't know won't go wrong. That annoying gray area that arises due to Gödel, Turing, and Rice.) Girard makes this point well in Locus Solum (fair warning: this text is heavy stuff, and "not peer-reviewed"). In Example 1 on page 309, he introduces ✠, the daimon 𝕯𝖆𝖎, whose semantics in a program would be any of the following, whichever you feel makes the most sense:
(This suggests a certain kind of REPL- or feedback-based style of programming, well-known in Lisp and Smalltalk circles.) The point is, we can get a richer notion of typechecking (and other kinds of checking) by not committing the very common error of thinking that type error = go home. Tying this back to the original topic of compiling code: I think compilation can be an implementation detail. I think we can uphold the "transparency" of compilation (i.e. "any optimization that doesn't get you caught red-handed is allowed") if we're careful enough, and I believe such care consists mostly of what I mention above (never assume that types have to check out or that the typechecker has the right to call the shots), plus code-generating "guards" (in the sense JIT compilation uses them) for things that are impossible to guarantee statically (in Bel, things like untyped function parameters and dynamic variables get guards). I'd say things don't even get particularly interesting or difficult until you start messing with the Bel interpreter. Now, there I confess I don't yet have the full story. But I'm really looking forward to getting to the point where I can easily ask such questions in the form of code. |
Now I also thought of Programming as collaborative reference by Oleg.
I think "programming with holes" or the REPL/feedback based style of Lisp or Smalltalk refer to this kind of "collaborative reference", at least after the fact, as it were. Allowing ✠ 𝕯𝖆𝖎 means making it socially acceptable to manipulate, inspect, and execute incomplete programs. I think the compileristas have adopted a too-narrow view that compilation requires completeness/validity — don't tell me it couldn't be useful (a) to run a program up to the point where it fails (getting, I dunno, side effects and debug/trace output along the way there), or even (b) to run a program to the point where it fails, confirming that it does fail. Getting the failure at runtime is strictly more informative than getting it at compile-time — for one thing, now you know that the failure occurs on a non-dead code path! As someone who likes these tight feedback loops, it counts as feedback saying "this ✠ is the one that needs your attention now". (Counterexample-Driven Development, CDD.) Similar to the beneficial effects of TDD, it's like the execution is throwing you back into the code, pointing to something that doesn't add up, the next counterexample-driven focus for your attention. |
Robert Harper, in this supplement to PFPL goes (if possible) even further, claiming that there is no essential difference between compilers and interpreters:
For what it's worth, I think this is also part of Futamura's message. |
The other day this paper was uploaded to arXiv. It's named "Practical compilation of fexprs using partial evaluation: Fexprs can performantly replace macros in purely-functional Lisp", and (from a quick skim) seems to address the above question of, can we get structure (and performance) back after choosing fexprs? Hoping to read it more in detail and write more. |
I'd like to add two things, almost two years later, as it were.
Right. I agree with past-me. Though I will point out that the first recursive call could have been replaced by a |
In all fairness I should add this, which I feel matters: the first edition of Dartmouth BASIC already had It also had |
There's also this paper, which correctly identifies the problem of "everything gets pessimized because operatives":
And then hints at a solution, which I'll summarize as "phase separation between modules". I don't have a sense of whether that's a good solution, or too limiting in practice, or something else. |
Also, I'd like to point out the Github and website as they don't seem to be in the paper: @masak Did you ever dig further on this? In particular this is timely given that Zig seems to be somewhat flirting with partial static evaluation with "comptime". They're not going the whole way to changing the evaluation of arguments as far as I can see, though. In the paper, the particularly interesting thing is just how many eval calls get elided--almost all wrap0 and wrap1. It's huge. To me, I'm still looking for a small embedded language for microcontrollers that gives me full power but fits on <32KB systems. Kernel was conceptually simple but generated MASSIVE amounts of garbage to hold unwrapped evaluation results as well as environments (to the point I was wondering if we could go back to dynamic environments). I wonder if just a couple of optimizations gets most of the performance without bloating code too far. |
Thanks! Those are some intriguing references from that site, including the one to FUN-GLL, which I didn't know about before. Need to check that out more.
Not the paper, no. I have it open now, but it's unlikely I will find the tuits soon to give it the attention it deserves. For now, three very minor points:
Here, let me excerpt from the Revised-1 Kernel Report:
(For reference, G1a is "All manipulable entities should be first-class objects" (which
|
Is this actually true? Aren't most builtins accessing lexical scoping which effectively is static (chained to parent)? Yes, you can access "$if" via the dynamic environment which could get redefined, you're just not likely to unless you are specifcally attempting to emulate dynamic scoping (which couldn't be compiled anyway). My read of the paper is that it is leaving breadcrumbs on the way down so that if you don't rebind "$if" along the way but bottom out in the partial evaluation, the system will unwind all the breadcrumbs and replace them with "$if symbol from definition environment". One particular quote from the paper stuck out: "Any use of fexprs that is not partially evaluated away is something that could not be expressed via macros, and thus paying a performance penalty in these cases is not onerous." |
Yes, it's true. Here's my chain of reasoning about Kernel — feel free to fact-check it. I've annotated with section numbers from the Report to show where I get things from.
It's less bad than I remembered (which is a relief), but it's still definitely bad. To what I write above, add the fact that such mutation can happen at any point and in arbitrarily dynamic ways, and that no static analysis will ever be able to catch it reliably due to Rice's theorem. Which makes it impossible to make any reliable assumptions for partial evaluation at compile time. |
While this is allowed, I don't think Shutt expected it to be common. (5.3.2) Rationale: Even $define! is considered "optional" (although I question whether that is really true ...). In addition, if you look at the uses of $define! that aren't just top-level variable setting, there's very few that are relying on redefinition to mutate during calls down the stack. And even those can often be replaced by constructs that don't rely on $define!, per se. I don't see any obvious usage of Shutt redefining something primitive once he makes it. If Shutt didn't use this stuff when defining his own language, it's probably not a tool he expected others to reach for very often. I think the only mutational applicative is "set-last!" from "append"/"append!". "$letrec" is a little odd so I'd have to think about it. Everything else is just defining a helper applicative (no operatives). In my opinion, a lot of the "we can redefine everything" stems from Shutt attempting to make Kernel axiomatic--based around a small, well-understood set of primitives to construct everything and avoid having to provide various pieces of ad hoc "implementation assistance" that other academics would challenge on theoretical grounds. He was, after all, wading into an area that was more than a little contentious by academic standards. If we are attempting to precompile/partial evaluate for speed, we need "implementation assistance" and that assumption is already out the window. I, personally, don't find the idea of "shadowing a builtin severely hampers your ability to compile/pre-evaluate" a very problematic restriction. Redefining builtins like "$if" would be shocking to most programmers, anyway. |
I totally understand where you're coming from. And I agree with the specifics of what you write after that. You are doing me the service right now of arguing in the exact way I was hoping you would, giving voice to one side of this debate (which I've kind of had with myself for the past two years or so). So, when I disagree on the wider point, please don't take it the wrong way. In fact, I don't know you at all, and I don't have time for "someone is wrong on the Internet"-style discussions. I think there is a really interesting central point that needs to be discussed here, though. Our discussion is an instance of the following common discussion: P1: "Behavior X in this programming system is extremely disruptive and destroys all our static analysis!" One instance of X is "pointer aliasing"; it wreaks havoc with the analysis of programs, up to and including data races, and there are reams of research published on how to combat this. Rust and Facebook Infer can be considered direct attacks on the problem. This story is not fully written. Another instance of X is "late-bound evaluation of code". Forget about the injection attack problem, that's a red herring. The real issue is that anything might happen in that code, including arbitrary heap mutations. Any assumption about the heap might be invalidated. (Which conservatively means it is invalidated.) One of my favorite issues in the TypeScript repository is #9998 Trade-offs in Control Flow Analysis. Look how quickly we lose control! The issue thread keeps being open and keeps being linked from all over the place, because a lot of other issues/requests are actually this issue in disguise. Optimizations need to be based on some kind of static analysis, and static analysis needs to be "conservative" in the sense that it's not allowed to make assumptions about bad things X not happening — in fact, if X can happen, then for all we know it does, Murphy's Law-style. In other words, P1 is right: X is bad, and it sucks. P2 is also right (X is less than 100% common), but P2's argument is not strong enough to refute P1's; it does not mean that We Can Have Nice Things again. This is all very frustrating. We are prevented from doing nice static analyses and nice optimizations by the worst imaginable behavior from the components we can't see into. It has led some people to start talking about "soundiness" as a reasonable compromise and an alternative to strictly conservative analyses. I have a lot of sympathy for that, and I think of it as a pragmatic stance. I include Incorrectness Logic in that stance as well. But I think the trade-off it makes (away from the pessimistic/conservative point) means that it can't be used for optimizations. What I do think is possible is to optimistically compile the efficient code, and then call it under a "guard" that checks that So there is hope. It's just, it's not the wishful-thinking kind of hope, where we say "X is not common, let's ignore X". It's of the complicated kind, where we carefully insert guards that allow us to monitor our wishful assumption, and we stay ready to divert to a slower "Plan B" path when it's invalidated, which is hopefully very rare. (Does the paper do it this way? I don't know, because I still haven't read it carefully enough. If it doesn't, it's either unsound, or not really capturing Kernel semantics.)
Kernel is based on a small, well-understood set of primitives, yes. According to a previous comment in this issue thread, three. And of those three For simplicity, let's keep Meaning absolutely no disrespect to those who have passed before us, here's how I think that conversation would go. I'd ask whether name lookup in Kernel is "late-bound". Shutt would say yes, sure, because environments are first-class, and asking "what is I never met Shutt in real life, but I've read a lot of his stuff at this point. My confidence that the conversation would go something like that is high.
You're either implementing Kernel the way it's specified, or you're not. If you make assumptions for your own convenience, but it means you deviate from the letter of the spec, then wouldn't it be easier to instead pick a language where those assumptions hold in the language specification? "Really late-bound everything" seems to be a vital and non-negotiable part of Kernel's feature set. |
If you haven't seen it, there is a nice post by Graydon Hoare talking about Rust ("The Rust I Wanted Had No Future"): I especially like the comment about lambda and environment capture:
It's funny, the high-end CS people keep dancing around the whole idea that the environment is a thing and it really needs to be reified properly. |
Interesting! I hadn't seen it. There's a HN discussion as well (for what it's worth). I haven't completely digested the entire post, and I'll have to read it again more carefully. But it's easy to see that a lot of it is speculation and what-could-have-been thinking. Guess the truth is that language design and language development is contingent, path-dependent, and has a lot more choices than easy answers.
It's part of the static/dynamic dichotomy, the eternal divide between Those Who Compile and Those Who Interpret. If you're building a compiler, then the environment is something like a series of linked stack frames, and it's typically not visible. Some people from this camp even say that making it visible would be a huge security flaw, although others disagree. If you're building an interpreter, then the environment is not strictly necessary — you could do everything "by substitution", and the environment just represents an obvious laziness-based optimization on that, to avoid traversing subtrees. So typically the environment is present; since the wall between interpreter and interpreted code is so thin and porous, it's much easier in this scenario to want to reify the environment and make it available to the program itself. So, I dunno. I'm not sure about the phrasing "it [the environment] really needs to be reified properly". 😄 In a much earlier comment in this thread (over two years ago), I talk about the double-edged practice of first-classing things, including environments:
I'm reminded of the phrase "what you lose on the swings, you gain on the roundabouts" (clearly coined by some amusement park mogul). In this case, the trade-off involved in reifying and first-classing environments is pretty clear: first-class environments are powerful and flexible in some objective, undeniable sense — but letting them loose like that also means that there's more "fog of war" statically, and less static analysis can be done. Since both flexibility and static guarantees are good things, it's not an obvious choice. I have this idea of creating an object system in Kernel, a little bit like CLOS maybe. The objects themselves would be environments (for data and methods) wrapped in encapsulations (for access control). I still haven't done that; it would be nice to try. It's hard to say whether the exercise would make me more warm towards first-class environments, or less. |
Except that I'm not necessarily sure it does. As I poke more at the ideas behind Kernel, one of the things you can explicitly say is "Please evaluate this in the immutable ground environment." Once you do that, things get a lot cleaner. Yes, you can still do things that rip the ground out from under yourself, but you can also not do them. If you start from the immutable ground environment, the only things not the same as ground are the things that you, yourself, changed. If you only use and define applicatives, then things seem to be able to be compiled. Of course, once you open up the can of worms with an operative, that all goes out the window. But you don't have to use operatives, that is your choice. I really wish Shutt had done an implementation of Kernel in something like C or Java. There is some wonkiness around environments and their implications that really needs a solid, non-lisp implementation to really lock down the semantics. With everything being Lisp-to-Lisp there's a bit too much hand waviness that you can get away with. This also applies to a LOT of the "machinery". The whole wrap/unwrap machinery seems overly generic (I don't think I have seen any example that wraps more than once). Having to cons up arguments for applicatives is known to be a performance disaster ("CONS should not CONS its arguments"). Lots of stuff is merged into a single list in order to only cons/uncons when keeping things separate as (operator, args) would avoid generating a whole bunch of garbage associated with prepending/appending all the time. etc . Shutt also puts a lot of emphasis on "$define!" to bootstrap everything which completely defeats any attempts at compilation/static analysis. It's not clear that this is actually required--being able to bootstrap from far less powerful primitives would benefit the performance of the interpreter significantly. |
Yes, I've been thinking something similar since my Apr 25 comment. It is like you say, that there are entire Kernel programs that one could validate as being non-weird at a glance, for example by saying "we're only calling known-good applicatives here". It's possible to go even further than that. I bet there's a technique of finding a "trusted kernel" of code at one or more points in the program, and then using that as the base case of an inductive argument showing that (in many cases) the whole program is trusted. In this case, I use "trusted" to specifically mean "doesn't mess with environments in a dynamic way". I still need to write that out formally, but I have a feeling it's true. The point is that a lot of the desired optimizations can be made, in the comfortable shadow of such an inductive argument.
Yes. I think it's the same dynamic here. Once we have the inductive trust argument, we can also start reasoning about which values escape and which ones don't. For values that don't escape, an optimizer is free to change representations and improve performance under the hood.
I'm not sure this is a big deal-breaker either. In Scheme, there's a restriction saying that |
Coming at this from a slightly different angle. I've been trying to think up a simple module system for Bel. Modules don't have to have a static component, but it's not far-fetched for them to do so, since this is a sensible level at which to ask "what names does it export?" This is a static question, in the sense of "shouldn't have to run the code to know that". (I can't back up this sentiment with any fact — it's more of a handwaving argument, along the lines of "what the heck would you be doing in your module code such that it makes the question 'what are its exports?' a hard one to answer!?".) At the same time, I don't want to dumb it down or put up limits just for the sake of some module-analyzing code. Consider this first part of a
(Did you spot the error? I deliberately changed I'm writing it as those repetitive top-level definitions right now mainly due to "speed constraints". In a faster Bel, the first thing I would consider would be to start abstracting and writing it something like this:
For some appropriately-defined We would then be in a situation where a bunch of I hereby coin the term "scrutable" for module code that meets this definition. We want to be able to ask "what are the exports?", and not just get a clear answer back (which might involve opportunistically expanding macros a bit), but also have a guarantee that we didn't miss any definitions hiding anywhere. I posit that this is possible, because macros tend to be very well-behaved in the sense that they depend on static stuff, not dynamic stuff. Module code needs to be scrutable, which is what guarantees that its exports are statically known. Keep in mind that in Bel, macros expand Late, at the same time as functions are called. So the fact that this should hold true and be possible is not obvious, and more than a little surprising. Bel macros have this ability to do dynamic stuff, but none of them use this ability. |
(continuing) It's a unification of these two points of view:
What the adjective "scrutable" does is establish a kind of peace treaty, an effective bridge between these two views, by saying that you can reliably get from the document to the side effects — essentially because the syntax-abstracting macros are upper-bounded in how weird they are. Moreover, this can also be checked statically. Just for completeness, here's a thing you cannot do in a scrutable module file: (def random-symbol ()
;; some code that puts together a random 5-letter symbol
..)
(mac def-random ()
`(def ,(random-symbol) ()
nil))
(def-random) ;; define and export something random |
I found the relevant Shutt quote I must have half-remembered above, from the comments of this blog post:
That dissertation link has bitrotted; here is a working link. Just for completeness, here is the Smoothness Conjecture, verbatim from section 1.1.2:
Implicit in all of this is (of course) that lambdas/functions factor into, on the one hand, vaus/operatives/fexprs, and on the other hand, evaluation of the operands into values before passing them — therefore vaus are more "smooth" than lambdas. (Also, pointed out in that same comment, lambda calculus is really about operatives, not about functions.) I'm all for smoothness, and I still agree with the reasons to seek it out... but it's worth thinking about how the paradigmatic countercurrent of macros and compilation came about. It has something to do with preferring not just dynamism and infinite flexibility, but also static guarantees, hardcoding/partial evaluation, and (not putting too fine a point on it) performance. |
In stark contrast to the "what should you use fexprs for?" undercurrent developing through this issue thread, I just found this example of the opposite sentiment — "what shouldn't you use fexprs for?" — at the end of a 2020 blog post by Shutt (about a conlang):
If fexprs ought to replace macros in pretty much all situations where macros are used, then... not only does a quite-big overlap need to exist between macros and fexprs, in order for fexprs to replace macros; it also suggests that fexprs are somehow more suited for the role macros have taken on. In other words, there's a point (in feature-providing state space) where macros cannot follow, and fexprs just keep going. I'm curious about that point. I'm curious about real-world concrete examples of such fexprs. These fexprs must, by construction, contain some aspect of runtime-only information, or decision-making delayed until runtime, which macros are too static to emulate. Annoyingly, as Shutt himself has pointed out, actually trying to guess at this group of fexprs (and their benefits) is Hard. Perhaps Alan Kay's dictum, that "the best way to predict the future is to invent it" applies here. I simply need to build enough awesomely dynamic fexprs until I stumble on a useful one. 😄 |
I've been meaning to write about an idea I've been throwing around inside my head for a while: in brief, what's with the dangerous and accident-prone separation of Lisp code/syntax, and its appropriate environment? But I just found that in this LtU thread, Ray Dilinger talks about exactly that. John Shutt replies (with what looks like skepticism). I haven't read the comment and the replies carefully enough to say anything more. I just wanted to dump it here for reference. Maybe I'll come back to this idea when I have read it carefully and/or thought about it some more. |
Here's something I'd like to put into writing; been thinking about it for a while (some of which has leaked into above discussions).
Anyway, all this falls out from the interaction between two factors: supplanting interpretation with compilation (for performance) and keeping the original promise of radical mutability of everything, including built-ins. |
Besides being closed and not really actionable, this issue thread is also low-traffic and mostly "saturated" these days. But I'd like to submit, for the pleasure of what dedicated readership it might still have, Manuel Simyoni's LispX, which I just discovered via this blog post of his about a topic which would be better suited for issue #569. If you dig a bit into LispX, you will find it's built on top of Shutt's vau calculus. Which makes it fairly unique alongside klisp of actually implementing the ideas from Kernel — but unlike klisp, LispX presents a much more Common-Lispy surface language to the user. Which is kinda cool. |
I'm curious to read an explanation of fundamental technical differences or similarities between Perl 6 macros (and, separately if appropriate, 007 macros) with Lisp FEXPRs. I thought ven especially might be able to shed light on the comparison.
(If this post, and ones like it, ought not be an issue in this repo, but rather, say, an SO or reddit post, and would still get your attention in this other location, please just close it with a comment explaining your preferences about where to post it and I'll cut/paste to that other location. I'm especially thinking it might be appropriate to start an SO tag for 007, perhaps initially justified as a tag to be used in conjunction with [perl6] but which would then immediately be available for use on its own anyway.)
The "Early Lisp macros" section of the wikipedia Macros page starts (apparently incorrectly) with:
I say "(apparently incorrectly)" because the footnote at the end of the paragraph that begins as above links to an email that says:
This suggests FEXPRs were, perhaps, very loosely akin to source filters in P5.
But, putting that aside, even if the wikipedia text is fundamentally wrong, and/or even if P6 and/or 007 macros are nothing whatsoever like FEXPRs, the first sentence in full sounds exactly like P6 macros to me, if one substitutes "abstract syntax tree forms of the arguments" for "syntactic forms of the arguments", and "compilation" for "computation":
Would you agree? In other words, is this accurate:
P6 / 007 macros are function-like operators whose inputs are not the values computed by the arguments but rather the abstract syntactic tree forms of the arguments, and whose outputs are values to be spliced into the compilation output.
The paragraph continues:
Does this map well to how P6 macros (and/or 007 macros) work?
The paragraph ends with:
This difficulty, which clearly did indeed historically occur, leading to lisp macros, was perhaps due to the actual nature of FEXPRs (which I'll speculate made FEXPR macros akin to source filters) rather than the above apparently incorrect description of FEXPRs, which, as I have said, was rejected in the email that the wikipedia page footnotes and which sounds to me reminiscent of Perl 6 macros.
To summarize, I'm interested to see what comes out of y'all reflecting on the nature of P6 macros, and/or 007 macros, as they might relate to FEXPRs, especially as described in the quoted email. If they're fundamentally different, what is the nature of that difference? If there are similarities, why do you think P6/007 macros will avoid the reasoning difficulty?
The text was updated successfully, but these errors were encountered: