As of Fennel 1.4.2, macros cannot be shadowed. The code
(macro square-mac [x]
`(* x x))
(fn mwe [mac]
(let [square-mac (* mac mac)]
(+ square-mac 1)))
Fails to compile with the error
Compile error: unknown:4:8: Compile error: local square-mac was overshadowed by a special form or macro
This breaks referential transparency. It's surprising that a name collision between a macro and a function-local variable causes a compiler error. And it makes code more brittle (if two people are working on the same file, they have to coordinate what names they reserved for macros).
This breaks referential transparency.
I'm having a hard time understanding how this is related to referential transparency. In no place does Fennel make any guarantees about pure behavior of functions, and even in languages that do, it is a runtime concern rather than a compile-time one.
It's surprising that a name collision between a macro and a function-local variable causes a compiler error.
We have experimented with removing the restriction on a branch a while back, but the hygiene problems it caused were considered more or less insurmountable.
if two people are working on the same file, they have to coordinate what names they reserved for macros
If two people are working on the same file and one of them introduces a top-level macro, it would be pretty important for them to coordinate regardless of shadowing rules. However, macros do not have to be top-level.
Hey thanks for the response (and thanks for Fennel)!
This breaks referential transparency.
More specifically, this breaks equivalence under renaming bound variables. Ideally renaming bound variables would be a safe transformation that doesn't change the meaning of the code. For example, these are two ways of spelling the identity function:
(fn id1 [x] x)
vs
(fn id2 [y] y)
Whether we said
x
ory
is a local implementation detail the rest of the code shouldn't see. And in Fennel minus macros, it is! But if we had a macro(macro y [x] ...)
then renaming x -> y would break our code:
(fn [x] x) ; ok ==> (fn [y] y) ; error
The way this connects to referential transparency, if I'm understanding correctly, is both implementations refer to the identity function, but our choice of this reference is not transparent to the rest of the code.
Fortunately, this errors loudly instead of silently breaking, so I don't think there's any way the current behavior can introduce bugs.
Do you happen to remember which branch or what the hygiene issues were? I'd be interested in taking a look!
Interesting; I had not heard of referential transparency being defined for locals rather than functions.
I haven't been able to find the previous discussion around the hygiene issues, but I did find this thread where a branch that loosened the restrictions here ended up with a significant performance regression: https://lists.sr.ht/~technomancy/fennel/patches/39850#feedback-277143:508-1315
However, even if the performance problems were solved, I think loosening this restriction would also significantly exacerbate existing hygiene issues around built-in macros. For instance, today if you write a macro that expands to a call to
case
, we rely on this restriction to make it so that the expansion calls the built-incase
pattern match, and not some local function namedcase
:(macro handle [x] `(case (. ,x :results) [:incomplete lines#] (print :incomplete (length lines#)) [:error msg#] (let [handlers# (require :handlers)] (handlers#.error msg#)) [kind#] (print "unknown results!" kind#))) (fn case [problem] (if problem.runtime :runtime-case :unknown-case)) (handle (calculate-request))
With the current scoping semantics around macros, it's very important that this should fail to compile! If we did have a mechanism whereby the macro could specifically lock down to the
case
from where it was defined rather than thecase
from where it's expanded, we could look at lifting the macro shadowing restriction. Clojure uses namespaces for this, but we don't have namespaces.Currently we have very good "macro output hygiene" but not very good "macro input hygiene". Even tho the current situation does a good job at protecting against confusion around built-in macros, macros which refer to locals are still subject to confusion issues stemming from hygiene. Luckily these are much more rare and easy to avoid than macros whose expansions refer to built-in macros.
I don't think the underlying problem can fixed in a fully backwards-compatible way, but we could consider solutions either for fennel 2.0, or if we can come up with an opt-in mechanism for it.
Ah, I am familiar with this hygiene problem!
For a quick fix for me, I found commenting out the
assert-compile
checks for this incheck-binding-valid
andsymbol-to-expression
gives me a version of Fennel that passes all tests buts three, appears to work fine, and doesn't stop me from committing these scope crimes.I want to take a look at how other lisps have dealt with this problem, especially schemes since they don't have namespaces, and read through Fennel's implementation of macros to see what the options are. I'll try to put them together in a table of design trade-offs.
Since we don't have namespaces, but we do have tables, maybe each macro can save the environment where it's defined in a table. (I think this is the idea of syntactic closures)
I think something like the following would keep hygiene, be reasonably simple to implement, and have a Lua-esque flavor
(macro handle [def-env call-env] [x] `(def-env.case (. ,x :results) [:incomplete lines#] (def-env.print :incomplete (def-env.length lines#)) [:error msg#] (def-env.let [handlers# (def-env.require :handlers)] (handlers#.error msg#)) [kind#] (def-env.print "unknown results!" kind#)))
then
(handle (calculate-request))
would like(let [def-env (get-macro-def-env :handle)] (def-env.case (. (calculate-request) :results) [:incomplete lines_0] (def-env.print :incomplete (def-env.length lines_0)) [:error msg_1] (def-env.let [handlers_2 (def-env.require :handlers)] (handlers_2.error msg_1)) [kind_3] (def-env.print "unknown results!" kind_3))))
Of course there's no benefit to writing
def-env
everywhere, so there should be some automation there. I think a sensible default would be every symbolx
in the definition becomesdef-env.x
and every symbolx
in an expression passed to the macro at its call site becomescall-env.x
.
Since we don't have namespaces, but we do have tables, maybe each macro can save the environment where it's defined in a table
The main problem here is if the macroexpander encounters a conflict with the definition environment and the call environment, the expanded macro can't "reach over" the call-environment definition to find the original definition. It's already been shadowed by that point. It can certainly identify the conflict, but it can't actually do anything about it, other than emitting a compiler error.
But emitting a compiler error is probably worth considering provided it's opt-in; some kind of :strict-macro flag could ensure that any reference in the expanded macro compiles to the same reference the macro references.
Of course there's no benefit to writing
def-env
everywhere, so there should be some automation there. I think a sensible default would be every symbolx
in the definition becomesdef-env.x
and every symbolx
in an expression passed to the macro at its call site becomescall-env.x
.Provided we solved the above problem this would be reasonable as long as we restrict it to only happening in these new proposed "double arglist" macros. The problem is I doubt that hardly anyone would actually use the new style of macro, since it's more verbose and it solves a problem that is almost never encountered.
Also this would only work with inline
macro
macros and not with macros that come fromimport-macros
because the scope they're defined under is completely different and it's impossible to convey that into a non-macro module.
The main problem here is if the macroexpander encounters a conflict with the definition environment and the call environment, the expanded macro can't "reach over" the call-environment definition to find the original definition. It's already been shadowed by that point.
There is a potential workaround to this, maybe, by taking advantage of the fact that the Lua identifiers don't have to match the Fennel identifiers; we do some mangling to ensure they are valid Lua. We could in theory use this to "reach over" to mangle a symbol based on the scope under which the macro was defined, not the scope in which it was called, which should resolve it to the original assuming it's not coming from
import-macros
.However, the compatibility problem remains; there doesn't seem to be a good way to do this without either introducing a breaking change or introducing a new form that will see limited use.
I made my use-cases more concrete. I think I have three:
downgrading a macro to a function while live-coding
defining functions named +,-,*,/
for compatibility with other lisps
for multi-dispatch when working with a hierarchy of mathematical types
so I can name locals independently from whatever might be happening in an outer scope.
I'm interested in doing design work around macro hygiene in Fennel. But, for these use-cases, that path might not be a foregone conclusion.
if the macroexpander encounters a conflict with the definition environment and the call environment, the expanded macro can't "reach over" the call-environment definition to find the original definition. It's already been shadowed by that point.
Is this still true if we take snapshots of the definition environment when encountering macro definitions? If the compiler can save a snapshot of the definition environment, associating it with the macro, wouldn't this give us a way to reach over the call-environment?
I'm interested in doing design work around macro hygiene in Fennel.
I think your use cases are reasonable when considered on their own. My concern is that the difficulty in implementing them will be significantly greater than the utility they provide. Of course, if you're willing to do the work that's great. However, the necessary changes are of a nature such that mistakes in implementation are likely to be subtle and difficult to detect when it comes to hygiene and backwards-compatibility, so it may take a good deal of time for me to review any potential changes.
If the compiler can save a snapshot of the definition environment, associating it with the macro, wouldn't this give us a way to reach over the call-environment?
This is easy to do in a compiler that emits bytecode. Because we emit Lua code, it's more difficult. It may be possible by manipulating the mangling maps in the symbol table, but that's not a technique we've ever used before, so I can't say for sure.