~technomancy/fennel#222: 
Allow shadowing macros?

As of Fennel 1.4.2, macros cannot be shadowed. The code

(macro square-mac [x]
  `(* x x))

(fn mwe [mac]
  (let [square-mac (* mac mac)]
    (+ square-mac 1)))

Fails to compile with the error

Compile error: unknown:4:8: Compile error: local square-mac was overshadowed by a special form or macro

This breaks referential transparency. It's surprising that a name collision between a macro and a function-local variable causes a compiler error. And it makes code more brittle (if two people are working on the same file, they have to coordinate what names they reserved for macros).

Status
REPORTED
Submitter
~rileylevy
Assigned to
No-one
Submitted
9 months ago
Updated
9 months ago
Labels
enhancement needs-design

~technomancy 9 months ago

This breaks referential transparency.

I'm having a hard time understanding how this is related to referential transparency. In no place does Fennel make any guarantees about pure behavior of functions, and even in languages that do, it is a runtime concern rather than a compile-time one.

It's surprising that a name collision between a macro and a function-local variable causes a compiler error.

We have experimented with removing the restriction on a branch a while back, but the hygiene problems it caused were considered more or less insurmountable.

if two people are working on the same file, they have to coordinate what names they reserved for macros

If two people are working on the same file and one of them introduces a top-level macro, it would be pretty important for them to coordinate regardless of shadowing rules. However, macros do not have to be top-level.

~rileylevy 9 months ago*

Hey thanks for the response (and thanks for Fennel)!

This breaks referential transparency.

More specifically, this breaks equivalence under renaming bound variables. Ideally renaming bound variables would be a safe transformation that doesn't change the meaning of the code. For example, these are two ways of spelling the identity function:

(fn id1 [x] x)

vs

(fn id2 [y] y)

Whether we said x or y is a local implementation detail the rest of the code shouldn't see. And in Fennel minus macros, it is! But if we had a macro

(macro y [x] ...)

then renaming x -> y would break our code:

(fn [x] x) ; ok 
==> (fn [y] y) ; error 

The way this connects to referential transparency, if I'm understanding correctly, is both implementations refer to the identity function, but our choice of this reference is not transparent to the rest of the code.

Fortunately, this errors loudly instead of silently breaking, so I don't think there's any way the current behavior can introduce bugs.

Do you happen to remember which branch or what the hygiene issues were? I'd be interested in taking a look!

~technomancy 9 months ago

Interesting; I had not heard of referential transparency being defined for locals rather than functions.

I haven't been able to find the previous discussion around the hygiene issues, but I did find this thread where a branch that loosened the restrictions here ended up with a significant performance regression: https://lists.sr.ht/~technomancy/fennel/patches/39850#feedback-277143:508-1315

However, even if the performance problems were solved, I think loosening this restriction would also significantly exacerbate existing hygiene issues around built-in macros. For instance, today if you write a macro that expands to a call to case, we rely on this restriction to make it so that the expansion calls the built-in case pattern match, and not some local function named case:

(macro handle [x]
  `(case (. ,x :results)
     [:incomplete lines#] (print :incomplete (length lines#))
     [:error msg#] (let [handlers# (require :handlers)]
                     (handlers#.error msg#))
     [kind#] (print "unknown results!" kind#)))

(fn case [problem]
  (if problem.runtime
      :runtime-case
      :unknown-case))

(handle (calculate-request))

With the current scoping semantics around macros, it's very important that this should fail to compile! If we did have a mechanism whereby the macro could specifically lock down to the case from where it was defined rather than the case from where it's expanded, we could look at lifting the macro shadowing restriction. Clojure uses namespaces for this, but we don't have namespaces.

Currently we have very good "macro output hygiene" but not very good "macro input hygiene". Even tho the current situation does a good job at protecting against confusion around built-in macros, macros which refer to locals are still subject to confusion issues stemming from hygiene. Luckily these are much more rare and easy to avoid than macros whose expansions refer to built-in macros.

I don't think the underlying problem can fixed in a fully backwards-compatible way, but we could consider solutions either for fennel 2.0, or if we can come up with an opt-in mechanism for it.

~rileylevy 9 months ago

Ah, I am familiar with this hygiene problem!

For a quick fix for me, I found commenting out the assert-compile checks for this in check-binding-valid and symbol-to-expression gives me a version of Fennel that passes all tests buts three, appears to work fine, and doesn't stop me from committing these scope crimes.

I want to take a look at how other lisps have dealt with this problem, especially schemes since they don't have namespaces, and read through Fennel's implementation of macros to see what the options are. I'll try to put them together in a table of design trade-offs.

Since we don't have namespaces, but we do have tables, maybe each macro can save the environment where it's defined in a table. (I think this is the idea of syntactic closures)

I think something like the following would keep hygiene, be reasonably simple to implement, and have a Lua-esque flavor

(macro handle [def-env call-env] [x]
  `(def-env.case (. ,x :results)
     [:incomplete lines#] (def-env.print :incomplete (def-env.length lines#))
     [:error msg#] (def-env.let [handlers# (def-env.require :handlers)]
                     (handlers#.error msg#))
     [kind#] (def-env.print "unknown results!" kind#)))

then (handle (calculate-request)) would like

(let [def-env (get-macro-def-env :handle)]
  (def-env.case (. (calculate-request) :results)
     [:incomplete lines_0] (def-env.print :incomplete (def-env.length lines_0))
     [:error msg_1] (def-env.let [handlers_2 (def-env.require :handlers)]
                     (handlers_2.error msg_1))
     [kind_3] (def-env.print "unknown results!" kind_3))))

Of course there's no benefit to writing def-env everywhere, so there should be some automation there. I think a sensible default would be every symbol x in the definition becomes def-env.x and every symbol x in an expression passed to the macro at its call site becomes call-env.x.

~technomancy 9 months ago

Since we don't have namespaces, but we do have tables, maybe each macro can save the environment where it's defined in a table

The main problem here is if the macroexpander encounters a conflict with the definition environment and the call environment, the expanded macro can't "reach over" the call-environment definition to find the original definition. It's already been shadowed by that point. It can certainly identify the conflict, but it can't actually do anything about it, other than emitting a compiler error.

But emitting a compiler error is probably worth considering provided it's opt-in; some kind of :strict-macro flag could ensure that any reference in the expanded macro compiles to the same reference the macro references.

Of course there's no benefit to writing def-env everywhere, so there should be some automation there. I think a sensible default would be every symbol x in the definition becomes def-env.x and every symbol x in an expression passed to the macro at its call site becomes call-env.x.

Provided we solved the above problem this would be reasonable as long as we restrict it to only happening in these new proposed "double arglist" macros. The problem is I doubt that hardly anyone would actually use the new style of macro, since it's more verbose and it solves a problem that is almost never encountered.

Also this would only work with inline macro macros and not with macros that come from import-macros because the scope they're defined under is completely different and it's impossible to convey that into a non-macro module.

~technomancy 9 months ago

The main problem here is if the macroexpander encounters a conflict with the definition environment and the call environment, the expanded macro can't "reach over" the call-environment definition to find the original definition. It's already been shadowed by that point.

There is a potential workaround to this, maybe, by taking advantage of the fact that the Lua identifiers don't have to match the Fennel identifiers; we do some mangling to ensure they are valid Lua. We could in theory use this to "reach over" to mangle a symbol based on the scope under which the macro was defined, not the scope in which it was called, which should resolve it to the original assuming it's not coming from import-macros.

However, the compatibility problem remains; there doesn't seem to be a good way to do this without either introducing a breaking change or introducing a new form that will see limited use.

~rileylevy 9 months ago

I made my use-cases more concrete. I think I have three:

  1. downgrading a macro to a function while live-coding

  2. defining functions named +,-,*,/

    • for compatibility with other lisps

    • for multi-dispatch when working with a hierarchy of mathematical types

  3. so I can name locals independently from whatever might be happening in an outer scope.

I'm interested in doing design work around macro hygiene in Fennel. But, for these use-cases, that path might not be a foregone conclusion.

if the macroexpander encounters a conflict with the definition environment and the call environment, the expanded macro can't "reach over" the call-environment definition to find the original definition. It's already been shadowed by that point.

Is this still true if we take snapshots of the definition environment when encountering macro definitions? If the compiler can save a snapshot of the definition environment, associating it with the macro, wouldn't this give us a way to reach over the call-environment?

~technomancy 9 months ago

I'm interested in doing design work around macro hygiene in Fennel.

I think your use cases are reasonable when considered on their own. My concern is that the difficulty in implementing them will be significantly greater than the utility they provide. Of course, if you're willing to do the work that's great. However, the necessary changes are of a nature such that mistakes in implementation are likely to be subtle and difficult to detect when it comes to hygiene and backwards-compatibility, so it may take a good deal of time for me to review any potential changes.

If the compiler can save a snapshot of the definition environment, associating it with the macro, wouldn't this give us a way to reach over the call-environment?

This is easy to do in a compiler that emits bytecode. Because we emit Lua code, it's more difficult. It may be possible by manipulating the mangling maps in the symbol table, but that's not a technique we've ever used before, so I can't say for sure.

Register here or Log in to comment, or comment via email.