~icefox/garnet#36: 
Thoughts on macros/derives

We will need some mechanism of code generation, because Rust's derive macros demonstrate how absolutely amazing it is at avoiding boilerplate. Serde and structopt are absolute life-changers, my mind requires that we have similar capabilities.

Prior art:

  • Rust's macros. Derive macros are a pain in the ass because they present a new API that is essentially divorced from your program, and it's at the level of a token stream that is kinda painful to deal with. Its template macros are nicer but simpler. When writing derive macros, you usually end up using the quote lib or some other method of templating to generate your code. But they also allow basically arbitrary code gen in a parameterized way, which is real hot.
  • Python decorators. Some of the power of this can be demonstrated here: https://hackernoon.com/the-real-c-killers-not-you-rust , under "Numba".
  • Zig comptime. Reportedly very nice because it slots into your existing code with essentially the same syntax.
  • OCaml p4 macros? Maybe as an example of what not to do, I have never ever felt comfortable using these.
  • Lisp/scheme macros, natch. Hot take, but IMO Rust has kinda taken the best parts of these and extended them. I've yet to see a Lisp macro that does what serde does.
Status
REPORTED
Submitter
~icefox
Assigned to
No-one
Submitted
2 years ago
Updated
2 months ago
Labels
T-THOUGHTS

~icefox 2 months ago*

Okay! I'm sick of type checking, so let's throw some design thought-juice at macros.

First off, what do I want macros for? This is mostly inspired by Rust, and how incredibly useful they are there, in a low-key but quite pervasive ways. I'm also very charmed by Elixir's macros, but really don't want them to be that pervasive. So let's make a list of the main use cases for macros Garnet, with some thoughts on each:

  • Varargs -- especially C#/Python/Rust-like format varargs. This is actually such an absurdly difficult/weird problem for a type system to deal with entirely statically that it's almost always either done at runtime via dynamic dispatch, or via some special case DSL in the compiler. Rust apparently does a bit of both, but I'd really prefer to avoid dynamic dispatch if we can 'cause it's slow and should be unnecessary. #74 has some interesting design thoughts on other ways to handle this, but in my mind I'm still defaulting to macros as the most basic and I-know-it-works approach.
  • Code generation/abstraction breaking -- This is probably the simplest usage of macros. Refactoring out snippets of boilerplate-y code that don't want to be functions for one reason or another. I actually expect this to be pretty important for Garnet, since I am taking at least the initial strategy of "more typing is preferable to more complicated designs". This is a niche and convenience-based usage, but IMO a very nice one when it exists. https://hg.sr.ht/~icefox/garnet/browse/src/parser.rs?rev=tip#L387 is an example of this. Elixir-like string sigils are also somewhat compelling.
  • Derives -- Introspection and code generation based on type definitions. If code generation is simplest then this use case is definitely the most common, at least in Rust. I don't have a particularly great idea of how I'd like to do this, but after writing so much Rust it's something I never want to live without. That said, Rust's procedural macros are kinda garbo and we could do better. Maybe dig up some of thephd's work on compile-time reflection and use that for inspiration? Not sure if they have any work on that topic more recent than that post.
  • DSL's -- This is the one I like the least, despite my Lisp heritage. It's very rare that I've seen a macro-based DSL that is actually easier to understand than writing the dumb-and-simple code, and most of the stuff it generates tends to be the dumb-and-simple equivalent anyway. But I don't want to disregard it entirely, since at its best it can be downright wonderful. You can also get a hell of a lot of goodness out of much more limited DSL's that just add more information to existing language constructs and feed that into a derive, such as logos and argh.

Whew! So enough theorycrafting, let's write some speculative code to translate garnetc's basic parse_delimited!() macro. In Rust:

macro_rules! parse_delimited {
    ($parser: expr, $tokenkind: path, $body: block) => {{
        loop {
            $body
            if $parser.peek_expect($tokenkind.discr()) {
            } else {
                break;
            }
        }
    }};
}

Transliterated into something that looks like Garnet:

macro parse_delimited($parser: Expr, $tokenkind: Path, $body: Block) =
  loop
    $body
    if peek_expect($parser, $tokenkind) then
      {}
    else
      break
    end
  end
end

What if we have explicit quote/unquote operators? That's one of the things that IMO Rust is weirdest for deferring to a library, even if it makes sense in historical context. Making things require an actual parser producing typed AST's like Rust macro_rules rather than just consuming and producing token streams like derive macros kinda sounds like a good idea in theory but also seems more noisy and less powerful. IIUC that's one of the main benefits that derive macros. Rust calls them attribute macros

macro parse_delimited(parser, tokenkind, body) =
  quote
    loop
      $body
      if peek_expect($parser, $tokenkind) then
        {}
      else
        break
      end
    end
  end -- quote
end

The $ is our unquote sigil for no particular reason. That case doesn't look much different tbh, the main distinction is that the args are not parsed into ASTs, but rather spliced literally into the result.

Cases to consider:

  • An actual derive on a struct
  • More complicated codegen with repeated statements and stuff
  • idk, what else?

Definitely need to do more reading about design issues with Rust macros and the pros and cons thereof. I'm not actually much of an expert on them, my experience writing them tends to be almost entirely the code-generation use-case.

~icefox 2 months ago

Oh, another use case for macros: general-purpose special-purpose syntax. No that's not a typo; I mean a way to have an expression that fits neatly into the existing syntax from the outside but can do magical things inside that you can't write in the language. In Rust this may be things like offset_of!() or include_bytes!(), which act as hooks into the compiler that do Special Things using knowledge the compiler has that isn't otherwise very accessible. Zig uses "builtin functions" for this same purpose, most notably in my mind @import().

Having read the Oasis's Compile-time Reflection In Rust report, it's an interesting mix. Some of it echoes my own thoughts about just having functions and types that offer a way to introspect on types. I think #17 has some work on that, but the gist of the idea is mostly just "why doesn't the compiler just generate an iterator for each struct type that gives me a (&str, Type) pair for each field in the struct?" The answer usually being that you can't do much with Type without compile-time evaluation or runtime reflection or a JIT, but well my type checker is already a compile-time evaluator, so that's not much of a blocker.

Key question: Why can't you iterate over the fields of a tuple at compile time? This would let you implement varargs, and is kinda related to const generics as well for array sizes, and is also a matter of the trait impl hell that exists for tuples, so it's a single feature that covers a lot of ground. "Accessing a tuple cannot be done programmatically with constant expressions, because the my_tuple.0 syntax is a hardcoded, explicit allowance of the syntax that literally requires a number after the .". So allowing any const expr after the dot would let us index tuples, at the cost of needing something like union types. That's a problem Rust has that I already want to fix anyway. I'm not gonna have something like union types, but it's a hell of a feature to think about.

The Oasis proposal does it by allowing iteration with compiler-generated (or at least compiler-aware) visitors, but it also has to propose a design that doesn't break Rust as it is. Then it uses traits with generics to implement the Type arg, so instead of fn next(&self) -> (&str, Type) you have fn visit_struct<Type>(&self, &str) or something. Then you implement that function where Type: SomethingSlightlyMagical and that is how you express your "pattern match on type1 | type2".

The pieces: a keyword that introspects on a type (at compile time I think) and produces some unknowable type that contains your introspection information, which has an undefined structure but that fulfills a particular trait. Better handling of discriminants and enumeration variants, which I already have in theory. And a way to express "zero or more" of a type or argument, which it hacks around with the visitors and some keywords.

Zig also kinda does a lot of this afaict, maybe better than Rust can. Learn more.

~icefox referenced this from #74 a month ago

Register here or Log in to comment, or comment via email.