~technomancy/fennel#232: 
precompile macros when loading compiler

When profiling my program, I found out that the majority of runtime is spent in (require :fennel). Behold:

$ cat test.fnl
(require :fennel)
(print "hii")
$ fennel --require-as-include --compile test.fnl > test.lua
$ tail test.lua
    ]===], {allowedGlobals = false, env = env, filename = "src/fennel/match.fnl", moduleName = module_name, scope = compiler.scopes.compiler, useMetadata = true})
    for k, v in pairs(match_macros) do
      compiler.scopes.global.macros[k] = v
    end
    package.preload[module_name] = nil
  end
  return mod
end
require("fennel")
return print("hii")
$ time lua test.lua
hii

real	0m0.484s
user	0m0.456s
sys	0m0.026s

That's half a second to print "hi". To compare, I required enough of penlight for test.lua to have a similar amount of lines (~8700) as with fennel. This took 30-50ms.

I remember someone mentioned on IRC that fennel compiles macros every time it's required. Is there a way to precompile everything? Could we make it the default? Otherwise any program that calls fennel programmatically (fennel.dofile, fennel.view, ...) pays a pretty big startup penalty.

Status
RESOLVED CLOSED
Submitter
~sarna
Assigned to
No-one
Submitted
7 months ago
Updated
2 months ago
Labels
enhancement

~technomancy 7 months ago

Yes, I think it's a good idea to turn macros into Lua as part of the bootstrap process for the compiler.

Otherwise any program that calls fennel programmatically (fennel.dofile, fennel.view, ...) pays a pretty big startup penalty.

Oddly these two examples are very different; if you want to call dofile then you definitely need the whole compiler loaded, but if you just want to call fennel.view that can already be done faster:

;; instead of this:
(local {: view} (require :fennel))

;; do this:
(local view (require :fennel.view))

However, I'm not sure if this is well-documented. And this only works with fennel.view; none of the other nested modules in the compiler support this.

~sarna 7 months ago

Thanks for the tip about fennel.view, I didn't know you could do that!

Sadly, in my case I need fennel.dofile as I'm evaluating fennel files provided by the user.

~technomancy 7 months ago

OK, I've taken a look at this. I have an implementation that offers some speed improvements, with some tradeoffs. However, I can't reproduce the slow boot speeds you're reporting. For the same program you've shown above, I get about 100ms runtime with LuaJIT, or 125 with Lua 5.4; both imperceptibly fast. Hard to say what's causing it to be so slow in your case.

Precompiling the macros gets it down to 20-30 milliseconds; however my current approach requires disabling metadata, which breaks docstrings on built-in macros.

~sarna 7 months ago

These timings were from my Raspberry Pi 4. On my Macbook Air (M1) it takes around 100ms.

A couple runs of profiling on the Raspberry Pi:

$ luajit -jp test.lua
hii
12%  special
 8%  multi-sym?
 8%  parse_string_loop
 8%  symbol_to_expression
 8%  parse_sym
 8%  hook-opts
 8%  compile1
 4%  quoted?
 4%  getb
 4%  sym
 4%  add_stable_keys
 4%  close_table
 4%  flatten_chunk
 4%  load-code
 4%  test.lua:0
 4%  view
 4%  maxn
$ luajit -jp test.lua
hii
 9%  test.lua:0
 6%  list?
 6%  whitespace_3f
 6%  compile1
 6%  special
 6%  getb
 6%  parse_sym
 3%  stablepairs
 3%  _157_
 3%  set_source_fields
 3%  peephole
 3%  parse_string
 3%  (for generator)
 3%  get_arg_name
 3%  getopt
 3%  check_malformed_sym
 3%  close_table
 3%  parse_comment
 3%  exprs1
 3%  test.lua:2654
 3%  load-code
 3%  view
 3%  parse_sym_loop
 3%  multi-sym?
$ luajit -jp test.lua
hii
 9%  test.lua:0
 6%  view
 6%  symbol_to_expression
 6%  check_malformed_sym
 3%  normalize_opts
 3%  make_options
 3%  get_arg_name
 3%  (for generator)
 3%  parse_number
 3%  sym
 3%  calculate_if_target
 3%  open_table
 3%  compile1
 3%  macroexpand_2a
 3%  getbyte
 3%  hook-opts
 3%  tostring
 3%  ast-source
 3%  _229_
 3%  whitespace_3f
 3%  f
 3%  exprs1
 3%  make-scope
 3%  getb
 3%  _12_
 3%  skip_whitespace
 3%  load-code

On the Mac I couldn't manage to run the LuaJIT profiler, and the ones written in Lua had issues with recursion and showed incorrect data.

Precompiling the macros gets it down to 20-30 milliseconds; however my current approach requires disabling metadata, which breaks docstrings on built-in macros.

From my perspective that's fine, if you want to fire a REPL for the user you can require the version that's not precompiled.

~technomancy 7 months ago

OK, well, you can take a look at my branch here where I've started this work: https://git.sr.ht/~technomancy/fennel/log/precompile-macros

I have an idea for how to fix the metadata problem, but there are also problems with nesting the compiler (loading fennel from a fennel program) in the test-nest test that I have no idea what could be causing them.

~sarna 7 months ago

Finally managed to test it :) I can confirm it helps a lot: now the timings on my Raspberry Pi are ~35ms for PUC Lua 5.1, and ~25ms for LuaJIT. That's much better!

~technomancy 7 months ago

I think I have this working on the "precompile-macros" branch now! But it's too big of a change to merge right now. We'll have to get 1.5.1 released, and then we can bring this in for 1.6.0.

~sarna 7 months ago

Great news! I see that even metadata is included now :) Thank you!

~technomancy REPORTED CLOSED 2 months ago

This is on main now!

Register here or Log in to comment, or comment via email.