~mcf/cproc#1: 
Variable-length arrays

This probably needs a stack save/restore feature in QBE. I looked at what clang/LLVM does for this, and they have stacksave and stackrestore intrinsic functions. I'm imagining something similar for QBE

    %stack =l stacksave
    %vla =l alloc8 %size
    ...
    stackrestore %stack

I'm not sure if this would be easy to implement, or if it would be tricky due to interference with QBE's own use of the stack.

Status
RESOLVED IMPLEMENTED
Submitter
~mcf
Assigned to
No-one
Submitted
5 years ago
Updated
4 months ago
Labels
No labels applied.

~mpech 5 years ago · edit

I will wait this feature to test PicoLisp (C implementation).

~sircmpwn 4 years ago

~mcf, can you write up a summary of your thoughts on this feature here? What blocks it, and possible workarounds? I remember it was something with qbe.

~sircmpwn 4 years ago

I wrote up an example of an implementation that might work for VLAs in a loop:

export function w $main() {
@start
	%iters =l copy 8
	%a.sz =l copy 8
	%a.0 =l alloc8 %a.sz
	%i =l copy %iters
	%sz =l copy %a.sz
@loop.start
@loop
	# Ensure that %a is at least as big as %sz
	%a.resize =w cugel %a.sz, %sz
	jnz %a.resize, @a.sufficient, @a.resize
@a.resize
	%a.oldsz =l copy %a.sz
	%a.sz =l copy %sz
	%a.1 =l alloc8 %a.sz
	%a.i =l copy 0
@a.copy
	%a.0.n =l add %a.0, %a.i
	%a.1.n =l add %a.1, %a.i
	%a.n =l loadl %a.0.n
	storel %a.n, %a.1.n
	%a.i =l add %a.i, 1
	%a.d =l sub %a.sz, %a.i
	jnz %a.d, @a.copy, @a.resized
@a.resized
	%a.0 =l copy %a.1
@a.sufficient
	call $printf(l $fmt, l %a.0)

	%i =w sub %i, 1
	%sz =l add %sz, 1
	jnz %i, @loop, @end
@end
	ret 0
}

data $fmt = { b "a.0: %p\n", b 0 }

Obviously the resulting code is going to suck but that's what you get for putting a VLA in a loop.

~mcf 4 years ago

I updated the description with what I concluded last time I investigated this.

Regarding your proposal, I think that has worst-case quadratic memory usage, which might not be acceptable. For example, something that should be safe, like

for (size_t i = 0; i < 8192; ++i) {
	char a[i];
	f(a, i);
}

would use 32 MiB on the stack. A possible variation could be to allocate in powers of 2, which should use use at most twice the required size.

I don't think the copy is necessary since VLAs can't be resized in C.

In any case, I don't it's worth trying to emulate full VLAs until QBE has the proper support. For now, supporting VLAs that don't appear under any labels or loop bodies is probably sufficient for most code.

~sircmpwn 4 years ago

I'm not convinced that qbe would do anything much different from this. Do you think otherwise?

~sircmpwn 4 years ago

Though +1 to allocating by powers of 2.

~mcf 4 years ago

Yes, I think QBE should be able to save and restore the stack pointer.

Consider the following program:

#include <stddef.h>
size_t x;
void g(char *);
void f(void) {
	for (;;) {
		char a[x];
		g(a);
	}
}

clang turns this into this LLVM IR:

define dso_local void @f() local_unnamed_addr #0 {
  br label %1

1:                                                ; preds = %1, %0
  %2 = load i64, i64* @x, align 8, !tbaa !2
  %3 = call i8* @llvm.stacksave()
  %4 = alloca i8, i64 %2, align 16
  call void @g(i8* nonnull %4) <a href="/~mcf/cproc/1" title="~mcf/cproc#1: Variable-length arrays">#1</a>
  call void @llvm.stackrestore(i8* %3)
  br label %1
}

which gets translated to this x86_64 asm:

f:                                      # @f
        pushq   %rbp
        movq    %rsp, %rbp
        pushq   %rbx
        pushq   %rax
.LBB0_1:                                # =>This Inner Loop Header: Depth=1
        movq    %rsp, %rbx
        movq    x(%rip), %rax
        movq    %rsp, %rdi
        addq    $15, %rax
        andq    $-16, %rax
        subq    %rax, %rdi
        movq    %rdi, %rsp
        callq   g
        movq    %rbx, %rsp
        jmp     .LBB0_1

Rather than allocating more and more stack space, the stack pointer is saved at the beginning of the loop body, and restored at the end.

D. Olsson 2 years ago · edit

When possible, I would love to help with testing, documentation or anything else that a not-compiler programmer can help with regarding this todo.

I'm trying to compile skalibs, s6 and redo-c with cproc. And all three of these projects require VLAs.

/D. Olsson

Nathanael Möcking 2 years ago · edit

If I may interject, but what would be the problems of implementing VLAs as syntactic sugar for the apparently implemented __builtin_alloca, which has a very similar functionality? This seems to be a very straight-forward way to go.

~mcf 2 years ago

See the earlier discussion. The problem is that alloca and VLAs behave differently in a loop. Memory allocated by alloca gets deallocated when the function returns, but VLAs get deallocated at the end of the block. Naively compiling VLAs as alloca could result in stack overflows in some of the examples above.

~noocsharp 5 months ago

I had a silly idea. For-loop bodies can be thought of as closures, where the environment is the set of local variables in a function. The body of a for-loop could be extracted into a separate function which takes an parameter to the local variables in the function containing the loop. This would take care of saving and restoring the stack.

For nested for-loops, additional parameters could be added to point to the environment of each nested block.

Of course this has a lot of downsides, but I think qbe is technically already capable of implementing VLAs without the memory overhead.

~mcf 5 months ago

Maybe this could work, but I think it should actually be pretty easy to add support for VLAs to QBE, just a matter of syntax.

In terms of implementing in cproc, I think the best order would be:

  • Add support for variably modified types, such as int (*array)[f()]. To do this, we'd add a new field struct expr *expr to the array struct in struct type, and perhaps a struct value *size field as well. The size would be computed by evaluating the expression at the appropriate time (I'd need to do some more reading of the spec). Several operators, such as sizeof, and pointer arithmetic would have to be changed to use the dynamic size instead of the constant size in the type.
  • Most uses of VLAs occur at the top-level of the function, so we should be able to support those easily with just one alloc.
  • The general case is what needs QBE support. I think stacksave/stackrestore instructions would be the best interface for frontends.

~mcf 5 months ago

Michael Forney referenced this ticket in commit 1e66f1c.

~mcf 4 months ago

Added #88 to track the VLA loop issue. For now, we will just ignore it to unblock VLA support.

~mcf REPORTED IMPLEMENTED 4 months ago

Nihal Jere referenced this ticket in commit 4f206ac.

Register here or Log in to comment, or comment via email.