~sircmpwn/hare#869: 
Consider adding some syntax to denote that two opaque types must be the same

For example, currently sort::search's prototype looks like this:

fn search(
    in: []const opaque,
    sz: size,
    key: const *opaque,
    cmp: *cmpfunc,
) (size | void);

type cmpfunc = fn(a: const *opaque, b: const *opaque) int;

No type safety is enforced here, so it's very easy to write buggy code (if an object with type []*t is passed to in and an object with type *t is passed to key, where really key should be **t) and have it go unchecked.

One way to remedy this without adding full parametric polymorphism would be to add some syntax which tells the compiler to check that multiple opaque types must be assigned from the same type:

fn search(
    in: []const opaque{elem},
    sz: size,
    key: const *opaque{elem},
    cmp: *cmpfunc,
) (size | void);

type cmpfunc = fn(a: const *opaque{item}, b: const *opaque{item}) int;

(The syntax used here is just proof-of-concept; I'm not sure what the actual syntax here would be.)

With this, the bug described above would no longer be possible, since the compiler would check that the in's secondary type is the same as key's secondary type. This wouldn't have any affect on runtime behavior: the types would stay opaque, and no information about the types would be passed into the function.

Note that this is different from aliases of opaque:

type foo = opaque;
fn bar(a: *foo, b: *foo) void;

Here, both a and b must be pointers to foo, but no other pointer types are allowed unless an explicit cast is used, since all pointers are only assignable to non-aliased *opaque (that is, bar isn't a "generic" function). The above proposal, on the other hand, doesn't create new type aliases. Other than the added compile-time check, the types are semantically identical to opaque in every way.

Status
RESOLVED WONT_FIX
Submitter
~sebsite
Assigned to
No-one
Submitted
8 months ago
Updated
2 months ago
Labels
idea

~ecs 8 months ago

tentative -1, if we're gonna do type-safe polymorphism i'd like to actually do it properly, and i'm not sure we want to do that

~turminal 6 months ago

This was already discussed I think, I remember talking about something like this to someone in #hare more than a year ago. I like the idea and would like to see it tried out in hare. It is however important to note that doing this in a self-consistent way and implementing the type-checking actually is one half of what every language with parametric polymorphism does. Fortunately it is the easier and better understood half. The other half is specialization, and it is much, much harder. It is avoided in this approach by requiring that the concrete parameters that are passed in are always hiding behind an opaque pointer or slice. So we end up on some kind of middle ground that is imo worth exploring.

~whynothugo 5 months ago

While exploring hare-ev recently, I've been looking into the same issue. Notably, for:

fn do_with_cb(
	cb: *fn(user: nullable *opaque) void,
	user: nullable *opaque,
) void = {
	// ... do something here ...
	cb(user);
};

If the type of either cb or user change at any time, this will compile. It may soft-fail at runtime, or it may even continue working with unexpected behaviour.

My approach is quite like the one on this issue. I've been calling this "opaque generics" while researching and exploring the idea because the generic type is completely opaque to the generic function. My solution to the above problem would be something like:

fn do_with_cb<T>(
	cb: *fn(user: nullable *T) void,
	user: nullable *T,
) void = {
	// ... do something here ...
	cb(user);
};

Instead of defining a generic type and declaring "this type must implement interface X", any functions that would we would consider part of "interface X" are passed as parameters. E.g.:

fn save_to_db<T>(
	db: database,
	items: &[]T,
	item_to_query: *fn(item: T) string,
) void = {
	for (let i = 0z; i < len(items); i += 1) {
		db.execute_query(item_to_query(item));
	};
};

I believe this is versatile, powerful, provides solid type checking for this situations and is extremely simple in design. The main downside is that functions are passed as a pointer. This ends up being a form of dynamic dispatch. For a function such as the following one, each comparison is a function call (rather than being inlined).:

fn sort<K>(items: *[]K, compare_fun: fn (a: K, b: K)) = void {
	// ...
};

#Inlining

I've considered that maybe when a generic function is forced inline, then a monomorphized version of it would be inlined by inlining the functions that are passed as parameter. This doesn't introduce new restrictions to the compilation/translation process. However, the parameter may not be possible to determine at compile-time, so this behaviour could be restricted to immutable global functions.

~sircmpwn 5 months ago

-1, this is just generics lite

~whynothugo 5 months ago

Yeah, it's the simpler parts of generics without the parts that would introduce huge complexity.

Do you know of something that might work instead of generics? I'm open to reading material that could be useful here.

~sebsite 2 months ago

With #[annotations], I think we might be better off just having some "standard" (by convention) lint annotation for this, and not make any changes to the language itself. Thoughts?

~turminal 2 months ago

I'm (mostly) fine with leaving this out from the language, but I don't think implementing half of generics as a lint annotation is a good idea. So if it doesn't make it into the language as defined by the specification, we should refrain from implementing it at all IMO.

~ecs 2 months ago

yeah -1 to implementing generics in the linter

~sebsite 2 months ago*

I wouldn't really consider it "implementing generics" in the linter; it's just checking that the types of function arguments are consistent.

We should definitely implement this for sort::* in the linter (if we aren't doing so in the language itself). Bugs stemming from incorrect argument types are relatively common, and it's pretty easy to detect in a linter, so not detecting this and warning kinda goes against the entire point of having a linter in the first place. But if we're supporting it for sort::*, it makes sense to support this for non-stdlib functions as well, rather than special casing sort::*. It's similar to how GCC and Clang warn about an invalid format string in printf, but also provide an attribute to enable this warning on user-defined functions as well. The logic is already there, so it makes sense IMO.

If others prefer, the linter could still special case sort::* by warning about mismatched argument types without needing the lint annotation (i.e. the annotation is implicitly added for recognized stdlib functions).

If people really don't want to add the annotation, then fine, but I very strongly believe that the linter should have checks for stdlib functions no matter what.

~turminal 2 months ago

Yeah I agree this would be nice to have at least for stdlib functions. The problem here is that it's very easy to implement that for sort::* (or any other function) in particular. It's a bit harder to make that work for general functions, so the situation is not as easily generalizable as with format strings in GCC/Clang. I really would like to see this tried out, (and we clearly all see the benefits of doing so), but if we do implement it, I just think it'd be a waste to only make a linter step out of it. It also sends a bit of a wrong message about what the annotations are for, I'd consider that linter step pretty close to a language extension.

~sebsite 2 months ago

It also sends a bit of a wrong message about what the annotations are for, I'd consider that linter step pretty close to a language extension.

I guess I disagree with that. What makes applying it to a stdlib function different from applying it to a non-stdlib function? The purpose of annotations is to interface with other tools; setting settings for the linter on a function is perfectly in-line with that.

Also, I think we should allow using annotations similarly for format strings. Functions which wrap around fmt:: functions are common enough that it doesn't really make sense to not add an annotation for this. I see the same-opaque-type annotation similarly (though I do acknowledge that this isn't as easy to generalize).

I guess I'm fine with not having an annotation for same-opaque-types (though I strongly believe there should be one for format strings). The relevant thing here is using the linter to prevent bugs in sort::* (and possibly other places) rather than extending the language (though I'm still open to extending the language if the implementation is simple and it ends up working well in practice).

~roselandgoose 2 months ago*

While I personally would appreciate the annotations approach Seb is suggesting, I agree with Ember and Bor that it is not right for hare. I think it is at best against Drew's intention in the annotations RFC and at worst basically a language extension...

However I would like to propose (yet) another alternative though. To be honest, I'm not totally sold on it myself as it requires a little bit of magic, though hopefully only as much as the new debug features. I was also inspired by Lorenz's macro discussion, though I think the following idea is a little more in keeping with hare's values than Lorenz's full on meta-programing idea:

What if we exposed a hook for modules to insert @test functions into code that imports (uses) them? Once we have a working hare::check::, modules like sort:: which expose *opaque -based interfaces could insert @test functions that would check for erroneous usage, written entirely in regular hare code with no reliance on extra syntax. Thoughts?

P.S. Apologies if this is an idea that has already been discussed in another ticket or list thread. If it is new and worth considering, I'm happy to do some research into what it would look like and write up an RFC.

~sebsite 2 months ago

I'm not 100% sure I understand the proposal, but from what I can gather, it sorta just sounds like implementing the logic for generics without actually letting the programmer use generics. It sounds complicated and overengineered to me I guess. I'm open to seeing examples though, either if I misunderstood something or if you're just confident about the idea.

~sircmpwn REPORTED WONT_FIX 2 months ago

I think this entire proposal is implementing generics without saying you're implementing generics. NACK all around.

~sebsite 2 months ago

What are your thoughts on the discussion of linter annotations (or just linter warnings in general)?

~sircmpwn 2 months ago

I do think that a linter could definitely identify incorrect use of sort::sort. I don't know if it has to be an annotation or could just be done somewhere else.

~roselandgoose 2 months ago

Would you want such a linter to be a separate program from the build driver?

~sebsite 2 months ago

So, my idea was that the linter is separate from the build driver, but it wraps around harec. That is, the linter analyzes Hare code and prints warnings, and then passes the files into harec to compile them. So then you can set $HAREC to the linter and hare build will display linter warnings.

Not sure whether or not the linter should be officially supported upstream. I think the best approach is to begin it as a downstream project, and it can be upstreamed later if that turns out to be a good idea.

Even if we don't end up doing this, and the linter really is completely separate, I do think that we should still normalize using it as part of the build process in one way or another, somehow.

(Further discussion on the linter which is unrelated to the issues with *opaque parameters should move to #916, or maybe something on the ML idk)

~sebsite 2 months ago

I do still think that we should come up with some path forward here, wrt the *opaque parameter stuff, whether that be in the language or in the linter. Because this is a real issue and I don't want to see it completely ignored. And I don't think shutting down the discussion with "generics bad" is very productive, because even if we aren't adding anything resembling generics to the language (which I mostly agree with), there's still value to be had in discussing the potential solutions to the problem.

~sircmpwn 2 months ago

Would you want such a linter to be a separate program from the build driver?

yes

Not sure whether or not the linter should be officially supported upstream. I think the best approach is to begin it as a downstream project, and it can be upstreamed later if that turns out to be a good idea.

+1

~sircmpwn 2 months ago

I don't find this problem particularly compelling. It affects one API in the whole of the stdlib. It's a known limitation of Hare's type system as-designed.

~whynothugo 2 months ago

The whole sort::cmp module is also affected by this. A lot of the API in hare-ev is too.

If I use ev::dial::resolve, I have to manually check that the type of cb and user have matching types. This is fine when initially writing code, but if a type ever changes somewhere, one has to manually follow through and make sure that all usages expect the new type. Granted, unit tests can catch issues, but it's really type mismatches to be caught by linters rather tests.

~sircmpwn 2 months ago

We're just endlessly discussing the consequences of the fact that we chose not to implement generics. We made this choice a long time ago and the trade-offs are well understood. It's not happening.

~turminal 2 months ago

I interpret the amount of activity on this issue as a sign that the choice and its consequences are not that well understood in the community, and perhaps rethinking opinions and decisions from years ago is not necessarily a bad thing. I for one am much less opposed to this sort of "generics" than I was when I started contributing to Hare.

Register here or Log in to comment, or comment via email.