~icefox/garnet#67: 
Operators and borrow types

The time has come to make some actual decisions about various operators and types, and how they interact. Basically I need bitwise operators, & | ^ ~, but I also worry that if we have & as a borrow operator like Rust does, and have it postfix, then foo & bar can be ambiguous with foo& bar. I'm not suuuuure there's any valid place that can happen; foo&.bar is unambiguous with a couple tokens of lookahead so I'm not too worried about that.

Having & | ^ ~ for bitwise operators is also a bit at odds with our keyword-y style in general, but I think here the legacy is worth it instead of using something like bitand. Bitwise operators are one of those things that are used rarely, but when they are used it tends to be a big pile of them at once. Still, I'm open to arguments.

So first, is foo& bar something that can actually happen, or is it always unambiguous?

Second, what do we want to actually use for borrows as well as raw pointers? 'Cause that will interact with this, and it's also just time to do it. If we use * for "constant pointer" then we end up with the same problem for foo* bar. Hmm, how does Zig handle it?

Status
REPORTED
Submitter
~icefox
Assigned to
No-one
Submitted
7 months ago
Updated
7 months ago
Labels
T-DESIGN

~icefox 7 months ago*

So first, is foo& bar something that can actually happen, or is it always unambiguous?

It's potentially ambiguous, but between our pratt parser's precedence and our syntactic newlines, I'm 90% sure it's something we can always disambiguate. Whether the result is actually nice at all, we have to find out. It does mean that foo& & bar could potentially be valid.

What to actually use for borrows? Well now that's a GOOD question. We really have three axes of borrow/pointer types: Owned, shared, and unique. In #65 we say "Syntax is a pain so let's trivialize it: We have two types, Share('a, T) and Uniq('a, T)", which really is the simplest base case. Or per the https://without.boats/blog/pinned-places/ blog post, it proposes a deeper level of increasing power: borrow < mutate < move. I also was intending to use ^ for pointers or borrows, mainly out of contrariness. I don't hate Austral's use of ! to signify mutable borrows either, so we could have foo^, foo^! and foo^pin or something. That... well, we'll see how it feels. ^pin is definitely oogly but also not that common, so, that's easy to change. It also combines poorly with prefix types compared to the others: &T vs &!T vs &pin T. I regret the loss of & to mean "borrow", so hell, maybe we keep it after all.

Raw pointers? * never bloody made sense to me, but the analogy with C is useful. Zig also uses *. And foo.* for dereferencing, which is slightly hilarious. &foo for taking a reference though, mysteriously.

Where do we stick lifetimes? &(a)T maybe, that should be unambiguous in the type language. We also can use &[a, T] like Austral does, though that clashes with arrays. Well types compose prefix no matter what, so &'a T is unambiguous. That's heckin' fine.

Hokay, so here's what I'm feeling:

thing type expression to wrap/create expression unwrap/dereference
shared borrow &'a T val& val^
unique borrow &!'a T val&! val^
move-out-able borrow &move 'a T T&move or something val^
constant raw pointer Ptr[T] Ptr(val) val^
mutable raw pointer PtrMut[T] PtrMut(val) val^
address Addr Addr(val) N/A

~akavel 7 months ago*

As to bitwise, for the record I'd kinda rephrase my thoughts from ~icefox/garnetc#19, with some further pondering:

  • FWIW, Lua seems to work with bitand etc; it's kinda true maybe Lua is a bit higher level language so presumably bitwise operators aren't needed there that often (it kinda lived without any official ones for really long in fact) as in a language targeted at a lower level as Garnet aims for IIUC; Whoops, looks like I was wrong on it: Lua 5.2 indeed had band/bor/... but they were functions in bit32 standard library, whereas Lua 5.3+ seems to have the popular | & etc. operators...
  • For a disclaimer, it's worth to note that bitwise operators are not the only case of "special" ones apart from basic arithmetic: notably, people seem to also often argue for vector/matrix operations to be special-cased into binary symbolic infix notation. The bitwise operators are used a lot in some domains such as hardware communication, network protocols, cryptography, which are common when writing some kinds of code. Conversely, vector/matrix operators are similarly used a lot in other domains, such as computer graphics, other kinds of geometry, or physics modeling & simulations, possibly also signal processing and others. (Though, arguably, bitwise operators still make their inroads in those domains as well from time to time, so maybe they're indeed somewhat more common in the end?)
  • With that said, I believe one really cool option could be if there was working UFCS support in Garnet. I believe in such case every library could de facto create their own kinda poor man's named infix operators. I think this could mean adding those infix ops could be postponed quite much - maybe even indefinitely. For the operation discussed in garnetc#19, I imagine this could then look more or less like below:
    oldstate :rshift(18) :bxor(oldstate) :rshift(27) :to_i32()
  • Alternatively, I think "word-based" infix operators could be a thing as well - at least postponing the decision whether to allocate sigils for them to a later time; so, e.g., I think 0xff bitand foo bitor bar could maybe be a thing - in a similar vein as something and foo or bar can work for logic expressions. Again, the discussed operation could then presumably look more or less like below:
    i64_to_i32(((oldstate bitrshift 18) bitxor oldstate) bitrshift 27)

~akavel 7 months ago

Curiously to me, the operation from garnetc#19 now seems actually to look more readable to me in the "poor man's infix" variant of UFCS above, compared to the forest of parentheses in the case of full-blown infix operators... just making a surprised note...

~akavel 7 months ago

Aaaactually, IIUC Nim seems to just go with and etc. for the bitwise ops (also e.g. shr for shift-right) - now that I think of it, in a statically typed language, you should always know if you're operating on booleans or on integers, right?

~icefox 7 months ago

Yeah you should, but overloading operations on booleans and integers Feels A Little Wrong. If I were awake I could construct some case of logic and comparison along the lines of ((a == b) and c) == d that would do Non-Obvious Things if logical and bitwise operations were the same. It might work, I'm just cautious.

You're right, oldstate :rshift(18) :bxor(oldstate) :rshift(27) :to_i32() honestly isn't the worst. These days I'm leaning away from UFCS and towards Elixir/ML/Elm style pipeline operators, so it would look something like:

oldstate |> rshift(18) |> bxor(oldstate) |> rshift(27) |> to_i32()

The |> sigil here feels a little weird, but is fine for now. Lemme dig some bitwise-heavy code out of my tests and see how it looks. And update it to the proposed borrowing syntax...

-- part of leb128.gt

fn read_unsigned(r Read) Result[U64, ReadError] =
    let mut result: U64 = 0
    let mut shift: U32 = 0
    loop
        let mut buf Arr[U8] = [0]
        Read.read_exact(r, buf&!)

        if shift == 63 and buf[0] != 0x00 and buf[0] != 0x01 then
            return Err(ReadError.Overflow)
        end

        let low_bits = low_bits_of_byte(buf[0]) as U64
        result = result |> bitand(low_bits |> shl(shift))

        if buf[0] |> bitand(CONTINUATION_BIT) == 0 then
            return Ok(result)
        end

        shift = shift + 7
    end
end

Hmmm. Gets kinda weird when you have to start nesting, like result |> bitand(low_bits |> shl(shift)) With your syntax that would be something like result:bitand(low_bits:shl(shift)) which does feel less noisy?

Let me try it out on my PRNG test code:

fn rand32_i32(rand Rand32) {Rand32, I32} =
    let oldstate = rand$.state
    let mut newrng = rand
    newrng$.state = oldstate * RAND32_MULTIPLIER + rand$.inc
    let xorshifted = oldstate |> shr(18) |> bitxor(oldstate) |> shr(27) |> to_u32()
    let rot = oldstate |> shr(59) |> to_i32()
    let num = xorshifted |> bitror(rot) |> to_i32()
    {newrng, num}
end

Eyyyy that's kinda weird but... honestly not terrible, I think?

~akavel 7 months ago

To clarify:

  • the foo:bar(arg):baz(arg2) syntax is not "mine", it's just Lua; but I assume you know this, and maybe are just saying it's mine as a shortcut, then that's ok for me ;) Also, it kinda made me realize it's not really necessarily UFCS I think, it's just a "method invocation syntax"; whether it's used with UFCS or not is a separate topic IIUC.
  • with foo |> bar(arg) |> baz(arg2), if you add any "piped" syntax, I'd say you need to be ready to accept it in a "nested" form (in args) anyway, it will just happen (whether that's to be deemed "cool" and "good practice" or not is another thing I guess, I don't really know) - e.g. I wrote some in my recent scripts in Nickel as shown below:

specimen 1:

  by-path = fun record-with-paths => record-with-paths
    |> std.record.to_array  # [{field=..., value=...}...]
    |> std.array.map (fun {field, value} => field |> std.string.split "/" |> put-deep value)
    |> std.record.merge_all,

specimen 2:

  oneliners_to_bat = fun record => record
    |> std.record.to_array
    |> std.array.map (fun x => {
        field = "%{x.field}.bat",
        value = "@%{x.value} %*\n",
      })
    |> std.record.from_array,

As to what is less vs. more noisy: firstly, personally, in your example above, I do actually find the |> example to have "more air" than the : one and as such be easier to read (to me); but that's again just One Internet Person's Opinion™. Secondly, this makes me think of Lua's booleans operations vs. C-style ones - in C you have foo && bar (or even foo&&bar), in Lua/Python you have foo and bar. In each case I think whether one or the other is more readable depends mainly on what the user is more familiar with. I remember there was a time when I found Lua's syntax weird in booleans and "hard to read", but with time they seem both equally fine. And eventually I started liking Lua's consistency, and a kind of a "flow of words" in its syntax - maybe just associating it with my overall liking of Lua and its well balanced minimalism and choice of features.

~akavel 7 months ago

And, as to "overloading operations on booleans and integers Feels A Little Wrong", I'd kinda challenge you on that :P Firstly, it don't feel wrong to me honestly ;P secondly, Nim does this and don't seem problem; thirdly, e.g. 1 / 2 vs. 1.0 / 2.0 are also overloaded and not exactly the same.

On the other hand, I remembered reading something about why C has both && and &, and that BCPL had just &; I tried to find it now, and found this: https://www.bell-labs.com/usr/dmr/www/chist.html, see section "Neonatal C" (via) - apparently one area where this might actually be an important distinction is in precedence rules (ideally foo && bar == baz should be foo && (bar == baz), while foo & bar == baz should be (foo & bar) == baz, which is not so in C as explained in the link and a common footgun).

Register here or Log in to comment, or comment via email.