#146 Rendering and caret positioning for complex scripts 10 hours ago

Ticket created by ~tainted-bit on ~eliasnaur/gio

With the fix for gio#104, we can now render glyphs for all languages. However, certain languages like Arabic are not rendered properly, to the degree that they are essentially still unsupported. This applies to all complex scripts: writing systems that require advanced processing operations like context-dependent substitutions, context-dependent mark positioning, glyph-to-glyph joining, glyph reordering, or glyph stacking. The current font shaper also does not support bidirectional text. Because of this issue, users who only read/write languages with complex scripts are excluded from using Gio apps. This is a feature request to add full support for all languages supported by unicode. Resolving these limitations will require significant changes to the font shaping system; supporting fonts with broad unicode coverage (gio#104) was a necessary but insufficient step.

Other GUI toolkits such as Qt, GTK+, and browser render engines, all solve this problem using the same software stack, discussed below.

The first piece that we need is a system to convert Go strings (containing runes / unicode codepoints) into a set of glyphs from the font file and their positions. This is by far the most difficult step of proper text rendering, because this is where all of the complexities of human writing systems arise. There is exactly one open-source project that accomplishes this goal: HarfBuzz. At its core, HarfBuzz provides a function called hb_shape that is responsible for translating unicode strings into glyph sequences. This is the library that all of the GUI toolkits use.

It is always preferable to avoid cgo if possible. However, I believe that there is no way that Gio will be able to provide reasonable support for complex scripts without using a cgo wrapper around HarfBuzz. This is because the project is necessarily massive, is constantly receiving fixes and updates, and requires a large and active community to keep it running. These factors are extreme enough that there are essentially no viable alternatives to HarfBuzz in any programming language. Porting it or replicating only the necessary components in Go is a monumental task that is certainly beyond the development resources of Gio, and possibly even beyond the resources of the Go subcommunity that needs this support, and yet it is necessary for supporting complex scripts. Moreover, I believe it is in the best interest of the wider open-source community to concentrate such efforts in the HarfBuzz project. This is why the first step to resolving this issue is writing a cgo wrapper around HarfBuzz (which will be useful not just to Gio, but to other Go projects as well).

The output of a call to HarfBuzz hb_shape is a sequence of glyphs to render from the font. For each glyph, we are given the codepoint to look up in the font file, the X and Y advances (i.e., the values to add to the current rendering cursor position after drawing the glyph), and X and Y offsets (i.e., values to add to the current cursor position when placing the glyph, but these should not affect the cursor position). HarfBuzz also outputs a set of clusters for the string: this is the proper way to handle caret positioning and text selection in widget.Editor. The caret position should be based on clusters rather than bytes or runes, because fonts may merge multiple codepoints into individual glyphs.

HarfBuzz requires several pieces of information as input: the string to shape, the script (e.g., Cyrillic), the language (e.g., Russian), the direction (e.g., left-to-right), the font file, and the shaper model (e.g., OpenType shaper). All of this information must be constant for the whole string. This means that rendering something like an Arabic passage with a left-to-right English proper noun in the middle will require multiple calls to HarfBuzz. Additionally, toolkits like Gio want to expose a simple function like LayoutString from text.Shaper to the developer without requiring them to specify things like the language that the string is written in. This means that we need another component that takes an arbitrary string as input, and produces a sequence of HarfBuzz hb_shape calls as output. In other toolkits, this is the job of the Pango library from the GNOME project.

In contrast to HarfBuzz, the relevant algorithms from Pango are small and simple enough to rewrite in pure Go, either as part of Gio or as part of a separate project. Pango supports a lot of extra functionality that we do not need, and the pieces that we do need are buried within a lot of C code that is unnecessary in Go. To replicate the relevant Pango functionality (breaking a string into runs such that each run will be fed into a single HarfBuzz hb_shape call), we'll need to implement an algorithm like this:

  1. Break up the string into runs with the same text direction (either left-to-right or right-to-left) using the Unicode Bidirectional Algorithm. This algorithm is fully specified in Unicode Standard Annex #9. Strangely, although it is implemented in the Go extended library (golang.org/x/text/unicode/bidi), the code is in unexported functions, and the exported functions all panic with unimplemented. Implementing this part will likely involve copying code from the unexported functions combined with new code implemented to follow the Unicode Annex.
  2. Within each run, break the string into further runs based on when the script changes. Pango does this by simply checking the unicode ranges for each codepoint to figure out which script it belongs to, and starting a new run whenever that changes. The information in the Go standard library (the RangeTable values in unicode) is sufficient for this.
  3. For each run, we now have the direction and script. We now need the language. In general, there is no algorithm for this. Pango does its best by using this algorithm (which is clearly not great, but it is good enough to be adopted by all of the GUI toolkits):
    1. Scan the languages installed on the system (e.g., specified in environment variables PANGO_LANGUAGE or LANGUAGE) in order of priority. For each language, check to see if the script can be used to write that language. If it can, then return that language.
    2. Otherwise, look up a representative language for the script from a hard-coded table (using the pango_script_get_sample_language function). If a representative language is defined, then return it.
    3. Otherwise, return the default language for the locale of the process (whether or not that makes sense).
  4. Set the shaper model to OpenType (we probably don't need to support other formats) and set the font file based on the current Gio theme. Note that for OpenType Collections like we used to fix gio#104, HarfBuzz requires a specific font index from the collection to use (it cannot scan the collection on its own). This means that we'll need to scan the collection to find the font index that supports the script for the run; we should probably scan the font data and cache this information when the collection is first loaded.

To summarize, my suggested path to supporting complex scripts in Gio is:

  1. Write a cgo wrapper around HarfBuzz.
  2. Implement the above algorithm to convert strings into runs.
  3. Update Gio's font rendering to: convert the string to runs, feed the runs into HarfBuzz, and then render the glyphs with the existing rendering system.
  4. Update the text as necessary to remove implicit shaping assumptions that are only valid for non-complex scripts.
  5. Update widget.Editor to base the caret position on clusters output by HarfBuzz, instead of runes.

#104 Support fonts with broad unicode coverage 15 hours ago

Comment by ~tainted-bit on ~eliasnaur/gio

Thanks for your guidance with this patch series. As promised, I've uploaded the fonts example so that others can see the new feature in action: github.com/Nik-U/gioexamples/fonts.

Through this process I've learned that supporting all languages is more difficult than just loading a broad-coverage font collection and that fixing this issue was only the first step. I'll open a new issue shortly with what remains to be done and a potential path forward.

#104 Support fonts with broad unicode coverage 2 days ago

Comment by ~tainted-bit on ~eliasnaur/gio

Patch 11495 omits the sfnt.Buffer lock and includes a test with the aforementioned shapings. To determine whether or not two font shapes are equal, it compares the Ops as opaque byte slices. This testing method is very fast and does not require any rendering, but it breaks if the shaper produces non-deterministic Ops or if any of the operations use Refs. I think that these are reasonable assumptions, but I don't know much about the operations subsystem. The test uses two TTF files that I have prepared. As hoped, they are both very small: 1071 bytes combined. The tests also include a rudimentary OTC merging function, as discussed.

#104 Support fonts with broad unicode coverage 2 days ago

Comment by ~tainted-bit on ~eliasnaur/gio

Rendering text is great for a demo but overkill for a test. It seems to me a test that inspects the result of Layout (and/or Shape) for a string containing fallback character(s) would suffice.

This will mostly work, but the tricky part is testing that the replacement glyph comes from the first font in the collection. This is not visible in the []text.Line returned by Layout because the glyph rune will be the same. Distinguishing the replacement characters will require at least a call to Shape and potentially examining the returned op.CallOp to identify the differences.

Alternatively, we could just not test that part of the code and leave it up to Collection to decide where replacement characters come from. After all, if all goes well we'll almost never see them anyway. :)

I'd like to keep any checked in TTF/OTC files tiny, < 1kb or even < 512 bytes like the other reference files in refs/. I'm ok with test relying on the Go fonts not changing, and I'm also ok with non-general OTC merging code (since it's only for the test).

I'm pretty sure that sizes like those are attainable with pyftsubset and potentially some other tools / manual massaging, but I won't know the final size for sure until I try. The plan is to have two TTFs that each contain two glyphs (replacement glyph and one identifiable symbol) and nothing else.

#104 Support fonts with broad unicode coverage 2 days ago

Comment by ~tainted-bit on ~eliasnaur/gio

If it's alright with you, I'd like to avoid the mutex for now, deferring optimization to when it has a bigger impact.

Alright. Perhaps a short comment should be left around the sfnt.Buffer declaration to discourage optimization before the time is right?

I see your point. Let's keep the cool demo in a separate repository (yours?) and I'll link to that from the newsletter and such.

Sure. I'll make a repository for it somewhere and post the link here once the patch is merged.

Without an example, is there a way to add a test that doesn't need large dependencies? I'd like for the new feature not to bit-rot in the future.

That makes sense. My thought is to add internal/rendertest/shaper_test.go with TestCollectionAsFace to do some black-box testing. This would generate an OTC of two fonts, then render the following:

  1. Using font 1, a valid glyph in font 1.
  2. Using font 1, an invalid glyph in font 1 that is also invalid in font 2.
  3. Using font 2, a valid glyph in font 2 that is invalid in font 1.
  4. Using font 2, the same invalid glyph from render 2.
  5. Using the OTC, the glyph from render 1.
  6. Using the OTC, the glyph from render 3.
  7. Using the OTC, the glyph from render 2 / 4.

The tests are then as follows:

  • Renders 1, 2, 3, and 4 are distinct.
  • Render 5 == render 1.
  • Render 6 == render 3.
  • Render 7 == render 2.

The main question is which fonts to use for the test, and how to make the OTC. Right now I'm leaning towards storing sample TTFs and the merged OTC as raw files in the repository (probably under internal/rendertest/refs/ ?). This would avoid the test becoming brittle if the Go fonts change (although admittedly that is unlikely, given that the last update was 3 years ago) and also avoid dependency on Roboto (which is not currently used elsewhere). It also avoids the need to use OTCMerge as a dependency or to include rudimentary OTC merging code (which is admittedly only a few lines if generality is not required).

Does that approach seem sensible to you?

#104 Support fonts with broad unicode coverage 4 days ago

Comment by ~tainted-bit on ~eliasnaur/gio

I can certainly add an example for v3 of this patch. However, if it uses the actual Noto OTCs, that will add a 120 MB download as a dependency. For that reason I'm thinking of adding it as a nested module instead of directly in the gioui.org/example module, if that's okay with you.

I've submitted an example program in patch 11458. It doesn't compile without the opentype.Collection patch applied first, of course.

#104 Support fonts with broad unicode coverage 4 days ago

Comment by ~tainted-bit on ~eliasnaur/gio

Until benchmarks show a definite performance improvement, I prefer the simple approach of just not sharing buffers.

BenchmarkDrawUI in internal/rendertest is a good place to benchmark the approaches, since it calls Layout and Shape on an opentype.Font, but not concurrently (so it is similar to how a single window performs). The lock-vs-no-lock approach we're talking about here is going to be a tiny cost for most applications for the same reason that lock contention would be very low: text.faceCache will cache the layout and paths for most text drawing calls. Consequently the approach we take mainly affects GUIs that make poor use of the cache, such as a program that changes the text it draws very rapidly (e.g., a stopwatch GUI), or a GUI that is being resized by the user. To simulate this, I modified the drawText function in internal/rendertest/bench_test.go to bypass the cache so that it could test the performance of Layout and Shape:

txt.Text = textRows[x] + strconv.Itoa(rand.Int())

Here are the results of 3 runs with -test.benchtime=5s on my i7-6700K:

This patch (using sync.Mutex):

BenchmarkDrawUI-8            399      14885780 ns/op     5600746 B/op       1571 allocs/op
BenchmarkDrawUI-8            398      15011455 ns/op     5601157 B/op       1571 allocs/op
BenchmarkDrawUI-8            399      15005305 ns/op     5600769 B/op       1571 allocs/op

7bbe0da (not sharing sfnt.Buffer):

BenchmarkDrawUI-8            400      15011438 ns/op     5794994 B/op       1891 allocs/op
BenchmarkDrawUI-8            398      14971487 ns/op     5795127 B/op       1891 allocs/op
BenchmarkDrawUI-8            399      15030301 ns/op     5794782 B/op       1891 allocs/op

So in this degenerate case of constantly changing text, both approaches have roughly the same CPU time, but reusing sfnt.Buffer saves many allocs, as expected. The amount that it saves will be proportional to the number of uncached calls to text.Font. The difference between the approaches will be negligible for most GUIs (where the code is rarely called), but measurable for others. So the sync.Mutex approach has superior or equal performance, but slightly more code complexity. Personally I think that using a sync.Mutex and reusing the buffers is simple enough to justify this performance improvement, but I can see why you might disagree.

I would love an example/fonts example showing off the new feature. Can you add it? I believe you can reuse example/hello.

I can certainly add an example for v3 of this patch. However, if it uses the actual Noto OTCs, that will add a 120 MB download as a dependency. For that reason I'm thinking of adding it as a nested module instead of directly in the gioui.org/example module, if that's okay with you.

#104 Support fonts with broad unicode coverage 5 days ago

Comment by ~tainted-bit on ~eliasnaur/gio

Patch 11445 contains the revised code. I was initially confused about why it was okay for opentype.Font to use an sfnt.Buffer in multi-window settings when it was not okay for opentype.Collection to do the same. It turns out that it's not okay for either: there is an existing race condition in opentype.Font. It is triggered very rarely, because it only occurs when multiple windows are rendering strings using the same font simultaneously. Moreover, clobbering the sfnt.Buffer may not actually be noticeable depending on what the sfnt package does with it. In any case, this patch also fixes the existing race condition by adding a lock around buffers for both Font and Collection. The locks will have very low contention, so fast path mutex overhead is almost certainly worth the cost in order to keep buffers around to avoid GC pressure.

I've also packaged the Noto font family in public repositories under github.com/gonoto. This makes it easy to test this patch with a simple change to example/hello:

otc, _ := opentype.ParseCollection(notosans.OTC())
th := material.NewTheme([]text.FontFace{{Face: otc}})

#104 Support fonts with broad unicode coverage 7 days ago

Comment by ~tainted-bit on ~eliasnaur/gio

I have the patches for this done and I've written the code to produce embeddable full-coverage Noto fonts for Go. The problem is that these font files end up being 60--100 MB, of which I've confirmed most of the data is the actual glyph shapes. Turning that into embeddable Go in an efficient way results in a 2x--4x file size increase, because we do not yet have an official resource embedding system. Now I have 181 Go repositories that are 120--220 MB each. I need to think more about how to handle this data in a better way than uploading 30 GB to Github.

#104 Support fonts with broad unicode coverage 12 days ago

Comment by ~tainted-bit on ~eliasnaur/gio

Removing the Metrics from text.Face will definitely simplify this approach drastically. However, we'll still need a sfnt.Buffer for the new Layout and Shape calls on the opentype.Collection. Is it a significant problem for these functions to be unsafe for concurrent use? The other methods (accessing the underlying sfnt.Font slice) will remain safe for concurrent use.

We'll also need to retain the combined metrics calculation for ascent, height, and the glyph bounds, because these are returned in the text.Line values produced by Layout (and subsequently used by widgets). These are fast enough to compute without a cache and have pretty clear algorithms, though.

My weird combo font is not something that should be used in practice―just for demonstrating the fallback behavior. I can prepare an actual font collection for Noto to submit to the font repository, along with a build script to reproduce it with otf2otc in the future as Noto evolves.