~eliasnaur/gio#146: 
Rendering and caret positioning for complex scripts

With the fix for gio#104, we can now render glyphs for all languages. However, certain languages like Arabic are not rendered properly, to the degree that they are essentially still unsupported. This applies to all "complex scripts": writing systems that require advanced processing operations like context-dependent substitutions, context-dependent mark positioning, glyph-to-glyph joining, glyph reordering, or glyph stacking. The current font shaper also does not support bidirectional text. Because of this issue, users who only read/write languages with complex scripts are excluded from using Gio apps. This is a feature request to add full support for all languages supported by unicode. Resolving these limitations will require significant changes to the font shaping system; supporting fonts with broad unicode coverage (gio#104) was a necessary but insufficient step.

Other GUI toolkits such as Qt, GTK+, and browser render engines, all solve this problem using the same software stack, discussed below.

The first piece that we need is a system to convert Go strings (containing runes / unicode codepoints) into a set of glyphs from the font file and their positions. This is by far the most difficult step of proper text rendering, because this is where all of the complexities of human writing systems arise. There is exactly one open-source project that accomplishes this goal: HarfBuzz. At its core, HarfBuzz provides a function called hb_shape that is responsible for translating unicode strings into glyph sequences. This is the library that all of the GUI toolkits use.

It is always preferable to avoid cgo if possible. However, I believe that there is no way that Gio will be able to provide reasonable support for complex scripts without using a cgo wrapper around HarfBuzz. This is because the project is necessarily massive, is constantly receiving fixes and updates, and requires a large and active community to keep it running. These factors are extreme enough that there are essentially no viable alternatives to HarfBuzz in any programming language. Porting it or replicating only the necessary components in Go is a monumental task that is certainly beyond the development resources of Gio, and possibly even beyond the resources of the Go subcommunity that needs this support, and yet it is necessary for supporting complex scripts. Moreover, I believe it is in the best interest of the wider open-source community to concentrate such efforts in the HarfBuzz project. This is why the first step to resolving this issue is writing a cgo wrapper around HarfBuzz (which will be useful not just to Gio, but to other Go projects as well).

The output of a call to HarfBuzz hb_shape is a sequence of glyphs to render from the font. For each glyph, we are given the codepoint to look up in the font file, the X and Y advances (i.e., the values to add to the current rendering cursor position after drawing the glyph), and X and Y offsets (i.e., values to add to the current cursor position when placing the glyph, but these should not affect the cursor position). HarfBuzz also outputs a set of "clusters" for the string: this is the proper way to handle caret positioning and text selection in widget.Editor. The caret position should be based on clusters rather than bytes or runes, because fonts may merge multiple codepoints into individual glyphs.

HarfBuzz requires several pieces of information as input: the string to shape, the script (e.g., Cyrillic), the language (e.g., Russian), the direction (e.g., left-to-right), the font file, and the shaper model (e.g., OpenType shaper). All of this information must be constant for the whole string. This means that rendering something like an Arabic passage with a left-to-right English proper noun in the middle will require multiple calls to HarfBuzz. Additionally, toolkits like Gio want to expose a simple function like LayoutString from text.Shaper to the developer without requiring them to specify things like the language that the string is written in. This means that we need another component that takes an arbitrary string as input, and produces a sequence of HarfBuzz hb_shape calls as output. In other toolkits, this is the job of the Pango library from the GNOME project.

In contrast to HarfBuzz, the relevant algorithms from Pango are small and simple enough to rewrite in pure Go, either as part of Gio or as part of a separate project. Pango supports a lot of extra functionality that we do not need, and the pieces that we do need are buried within a lot of C code that is unnecessary in Go. To replicate the relevant Pango functionality (breaking a string into "runs" such that each "run" will be fed into a single HarfBuzz hb_shape call), we'll need to implement an algorithm like this:

  1. Break up the string into runs with the same text direction (either left-to-right or right-to-left) using the Unicode Bidirectional Algorithm. This algorithm is fully specified in Unicode Standard Annex #9. Strangely, although it is implemented in the Go extended library (golang.org/x/text/unicode/bidi), the code is in unexported functions, and the exported functions all panic with "unimplemented". Implementing this part will likely involve copying code from the unexported functions combined with new code implemented to follow the Unicode Annex.
  2. Within each run, break the string into further runs based on when the script changes. Pango does this by simply checking the unicode ranges for each codepoint to figure out which script it belongs to, and starting a new run whenever that changes. The information in the Go standard library (the RangeTable values in unicode) is sufficient for this.
  3. For each run, we now have the direction and script. We now need the language. In general, there is no algorithm for this. Pango does its best by using this algorithm (which is clearly not great, but it is good enough to be adopted by all of the GUI toolkits):
  4. Scan the languages installed on the system (e.g., specified in environment variables PANGO_LANGUAGE or LANGUAGE) in order of priority. For each language, check to see if the script can be used to write that language. If it can, then return that language.
  5. Otherwise, look up a "representative language" for the script from a hard-coded table (using the pango_script_get_sample_language function). If a representative language is defined, then return it.
  6. Otherwise, return the default language for the locale of the process (whether or not that makes sense).
  7. Set the shaper model to OpenType (we probably don't need to support other formats) and set the font file based on the current Gio theme. Note that for OpenType Collections like we used to fix gio#104, HarfBuzz requires a specific font index from the collection to use (it cannot scan the collection on its own). This means that we'll need to scan the collection to find the font index that supports the script for the run; we should probably scan the font data and cache this information when the collection is first loaded.

To summarize, my suggested path to supporting complex scripts in Gio is:

  1. Write a cgo wrapper around HarfBuzz.
  2. Implement the above algorithm to convert strings into runs.
  3. Update Gio's font rendering to: convert the string to runs, feed the runs into HarfBuzz, and then render the glyphs with the existing rendering system.
  4. Update the text as necessary to remove implicit shaping assumptions that are only valid for non-complex scripts.
  5. Update widget.Editor to base the caret position on clusters output by HarfBuzz, instead of runes.
Status
REPORTED
Submitter
~tainted-bit
Assigned to
No-one
Submitted
3 months ago
Updated
2 months ago
Labels
No labels applied.

~whereswaldon 3 months ago

I just want to say thank you for taking the time to assemble this information! I think this is an excellent summary of the work that needs doing, and it's sufficiently clear that I even think people previously ignorant of these details (like me) can reasonably help implement/review parts of it. I think this looks super reasonable, though my opinion in this area isn't worth all that much.

~sbinet 3 months ago

interesting.

I knew of harfbuzz (having seen it in compilation logs of chromium and all) but didn't really know what it was.

it's only tangentially related to this issue, I think, but let me mention it. in the context of scientific text edition, it would be great to also properly handle math equations and symbols. for that, I believe the golden standard is LaTeX. (interestingly, there are projects that leverage HarfBuzz inside Tex engines)

I've started a little reimplementation of the math TeX parser, TeX boxing model and a renderer here:

(because of the current state of golang.org/x/image/font/sfnt, I couldn't make use of nicer (mathematical) fonts than gofont, even if I did package the "latex" fonts over there:

anyways, thanks for putting all of this together. I'll keep an eye on it.

-s

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ On Wednesday, July 8, 2020 11:45 PM, ~tainted-bit outgoing@sr.ht wrote:

With the fix for gio#104, we can now render glyphs for all languages. However, certain languages like Arabic are not rendered properly, to the degree that they are essentially still unsupported. This applies to all "complex scripts": writing systems that require advanced processing operations like context-dependent substitutions, context-dependent mark positioning, glyph-to-glyph joining, glyph reordering, or glyph stacking. The current font shaper also does not support bidirectional text. Because of this issue, users who only read/write languages with complex scripts are excluded from using Gio apps. This is a feature request to add full support for all languages supported by unicode. Resolving these limitations will require significant changes to the font shaping system; supporting fonts with broad unicode coverage (gio#104) was a necessary but insufficient step.

Other GUI toolkits such as Qt, GTK+, and browser render engines, all solve this problem using the same software stack, discussed below.

The first piece that we need is a system to convert Go strings (containing runes / unicode codepoints) into a set of glyphs from the font file and their positions. This is by far the most difficult step of proper text rendering, because this is where all of the complexities of human writing systems arise. There is exactly one open-source project that accomplishes this goal: HarfBuzz. At its core, HarfBuzz provides a function called hb_shape that is responsible for translating unicode strings into glyph sequences. This is the library that all of the GUI toolkits use.

It is always preferable to avoid cgo if possible. However, I believe that there is no way that Gio will be able to provide reasonable support for complex scripts without using a cgo wrapper around HarfBuzz. This is because the project is necessarily massive, is constantly receiving fixes and updates, and requires a large and active community to keep it running. These factors are extreme enough that there are essentially no viable alternatives to HarfBuzz in any programming language. Porting it or replicating only the necessary components in Go is a monumental task that is certainly beyond the development resources of Gio, and possibly even beyond the resources of the Go subcommunity that needs this support, and yet it is necessary for supporting complex scripts. Moreover, I believe it is in the best interest of the wider open-source community to concentrate such efforts in the HarfBuzz project. This is why the first step to resolving this issue is writing a cgo wrapper around H arfBuzz (which will be useful not just to Gio, but to other Go projects as well).

The output of a call to HarfBuzz hb_shape is a sequence of glyphs to render from the font. For each glyph, we are given the codepoint to look up in the font file, the X and Y advances (i.e., the values to add to the current rendering cursor position after drawing the glyph), and X and Y offsets (i.e., values to add to the current cursor position when placing the glyph, but these should not affect the cursor position). HarfBuzz also outputs a set of "clusters" for the string: this is the proper way to handle caret positioning and text selection in widget.Editor. The caret position should be based on clusters rather than bytes or runes, because fonts may merge multiple codepoints into individual glyphs.

HarfBuzz requires several pieces of information as input: the string to shape, the script (e.g., Cyrillic), the language (e.g., Russian), the direction (e.g., left-to-right), the font file, and the shaper model (e.g., OpenType shaper). All of this information must be constant for the whole string. This means that rendering something like an Arabic passage with a left-to-right English proper noun in the middle will require multiple calls to HarfBuzz. Additionally, toolkits like Gio want to expose a simple function like LayoutString from text.Shaper to the developer without requiring them to specify things like the language that the string is written in. This means that we need another component that takes an arbitrary string as input, and produces a sequence of HarfBuzz hb_shape calls as output. In other toolkits, this is the job of the Pango library from the GNOME project.

In contrast to HarfBuzz, the relevant algorithms from Pango are small and simple enough to rewrite in pure Go, either as part of Gio or as part of a separate project. Pango supports a lot of extra functionality that we do not need, and the pieces that we do need are buried within a lot of C code that is unnecessary in Go. To replicate the relevant Pango functionality (breaking a string into "runs" such that each "run" will be fed into a single HarfBuzz hb_shape call), we'll need to implement an algorithm like this:

  1. Break up the string into runs with the same text direction (either left-to-right or right-to-left) using the Unicode Bidirectional Algorithm. This algorithm is fully specified in Unicode Standard Annex #9. Strangely, although it is implemented in the Go extended library (golang.org/x/text/unicode/bidi), the code is in unexported functions, and the exported functions all panic with "unimplemented". Implementing this part will likely involve copying code from the unexported functions combined with new code implemented to follow the Unicode Annex.

  2. Within each run, break the string into further runs based on when the script changes. Pango does this by simply checking the unicode ranges for each codepoint to figure out which script it belongs to, and starting a new run whenever that changes. The information in the Go standard library (the RangeTable values in unicode) is sufficient for this.

  3. For each run, we now have the direction and script. We now need the language. In general, there is no algorithm for this. Pango does its best by using this algorithm (which is clearly not great, but it is good enough to be adopted by all of the GUI toolkits):

  4. Scan the languages installed on the system (e.g., specified in environment variables PANGO_LANGUAGE or LANGUAGE) in order of priority. For each language, check to see if the script can be used to write that language. If it can, then return that language.

  5. Otherwise, look up a "representative language" for the script from a hard-coded table (using the pango_script_get_sample_language function). If a representative language is defined, then return it.

  6. Otherwise, return the default language for the locale of the process (whether or not that makes sense).

  7. Set the shaper model to OpenType (we probably don't need to support other formats) and set the font file based on the current Gio theme. Note that for OpenType Collections like we used to fix gio#104, HarfBuzz requires a specific font index from the collection to use (it cannot scan the collection on its own). This means that we'll need to scan the collection to find the font index that supports the script for the run; we should probably scan the font data and cache this information when the collection is first loaded.

    To summarize, my suggested path to supporting complex scripts in Gio is:

  8. Write a cgo wrapper around HarfBuzz.

  9. Implement the above algorithm to convert strings into runs.

  10. Update Gio's font rendering to: convert the string to runs, feed the runs into HarfBuzz, and then render the glyphs with the existing rendering system.

  11. Update the text as necessary to remove implicit shaping assumptions that are only valid for non-complex scripts.

  12. Update widget.Editor to base the caret position on clusters output by HarfBuzz, instead of runes.

--
View on the web: https://todo.sr.ht/~eliasnaur/gio/146

~eliasnaur 3 months ago

Wonderful writeup. I've known that Gio's text layout algorithm was too naïve since the inception, but held off implementing something better because of the effort required. Thanks for working on this difficult yet important problem.

Like the others responding to this issue, I have very little knowledge in this area. I have a few superficial comments and a main concern below.

On Wed Jul 8, 2020 at 21:45, ~tainted-bit wrote:

With the fix for gio#104, we can now render glyphs for all languages. However, certain languages like Arabic are not rendered properly, to the degree that they are essentially still unsupported. This applies to all "complex scripts": writing systems that require advanced processing operations like context-dependent substitutions, context-dependent mark positioning, glyph-to-glyph joining, glyph reordering, or glyph stacking. The current font shaper also does not support bidirectional text. Because of this issue, users who only read/write languages with complex scripts are excluded from using Gio apps. This is a feature request to add full support for all languages supported by unicode. Resolving these limitations will require significant changes to the font shaping system; supporting fonts with broad unicode coverage (gio#104) was a necessary but insufficient step.

Other GUI toolkits such as Qt, GTK+, and browser render engines, all solve this problem using the same software stack, discussed below.

The first piece that we need is a system to convert Go strings (containing runes / unicode codepoints) into a set of glyphs from the font file and their positions. This is by far the most difficult step of proper text rendering, because this is where all of the complexities of human writing systems arise. There is exactly one open-source project that accomplishes this goal: HarfBuzz. At its core, HarfBuzz provides a function called hb_shape that is responsible for translating unicode strings into glyph sequences. This is the library that all of the GUI toolkits use.

It is always preferable to avoid cgo if possible. However, I believe that there is no way that Gio will be able to provide reasonable support for complex scripts without using a cgo wrapper around HarfBuzz. This is because the project is necessarily massive, is constantly receiving fixes and updates, and requires a large and active community to keep it running. These factors are extreme enough that there are essentially no viable alternatives to HarfBuzz in any programming language. Porting it or replicating only the necessary components in Go is a monumental task that is certainly beyond the development resources of Gio, and possibly even beyond the resources of the Go subcommunity that needs this support, and yet it is necessary for supporting complex scripts. Moreover, I believe it is in the best interest of the wider open-source community to concentrate such efforts in the HarfBuzz project. This is why the first step to resolving this issue is writing a cgo wrapper around H arfBuzz (which will be useful not just to Gio, but to other Go projects as well).

You're probably right about the Gio community, but I wouldn't put it beyond the larger Go community to create a HarfBuzz alternative. The Rustaceans did it:

https://yeslogic.com/blog/allsorts-rust-font-shaping-engine.html
https://github.com/yeslogic/allsorts
	

The output of a call to HarfBuzz hb_shape is a sequence of glyphs to render from the font. For each glyph, we are given the codepoint to look up in the font file, the X and Y advances (i.e., the values to add to the current rendering cursor position after drawing the glyph), and X and Y offsets (i.e., values to add to the current cursor position when placing the glyph, but these should not affect the cursor position). HarfBuzz also outputs a set of "clusters" for the string: this is the proper way to handle caret positioning and text selection in widget.Editor. The caret position should be based on clusters rather than bytes or runes, because fonts may merge multiple codepoints into individual glyphs.

HarfBuzz requires several pieces of information as input: the string to shape, the script (e.g., Cyrillic), the language (e.g., Russian), the direction (e.g., left-to-right), the font file, and the shaper model (e.g., OpenType shaper). All of this information must be constant for the whole string. This means that rendering something like an Arabic passage with a left-to-right English proper noun in the middle will require multiple calls to HarfBuzz. Additionally, toolkits like Gio want to expose a simple function like LayoutString from text.Shaper to the developer without requiring them to specify things like the language that the string is written in. This means that we need another component that takes an arbitrary string as input, and produces a sequence of HarfBuzz hb_shape calls as output. In other toolkits, this is the job of the Pango library from the GNOME project.

In contrast to HarfBuzz, the relevant algorithms from Pango are small and simple enough to rewrite in pure Go, either as part of Gio or as part of a separate project. Pango supports a lot of extra functionality that we do not need, and the pieces that we do need are buried within a lot of C code that is unnecessary in Go. To replicate the relevant Pango functionality (breaking a string into "runs" such that each "run" will be fed into a single HarfBuzz hb_shape call), we'll need to implement an algorithm like this:

  1. Break up the string into runs with the same text direction (either left-to-right or right-to-left) using the Unicode Bidirectional Algorithm. This algorithm is fully specified in Unicode Standard Annex #9. Strangely, although it is implemented in the Go extended library (golang.org/x/text/unicode/bidi), the code is in unexported functions, and the exported functions all panic with "unimplemented". Implementing this part will likely involve copying code from the unexported functions combined with new code implemented to follow the Unicode Annex.

The Go team's code isn't compatible with the UNLICENSE, so straight copying is not going to work.

Have you considered the other way round: working through the issues with the Go team and export their implementations? It may take longer, but the result will be reviewed by experts and more widely useful. I'd love for other Go toolkits such as Fyne to have great text layout as well.

  1. Within each run, break the string into further runs based on when the script changes. Pango does this by simply checking the unicode ranges for each codepoint to figure out which script it belongs to, and starting a new run whenever that changes. The information in the Go standard library (the RangeTable values in unicode) is sufficient for this.

Great.

  1. For each run, we now have the direction and script. We now need the language. In general, there is no algorithm for this. Pango does its best by using this algorithm (which is clearly not great, but it is good enough to be adopted by all of the GUI toolkits):
  2. Scan the languages installed on the system (e.g., specified in environment variables PANGO_LANGUAGE or LANGUAGE) in order of priority. For each language, check to see if the script can be used to write that language. If it can, then return that language.

Sounds very brittle, and unlikely to work on locked down systems such as mobiles. Not to mention slow.

  1. Otherwise, look up a "representative language" for the script from a hard-coded table (using the pango_script_get_sample_language function). If a representative language is defined, then return it.

This is the only somewhat reasonable approach.

  1. Otherwise, return the default language for the locale of the process (whether or not that makes sense).

It's really not great that the local machine's language(s) influence how text is laid out.

I suspect Pango goes to a great deal of trouble because it wants to be useful in every situation, no matter how little information it has.

It would be OK for Gio to require language hints from the programmer to properly support ambiguous scripts.

  1. Set the shaper model to OpenType (we probably don't need to support other formats) and set the font file based on the current Gio theme. Note that for OpenType Collections like we used to fix gio#104, HarfBuzz requires a specific font index from the collection to use (it cannot scan the collection on its own). This means that we'll need to scan the collection to find the font index that supports the script for the run; we should probably scan the font data and cache this information when the collection is first loaded.

To summarize, my suggested path to supporting complex scripts in Gio is:

  1. Write a cgo wrapper around HarfBuzz.

Will that package require HarfBuzz installed on the user's system beforehand?

  1. Implement the above algorithm to convert strings into runs.
  2. Update Gio's font rendering to: convert the string to runs, feed the runs into HarfBuzz, and then render the glyphs with the existing rendering system.
  3. Update the text as necessary to remove implicit shaping assumptions that are only valid for non-complex scripts.
  4. Update widget.Editor to base the caret position on clusters output by HarfBuzz, instead of runes.

Good plan.

My main worry is the hard C++ and Cgo dependency on HarfBuzz. I believe a killer feature of Gio is the ability to generate GUI binaries with minimal dependencies at runtime and build time. In fact, my longer term goal is for users to not need any dependencies outside a Go toolchain.

( One approach is .syso files:

https://github.com/golang/go/wiki/GcToolchainTricks#c-code-without-cgo

Another potential approach is pre-generated Cgo:

https://github.com/golang/go/issues/38917

)

FWIW, that goal is why Gio has a custom renderer instead of something like Skia, and why package app is an optional (yet mostly unavoidable) dependency.

I would like the text shaper dependency isolated or eliminated as well. Some ideas:

  • Keep the naïve Gio text shaper, update its output to match the advanced interfaces from the go-harfbuzz package. Then, let main programs decide which shaper to use (by importing .../harfbuzz and instantiating a concrete shaper).

This option is why text.Shaper is an interface and not a concrete type.

Downside is that in practice, most general purpose programs will need the harfbuzz package, complicating their build.

  • Build HarfBuzz as a static .a library in a go-harfbuzz repository to minimize dependencies on additional toolchains (C++). Unfortunately, I suspect linking C++ is too complicated for the #38917 proposal be realistic. I also worry about the size of the C++ runtime.

  • Like the above, but build HarfBuzz as a .syso file. Quite possibly unfeasible.

  • Translate the HarfBuzz rules to Go. Only feasible if most of HarfBuzz algorithms are table-driven or regular enough for automatic translation.

  • Pre-generate the complex layout rules and add them to font data.

While researching this issue, I stumpled upon Graphite (SIL):

https://en.wikipedia.org/wiki/Graphite_(SIL)

Apple's AAT seems similar:

https://en.wikipedia.org/wiki/Apple_Advanced_Typography

The idea is that instead of putting all the complexity into the font shaper, add the shaping rules to the font files themselves. The idea is viable because Gio doesn't need to be able to handle arbitrary TTF font files at runtime, unlike, say, a browser. We can require fonts in whatever format we like.

In other words, is it feasible to have a tool that takes a font and pre-generates the difficult rules and then have a simpler runtime layout algorithm in Go (the tool is run once per font and may have arbitrary C++ dependencies)? Alternatively, can we get away with TTF fonts for latin scripts and require SIL or AAT for the complex ones (assuming a Go Graphite/AAT shaper is feasible)?

Elias

~tainted-bit 3 months ago

Thanks all for the kind words and Elias for the thoughtful analysis. In the interest of full disclosure, I am also not an expert in typography (my day job is cryptography), and most of these topics are things I've only learned in the past week; I'm certain that I remain ignorant of much. Apologies if I accidentally misrepresented my knowledge level. Additionally, beyond gio#104 this feature is in "nice-to-have" territory for my use-case, and thus I don't really have the institutional backing to heavily contribute to a resolution. I created this issue because I think it's an important feature to track, and because I felt that I could provide a general roadmap for others who may be in a better position to contribute.

The Rustaceans did it

I had not heard of the Allsorts project. It looks quite promising! I hope that the Go community could eventually create something similar, because it would be very useful to many projects.

Have you considered the other way round: working through the issues with the Go team and export their implementations? It may take longer, but the result will be reviewed by experts and more widely useful. I'd love for other Go toolkits such as Fyne to have great text layout as well.

That is definitely a possible approach. I'm quite curious about why the package is in its current state and what might be holding it back. If it is because there's a problem with the API design, that would be useful information to know before starting a similar effort. I personally cannot contribute code to the Go project itself, but maybe somebody without these constraints would be interested in solving the mystery and completing the package. Luckily we have the algorithm fully specified in the Unicode Annex to fall back on if needed.

It's really not great that the local machine's language(s) influence how text is laid out. I suspect Pango goes to a great deal of trouble because it wants to be useful in every situation, no matter how little information it has. It would be OK for Gio to require language hints from the programmer to properly support ambiguous scripts.

I agree. I went looking for this algorithm in Pango because it was not clear to me how it would be possible to provide the language to HarfBuzz at all, and was rather surprised to discover how Pango does it. Altering the text shaper based on system settings seems unexpected and undesirable. There are certainly cases where the developer knows the language that a string is written in, so it would be nice to have a version of LayoutString and friends that allows the language to be specified. Getting this right will be tricky for strings that contain multiple languages, though. I can imagine that being done either by passing a slice of string offsets and languages, or by having the app developer call the layout functions for each substring (which would be close to bubbling up responsibility for forming HarfBuzz runs up to the app).

In any case we'll still probably need a function that does best-effort guessing of languages, because there are a lot of cases in which the app also won't know what language a string is written in (particularly in the case of user input). In such cases, the most sensible approach might be to guess using the "representative language" approach in a more generous way than Pango. This would mean that we'd have failure cases like always guessing that Han characters are Chinese, thereby possibly rendering Japanese text incorrectly. These cases are more forgivable when the solution is for the app developer to explicitly specify the language with the other function.

I'm not sure of a better approach, because it seems that in the general case, using the HarfBuzz API essentially requires solving the Natural Language Processing problem, which will not be happening anytime soon (and certainly not at real-time speeds in any case).

My main worry is the hard C++ and Cgo dependency on HarfBuzz.

Yes, I share this worry. It seems to me that the best approach in the short term would be your first suggestion of a cgo package depending on HarfBuzz (which would need to be installed in the development environment at least) that could optionally be swapped out with the default shaper. This is something that would work today, but brings along all of the associated disadvantages. I don't know enough about the #38917 and .syso approaches to know if those would work, but they do sound preferable if they are possible. Depending on how long this issue remains open, it might make sense to start with the optional cgo package and then switch to / release another package using the more advanced methods if they become viable.

It seems to me that translating the HarfBuzz rules into Go would require the same infrastructure as embedding the rules in a Graphite font. I don't see how the rules could be reasonably extracted from HarfBuzz without a code analysis in either case, and so I'd lean toward the Go translation as a better option. This would also be preferable to using cgo in my opinion, if it is possible. After browsing the HarfBuzz sources for a while, I see that there are many table-based rules and code to generate these tables from Unicode-provided text files, but there is also a significant amount of logic in the code that gives me doubts about an automated translation. Still, it is worth investigating more deeply.

~joe-getcouragenow 3 months ago

The tables are perhaps extractable.

https://github.com/grisha/hbshape/blob/master/hbshape.go

Then call the hbshape command.

I agree that doing this with the golang team would help everyone gain this.

https://github.com/golang/go/issues/29528

  • Latin only :(

~tainted-bit 3 months ago

To summarize, the "Write a cgo wrapper around HarfBuzz" step in the original issue has several possible alternatives:

  • Write a cgo wrapper around HarfBuzz.
  • Link to a pre-compiled HarfBuzz using #38917 or the .syso method.
  • Convert the HarfBuzz shaping rules into native Go code, or a Graphite font (or some other encoding).

The github.com/grisha/hbshape package is a simplified form of the original suggestion: a cgo binding around HarfBuzz. If we take that approach, it would probably be best to expose more of the hb_shape functionality and write more code to handle buffer lifetimes. This is the easiest path forward, requiring only a few hundred lines of Go code at most and almost no maintenance. The main downside is that it requires a system dependency for linking with HarfBuzz, which we'd ideally like to avoid since it compromises Gio's "killer feature".

The two harder paths forward are: get the #38917 or .syso approaches to work (would require a lot of C++ linking knowledge and Go toolchain hacking, if it's possible at all) so that we can distribute pre-compiled versions of HarfBuzz, or programmatically translate the HarfBuzz algorithms into Go or Graphite rules (would require experience with static analysis, lots of tweaking, and manually porting some parts).

To give some more context around this last option, here's a rough breakdown of the HarfBuzz control flow based on my cursory observations:

  1. The main entry point is hb_shape_full in hb-shape.cc. The first thing that this does is initialize a shape plan based on the buffer properties. More on this later. [Code link]
  2. When the buffer is configured to use OpenType fonts (which is probably what we'll do), hb_shape_full eventually calls hb_ot_shape_internal in hb-ot-shape.cc with the shape plan. hb_ot_shape_internal contains the high-level logic for shaping. [Code link]
  3. Multiple shaping operations make use of the shape plan: masking, replacement characters, complex positioning, and more. In order to do complex script positioning, the plan is initialized (and cached) with table data based on the buffer's script back in the first step. The relevant part of the call stack for that is hb_ot_shape_complex_categorize in hb-ot-shape-complex.hh. [Code link]. There you can find a script-based switch that chooses a complex script shaper.
  4. Complex script shapers are stored in files called hb-ot-shape-complex-*. There are quite a few of them. As an example of how they work, look at hb-ot-shape-complex-arabic.cc. [Code link]. The main thing to note here is that complex shapers have a lot of power. They can specify font "features" to load (this corresponds to "table" tags in the SFNT format) in order to implement their responsibilities, which can be seen in struct hb_ot_complex_shaper_t in hb-ot-shape-complex.hh. [Code link]. They can use arbitrary C++ code to implement these features; looking at the Arabic shaper reveals that this logic can be quite complex in practice.
  5. The complex shaper implementations often import data tables from unicode. For example, hb-ot-shape-complex-arabic-table.hh contains information about glyph joining and positioning. [Code link].
  6. The table data header files are generated using small Python programs that read unicode standard text files. These programs are stored in files named gen-*.py. For Arabic, the table generator is gen-arabic-table.py. [Code link].

The main takeaway from this for me is that HarfBuzz is complicated! Moreover, most of the source files referenced above were updated between a few hours and a few days ago at the time of this writing---this is what I mean by the project being high maintenance. This is why I think that the only reasonable approach to transcoding the rules (e.g., into Go, Graphite, or something else) is an automated translation, or else the resources required to essentially duplicate the effort would be substantial. Unfortunately, as the summary above points out, this will require static analysis of C++ code, which is certainly non-trivial. It's certainly not possible to extract the rules using dynamic analysis alone (such as a cgo wrapper). The major advantage of the automated translation approach would be that we wouldn't need to link with C++ code at all, while still having access to the latest text shaping features.

Hopefully that gives a good starting point for anyone who wants to explore some of these methods.

~whereswaldon 3 months ago

This might be a bad idea (I lack the rust knowledge to evaluate the feasibility), but could we build the rust implementation as a C archive and link to it with CGO? The benefit is primarily avoiding the complexity of harfbuzz's build system (though memory safety is also nice). Or, perhaps, could we automate a rust => go transformation? I basically wonder whether we can leverage the work that the rustaceans have already put in to accomplish this in a newer generation of systems language.

~joe-getcouragenow 3 months ago

Do you mean to link to the rust port instead of the hafbuzz c++ code ?

What about try out https://github.com/elliotchance/c2go ?

The tinygo system takes the same approach of using clang to produce the AST.

There may even we some cross over to leverage between tinygo and c2go. For example the tooling of tinygo may be able to be leveraged to make it easier to use the c2go tool.

Harbuzz is quite complicated. But it's probably worth having a try to at least rule it out before going down the cgo or static linking.

~joe-getcouragenow 2 months ago

The sqlite tool that translates c to go has now reached beta and actually works. This transpiler is a likely candidate for doing the same with hafbuzz

https://www.reddit.com/r/golang/comments/hyenjh/cgofree_sqlite_databasesql_driver_for_linuxamd64/

Here is the generator: https://gitlab.com/cznic/sqlite/-/blob/master/generator.go

Register here or Log in to comment, or comment via email.