A fair amount of space is wasted due to duplicate strings in the d:source data. This will need be addressed before the next release.
Any idea on how you would do this? I've thought about how this might work also. Maybe a dictionary entry that contains the string and then use a content hash in the header that points to a string lookup table? I don't have enough knowledge of Retro internals to say for sure but am curious.
For words in the base image, I'm manually filling in the source data, so those don't have duplications.
I have a couple of easy options for the others:
- setup a table (or linked list?) of source filenames & hashes, and point the d:source field to the existing entries (or add to it if not present)
- use the dictionary. In this case I'd have a word class that identifies these strings, and point the d:source field for words to the d:name field of the dictionary entries whose name matches the source filename
I like the second approach in terms of not needing to add another data structure, but it would add some visual noise to the output of
d:words
. I'll probably do a prototype of each and see what feels better in practice.