git-archive has a stable output format to .tar, but afterward, hands things off to an external compressor: namely, the gzip binary. Let's compare some different downloads across source hosting platforms:
https://github.com/archlinux/devtools/archive/20190821.tar.gz https://git.archlinux.org/devtools.git/snapshot/devtools-20190821.tar.gz https://gitlab.com/eschwartz/devtools/-/archive/20190821/devtools-20190821.tar.gz https://git.sr.ht/~eschwartz/devtools/archive/20190821.tar.gz
$ sha256sum *devtools-20190821.tar.gz 4557e5db0225db0aab0d26b853907b3308037a05231519a69f7882ee2168b3b3 cgit-devtools-20190821.tar.gz 4557e5db0225db0aab0d26b853907b3308037a05231519a69f7882ee2168b3b3 github-devtools-20190821.tar.gz 4557e5db0225db0aab0d26b853907b3308037a05231519a69f7882ee2168b3b3 gitlab-devtools-20190821.tar.gz fe222eb819bf0dd410ab6a3201fc196961746e3b2f1866dae5ca5d27142da208 srht-devtools-20190821.tar.gz
One of these tarballs is not like the others! However, the underlying tar is the same.
$ gzip -dk github-devtools-20190821.tar.gz srht-devtools-20190821.tar.gz $ sha256sum *devtools-20190821.tar 528100dae1d0c2a4747b43b818e6a8776dc66723afcba33e615baac9874eac77 github-devtools-20190821.tar 528100dae1d0c2a4747b43b818e6a8776dc66723afcba33e615baac9874eac77 srht-devtools-20190821.tar
Seems like sr.ht is hosted on alpine with the gzip binary provided by busybox. ssh'ing into a builds.sr.ht alpine image and using gzip -n on the .tar reveals this busybox build reproduces the same tarball. So this is where git.sr.ht is getting the unusual output.
Does busybox guarantee a stable output? It is certainly not generating the exact same bytes as GNU gzip is.
More worryingly, I cannot generate this output on my Arch Linux laptop. My busybox gzip -n, produces the following sha256sum: 4449fda607906c232ba753c9a5b3299ce4b14750aab1ad1da65a3f774df43a8b
It seems like to at least some extent, what output you get from busybox gzip will depend on which version and/or build of busybox you have. Maybe it would be better to require sr.ht to be hosted on a system with a non-busybox build.
This has interesting applications for https://todo.sr.ht/~sircmpwn/git.sr.ht/231, because if the gzip compressor is unreliable it may be better to advise users to sign their sources via
git notes --ref=refs/notes/signatures/tar, not tar.gz (of course, an argument could be made that that is more advisable even without this).
Will need to double-check to make sure reality plays out as expected, but the next major.minor busybox release should ensure that all busybox-generated gzip files have invalidated checksums, and instead align with what GNU gzip creates (where things will hopefully remain permanently).
So it turns out that this is generally an issue that also causes https://github.com/swaywm/sway/issues/4603
Those are from four different services. The git.sr.ht tarball is self-consistent, it doesn't need to match the others.
The tars above show the status has not changed; srht-devtools-20190821.tar.gz continues to have a checksum of fe222eb819bf0dd410ab6a3201fc196961746e3b2f1866dae5ca5d27142da208
However, if you boot into the alpine/edge image where BusyBox v1.32.0 is installed,
busybox gzip -n < devtools-20190821.tarproduces the same file as GNU gzip.
See this test case: https://builds.sr.ht/~eschwartz/job/331650
So the fix would be to upgrade busybox on the sourcehut server, e.g. by migrating to a newer version of Alpine.
If there is the possibility of matching the other sha256 checksums, and it is as trival (maybe), to update the Alpine version (I am not an expert in the setup), then it would be nice if that could be done.
The reason pretty much is, that you can use each service if the others are unavailable, while not needing to have a different hash for each service. E.g. srht -> gitlab -> github -> else.
This is now fixed in the production sr.ht instance (which migrated from alpine 3.12 -> 3.13 today and therefore upgraded busybox), so all sourcehut generated archives undergo a one-time change and then should act like comparable archives from any GNU gzip-using software forge going forward.