~sircmpwn/hg.sr.ht#33: 
Generated archives have unstable checksum due to changing timestamps

For reproducibility downstream needs archives that don't change on every download.

$ curl -sO https://hg.sr.ht/%7Esircmpwn/hg.sr.ht/archive/0.17.0.tar.gz
$ sha256 0.17.0.tar.gz
SHA256 (0.17.0.tar.gz) = 2e08230afef59cda9472516306c7ba2376ba10a0d90672513fb091fc6c0c61f8
$ tar xzf 0.17.0.tar.gz -C before

$ curl -sO https://hg.sr.ht/%7Esircmpwn/hg.sr.ht/archive/0.17.0.tar.gz
$ sha256 0.17.0.tar.gz
SHA256 (0.17.0.tar.gz) = 04e6a3d4313fe333021dc03b88b74b2c220951c23f5c25c5bda11e361879393c
$ tar xzf 0.17.0.tar.gz -C after

$ diff -U8 <(mtree -K sha256 -cip before) <(mtree -K sha256 -cip after)
--- before
+++ after
@@ -1,19 +1,19 @@
 #	   user: holo
 #	machine: raphael.local
-#	   tree: /tmp/before
+#	   tree: /tmp/after
 #	   date: Wed Nov 20 22:53:06 2019
 
 # .
 /set type=file uid=1234 gid=0 mode=0755 nlink=1 flags=none
-.               type=dir nlink=3 time=1574289857.738323359
+.               type=dir nlink=3 time=1574290037.232710825
 
 # ./hg.sr.ht-0.17.0
-hg.sr.ht-0.17.0 type=dir nlink=5 time=1574289857.739530469
+hg.sr.ht-0.17.0 type=dir nlink=5 time=1574290037.233607731
     .hg_archival.txt \
                 mode=0644 size=122 time=1573509795.000000000 \
                 sha256digest=7323b7a7d3ba0752a3c4f74aad5c525640fc4910af99e30c62d1438899739035
     .hgignore   mode=0644 size=151 time=1573509795.000000000 \
                 sha256digest=4e05b8e612cfc5b63606c462f1f1195a192415c45362e5d84adaa33d0ca1efa8
     .hgtags     mode=0644 size=2186 time=1573509795.000000000 \
                 sha256digest=b05a4536b207ec700d18363f482bee811c85620172135151d36a5a649a2fc84d
     LICENSE     mode=0644 size=34520 time=1573509795.000000000 \
@@ -51,30 +51,30 @@ hg.sr.ht-0.17.0 type=dir nlink=5 time=1574289857.73953
     run.py      size=1904 time=1573509795.000000000 \
                 sha256digest=d27b3e49cc4e247a0320edcba85e70900a1d74a5265319ac92d8f59b72bed458
     setup.py    size=2150 time=1573509795.000000000 \
                 sha256digest=b5107e6dec1687f02aa77c14a36ab1fe9f066fef0bf8271680a8039d4d841734
     static      type=link time=1573509795.000000000 link=hgsrht/static/
 
 # ./hg.sr.ht-0.17.0/.builds
 /set type=file uid=1234 gid=0 mode=0644 nlink=1 flags=none
-.builds         type=dir mode=0755 nlink=2 time=1574289857.738408742
+.builds         type=dir mode=0755 nlink=2 time=1574290037.232766081
     alpine.yml  size=1536 time=1573509795.000000000 \
                 sha256digest=3ad1901a4de957607c1bc4415d9d1a03068104a24b059c596aad982aecfc50b9
     archlinux.yml \
                 size=1142 time=1573509795.000000000 \
                 sha256digest=e140d31900cf7bff2fa750fa428a0c37fe575f3343402fed43324b4a07ca005e
     debian.yml  size=1338 time=1573509795.000000000 \
                 sha256digest=05a69aadba06948ef7df6ed19ee46ff8eaaac8a4101cbccdd7ed07577be56583
 # ./hg.sr.ht-0.17.0/.builds
 ..
 
 
 # ./hg.sr.ht-0.17.0/hgsrht
-hgsrht          type=dir mode=0755 nlink=8 time=1574289857.739467868
+hgsrht          type=dir mode=0755 nlink=8 time=1574290037.233568207
     app.py      size=1703 time=1573509795.000000000 \
                 sha256digest=e6f5fd9df4feb76c5cd3eca2ff0fb8e4971284d1a15695d2bac8b421bdeb8440
     hg.py       size=3812 time=1573509795.000000000 \
                 sha256digest=8d265bb2d223ae23c9836fbe8b814168b2ae4d0d98c8470796780b179703c419
     hgwebshim.py \
                 size=1837 time=1573509795.000000000 \
                 sha256digest=fae7788623375336cc8003ebbe78cfaa439fba60e63df61949209474ab9dce4b
     repos.py    size=2980 time=1573509795.000000000 \
@@ -82,25 +82,25 @@ hgsrht          type=dir mode=0755 nlink=8 time=157428
     service.py  size=370 time=1573509795.000000000 \
                 sha256digest=ffd092eb51e6fbe362e1f73363c0a709a5c6bb62ce470e89de482eae18885333
     submit.py   size=3337 time=1573509795.000000000 \
                 sha256digest=36cfc976ce3adf5ea16db46cf93f96e2c96f5320f9f1de242ebfc359e185aca0
     webhooks.py size=414 time=1573509795.000000000 \
                 sha256digest=f5af9a9be16ed8c2741d02e0f41a4258fc136776a9aa9fa0868af96165f15fb8
 
 # ./hg.sr.ht-0.17.0/hgsrht/alembic
-alembic         type=dir mode=0755 nlink=3 time=1574289857.738772507
+alembic         type=dir mode=0755 nlink=3 time=1574290037.233071376
     env.py      size=72 time=1573509795.000000000 \
                 sha256digest=47ccfa69be3a0e4b609a69b8e60e5effaa29cee27f7ef182b5e10b83d1b46e52
     script.py.mako \
                 size=412 time=1573509795.000000000 \
                 sha256digest=0fc905238e3ff6f04966b0184a46710d35f6f92e58fa811eb4477d04a968f52f
 
 # ./hg.sr.ht-0.17.0/hgsrht/alembic/versions
-versions        type=dir mode=0755 nlink=2 time=1574289857.738848287
+versions        type=dir mode=0755 nlink=2 time=1574290037.233133204
     07d78f270a70_add_user_webhook_table.py \
                 size=1815 time=1573509795.000000000 \
                 sha256digest=86e3e7b131a0077759c7830c7f2ecc38cd60e69ff7b40b8698c176cbdb27566e
     43fff2508875_add_source_repo_id_to_repository.py \
                 size=597 time=1573509795.000000000 \
                 sha256digest=aee6864bf3c7f15ec857a9e83096974bccf47ebf81816577a5f5df30885d9fcc
     70bea64c0008_add_ssh_key_table.py \
                 size=697 time=1573509795.000000000 \
@@ -114,53 +114,53 @@ versions        type=dir mode=0755 nlink=2 time=157428
 # ./hg.sr.ht-0.17.0/hgsrht/alembic/versions
 ..
 
 # ./hg.sr.ht-0.17.0/hgsrht/alembic
 ..
 
 
 # ./hg.sr.ht-0.17.0/hgsrht/blueprints
-blueprints      type=dir mode=0755 nlink=2 time=1574289857.738942912
+blueprints      type=dir mode=0755 nlink=2 time=1574290037.233212256
     __init__.py size=0 time=1573509795.000000000 \
                 sha256digest=e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
     extramanage.py \
                 size=1687 time=1573509795.000000000 \
                 sha256digest=a0c2f842dc55b1cd19ae98d7f0e71e74b442d6394481ee47dfa36da5326161bb
     internal.py size=3722 time=1573509795.000000000 \
                 sha256digest=eeb78d31902c837b77edc632b8595d42ce5b8cf9bf4237cb4dc0fef73aae9162
     repo.py     size=26143 time=1573509795.000000000 \
                 sha256digest=23a62e479a8deb7d05d8e4469e8465694a06d93fc3f7e755ed945cf943038438
     stats.py    size=1997 time=1573509795.000000000 \
                 sha256digest=6c539115a3e6ae146803019703e9d11ad8b5f33020adfb4e8c3e172c86bd1cfa
 # ./hg.sr.ht-0.17.0/hgsrht/blueprints
 ..
 
 
 # ./hg.sr.ht-0.17.0/hgsrht/hgext
-hgext           type=dir mode=0755 nlink=2 time=1574289857.739109314
+hgext           type=dir mode=0755 nlink=2 time=1574290037.233342107
     __init__.py size=2598 time=1573509795.000000000 \
                 sha256digest=a539459733c5caed8764d7fea0846f91d65ede78d35b5ef4b6daf1996e1c2cf1
 # ./hg.sr.ht-0.17.0/hgsrht/hgext
 ..
 
 
 # ./hg.sr.ht-0.17.0/hgsrht/hgrcs
-hgrcs           type=dir mode=0755 nlink=2 time=1574289857.739148434
+hgrcs           type=dir mode=0755 nlink=2 time=1574290037.233368844
     global.cfg  size=64 time=1573509795.000000000 \
                 sha256digest=9b946f0fe59cf789167a227ac3a5a556e6ba75398134c00cc962dacbc486006c
     nonpublishing.cfg \
                 size=55 time=1573509795.000000000 \
                 sha256digest=6785539811e9c661ecc436be9b4c10feec05e75ba9785af4b402557b2af7b7f8
 # ./hg.sr.ht-0.17.0/hgsrht/hgrcs
 ..
 
 
 # ./hg.sr.ht-0.17.0/hgsrht/templates
-templates       type=dir mode=0755 nlink=2 time=1574289857.739414544
+templates       type=dir mode=0755 nlink=2 time=1574290037.233534144
     bookmarks.html \
                 size=103 time=1573509795.000000000 \
                 sha256digest=f2f4a381b07ed343f495c54319a309be9889615f3c1fcee444500afcce74e0a1
     branches.html \
                 size=103 time=1573509795.000000000 \
                 sha256digest=f2f4a381b07ed343f495c54319a309be9889615f3c1fcee444500afcce74e0a1
     dashboard.html \
                 size=280 time=1573509795.000000000 \
@@ -186,30 +186,30 @@ templates       type=dir mode=0755 nlink=2 time=157428
                 sha256digest=63cedca68a50f2b348ed31ad9e4988cd1745509ce6bcb780c864e8fdeba8ffc8
     tags.html   size=105 time=1573509795.000000000 \
                 sha256digest=f5177a9dd9d0f4ef014ccacb4b4f280f9d54fdb9875b0a4c992ced51850b051f
 # ./hg.sr.ht-0.17.0/hgsrht/templates
 ..
 
 
 # ./hg.sr.ht-0.17.0/hgsrht/types
-types           type=dir mode=0755 nlink=2 time=1574289857.739453373
+types           type=dir mode=0755 nlink=2 time=1574290037.233558622
     __init__.py size=732 time=1573509795.000000000 \
                 sha256digest=410b923ebd6e43ef3cacc1eaba62d225385065d41f09f0b08e67790924c31385
     sshkey.py   size=631 time=1573509795.000000000 \
                 sha256digest=eb212a64002703a8cb2d56bfc17e707aa44a8d1caba5f73b2b90c496490d14cb
 # ./hg.sr.ht-0.17.0/hgsrht/types
 ..
 
 # ./hg.sr.ht-0.17.0/hgsrht
 ..
 
 
 # ./hg.sr.ht-0.17.0/scss
-scss            type=dir mode=0755 nlink=2 time=1574289857.739497039
+scss            type=dir mode=0755 nlink=2 time=1574290037.233589167
     main.scss   size=2987 time=1573509795.000000000 \
                 sha256digest=758d8e58007980a0cd6f5fa66798ea5e55732ff5de1071988162ac6c8cdf9d02
 # ./hg.sr.ht-0.17.0/scss
 ..
 
 # ./hg.sr.ht-0.17.0
 ..
Status
RESOLVED FIXED
Submitter
Jan Beich
Assigned to
No-one
Submitted
5 years ago
Updated
4 years ago
Labels
No labels applied.

~sorrow 4 years ago

From discussion on lists:

Looks like "-n" needs to be added to gzip(1):

 -n, --no-name     This option stops the filename and timestamp from being
                   stored in the output file.

~gentoophysicist 4 years ago

Is anything happening with regards to this bug? It makes it impossible to create stable Gentoo packages, since we need reproducible tarballs.

~maribu 4 years ago

I would like to gently remind that this is still an issue for distributions packaging software hosted here. Given that the fix should be relatively simple (adding the -n parameter to the call of gzip), it would be nice if this could be addressed.

~nabijaczleweli 4 years ago

Unless I'm more blind than usual, the fix has required and still requires patching the upstream mercurial server, since the actual packaging call is

hg_repo.client.archive(path.encode(),
    rev=rev,
    prefix=basename,
    type="tgz")

and mercurial uses a homebrew archiver (https://www.mercurial-scm.org/repo/hg/file/d42809b6b10f/mercurial/archival.py#l134), which shouldn't be too difficult to fix if you have a repro, which it looks like you do.

~maribu 4 years ago

Thanks for the pointer. I'm pretty unfamiliar with hg as I usually work with git, but to me it looks like setting the mtime parameter in the call to archive() to the timestamp of the last change would fix the issue. If that is true, no change in upstream mercurial would be required.

Maybe someone familiar with the source can quickly add the parameter and check if this fixes the issue?

~nprescott 4 years ago

I don't believe the fix requires any upstream changes.

It looks to me like the reported timestamp differences reflect the time at which tar was invoked. I can reproduce the behavior with a single tar file extracted to two directories.

$ mkdir /tmp/first
$ mkdir /tmp/second

$ tar xzf 0.17.0.tar.gz -C /tmp/first
$ tar xzf 0.17.0.tar.gz -C /tmp/second

$ mtree -K sha256digest -cip /tmp/first > /tmp/first.txt
$ mtree -K sha256digest -cip /tmp/second > /tmp/second.txt

$ diff /tmp/first.txt /tmp/second.txt
--- /tmp/first.txt	Wed Dec  9 22:33:58 2020
+++ /tmp/second.txt	Wed Dec  9 22:34:13 2020
@@ -1,14 +1,14 @@
-#	   tree: /tmp/first
-#	   date: Wed Dec  9 22:33:58 2020
+#	   tree: /tmp/second
+#	   date: Wed Dec  9 22:34:13 2020
 
 # .
 /set type=file uid=1000 gid=0 mode=0755 nlink=1
-.               type=dir nlink=3 time=1607571153.604606590
+.               type=dir nlink=3 time=1607571182.794670190
 
 # ./hg.sr.ht-0.17.0
-    hg.sr.ht-0.17.0 type=dir nlink=5 time=1607571153.604606590
+    hg.sr.ht-0.17.0 type=dir nlink=5 time=1607571182.794670190
         .hg_archival.txt \
                         mode=0644 size=122 time=0.0 \
                         sha256digest=7323b7a7d3ba0752a3c4f74aad5c525640fc4910af99e30c62d1438899739035
@@ -54,7 +54,7 @@
 # ./hg.sr.ht-0.17.0/.builds
 /set type=file uid=1000 gid=0 mode=0644 nlink=1
         .builds         type=dir mode=0755 nlink=2 \
-                        time=1607571153.574606440
+                        time=1607571182.764701552
             alpine.yml      size=1536 time=0.0 \
                             sha256digest=3ad1901a4de957607c1bc4415d9d1a03068104a24b059c596aad982aecfc50b9
             archlinux.yml   size=1142 time=0.0 \

I think the issue is in how the archive is generated with a randomized name before being sent. The "created with" name is visible on each download (contained in the .gz file) and triggers the checksum mismatch:

$ file 0.17.0.tar.gz 
0.17.0.tar.gz: gzip compressed data, was "0.17.0b'0c7421f703e0a468'.tar", last modified: Mon Nov 11 22:03:15 2019, max compression

I've submitted a patch that I think should fix this:

https://lists.sr.ht/~sircmpwn/sr.ht-dev/patches/15870

~ludovicchabant REPORTED FIXED 4 years ago

Fixed by Nolan in 701795ec3596.

Register here or Log in to comment, or comment via email.