~nickbp/kapiti#27: 
Rykv serialization is unable to read from redis

From gdb stacktrace against 2085362. Rykv serialization is (only) being used for storing to Redis, and reading from Redis shows the following segfault.

<a href="/~nickbp/kapiti/5" title="~nickbp/kapiti#5: Implement DNS-over-TLS client">#5</a>  0x0000555555843404 in kapiti::specs::message::_::{{impl}}::deserialize<rkyv::de::deserializers::AllocDeserializer,u8,kapiti::specs::enums_generated::OpCode> (self=0x7fffe8004bff, deserializer=0x7ffff7962558)
    at /home/nick/proj/kapiti/src/specs/message.rs:51
<a href="/~nickbp/kapiti/6" title="~nickbp/kapiti#6: Implement DNS-over-HTTPS client">#6</a>  0x00005555558455c4 in kapiti::specs::message::_::{{impl}}::deserialize<rkyv::de::deserializers::AllocDeserializer> (self=0x7fffe8004bfc, deserializer=0x7ffff7962558) at /home/nick/proj/kapiti/src/specs/message.rs:75
<a href="/~nickbp/kapiti/7" title="~nickbp/kapiti#7: Implement recursive DNS queries">#7</a>  0x0000555555841764 in kapiti::specs::message::_::{{impl}}::deserialize<rkyv::de::deserializers::AllocDeserializer> (self=0x7fffe8004bd0, deserializer=0x7ffff7962558) at /home/nick/proj/kapiti/src/specs/message.rs:13
<a href="/~nickbp/kapiti/8" title="~nickbp/kapiti#8: Implement rate limiter for incoming queries">#8</a>  0x0000555555761d24 in kapiti::redis::RedisCache::fetch (self=0x5555566129b0, request_info=0x555556626e10)
    at /home/nick/proj/kapiti/src/redis.rs:80
<a href="/~nickbp/kapiti/9" title="~nickbp/kapiti#9: Run filter update/parse as a background task">#9</a>  0x000055555594d3f6 in kapiti::resolver::{{impl}}::resolve::{{closure}} ()
    at /home/nick/proj/kapiti/src/resolver.rs:119

Might be straightforward, haven't looked into it yet. Going by how the error mentions OpCode specifically, maybe it's caused by the recent switch from u8 to enum there? The Redis instance might just be holding some records from before the change.

May be a good time to introduce src/specs/* schema hashes in the Redis access, see TODOs in redis.rs.

Status
RESOLVED FIXED
Submitter
~nickbp
Assigned to
No-one
Submitted
6 months ago
Updated
6 months ago
Labels
No labels applied.

~nickbp 6 months ago

Fixed the segfault by enabling rykv validation: https://gitlab.com/kapiti/kapiti/-/commit/2012055df29eb8631d416c6c33fdb595d9bcabea

It still fails to deserialize cached redis data, but no longer segfaults and instead ignores the bad data for caching purposes.

With that fix, the errors now look like:

Apr 03 10:20:12.655  WARN kapiti::redis: Ignoring corrupt redis data at key='kapiti_fbmsg__A__example.com' (ttl=85973): check bytes error: check failed for struct member header: check failed for struct member op_code: check failed for enum tuple variant Enum: check failed for tuple struct member 0: invalid tag for enum: 120

This one may be expected if the redis data is very old. I switched the op_code field from an int to an IntEnum last night, so maybe there's old long-lived data still in the cache? (Note the large TTL)

Apr 03 10:20:12.655  WARN kapiti::redis: Ignoring corrupt redis data at key='kapiti_fbmsg__A__example.org' (ttl=437): check bytes error: check failed for struct member header: check failed for struct member is_response: check failed for bool: expected 0 or 1, found 122

This one meanwhile looks a bit suspicious and suggests a serialization issue. The TTL isn't nearly as long so it probably isn't a leftover value from yesterday.

~nickbp 6 months ago

Changed redis keying to include a hash of src/specs/*, now there shouldn't be any issues around schema changes causing cache deserialization errors on upgrades: https://gitlab.com/kapiti/kapiti/-/commit/3fdc33820a65f9c76ee916d949e1404d9b8eee02

Even with the change, I'm still seeing the same validation errors listed above, so it looks like the problems here are with the serialization/deserialization itself, rather than anything to do with changes to the schema.

Can test the following on redis.set():

  • Try to deserialize immediately after serialize and check that the result can be parsed
  • Try to redis.get() after the set() and check that the output matches the input

Possible causes:

  • Using the serializers wrong?
  • Incorrect annotations on serializable types?
  • Bug in rykv?

~nickbp REPORTED FIXED 6 months ago

Turns out it was "using the serializers wrong": The offset returned by rkyv when serializing must also be passed to rkyv when deserializing. I had assumed this was for cases like serializing multiple things to a single buffer and was just setting it to 0, but apparently not.

As a solution this offset is now being stored to redis as a u32 alongside the data itself, to be included in deserialize calls. Once a correct non-zero offset was provided, the serialization errors went away.

Register here or Log in to comment, or comment via email.