I've been running Soju for a few months, storing messages in a Postgres database. My install has ~4.6 million messages:
soju=# select count(*) from "Message";
count
---------
4629046
I went through some of the code for retrieving messages from Postgres found a potential optimization for retrieving the most recent message of a given target.
To be honest - I'm not sure how often retrieving the most recent message of a target is, I haven't gotten to build a custom soju yet and see how much of an impact this makes, but it gets one query from 5 seconds down to under a second so, it may add up. At some point I'll try patching my copy with this and see how noticeable the performance improvement is.
The current query for finding the most recent message for a given target is:
SELECT m.id FROM "Message" AS m, "MessageTarget" as t
WHERE t.network = $1 AND t.target = $2 AND m.target = t.id
ORDER BY m.time DESC LIMIT 1
This results in a big hash join when "Message" gets large, right now on my install this query takes about 5 seconds, here's the output of "explain analyze", I've replaced the target channel with "(my-target)":
Limit (cost=195466.63..195466.75 rows=1 width=12) (actual
time=4798.091..4849.794 rows=1 loops=1)
-> Gather Merge (cost=195466.63..195842.09 rows=3218 width=12)
(actual time=4753.783..4805.483 rows=1 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Sort (cost=194466.61..194470.63 rows=1609 width=12)
(actual time=4566.813..4566.829 rows=1 loops=3)
Sort Key: m."time" DESC
Sort Method: top-N heapsort Memory: 25kB
Worker 0: Sort Method: top-N heapsort Memory: 25kB
Worker 1: Sort Method: top-N heapsort Memory: 25kB
-> Hash Join (cost=8.18..194458.56 rows=1609 width=12)
(actual time=31.911..4490.601 rows=181417 loops=3)
Hash Cond: (m.target = t.id)
-> Parallel Seq Scan on "Message" m
(cost=0.00..189365.14 rows=1930414 width=16) (actual
time=0.710..4029.037 rows=1543032 loops=3)
-> Hash (cost=8.17..8.17 rows=1 width=4) (actual
time=28.123..28.129 rows=1 loops=3)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Index Scan using
"MessageTarget_network_target_key" on "MessageTarget" t
(cost=0.15..8.17 rows=1 width=4) (actual time=28.068..28.078 rows=1
loops=3)
Index Cond: ((network = 1) AND (target
= '(my-target)'::text))
Planning Time: 1.594 ms
JIT:
Functions: 37
Options: Inlining false, Optimization false, Expressions true,
Deforming true
Timing: Generation 5.493 ms, Inlining 0.000 ms, Optimization 10.794
ms, Emission 117.567 ms, Total 133.854 ms
Execution Time: 4852.882 ms
I noticed if I manually get the target id the result is incredibly fast:
explain analyze select m.id from "Message" as m where m.target=396097
order by m.time limit 1;
Limit (cost=0.43..1.63 rows=1 width=12) (actual time=0.380..0.381
rows=1 loops=1)
-> Index Scan using "MessageIndex" on "Message" m
(cost=0.43..682062.37 rows=569093 width=12) (actual time=0.371..0.371
rows=1 loops=1)
Index Cond: (target = 396097)
Planning Time: 0.753 ms
Execution Time: 0.448 ms
So, I crafted a query that forces that to happen - it looks up the MessageTarget id as a subquery, turns it into an array, and accesses the first element:
explain analyze select m.id from "Message" as m where
m.target=(array(select t.id from "MessageTarget" as t where t.network=1
and t.target='(my-target)'))[1] order by m.time desc limit 1;
Limit (cost=8.60..11.09 rows=1 width=12) (actual time=0.396..0.398
rows=1 loops=1)
InitPlan 1 (returns $0)
-> Index Scan using "MessageTarget_network_target_key" on
"MessageTarget" t (cost=0.15..8.17 rows=1 width=4) (actual
time=0.188..0.191 rows=1 loops=1)
Index Cond: ((network = 1) AND (target =
'(my-target)'::text))
-> Index Scan Backward using "MessageIndex" on "Message" m
(cost=0.43..499757.62 rows=201438 width=12) (actual time=0.395..0.395
rows=1 loops=1)
Index Cond: (target = ($0)[1])
Planning Time: 0.698 ms
Execution Time: 0.527 ms
This should give the same result as the original but be a lot faster with larger message stores:
SELECT m.id FROM "Message" AS m
WHERE m.target = (ARRAY(
SELECT t.id
FROM "MessageTarget" as t
WHERE t.network = $1 AND t.target = $2
))[1]
ORDER BY m.time DESC LIMIT 1
Hi there, another potential speedup.
The core of the Postgres version of ListMessageLastPerTarget is essentially:
SELECT t.target, MAX(m.time) AS latest FROM "Message" m, "MessageTarget" t WHERE m.target = t.id and t.network = $1 GROUP BY t.target;
With additional constraints being placed based on parameters - but that seems to be the main part of the query and again, on my system takes some time:
GroupAggregate (cost=0.58..163824.50 rows=6 width=40) (actual time=872.914..6008.324 rows=24 loops=1) Group Key: t.target -> Nested Loop (cost=0.58..163708.67 rows=23154 width=40) (actual time=13.871..5188.214 rows=4462643 loops=1) -> Index Scan using "MessageTarget_network_target_key" on "MessageTarget" t (cost=0.15..24.26 rows=6 width=36) (actual time=0.070..0.198 rows=24 loops=1) Index Cond: (network = 1) -> Index Only Scan using "MessageIndex" on "Message" m (cost=0.43..25267.39 rows=201335 width=12) (actual time=0.076..185.870 rows=185943 loops=24) Index Cond: (target = t.id) Heap Fetches: 772903 Planning Time: 0.933 ms JIT: Functions: 9 Options: Inlining false, Optimization false, Expressions true, Deforming true Timing: Generation 2.175 ms, Inlining 0.000 ms, Optimization 0.977 ms, Emission 12.658 ms, Total 15.811 ms Execution Time: 6012.178 ms (14 rows)
If instead, a left lateral join is used:
SELECT t.target, l.latest FROM "MessageTarget" t LEFT JOIN LATERAL ( SELECT m.target, m.time as latest FROM "Message" m WHERE m.target = t.id ORDER BY m.time DESC LIMIT 1 ) AS l ON t.id = l.target WHERE t.network = $1;
You get something like this:
Sort (cost=19.17..19.19 rows=6 width=40) (actual time=1.483..1.487 rows=24 loops=1) Sort Key: l.latest DESC Sort Method: quicksort Memory: 26kB -> Nested Loop Left Join (cost=4.63..19.10 rows=6 width=40) (actual time=0.235..1.387 rows=24 loops=1) -> Bitmap Heap Scan on "MessageTarget" t (cost=4.20..13.67 rows=6 width=36) (actual time=0.132..0.147 rows=24 loops=1) Recheck Cond: (network = 1) Heap Blocks: exact=1 -> Bitmap Index Scan on "MessageTarget_network_target_key" (cost=0.00..4.20 rows=6 width=0) (actual time=0.102..0.103 rows=24 loops=1) Index Cond: (network = 1) -> Subquery Scan on l (cost=0.43..0.89 rows=1 width=12) (actual time=0.049..0.049 rows=1 loops=24) Filter: (t.id = l.target) -> Limit (cost=0.43..0.88 rows=1 width=12) (actual time=0.047..0.048 rows=1 loops=24) -> Index Only Scan Backward using "MessageIndex" on "Message" m (cost=0.43..90555.82 rows=201337 width=12) (actual time=0.046..0.046 rows=1 loops=24) Index Cond: (target = t.id) Heap Fetches: 19 Planning Time: 0.686 ms Execution Time: 1.648 ms
Cutting the time down from around 5-6 seconds to under 2ms
Like before, I still need to create a soju build with these changes and verify it actually improves performance.
I implemented these queries and opened a pull request, figured I should link to it from here: https://codeberg.org/emersion/soju/pulls/7