~aw/mygit#34: 
email threading

parse threads

I think I almost certainly need a sqlite db here. While I'm still just messing around with things, I can do everything in memory

Status
REPORTED
Submitter
~aw
Assigned to
No-one
Submitted
6 months ago
Updated
6 months ago
Labels
No labels applied.

Johann Galle 6 months ago · edit

On 2021-03-30, Alex Wennerberg wrote:

parse threads

How do you want to do threading, i.e. how detailed should it be? I think we could get to source hut level of threading (which is quite simple) by checking some combination of the Message-ID, In-Reply-To and References headers. https://cr.yp.to/immhf/thread.html might be helpful.

I think I almost certainly need a sqlite db here. While I'm still just messing around with things, I can do everything in memory

If using the above approach, I'm not sure if using a database for basically just lists is necessary. If old messages are deleted when it becomes too much, we might be able to get around that. Or maybe generate some static HTML and just append to that when receiving new messages to get around it too.

~aw 6 months ago

On Thu Apr 1, 2021 at 5:29 AM PDT, Johann Galle wrote:

On 2021-03-30, Alex Wennerberg wrote:

parse threads

How do you want to do threading, i.e. how detailed should it be? I think we could get to source hut level of threading (which is quite simple) by checking some combination of the Message-ID, In-Reply-To and References headers. https://cr.yp.to/immhf/thread.html might be helpful.

I think I almost certainly need a sqlite db here. While I'm still just messing around with things, I can do everything in memory

If using the above approach, I'm not sure if using a database for basically just lists is necessary. If old messages are deleted when it becomes too much, we might be able to get around that. Or maybe generate some static HTML and just append to that when receiving new messages to get around it too

The issue essentially just has to do with indexing. Building the thread without a db would for example require either reading through and parsing every email for each http request, or keeping an in-memory index of all the email ids -> email file. That may actually be good enough, but there are potentially advantages of using a sqlite db (plaintext searching, etc)

BTW I am traveling soon and will be spending much less time on this project for the foreseeable future -- if you'd like I can give you write access to the repo or you can fork it if you'd like to continue experimenting.

All the best, Alex

Register here or Log in to comment, or comment via email.