~ols

sheffield, england

https://ols.wtf

Trackers

~ols/veri

Last active 3 years ago

~ols/ols.wtf

Last active 5 years ago

~ols/docs

Last active 6 years ago

#204 GraphQL list archive contains duplicate messages 2 years ago

Comment by ~ols on ~sircmpwn/lists.sr.ht

Further thoughts based on a little digging into how different lists behave:

  • small/personal lists only duplicate once
  • larger lists like ~sircmpwn/free-writers-club and ~sircmpwn/sr.ht-ops duplicate hundreds of times
  • massive lists like ~sircmpwn/sr.ht-announce duplicate 1000s of times

Could it be linked to subscriber numbers?

#204 GraphQL list archive contains duplicate messages 2 years ago

Ticket created by ~ols on ~sircmpwn/lists.sr.ht

mbox archives downloaded from the URL provided by the GraphQL API contain multiple copies of the same message

e.g. for ~sircmpwn/free-writers-club

  • At the time of filing this ticket, there has been one message in the last 30 days
  • The Last 30 Days archive from the lists UI contains one email
  • The Last 30 Days archive from the GraphQL API (https://lists.sr.ht/query/list/4.mbox?since=30) contains 175 copies of the same email

#123 Allow user CSS 3 years ago

Comment by ~ols on ~emersion/gamja

Understood, will look at making mobile CSS more accessible as an alternative

#123 Allow user CSS 3 years ago

Ticket created by ~ols on ~emersion/gamja

Allow a user to provide a CSS snippet to style the client (for mobile users who cannot just install a browser extension to manage this)

#8 Security Headers 3 years ago

Ticket created by ~ols on ~ols/veri

#7 Proper cache (make use of HTMLContent) 3 years ago

Ticket created by ~ols on ~ols/veri

#6 Add date scraped 3 years ago

Ticket created by ~ols on ~ols/veri

#5 Custom user agent 4 years ago

Comment by ~ols on ~ols/veri

Currently the User-Agent is blank. This should be changed so that it contains the details of the project including an abuse and/or contact email address. This should be configurable easily so that not all self-hosted instances reference the parent project: e.g.

veri-[web|gemini|gopher]-scraper operated by <domain> (<email>)

#4 Respect robots.txt 4 years ago

Comment by ~ols on ~ols/veri

The colly library we are using to grab URLs from pages respects robots.txt, but this is passed to go-readability which has no concept of robots.txt. We either need to implement our own checking of the site's robots.txt before running go-readability, or else give the list of links another pass through colly to filter out ones that don't match the site's robots.txt

#2 Write output to database 4 years ago

Comment by ~ols on ~ols/veri

Database technology as yet undecided