~mil/mobroute-tickets#83: 
Mobsql: Integrate gtfstidy

The gtfstidy project could be used as a drop-in library infront of the load process for mobsql. On certain archives gtfstidy can reduce load size / improve load speed by quite a large factor. Would likely make using a large number of archives much more feasible - and would solve many issues with respect aggregate feeds / #19 once implemented.

Only problem currently is that afaict gtfstidy loads each necessary CSV file fully into memory (https://github.com/patrickbr/gtfstidy/issues/17). As a workaround, gtfstidy could be made optional in the load process (on Transito there could be a checkbox to enable/disable on load). Alternative would be to fork gtfstidy and implement mmap or something similar

Status
REPORTED
Submitter
~mil
Assigned to
No-one
Submitted
8 months ago
Updated
8 months ago
Labels
mobsql performance

~mil 8 months ago

~mil 8 months ago

Would also solve issues such as #68 (for duplicate rows in stop_times) where the Mobsql import fails as we validate against GTFS spec in SQL schema

~mil referenced this from #68 8 months ago

Register here or Log in to comment, or comment via email.