hi - i'm getting frequent segfaults/errors that crash my haskell executable that i can't even catch with Conrol.Exception.try. i'm reading ~30 similarly formatted pdfs; pdftotext reads each with ~70% success rate, but ~30% of the time will segfault or report another error, neither of which can be caught afaict. poppler's command line pdftotext works fine on them, as does https://hackage.haskell.org/package/pdf-toolbox-[core|content|document]
code: txt <- ((e::SomeException) -> print e >> ... {- never occurs -}) ||| pure =<< try (pdftotext Physical <$> fromJust <$> openFile f)
results: succeeds ~70% of the time
once crashed with: Segmentation fault: 11
once i got: poppler/error: Embedded font file may be invalid Segmentation fault: 11
often crashes with: libc++abi.dylib: terminating with uncaught exception of type std::__1::system_error: recursive_mutex lock failed: Invalid argument
if i don't use try, it crashes with this: poppler/error: font resource is not a dictionary poppler/error: font resource is not a dictionary poppler/error: font resource is not a dictionary covid-exe(36374,0x10dc82dc0) malloc: *** error for object 0x7f8c29d60027: pointer being freed was not allocated covid-exe(36374,0x10dc82dc0) malloc: *** set a breakpoint in malloc_error_break to debug Abort trap: 6
here are the pdfs i'm running it on: https://drive.google.com/file/d/15iOoGT3NCdWq9Gw1OavD7rCcuIqwc0eG/view?usp=sharing
osx 10.15.7 stack 2.3.3 (resolver lts-16.13, ghc 8.8.4) poppler 20.11.0