~ghost08/ratt#7: 
ebay-kleinanzeigen internal extractor fails with no html node found for selector: 'span.breadcrump-leaf'

with the latest commit, ratt fails:

$ ratt auto https://www.ebay-kleinanzeigen.de/s-berlin/auto/k0l3331
...
2022/04/20 00:41:05 feed.go:304: feed description: no html node found for selector: 'span.breadcrump-leaf'

unfortunately it is currently not possible to override the internal configuration as "internal" config takes precedence over the working direcrtory.

Thanks for your time!

Status
RESOLVED CLOSED
Submitter
~makefu
Assigned to
No-one
Submitted
2 years ago
Updated
10 months ago
Labels
No labels applied.

~ghost08 2 years ago

VladimĂ­r Magyar referenced this ticket in commit a8ddff6.

~ghost08 2 years ago

I've changed the order in which ratt will search the config files. Now
it is:

                                                                                                                                              
  1. current dir
  2. .config/ratt
  3. embedded confs
                                                                                                                                              

Didn't fix the ebay config, don't know where to search for the
description. If you've fixed it, can you please send it here? As a patch
or just a response to this thread?

~makefu 2 years ago

Will do, once i have something working! And thanks for your quick response and fix :)

~makefu 2 years ago

Just a very quick update, as a hotfix i removed the the 'description' field.

Additionally i added the location to the rss feed text. I also saw that the current config will happily put 'alternative Anzeigen' into the feed (articles which do not match the request). This is the patch:

diff --git a/confs/ebay-kleinanzeigen.yml b/confs/ebay-kleinanzeigen.yml
index 4617045..cb10c7e 100644
--- a/confs/ebay-kleinanzeigen.yml
+++ b/confs/ebay-kleinanzeigen.yml
@@ -11,11 +11,10 @@ selectors:
     insecure: false
   feed:
     title: title
-    description: span.breadcrump-leaf
     authorname: ""
     authoremail: ""
   item:
-    container: li[class='ad-listitem lazyload-item   ']
+    container: ul[id='srchrslt-adtable'] li[class='ad-listitem lazyload-item   ']
     title: |
       title = sel:find("h2.text-module-begin"):first():text():gsub("^%s*(.-)%s*$", "%1")
       print(title)
@@ -44,7 +43,8 @@ selectors:
     createdformat: 02.01.2006 15:04 MST
     description: |-
       description = sel:find(".aditem-main--middle"):html()
-      print(description)
+      place = sel:find(".aditem-main--top--left"):html()
+      print(description .. place)
     content: ""
     image: |
       img = sel:find("div.imagebox"):first():attr("data-imgsrc")

~ghost08 2 years ago

Thanks, I've applied it. Next time send a patch to the ratt mailing list please mailto:~ghost08/ratt@lists.sr.ht

~makefu 2 years ago

will do!

~ghost08 REPORTED CLOSED 10 months ago

Register here or Log in to comment, or comment via email.