Author Topic: [Meta] Geekhack Classifieds Scraping - looking for opinions  (Read 1329 times)

0 Members and 1 Guest are viewing this topic.

Offline Dihedral

  • Thread Starter
  • Posts: 827
  • Location: United Kingdom
  • Mostly Harmless
[Meta] Geekhack Classifieds Scraping - looking for opinions
« on: Tue, 23 August 2016, 05:29:36 »
Hey guys,

I've been working on a little system to improve the geekhack classifieds and would like your feedback.

This system scrapes the classifieds and compiles the information it finds into a formatted catalogue of WTS, WTB and WTT items.

It looks for little tags like these which can be added to classifieds OPs and uses them to form its catalogue:

Code: [Select]
[color=transparent]<<<[["WTS", "Dell QuietKey", "$20"], ["WTB", "NMB RT101+", "$80"], ["WTT", "Matias Quiet Click x200", "Cream Damped Alps x50"]]>>>[/color]
This line is a JSON list. Each item in the list is itself a list, representing a listing, with three values - the category (WTS, WTB, WTT), the item, and the price. Price is a a kinda vague concept - for WTB it means the budget of the user, and for WTT it means desired items.

Only one of these lines can be present in any single OP. The tags <<< and >>> signify to the program the location of the JSON and the color tags are not necessary but are simply there to stop the JSONs from cluttering an OP.

The program automatically reads all the posts in the classifieds and turns these JSON lines into a formatted set of tables like below:


WTT
    ITEM   
    PRICE   
    USER   
    TOPIC   
Matias Quiet Click x200    Cream Damped Alps x50    Dihedral    Topic Link   



WTS
    ITEM   
    PRICE   
    USER   
    TOPIC   
Dell QuietKey    $20    Dihedral    Topic Link   



WTB
    ITEM   
    PRICE   
    USER   
    TOPIC   
NMB RT101+    $80    Dihedral    Topic Link   




What are your thoughts? Is this a worthwhile system that should be deployed. What improvements can be made to it. If you want to see the code I am happy to dump it into a GitHub, just ask.
« Last Edit: Wed, 24 August 2016, 07:06:58 by Dihedral »

Offline Dihedral

  • Thread Starter
  • Posts: 827
  • Location: United Kingdom
  • Mostly Harmless
Re: Geekhack Classifieds Scraping
« Reply #1 on: Wed, 24 August 2016, 05:54:16 »
 :blank:

Offline Dihedral

  • Thread Starter
  • Posts: 827
  • Location: United Kingdom
  • Mostly Harmless
Re: [Meta] Geekhack Classifieds Scraping - looking for opinions
« Reply #2 on: Wed, 24 August 2016, 09:14:30 »
 :blank: Off Topic posts get buried quickly

Offline Bromono

  • Wanabe Cicerone
  • * Destiny Supporter
  • Posts: 1115
  • Location: The Alamo's Basement
  • HHKB > Your Opinion
Re: [Meta] Geekhack Classifieds Scraping - looking for opinions
« Reply #3 on: Wed, 24 August 2016, 09:18:00 »
What language are you using?

Offline Dihedral

  • Thread Starter
  • Posts: 827
  • Location: United Kingdom
  • Mostly Harmless
Re: [Meta] Geekhack Classifieds Scraping - looking for opinions
« Reply #4 on: Wed, 24 August 2016, 09:23:34 »
What language are you using?

It's all implemented in Python. A large chunk of the code is a library I wrote to allow easy writing of Python scripts which interface with Geekhack, similar to praw for reddit. The rest of it handles the actual cataloging of data. Hopefully I will be able to use the library again for any future projects.