Introduction¶
pyolx supplies two methods to scrape data from www.olx.pl website
Scraping category data¶
This method scrapes available offer urls from OLX search results with parameters
-
olx.category.
get_category
(main_category=None, sub_category=None, detail_category=None, region=None, search_query=None, url=None, **filters)[source]¶ Parses available offer urls from given category from every page
Parameters: - url – User defined url for OLX page with offers. It overrides category parameters and applies search filters.
- main_category – Main category
- sub_category – Sub category
- detail_category – Detail category
- region – Region of search
- search_query – Additional search query
- filters – Dictionary with additional filters. Following example dictionary contains every possible filter
with examples of it’s values.
Example: - input_dict = {
- “[filter_float_price:from]”: 2000, # minimal price “[filter_float_price:to]”: 3000, # maximal price “[filter_enum_floor_select][0]”: 3, # desired floor, enum: from -1 to 11 (10 and more) and 17 (attic) “[filter_enum_furniture][0]”: True, # furnished or unfurnished offer “[filter_enum_builttype][0]”: “blok”, # valid build types: # blok, kamienica, szeregowiec, apartamentowiec, wolnostojacy, loft “[filter_float_m:from]”: 25, # minimal surface “[filter_float_m:to]”: 50, # maximal surface “[filter_enum_rooms][0]”: 2 # desired number of rooms, enum: from 1 to 4 (4 and more)
}
Returns: List of all offers for given parameters Return type: list
It can be used like this:
input_dict = {'[filter_float_price:from]': 2000}
parsed_urls = olx.category.get_category("nieruchomosci", "mieszkania", "wynajem", "Gdańsk", **input_dict)
The above code will put a list of urls containing all the apartments found in the given category into the parsed_url variable
Scraping offer data¶
This method scrapes all offer details from
It can be used like this:
descriptions = olx.offer.get_descriptions(parsed_urls)
The above code will put a list of offer details for each offer url provided in parsed_urls into the descriptions variable