Scraping AJAX Pages with Python

In this post I’ll show an example of how to scrape AJAX pages with Python. Continue reading

Getting Amazon Reviews for Library Books with Python

Often when I want to read about a new technical subject, I’ll see what’s available for free at the local library before buying a new book. To decide which book to check out, I browse through the library’s catalogue and look up the reviews for each book on Amazon to... Continue reading

Scraping with Django Backend

I like to use Django as a backend in my scraping scripts. I use it to model the data being scraped with the Django ORM. Then, once the data has been collected, I view the results using the automatic admin interface which is easy to setup in just a few... Continue reading

Scraping with Python Selenium and PhantomJS

In previous posts, I covered scraping using mechanize as the browser. Sometimes though a site uses so much Javascript to dynamically render its pages that using a tool like mechanize (which can’t handle Javascript) isn’t really feasable. For these cases, we have to use a browser that can run the... Continue reading

Scraping by Example - Handling JSON data

Today’s post will cover scraping sites where the pages are dynamically generated from JSON data. Compared to static pages, scraping pages rendered from JSON is often easier: simply load the JSON string and iterate through each object, extracting the relevent key/value pairs as you go. Continue reading