Scraping with CasperJS

In a previous post, I showed how to scrape a Javascript-heavy site by using the Selenium bindings for Python to drive a headless browser (PhantomJS). In this post, I’ll show how to scrape the same site using CasperJS. Continue reading

Scraping AJAX Pages with Python

In this post I’ll show an example of how to scrape AJAX pages with Python. Continue reading

Getting Amazon Reviews for Library Books with Python

Often when I want to read about a new technical subject, I’ll see what’s available for free at the local library before buying a new book. To decide which book to check out, I browse through the library’s catalogue and look up the reviews for each book on Amazon to... Continue reading

Scraping with Django Backend

I like to use Django as a backend in my scraping scripts. I use it to model the data being scraped with the Django ORM. Then, once the data has been collected, I view the results using the automatic admin interface which is easy to setup in just a few... Continue reading

Scraping with Python Selenium and PhantomJS

In previous posts, I covered scraping using mechanize as the browser. Sometimes though a site uses so much Javascript to dynamically render its pages that using a tool like mechanize (which can’t handle Javascript) isn’t really feasable. For these cases, we have to use a browser that can run the... Continue reading