Book Review: Web Scraping with Python

I just finished reading Web Scraping with Python by Richard Lawson; Packt Publishing. Continue reading

Iterating through Dynamic Select Options with Selenium

In this post I’ll use Selenium to show how to iterate through dropdown menus in a form that uses SELECT elements whose option values are dynamically generated. I’ll provide a technique that can be used to determine when the option values have loaded. I’ll wrap up the post by refactoring... Continue reading

Using Selenium to Scrape ASP.NET Pages with AJAX Pagination

In my last post I went over the nitty-gritty details of how to scrape an ASP.NET AJAX page using Python mechanize. Since mechanize can’t process Javascript, we had to understand the underlying data formats used when sending form submissions, parsing the server’s response, and how pagination is handled. In this... Continue reading

Scraping ASP.NET Pages with AJAX Pagination

In a previous post I showed how to scrape a page that uses AJAX to return results dynamically. In that example, the results were easy to parse (XML) and the pagination scheme was straightforward (page number in the AJAX query JSON). In this post, I’ll show a more complicated example... Continue reading

Scraping with CasperJS

In a previous post, I showed how to scrape a Javascript-heavy site by using the Selenium bindings for Python to drive a headless browser (PhantomJS). In this post, I’ll show how to scrape the same site using CasperJS. Continue reading