Edit (2015-02-22): I've added the scripts on GitHub that are working for party contributors, and partially working for riding associations contributors. I'm starting the scrape for contribution reports (postal code, etc.), but it's extremely slow (6 pages per minute). It's probably because the machine I'm using is in Asia. If you have a bunch of machines on a good connection, let me know! Since 2009, I've been on the case of the Elections Canada contributions database, where all donations to political parties are "accessible" to the public. In the summer of 2013, the site was revamped and made considerably harder to scrape. You could always get the basic infos, like the name of the contributor, how much they gave, and even a unique identifier "client id" for each person. But for the rest, the postal code, municipality and province, that really allows you to do more serious crossing with other databases (or for maps), you couldn't retrive it in a simple way like before, when that personal information was found on popup pages, and a relatively simple URL without particular session check. So actually, I had given up at that point. But with a federal election coming up, and e-mails flying here and there about the election, I decided to put some time to poke around the site. When you scrape a site, the rule is always to understand how a human navigating the site goes around to get his information, what he triggers in terms of cookies, IDs, etc. My tools of predilection is just Chrome Developer Tools. Then, I use the handy "Copy as cURL", by right-clicking requests in the "Network" section, while navigating the site. In this case, I went to the donations to the parties, the largest of all the ways money enters the coffers of political parties (money may be sent through riding associations and candidates, among other ways). I noticed that the URL at the top changes, but if you just paste it in an incognito window (good way to test whether a page is also dependent of state), you are sent back to the homepage. E.g.: http://www.elections.ca/WPAPPS/WPF/EN/PP/ContributionReport?act=C2&returntype=1&option=4&queryid=b2d4d650d0304b2ebf46a3dff36d70a6&period=0&fromperiod=0&toperiod=0&exactMatch=False&contribrange=-1&contribclass=1%2C%206%2C%2012&selectallcontribclasses=True&addrclientid=24447&addrname=Jean%2BA.%2BCorsi&displayaddress=True&reportPage=1&totalReportPages=6127&ln_no=5318339&setfocusid=addresslink53183391 sends you back to this page http://www.elections.ca/WPAPPS/WPF/EN/PP/ So, quickly I identified the cookie as being crucial to getting the right pages. After poking around a little, I found out that what was even more important was the queryid. The queryid is generated when you select the parties you are searching. Of course, I will want to select all parties at once. When you click the button, it is as if you registered with the webapp as to which entities (an entity represent one annual/quarterly report sent by parties or candidates to Elections Canada), and the webapp returns you a valid queryid that you can now use for your scrape. I usually take the cURL command from the Chrome Dev Tools, but try to reduce it as much as possible, taking parts out and keeping the minimal version that would still yield valid data to be returned. In the case of parties, to get a valid queryid, I found that you don't actually need a previous queryid (even if on the Web, you will have one), and can blank out most of the parameters of the post data, but you will require the cookie, and probably the basic search parameters in the query string (as well as returntype in the post data, as whether the contributions are verified or not, I think). When you issue the cURL, the page will redirect you to a page with the next search page in the form of: http://www.elections.ca/WPAPPS/WPF/EN/PP/SelectSearchOptions?act=C2&period=0&returntype=1&fromperiod=2007&toperiod=2014&queryid=884ebe6c192d489bb17f842c080f86f8 (This won't work in your browser, because you may not have a valid cookie or the queryid will be invalidated by the time you read this.) After that, you can go to the report page and try to download all the listing pages: http://www.elections.ca/WPAPPS/WPF/EN/PP/ContributionReport?act=C2&returntype=1&option=4&queryid=7f48da109479493aac937abbbded46f8&period=0&fromperiod=0&toperiod=0&exactMatch=False&contribrange=-1&contribclass=1%2C%206%2C%2012&selectallcontribclasses=True&addrclientid=25922&addrname=Jocelyn%2BT%2BAantjes&displayaddress=True&reportPage=1&totalReportPages=1207&ln_no=5816619&setfocusid=addresslink581661930 As of today, there are 6127 pages of 200 entries each for contributions to parties, to all parties, between 2007 and 2013 inclusively (details of 2014 aren't in yet). What we'll be interested to see are the quarterly reports for 2015 later in the year, as that is the most lucrative time for parties. The same strategy should be mirrored for other entry points for the money. With the data, I would check for irregularities, like signs of weird amounts, or simply try to draw insights on the patterns of money collection. It's also a massive, very massive database, with in excess of a million entries, surely hundreds of thousands of unique contributors. Finding something in there is daunting. But if a simple data source can be provided, then you're half-way there already.