Image Scraping My Wedding Photos

It’s a rather long complicated story, but the gist of it is I got married to the girl of my dreams on 7/9/11 and we never got our wedding pictures from our photographer. By the powers that be fast-forwarding to now my wife was able to get a hold of him via phone and he agreed to put our pictures up on PhotoBucket as well as send us a DVD of our pectures. Well it doesn’t look like the DVD is coming anytime soon… so we thought it best to get our images off of PhotoBucket before they vanished.

After realizing that PhotoBucket had no way to bulk download our photo album that was uploaded I took it upon myself to find a less insane way to download the 410 images. I cracked open my trusty firebug in hopes that the UI was using ajax to fetch my images and I struck gold! After inspecting the JSON payload that was being used I fashioned my own script. If anyone else finds themselves needing to fetch an album before it disappears this script may give you some help in doing so.

grabber.rblink
require 'net/http'
require 'json'
def for_page(page_num)
'?filters[album]=/albums/p168/pixelperfectevent/Cass_Ben&filters[album_content]=2'+
"&sort=3&limit=24&page=#{page_num}&&linkerMode=false&json=1"+
'&hash=2593b2b2d9f53eee3a2c3bebf874bfb8&_=1384041813223')
end
(1..18).each do |page_number|
puts "On page : #{page_number}"
JSON.parse(Net::HTTP.get(for_page(page_number)))['body']['objects'].each do |obj|
url = obj['fullsizeUrl']
puts "\t Fetching: #{url}"
wget </span><span class="si">#{</span><span class="n">url</span><span class="si">}</span><span class="sb">
end
end

Comments