Python Movie Data Crawler

Couple of days ago,I was talking with newly joined Engineer in our team.Suddenly I found he is a very resourceful person in terms of collecting movi.He has personal collection of movie archive which is 3TB !

So,I was really happy to find such a person as a teammate.He has three different lists of movies in text file.And shared one of the text file which contained more than couple of  hundreds movie name,I decided to write a script that scrapes movie rating from IMDB.com.I was googling if there is any public REST API available and found this website http://www.imdbapi.com/ which returns json as a search result.

The main task is couple of lines code like below:

import requests
import urllib
import json
BASE_URL = 'http://www.imdbapi.com/?'
query = {'i': '', 't': movi_name ,'tomatoes':'true'}
response = requests.get(BASE_URL+urllib.urlencode(query))
output = json.dumps(response.content, separators=(',',':'))

The output is the movie information,to grab specific result I have done some more formatting.You can check the whole script from here
Right now the script would print output like this:

Getting Movi The Pianist...
{'Plot': 'A Polish Jewish musician struggles to survive the destruction of the Warsaw ghetto of World War II.', 'Rating': '8.5', 'Title': 'The Pianist', 'Director': 'Roman Polanski', 'tomatoRating': '8.2', 'IMDB Rating': '8.5'}

In this script I have used amazing python module called Requests.

You can add more functionality like getting movie poster or writhing the movie data into
And if you want to run this script please add text file names movies.txt,like below:

The Pianist
The Avengers

Happy coding!

Advertisements

2 comments

  1. Mashpy Says · March 18, 2012

    নতুন ডোমেইন মনে হচ্ছে…

    মুভির রেটিং দেখে মুভি দেখতে বসার কনসেপ্টটা ভাল লাগল। ধন্যবাদ।

  2. website · June 4, 2012

    The design for the blog is a bit off in Epiphany. Even So I like your website. I might have to install a normal browser just to enjoy it.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s