Iterating multiple lists at a time

I was writing a python script,I needed to iterate multiple iterator. At first I was planning to loop through each of the item.But I found it really non-pythonic. So I went through python docs to figure out if I can find something to get this code more pythonic.We can iterate iterator mainly two ways using zip() and using itertools.izip


Python Movie Data Crawler

Couple of days ago,I was talking with newly joined Engineer in our team.Suddenly I found he is a very resourceful person in terms of collecting movi.He has personal collection of movie archive which is 3TB !

So,I was really happy to find such a person as a teammate.He has three different lists of movies in text file.And shared one of the text file which contained more than couple of  hundreds movie name,I decided to write a script that scrapes movie rating from was googling if there is any public REST API available and found this website which returns json as a search result.

The main task is couple of lines code like below:

import requests
import urllib
import json
query = {'i': '', 't': movi_name ,'tomatoes':'true'}
response = requests.get(BASE_URL+urllib.urlencode(query))
output = json.dumps(response.content, separators=(',',':'))

The output is the movie information,to grab specific result I have done some more formatting.You can check the whole script from here
Right now the script would print output like this:

Getting Movi The Pianist...
{'Plot': 'A Polish Jewish musician struggles to survive the destruction of the Warsaw ghetto of World War II.', 'Rating': '8.5', 'Title': 'The Pianist', 'Director': 'Roman Polanski', 'tomatoRating': '8.2', 'IMDB Rating': '8.5'}

In this script I have used amazing python module called Requests.

You can add more functionality like getting movie poster or writhing the movie data into
And if you want to run this script please add text file names movies.txt,like below:

The Pianist
The Avengers

Happy coding!

Web Scraping using python

Couple of weeks ago,I was working one of the project.Where my responsibility was to scrap several specific information from the website.

It was a dispensary listing website around 2600+ dispensary was there.Information like dispensary name,state,address,email,website etc was needed.I decided to use python for scrapping because of its huge  library collection and available third parties.

Using a csv file where directory name and the rss feed of the corresponding directory was listed.

The script read the csv file then take the name of the directory  and link.Then using feed parser it counts the number of dispensary from of the rss feed.

Feed parser is a very handy,to collect specif info from the rss feed.

feed_data = feedparser.parse(feed_path)
count = len(feed_data['entries'])
link_url = feed_data['entries'][i]['link']

Then,collected the dispensary url.And appended those into a list.

Now the real parts start,scrap the data using urls.For that I used BeautiFulSoup.Parsing data is very easy using it.

Before trying it,you have to install it.In ubuntu following terminal command is enough

 sudo apt-get install python-beautifulsoup

or using easy install

sudo easy_install BeautifulSoup

Parsing is as simple as below:

import urllib2
from BeautifulSoup import BeautifulSoup
url = ""
html = urllib2.urlopen(url).read()
data = BeautifulSoup(html)
print data

It will print whole parsed data of the url.And then you can navigate and collect yours need html tag values from the
soup :).
Then i loop through my url list and collected different scrap data from the soup.
Then the next part was to write those data into csv file.Its pretty simple in python.

import csv
data_list = [name,title,address,email]
file_path = "where_you_want_to_save_the_csv_file"
file_p = open(file_path, "ab")
write_pattern = csv.writer(file_p, delimiter=",", quotechar=",")

It generates a csv file with the data in this format
You can use the script or can contribute in that script.As It was my first scrapping,the script is not bullet proof,you can suggest better coding approach also.
Script is here
Finally I ran the script in my local machine,it took 5 hours to scrap 2660+ dispensary data.

python csv file reading writting.

Reading and writing to a csv file is fairly simple using python.As python has both reader() and writter() method is  easy to use.In the code below,the script is reading a csv file and appending  row values in different list.the csv file had data in this format



The code below take reads the csv file and take it name to a list and link to another list:

file_path = "path_of_your_csv_file"
shop = csv.reader(open(file_path, "rb"))
shop_list = []
file_name_list = []
rss_link_list = []
for d in shop_list:

And the writing to a csv file was interesting,I tried to write in a csv file and each time when a write was writting data but when a new write operation was executed previous data was erased.Actually,I was missing silly point the I didnt mention the data should be appended when writted.The code was as belows:

def csv_writer(data_list,file_path):
    import csv
    file_p = open(file_path, "wb")
    write_pattern = csv.writer(file_p, delimiter=",", quotechar=",")

In this method user have to pass data and the file path where the csv file will exists should be declared.

You can find more about python csv read write form here

Date time in python

python has a very strong date time library,which one can handle different pattern of date time string.

I was working in a project user will submit the date time in unusual format and I have to convert it in a pattern


I just went through pyhton doc,unfortunately missed its power.And started to write custom function for doing the task.

My function was like below:

def date_convert(raw_format):
>>> def date_convert(raw_format):
    date_list = raw_format.split()
    temp_m = date_list[0].strip(",")
    spacer = "-"
    month_list = ["January","February","March","April", "May", "June","July","August","Septempber","October","November","December"]
    raw_month_number = str(month_list.index(temp_m)+1)
    date_fix = (date_list[2],raw_month_number,date_list[1].strip(","))
    final_date = (spacer).join(date_fix)
    return final_date
>>> print date_convert("March 12, 2010")

After coding this method,I asked one of my seniors about any other date time formatting are available or not in this issues in terms of performance,he gave me
two links :
Using above two links easy and efficient function can be written.

Sorting in python

One of the strong python feature is its data structure,list and dictionary give you the power to play with your data in versatile way.
Also have some methods which are needed by python programmer in day to day code work.

One of the basic and strong method is sorting.
Example is:

>>>a = [1,2,3,6,3,4]
>>>print a
[1, 2, 3, 3, 4, 6]

Another efficient feature that we can use key while doing the sorting.Its like:

>>> a = [1,2,3,4]
>>> sorted(a, reverse=True)
[4, 3, 2, 1]

Another parameter we can use during sorted() is “key”

>>>a = ['e','f','a','g']
>>> sorted(a, key=str.lower)
['a', 'e', 'f', 'g']

Run windows command line command using python

I was just playing with my python interpreter in my windows 7 system today and thinking what to do silly,fun.

Then just searched for the library in python which can be used to list windows directory/file.

The OS module is used in this case and it was simple.I put the windows command that I want to execute and passed it inside popen() .

windows = "dir"
command = os.popen(windows).read()
print command

Its simple 🙂

If any one have better solution than this please drop a line,that I can know that.

To know details about python’s OS module,click here.