Counting line number of files recursively in Unix

I was working in a project for last couple of months, as the days are passing the codebase is getting larger. Suddenly I thought, It would be great if I can  know how many lines of code I have written so far for each module. And also in total. I know unix has a really awesome utils named wc.

After googling and trying different params and commands I managed to find it by merging to unix tool(wc and find), the full command for recursive line number counting is like below:

wc -l `find . -type f`

The command returns something like below:

Screen Shot 2013-09-25 at 1.39.13 AM

Using  find . -type f   listing all the files recursively and wc -l is counting the line numbers 🙂

For learning these tow unix command in details check wc and find manual.

Python super and init explained with example

Python super :

Python super keyword is confusing some time to newbie or even for intermediate python programmers.

But the idea behind super is really simple. In OOP paradigm we often need to do implement inheritance like below:


class A(object):

    def fancy_func(self):
        print 'Fancy Function Called from Class A'

class B(A):

    def fancy_func(self):
        return super(B, self).fancy_func()

b is an Object of class B and fancy_func is the method of B, super is returning the base classes method. If we don’t use super,we had declare an object of class A and then we had to call fancy_func. On the other hand,super returns proxy object. Super uses __mro__(method resolution order).

Super can be used:

  • For single inheritance using to refer parent classes
  • In multiple inheritance its very useful in during dynamic execution.

For real life coding when we need to enhance any module method we can easily super to get things done.

And we don’t even have to know details about the base class that we are extending from.

super is only applicable for python new style classes ( the classes derived from object ex. class A(object) )

For python3 the syntax is like below:


Syntax of calling super is like below:

super(subClass, instance).method(args)

Python __init__ :

If you declare a __init__ in your python class, it will be run when you initialize an object from that class.

__init__ acts like constructors in other languages but actually its not. There is a basic major difference between from other methods and __init__,its you cant return anything from it.You can add properties to the current object using like self.myProperty = ‘TEST’ and you can use it in any other method by accessing like self.myProperty

Simply __init__ is used when we want to control the initialization of the class.

Lets build something real with these features:

import requests
from BeautifulSoup import BeautifulSoup
class crawlPyCentral(object):
def __init__(self, url=''):
self.url = url
def getSoup(self):
response = requests.get(self.url)
soup = BeautifulSoup(response.content)
return soup
def getTitles(self):
soup = self.getSoup()
uls = soup.findAll('ul',{'class':'category-posts'})
for ul in uls:
lis = ul.findAll('a')
for li in lis:
yield li
class filteredCrawler(crawlPyCentral):
def getTitles(self, keyword):
for t in super(filteredCrawler, self).getTitles():
if t.text.find(keyword) > 1:
yield t.text
if __name__=='__main__':
f = filteredCrawler()
for title in f.getTitles('1'):
print title

view raw
hosted with ❤ by GitHub

In the above example we implement both the concept of __init__ and super. Here __init__ using for setting value of url while intilazing the object  and super is being used to call the crawlPyCentral’s getTitles.

To dig more deep into super check this blog post

MongoDB backup script

Last year while I was working in a project, I needed to automate the whole backing up process from taking snapshot of the current db to saving it to AWS S3 buckets. At that time I took most of the staffs from this blog post.


Couple of days ago, I started to code for making small backup script that will backup to another cloud machine rather than to AWS S3.  Instead of  coding it from scratch, I reused my previously coded script. All I need to implement a bash function(save_in_cloud) which runs a simple scp command 🙂

The whole script look like below:

#SET the bd name which one you want to backup
#SET server path where you want to save the file
#SET your user name
#SET your host name or IP of the server
date_now=`date +%Y_%m_%d_%H_%M_%S`
current_year=`date +%Y`
log() {
echo $1
rm -rf 'db_backup_'${current_year}*
log 'cleaning up….'
log 'snapshotting the db and creating archive' && \
${MONGODB_SHELL} admin fsync_lock.js && \
log 'db locked and creating backup'
${DUMP_UTILITY} -d ${DB_NAME} -o ${dir_name} && tar -jcf $file_name ${dir_name} && \
${MONGODB_SHELL} admin fsync_unlock.js && \
log 'data backd up and created snapshot'
log 'saving backup to another server…'
scp ${file_name} ${SERVER_USER}@${HOST_NAME}:${CLOUD_PATH}
log 'saved scuccessfully'
do_backup && save_in_cloud && do_cleanup

view raw
hosted with ❤ by GitHub

I reused this script, all I did just added a new function which copy the current backup data to a remote server.  And also updated do_cleanup, now it works in any year.

The backup script depends on other two js (fsync_lock.js and fsync_unlock.js) functions which responsible for locking mongo during  db snapshots and releasing lock  after the  snapshots.

Happy Coding 🙂

Factory pattern in Python

We use design pattern to build reusable solution. Building reusable solution is hard and design patterns helps us by giving common design solution for same sort of problems.

One of the important design patterns is Factory Method Pattern. In Python the implementation of factory pattern look like below:

class Ladder(object):
def __init__(self):
self.hight = 20
class Table(object):
def __init__(self):
self.legs = 4
my_factory = {
"target1": Ladder,
"target2": Table,
if __name__ == '__main__':
print my_factory ["target1"]().hight

When to use factory pattern?

There are couple of cases when we can use factory pattern, one of the case is- when there is needed to create objects that are dependent on other objects.

That means when we are going to create a complex objects, and complex objects will be based on other objects. When we need to create the complex object we dont need to know the details about other objects that rely on the creation process. Example is like below:

class Train(object):
def __init__(self):
self.speed = 120
class Bus(object):
def __init__(self):
self.speed = 60
class Trum(object):
def __init__(self):
self.speed = 40
class System(object):
def create(self, *args):
return args
class TranspotationSystem(object):
def __init__(self):
self.train = Train()
self.bus = Bus()
self.trum = Trum()
def createTSSyetm(self):
s = System()
t_system = s.create(self.train, self.trum, self.bus)
for t in range(0,len(t_system)):
print t_system[t].speed
if __name__=='__main__':
T = TranspotationSystem()
t = Train()
tr = Trum()
b = Bus()

Ideal situation would be, when we see we are coding to gather information to create objects. And factories help to gather object creation in a single place. And also it helps to create decoupled system.

If you have better understanding and experience of using factory pattern in your python code, please share it in comment.

Django merging two QuerySet using itertools.

I was working with a django application where I need to merge two query set. After going through django ORM docs, could not find anything helpful.

I was planning to do it in a unpythonic way like iterating two queryset and appending each item to a new list, just before doing it I thought it would be better to google for it. And after couple of minutes found it. We can use python itertools to merge two or more query set. Like below:

from itertools import chain
cars = Cars.objects.all()
trucks = Truck.objects.all()
all_vechiles = list(chain(cars, trucks))

Python itertools is an amazing module that contains real handy methods what we need to handle iterators and doing different types of operation. If you never used  itertools before you are missing one of the charm of python.

Check Itertools chain docs for details.

Happy Coding!

Comment notification plugin in wordpress

May be it was end of the last year, I was assigned to finish a wordpress project which was taking way too long time to deliver. I was away from wordpress development for more than two years. So after jumping into the project I found most of the user requirements are  bit different than the general.wordpress-logo

One of the requirements was to wordpress admin should be able to use “admin comment” section to reply the comments as email whether the user is subscribed to the post/comment or not. If some one comment into the content and if admin approved it with reply, the user will have to receive an email with the reply.

Worpdress provide a nice action call “comment_post“, I wrote a simple method which will be executed after every comment and check is it approved or not and will send an email. Check the full plugin from github

By the way, without subscriptions into the post and comment, sending email is not “ethical” so if you want to use it use in your own risk.

Also feel free to fork it if you want to add/customize 🙂

Painless deployment with Fabric

Deployment of code in test/staging/production  servers is one of the important part of modern web applications development cycle.

Deploying code were painful because its repetitive same tasks we have to do every time we want to push code, during deployment  if something goes wrong the application will go down too. But the scenario has changed, now we have many tools to make the deployment easier and fun. I have used Capistrano and Fabric for deployment. Found Fabric really painless and as its a Python battery, it was easier for me to adopt and get things done.

I am going to cover fundamental operations and finally a simple fabric script(like boilerplate) for writing your own fabric script.

env = its a Python dictionary like subclass where we define specific settings like password,user etc

local = runs command in  local host(where fabric script is being run)

run = runs command in a remote host

You can use these code tasks in many different ways, to do that check the Fabric Office Documentation from here.

from fabric.api import local, run, env, put
env.graceful = False
def test_server():
env.user = 'your_user_name'
env.serverpath = '/'
env.site_root = 'your_app_root'
env.password = 'your_pass' #ssh password for user
# env.key_filename = ''# specify server public key
#lis of hossts in env.hosts
env.hosts = [
#sample method for git pull
def pull(branch_name):
env.site_root = 'your_project_path'
run('cd %s && git pull origin %s' % (env.site_root, branch_name))
#deploy current directories all code without
def deploy():
env.files = '*'
env.site_name = 'your_app_name'
env.site_path = 'your_application_path'
run('rm -rf %s/%s' % (env.site_path,env.site_name))
local('zip -r %s' % (env.site_name, env.files))
run('cd %s && unzip -d %s && rm' % (env.site_root, \
env.site_name, env.site_name, env.site_name))
local('rm -rf' % env.site_name)
#restart apache of remote host
def restart_apache():
cmd = "/usr/local/apache2/bin/apachectl -k graceful" if (env.graceful is True) \
else "service httpd restart"
def latest_access_log():
cmd = "tail -n 10 /var/log/apache2/access.log"
def latest_error_log():
cmd = "tail -n 10 /var/log/apache2/error.log"

view raw
hosted with ❤ by GitHub

sudo apt-get install python-pip
sudo pip install fabric

view raw
hosted with ❤ by GitHub

First gist is a sample fabric script,second one is a bash script to install fabric in your ubuntu machine.

 After setting username,password and host information into the script you cab check your server’s access log by running  fab test_server latest_access_log 

I am using fabric for around two years and used for different small,medium and large projects.

There are many interesting open source projects going on top of Fabric. I found these two projects really promising.



Search through github,you will find many advance level Fabric use.

Happy Coding!

Pythonic way to calculate Standard Deviation

If you are familiar with  basic statistics, I think you know what is Standard Deviation, if you dont know what is Standard Deviation you can check wiki for details.

And if it seems yet hard to wrap the idea into your brain,check this thread. Hope you understand it now. Standard deviation is an efficient when you want to understand a set of data and widely used in different industries. I was working with an algorithm couple of months ago where I had to calculated standard deviation of  a series of data. And the sets of data is large.

After coding couple of versions I code a small python class which calculates standard deviation of data. Check it out

from __future__ import division
from math import sqrt, pow
class StandardDeviation(object):
def do_round(self, data):
data = "%.3f" % round(data, 3)
return float(data)
def do_diff(self, n, mean):
return pow((nmean), 2)
def standDev(self, data_list):
mean = sum(data_list)/len(data_list)
result = sqrt(sum([self.do_diff(s, mean) for s in \
data_list] )/len(data_list))
return float(self.do_round(result))
if __name__=='__main__':
data_list = [2, 4, 4, 4, 5, 5, 7, 9 ]
std = StandardDeviation()
print std.standDev(data_list)

Happy Coding!

Summary of 2012

As its weblog for myself,like every other blogger in the world I share my yearly summary for last couple of years.For 2012 it was …

  • I switched my job,started to work in Tasawr Interactive since last September.
  • Worked with NewsCred API, it was an amazing experience. Coz NewsCred API is a scaled solution and have large code base.I got real taste of working in a complex Python project while I worked with the API team.
  • I had to brush up my Drupal knowledge and learned Drupal 7
  • Started to work on my pet project which is completely based on Python/Django
  • Developed a small financial API for one of my client(I used tastypie for the first time and loved it)
  • Worked for Indexica as a remote developer, where my role was DevOps. I had to develop a portion of their API and also managed cloud infrastructure.
  • Worked with SOLR  and Nutch for the first time and loved both of this.

Iterating multiple lists at a time

I was writing a python script,I needed to iterate multiple iterator. At first I was planning to loop through each of the item.But I found it really non-pythonic. So I went through python docs to figure out if I can find something to get this code more pythonic.We can iterate iterator mainly two ways using zip() and using itertools.izip

from itertools import izip
A = [1, 2]
B = [3, 4]
C = [5, 6]
def iterate_using_zip(item1, item2, item3):
for a,b,c in zip(item1, item2, item3):
print a,b,c
def iterate_using_izip(item1, item2, item3):
for k,l,m in izip(item1, item2, item3):
print k,l,m
if __name__=='__main__':
iterate_using_izip(A, B, C)