As a continuation of the previous post regarding the
Voting data analysis , we will be creating some Graphs and charts based on the
result we got .
Note: Library
to be used is matplotlib, which helps is creating some graphical
charts based on our data.
Modules to be downloaded/imported are - numpy
and pyplot .
A little brief behind using these two modules :
· pyplot is one way to plot graph data
with Matplotlib. It's modelled on the way charting works in another popular
commercial program, MATLab.
· importmatplotlib.pyplot as plt
· NumPy is a module providing lots of
numeric functions for Python
· importnumpyasnp
Bar Chart :
We will be using the library matplotlib to create some graphical charts
based on our data.
· Inline Charts:
%matplotlib inline
... the commands mentioned above instructs IPython that we want charts to
be created in "inline style" inside our notebook and not in a
separate window..
Codes used in creating the charts are as follows:
Explanation :
We captured radish variety and its count in two arrays
names = []
votes = []
Then we have created a range of indexes for the
X values in the graph, one entry for each item in the "counts"
dictionary (i.e.
len(counts)
),
numbered 0,1,2,3,etc. This will spread out the graph bars evenly across the X
axis on the plot.np.arange
is a NumPy function like the range()
function in Python, only the result it
produces is a "NumPy array". plt.bar()
creates a bar graph, using the "x"
values as the X axis positions and the values in the votes array (i.e. the vote
counts) as the height of each bar.
The Output :
Final Code for Bar Chart :
import matplotlib.pyplot as plt import numpy as np # Create an empty dictionary for associating radish names # with vote counts counts = {} fraud=[] # Create an empty list with the names of everyone who voted voted = [] # Clean up (munge) a string so it's easy to match against other strings def clean_string(s): return s.strip().capitalize().replace(" "," ") # Check if someone has voted already and return True or False def has_already_voted(name): if name in voted: fraud.append(name) return True return False # Count a vote for the radish variety named 'radish' def count_vote(radish): if not radish in counts: # First vote for this variety counts[radish] = 1 else: # Increment the radish count counts[radish] = counts[radish] + 1 def max_voted_radish(stats): return[key for key,val in stats.iteritems() if val == max(stats.values())] def min_voted_radish(stats): return[key for key,val in stats.iteritems() if val == min(stats.values())] for line in open("radishsurvey.txt"): line = line.strip() name, vote = line.split(" - ") name = clean_string(name) vote = clean_string(vote) if not has_already_voted(name): count_vote(vote) voted.append(name) names = [] votes = [] # Split the dictionary of name:votes into two lists, one for names and one for vote count for radish in counts: names.append(radish) votes.append(counts[radish]) mxpos= votes.index(max(votes))+1 mnpos= votes.index(max(votes))+1 # The X axis can just be numbered 0,1,2,3... x = np.arange(len(counts)) plt.bar(x, votes) plt.xticks(x + 0.5, names, rotation=90) plt.yticks(np.arange(0,max(votes)+20,10)) plt.ylabel('Votes') plt.xlabel('Voters') plt.title('Leader Board') plt.annotate('max vote '+str(max(votes)), xy=(0.5+0.5*mxpos, max(votes)), xytext=(2+0.5*mxpos, max(votes)+5), arrowprops=dict(facecolor='red', shrink=0.05), )
Output:
Pie Chart :
Code :
import matplotlib.pyplot as plt import numpy as np from pylab import * # Create an empty dictionary for associating radish names # with vote counts counts = {} fraud=[] # Create an empty list with the names of everyone who voted voted = [] # Clean up (munge) a string so it's easy to match against other strings def clean_string(s): return s.strip().capitalize().replace(" "," ") # Check if someone has voted already and return True or False def has_already_voted(name): if name in voted: fraud.append(name) return True return False # Count a vote for the radish variety named 'radish' def count_vote(radish): if not radish in counts: # First vote for this variety counts[radish] = 1 else: # Increment the radish count counts[radish] = counts[radish] + 1 def max_voted_radish(stats): return[key for key,val in stats.iteritems() if val == max(stats.values())] def min_voted_radish(stats): return[key for key,val in stats.iteritems() if val == min(stats.values())] for line in open("radishsurvey.txt"): line = line.strip() name, vote = line.split(" - ") name = clean_string(name) vote = clean_string(vote) if not has_already_voted(name): count_vote(vote) voted.append(name) names = [] votes = [] # Split the dictionary of name:votes into two lists, one for names and one for vote count for radish in counts: names.append(radish) votes.append(counts[radish]) vts=[(float(x)/float(sum(votes)))*100.0 for x in votes] sizes = vts cs=cm.Set1(np.arange(40)/40.) expl=[] for i in xrange(len(vts)): if vts[i]==max(vts): expl.append(0.1) else: expl.append(0) plt.pie(sizes, explode=expl, labels=names, colors=cs, autopct='%1.1f%%', shadow=True, startangle=90) # Set aspect ratio to be equal so that pie is drawn as a circle. plt.axis('equal') plt.show()
Output :
No comments :
Post a Comment