Feeds:
Posts
Comments

Archive for the ‘Programming-Python’ Category

Pandas in Python Library that is used for Data Manipulation and Analysis. It came from terms “Panel Data”. It’s open source under three-clause BSD License. Original developer was Wes McKinney in 2008 while he worked at AQR Capital Management to process Quantitative Analysis on financial data. It was written in Python, Cython and C.

Pandas in mainly used for Machine Learning.

There a lot of features available that you can used for:
-reading and writing various data format, csv, MS excel, json, html, SAS, SPSS, SQL, Google Big Query, Stata, Msgpack etc.
-Group, Join, Merge, Filter, Pivot, Reshaping data set.
-Time series function and so many more.

Installation
In this tutorial I use Python 3.6.9 (default, Nov 7 2019, 10:44:02), so the installation command will be: pip3 install pandas.
From Linux terminal type:

$ pip3 install pandas
Collecting pandas
  Downloading https://files.pythonhosted.org/packages/bb/71/8f53bdbcbc67c912b888b40def255767e475402e9df64050019149b1a943/pandas-1.0.3-cp36-cp36m-manylinux1_x86_64.whl (10.0MB)
    100% |████████████████████████████████| 10.0MB 48kB/s 
Collecting python-dateutil>=2.6.1 (from pandas)
  Using cached https://files.pythonhosted.org/packages/d4/70/d60450c3dd48ef87586924207ae8907090de0b306af2bce5d134d78615cb/python_dateutil-2.8.1-py2.py3-none-any.whl
Collecting numpy>=1.13.3 (from pandas)
  Downloading https://files.pythonhosted.org/packages/07/08/a549ba8b061005bb629b76adc000f3caaaf881028b963c2e18f811c6edc1/numpy-1.18.2-cp36-cp36m-manylinux1_x86_64.whl (20.2MB)
    100% |████████████████████████████████| 20.2MB 45kB/s 
Collecting pytz>=2017.2 (from pandas)
  Using cached https://files.pythonhosted.org/packages/e7/f9/f0b53f88060247251bf481fa6ea62cd0d25bf1b11a87888e53ce5b7c8ad2/pytz-2019.3-py2.py3-none-any.whl
Collecting six>=1.5 (from python-dateutil>=2.6.1->pandas)
  Using cached https://files.pythonhosted.org/packages/65/eb/1f97cb97bfc2390a276969c6fae16075da282f5058082d4cb10c6c5c1dba/six-1.14.0-py2.py3-none-any.whl
Installing collected packages: six, python-dateutil, numpy, pytz, pandas
Successfully installed numpy-1.18.2 pandas-1.0.3 python-dateutil-2.8.1 pytz-2019.3 six-1.14.0
$

pandas01
(more…)

Read Full Post »

It’s fun that Math formula can visualize ‘LOVE’ word.
Type the code below in your python IDE and run it.

import matplotlib.pyplot as plt
import numpy as np
L=np.arange(0,6,0.1)
O=np.arange(-3,3,0.1)
V=np.arange(-2,3,1)
E=np.arange(-3,3,0.1)
fig,(ax1,ax2,ax3,ax4)=plt.subplots(ncols=4)
ax1.set_title(r'$ y=\frac{1}{x}$') #display Math formula
ax1.plot(L,1/L)                    #print L
ax2.set_title(r'$ x^2+y^2=9$')     #display Math formula
ax2.plot(O,(9-O**2)**0.5)          #print O
ax2.plot((9-O**2)**0.5,O)
ax2.plot(O,-(9-O**2)**0.5)
ax3.set_title(r'$ y=|-2x| $')      #display Math formula
ax3.plot(V,(abs(-2*V)))            #print V
ax4.set_title(r'$ -3|sin y| $')    #display Math formula
ax4.plot(-3*abs(np.sin(E)),E)      #print E
plt.show()

love-code
love

Read Full Post »

Pie is very useful to visualize relative proportions of a data set and easily to be understood. The size of the circle will calculated based on the total quantity it represents.
pie01
For example:
There are 3 color in a cirlce, Grey 25, Blue 25 and Green 50.
So total circle size is: 25+25+50=100
The size for the color will be:

Grey  → 25/100 = 0.25 x 100% = 25%
Blue  → 25/100 = 0.25 x 100% = 25%
Green → 50/100 = 0.25 x 100% = 50%

Let’s say the size total is not 100, Grey 30, Blue20 and Green 40.
The total size is: 30+20+40=90.
What the percentage will be?

Grey → 30/90 = 0.3333 x 100% = 33.33%
Blue → 20/90 = 0.2222 x 100% = 22.22%
Green → 40/90 = 0.4444 x 100% = 44,44%

pie02
(more…)

Read Full Post »

Histogram is also a bar type graph chart. The main different between Bar Chart and Histogram Chart are:
Bar Chart for compare numeric data among Categories.
Histogram Chart for compare numeric data in a Category which is distributed into ‘bin’ or ‘bucket’. The term of ‘bin’ in here is a group data.

For example:
you have 24 population in a town with age below:

5,6,7,5,6,4,10,15,14,13,30,35,23,36,45,49,40,51,55,53,60,65,66,70

and you want to visualize it into a graph.

Let’s try.
Open your python IDE and type the code below.

import matplotlib.pyplot as plt
age=[5,6,7,5,6,4,10,15,14,13,30,35,23,36,45,49,40,51,55,53,60,65,66,70]
plt.hist(age)
plt.show()

hist01
hist02

(more…)

Read Full Post »

In this tutorial, I will show you how to create Bar graph. Many people use Bar graph because it’s easy to presenting a comparison between 1,2 or 3 value. More than that it’s not recommended because it will be difficult to show the comparison.

Bar Chart can Vertical or Horizontal, depend on what you need.

Single Bar Vertical
Let’s try with Single bar vertical. Type the code below and run in your python text editor.

import matplotlib.pyplot as plt
month=['Jan','Feb','Mar','Apr','May','Jun','Jul','Aug','Sep','Oct','Nov','Dec']
revenue=[100,110,120,100,90,115,70,90,140,100,110,120]
plt.bar(month,revenue) 
plt.title('Simple Bar Graph') 
plt.xlabel('Month') 
plt.ylabel('Revenue (K)USD')
plt.grid(linestyle='dotted',axis='y')
plt.legend() 
plt.show()

bar01
bar02
(more…)

Read Full Post »

Matplotlib is a plotting library for python. Matplotlib was originally written by John D Hunter during his post-doctoral research in neurobiology to visualize electrocorticography (ECoG) data of epilepsy patients. Then Michael Droettboom lead the development project after John Hunter passed away in August 2012.
Matplotlib version 2.0.x support python version 2.7 – 3.6.
Starting 2020, Matplotlib does not support python 2 anymore.

Using Matplotlib with python is easy.

I will show you how to visualize a simple graph.
In this tutorial, I use IDLE as python editor.
If you don’t have it, you can install using command below from your Linux Terminal.
$ sudo apt-get install idle

Run the idle.
plot00

From menu, select File>New File to display the Editor.
Type the code below and press F5 to run the code.

import matplotlib.pyplot as plt
x=[1,2,3,4]
y=[5,6,7,8]
plt.plot(x,y)
plt.show()

plot01plot02
You have your first graph with Matplotlib.
It’s easy, right?
(more…)

Read Full Post »

Slicing is accessing parts of array content.

The syntax is:
start:stop:step

x[1:5]      → display 1 until 5 → 1,2,3,4,5
x[5:]        → display all after 5 → 6,7,8,9
x[:6]        → display from beginning until 6 → 1,2,3,4,5,6,7,8,9
x[:]          → display all → 1,2,3,4,5,6,7,8,9
x[1:9:2]  → display between 1 to 9, step 2 → 2,4,6,8
x[-1]        → display last item → 9
x[-2]        → display 2nd item from the last → 8
x[:-3]      → display all except the 3 items. → 1,2,3,4,5,6
x[::-1]     → display all in reversed → 9,8,7,6,5,4,3,2,1
x[2::-1]   → display first 3 items, reversed → 3,2,1
x[:-4:-1]  → display the last 3 items, reversed → 9,8,7
x[-2::-1]  → display all except the 1 item, reversed → 8,7,6,5,4,3,2,1

Open your Linux Terminal and practice it.

$ python3
Python 3.6.9 (default, Nov 7 2019, 10:44:02)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> x=np.array([1,2,3,4,5,6,7,8,9])
>>> x
array([1, 2, 3, 4, 5, 6, 7, 8, 9])
 (more…)

Read Full Post »

In this article, I will show you how to do basic operation in array.

>>> import numpy as np
>>> x=np.array([[1,2,3,4],[5,6,7,8]])
>>> x
array([[1, 2, 3, 4],
[5, 6, 7, 8]])
>>> x.ravel()
array([1, 2, 3, 4, 5, 6, 7, 8])

ravel() function will create all array into 1 demensional array.

>>> x
array([[1, 2, 3, 4],
[5, 6, 7, 8]])

But the operation will not change the original array value.

>>> x.min()
1

Minimum data in the array

>>> x.max()
8

Maximum data in the array

>>> x.mean()
4.5

Average data in the array

>>> x.sum()
36

Total value in the array

>>> x.sum(axis=0)
array([ 6, 8, 10, 12])
>>> x.sum(axis=1)
array([10, 26])

math02

>>> np.sqrt(x)
array([[1. , 1.41421356, 1.73205081, 2. ],
[2.23606798, 2.44948974, 2.64575131, 2.82842712]])

Square root each data value in the array.

>>> y=np.array([[1,1,1,1],[1,1,1,1]])
>>> y
array([[1, 1, 1, 1],
[1, 1, 1, 1]])
>>> x+y
array([[2, 3, 4, 5],
[6, 7, 8, 9]])
>>> x-y
array([[0, 1, 2, 3],
[4, 5, 6, 7]])
>>> x*y
array([[1, 2, 3, 4],
[5, 6, 7, 8]])
>>>

math01

Read Full Post »

Older Posts »