Archive for April 14th, 2020

I try to make comparison about Memory Layout between x86-32 bit and x86-64 bit in Linux.

I run 3 different programs, code written in Assembly, C language and python to see how they are loaded into the memory. Then each program I make another 2 copy. So, each language will represent 3 programs.

I use tmux in Linux Terminal, so we can see all programs run together.

Before we start, you have to know the concept behind memory layout in Linux so you will understand what I’m going to explain in this article.

Memory Layout in Linux
For Architecture 32 bit, at the time a program is loaded into memory, all sections of the programs are loaded into each part of the memory. All codes and data which are declared all brought together, even if the source code is separated. The instruction in .text section is loaded into address 0x0804800. Followed by .data section and .bss section. The last address of linux is 0xbFFFFFF.

See the picture below:

In this tutorial, I use AMD processor (Intel compatible). Since the byte order use Little Endian which is start with LSB (lowest significant byte), we have to read the memory from bottom to up.

Virtual Memory Organized
If all programs are loaded in the same location in the memory, why it never over lap each other?
It’s because the program only access the Virtual Memory.
Physical Memory is RAM chip of your computer.
Virtual Memory is the way program think about the memory.
Before a program is loaded in a memory, Linux will search for the empty physical memory that big enough to hold the program. Then it will tell processor to pretend that this memory address is real address of 0x08048000 for the program to stay. After that each program will have it’s own sandbox to play.

Every program will believe that they are stand alone and enjoy all they memory that they have.
So, the address that program believe to use is named Virtual Address meanwhile the real address in the memory chip is named Physical Address.
Process that pointed virtual address to physical address is named Mapping.

Multi tasking in Linux
The Core of Linux is a Block Code that is name Kernel. Memory System is marked as Kernel Space and User Space. Communication in between is handled by System Call. Access to Hardware is limited in software that is run in Kernel Space and only can be done via Kernel Mode Device Drivers.

VDSO (Virtual Dynamically Linked Share Object).
VDSO is memory area allocated in user space for kernel functionalities purpose. It’s kernel mechanism that is used for program to call Kernel Space routines. VDSO use standard mechanism for linking and loading ELF format (Executable and Linkable Format).


Read Full Post »

Pandas in Python Library that is used for Data Manipulation and Analysis. It came from terms “Panel Data”. It’s open source under three-clause BSD License. Original developer was Wes McKinney in 2008 while he worked at AQR Capital Management to process Quantitative Analysis on financial data. It was written in Python, Cython and C.

Pandas in mainly used for Machine Learning.

There a lot of features available that you can used for:
-reading and writing various data format, csv, MS excel, json, html, SAS, SPSS, SQL, Google Big Query, Stata, Msgpack etc.
-Group, Join, Merge, Filter, Pivot, Reshaping data set.
-Time series function and so many more.

In this tutorial I use Python 3.6.9 (default, Nov 7 2019, 10:44:02), so the installation command will be: pip3 install pandas.
From Linux terminal type:

$ pip3 install pandas
Collecting pandas
  Downloading https://files.pythonhosted.org/packages/bb/71/8f53bdbcbc67c912b888b40def255767e475402e9df64050019149b1a943/pandas-1.0.3-cp36-cp36m-manylinux1_x86_64.whl (10.0MB)
    100% |████████████████████████████████| 10.0MB 48kB/s 
Collecting python-dateutil>=2.6.1 (from pandas)
  Using cached https://files.pythonhosted.org/packages/d4/70/d60450c3dd48ef87586924207ae8907090de0b306af2bce5d134d78615cb/python_dateutil-2.8.1-py2.py3-none-any.whl
Collecting numpy>=1.13.3 (from pandas)
  Downloading https://files.pythonhosted.org/packages/07/08/a549ba8b061005bb629b76adc000f3caaaf881028b963c2e18f811c6edc1/numpy-1.18.2-cp36-cp36m-manylinux1_x86_64.whl (20.2MB)
    100% |████████████████████████████████| 20.2MB 45kB/s 
Collecting pytz>=2017.2 (from pandas)
  Using cached https://files.pythonhosted.org/packages/e7/f9/f0b53f88060247251bf481fa6ea62cd0d25bf1b11a87888e53ce5b7c8ad2/pytz-2019.3-py2.py3-none-any.whl
Collecting six>=1.5 (from python-dateutil>=2.6.1->pandas)
  Using cached https://files.pythonhosted.org/packages/65/eb/1f97cb97bfc2390a276969c6fae16075da282f5058082d4cb10c6c5c1dba/six-1.14.0-py2.py3-none-any.whl
Installing collected packages: six, python-dateutil, numpy, pytz, pandas
Successfully installed numpy-1.18.2 pandas-1.0.3 python-dateutil-2.8.1 pytz-2019.3 six-1.14.0


Read Full Post »