Saturday, 15 February 2014

Byte ordering within your memory...

Let's Explore.

Well, it is quite interesting that as human being, computer also write to and read from memory from left-to-right as well as right-to-left.

If you want to address whole title of this blog post in single word, in computer terminology, then the term is endian or endianness.

In a byte-addressable system, where each byte has an index, called its address. A word (four bytes on 32-bit system) are stored in back-to-back memory addresses. However, the question is how!? Which part of data (the most significant byte OR the least significant byte) store at the first memory address!?

We have mainly two options... Big-endian and Little-endian. (I use 'mainly' because, except this two methods, there are bi-endian and middle-endian.)

Big-endian

  • In this system, most significant byte of the word is stored in the smallest address given and the least significant byte is stored in the largest. The big-endian format is also known as the Motorola convention.
  • Decreasing numeric significance with increasing memory addresses (or increasing time), known as big-endian.
Little-endian
  • In this system, least significant byte is stored in the smallest address. Mainly Intel series of processors use the little-endian format, also known as the Intel convention.
  • Increasing numeric significance with increasing memory addresses (or increasing time), known as little-endian.

Let's see how your system laid out given data within memory.

I am using Debian Linux on Intel Centrino, which is x86 machine and little-endian by definition. 

Following Macro will take the first byte of the integer which is casted to unsigned character pointer and check that whether it is 0 or 1. Rest of the code is self-explanatory... I guess!!


1
2
3
4
5
6
7
int i = 1;

#define endian() { \
     ((*(unsigned char *)&i) == 0) \
     ? printf("Big-endian") \
     : printf("Little-endian"); \
}

In following code, we take an address of integer (int) and cast it to unsigned character pointer (unsigned char *), then we print value of each byte in hex.


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
#include <stdio.h>
 
typedef unsigned char * char_ptr;
 
void show_bytes(char_ptr start, int length);
 
int main(void) {
    int a = 12345;
    int *ptr = &a;
    show_bytes((char_ptr) ptr, sizeof(int));
    return 0;
}
 
void show_bytes(char_ptr start, int length) {
    int i;
    for (i = 0; i < length; i++) {
        printf("%.2x\n", start[i]);
    }
    printf("\n");
}

Hexadecimal value of 12345 is 0x3039. So, on big-endian system output will be 0x00003039 and in contrast on little-endian system output will be 0x39300000.

Monday, 2 December 2013

Linux Memory Paging Model

Key task of paging unit is to prevent one from accessing the linear addresses which are not belongs to it and then converts linear addresses to physical addresses.

Group of contiguous fixed-size linear addresses is called page and this page is map to contiguous physical addresses. why page?, not a single linear address?? it has two reasons. firstly, operations can perform on it efficiently and secondly, access rights can apply on whole page instead of all the linear addresses included in it.

Paging unit thinks of all RAM as partitioned into fixed-length page frames. Size of page and page frame is identical. widely used page size is 4 KB but it could be 2 MB or 4 MB. The data structure that map linear to physical addresses are called the page tables.

this command will show size of pages in bytes on Linux box:

~$] getconf PAGESIZE

x86 processors supports several modes. real-mode, 32-bit protected-mode and 64-bit protected-mode. When computer boots, processor running in real-mode due to the x86 processor family's backward compatibility. After preparing all data structures that may need in protected-mode, kernel enables protected mode by setting the PE bit in the CR0 processor register. At this point paging disabled; paging is an optional feature of the processor. By setting the PG bit of a register named CR0, kernel enable the paging unit.
Let us see, how physical memory addressed!?
most computer architectures are byte-addressable means data can be accessed 8 bits at a time.

Real-mode:
    data bus: 16-bit
    address bus: 20-bit
    physical memory limit = 1 MB

    2^20 = 1048576 addresses = 1048576 * 1 byte = 1 MB

32-bit protected mode:
    data bus: 32-bit
    address bus: 32-bit
    physical memory limit: 4 GB

    2^32 = 4294967296 addresses = 4294967296 * 1 bytes = 4096 MB = 4 GB

64-bit protected mode:
    data bus: 64-bit
    address bus: 40-bit
    physical memory limit = 1 TB

    1 TB!!, How??.. This is Home Work ;)

So, your 32-bit operating system can access 4 GB of RAM, right!??
NO... Becasue, processor can address up to 4 GB physical memory, not RAM.
Confused??... Let us make it simple, we used to think that all the physical memory is RAM. Actually, the RAM and memory/registers of the I/O devices are mapped to physical address values. So when an address is accessed by the CPU, it may refer to a portion of physical RAM, but it can also refer to memory of the I/O device. This method is called Memory-mapped I/O.
640K Ought to be Enough for Anyone
-Bill Gates
Let's take example of real-mode, it can address 1024 KB physical memory and can address only 640 KB RAM but where are the remaining physical addresses mapped. This mapping of memory addresses away from RAM causes the hole in physical memory between 640KB and 1MB, when memory addresses are reserved for BIOS ROM, legacy video card and PCI devices. So, if you have 4 GB or greater RAM on 32-bit operating system, check it out that how much of it you are using.
this command will output memory map of your Linux system:

~]$ cat /proc/iomem

Intel has satisfied these requests of accessing 4 GB or greater RAM by increasing the number of address pins on its processors from 32 to 36.

this command will show address pins in Linux box:

~]$ cat /proc/cpuinfo | grep address

With the Pentium Pro processor, Intel introduced a mechanism called Physical Address Extension (PAE). PAE is activated by setting the PAE bit in the cr4 register. When PAE bit enabled, processor use 36-bit address bus instead of 32-bit address bus. So, if address bus is 36-bit long, then how much of physical memory can access!? I know that you can figured it out.

Address translation when PAE bit disabled.
  • The translation of linear addresses is accomplished in two steps. The first translation table is called the Page Directory, and the second is called the Page Table. 32-bit linear address divided into three fields. The most significant 10 bits (1024 entries) represents page directory entry, intermediate 10 bits (1024 entries) represents page table entry and least significant 12 bits (4096 entries) determines the relative position within the page frame.


Address translation when PAE bit enabled.
  • PAE enabled paging mechanism translates 32-bit linear addresses into 36-bit physical ones in three steps. The first translation table is called the Page Directory Pointer Table (PDPT) , the second is called the Page Directory and the third is called Page table. 32-bit linear address divided into four fields. The most significant 2 bits (4 entries) represents entry in Page Directory Pointer Table, 9 bits (512 entries) represents Page Directory entry, 9 bits (512 entries) represents Page Table entry and 12 bits (4096 entries) offset determines the relative position within the page frame. Once cr3 is set, it is possible to address up to 4 GB of RAM. If we want to address more RAM, we’ll have to put a new value in cr3 or change the content of the PDPT.

    Thursday, 28 November 2013

    Get address and size of Global Descriptor Table

    I just read about GDT which translates logical address into linear address in linux. Out of curiosity, Let's find out the address and size of GDT.

    Global Descriptor Table (GDT) use for memory segmentation in protected mode. GDT is array contains segment descriptors, local descriptors table and task state descriptor. GDT holds 8-byte long descriptors which has information about segment's base address, limit and access privileges.
    In uniprocessor systems there is only one GDT, while in multiprocessor systems there is one GDT for every CPU in the system.
    gdtr control register holds the 16-bit size and 32-bit address of GDT. gdtr has 48-bit fields.

    |-----SIZE-----|------------ADDRESS------------|

    #include <stdio.h>
    
    struct gdtr {
            unsigned short size;
            unsigned int addr __attribute__((packed));
    } gdtr;
    
    int main(void)
    {
            asm("sgdt %0" :"+m" (gdtr));
            printf("limit: %u\nbase: %x\n", gdtr.size, gdtr.addr);
            return 0;
    }
    
    Note:
    (1) copy the content of gdtr control register to memory address gdtr using sgdtr instruction in inline assembly.
    (2) __attribute__((packed)) ensures that structure fields align on one-byte boundaries.