I’m currently reading the Computer Architecture: A quantitative approach book and after so many diagrams that are connected, but that never appear together I decided to summarize everything in a design. It’s not intended to be complete, as there are many ways of solving this topic, but from a hypothetical perspective it gives a good perspective on how memory mapping works in current computer architecture.

Introduction

This post examines the memory mapping flow within a hypothetical 64-bit computer architecture. The system employs a three-level cache hierarchy, starting with a Translation Lookaside Buffer (TLB) for virtual-to-physical address translation. Subsequent levels include a 32KB L1 cache (organized into 256 sets with 2-way associativity and 512-bit block size) and a 4MB L2 cache (comprising 16,384 sets with 4-way associativity and 512-bit block size). For the purpose of illustrating the mapping process, we’ll present a simplified model that combines instruction and data caches. We will not discuss block replacement algorithms or write-back mechanisms in this post.

How memory mapping works?

Github Actions | Visual explanation of the memory mapping process |

Numbers in red explain how the diagram should be read and each one explains the following:

The virtual address is divided into two parts: virtual page number and page offset.
The virtual page number is divided in two parts: TLB tag compare address and TLB index.
The TLB uses the TLB index to look up the corresponding entry in the tables.
The TLB tag is compared with the tag stored in the TLB entry.
TLB hit.
The L1 cache index is used to select the set in the cache.
The L1 cache tag is compared with the tag stored in the L1 cache entry.
L1 cache hit.
The block offset selects the offset from L1 data and this is sent to the CPU.
L1 cache miss.
Let’s take back the physical address and divide it in: L2 tag compare address, L2 cache index and block offset.
The L2 cache index is used to select the set in the cache.
The L2 cache tag is compared with the tag stored in the L2 cache entry.
L2 cache hit.
The block offset selects the offset from L2 data and this is sent to the CPU.
Let’s come back to the TLB cache miss.
The virtual address is divided in: canonical form, page-map L4 entry, page-directory-pointer entry, page directory entry, page table entry and page offset.
The CR3 register entry is summed to the page-map L4 to obtain the location of the page-directory-pointer table.
The page-directory-pointer table is summed to the page-directory-pointer entry to obtain the location of the page directory table.
The page directory table is summed to the page directory entry to obtain the location of the page table entry.
The page table is summed to the page table entry to obtain the physical address.
The physical address is the physical page frame number.
The physical page frame number and the page offset indicate the exact location in main memory.

Conclusion

I enjoyed understanding and explaining the memory mapping process in this hypothetical architecture. I hope you found it helpful and that it demystified some of the inner workings of computer memory. Thanks for reading!

← Previous Post Next Post →