What is memory? (part 1): Virtual memory and address spaces
I remember when I was an undergraduate taking my first computer science course. We were learning about the null
type in Java, and someone asked the (rather clumsy) TA, “I understand that you can compare an object to null
to see if it was allocated, but what actually is null
?” The TA turned on the projector, which rendered a picture of a rectangle onto a cloth screen in the front of the class. He responded, “Well…null
is the area down here”, and proceeded to circle the bottom of the rectangle in marker, thus immediately ruining the screen and destroying university property. He didn’t go on to explain what he meant by “the area down here”, and I was left wondering what null
really was for several years after that.
What the confused student didn’t realize at the time is that they were asking a very insightful and fundamental question about how computers work. To ask the question “What is null
” is to ask, “what is memory”? To ask “what is memory” is to ask, “what is a cache, and how do CPUs coordinate to read from and write to memory?” The recursion goes on and on, with an engineer becoming more and more capable (and frankly, more and more lucrative to companies) as they learn the answers to these questions and gain a deeper understanding of how a computer actually works.
That’s not to say that it’s a hard requirement to learn all this stuff in order to be a capable and productive developer. To the contrary, most developers in the tech industry will have great careers viewing all of this as a black box, and there’s nothing wrong with that. At the end of the day though, you as an engineer will always be limited in what you can accomplish and how far you can progress if you don’t understand how memory works. Think about what capabilities truly understanding memory would afford you. Is your program slow? Perhaps you should check if you’re accidentally thrashing your system by using more memory than is available on your machine. Does your product need to scale as your customer base grows? Leveraging a caching layer that can more quickly service requests to your customers is now a solution that you’re able to suggest. These skills will allow you not only to command a higher salary, but also to take on leadership roles as an engineer that would otherwise be out of reach for someone who didn’t have the necessary knowledge to make highly impactful engineering decisions.
Now that the sales pitch is out of the way, let’s get started. We’ll begin by developing a basic definition of and intuition for virtual memory in this post, and eventually dive into deeper (but still very important and interesting) concepts such as paging and caching in future posts. By the end of the series, you’ll know what virtual memory is, what a process (i.e. an instance of a running program) is, how CPUs actually read and write to and from memory, and much more. Note that this post makes an assumption that you’re familiar with hexadecimal notation, and you know what a byte is. If you don’t, leave a comment below and I’ll write up another post that introduces them in an understandable way.
What is "virtual memory"?
It turns out that this is a deep question that could (and has) filled up many volumes of textbooks, research papers, and probably blog posts. Let’s start with a simple (but conceptually accurate) model.
We can think of virtual memory as a very large array of data that is fast to read and write. The 0th index of the array can be called “address 0” (0x0), and the highest index of the array can be called “address 2^{64} - 1” (0xffffffffffffffff). Every instance of a running process in a system has a single instance of this virtual memory array, which we call its virtual memory address space, or more succinctly, its address space.
This is conceptually no different than an array in a programming language. Like any other array, a process can store values at entries in its address space, and then read those values at a later time. Unlike a regular array, however, a process’s address space is sparsely allocated, meaning that it does not automatically have access to all entries of its address space, and will have (very large) “holes” where the addresses are unavailable for use. In order to allocate ranges of its address space and fill those holes, the process must ask the operating system to allocate memory on its behalf. For any address / entry in the address space that’s available for use, we’ll say that it’s allocated. If an address has not been allocated and thus cannot be written to or read from, we’ll say that it’s unallocated.
Hold on, 2^{64}?? A computer doesn’t have that much memory…
We said above that every process in a system has a virtual address space that’s (usually) of size 2^{64}. You may be wondering, “How on earth is that possible? My computer only has 16 GB of RAM?” That’s a very reasonable and astute question, and the answer is that we’re talking about virtual memory, not physical memory. Virtual memory is a process’s view of memory. Every process in a running system sees memory as an array of size 2^{64}, that it has exclusive access to. In reality, there is a much smaller array of physical memory that is managed by the operating system, and given to processes to fill the holes in a process’s address space when requested.
But why do we need this extra layer / abstraction? Why not just have processes use physical memory directly? Well, because programming is already hard enough, and we like things to be as simple as possible. Virtual memory gives every process the exact same model for memory, everywhere. If I write a program, I don’t want to also have to write code to scrape through physical memory to see what’s still available after other processes have already taken the memory they need.
Virtual memory is indeed a powerful concept. It provides all processes with the illusion that they have full and exclusive control over all the memory on the system. It also allows us to use a framework for running processes that partitions its address space into different regions, each with a specific purpose. For example (and we’ll go into more detail on this in the next post), part of a process’s address space consists of all of its code, i.e. instructions. Another part of the address space is the process’s heap, i.e. space where it can dynamically allocate and use extra memory at runtime. We will go into more details on how a process is laid out in memory in the next post. For now, let’s gain a bit more fundamental intuition about memory.
Everything is memory and context
We now know that a virtual memory address space is a giant, sparse array that makes up a process’s view of memory. We also have some intuition for why virtual memory is critical to having a reasonable programming model. It’s time for us to take our new intuition, and use it to deepen our knowledge of computers with a very important point: Everything in software is just data and context.
This should hopefully make some intuitive sense. When you write a program and store something in a variable, that variable has a specific purpose which ascribes meaning to what’s stored there. For example, if you store the number 0 in a variable called, num_cookies
, then you are ascribing the meaning of “number of cookies” to some value 0 that is stored somewhere in the process’s address space. The data is “0”, and the context is, “number of cookies”. If you were storing an address in a variable (which would make it a “pointer”), the address 0 would mean, “the bottom of the address space”. The data is the same, but the context gives it a completely different meaning.
Let’s take a look at a real world example to apply this intuition in practice.
Compiling Code
At a high level, the job of a compiler is to translate a “human-readable” source code file into a “machine readable” binary file. The term “human readable” is a bit misleading though. Like everything else in a computer, the contents of the file are just numbers, to which we happen to ascribe the context of being a “human readable text file”. This context ascribes a meaning of “text” to the numbers. Let me show you what I mean with an example.
Say that we had a simple program in a source file called byte_lab.c
:
$ vim byte_lab.c
#include <stdio.h>
int main() {
printf("Byte Lab is fun!\n");
return 0;
}
It looks like text, but that’s only because vim
(my strongly preferred text editor) interprets the numbers (bytes) in the file as text, and presents those bytes to you as alphanumeric characters. Let’s take a look at byte_lab.c
using the hexdump
utility, which displays the numerical contents of a file:
$ hexdump byte_lab.c
user@user-VirtualBox:~/experiments$ hexdump -C byte_lab.c
00000000 23 69 6e 63 6c 75 64 65 20 3c 73 74 64 69 6f 2e |#include <stdio.|
00000010 68 3e 0a 0a 69 6e 74 20 6d 61 69 6e 28 29 20 7b |h>..int main() {|
00000020 0a 20 20 20 20 70 72 69 6e 74 66 28 22 42 79 74 |. printf("Byt|
00000030 65 20 4c 61 62 20 69 73 20 66 75 6e 21 5c 6e 22 |e Lab is fun!\n"|
00000040 29 3b 0a 20 20 20 20 72 65 74 75 72 6e 20 30 3b |);. return 0;|
00000050 0a 7d 0a |.}.|
00000053
The right-most column should look familiar – it’s the contents of byte_lab.c
from above. The middle column shows us each the hexadecimal representation for each character in the file. The left-most column is the offset of each byte / character. So one-by-one, we have the following character / byte representation pair:
#: 0x23
i: 0x69
n: 0x6e
c: 0x63
l: 0x6c
…
0: 0x30
;: 0x3b
\n: 0x0a
}: 0x7d
\n: 0x0a
If you’re curious as to why these numbers mean these specific characters, see the section below where we discuss the ASCII standard. But anyways, that’s why vim
is able to show you the textual representation of the file when you open it up. vim
expects to see bytes / numbers that correspond to alphanumeric text characters, and when it does, it knows how to turn them into characters on our screen.
An interesting question that comes to mind is, “What happens if we open vim on a binary file?” Let’s find out:
$ vim byte_lab # opening the byte_lab binary executable file
^?ELF^B^A^A^@^@^@^@^@^@^@^@^@^C^@>^@^A^@^@^@`^P^@^@^@^@^@^@@^@^@^@^@^@^@^@x9^@^@^@^@^@^@^@^@^@^@@^@8^@^M^@@^@^_^@^^^@^F^
…
@^@^@^@^@^@H^B^@^@^@^@^@^@^A^@^@^@^@^@^@^@/lib64/ld-linux-x86-64.so.2^@^@^@^@^@
…
There are a lot of ^@
characters which is what vim renders when it doesn’t know what character the number is supposed to represent. There are also some ^A
’s and ^B
’s which essentially also just mean that the bytes in the file don’t correspond to something meant to be read by humans. Interestingly though, we also see some human readable text (“ELF”, and “lib64/ld-linux-x86-64.so”). Why is that? Because executable, binary ELF files actually contain some human readable text as well! The files aren’t meant to be read by humans, but that doesn’t mean that they don’t contain some values in memory that, in context, are meant to be interpreted as textual characters.
An aside: ASCII
I want to briefly touch on standards and ASCII so you’re aware of how vim knows which characters correspond to which bytes. The answer is that the characters being rendered by vim are actually encoded as part of a standard. In this case, the ASCII standard. A standard is a widely accepted agreement that specifies behavior / the expectation of something in computer systems. Sometimes standards dictate the behavior and guarantees of a programming language (e.g. the “C standard” or “C++ standard”), and sometimes they specify e.g. how characters should be encoded (or, how they should be represented by numbers).
ASCII is a relatively simple standard that specifies the encoding for 128 characters (which only require 7 bits). We can actually see the encoding for all of the characters with man ascii
:
$ man ascii
ASCII(7) Linux Programmer's Manual ASCII(7)
NAME
ascii - ASCII character set encoded in octal, decimal, and hexadecimal
DESCRIPTION
ASCII is the American Standard Code for Information Interchange. It is a 7-bit code. Many 8-bit codes (e.g., ISO
8859-1) contain ASCII as their lower half. The international counterpart of ASCII is known as ISO 646-IRV.
The following table contains the 128 ASCII characters.
C program '\X' escapes are noted.
Oct Dec Hex Char Oct Dec Hex Char
────────────────────────────────────────────────────────────────────────
000 0 00 NUL '\0' (null character) 100 64 40 @
001 1 01 SOH (start of heading) 101 65 41 A
002 2 02 STX (start of text) 102 66 42 B
003 3 03 ETX (end of text) 103 67 43 C
004 4 04 EOT (end of transmission) 104 68 44 D
005 5 05 ENQ (enquiry) 105 69 45 E
006 6 06 ACK (acknowledge) 106 70 46 F
007 7 07 BEL '\a' (bell) 107 71 47 G
010 8 08 BS '\b' (backspace) 110 72 48 H
011 9 09 HT '\t' (horizontal tab) 111 73 49 I
012 10 0A LF '\n' (new line) 112 74 4A J
013 11 0B VT '\v' (vertical tab) 113 75 4B K
014 12 0C FF '\f' (form feed) 114 76 4C L
015 13 0D CR '\r' (carriage ret) 115 77 4D M
016 14 0E SO (shift out) 116 78 4E N
017 15 0F SI (shift in) 117 79 4F O
020 16 10 DLE (data link escape) 120 80 50 P
021 17 11 DC1 (device control 1) 121 81 51 Q
022 18 12 DC2 (device control 2) 122 82 52 R
023 19 13 DC3 (device control 3) 123 83 53 S
024 20 14 DC4 (device control 4) 124 84 54 T
025 21 15 NAK (negative ack.) 125 85 55 U
026 22 16 SYN (synchronous idle) 126 86 56 V
027 23 17 ETB (end of trans. blk) 127 87 57 W
030 24 18 CAN (cancel) 130 88 58 X
031 25 19 EM (end of medium) 131 89 59 Y
032 26 1A SUB (substitute) 132 90 5A Z
033 27 1B ESC (escape) 133 91 5B [
034 28 1C FS (file separator) 134 92 5C \ '\\'
035 29 1D GS (group separator) 135 93 5D ]
036 30 1E RS (record separator) 136 94 5E ^
037 31 1F US (unit separator) 137 95 5F _
040 32 20 SPACE 140 96 60 `
041 33 21 ! 141 97 61 a
042 34 22 " 142 98 62 b
043 35 23 # 143 99 63 c
044 36 24 $ 144 100 64 d
045 37 25 % 145 101 65 e
046 38 26 & 146 102 66 f
047 39 27 ' 147 103 67 g
050 40 28 ( 150 104 68 h
051 41 29 ) 151 105 69 i
052 42 2A * 152 106 6A j
053 43 2B + 153 107 6B k
054 44 2C , 154 108 6C l
055 45 2D - 155 109 6D m
056 46 2E . 156 110 6E n
057 47 2F / 157 111 6F o
060 48 30 0 160 112 70 p
061 49 31 1 161 113 71 q
062 50 32 2 162 114 72 r
063 51 33 3 163 115 73 s
064 52 34 4 164 116 74 t
065 53 35 5 165 117 75 u
066 54 36 6 166 118 76 v
067 55 37 7 167 119 77 w
070 56 38 8 170 120 78 x
071 57 39 9 171 121 79 y
072 58 3A : 172 122 7A z
073 59 3B ; 173 123 7B {
074 60 3C < 174 124 7C |
075 61 3D = 175 125 7D }
076 62 3E > 176 126 7E ~
077 63 3F ? 177 127 7F DEL
And we see that every character has a specific numerical encoding. If we go back to our example above with hexdump
, we’ll see that all of the character / byte combinations printed by hexdump
match the encodings of the ASCII standard here.
Wrapping up
In this post we learned a few very important concepts:
- Every process has their own view of virtual memory as a sparse array of storage going from 0, to 2^{64} - 1.
- A process fills holes in its virtual memory address space by asking the operating system for memory, which itself is responsible for managing the physical memory on the system, and giving it to processes upon request.
- At the end of the day, everything in computers is about data and context.
- We sometimes use standards to ensure that everyone views bytes in a specific context in the same way.
There is a lot more to unpack and learn about memory. In the next post, we’ll take a closer look at how a process’s address space is laid out in memory. We’ll learn about what loaders are (i.e. how your program actually gets run), and that processes are composed of something called segments. We’ll also learn about the heap, and the stack. If that sounds interesting, tune in!
One more thing though before we sign off – we still haven’t answered the question of “What actually is null
?” We’ll answer it now: null
is simply a term that represents address 0 in a process’s address space. That’s right, it’s just a plain old memory address representing the very bottom / beginning of a process’s address space. This address is actually no different than any other address, but the convention in every operating system that I’ve ever come across is to ensure that memory at the null
/ NULL
/ nullptr
address is never allocated, so accessing it will result in the process crashing.
Disclaimer: null in Java actually does have an abstract / semantic meaning which is specific to the language. The spirit of null
though is as described above, and was what the TA was trying to convey!
As always, if you have any questions or suggestions, please leave them in the comments below. See you all next time!
Comments
Sign in or become a Byte Lab member to join the conversation.