从这一系列文章翻译并结合自己理解提炼出来的,代码都自己实践过,有时间的也可以直接阅读英文原链接Hack The Virtual Memory: C strings & /proc Hack the Virtual Memory: drawing the VM diagram Hack the Virtual Memory: malloc, the heap & the program break

通过/proc文件系统探究虚拟内存
本小节会通过/proc文件系统找到正在运行的进程的字符串所在的虚拟内存地址,并通过更改此内存地址的内容来更改字符串内容
虚拟内存
虚拟内存是一种实现在计算机软硬件之间的内存管理技术,它将程序使用到的内存地址(虚拟地址)映射到计算机内存中的物理地址,虚拟内存使得应用程序从繁琐的管理内存空间任务中解放出来,提高了内存隔离带来的安全性,虚拟内存地址通常是连续的地址空间,由操作系统的内存管理模块控制,在触发缺页中断时利用分页技术将实际的物理内存分配给虚拟内存,而且64位机器虚拟内存的空间大小远超出实际物理内存的大小,使得进程可以使用比物理内存大小更多的内存空间。
在深入研究虚拟内存前,有几个关键点:
每个进程都有它自己的虚拟内存虚拟内存的大小取决于系统的体系结构不同操作管理有着不同的管理虚拟内存的方式,但大多数操作系统的虚拟内存结构如下图:virtual_memory.png
上图并不是特别详细的内存管理图,高地址其实还有内核空间等等,但这不是这篇文章的主题。从图中可以看到高地址存储着命令行参数和环境变量,之后是栈空间、堆空间和可执行程序,其中栈空间向下延申,堆空间向上增长,堆空间需要使用malloc分配,是动态分配的内存的一部分。
首先通过一个简单的C程序探究虚拟内存
#include <stdlib.h> #include <stdio.h> #include <string.h> /** * main - 使用strdup创建一个字符串的拷贝,strdup内部会使用malloc分配空间, * 返回新空间的地址,这段地址空间需要外部自行使用free释放 * * Return: EXIT_FAILURE if malloc failed. Otherwise EXIT_SUCCESS */ int main(void) { char *s; s = strdup("test_memory"); if (s == NULL) { fprintf(stderr, "Can't allocate mem with mallocn"); return (EXIT_FAILURE); } printf("%pn", (void *)s); return (EXIT_SUCCESS); } 编译运行:gcc -Wall -Wextra -pedantic -Werror main.c -o test; ./test 输出:0x88f010
我的机器是64位机器,进程的虚拟内存高地址为0xffffffffffffffff, 低地址为0x0,而0x88f010远小于0xffffffffffffffff,因此大概可以推断出被复制的字符串的地址(堆地址)是在内存低地址附近,具体可以通过/proc文件系统验证.ls /proc目录可以看到好多文件,这里主要关注/proc/[pid]/mem和/proc/[pid]/maps
mem & maps
man proc /proc/[pid]/mem This file can be used to access the pages of a process's memory through open(2), read(2), and lseek(2). /proc/[pid]/maps A file containing the currently mapped memory regions and their access permissions. See mmap(2) for some further information about memory mappings. The format of the file is: address perms offset dev inode pathname 00400000-00452000 r-xp 00000000 08:02 173521 /usr/bin/dbus-daemon 00651000-00652000 r--p 00051000 08:02 173521 /usr/bin/dbus-daemon 00652000-00655000 rw-p 00052000 08:02 173521 /usr/bin/dbus-daemon 00e03000-00e24000 rw-p 00000000 00:00 0 [heap] 00e24000-011f7000 rw-p 00000000 00:00 0 [heap] ... 35b1800000-35b1820000 r-xp 00000000 08:02 135522 /usr/lib64/ld-2.15.so 35b1a1f000-35b1a20000 r--p 0001f000 08:02 135522 /usr/lib64/ld-2.15.so 35b1a20000-35b1a21000 rw-p 00020000 08:02 135522 /usr/lib64/ld-2.15.so 35b1a21000-35b1a22000 rw-p 00000000 00:00 0 35b1c00000-35b1dac000 r-xp 00000000 08:02 135870 /usr/lib64/libc-2.15.so 35b1dac000-35b1fac000 ---p 001ac000 08:02 135870 /usr/lib64/libc-2.15.so 35b1fac000-35b1fb0000 r--p 001ac000 08:02 135870 /usr/lib64/libc-2.15.so 35b1fb0000-35b1fb2000 rw-p 001b0000 08:02 135870 /usr/lib64/libc-2.15.so ... f2c6ff8c000-7f2c7078c000 rw-p 00000000 00:00 0 [stack:986] ... 7fffb2c0d000-7fffb2c2e000 rw-p 00000000 00:00 0 [stack] 7fffb2d48000-7fffb2d49000 r-xp 00000000 00:00 0 [vdso] The address field is the address space in the process that the mapping occupies. The perms field is a set of permissions: r = read w = write x = execute s = shared p = private (copy on write) The offset field is the offset into the file/whatever; dev is the device (major:minor); inode is the inode on that device. 0 indicates that no inode is associated with the memory region, as would be the case with BSS (uninitialized data). The pathname field will usually be the file that is backing the mapping. For ELF files, you can easily coordinate with the offset field by looking at the Offset field in the ELF program headers (readelf -l). There are additional helpful pseudo-paths: [stack] The initial process's (also known as the main thread's) stack. [stack:<tid>] (since Linux 3.4) A thread's stack (where the <tid> is a thread ID). It corresponds to the /proc/[pid]/task/[tid]/ path. [vdso] The virtual dynamically linked shared object. [heap] The process's heap. If the pathname field is blank, this is an anonymous mapping as obtained via the mmap(2) function. There is no easy way to coordinate this back to a process's source, short of running it through gdb(1), strace(1), or similar. Under Linux 2.0 there is no field giving pathname.
通过mem文件可以访问和修改整个进程的内存页,通过maps可以看到进程当前已映射的内存区域,有地址和访问权限偏移量等,从maps中可以看到堆空间是在低地址而栈空间是在高地址. 从maps中可以看到heap的访问权限是rw,即可写,所以可以通过堆地址找到上个示例程序中字符串的地址,并通过修改mem文件对应地址的内容,就可以修改字符串的内容啦,程序:
#include <stdlib.h> #include <stdio.h> #include <string.h> #include <unistd.h> /** * main - uses strdup to create a new string, loops forever-ever * * Return: EXIT_FAILURE if malloc failed. Other never returns */ int main(void) { char *s; unsigned long int i; s = strdup("test_memory"); if (s == NULL) { fprintf(stderr, "Can't allocate mem with mallocn"); return (EXIT_FAILURE); } i = 0; while (s) { printf("[%lu] %s (%p)n", i, s, (void *)s); sleep(1); i++; } return (EXIT_SUCCESS); } 编译运行:gcc -Wall -Wextra -pedantic -Werror main.c -o loop; ./loop 输出: [0] test_memory (0x21dc010) [1] test_memory (0x21dc010) [2] test_memory (0x21dc010) [3] test_memory (0x21dc010) [4] test_memory (0x21dc010) [5] test_memory (0x21dc010) [6] test_memory (0x21dc010) ...
这里可以写一个脚本通过/proc文件系统找到字符串所在位置并修改其内容,相应的输出也会更改首先找到进程的进程号
ps aux | grep ./loop | grep -v grep zjucad 2542 0.0 0.0 4352 636 pts/3 S+ 12:28 0:00 ./loop
2542即为loop程序的进程号,cat /proc/2542/maps得到
00400000-00401000 r-xp 00000000 08:01 811716 /home/zjucad/wangzhiqiang/loop 00600000-00601000 r--p 00000000 08:01 811716 /home/zjucad/wangzhiqiang/loop 00601000-00602000 rw-p 00001000 08:01 811716 /home/zjucad/wangzhiqiang/loop 021dc000-021fd000 rw-p 00000000 00:00 0 [heap] 7f2adae2a000-7f2adafea000 r-xp 00000000 08:01 8661324 /lib/x86_64-linux-gnu/libc-2.23.so 7f2adafea000-7f2adb1ea000 ---p 001c0000 08:01 8661324 /lib/x86_64-linux-gnu/libc-2.23.so 7f2adb1ea000-7f2adb1ee000 r--p 001c0000 08:01 8661324 /lib/x86_64-linux-gnu/libc-2.23.so 7f2adb1ee000-7f2adb1f0000 rw-p 001c4000 08:01 8661324 /lib/x86_64-linux-gnu/libc-2.23.so 7f2adb1f0000-7f2adb1f4000 rw-p 00000000 00:00 0 7f2adb1f4000-7f2adb21a000 r-xp 00000000 08:01 8661310 /lib/x86_64-linux-gnu/ld-2.23.so 7f2adb3fa000-7f2adb3fd000 rw-p 00000000 00:00 0 7f2adb419000-7f2adb41a000 r--p 00025000 08:01 8661310 /lib/x86_64-linux-gnu/ld-2.23.so 7f2adb41a000-7f2adb41b000 rw-p 00026000 08:01 8661310 /lib/x86_64-linux-gnu/ld-2.23.so 7f2adb41b000-7f2adb41c000 rw-p 00000000 00:00 0 7ffd51bb3000-7ffd51bd4000 rw-p 00000000 00:00 0 [stack] 7ffd51bdd000-7ffd51be0000 r--p 00000000 00:00 0 [vvar] 7ffd51be0000-7ffd51be2000 r-xp 00000000 00:00 0 [vdso] ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
看见堆地址范围021dc000-021fd000,并且可读可写,而且021dc000<0x21dc010<021fd000,这就可以确认字符串的地址在堆中,在堆中的索引是0x10(至于为什么是0x10,后面会讲到),这时可以通过mem文件到0x21dc010地址修改内容,字符串输出的内容也会随之更改,这里通过python脚本实现此功能.
#!/usr/bin/env python3 ''' Locates and replaces the first occurrence of a string in the heap of a process Usage: ./read_write_heap.py PID search_string replace_by_string Where: - PID is the pid of the target process - search_string is the ASCII string you are looking to overwrite - replace_by_string is the ASCII string you want to replace search_string with ''' import sys def print_usage_and_exit(): print('Usage: {} pid search write'.format(sys.argv[0])) sys.exit(1) # check usage if len(sys.argv) != 4: print_usage_and_exit() # get the pid from args pid = int(sys.argv[1]) if pid <= 0: print_usage_and_exit() search_string = str(sys.argv[2]) if search_string == "": print_usage_and_exit() write_string = str(sys.argv[3]) if search_string == "": print_usage_and_exit() # open the maps and mem files of the process maps_filename = "/proc/{}/maps".format(pid) print("[*] maps: {}".format(maps_filename)) mem_filename = "/proc/{}/mem".format(pid) print("[*] mem: {}".format(mem_filename)) # try opening the maps file try: maps_file = open('/proc/{}/maps'.format(pid), 'r') except IOError as e: print("[ERROR] Can not open file {}:".format(maps_filename)) print(" I/O error({}): {}".format(e.errno, e.strerror)) sys.exit(1) for line in maps_file: sline = line.split(' ') # check if we found the heap if sline[-1][:-1] != "[heap]": continue print("[*] Found [heap]:") # parse line addr = sline[0] perm = sline[1] offset = sline[2] device = sline[3] inode = sline[4] pathname = sline[-1][:-1] print("tpathname = {}".format(pathname)) print("taddresses = {}".format(addr)) print("tpermisions = {}".format(perm)) print("toffset = {}".format(offset)) print("tinode = {}".format(inode)) # check if there is read and write permission if perm[0] != 'r' or perm[1] != 'w': print("[*] {} does not have read/write permission".format(pathname)) maps_file.close() exit(0) # get start and end of the heap in the virtual memory addr = addr.split("-") if len(addr) != 2: # never trust anyone, not even your OS :) print("[*] Wrong addr format") maps_file.close() exit(1) addr_start = int(addr[0], 16) addr_end = int(addr[1], 16) print("tAddr start [{:x}] | end [{:x}]".format(addr_start, addr_end)) # open and read mem try: mem_file = open(mem_filename, 'rb+') except IOError as e: print("[ERROR] Can not open file {}:".format(mem_filename)) print(" I/O error({}): {}".format(e.errno, e.strerror)) maps_file.close() exit(1) # read heap mem_file.seek(addr_start) heap = mem_file.read(addr_end - addr_start) # find string try: i = heap.index(bytes(search_string, "ASCII")) except Exception: print("Can't find '{}'".format(search_string)) maps_file.close() mem_file.close() exit(0) print("[*] Found '{}' at {:x}".format(search_string, i)) # write the new string print("[*] Writing '{}' at {:x}".format(write_string, addr_start + i)) mem_file.seek(addr_start + i) mem_file.write(bytes(write_string, "ASCII")) # close files maps_file.close() mem_file.close() # there is only one heap in our example break
运行这个Python脚本
zjucad@zjucad-ONDA-H110-MINI-V3-01:~/wangzhiqiang$ sudo ./loop.py 2542 test_memory test_hello [*] maps: /proc/2542/maps [*] mem: /proc/2542/mem [*] Found [heap]: pathname = [heap] addresses = 021dc000-021fd000 permisions = rw-p offset = 00000000 inode = 0 Addr start [21dc000] | end [21fd000] [*] Found 'test_memory' at 10 [*] Writing 'test_hello' at 21dc010
同时字符串输出的内容也已更改
[633] test_memory (0x21dc010) [634] test_memory (0x21dc010) [635] test_memory (0x21dc010) [636] test_memory (0x21dc010) [637] test_memory (0x21dc010) [638] test_memory (0x21dc010) [639] test_memory (0x21dc010) [640] test_helloy (0x21dc010) [641] test_helloy (0x21dc010) [642] test_helloy (0x21dc010) [643] test_helloy (0x21dc010) [644] test_helloy (0x21dc010) [645] test_helloy (0x21dc010)
实验成功.
通过实践画出虚拟内存空间分布图
再列出内存空间分布图
virtual_memory.png
基本上每个人或多或少都了解虚拟内存的空间分布,那如何验证它呢,下面会提到.
堆栈空间
首先验证栈空间的位置,我们都知道C中局部变量是存储在栈空间的,malloc分配的内存是存储在堆空间,所以可以通过打印出局部变量地址和malloc的返回内存地址的方式来验证堆栈空间在整个虚拟空间中的位置.
#include <stdlib.h> #include <stdio.h> #include <string.h> /** * main - print locations of various elements * * Return: EXIT_FAILURE if something failed. Otherwise EXIT_SUCCESS */ int main(void) { int a; void *p; printf("Address of a: %pn", (void *)&a); p = malloc(98); if (p == NULL) { fprintf(stderr, "Can't mallocn"); return (EXIT_FAILURE); } printf("Allocated space in the heap: %pn", p); return (EXIT_SUCCESS); } 编译运行:gcc -Wall -Wextra -pedantic -Werror main.c -o test; ./test 输出: Address of a: 0x7ffedde9c7fc Allocated space in the heap: 0x55ca5b360670
通过结果可以看出堆地址空间在栈地址空间下面,整理如图:
virtual_memory_stack_heap.png
可执行程序
可执行程序也在虚拟内存中,可以通过打印main函数的地址,并与堆栈地址相比较,即可知道可执行程序地址相对于堆栈地址的分布.
#include <stdlib.h> #include <stdio.h> #include <string.h> /** * main - print locations of various elements * * Return: EXIT_FAILURE if something failed. Otherwise EXIT_SUCCESS */ int main(void) { int a; void *p; printf("Address of a: %pn", (void *)&a); p = malloc(98); if (p == NULL) { fprintf(stderr, "Can't mallocn"); return (EXIT_FAILURE); } printf("Allocated space in the heap: %pn", p); printf("Address of function main: %pn", (void *)main); return (EXIT_SUCCESS); } 编译运行:gcc main.c -o test; ./test 输出: Address of a: 0x7ffed846de2c Allocated space in the heap: 0x561b9ee8c670 Address of function main: 0x561b9deb378a
由于main(0x561b9deb378a) < heap(0x561b9ee8c670) < (0x7ffed846de2c),可以画出分布图如下:
virtual_memory_stack_heap_executable.png
命令行参数和环境变量
程序入口main函数可以携带参数:
第一个参数(argc): 命令行参数的个数第二个参数(argv): 指向命令行参数数组的指针第三个参数(env): 指向环境变量数组的指针通过程序可以看见这些元素在虚拟内存中的位置:
#include <stdlib.h> #include <stdio.h> #include <string.h> /** * main - print locations of various elements * * Return: EXIT_FAILURE if something failed. Otherwise EXIT_SUCCESS */ int main(int ac, char **av, char **env) { int a; void *p; int i; printf("Address of a: %pn", (void *)&a); p = malloc(98); if (p == NULL) { fprintf(stderr, "Can't mallocn"); return (EXIT_FAILURE); } printf("Allocated space in the heap: %pn", p); printf("Address of function main: %pn", (void *)main); printf("First bytes of the main function:nt"); for (i = 0; i < 15; i++) { printf("%02x ", ((unsigned char *)main)[i]); } printf("n"); printf("Address of the array of arguments: %pn", (void *)av); printf("Addresses of the arguments:nt"); for (i = 0; i < ac; i++) { printf("[%s]:%p ", av[i], av[i]); } printf("n"); printf("Address of the array of environment variables: %pn", (void *)env); printf("Address of the first environment variable: %pn", (void *)(env[0])); return (EXIT_SUCCESS); } 编译运行:gcc main.c -o test; ./test nihao hello 输出: Address of a: 0x7ffcc154a748 Allocated space in the heap: 0x559bd1bee670 Address of function main: 0x559bd09807ca First bytes of the main function: 55 48 89 e5 48 83 ec 40 89 7d dc 48 89 75 d0 Address of the array of arguments: 0x7ffcc154a848 Addresses of the arguments: [./test]:0x7ffcc154b94f [nihao]:0x7ffcc154b956 [hello]:0x7ffcc154b95c Address of the array of environment variables: 0x7ffcc154a868 Address of the first environment variable: 0x7ffcc154b962
结果如下:main(0x559bd09807ca) < heap(0x559bd1bee670) < stack(0x7ffcc154a748) < argv(0x7ffcc154a848) < env(0x7ffcc154a868) < arguments(0x7ffcc154b94f->0x7ffcc154b95c + 6)(6为hello+1(' ')) < env first(0x7ffcc154b962)可以看出所有的命令行参数都是相邻的,并且紧接着就是环境变量.
argv和env数组地址是相邻的吗
上例中argv有4个元素,命令行中有三个参数,还有一个NULL指向标记数组的末尾,每个指针是8字节,8*4=32, argv(0x7ffcc154a848) + 32(0x20) = env(0x7ffcc154a868),所以argv和env数组指针是相邻的.
命令行参数地址紧随环境变量地址之后吗
首先需要获取环境变量数组的大小,环境变量数组是以NULL结束的,所以可以遍历env数组,检查是否为NULL,获取数组大小,代码如下:
#include <stdlib.h> #include <stdio.h> #include <string.h> /** * main - print locations of various elements * * Return: EXIT_FAILURE if something failed. Otherwise EXIT_SUCCESS */ int main(int ac, char **av, char **env) { int a; void *p; int i; int size; printf("Address of a: %pn", (void *)&a); p = malloc(98); if (p == NULL) { fprintf(stderr, "Can't mallocn"); return (EXIT_FAILURE); } printf("Allocated space in the heap: %pn", p); printf("Address of function main: %pn", (void *)main); printf("First bytes of the main function:nt"); for (i = 0; i < 15; i++) { printf("%02x ", ((unsigned char *)main)[i]); } printf("n"); printf("Address of the array of arguments: %pn", (void *)av); printf("Addresses of the arguments:nt"); for (i = 0; i < ac; i++) { printf("[%s]:%p ", av[i], av[i]); } printf("n"); printf("Address of the array of environment variables: %pn", (void *)env); printf("Address of the first environment variables:n"); for (i = 0; i < 3; i++) { printf("t[%p]:"%s"n", env[i], env[i]); } /* size of the env array */ i = 0; while (env[i] != NULL) { i++; } i++; /* the NULL pointer */ size = i * sizeof(char *); printf("Size of the array env: %d elements -> %d bytes (0x%x)n", i, size, size); return (EXIT_SUCCESS); } 编译运行:gcc main.c -o test; ./test nihao hello 输出: Address of a: 0x7ffd5ebadff4 Allocated space in the heap: 0x562ba4e13670 Address of function main: 0x562ba2f1881a First bytes of the main function: 55 48 89 e5 48 83 ec 40 89 7d dc 48 89 75 d0 Address of the array of arguments: 0x7ffd5ebae0f8 Addresses of the arguments: [./test]:0x7ffd5ebae94f [nihao]:0x7ffd5ebae956 [hello]:0x7ffd5ebae95c Address of the array of environment variables: 0x7ffd5ebae118 Address of the first environment variables: [0x7ffd5ebae962]:"LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=00:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=01;31:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.tzo=01;31:*.t7z=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lrz=01;31:*.lz=01;31:*.lzo=01;31:*.xz=01;31:*.zst=01;31:*.tzst=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.alz=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.cab=01;31:*.wim=01;31:*.swm=01;31:*.dwm=01;31:*.esd=01;31:*.jpg=01;35:*.jpeg=01;35:*.mjpg=01;35:*.mjpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.m4a=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:*.oga=00;36:*.opus=00;36:*.spx=00;36:*.xspf=00;36:" [0x7ffd5ebaef4e]:"HOSTNAME=3e8650948c0c" [0x7ffd5ebaef64]:"OLDPWD=/" Size of the array env: 11 elements -> 88 bytes (0x58) 运算结果如下: root@3e8650948c0c:/ubuntu# bc bc 1.07.1 Copyright 1991-1994, 1997, 1998, 2000, 2004, 2006, 2008, 2012-2017 Free Software Foundation, Inc. This is free software with ABSOLUTELY NO WARRANTY. For details type `warranty'. obase=16 ibase=16 58+7ffd5ebae118 (standard_in) 3: syntax error 58+7FFD5EBAE118 7FFD5EBAE170 quit
通过结果可知7FFD5EBAE170 != 0x7ffd5ebae94f,所以命令行参数地址不是紧随环境变量地址之后。截至目前画出图表如下:
virtual_memory_args_env.png
栈内存真的向下增长吗
可以通过调用函数来确认,如果真的是向下增长,那么调用函数的地址应该高于被调用函数地址, 代码如下:
#include <stdlib.h> #include <stdio.h> #include <string.h> void f(void) { int a; int b; int c; a = 98; b = 1024; c = a * b; printf("[f] a = %d, b = %d, c = a * b = %dn", a, b, c); printf("[f] Adresses of a: %p, b = %p, c = %pn", (void *)&a, (void *)&b, (void *)&c); } int main(int ac, char **av, char **env) { int a; void *p; int i; int size; printf("Address of a: %pn", (void *)&a); p = malloc(98); if (p == NULL) { fprintf(stderr, "Can't mallocn"); return (EXIT_FAILURE); } printf("Allocated space in the heap: %pn", p); printf("Address of function main: %pn", (void *)main); f(); return (EXIT_SUCCESS); } 编译运行:gcc main.c -o test; ./test 输出: Address of a: 0x7ffefc75083c Allocated space in the heap: 0x564d46318670 Address of function main: 0x564d45b9880e [f] a = 98, b = 1024, c = a * b = 100352 [f] Adresses of a: 0x7ffefc7507ec, b = 0x7ffefc7507f0, c = 0x7ffefc7507f4
结果可知: f{a} 0x7ffefc7507ec < main{a} 0x7ffefc75083c可画图如下:
virtual_memory_stack.png
其实也可以写一个简单的代码,通过查看/proc文件系统中map内容来查看内存分布,这里就不举例啦.
堆内存(malloc)
malloc
malloc是常用的动态分配内存的函数,malloc申请的内存分配在堆中,注意malloc是glibc函数,不是系统调用.man malloc:
[...] allocate dynamic memory[...] void *malloc(size_t size); [...] The malloc() function allocates size bytes and returns a pointer to the allocated memory.
不调用malloc,就不会有堆空间[heap]
看一段不调用malloc的代码
#include <stdlib.h> #include <stdio.h> /** * main - do nothing * * Return: EXIT_FAILURE if something failed. Otherwise EXIT_SUCCESS */ int main(void) { getchar(); return (EXIT_SUCCESS); } 编译运行:gcc test.c -o 2; ./2 step 1 : ps aux | grep ./2$ 输出: zjucad 3023 0.0 0.0 4352 788 pts/3 S+ 13:58 0:00 ./2 step 2 : /proc/3023/maps 输出: 00400000-00401000 r-xp 00000000 08:01 811723 /home/zjucad/wangzhiqiang/2 00600000-00601000 r--p 00000000 08:01 811723 /home/zjucad/wangzhiqiang/2 00601000-00602000 rw-p 00001000 08:01 811723 /home/zjucad/wangzhiqiang/2 007a4000-007c5000 rw-p 00000000 00:00 0 [heap] 7f954ca02000-7f954cbc2000 r-xp 00000000 08:01 8661324 /lib/x86_64-linux-gnu/libc-2.23.so 7f954cbc2000-7f954cdc2000 ---p 001c0000 08:01 8661324 /lib/x86_64-linux-gnu/libc-2.23.so 7f954cdc2000-7f954cdc6000 r--p 001c0000 08:01 8661324 /lib/x86_64-linux-gnu/libc-2.23.so 7f954cdc6000-7f954cdc8000 rw-p 001c4000 08:01 8661324 /lib/x86_64-linux-gnu/libc-2.23.so 7f954cdc8000-7f954cdcc000 rw-p 00000000 00:00 0 7f954cdcc000-7f954cdf2000 r-xp 00000000 08:01 8661310 /lib/x86_64-linux-gnu/ld-2.23.so 7f954cfd2000-7f954cfd5000 rw-p 00000000 00:00 0 7f954cff1000-7f954cff2000 r--p 00025000 08:01 8661310 /lib/x86_64-linux-gnu/ld-2.23.so 7f954cff2000-7f954cff3000 rw-p 00026000 08:01 8661310 /lib/x86_64-linux-gnu/ld-2.23.so 7f954cff3000-7f954cff4000 rw-p 00000000 00:00 0 7ffed68a1000-7ffed68c2000 rw-p 00000000 00:00 0 [stack] 7ffed690e000-7ffed6911000 r--p 00000000 00:00 0 [vvar] 7ffed6911000-7ffed6913000 r-xp 00000000 00:00 0 [vdso] ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
可以看到,如果不调用malloc,maps中就没有[heap]
下面运行一个带有malloc的程序
#include <stdio.h> #include <stdlib.h> /** * main - prints the malloc returned address * * Return: EXIT_FAILURE if something failed. Otherwise EXIT_SUCCESS */ int main(void) { void *p; p = malloc(1); printf("%pn", p); getchar(); return (EXIT_SUCCESS); } 编译运行:gcc test.c -o 3; ./3 输出:0xcc7010 验证步骤及输出: zjucad@zjucad-ONDA-H110-MINI-V3-01:~/wangzhiqiang$ ps aux | grep ./3$ zjucad 3113 0.0 0.0 4352 644 pts/3 S+ 14:06 0:00 ./3 zjucad@zjucad-ONDA-H110-MINI-V3-01:~/wangzhiqiang$ cat /proc/3113/maps 00400000-00401000 r-xp 00000000 08:01 811726 /home/zjucad/wangzhiqiang/3 00600000-00601000 r--p 00000000 08:01 811726 /home/zjucad/wangzhiqiang/3 00601000-00602000 rw-p 00001000 08:01 811726 /home/zjucad/wangzhiqiang/3 00cc7000-00ce8000 rw-p 00000000 00:00 0 [heap] 7fc7e9128000-7fc7e92e8000 r-xp 00000000 08:01 8661324 /lib/x86_64-linux-gnu/libc-2.23.so 7fc7e92e8000-7fc7e94e8000 ---p 001c0000 08:01 8661324 /lib/x86_64-linux-gnu/libc-2.23.so 7fc7e94e8000-7fc7e94ec000 r--p 001c0000 08:01 8661324 /lib/x86_64-linux-gnu/libc-2.23.so 7fc7e94ec000-7fc7e94ee000 rw-p 001c4000 08:01 8661324 /lib/x86_64-linux-gnu/libc-2.23.so 7fc7e94ee000-7fc7e94f2000 rw-p 00000000 00:00 0 7fc7e94f2000-7fc7e9518000 r-xp 00000000 08:01 8661310 /lib/x86_64-linux-gnu/ld-2.23.so 7fc7e96f8000-7fc7e96fb000 rw-p 00000000 00:00 0 7fc7e9717000-7fc7e9718000 r--p 00025000 08:01 8661310 /lib/x86_64-linux-gnu/ld-2.23.so 7fc7e9718000-7fc7e9719000 rw-p 00026000 08:01 8661310 /lib/x86_64-linux-gnu/ld-2.23.so 7fc7e9719000-7fc7e971a000 rw-p 00000000 00:00 0 7ffc91c18000-7ffc91c39000 rw-p 00000000 00:00 0 [stack] 7ffc91d5f000-7ffc91d62000 r--p 00000000 00:00 0 [vvar] 7ffc91d62000-7ffc91d64000 r-xp 00000000 00:00 0 [vdso] ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
程序中带有malloc,那maps中就有[heap]段,并且malloc返回的地址在heap的地址段中,但是返回的地址却不再heap的最开始地址上,相差了0x10字节,为什么呢?看下面:
strace, brk, sbrk
malloc不是系统调用,它是一个正常函数,它必须调用某些系统调用才可以操作堆内存,通过使用strace工具可以追踪进程的系统调用和信号,为了确认系统调用是malloc产生的,所以在malloc前后添加write系统调用方便定位问题。
#include <stdio.h> #include <stdlib.h> #include <unistd.h> /** * main - let's find out which syscall malloc is using * * Return: EXIT_FAILURE if something failed. Otherwise EXIT_SUCCESS */ int main(void) { void *p; write(1, "BEFORE MALLOCn", 14); p = malloc(1); write(1, "AFTER MALLOCn", 13); printf("%pn", p); getchar(); return (EXIT_SUCCESS); } 编译运行:gcc test.c -o 4 zjucad@zjucad-ONDA-H110-MINI-V3-01:~/wangzhiqiang$ strace ./4 execve("./4", ["./4"], [/* 34 vars */]) = 0 brk(NULL) = 0x781000 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=111450, ...}) = 0 mmap(NULL, 111450, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f37720fa000 close(3) = 0 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) open("/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 read(3, "177ELF2113 3 >

