Dissecting Bloated Executables
Did you ever wonder what’s used to stuff the sausage of a Windows executable file? In yesterday’s post I examined a simple text-only C program, and discovered that 18 lines of C code created a 6144 byte executable program. Using OllyDbg, I learned that the functions I wrote compiled into only 120 bytes of code, but the executable was 50 times larger than that. This was true even when the C runtime library was in a DLL instead of statically linked, code was compiled in release mode and optimization was set for “minimum size”, and all the advanced compiler and linker options were turned off in an effort to eliminate surprises. No hot-patching support, C++ exception handling, function inlining, buffer overrun checks, security development lifecycle checks, whole program optimization, etc. The complete set of command line switches for the compiler and linker were as follows (using Microsoft Visual Studio Express 2012):
/Yu"stdafx.h" /GS- /analyze- /W3 /Gy- /Zc:wchar_t /Zi /Gm- /O1 /Ob0 /sdl- /Fd"Release\vc110.pdb" /fp:precise /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /D "_CRT_SECURE_NO_WARNINGS" /D "_MBCS" /errorReport:none /WX- /Zc:forScope /Gd /Oy- /MD /Fa"Release\" /nologo /Fo"Release\" /Fp"Release\Backwards.pch" /OUT:"C:\Users\chamberlin\Documents\Reversing\Release\Backwards.exe" /MANIFEST /NXCOMPAT /PDB:"C:\Users\chamberlin\Documents\Reversing\Release\Backwards.pdb" /DYNAMICBASE:NO "kernel32.lib" "user32.lib" "gdi32.lib" "winspool.lib" "comdlg32.lib" "advapi32.lib" "shell32.lib" "ole32.lib" "oleaut32.lib" "uuid.lib" "odbc32.lib" "odbccp32.lib" /MACHINE:X86 /OPT:REF /SAFESEH:NO /INCREMENTAL:NO /PGD:"C:\Users\chamberlin\Documents\Reversing\Release\Backwards.pgd" /SUBSYSTEM:CONSOLE /MANIFESTUAC:"level='asInvoker' uiAccess='false'" /ManifestFile:"Release\Backwards.exe.intermediate.manifest" /OPT:ICF /ERRORREPORT:NONE /NOLOGO /TLBID:1
The Windows PE Header
What else is eating up all that space in the executable file? For starters, every Windows executable begins with a header that describes its contents. In fact, there are two headers. The first 256 bytes of any modern Windows application is actually a legacy DOS executable header and a small 16-bit DOS program. The DOS header begins with the two letters MZ (ASCII 4D 5A), which you can see by opening the executable file in any binary editor. The DOS program is a hold-over from the early days of Windows, when a confused person might try to run a Windows program from inside DOS. Copy a modern Windows executable file to an ancient DOS box and run it, and the embedded DOS program will print a message like “This program cannot be run in DOS mode.” Score 1 for backwards compatibility.
Following the DOS header and stub program is a Windows PE (portable executable) header, where all the interesting stuff is found. The PE header has a variable size, but is typically a few hundred bytes, and is 408 bytes for the example program described here. The PE header is used by the Windows loader to place the program’s code and data into memory, and to perform run-time dynamic linking with DLLs. It describes what sections the executable has, what functions it imports, and lots of other goodies. The PE header can be explored using the Microsoft tool dumpbin, which is included with a standard install of Visual Studio. Running dumpbin /headers on the example program produces this output:
Microsoft (R) COFF/PE Dumper Version 11.00.50727.1 Copyright (C) Microsoft Corporation. All rights reserved. Dump of file Backwards.exe PE signature found File Type: EXECUTABLE IMAGE FILE HEADER VALUES 14C machine (x86) 4 number of sections 560B1034 time date stamp Tue Sep 29 15:27:00 2015 0 file pointer to symbol table 0 number of symbols E0 size of optional header 103 characteristics Relocations stripped Executable 32 bit word machine OPTIONAL HEADER VALUES 10B magic # (PE32) 11.00 linker version A00 size of code C00 size of initialized data 0 size of uninitialized data 12E9 entry point (004012E9) 1000 base of code 2000 base of data 400000 image base (00400000 to 00404FFF) 1000 section alignment 200 file alignment 6.00 operating system version 0.00 image version 6.00 subsystem version 0 Win32 version 5000 size of image 400 size of headers 0 checksum 3 subsystem (Windows CUI) 8100 DLL characteristics NX compatible Terminal Server Aware 100000 size of stack reserve 1000 size of stack commit 100000 size of heap reserve 1000 size of heap commit 0 loader flags 10 number of directories 0 [ 0] RVA [size] of Export Directory 21B4 [ 3C] RVA [size] of Import Directory 4000 [ 1E0] RVA [size] of Resource Directory 0 [ 0] RVA [size] of Exception Directory 0 [ 0] RVA [size] of Certificates Directory 0 [ 0] RVA [size] of Base Relocation Directory 0 [ 0] RVA [size] of Debug Directory 0 [ 0] RVA [size] of Architecture Directory 0 [ 0] RVA [size] of Global Pointer Directory 0 [ 0] RVA [size] of Thread Storage Directory 2100 [ 40] RVA [size] of Load Configuration Directory 0 [ 0] RVA [size] of Bound Import Directory 2000 [ A0] RVA [size] of Import Address Table Directory 0 [ 0] RVA [size] of Delay Import Directory 0 [ 0] RVA [size] of COM Descriptor Directory 0 [ 0] RVA [size] of Reserved Directory SECTION HEADER #1 .text name 8BA virtual size 1000 virtual address (00401000 to 004018B9) A00 size of raw data 400 file pointer to raw data (00000400 to 00000DFF) 0 file pointer to relocation table 0 file pointer to line numbers 0 number of relocations 0 number of line numbers 60000020 flags Code Execute Read SECTION HEADER #2 .rdata name 526 virtual size 2000 virtual address (00402000 to 00402525) 600 size of raw data E00 file pointer to raw data (00000E00 to 000013FF) 0 file pointer to relocation table 0 file pointer to line numbers 0 number of relocations 0 number of line numbers 40000040 flags Initialized Data Read Only SECTION HEADER #3 .data name 38C virtual size 3000 virtual address (00403000 to 0040338B) 200 size of raw data 1400 file pointer to raw data (00001400 to 000015FF) 0 file pointer to relocation table 0 file pointer to line numbers 0 number of relocations 0 number of line numbers C0000040 flags Initialized Data Read Write SECTION HEADER #4 .rsrc name 1E0 virtual size 4000 virtual address (00404000 to 004041DF) 200 size of raw data 1600 file pointer to raw data (00001600 to 000017FF) 0 file pointer to relocation table 0 file pointer to line numbers 0 number of relocations 0 number of line numbers 40000040 flags Initialized Data Read Only Summary 1000 .data 1000 .rdata 1000 .rsrc 1000 .text
This executable has four sections:
- .text (8BA hex or 2234 decimal bytes)
- .rdata (526 hex or 1318 decimal bytes)
- .data (38C hex or 908 decimal bytes)
- .rsrc (1E0 hex or 480 decimal bytes)
That’s 4940 total bytes of raw section data, but each section must be 512 byte aligned. Including the alignment padding, the four sections combined use 5488 bytes on disk. So that’s where the bulk of the file’s data lies.
Imports
The PE header also contains the executable’s imports: the list of DLLs that it requires and the functions that are used in each DLL. For this text-only console-based example program, I would expect to see a couple of functions like printf and scanf imported from the C runtime library, and maybe a few other functions imported from kernel32.dll for creating and managing the console window. I can use dumpbin /imports to to parse the PE header and display the imports list:
Dump of file Backwards.exe File Type: EXECUTABLE IMAGE Section contains the following imports: MSVCR110.dll 402024 Import Address Table 402214 Import Name Table 0 time date stamp 0 Index of first forwarder reference 21C _cexit 22C _configthreadlocale 1E2 __setusermatherr 2EF _initterm_e 2EE _initterm 1A5 __initenv 284 _fmode 22B _commode 13B ?terminate@@YAXXZ 269 _exit 36C _lock 4D6 _unlock 21B _calloc_crt 19C __dllonexit 412 _onexit 2F6 _invoke_watson 22F _controlfp_s 260 _except_handler4_common 23B _crt_debugger_hook 19A __crtUnhandledException 199 __crtTerminateProcess 5BC exit 1E0 __set_app_type 1A4 __getmainargs 205 _amsg_exit 16F _XcptFilter 649 strlen 630 scanf 198 __crtSetUnhandledExceptionFilter 620 printf KERNEL32.dll 402000 Import Address Table 4021F0 Import Name Table 0 time date stamp 0 Index of first forwarder reference 383 IsDebuggerPresent 117 DecodePointer 311 GetTickCount64 2F4 GetSystemTimeAsFileTime 228 GetCurrentThreadId 43C QueryPerformanceCounter 13C EncodePointer 388 IsProcessorFeaturePresent
Wow! There are a lot more functions imported from the C runtime library than you might have expected, including some odd-looking ones like _invoke_watson and _crt_debugger_hook, and several functions related to exception handling. Remember, C++ exception handling was disabled in the compiler options, so seeing these functions imported here is something of a surprise. But the really strange discovery is the list of functions imported from kernel32.dll. Why does it need to check if a debugger is present, or use functions like GetTickCount64 or QueryPerformanceCounter? There’s nothing timing-related at all in the example program, so the presence of these imports is a complete mystery. Hopefully I can find an explanation later when I examine the other parts of the executable.
Exploring the Sections
.reloc
The executable originally had a 960 byte .reloc section too, but I suppressed that. Code in the .text segment is assembled using absolute addressing, assuming it will be loaded at a fixed image base (typically 00400000). If the Windows loader can’t place the program at that address, it will choose a different base address, and use the information in the .reloc segment to find absolute address references in the code that need to be patched.
But I think this feature is no longer needed today, thanks to virtual memory. Each program gets its own private virtual address space, so what could possibly conflict with it such that it couldn’t be loaded at 00400000? Indeed, none of the other example programs I looked at had a .reloc section, but mine did. It turned out that Visual Studio was adding a .reloc section by default, as a result of the Randomize Base Address feature controlled by the /DYNAMICBASE command line switch. This feature chooses a different base address at which to load the program every time it’s run, which I guess is some kind of security feature. After specifying /DYNAMICBASE:NO for the linker, the .reloc section disappeared and the program continued to run fine.
.rsrc
What about that resource section, .rsrc? It’s normally used to hold Windows resources like cursors and images, but this is a text-only console program. It doesn’t need any resources, so why is the .rsrc section there at all? I can use dumpbin again to look at the raw data in the .rsrc section, with dumpbin /section:.rsrc /rawdata:
SECTION HEADER #4 .rsrc name 1E0 virtual size 4000 virtual address (00404000 to 004041DF) 200 size of raw data 1600 file pointer to raw data (00001600 to 000017FF) 0 file pointer to relocation table 0 file pointer to line numbers 0 number of relocations 0 number of line numbers 40000040 flags Initialized Data Read Only RAW DATA #4 00404000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 ................ 00404010: 18 00 00 00 18 00 00 80 00 00 00 00 00 00 00 00 ................ 00404020: 00 00 00 00 00 00 01 00 01 00 00 00 30 00 00 80 ............0... 00404030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 ................ 00404040: 09 04 00 00 48 00 00 00 60 40 00 00 7D 01 00 00 ....H...`@..}... 00404050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00404060: 3C 3F 78 6D 6C 20 76 65 72 73 69 6F 6E 3D 27 31 <?xml version='1 00404070: 2E 30 27 20 65 6E 63 6F 64 69 6E 67 3D 27 55 54 .0' encoding='UT 00404080: 46 2D 38 27 20 73 74 61 6E 64 61 6C 6F 6E 65 3D F-8' standalone= 00404090: 27 79 65 73 27 3F 3E 0D 0A 3C 61 73 73 65 6D 62 'yes'?>..<assemb 004040A0: 6C 79 20 78 6D 6C 6E 73 3D 27 75 72 6E 3A 73 63 ly xmlns='urn:sc 004040B0: 68 65 6D 61 73 2D 6D 69 63 72 6F 73 6F 66 74 2D hemas-microsoft- 004040C0: 63 6F 6D 3A 61 73 6D 2E 76 31 27 20 6D 61 6E 69 com:asm.v1' mani 004040D0: 66 65 73 74 56 65 72 73 69 6F 6E 3D 27 31 2E 30 festVersion='1.0 004040E0: 27 3E 0D 0A 20 20 3C 74 72 75 73 74 49 6E 66 6F '>.. <trustInfo 004040F0: 20 78 6D 6C 6E 73 3D 22 75 72 6E 3A 73 63 68 65 xmlns="urn:sche 00404100: 6D 61 73 2D 6D 69 63 72 6F 73 6F 66 74 2D 63 6F mas-microsoft-co 00404110: 6D 3A 61 73 6D 2E 76 33 22 3E 0D 0A 20 20 20 20 m:asm.v3">.. 00404120: 3C 73 65 63 75 72 69 74 79 3E 0D 0A 20 20 20 20 <security>.. 00404130: 20 20 3C 72 65 71 75 65 73 74 65 64 50 72 69 76 <requestedPriv 00404140: 69 6C 65 67 65 73 3E 0D 0A 20 20 20 20 20 20 20 ileges>.. 00404150: 20 3C 72 65 71 75 65 73 74 65 64 45 78 65 63 75 <requestedExecu 00404160: 74 69 6F 6E 4C 65 76 65 6C 20 6C 65 76 65 6C 3D tionLevel level= 00404170: 27 61 73 49 6E 76 6F 6B 65 72 27 20 75 69 41 63 'asInvoker' uiAc 00404180: 63 65 73 73 3D 27 66 61 6C 73 65 27 20 2F 3E 0D cess='false' />. 00404190: 0A 20 20 20 20 20 20 3C 2F 72 65 71 75 65 73 74 . </request 004041A0: 65 64 50 72 69 76 69 6C 65 67 65 73 3E 0D 0A 20 edPrivileges>.. 004041B0: 20 20 20 3C 2F 73 65 63 75 72 69 74 79 3E 0D 0A </security>.. 004041C0: 20 20 3C 2F 74 72 75 73 74 49 6E 66 6F 3E 0D 0A </trustInfo>.. 004041D0: 3C 2F 61 73 73 65 6D 62 6C 79 3E 0D 0A 00 00 00 </assembly>.....
Interesting… there’s a plain-text XML file in the resource section. This is the Windows application manifest, and is used to indicate whether the program needs administrator privileges in order to run, kind of like the setuid flag under Linux. I believe the manifest can also be used to select a specific DLL to use with the program, if multiple versions of the same DLL exist. Does this program actually need a manifest? I’m not sure, but with padding it’s taking up 512 bytes.
.rdata
Next let’s look at the read-only data section, .rdata. The example program contains three string constants that would seem to be the only candidates for read-only data, and they’re maybe 50 total bytes. What else is in the .rdata section to make it 1318 bytes? I can use dumpbin again to peek inside:
SECTION HEADER #2 .rdata name 526 virtual size 2000 virtual address (00402000 to 00402525) 600 size of raw data E00 file pointer to raw data (00000E00 to 000013FF) 0 file pointer to relocation table 0 file pointer to line numbers 0 number of relocations 0 number of line numbers 40000040 flags Initialized Data Read Only RAW DATA #2 00402000: E8 24 00 00 D8 24 00 00 C6 24 00 00 AC 24 00 00 è$..O$..Æ$..¬$.. 00402010: 96 24 00 00 7C 24 00 00 6C 24 00 00 FC 24 00 00 .$..|$..l$..ü$.. 00402020: 00 00 00 00 08 23 00 00 12 23 00 00 28 23 00 00 .....#...#..(#.. 00402030: 3C 23 00 00 4A 23 00 00 56 23 00 00 62 23 00 00 <#..J#..V#..b#.. 00402040: 6C 23 00 00 78 23 00 00 00 23 00 00 B0 23 00 00 l#..x#...#..°#.. 00402050: B8 23 00 00 C2 23 00 00 D0 23 00 00 DE 23 00 00 ,#..A#..D#.._#.. 00402060: E8 23 00 00 FA 23 00 00 0A 24 00 00 24 24 00 00 è#..ú#...$..$$.. 00402070: 3A 24 00 00 54 24 00 00 F8 22 00 00 E6 22 00 00 :$..T$..o"..æ".. 00402080: D6 22 00 00 C8 22 00 00 BA 22 00 00 A2 22 00 00 Ö"..E"..º"..¢".. 00402090: 9A 22 00 00 8C 23 00 00 90 22 00 00 00 00 00 00 ."...#..."...... 004020A0: 00 00 00 00 39 11 40 00 00 00 00 00 00 00 00 00 ....9.@......... 004020B0: 80 10 40 00 3B 15 40 00 34 13 40 00 00 00 00 00 ..@.;.@.4.@..... 004020C0: 57 68 61 74 20 69 73 20 79 6F 75 72 20 6E 61 6D What is your nam 004020D0: 65 3F 20 00 25 33 31 73 00 00 00 00 59 6F 75 72 e? .%31s....Your 004020E0: 20 73 65 63 72 65 74 20 63 6F 64 65 20 69 73 3A secret code is: 004020F0: 20 00 00 00 58 30 40 00 A8 30 40 00 00 00 00 00 ...X0@."0@..... 00402100: 48 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 H............... 00402110: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00402120: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00402130: 00 00 00 00 00 00 00 00 00 00 00 00 18 30 40 00 .............0@. 00402140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00402150: 00 00 00 00 00 00 00 00 FE FF FF FF 00 00 00 00 ........_ÿÿÿ.... 00402160: D4 FF FF FF 00 00 00 00 FE FF FF FF 99 12 40 00 Oÿÿÿ...._ÿÿÿ..@. 00402170: AD 12 40 00 00 00 00 00 FE FF FF FF 00 00 00 00 -.@....._ÿÿÿ.... 00402180: D8 FF FF FF 00 00 00 00 FE FF FF FF 39 14 40 00 Oÿÿÿ...._ÿÿÿ9.@. 00402190: 4C 14 40 00 00 00 00 00 FE FF FF FF 00 00 00 00 L.@....._ÿÿÿ.... 004021A0: CC FF FF FF 00 00 00 00 FE FF FF FF 00 00 00 00 Iÿÿÿ...._ÿÿÿ.... 004021B0: 10 16 40 00 14 22 00 00 00 00 00 00 00 00 00 00 ..@..".......... 004021C0: AC 22 00 00 24 20 00 00 F0 21 00 00 00 00 00 00 ¬"..$ ..d!...... 004021D0: 00 00 00 00 18 25 00 00 00 20 00 00 00 00 00 00 .....%... ...... 004021E0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 004021F0: E8 24 00 00 D8 24 00 00 C6 24 00 00 AC 24 00 00 è$..O$..Æ$..¬$.. 00402200: 96 24 00 00 7C 24 00 00 6C 24 00 00 FC 24 00 00 .$..|$..l$..ü$.. 00402210: 00 00 00 00 08 23 00 00 12 23 00 00 28 23 00 00 .....#...#..(#.. 00402220: 3C 23 00 00 4A 23 00 00 56 23 00 00 62 23 00 00 <#..J#..V#..b#.. 00402230: 6C 23 00 00 78 23 00 00 00 23 00 00 B0 23 00 00 l#..x#...#..°#.. 00402240: B8 23 00 00 C2 23 00 00 D0 23 00 00 DE 23 00 00 ,#..A#..D#.._#.. 00402250: E8 23 00 00 FA 23 00 00 0A 24 00 00 24 24 00 00 è#..ú#...$..$$.. 00402260: 3A 24 00 00 54 24 00 00 F8 22 00 00 E6 22 00 00 :$..T$..o"..æ".. 00402270: D6 22 00 00 C8 22 00 00 BA 22 00 00 A2 22 00 00 Ö"..E"..º"..¢".. 00402280: 9A 22 00 00 8C 23 00 00 90 22 00 00 00 00 00 00 ."...#..."...... 00402290: 20 06 70 72 69 6E 74 66 00 00 30 06 73 63 61 6E .printf..0.scan 004022A0: 66 00 49 06 73 74 72 6C 65 6E 00 00 4D 53 56 43 f.I.strlen..MSVC 004022B0: 52 31 31 30 2E 64 6C 6C 00 00 6F 01 5F 58 63 70 R110.dll..o._Xcp 004022C0: 74 46 69 6C 74 65 72 00 05 02 5F 61 6D 73 67 5F tFilter..._amsg_ 004022D0: 65 78 69 74 00 00 A4 01 5F 5F 67 65 74 6D 61 69 exit..☼.__getmai 004022E0: 6E 61 72 67 73 00 E0 01 5F 5F 73 65 74 5F 61 70 nargs.à.__set_ap 004022F0: 70 5F 74 79 70 65 00 00 BC 05 65 78 69 74 00 00 p_type..¼.exit.. 00402300: 69 02 5F 65 78 69 74 00 1C 02 5F 63 65 78 69 74 i._exit..._cexit 00402310: 00 00 2C 02 5F 63 6F 6E 66 69 67 74 68 72 65 61 ..,._configthrea 00402320: 64 6C 6F 63 61 6C 65 00 E2 01 5F 5F 73 65 74 75 dlocale.â.__setu 00402330: 73 65 72 6D 61 74 68 65 72 72 00 00 EF 02 5F 69 sermatherr..ï._i 00402340: 6E 69 74 74 65 72 6D 5F 65 00 EE 02 5F 69 6E 69 nitterm_e.î._ini 00402350: 74 74 65 72 6D 00 A5 01 5F 5F 69 6E 69 74 65 6E tterm.¥.__initen 00402360: 76 00 84 02 5F 66 6D 6F 64 65 00 00 2B 02 5F 63 v..._fmode..+._c 00402370: 6F 6D 6D 6F 64 65 00 00 3B 01 3F 74 65 72 6D 69 ommode..;.?termi 00402380: 6E 61 74 65 40 40 59 41 58 58 5A 00 98 01 5F 5F nate@@YAXXZ...__ 00402390: 63 72 74 53 65 74 55 6E 68 61 6E 64 6C 65 64 45 crtSetUnhandledE 004023A0: 78 63 65 70 74 69 6F 6E 46 69 6C 74 65 72 00 00 xceptionFilter.. 004023B0: 6C 03 5F 6C 6F 63 6B 00 D6 04 5F 75 6E 6C 6F 63 l._lock.Ö._unloc 004023C0: 6B 00 1B 02 5F 63 61 6C 6C 6F 63 5F 63 72 74 00 k..._calloc_crt. 004023D0: 9C 01 5F 5F 64 6C 6C 6F 6E 65 78 69 74 00 12 04 ..__dllonexit... 004023E0: 5F 6F 6E 65 78 69 74 00 F6 02 5F 69 6E 76 6F 6B _onexit.ö._invok 004023F0: 65 5F 77 61 74 73 6F 6E 00 00 2F 02 5F 63 6F 6E e_watson../._con 00402400: 74 72 6F 6C 66 70 5F 73 00 00 60 02 5F 65 78 63 trolfp_s..`._exc 00402410: 65 70 74 5F 68 61 6E 64 6C 65 72 34 5F 63 6F 6D ept_handler4_com 00402420: 6D 6F 6E 00 3B 02 5F 63 72 74 5F 64 65 62 75 67 mon.;._crt_debug 00402430: 67 65 72 5F 68 6F 6F 6B 00 00 9A 01 5F 5F 63 72 ger_hook....__cr 00402440: 74 55 6E 68 61 6E 64 6C 65 64 45 78 63 65 70 74 tUnhandledExcept 00402450: 69 6F 6E 00 99 01 5F 5F 63 72 74 54 65 72 6D 69 ion...__crtTermi 00402460: 6E 61 74 65 50 72 6F 63 65 73 73 00 3C 01 45 6E nateProcess.<.En 00402470: 63 6F 64 65 50 6F 69 6E 74 65 72 00 3C 04 51 75 codePointer.<.Qu 00402480: 65 72 79 50 65 72 66 6F 72 6D 61 6E 63 65 43 6F eryPerformanceCo 00402490: 75 6E 74 65 72 00 28 02 47 65 74 43 75 72 72 65 unter.(.GetCurre 004024A0: 6E 74 54 68 72 65 61 64 49 64 00 00 F4 02 47 65 ntThreadId..ô.Ge 004024B0: 74 53 79 73 74 65 6D 54 69 6D 65 41 73 46 69 6C tSystemTimeAsFil 004024C0: 65 54 69 6D 65 00 11 03 47 65 74 54 69 63 6B 43 eTime...GetTickC 004024D0: 6F 75 6E 74 36 34 00 00 17 01 44 65 63 6F 64 65 ount64....Decode 004024E0: 50 6F 69 6E 74 65 72 00 83 03 49 73 44 65 62 75 Pointer...IsDebu 004024F0: 67 67 65 72 50 72 65 73 65 6E 74 00 88 03 49 73 ggerPresent...Is 00402500: 50 72 6F 63 65 73 73 6F 72 46 65 61 74 75 72 65 ProcessorFeature 00402510: 50 72 65 73 65 6E 74 00 4B 45 52 4E 45 4C 33 32 Present.KERNEL32 00402520: 2E 64 6C 6C 00 00 .dll..
The expected string constants appear at 004020C0 and consume 49 bytes. It seems that most or all of the data before and after those strings is part of the imports list. The PE header contains a bunch of offsets to the import data, but the import data itself can be located anywhere in the executable file. Apparently the linker has chosen to place it here in the .rdata section.
As best as I can tell from examining code disassemblies, most of the bytes before the string constants (00402000 to 0040209C) are a table of function pointers that will be filled in by the loader, belying the “read only” nature of this section. I guess “read only” only applies to the program itself once it starts running, and not actions performed by the loader. For example, after the loader has loaded the C runtime library DLL and determined the address of the printf function, it will place that address into one of these table entries. The main code can then use that table entry to call printf indirectly when needed.
After the strings but before the imported function names, there are 416 bytes from 004020F0 to 0040228F that appear unrelated to the imports list or any easily-identifiable code. From examining the code disassembly, it appears these are used by some mystery library code that’s inserted into the executable, but I’ve been unable to determine what it’s for.
The bytes from 00402290 onward are the actual names of the imported DLLs and the functions needed from each one. It’s a little curious that these are stored as plain text function names, instead of by index in the DLL or by a hash of the function name. I guess a few bytes wasted here isn’t very important.
.data
Next I’ll look at the .data section, for initialized data that’s both readable and writable. The example program doesn’t have any global variables or other structures that would obviously go in the .data section, so it’s not clear what’s consuming 908 bytes here. Let’s look:
SECTION HEADER #3 .data name 38C virtual size 3000 virtual address (00403000 to 0040338B) 200 size of raw data 1400 file pointer to raw data (00001400 to 000015FF) 0 file pointer to relocation table 0 file pointer to line numbers 0 number of relocations 0 number of line numbers C0000040 flags Initialized Data Read Write RAW DATA #3 00403000: 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00403010: FE FF FF FF FF FF FF FF 4E E6 40 BB B1 19 BF 44 _ÿÿÿÿÿÿÿNæ@»±.¿D 00403020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00403030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00403040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00403050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00403060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00403070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00403080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00403090: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 004030A0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 004030B0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 004030C0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 004030D0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 004030E0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 004030F0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00403100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00403110: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00403120: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00403130: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00403140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00403150: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00403160: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00403170: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00403180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00403190: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 004031A0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 004031B0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 004031C0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 004031D0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 004031E0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 004031F0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Hmm, there’s not a lot of initialization going on in this initialized data section, unless it’s a whole lot of things being initialized to zero. Nothing recognizable jumps out from the data. From examining the code disassembly for references to this area of memory, it looks like this space is used by about 40 global variables, most of which are 2 or 4 bytes, but with a few larger ones. It seems a little wasteful to store so many zeroes in the executable file when those variables could have been stored in an uninitialized section instead (.bss). But due to the 512 byte alignment requirement for the section, any zero data after the real initialized data will be free, because those zeroes have to be there anyway for alignment padding.
So what is all this stuff? The disassembly shows that some of it is related to the state of the terminal window, and some of it may be used in conjunction with handling of the command line args. Some of it looks like a place to save the processor state – maybe as part of a debugger integration or core dump capability (!). Most of it is referenced by blocks of code whose purpose is a complete mystery to me. Why is all this junk here, even when I’ve turned off every compiler and linker option I could find that suggested it would add extra “features” to my program? I’m beginning to feel like I’ve lost control of my own creation. Why can’t I get rid of all this extra junk, and only have the code and data that I put there myself?
.text
The text segment is where the code is stored, and there’s a lot of it – 2234 bytes. From prior examination, I know that the C functions I wrote produced only 120 bytes of code, so most of what’s in the .text segment must be something else. The disassembly is too long to include here, but you can view the disassembly of the entire .text section here.
Instead of using OllyDbg again, this time I’m using the free version of IDA from Hex-Rays. IDA is like OllyDbg in many ways, but it also has some powerful unique features, such as displaying a function’s disassembly as a directed graph instead of a straight text listing. The feature I’m interested in for this analysis is IDA’s ability to color-code different parts of the disassembly depending on what type of code it is.
By default, the address for each line is displayed in black if it’s part of a normal program function, or cyan if it’s part of a compiled-in library function that IDA recognizes, or red/brown if it’s not part of any function (as far as IDA can determine). In theory, the code resulting from my C functions should be black, other library code should be cyan, and strange mystery stuff should be red/brown. In practice this didn’t work all that consistently, but it was still a help. Unfortunately the line coloring is lost in the .text disassembly that I linked above. I couldn’t find any way to copy the disassembly as formatted text to retain the color, and a screenshot of thousands of lines of code would be impractical.
The .text segment includes code from 0040100 to 004018BA. Paging through it with the help of IDA, here’s what I found:
00401000 – 00401078 (120 bytes): This is the actual C program code that implements string reverse, and calls printf and scanf. All the code I wrote myself is here.
00401080 – 00401183 (259 bytes): Mystery code. The rest of the code never calls or jumps here, or references it in any way I’ve found. This code begins by checking for the DOS header, and then for the PE header. It then parses some fields from the PE and does other things I don’t understand, along the way calling __set_app_type, EncodePointer, _fmode, _commode, __setusermatherr, _configthreadlocale, and __getmainargs. What the heck is this?? Is it really unused code, or if not, how is it used?
00401184 – 004012E8 (356 bytes): IDA identifies this as compiled-in library code, and it appears to be the main loop for a console-based program. It does some work including calling _initterm and _initenv, before calling the C main() function. When main() returns, this code calls exit(). During the initialization process before calling main(), the code also calls a couple of longish subroutines that reference addresses in the x86 fs: segment. Google tells me this is the Win32 Thread Information Block (TIB), and contains state information about structured exception handling and other details. I’d like to know what this TIB-related code is doing, and why this whole block of code was compiled directly into my program instead of being part of the C runtime DLL or kernel32.dll.
004012E9 – 004012F2 (9 bytes): This is the entry point of the executable. It calls some subroutine whose purpose is TBD, than jumps to the library code described in the previous section.
004012F3 – 00401341 (78 bytes): Appears to be an exception filter function. What’s strange is that the only code that references it is within the function itself. Near the end of the function, it passes its own address to _crtSetUnhandledExceptionFilter. But how does the filter ever get installed in the first place? And why is there an exception filter at all, when I disabled exception handling in the build options?
00401360 – 004014A0 (320 bytes): These are the subroutines called by the setup code at 00401184, which appear to do something related to manipulating the TIB. Again, why is this needed, and why isn’t it in a DLL?
004014A1 – 0040153A (153 bytes): This is the TBD subroutine called directly from the executable’s entry point. The code looks very similar to the __security_init_cookie library function, although IDA doesn’t recognize it. The security cookie is part of a system of runtime checking for buffer overflows, that I explicitly turned off in the compiler options with /GS-. Nevertheless, the code is still here. The cookie value is computed at runtime, and it’s supposed to be impossible for an attacker to predict it ahead of time. To accomplish that, the value is determined by XOR-ing a lot of numbers obtained from calling GetSystemTimeAsFileTime, GetCurrentThreadId, GetTickCount64, and QueryPerformanceCounter. Now I know why all those functions appeared in the imports list. But why is this code included when buffer checks are disabled with the /GS- compiler option?
0040153B – 00401576 (59 bytes): Looks like unreachable code. It calls _calloc_crt and EncodePointer. Why is this here at all, if it’s unreachable?
00401577 – 00401670 (249 bytes): Here are a series of related functions, with the only entry point appearing to be in the mystery PE header scanning code at 00401080 that was previously described. And since that code never seems to be called, these functions won’t be called either. It sure seems like I’m missing something. The functions do a lot of internal data manipulation, with the only external function calls being to EncodePointer and DecodePointer.
00401671 – 00401697 (38 bytes): This function calls _controlfp_s, which gets the floating point state, and can be used to mask and unmask floating point exceptions. It may also call _invoke_watson, which brings up Microsoft’s Doctor Watson tool. That tool is a basic program error debugger, that can generate a crash report in a text file. OK, but why is this here? I don’t want this junk. As before, the only place this function is called is from the mystery PE header scanning code at 00401080. It’s definitely beginning to look like I was wrong about the code at 00401080 never being called, but then where/how is it called?
004016B0 – 0040172B (123 bytes): More crappy code I don’t understand, called from places that appear never to be called themselves. Now I’m starting to get pissed. Looks like this is more exception handling stuff of some type.
0040176C – 004018A1 (309 bytes): Something here that looks like an exception handler. It calls IsProcessorFeaturePresent, and copies the contents of all the CPU registers into global variables, then calls another function that checks IsDebuggerPresent. If yes, it calls _crt_debugger_hook, and if no it calls __crtUnhandledException. I can’t find any other code that installs this exception handler though – as far as I can tell, it’s never used.
Conclusion
That’s it. Clearly the biggest mystery (and largest amount of code) is all this stuff that looks related to exception handling, that appears never to be called. Is it actually called, through some clever mechanism that IDA and I overlooked? Or is it code that would have been called if I hadn’t disabled exception handling, but that for some reason wasn’t stripped out of the final executable file?
For this learning exercise, I’d love to create an executable file that’s free of as much of this crud as possible. I don’t need exception handlers or Doctor Watson dumpers or buffer overflow detection. If my program has a bug, just let it crash! I want to make a C program that results in as few bytes of code as is reasonably possible.
Total everything up, and my code and data combined are only about 3% of the total size of the executable file. See the table and pie graph at the top of the post for a comparison of which content consumes the most space. I’m not trying to win a “smallest .exe” contest, but that degree of bloat is frustrating when I’m trying to make a minimal executable to learn more about how it works. Maybe I’m doing something wrong? Feel free to try it yourself to double-check my results. You can grab the source code here and the exe here, and the compiler and linker settings that I used are listed above.
Read 8 comments and join the conversation8 Comments so far
Leave a reply. For customer support issues, please use the Customer Support link instead of writing comments.
Consider busting out nasm and doing it in assembler? That should get you a nice minimal PE file. (Having written a compiler myself for fun that targets Windows, you definitely don’t need all that gunk. The only thing you absolutely have to have is an import of something from kernel32.dll or your process will crash before it even hits your entry point).
Good thought, and I did make a simple assembly example with MASM. I’d sure like to make a minimal C example, though. Maybe the C library runtime functions require the exception stuff and the security cookie, even if I disable it for my own code?
There’s some information here with various tips on how to reduce the executable size in windows.
http://www.catch22.net/tuts/reducing-executable-size
For a minimal C executable, there’s always MinGW (Minimalist GNU for Windows) which is bundled with Code::Blocks IDE.
Worth noting that compiling the same code with GCC (4.8.4) using no optimisation yield an 7500 bytes binary…and with size optimisation ‘only’ 7470 bytes…and 5580 once striped.
Would you be so kind to do the same kind of examination on the GCC/Linux binary ?
I love your work, please continue to be awesome.
Interesting… so GCC’s optimized and stripped exe is slightly smaller than the one I got from the Microsoft compiler, but the difference isn’t large.
I’m not exactly sure what my goal is with this stuff… I just find it interesting. Making the smallest possible exe isn’t really the goal – for that I could try a different C runtime or compiler, or code entirely in assembly. It’s more a question of understanding what all that “extra” stuff is there for, and whether it is truly required for an exe using the Microsoft C runtime lib, or if I just haven’t found the right compiler settings to turn it off. Steve Moody’s recommended article from catch22.net looks like a good place to start – I’ll try their suggestions and see where it leads.
Another idea is to recompile with the Microsoft compiler once more, using the same optimizations and compiler settings, except retaining debug symbols. That may make it easier to decipher what the mystery code blocks are doing.
I just realized that I was confusing two different types of exception handling mechanisms: C++ exceptions and Structured Exception Handling. I disabled C++ exceptions in the compiler settings, but not SEH. In fact, I’m not sure there is a way to disable SEH, since it’s likely used by the C runtime itself. So that’s probably why all that exception-related code is showing up in my executable file.
I understand C++ exceptions: they are typed objects, and used with the C++ keywords try, throw, and catch. They’re only relevant to programs written in C++, and as far as I know they’re not processor or OS dependent.
I’m not familiar with SEH, but it appears to be a Windows-specific exception mechanism, wrapped around the x86 CPU exception processing for errors like division by zero, illegal instruction, or invalid memory reference. That sounds useful, but I don’t see why a program should be *required* to use that mechanism. When the Windows loader starts up a new program, it should initialize the exception handler vectors to some kind of default handler that’s implemented within the OS. When invoked, that handler should pop up a window saying “Your program crashed” and then terminate the misbehaving program.
I spent more time experimenting with compiler settings, and couldn’t find any settings to prevent the extra unwanted code from being included in my executable file. But what did work was to define my own entry point:
Then set the program’s entry point to MyEntryPoint in the advanced linker settings. This causes all of the unwanted code to be omitted, and shrinks the executable from 6144 to 2560 bytes. It also eliminates everything from the imports list except printf, scanf, and strlen. New section sizes are:
.text: 121 bytes – hooray!
.rdata: 344 bytes
.data: 4 bytes
.rsrc: 480 bytes – same as before
This program still uses the C runtime library, it just doesn’t include the C startup code that was being auto-generated. The new program still appears to run fine.