Build-time String Encryption for Position-Independent Code

2026-01-30

TLDR:

We turn this:

puts(pic_str("Hello, World!"));

Into this:

puts(((const char *)(({
 struct _str_struct_1 {
   char buf[14];
 } _str_inst_1;
 unsigned char *_data_ptr;
 volatile unsigned long long _key_val;
 volatile unsigned long _len = 13;
 __asm__ volatile(
     "jmp skip_str_%=\n"
     "str_data_%=:\n"
     ".byte 0x9c, 0xc5, 0xc6, 0xef, 0xf5, 0x27, 0xe5, 0xbd, 0xbb, 0xd2, 0xc6, 0xe7, 0xbb\n"
     "str_key_%=:\n"
     ".quad 0xeac50b9a83aaa0d4\n"
     "skip_str_%=:\n"
     "lea str_data_%=(%%rip), %0\n"
     "movq str_key_%=(%%rip), %1\n"
     : "=r"(_data_ptr), "=r"(_key_val)
     :
     :);
 for (volatile unsigned long _i = 0; _i < _len; _i++) {
   volatile unsigned char _key_byte = (_key_val >> ((_i % 8) * 8)) & 0xFF;
   _str_inst_1.buf[_i] = _data_ptr[_i] ^ _key_byte;
 }
 _str_inst_1.buf[13] = 0;
 _str_inst_1;
}).buf)));

and no longer have any string worries in our PIC payloads!

shellcode example

Why?

OK let’s start with PIC, which is position-independent code - sometimes also called shellcode. The main feature of this type of program is that it is just a snippet of machine instructions that can be directly executed and will JUST WORK™ (assuming the OS and architecture are correct).

To get a REAL PIC we can use really underrated program: msfvenom.

Shellcodes generated by msfvenom are truly PIC.

Let’s make one now:

» msfvenom -p windows/x64/messagebox TEXT="world" TITLE=hello -f raw -v SHELLCODE > msgbox.bin                  
[-] No platform was selected, choosing Msf::Module::Platform::Windows from the payload
[-] No arch selected, selecting arch: x64 from the payload
No encoder specified, outputting raw payload
Payload size: 297 bytes

» xxd msgbox.bin                                                                                                
00000000: fc48 81e4 f0ff ffff e8cc 0000 0041 5141  .H...........AQA
00000010: 5052 5156 4831 d265 488b 5260 488b 5218  PRQVH1.eH.R`H.R.
00000020: 488b 5220 488b 7250 4d31 c948 0fb7 4a4a  H.R H.rPM1.H..JJ
00000030: 4831 c0ac 3c61 7c02 2c20 41c1 c90d 4101  H1..<a|., A...A.
00000040: c1e2 ed52 488b 5220 4151 8b42 3c48 01d0  ...RH.R AQ.B<H..
00000050: 6681 7818 0b02 0f85 7200 0000 8b80 8800  f.x.....r.......
00000060: 0000 4885 c074 6748 01d0 448b 4020 5049  ..H..tgH..D.@ PI
00000070: 01d0 8b48 18e3 564d 31c9 48ff c941 8b34  ...H..VM1.H..A.4
00000080: 8848 01d6 4831 c041 c1c9 0dac 4101 c138  .H..H1.A....A..8
00000090: e075 f14c 034c 2408 4539 d175 d858 448b  .u.L.L$.E9.u.XD.
000000a0: 4024 4901 d066 418b 0c48 448b 401c 4901  @$I..fA..HD.@.I.
000000b0: d041 8b04 8848 01d0 4158 4158 5e59 5a41  .A...H..AXAX^YZA
000000c0: 5841 5941 5a48 83ec 2041 52ff e058 4159  XAYAZH.. AR..XAY
000000d0: 5a48 8b12 e94b ffff ff5d e80b 0000 0075  ZH...K...].....u
000000e0: 7365 7233 322e 646c 6c00 5941 ba4c 7726  ser32.dll.YA.Lw&
000000f0: 07ff d549 c7c1 0000 0000 e806 0000 0077  ...I...........w
00000100: 6f72 6c64 005a e806 0000 0068 656c 6c6f  orld.Z.....hello
00000110: 0041 5848 31c9 41ba 4583 5607 ffd5 4831  .AXH1.A.E.V...H1
00000120: c941 baf0 b5a2 56ff d5                   .A....V..

The bytes generated by this msfvenom are a completely independent program to pop a messagebox. No DLLs are needed, no installers, nothing written to disk, no online downloads. Makes you wonder why we’re shipping 1GB Electron apps when this is all it takes. But that is besides the point.

Here we see the topic of our discussion: STRINGS. If you look closely at the output you can see strings hello and world directly embedded into the shellcode.

The main goal of this shellcode is just to call this WinAPI function:

int MessageBoxA(
    HWND hWnd,        // Handle to owner window (we can just pass 0 here)
    LPCSTR lpText,    // Pointer to message text
    LPCSTR lpCaption, // Pointer to title/caption text
    UINT uType        // Dialog box style
);

But shellcode should be just ASM - how are strings encoded/stored into assembly?

Demo

Fortunately the Metasploit Framework is open source:

Per the Windows x64 calling convention we shove values into registers and call the function.

Register Parameter Description
RCX 1st - hWnd Window handle
RDX 2nd - lpText Message text
R8 3rd - lpCaption Title text
R9 4th - uType Style flags

And that’s exactly what the shellcode does:

payload_asm = %(
  cld
  and rsp,0xfffffffffffffff0
  call start_main
  #{asm_block_api}
start_main:
  pop rbp
  call get_user32
  db "user32.dll", 0x00
get_user32:
  pop rcx
  mov r10d, #{Rex::Text.block_api_hash('kernel32.dll', 'LoadLibraryA')}
  call rbp
  mov r9, #{style}
  call get_text
  db "#{datastore['TEXT']}", 0x00
get_text:
  pop rdx
  call get_title
  db "#{datastore['TITLE']}", 0x00
get_title:
  pop r8
  xor rcx,rcx
  mov r10d, #{Rex::Text.block_api_hash('user32.dll', 'MessageBoxA')}
  call rbp
exitfunk:
  #{exitfunc_asm}
)

In the assembly above we observe a few things:

  1. Strings are just stored as db blocks

  2. Their addresses are pushed on the stack using the call instruction and popped into registers with pop rdx, pop r8 and pop r9

  3. rcx is set to zero using xor rcx, rcx.

Using the call instruction this way may seem odd (like it did to me) but it’s pretty clever and efficient.

So how is this different from normal programs?

Let’s compile a basic hello world:

» cat > main.c << 'EOF'
#include <stdio.h>

int main() {
    printf("Hello, World!\n");
    return 0;
}
EOF

» x86_64-w64-mingw32-gcc main.c -ggdb -o main.exe

» x86_64-w64-mingw32-objdump -DrS -M intel --disassembler-color=on --disassemble=main main.exe
main.exe:     file format pei-x86-64
Disassembly of section .text:
0000000140001540 <main>:
#include <stdio.h>

int main() {
   140001540:	55                   	push   rbp
   140001541:	48 89 e5             	mov    rbp,rsp
   140001544:	48 83 ec 20          	sub    rsp,0x20
   140001548:	e8 f3 00 00 00       	call   140001640 <__main>
    printf("Hello, World!\n");
   14000154d:	48 8d 05 fc 2a 00 00 	lea    rax,[rip+0x2afc]        # 140004050 <.rdata>
   140001554:	48 89 c1             	mov    rcx,rax
   140001557:	e8 94 12 00 00       	call   1400027f0 <puts>
    return 0;
   14000155c:	b8 00 00 00 00       	mov    eax,0x0
}
   140001561:	48 83 c4 20          	add    rsp,0x20
   140001565:	5d                   	pop    rbp
   140001566:	c3                   	ret

» x86_64-w64-mingw32-objdump -s main.exe | grep .rdata -A 7
Contents of section .rdata:
 140004000 6c696267 63635f73 5f647732 2d312e64  libgcc_s_dw2-1.d
 140004010 6c6c005f 5f726567 69737465 725f6672  ll.__register_fr
 140004020 616d655f 696e666f 005f5f64 65726567  ame_info.__dereg
 140004030 69737465 725f6672 616d655f 696e666f  ister_frame_info
 140004040 00000000 00000000 00000000 00000000  ................
 140004050 48656c6c 6f2c2057 6f726c64 21000000  Hello, World!...
 140004060 90160040 01000000 00000000 00000000  ...@............

The string is nowhere to be seen in the .text section. The compiler stashed it in .rdata and referenced it by offset. Now our program’s split in two - steal just .text and it breaks. To make this shellcode-ready, you’d need to drag .rdata along and fix up offsets yourself. I won’t bore you with relocation mechanics; Raphael Mudge covered that beautifully in his Crystal Palace series.

So… yeah normal programs are quite different from msfvenom shellcodes.

This has to be a solved problem right? Surely others have already tackled and solved this.

I had whole sections breaking down each of these methods - history, implementation details, my personal opinions on all of them. Then I realized: nobody’s going to read that. So here’s the tl;dr in table form instead.

Method Notes
Stack Strings Manual character-by-character construction on stack. Bad DX, gets promoted to .rdata by compiler. Good enough for APTs lol
Donuts Wraps payloads with custom loader. Solves PIC (Mythic, Sliver) but heavily signatured
Stardust Forces .rdata into .text via linker script. Fundamental post, elegant, not encrypted, C++
Crystal Palace Custom linker for building PICs. Groundbreaking, supports whole pic/blob encrypiton but no auto string enc, probably future of redteam tradecraft
Geekembly Compile-time encryption via C++ templates/constexpr. Very original, still C++
SILVERPICK Shellcode framework with template-based strings. Great repo, interesting C++ method, still C++
Zig BOF Zig metaprogramming for stack strings. I don’t know Zig
JAI Metaprogramming-first language. Probably best fit, not released, Jonathan Blow will get angry when he finds out his language is used for maldev
Rust Will probably work with macros, but I don’t know Rust, will probably take half an hour to compile

As you might have gathered from reading the notes column, these aren’t serious/professional complaints about these projects - all of them are solid choices for certain situations. I just decided to build my own solution.

enc_pic_str:

Let’s start with the design and aesthetic goals I had when designing this method of string obfuscation:

  1. C. Not C++.
  2. Easy to understand and reason about: final C code should have locality of behaviour and not affect other code sections or introduce dependency on compiler features which may silently change other parts too.
  3. Transparent DX experience: it should be debuggable and single step-able. Obfuscation itself should be as dumb/simple as possible.
  4. Easy to extend or modify: encryption scheme should be easy to swap out, maybe with one which is less signatured or bypasses FLOSS.

At a really high level we just need to parse the source code, grab all the strings, replace them with encoded data and their decryption routine:

// we want to turn this
puts("Hello World!");

// into this
unsigned char encrypted[] = {0x0a, 0x27, 0x2e, 0x2e, 0x2d,
    0x62, 0x15, 0x2d, 0x30, 0x2e, 0x26, 0x63};

char decrypted[13];
for (int i = 0; i < 12; i++) {
  decrypted[i] = encrypted[i] ^ 0x42;
}
decrypted[12] = '\0';

puts(decrypted);

If you actually write code like this, the compiler might just delete all the decryption and put hello world back into your puts again lol.

If we go back to the code snippet above, you should be able to see some similarities. I’ll annotate it with comments right now.

puts(((const char *)(({
 // Define a struct to hold our decrypted string buffer
 // This will be passed by VALUE (not pointer) to avoid UB
 struct _str_struct_1 {
   char buf[14];
 } _str_inst_1;

 unsigned char *_data_ptr;

 // Store the 8-byte XOR key as a single 64-bit integer
 // volatile prevents compiler from optimizing away the key
 volatile unsigned long long _key_val;

 // volatile prevents compiler from optimizing away the length
 volatile unsigned long _len = 13;

 // Inline assembly to control exact ASM
 __asm__ volatile("jmp skip_str_%=\n" // Jump over the data bytes
                  "str_data_%=:\n"    // Label for our encrypted data
                  ".byte 0x9c, 0xc5, 0xc6, 0xef, 0xf5, 0x27, 0xe5, 0xbd, 0xbb, 0xd2, 0xc6, "
                  "0xe7, 0xbb\n"                    // Encrypted bytes embedded in code
                  "str_key_%=:\n"                   // Label for our 8-byte key
                  ".quad 0xeac50b9a83aaa0d4\n"      // 64-bit XOR key embedded in code
                  "skip_str_%=:\n"                  // Label after data
                  "lea str_data_%=(%%rip), %0\n"    // Load address of encrypted data into
                                                    // _data_ptr (RIP-relative addressing)
                  "movq str_key_%=(%%rip), %1\n"    // Load 64-bit key value into _key_val
                                                    // (RIP-relative addressing)
                  : "=r"(_data_ptr), "=r"(_key_val) // OUTPUTS: both values loaded from code
                  :                                 // INPUTS: none
                  :);                               // CLOBBERS: none

 // XOR decryption scheme: each byte XORed with corresponding byte from 64-bit key
 // volatile prevents loop from being optimized out
 for (volatile unsigned long _i = 0; _i < _len; _i++) {
   // Extract the appropriate key byte from the 64-bit key value
   // Uses bit shifting and masking to get byte at position (_i % 8)
   // Example: for _i=0, gets lowest 8 bits; for _i=1, gets bits 8-15, etc.
   volatile unsigned char _key_byte = (_key_val >> ((_i % 8) * 8)) & 0xFF;

   // Decrypt using extracted key byte
   _str_inst_1.buf[_i] = _data_ptr[_i] ^ _key_byte;
 }
 _str_inst_1.buf[13] = 0; // Null terminator

 _str_inst_1; // GNU statement expression: return struct BY VALUE
}).buf)));     // Access .buf member from the returned struct VALUE

The keen-eyed among you might have noticed that random bytes mid-instruction stream could produce invalid assembly. Yep - that’s why we wrap db blocks in jumps. (msfvenom used call instead.)

Detectable? Maybe. But plenty of “valid” binaries probably have garbage in their ‘.text’ sections.

But at which point in compilation are these strings replaced with this monstrosity? What will we use to create such code?

Demo

             --------------       -------------------------
 -------     |enc_pic_str |       |.c file with encrypted |    ----------
|.c file|--> |preprocessor| ----> | string blocks         |--->|compiler|
 -------     --------------       -------------------------    ----------
                                                                    |
                                                                    |
                                                                    v
                                                         ---------------------
                                                         | PIC ready program |
                                                         | (assuming only    |
                                                         | strings were the  |
                                                         | problem)          |
                                                         ---------------------

I know, I know - I had the same reaction. This is a total hack. We’re completely sidestepping language limitations and just generating the code ourselves… But it works.

for the preprocessor I Initially tried Clang libtooling - find const strings by signature, replace selectively. Lots of misfires. Bad idea. libtooling runs C/C++ through LLVM’s frontend; once parsed, you can mutate the AST with full semantic info. 👀👀👀👀 Source-to-source obfuscators become possible. Someone’s gonna get nerdsniped and build this; not me, not today.

Anyway, libtooling is overkill:

Demo

Instead, I decided to create a macro that passes strings through transparently if not replaced:

static inline const char *pic_str_impl(const char *s) { return s; }

static inline const wchar_t *pic_str_impl_w(const wchar_t *s) { return s; }

#define pic_str(x)                       \
  _Generic((x),                          \
      char *: pic_str_impl,              \
      const char *: pic_str_impl,        \
      wchar_t *: pic_str_impl_w,         \
      const wchar_t *: pic_str_impl_w)(x)

And the replacing and xor value generation happens using every programmer’s real best friends: regex and snprintf.

Match pic_str, replace with encrypted blobs. Dumb, simple, no templates, no metaprogramming, no recursion. Ugly preprocessor step? Sure. Could be worse.

Bonus: fresh keys every build. #POLYMORPHISM, wow (feigned excitement).

enc_pic_str works on single files. While it does have a batch mode and can process multiple .c files simultaneously, it still requires some scaffolding using a build system. But if you’ve come this far that shouldn’t be too hard.

My recommendation is to build new PIC projects in unity build style as single compilation units and include independent logic sections as header-only stb libs with #define guarding implementations in the main section only. Trust me, the C compiler will thank you for placing everything in a single compilation unit. After producing the single C file, single pass can be made to process it with enc_pic_str.

You can find the repo here.