Making arbitrarily-large binaries from fixed-size C++ code
C++ programmers know that careless use of templates can bloat your binaries. This got me to thinking, could you write a C++ program that can make an arbitrarily large binary, just by tweaking some constant in the code?
Now of course you could just declare a really large array initialized to some non-zero constant. But that’s too easy. I wanted to see if I can generate an arbitrary amount of code (not data).
My first attempt was simple and straightforward:
// Copyright 2015 Google Inc.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#include <stdio.h>
template<int I>
void f() {
printf("%d", I);
f<I-1>();
}
int main() {
f<10000>();
}
The idea here is to keep generating functions f<10000>, f<9999>,
f<9998>
, etc. Each one is a completely separate function that is
emitted into the binary, so each one makes the binary grow a little.
My code here couldn’t technically work since it never terminates the
recursion, but GCC and Clang don’t even let us get that far – they
wisely prevent us from inflicting this bad idea on ourselves:
test.cc:3:45: fatal error: recursive template instantiation exceeded maximum depth of 256
template<int I> void f() { printf("%d", I); f<I-1>(); }
^
test.cc:3:45: note: in instantiation of function template specialization 'f<9744>' requested here
template<int I> void f() { printf("%d", I); f<I-1>(); }
So our new challenge is: find a way to generate an arbitrary number of functions without exceeding 256 levels of template instantiation depth. Exponential growth to the rescue!
#include <stdio.h>
template<int I> void f() { printf("%d", I); }
template<int N, int I>
struct C {
void operator()() {
f<I>();
C<N/2, I+N>()();
C<N/2, I-N>()();
}
};
template<int I>
struct C<0, I> {
void operator()() {}
};
int main() {
C<65536, 0>()();
}
This program will generate roughly 65536 functions, each with its own
unique integer, ie. f<65536>
. It doesn’t exceed an instantiation
depth of 256 because it uses divide and conquer: each level of
instantiation doubles the number of functions, so only lg(N)
levels
of instantiation are required.
On my system, the program above compiles to a 38MB binary. You should be able to make it arbitrarily large, but go too large and you might start running into linker errors!