MASS'16: Protecting Dynamic Code by Modular Control-Flow Integrity
My notes on the talk of Gang Tan from Pensylvania State University on Modularity across System Stacks.
How can Modularity help software security?
A number of bugs lead to cyber insecurity. E.g.:
- Heartbleed
- shellshock
- poddle
Buggy software can be as harmful as malicious software, since attackers exploit security bugs. Tiny programming mistakes can cause huge havoc.
Can we automate the mitigation of programming errors?
Approach: Use compilers for bug toleration.
We perform program transformation to embed security checks into exeutable code. We detect attacks during runtime (e.g. a stack guard aka inline reference monitors (IRM).
Ideally: enforce security policy, catch large amount of attacks, only a tolerable slowdown.
Background: Control-Flow Hijacking and Control-Flow Integrity
Memory Corruption Errors: Software written in unsafe languages may suffer from memory-corruption errors - Buffer overflows - Use after free bugs - Format string errors
Threat model
Attacker controls data memory. He can corrupt data memory between any two instructions (like a concurrent thread). We assume a separation between code and data memory. Attackers cannot change code memory nor register contents.
Control-flow hijacking
Corrupt a code pointer and hijack it to change the control flow.
Control Flow integrity (CFI) [Abadi et all CCS 2005]
1) pre-determine a control-flow graph (CFG) of a program 2) Enforce the CFG by instrumenting indirect branches in the program
For each indirect branch, we insert an no-op containing an ID at every target and before any indirect branch, we insert an ID-check to see if the jump target is integer. For example, after every call, we place that noop.
Using a safe language not just lowers performance and prevents using our unsafe legacy code, but also requires the runtime environments of those to be safe (e.g. JIT-spraying attacks are a vulnerability too).
Classic CFI lacks modularity
The construcion of CFG typically requires a global analysis. The inserted ids cannot overlap with the rest of the code, which requires global analysis again.
Additionally, the CFG sometimes changes after linking, so that the CFI ids would need to be updated.
Modular CFI [Niu & Tan PLDI 2014]
CFG is encoded as centralized tables. Those tables are consulted when enforcing CFI. They can be updated during dynamic linking, can be type-based and provide memory cache advantages.
The CFI datasource basically becomes a database.
Generating the CFG for C/C++ including indirect branches is hard, since concepts like function pointers signal handlers, exceptions and so on need to be resolved.
As a performance trade-off, we loosen the precision and do the analysis type based (more name collisions, i.e. false negative fraud detections). We can add precision by basing our detection on input additional to types (Per-input CFI, PICFI Niu & Tan CCS 2015). The challenge there is to only add a hook and evaluate it lazily to avoid having to enumerate all inputs.
Example: We allow a call an indirect call to function pointer of type t*, if
- the functions type is structurally equivalend and
- the address is taken in the code
We allow returns to go back to any caller in the call graph.
Security issues of JIT-Compilers
- JITted code is runnable and writable in the same time, that opens doors
- JIT Spraying
- hacker controls javascript code and may
RockJIT [Niu & Tan CCS 2014]
RockJIT extends JIT compilers with Modular CFI. Each piece of JITted code becomes a new modules.
To adapt a JIT Compiler to RockJIT, the code-emission logic needs to be changed to emit MCFI compatible code. All modification or deletion of code needs to run through RockJITs primitives.
Future Work
Explore the security gains of when CFG is more precise?