MASS'16: Protecting Dynamic Code by Modular Control-Flow Integrity

March 14, 2016

My notes on the talk of Gang Tan from Pensylvania State University on Modularity across System Stacks.

How can Modularity help software security?

A number of bugs lead to cyber insecurity. E.g.:

Heartbleed
shellshock
poddle

Buggy software can be as harmful as malicious software, since attackers exploit security bugs. Tiny programming mistakes can cause huge havoc.

Can we automate the mitigation of programming errors?

Approach: Use compilers for bug toleration.

We perform program transformation to embed security checks into exeutable code. We detect attacks during runtime (e.g. a stack guard aka inline reference monitors (IRM).

Ideally: enforce security policy, catch large amount of attacks, only a tolerable slowdown.

Background: Control-Flow Hijacking and Control-Flow Integrity

Memory Corruption Errors: Software written in unsafe languages may suffer from memory-corruption errors - Buffer overflows - Use after free bugs - Format string errors

Threat model

Attacker controls data memory. He can corrupt data memory between any two instructions (like a concurrent thread). We assume a separation between code and data memory. Attackers cannot change code memory nor register contents.

Control-flow hijacking

Corrupt a code pointer and hijack it to change the control flow.

Control Flow integrity (CFI) [Abadi et all CCS 2005]

1) pre-determine a control-flow graph (CFG) of a program 2) Enforce the CFG by instrumenting indirect branches in the program

For each indirect branch, we insert an no-op containing an ID at every target and before any indirect branch, we insert an ID-check to see if the jump target is integer. For example, after every call, we place that noop.

Using a safe language not just lowers performance and prevents using our unsafe legacy code, but also requires the runtime environments of those to be safe (e.g. JIT-spraying attacks are a vulnerability too).

Classic CFI lacks modularity

The construcion of CFG typically requires a global analysis. The inserted ids cannot overlap with the rest of the code, which requires global analysis again.

Additionally, the CFG sometimes changes after linking, so that the CFI ids would need to be updated.

Modular CFI [Niu & Tan PLDI 2014]

CFG is encoded as centralized tables. Those tables are consulted when enforcing CFI. They can be updated during dynamic linking, can be type-based and provide memory cache advantages.

The CFI datasource basically becomes a database.

Generating the CFG for C/C++ including indirect branches is hard, since concepts like function pointers signal handlers, exceptions and so on need to be resolved.

As a performance trade-off, we loosen the precision and do the analysis type based (more name collisions, i.e. false negative fraud detections). We can add precision by basing our detection on input additional to types (Per-input CFI, PICFI Niu & Tan CCS 2015). The challenge there is to only add a hook and evaluate it lazily to avoid having to enumerate all inputs.

Example: We allow a call an indirect call to function pointer of type t*, if

the functions type is structurally equivalend and
the address is taken in the code

We allow returns to go back to any caller in the call graph.

Security issues of JIT-Compilers

JITted code is runnable and writable in the same time, that opens doors
JIT Spraying
- hacker controls javascript code and may

RockJIT [Niu & Tan CCS 2014]

RockJIT extends JIT compilers with Modular CFI. Each piece of JITted code becomes a new modules.

To adapt a JIT Compiler to RockJIT, the code-emission logic needs to be changed to emit MCFI compatible code. All modification or deletion of code needs to run through RockJITs primitives.

Future Work

Explore the security gains of when CFG is more precise?

Discussion, links, and tweets

I care about lots other interesting things as well. Follow me on Twitter to get an impression of that. Or contact me directly on another channel or come along to visit me.

Follow @abstractourism