# SecPod: A Framework for Virtualization-based Security Systems

**Xiaoguang Wang**<sup>†</sup>, Yue Chen<sup>†</sup>, Zhi Wang<sup>†</sup>, Yong Qi<sup>‡</sup>, Yajin Zhou<sup>‡</sup>

Florida State University<sup>†</sup> Xi'an Jiaotong University<sup>‡</sup> Qihoo 360<sup>‡</sup>



- 1. Motivation
- 2. SecPod Design
- 3. Implementation
- 4. Evaluation
- 5. Related Work

ም.

Kernel protection requires page table integrity

Page tables decide address translation (from VA to PA)

< 回 > < 三 > < 三 >

Kernel protection requires page table integrity

- Page tables decide address translation (from VA to PA)
- Page tables control memory protection

< 回 > < 三 > < 三 >

Kernel protection requires page table integrity

- Page tables decide address translation (from VA to PA)
- Page tables control memory protection
- ▶ e.g. Data Execution Prevention (Write ⊕ eXecute)

- 4 週 ト - 4 三 ト - 4 三 ト

Kernel protection requires page table integrity

- Page tables decide address translation (from VA to PA)
- Page tables control memory protection
- ▶ e.g. Data Execution Prevention (Write ⊕ eXecute)

#### However, page tables are always writable in the kernel

- Kernel needs to frequently change memory mapping
- Kernel protection can be subverted by manipulating page tables

- 4 回 2 - 4 回 2 - 4 回 2 - 4 回 2 - 4 回 2 - 4 回 2 - 4 回 2 - 4 回 2 - 4 回 2 - 4 回 2 - 4 回 2 - 4 回 2 - 4 回 2 - 4 回 2 - 4 回 2 - 4 回 2 - 4 回 2 - 4 回 2 - 4 回 2 - 4 回 2 - 4 回 2 - 4 回 2 - 4 回 2 - 4 回 2 - 4 回 2 - 4 回 2 - 4 回 2 - 4 回 2 - 4 回 2 - 4 回 2 - 4 回 2 - 4 回 2 - 4 回 2 - 4 回 2 - 4 回 2 - 4 回 2 - 4 回 2 - 4 回 2 - 4 回 2 - 4 回 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 2 - 4 □ 0 − 4 □ 0 − 4 □ 0 − 4 □ 0 − 4 □ 0 − 4 □ 0 − 4 □ 0 − 4 □ 0 − 4 □ 0 − 4 □ 0 − 4 □ 0 − 4 □ 0 − 4 □ 0 − 4 □

- Security tools are isolated "out-of-the-box", but need to intercept key guest events
  - ▷ e.g., guest page table updates, control-register updates



- Security tools are isolated "out-of-the-box", but need to intercept key guest events
   e.g., guest page table updates, control-register updates
- Shadow paging enables reliable kernel memory protection
  Hypervisor uses shadow paging to virtualize memory



 Security tools are isolated "out-of-the-box", but need to intercept key guest events
 a.g. guest page table undates, control register undates

▷ e.g., guest page table updates, control-register updates

Shadow paging enables reliable kernel memory protection
 Pypervisor uses shadow paging to virtualize memory
 SPTs are synchronized with GPTs by the hypervisor
 →security tools can intercept guest page table updates



• • = • •

 Security tools are isolated "out-of-the-box", but need to intercept key guest events
 a.g. guest page table underes, control register underes

▷ e.g., guest page table updates, control-register updates

Shadow paging enables reliable kernel memory protection
 ▷ Hypervisor uses shadow paging to virtualize memory
 ▷ SPTs are synchronized with GPTs by the hypervisor
 →security tools can intercept guest page table updates
 ▷ SPTs supersede GPTs for address translation



### Virtualization Hardware Obsoletes Shadow Paging

Nested paging introduces two-level address translation for VMs
 Both GPT and NPT are used by CPU for address translation



<sup>1</sup>VMware: performance evaluation of Intel EPT hardware assist. ( = ) = 🤊 ५०

5/22

### Virtualization Hardware Obsoletes Shadow Paging

- Nested paging introduces two-level address translation for VMs
  Both GPT and NPT are used by CPU for address translation
- Nested paging has big performance advantage over SPT
  An acceleration of up to 48% for MMU-intensive tasks <sup>1</sup>



### Virtualization Hardware Obsoletes Shadow Paging

- Nested paging introduces two-level address translation for VMs
  Both GPT and NPT are used by CPU for address translation
- Nested paging has big performance advantage over SPT
  An acceleration of up to 48% for MMU-intensive tasks <sup>1</sup>
- Security tools cannot intercept guest memory updates with NPT
  Guest is free to change its GPTs, without notifying hypervisor



<sup>1</sup>VMware: performance evaluation of Intel EPT hardware assist. **EXAMPLE 1** Second: A Framework for Virtualization-based Security Systems 2015 USENIX ATC July 8-10 2015 Santa Clara, CA 5/22

### Why SecPod

Our Goal:

A framework for virtualization-based security tools on the modern virtualization hardware with **nested paging** 

### Why SecPod

Our Goal:

A framework for virtualization-based security tools on the modern virtualization hardware with **nested paging** 

Our Solution:

SecPod: A Framework for Virtualization-based Security Systems

→ Ξ →

### Threat Model and Assumption

Trustworthy hardware and trusted booting
 Load-time integrity is protected by trusted booting
 IOMMU is properly configured to prevent DMA attacks

### Threat Model and Assumption

- Trustworthy hardware and trusted booting
  Load-time integrity is protected by trusted booting
  IOMMU is properly configured to prevent DMA attacks
- Hypervisor is trusted
  - ▷ Formal verification [seL4, SOSP'09], integrity protection and monitoring [HyperSafe, S&P'10]

### Threat Model and Assumption

- Trustworthy hardware and trusted booting
  Load-time integrity is protected by trusted booting
  - IOMMU is properly configured to prevent DMA attacks
- Hypervisor is trusted

▷ Formal verification [seL4, SOSP'09], integrity protection and monitoring [HyperSafe, S&P'10]

Kernel is benign but contains vulnerabilities
 Powerful attackers can change arbitrary memory of the kernel

< 同 ト く ヨ ト く ヨ ト

### SecPod Architecture



(日) (同) (日) (日) (日)

- 34

### Key Technique I: Paging Delegation

SecPod creates an isolated address space from the kernel



# Key Technique I: Paging Delegation

- SecPod creates an isolated address space from the kernel
- Secure space maintains SPTs for the guest
  - $\triangleright$  SPTs are the only effective page tables for the guest
  - ▷ SPTs mirror GPTs (if no memory protection violation)



SecPod: A Framework for Virtualization-based Security Systems 2015 USENIX ATC July 8-10 2015 Santa Clara, CA 9/22

# Key Technique I: Paging Delegation

- SecPod creates an isolated address space from the kernel
- Secure space maintains SPTs for the guest
  SPTs are the only effective page tables for the guest
  SPTs mirror GPTs (if no memory protection violation)
- SecPod forwards guest page table updates to secure space



SecPod: A Framework for Virtualization-based Security Systems 2015 USENIX ATC July 8-10 2015 Santa Clara, CA 9/22

### SecPod Address Space Layout

Normal/secure spaces use page-table based isolation
 Entry/exit gates are the only passage



SecPod: A Framework for Virtualization-based Security Systems 2015 USENIX ATC July 8-10 2015 Santa Clara, CA 10/22

(3)

### SecPod Address Space Layout

- Normal/secure spaces use page-table based isolation
  Entry/exit gates are the only passage
- Guest kernel is mapped in secure space
  - ▷ Security tools can access guest memory, but not execute it



SecPod: A Framework for Virtualization-based Security Systems 2015 USENIX ATC July 8-10 2015 Santa Clara, CA 10/22

4 E 5 4

# Protecting Secure Space

#### Attacker might try to:

- Enter secure space without security checks
- Request malicious page table updates
  e.g., to map secure memory in guest
- Misuse privileged instructions
  - $\triangleright$  e.g., to load a malicious page table, to disable paging...

# Protecting Secure Space

#### Attacker might try to:

- Enter secure space without security checks
- Request malicious page table updates
  e.g., to map secure memory in guest
- Misuse privileged instructions
  e.g., to load a malicious page table, to disable paging...

#### Our countermeasures:

- Secure and efficient context switch
- Page table update validation
- Execution trapping of privileged instructions

### Secure and Efficient Context Switch

- Entry/exit gates are only passage between secure/normal spaces
  Each gate switches the page table, the stack...
  - ▷ Entry gate runs atomically by disabling interrupts (SIM [CCS '09])

A B b

### Secure and Efficient Context Switch

- Entry/exit gates are only passage between secure/normal spaces
  Each gate switches the page table, the stack...
  - ▷ Entry gate runs atomically by disabling interrupts (SIM [CCS '09])
- Loading CR3 is a privileged instruction trapped by SecPod
  Performance overhead is high if each context switch is trapped

・ 同 ト ・ ヨ ト ・ ヨ ト

### Secure and Efficient Context Switch

- Entry/exit gates are only passage between secure/normal spaces
  Each gate switches the page table, the stack...
  - ▷ Entry gate runs atomically by disabling interrupts (SIM [CCS '09])
- Loading CR3 is a privileged instruction trapped by SecPod
  Performance overhead is high if each context switch is trapped
- Intel CR3 target list to the rescue:
  - ▷ Four page tables can be loaded without being trapped by CPU

- 4 週 ト - 4 三 ト - 4 三 ト

▷ There are many SPTs for the guest

 $\rightarrow$  use a fixed top-Level page table for all SPTs

### Page Table Update Validation

• SecPod enforces basic kernel memory integrity for guest  $\triangleright$  No mapping is allowed to the secure space code/data  $\triangleright$  Enforce kernel W $\oplus X$ 



# Key Technique II: Execute Trapping

- SecPod traps malicious privileged instructions executed by guest
  It can trap intended and unintended<sup>2</sup> privileged instructions
- Hypervisor notifies secure space trapped instructions via upcalls
  Similar to signal delivery in the traditional OS

 $<sup>^2</sup>$ X86 has variable-length instructions, unintended instructions can be "created" by jumping to the middle of an instruction.  $_{\odot}$ 

### Trapped Sensitive Instructions

| Instruction | Semantics                        |
|-------------|----------------------------------|
| LGDT        | Load global descriptor table     |
| LLDT        | load local descriptor table      |
| LIDT        | load interrupt descriptor table  |
| LMSW        | load machine status word         |
| MOV to CRO  | write to CRO                     |
| MOV to CR4  | write to CR4                     |
| MOV to CR8  | write to CR8                     |
| MOV to CR3  | load a new page table            |
| WRMSR       | write machine-specific registers |

### Implementation

- Paging delegation
  - $\triangleright$  Leverage Linux paravirtualization interface: pv\_mmu\_ops
- Execution trapping implemented in the Hypervisor (KVM)

A B A A B A

### Implementation

- Paging delegation
  - $\triangleright$  Leverage Linux paravirtualization interface: <code>pv\_mmu\_ops</code>
- Execution trapping implemented in the Hypervisor (KVM)

#### Security tools:

- ▷ Compiled as ELF libraries and loaded into secure space
- ▷ Implemented an example tool to prevent unauthorized kernel code from execution (Patagonix [USENIX Sec '08], NICKLE [RAID '08])

- Maliciously modify secure space memory
  - ▷ Secure space memory is not mapped in the normal space
    - $\rightarrow$  try to map the secure space memory in the guest

(4 個) トイヨト イヨト

- Maliciously modify secure space memory
  - ▷ Secure space memory is not mapped in the normal space
    - $\rightarrow$  try to map the secure space memory in the guest
  - Directly change the page mapping
  - Ask SecPod to map secure memory

- 4 回 ト - 4 回 ト

- Maliciously modify secure space memory
  ▷ Secure space memory is not mapped in the normal space
  → try to map the secure space memory in the guest
  ▷ Directly change the page mapping ← SPT is isolated
  - $\triangleright$  Ask SecPod to map secure memory  $\leftarrow$  SPT update validation

(人間) トイヨト イヨト

- Maliciously modify secure space memory
  - ▷ Secure space memory is not mapped in the normal space
    - $\rightarrow$  try to map the secure space memory in the guest
  - ▷ Directly change the page mapping ← SPT is isolated
  - ▷ Ask SecPod to map secure memory ← SPT update validation
- Misuse privileged instructions
  - Privileged instructions by guest are trapped and verified

(人間) システン イラン

### Performance Evaluation: LMBench



### Performance Evaluation: SysBench OLTP



### Related Work

- Virtualization-based security
  - Malware analysis: Ether[CCS'08]
  - Rootkit detection and prevention: PoKeR[EuroSys'09]
  - Virtual machine introspection: Virtuoso[S&P'11], SIM[CCS'09]
- Kernel/user application security
  - Exploit mitigation techniques: ASLR, DEP, CFI[CCS'07]
  - Kernel/hypervisor memory integrity: TZ-RKP[CCS'14],
    HyperSafe[S&P'10], Nested Kernel[ASPLOS'15]

- 4 目 ト - 4 日 ト - 4 日 ト

# Summary



SecPod: A Framework for Virtualization-based Security Systems

(日) (同) (日) (日) (日)

# Thank you & Questions?

SecPod: A Framework for Virtualization-based Security Systems 2015 USENIX ATC July 8-10 2015 Santa Clara, CA 22/22

イロト 不得下 イヨト イヨト