www.intel.com Open in urlscan Pro
2a02:26f0:3500:88e::1ea2  Public Scan

URL: https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/technical-documentatio...
Submission: On July 03 via api from US — Scanned from DE

Form analysis 1 forms found in the DOM

Name: hpsform-new /content/www/us/en/search.html

<form class="mega-search-form search item" data-igm-search-control="" name="hpsform-new" id="hpsform-new" action="/content/www/us/en/search.html" role="search" onsubmit="return inputSearch()">
  <button type="submit" class="btn-mega-search icon" aria-label="Search" tabindex="-1">
    <span class="sr-only">Search</span>
    <span class="fa-global-search"></span>
  </button>
  <label for="mobile-search">
    <span class="sr-only">&lt;</span>
    <input id="toplevelcategory" name="toplevelcategory" type="hidden" value="none">
    <input id="query" name="query" type="hidden" value="">
    <input data-tabindex="1" class="form-control" data-search-input="" id="mobile-search" name="keyword" type="text" data-igm-search-input="" title="Search" autocomplete="off" data-target-result="#igm-search-result"
      aria-controls="mega-results-overlay" aria-label="Search Intel.com" placeholder="Search Intel.com">
  </label>
  <button type="button" id="cls-btn-advanced" class="btn-mega-close fa-cancel-1 hide-flyout" aria-label="Clear Search Term" data-clear-input="clear-input">
    <span class="sr-only">Close Search Panel</span>
  </button>
  <button type="button" id="advanced-btn" class="btn-advance-filter fa-sliders hide-flyout" aria-label="Advanced Search Panel" data-mega-advanced-search="advanced-search">
    <span class="sr-only">Advanced Search</span>
  </button>
</form>

Text Content

Skip To Main Content

Toggle Navigation
    
 1. Products
    Products Home
    Processors
       
     * Intel® Core™ Ultra Processors
       
     * Intel® Core™ Processors
       
     * Intel® Xeon® Processors
       
     * Intel® Xeon® CPU Max Series
       
     * Intel Atom® Processors
    
    Select Systems & Devices
       
     * Intel® Evo™ Laptops
       
     * Intel vPro® for Business
       
     * Gaming Systems
       
     * Intel® Arc™ Discrete Graphics
       
     * Intel® Wi-Fi Products
       
     * Thunderbolt™ Technology
       
     * Intel® Unison™ Software
       
     * Chipsets
    
    AI Accelerators
       
     * Intel® Gaudi® AI Accelerators
       
     * AI Software Solutions
       
     * Intel® Geti™ Platform
       
     * Intel® Data Center GPU Max Series
       
     * Intel® Data Center GPU Flex Series
    
    Network to Edge
       
     * Edge and Embedded Processors
       
     * Intel® Ethernet Products
       
     * Intel® Infrastructure Processing Unit (Intel® IPU)
       
     * Intel® Silicon Photonics Connectivity
       
     * Intel® Intelligent Fabric Processors
       
     * Storage Solutions
    
    FPGAs & Programmable Devices
       
     * Intel® FPGAs, CPLDs, and Configuration Devices
       
     * Intel® eASIC™ Structured ASIC Devices
       
     * Intel® Quartus® Development Software & Tools
       
     * Intellectual Property
       
     * Intel FPGA Development Kits
       
     * Acceleration Boards & Platforms
    
    Software
       
     * Trust and Security Solutions
       
     * Performance Solutions
       
     * oneAPI Unified Runtime
       
     * OpenVINO™ Toolkit
       
     * Open Source Projects
       
     * Intel® Developer Catalog

 2. Support
    Drivers & Downloads
       
     * Auto-update your Drivers
       
     * Download Center
    
    Support For
       
     * Products
       
     * Developers
       
     * Suppliers
    
    Resources
       
     * Support Community
       
     * Warranty Information
       
     * Contact Support

 3. Solutions
    Business Solutions
       
     * AI
       
     * 5G
       
     * Business PCs
       
     * Cloud Computing
       
     * Edge Computing
       
     * IOT
       
     * View all >
    
    Consumer Solutions
       
     * Creators
       
     * Gaming
    
    Industry Solutions
       
     * Education
       
     * Financial Services
       
     * Government & Public Sector
       
     * Healthcare & Life Sciences
       
     * Industrial
       
     * Retail
       
     * Transportation
       
     * View all >
    
    Resources
       
     * Customer Spotlight
       
     * Technology Tips & Tricks
       
     * Partner Stories
       
     * Intel® Partner Showcase
       
     * Stories
       
     * Intel Blogs

 4. Developers
    Developers Home
    Development Tools
       
     * Software Catalog
       
     * Download Center
       
     * Design Tools and Services
       
     * Software Registration
    
    Topics & Technologies
       
     * Artificial Intelligence
       
     * Client
       
     * Cloud
       
     * Game Development
       
     * Edge, IoT & 5G
       
     * High Performance Computing (HPC)
    
    Resources & Documentation
       
     * Learn
       
     * Communities & Events
       
     * Developer Programs
       
     * Get Help

 5. Partners
    Intel® Partner Alliance
       
     * About Membership
       
     * Already a Member? Login
       
     * Submit a Solution
       
     * Get Help
    
    Intel® Partner Showcase
       
     * Offerings by Category
       
     * Partner Directory
    
    Retail Partner Solutions
 6. Foundry
    Intel Foundry
       
     * World-Class Wafer Foundry
       
     * Advanced Packaging
       
     * Chiplets
       
     * Software and Services
       
     * Design Ecosystem Accelerator Alliances

    
 7. More +
    
    
    

Sign In My Intel
My Tools
   
 * ?

Sign Out
English


SELECT YOUR LANGUAGE

 * Bahasa Indonesia
 * Deutsch
 * English
 * Español
 * Français
 * Português

 * Tiếng Việt
 * ไทย
 * 한국어
 * 日本語
 * 简体中文
 * 繁體中文

Toggle Search
Search < Close Search Panel Advanced Search
close
Sign In to access restricted content


USING INTEL.COM SEARCH

You can easily search the entire Intel.com site in several ways.

 * Brand Name: Core i9
 * Document Number: 123456
 * Code Name: Emerald Rapids
 * Special Operators: “Ice Lake”, Ice AND Lake, Ice OR Lake, Ice*


QUICK LINKS

You can also try the quick links below to see results for most popular searches.

   
 * Product Information
 * Support
 * Drivers & Software


RECENT SEARCHES

Sign In to access restricted content


ADVANCED SEARCH

All of these terms Any of these terms Exact term only Find results with
All Results Product Information Support Drivers & Software Documentation &
Resources Partners Communities Corporate Show results from


ONLY SEARCH IN

Title Description Content ID
Search
Sign in to access restricted content.
 1. Developers
     * Overview
     * Topics & Technologies
     * Artificial Intelligence
     * Overview
     * Frameworks
     * AI Development Software
     * Performance Data
     * Courses
     * Technical Library
     * Support
     * AI PC Development
     * ONNX*
     * WebNN
     * Client Applications
     * Business
     * Cloud to PC
     * Cloud
     * Overview
     * Tools
     * Library
     * Performance Benchmarks
     * Community
     * GPU Research
     * Researchers
     * Publications
     * Samples
     * Edge Computing
     * Overview
     * Tools
     * FlexRAN™ Reference Architecture
     * Documentation
     * Intel® Kiosk Peripheral Management Utility
     * Open AMT Cloud Toolkit
     * Learn
     * Edge Insights for Industrial
     * Go to Market with the OpenVINO™ Toolkit
     * Optimized Inference at the Edge
     * Self-Paced Training for the OpenVINO™ Toolkit
     * Community
     * Hardware
     * All Hardware
     * AIxboard Developer Kit
     * Developer Kits with Intel® Core™ Ultra Processors
     * Developer Kits with Intel Atom® x7000E Processors
     * Developer Kits with 14th Generation Intel® Core™ Processors
     * Developer Kits with 13th Generation Intel® Core™ Processors
     * Developer Kits Based on 12th Generation Intel® Core™ Processors
     * Developer Kits Based on 11th Generation Intel® Core™ Processors
     * Developer Kits Based on 10th Generation Intel® Core™ Processors
     * Hinj* IoT Sensor Hub Development Kit
     * Intel® Cyclone® 10 LP FPGA
     * Intel® IoT RFP Ready Kits
     * Intel® Movidius™ VPU
     * Intel vPro® Platform for Retail
     * Kits with Intel® Xeon® D-2100 Processors
     * Developer Kits with 4th Gen Intel® Xeon® Processors
     * Nezha Developer Kit
     * Snō + SnōMākr Development Kit
     * Terasic DE10-Nano Kit
     * XLR8* Development Board
     * Intel® Tiber™ Edge Platform
     * Edge AI Reference Kits
     * Smart Meter Scanning
     * Intelligent Queue Management
     * Technical Library
     * GameDev
     * Tools
     * Technical Library
     * Partners
     * HPC
     * Overview
     * Technical Library
     * Data Center
     * Tools & Libraries
     * Firmware
     * coreboot*
     * Intel® Intelligent Test System
     * Unified Extensible Firmware Interface (UEFI)
     * SMI Transfer Monitor
     * Networking
     * Data Plane Development Kit
     * Learn
     * Open Ecosystem
     * Projects
     * ACPI Component Architecture (ACPICA)
     * Assistive Context-Aware Toolkit (ACAT)
     * Celadon
     * C for Metal SDK
     * pm-graph
     * Features
     * Resources
     * Intel® Kernel Guard Technology
     * Linux* Kernel Performance
     * Key Management Reference Application
     * Multi-Rail Power Sequencer and Monitor
     * Open Data Center Diagnostic Project 🡭
     * OP-TEE* for Intel® Architecture
     * Linux* Stacks for Intel® Software Guard Extensions
     * Intel® x86 Architecture Distribution of Trusty* OS
     * Open Source Media Framework
     * Resources
     * NumaTOP
     * Intel® QuickAssist Technology
     * Resources
     * Graphics Numerical Hardware
     * Guides & Tutorials
     * News & Events
     * Participate
     * Team
     * Persistent Memory
     * Library
     * Intel® Platform Analysis Technology
     * Runtime Languages
     * Software Security Guidance
     * Advisory Guidance
     * Best Practices
     * Resources
     * Disclosure Documentation
     * Feature Documentation
     * More Information
     * Storage
     * Tools
     * Software Catalog
     * Containers
     * About
     * oneAPI
     * Overview
     * Data Parallel C++/SYCL*
     * Toolkits
     * Intel® oneAPI Base Toolkit
     * Overview
     * Download
     * Documentation
     * Intel® HPC Toolkit
     * Overview
     * Download
     * Documentation
     * Intel® oneAPI IoT Toolkit
     * Overview
     * Download
     * Documentation
     * Intel® Rendering Toolkit
     * Overview
     * Download
     * Documentation
     * AI Tools
     * Overview
     * Download
     * Documentation
     * Intel® System Bring-up Toolkit
     * Documentation
     * Intel® oneAPI DL Framework Developer Toolkit
     * Documentation
     * Tech Articles & How-Tos
     * Library
     * News Updates
     * Webinars
     * Components
     * All Components
     * Intel® oneAPI Collective Communications Library
     * Documentation
     * Intel® oneAPI Data Analytics Library
     * Download
     * Documentation
     * Intel® oneAPI Deep Neural Network Library
     * Overview
     * Download
     * Documentation
     * Intel® oneAPI DPC++/C++ Compiler
     * Download
     * Documentation
     * Legacy Intel® C++ Compiler Documentation
     * Intel® oneAPI DPC++ Library
     * Documentation
     * Intel® oneAPI Math Kernel Library
     * Download
     * Documentation
     * Link Line Advisor
     * Intel® oneAPI Threading Building Blocks
     * Download
     * Documentation
     * Intel® Video Processing Library
     * Documentation
     * Intel® Advisor
     * Documentation
     * Intel® Cluster Checker
     * Documentation
     * Intel® Distribution for GDB*
     * Documentation
     * Intel® Distribution of Modin*
     * Documentation
     * Intel® Distribution for Python*
     * Intel® DPC++ Compatibility Tool
     * Download
     * Documentation
     * Intel® Embree
     * Intel® Extension for Scikit-learn*
     * Intel® FPGA Add-on for oneAPI Base Toolkit
     * Documentation
     * Intel® Fortran Compiler
     * Documentation
     * Intel® Inspector
     * Documentation
     * Intel® Integrated Performance Primitives
     * Documentation
     * Intel® MPI Library
     * Documentation
     * Pinning Simulator
     * Intel® Neural Compressor
     * Intel® Open Image Denoise
     * Intel® Open Volume Kernel Library
     * Intel® OpenSWR
     * Intel® Extension for PyTorch*
     * Documentation
     * Intel® Extension for TensorFlow*
     * Download from TensorFlow*
     * Download the Intel® Extension for TensorFlow*
     * Intel® Optimization for XGBoost*
     * Documentation
     * Tutorials
     * Code Samples
     * Intel® OSPRay
     * Intel® Trace Analyzer and Collector
     * Documentation
     * Intel® VTune™ Profiler
     * Download
     * Documentation
     * Intel® AI Reference Models
     * Code Samples
     * Training
     * Overview
     * Catalog
     * Academic Program
     * Educator
     * Student Ambassador
     * Innovator
     * Centers of Excellence
     * Experts
     * Developer Summit
     * Essentials of SYCL*
     * FPGA Development Flow Using Intel® oneAPI Base Toolkit
     * Heterogeneous Programming Using Numba-Data Parallel Python*
     * AI Tools Samples Workflow
     * Intel® oneAPI Math Kernel Library (oneMKL) Essentials
     * Machine Learning using oneAPI
     * Migrate from CUDA* to C++ with SYCL*
     * OpenMP* Offload Basics
     * Optimize GPU Apps with the Intel® oneAPI Base Toolkit
     * Intel® OSPRay Essentials
     * Performance, Portability & Productivity
     * Workflow for a CUDA* to SYCL* Migration
     * Documentation
     * oneAPI Programming Guide
     * Installation Guides
     * Supported Hardware
     * Support
     * Priority Support for Intel® oneAPI Base Toolkit
     * Priority Support for Intel® oneAPI Base & HPC Toolkit
     * Priority Support for Intel® oneAPI Base & IoT Toolkit
     * FPGA
     * Intel® Active Management Technology SDK
     * Features
     * Intel® Collaboration Suite for WebRTC SDK
     * Download
     * Intel® Tiber™ Developer Cloud
     * Overview
     * Training Portal (Beta)
     * Intel® Distribution of OpenVINO™ Toolkit
     * What's New
     * Get Started
     * AI PC
     * Download
     * Documentation
     * Knowledge Base
     * Certification
     * Intel® Dynamic Application Loader
     * Technology Details
     * SDK
     * Get Started
     * Frameworks
     * Intel® Game Dev AI Toolkit
     * Intel® Graphics Performance Analyzers
     * Graphics Frame Analyzer
     * Graphics Trace Analyzer
     * System Analyzer
     * Download
     * Get Started
     * Training
     * Forum
     * Intel® Homomorphic Encryption Toolkit
     * Intel® In-Band Manageability
     * Use Cases
     * Documentation
     * Intel® Integrated Simulation Infrastructure with Modeling
     * Instruction Set Architecture (ISA) Extensions
     * Intel® Intelligent Storage Acceleration Library (Intel® ISA-L)
     * Intel® ISA-L on GitHub*
     * Intel® ISA-L for Cryptography on GitHub*
     * Intel® Media SDK
     * OpenCL™ Runtime
     * Pin
     * Intel® Platform Analysis Library
     * Intel Tools for OpenCL™ Applications
     * Runtimes (GPU, CPU)
     * Intel® Smart Edge Open
     * Documentation
     * Intel® Software Guard Extensions
     * Download
     * Attestation & Provisioning Services
     * Get Started
     * Training
     * Documentation
     * Intel® SGX SDK for Linux OS*
     * Commercial License Request
     * Forum
     * Intel® Secure Device Onboard
     * Intel® Software Development Emulator
     * Intel® Trust Domain Extensions (Intel® TDX)
     * Overview
     * Documentation
     * Intel® Quantum SDK
     * Intel® Query Processing Library (Intel® QPL)
     * Resellers
     * Hardware Platforms
     * Overview
     * Intel® Gaudi® AI Accelerators
     * Overview
     * Get Started
     * Overview
     * Get Access
     * Using HuggingFace
     * Tutorials
     * Generative AI and Large Language Models
     * Software Setup and Installation
     * Kernel Libraries
     * Intel® Gaudi® accelerators on Premise
     * Model Optimization
     * Model Performance
     * Catalog
     * Documentation
     * Forum
     * Blog
     * Events
     * FAQ
     * Intel® Core™ Ultra Processors
     * Intel® Data Center GPU Max Series
     * Intel® Data Center GPU Flex Series
     * Resources & Documentation
     * Tuning Guides
     * Installation Guides
     * Programming Guides
     * Learn
     * MLOps Certification
     * Community & Events
     * Events
     * DevCon On Demand
     * Kubecon
     * Open Source Summit Europe
     * Parallel Universe Magazine
     * Intel® Student Ambassadors
     * Intel® Software Innovators
     * Developer Programs
     * Overview
     * Intel® FPGA Academic Program
     * Membership
     * Boards
     * Materials
     * Artificial Intelligence
     * Computer Organization
     * Computer Systems
     * Digital Logic
     * Embedded Systems
     * Heterogeneous Computing
     * IP Cores
     * SD Card Images
     * Tutorials
     * Workshops
     * Research
     * Support
     * Get Help
     * Product Support
     * Software Product Forums
     * Hardware Product Forums
     * Priority Support
     * FPGA Support
     * Frequently Asked Questions
     * Licensing
     * Priority Support
     * Purchasing, Renewing, and Upgrading
     * Downloading
     * Installation
     * Registration

 2. Topics & Technologies
     * Artificial Intelligence
     * AI PC Development
     * Client Applications
     * Cloud
     * GPU Research
     * Edge Computing
     * GameDev
     * HPC
     * Data Center
     * Firmware
     * Networking
     * Open Ecosystem
     * Persistent Memory
     * Intel® Platform Analysis Technology
     * Runtime Languages
     * Software Security Guidance
     * Storage

 3. Software Security Guidance
     * Advisory Guidance
     * Best Practices
     * Resources
     * Disclosure Documentation
     * Feature Documentation
     * More Information

 4. Feature Documentation
 5. Hardware Features and Behaviors Related to Speculative Execution

The browser version you are using is not recommended for this site.
Please consider upgrading to the latest version of your browser by clicking one
of the following links.

 * Safari
 * Chrome
 * Edge
 * Firefox


HARDWARE FEATURES AND BEHAVIOR RELATED TO SPECULATIVE EXECUTION

ID 823144
Updated 6/21/2024
Version 1.0
Public

 * Speculative Execution
   * Incidental Channels
   * Restricting Speculative Execution
 * Control-Flow Speculation
   * Indirect Branches
     * Overview of Indirect Branch Predictors
     * Indirect Branch Speculation Control Mechanisms
     * Software Techniques for Indirect Speculation Control
   * Conditional Branches
     * Overview of Bounds Check Bypass
     * Identifying Bounds Check Bypass Vulnerabilities
     * Conditional Branch Speculation Analysis
     * Software Techniques for Conditional Speculation Control
     * Operating System Mitigations
 * Data Speculation
   * Overview of Data Speculation
   * Speculative Store Bypass
   * Speculative Store Bypass Control Mechanisms
     * Software-Based Mitigations
     * Speculative Store Bypass Disable (SSBD)
 * Data-Dependent Prefetchers
 * Additional Software Guidance
   * Operating Systems
   * System Management Mode (SMM)
 * Related Intel Security Features and Technologies
   * Intel® OS Guard
   * Execute Disable Bit
   * Intel® Control-Flow Enforcement Technology (Intel® CET)
     * Intel CET Shadow Stack Speculation Limitations
     * Intel CET indirect Branch Tracking (CET IBT) Speculation Limitations
   * Protection Keys
   * Supervisor-Mode Access Prevention (SMAP)
 * CPUID Enumeration and Architectural MSRs
 * References
 * Footnotes






KEY TAKEAWAYS

 * Modern processors make predictions about the program’s future execution to
   improve performance. Processors implement various forms of predictions and
   speculation which may result in instructions being speculatively executed. If
   a prediction was wrong, the instructions which were speculatively executed
   based on the misprediction must be squashed and do not affect architectural
   states. However, a malicious actor may be able to use mispredictions to
   perform transient execution attacks.

 * This article consolidates prior guidance on speculative execution and brings
   together relevant information to help readers navigate this topic. This
   consolidated document explains how to effectively address speculation in
   Intel processors for secure code execution, limit the performance impact of
   mitigations, and avoid mitigation redundancies.

 * Intel plans to update this document periodically to incorporate new guidance
   documents as they are released; for example, reflecting speculation control
   mechanisms that may be added on future Intel products.



By

Modern processors use speculative execution to provide higher performance, more
efficient resource utilization, and better user experiences. The speculation
mechanisms may use various forms of predictors to anticipate future program
execution and improve performance by having instructions execute earlier than
their program order. While these predictors are designed to have high accuracy,
wrong predictions can occur and result in mis-speculation, where a processor
first executes instructions based on a prediction, and later squashes them to
return to correct program execution. An attacker can potentially exploit such
mis-speculation to reveal sensitive data in a transient execution attack. 

While previous documentation described specific speculative execution
vulnerabilities and their mitigations, this article consolidates the prior
guidance on speculative execution with better organization. It continues to
refer to the per-vulnerability guidance documents for more details. The first
version of this article does not change existing guidance for transient
execution attacks over what has previously been published, but rather brings
together relevant information to help readers navigate this topic. Later
versions of this article may include additional guidance. 

This consolidated document explains how to effectively address speculation in
Intel processors for secure code execution, limit the performance impact of
mitigations, and avoid mitigation redundancies for the features and behaviors
included. It also provides an overview of the different types of speculation on
current Intel processors and describes the hardware controls and software-based
techniques that developers can use to restrict speculation and reduce the
ability of potential adversaries to infer secret data due to speculation. Intel
plans to update this article periodically to incorporate new guidance documents
as they are released; for example, reflecting speculation mechanisms that may be
added on future Intel processors.

This document is organized as follows: the Speculative Execution section starts
with an introduction to speculative execution, describes control-flow and data
speculation, and outlines options to restrict speculation. The Control-Flow
Speculation section details control-flow speculation due to indirect and
conditional branches and techniques to restrict control-flow speculation on
Intel processors. The Data Speculation section describes variants of data
speculation such as memory disambiguation and options to manage data
speculation. The Data-Dependent Prefetchers section outlines data-dependent
prefetches. The Additional Software Guidance section summarizes the
recommendations for restricting speculation in common use cases, such as after a
processor enters a higher privilege level. The Related Intel Security Features
and Technologies section describes security features and technologies which
reduce the effectiveness of malicious attacks described in the previous
sections. Finally, the CPUID Enumerations and Architectural MSRs section
describes the processor enumerations and model-specific registers that provide
the hardware features and mechanisms described in this article.


SPECULATIVE EXECUTION

In order to improve performance, modern processors make predictions about the
program’s future execution. Processors use these predictions to speculatively
execute younger instructions ahead of the current instruction pointer. As the
processor advances in program execution, it resolves all conditions required to
determine the correctness of the prediction. If the original predictions were
correct, the speculatively executed instructions can retire, and their state
becomes architecturally visible. If a prediction was wrong, the instructions
which were speculatively executed based on the misprediction must be squashed
and do not affect architectural states. These squashed instructions, which were
only executed speculatively, are called transient instructions. Based on the
resolved conditions, the processor then resumes with the correct program
execution. A more detailed description of speculative execution is available in
the Refined Speculation Execution Terminology article. 

Processors implement various forms of predictions and speculation which may
result in instructions being speculatively executed, including: 

 * Control-flow speculation involves speculatively executing instructions based
   on a prediction of the program’s control flow.
   * Indirect branch predictors predict the target address of indirect branch
     instructions1 to allow instructions at the predicted target address to be
     speculatively executed before the target address has been resolved. 
   * Conditional branch predictors predict the direction of conditional branches
     to allow instructions on the predicted path to be speculatively executed
     before the condition has been resolved. 
 * Data speculation involves speculatively executing instructions which depend
   on the values from previous instructions before the previous instructions
   have been executed. For example, the processor may speculatively forward data
   from a previous load to younger dependent instructions before the addresses
   of all intervening stores are known. 

While speculative execution predictors strive to have high accuracy, predictions
can be wrong. A malicious actor may be able to use mispredictions to perform
transient execution attacks, in which case a malicious actor may attempt to
retrieve secret information from transiently executed instructions through an
incidental channel. 

Multiple sources of speculation may affect the same instruction. For example, an
indirect branch may be affected by both control-flow speculation and data
speculation. Control-flow speculation may cause the indirect branch to be
predicted with a target based on past behavior. If a malicious actor controlled
the predicted branch target, this would be called attacker-controlled
prediction. Data speculation could later affect the indirect branch’s source
data and cause it to transiently go to an incorrectly predicted location before
later redirecting to the correct location. If a malicious actor controlled the
branch target through data speculation, this would be called attacker-controlled
jump redirection. 

The following sections discuss the various types of speculation as well as the
configurations Intel processors provide to control speculation.


INCIDENTAL CHANNELS

There are several sources of incidental channels that may be used to retrieve
information from transiently executed instructions. An overview of possible
incidental channels is provided in the incidental channel taxonomy. 

Using such incidental channels, a malicious actor may be able to gain
information through observing certain states of the system, such as by measuring
the microarchitectural properties of the system. Unlike buffer overflows and
other vulnerability classes, incidental channels do not directly influence the
execution of the program, nor allow data to be modified or deleted.  

For instance, a cache timing side channel involves an agent detecting whether a
piece of data is present in any or a specific level of the processor’s caches,
which may be used to infer some other related information. One common method to
detect whether the data of interest is present in a cache is to use timers to
measure the latency to access memory at the corresponding address and compare
with the baseline timing of memory accesses that hit the cache or memory.


RESTRICTING SPECULATIVE EXECUTION

System operators have a range of options available to restrict speculation in
Intel processors and reduce the risk of transient execution attacks. Intel
processors provide several controls, such as enhanced Indirect Branch Restricted
Speculation (IBRS) and Speculative Store Bypass Disable (SSBD), to restrict
control speculation of indirect branches and to control data speculation,
respectively. The Indirect Branch Speculation Control Mechanisms section details
the indirect branch speculation controls available and their usage and the
Speculative Store Bypass Control Mechanisms section describes controls to
restrict data speculation. 

Speculation can also be restricted through software-based techniques: For
example, software can use a technique called retpoline (see the Software
Techniques for Indirect Speculation Control section) to restrict indirect branch
speculation and use bounds clipping to prevent speculative out-of-bounds array
accesses following conditional branches (refer to the Overview of Bounds Check
Bypass section).

More generally, software can insert speculation-stopping barriers at the proper
locations as needed to prevent a speculative side channel. The LFENCE
instruction, or any serializing instruction, can serve as such a barrier. The
LFENCE instruction and serializing instructions ensure that no later instruction
will execute, even speculatively, until all prior instructions have completed
locally. The LFENCE instruction has lower latency than the serializing
instructions and thus is recommended when a speculation-stopping barrier is
needed. 

Certain security features with architectural effect can also be effective with
respect to speculative execution. For example, when Supervisor Mode Access
Prevention (SMAP) is enabled, supervisor loads executed with a cleared AC flag
will not transiently access memory in user mode pages from CPL0. This may
prevent an attacker from using user memory for an incidental channel.


CONTROL-FLOW SPECULATION

As highlighted in the Speculative Execution section, control-flow speculation
occurs when the processor speculatively executes instructions based on control
flow prediction. The two main sources of transient execution related to
control-flow speculation on Intel processors are indirect branches and
conditional branches. 

Besides control-flow speculation from branch predictions, there is also implicit
sequential control-flow speculation due to out-of-order execution: instructions
can be speculatively executed on a sequential control-flow path ahead of the
architecturally committed instruction pointer. In case of architectural or
microarchitectural events (for example, exceptions or assists), instructions on
the sequential path following the event may be transiently executed and squashed
later by the processor as part of the event handling mechanism. This is a common
behavior in modern processors with out-of-order execution and not considered a
security issue by itself. However, when combined with specific vulnerabilities
such as Rogue Data Cache Load, Rogue System Register Read, L1 Terminal Fault and
Lazy FP, malicious actors may be able to leverage speculative execution to
bypass existing security restrictions and infer secret data on some processors.
This paper does not discuss the behavior of those specific vulnerabilities;
refer to the respective technical papers for more details.

The following section of this article describes control-flow speculation due to
indirect branches and conditional branches, as well as the hardware and software
mechanisms that can be used to restrict such control-flow speculation.


INDIRECT BRANCHES

OVERVIEW OF INDIRECT BRANCH PREDICTORS

Intel processors use indirect branch predictors to determine the target address
of instructions that are to be speculatively executed after a near indirect
branch instruction, as enumerated in the table below.

Table 1: Instructions that use Indirect Branch Predictors Branch Type
Instruction Opcode Near Call Indirect CALL r/m16, CALL r/m32, CALL r/m64 FF /2
Near Jump Indirect JMP r/m16, JMP r/m32, JMP r/m64 FF /4 Near Return RET, RET
Imm16 C3, C2 Iw

References in this document to indirect branches are only to near call indirect,
near jump indirect and near return instructions. 

To make accurate predictions, indirect branch predictors are trained through
program execution. Specifically, indirect branch predictors learn the target
addresses of indirect branch instructions when they execute and use them for
target prediction of subsequent execution of indirect branch instructions. While
being accurate for most cases, misprediction may happen and the indirect branch
predictor may predict the wrong target address, which can result in instructions
at an incorrect code location being speculatively executed and later squashed. 

Intel processors implement different forms of indirect branch predictors, such
as: 

 * Branch Target Buffer (BTB) predicts indirect branch target address based on
   the branch instruction’s address. 
 * Other branch predictors predict indirect branch target address based on the
   history of previously executed branch instructions. This allows the processor
   to predict different targets for the same indirect branch depending upon the
   previous code leading up to the indirect branch. The Branch History Buffer
   (BHB) holds the history which is used to select branch targets in these
   predictors.
 * The Return Stack Buffer (RSB) is a microarchitectural structure that predicts
   the targets of near RET instructions based on previous corresponding CALL
   instructions. Each execution of a near CALL instruction with a non-zero
   displacement adds an entry to the RSB that contains the address of the
   instruction sequentially following that CALL instruction. The RSB is not used
   or updated by far CALL, far RET, or IRET instructions.

Note that besides control-flow speculation, such as in indirect branch
predictions, data speculation can also be the origin of speculative execution in
the context of indirect branch instructions. For instance, due to memory
disambiguation, an indirect jump instruction may load the target address from a
memory location and speculatively jump to this target address before an older
store instruction has stored a different target address to that memory
location2.

Branch Target Injection (BTI), Branch History Injection (BHI), and Intra-mode
BTI are all microarchitectural transient execution attack techniques which
involve an adversary influencing the target of an indirect branch by training
the indirect branch predictors. Intel processors support indirect branch
speculation control mechanisms which can be used to mitigate such attacks.

INDIRECT BRANCH PREDICTION AND INTEL® HYPER-THREADING TECHNOLOGY (INTEL® HT
TECHNOLOGY) 

In a processor supporting Intel® Hyper-Threading Technology, a core (or physical
processor) may include multiple logical processors. On such processors, the
logical processors sharing a core may share indirect branch predictors. As a
result of this sharing, software on one of a core’s logical processors may be
able to control the predicted target of an indirect branch executed on another
logical processor on the same core. 

This sharing occurs only within a core. Software executing on a logical
processor of one core cannot control the predicted target of an indirect branch
by a logical processor of a different core. 

This sharing also occurs only when STIBP is not enabled and only on processors
without support for enhanced IBRS.

INDIRECT BRANCH SPECULATION CONTROL MECHANISMS

Intel has developed indirect branch predictor controls, which are interfaces
between the processor and system software to manage the state of indirect branch
predictors. 

All supported Intel processors provide three indirect branch control mechanisms:

 * Indirect Branch Restricted Speculation (IBRS): Restricts indirect branch
   predictions, which can be used by virtual machine manager (VMM) or operating
   system code to prevent the use of predictions from another security domain.
   Recent processors support enhanced IBRS, which can be enabled once and never
   disabled (always on mode).
 * Single Thread Indirect Branch Predictors (STIBP): Prevents indirect branch
   predictions from being controlled by a sibling hyperthread. Processors which
   support enhanced IBRS always have this behavior, regardless of the setting of
   STIBP.
 * Indirect Branch Predictor Barrier (IBPB): Prevents indirect branch
   predictions after the barrier from being controlled by software executed
   before the barrier. IBPB also acts as a barrier for the Fast Store Forwarding
   Predictor and Data Dependent Prefetchers (refer to the Overview of Data
   Speculation section), where relevant. This allows VMM and operating system
   code to provide isolation when switching between guests or userspace
   applications which execute in different security domains.

Some recent Intel processors also support additional indirect branch control
mechanisms which focus on specific indirect branch predictors or behaviors. Some
examples include the IPRED_DIS_U, IPRED_DIS_S, RRSBA_DIS_U, RRSBA_DIS_S, and
BHI_DIS_S bits in the IA32_SPEC_CTRL MSR.

System software can use these indirect branch control mechanisms to defend
against branch target injection attacks.

PREDICTOR MODE

Intel processors support different modes of operation corresponding to different
levels of privilege.  VMX root operation (for a virtual-machine monitor, or
host) is more privileged than VMX non-root operation (for a virtual machine, or
guest). Within either VMX root operation or VMX non-root operation, supervisor
mode (CPL < 3) is more privileged than user mode (CPL= 3).

To prevent inter-mode attacks based on branch target injection, it is important
to ensure that less privileged software cannot control the branch target
prediction in more privileged software. For this reason, it is useful to
introduce the concept of predictor mode associated with different modes of
operation as mentioned above. There are four predictor modes: host-supervisor,
host-user, guest-supervisor, and guest-user.

The guest predictor modes are considered less privileged than the host predictor
modes. Similarly, the user predictor modes are considered less privileged than
the supervisor predictor modes.

There are operations that may be used to transition between unrelated software
components but do not change CPL or cause a VMX transition. These operations do
not change predictor mode.  Examples include MOV to CR3, VMPTRLD, EPTP switching
(using VM function 0), and GETSEC[SENTER].

INDIRECT BRANCH RESTRICTED SPECULATION (IBRS)

Indirect branch restricted speculation (IBRS) is an indirect branch control
mechanism that restricts speculation of indirect branches. A processor supports
IBRS if it enumerates CPUID.(EAX=7H,ECX=0):EDX[26] as 1.

IBRS: BASIC SUPPORT

Processors that support IBRS provide the following guarantees without any
enabling by software:

 * The predicted targets of near indirect branches executed in an enclave (a
   protected container defined by Intel® SGX) cannot be controlled by software
   executing outside the enclave.
 * If the default treatment of system-management interrupts (SMIs) and system
   management mode SMM is active, software executed before a SMI cannot control
   the predicted targets of indirect branches executed in SMM after the SMI.
 * The predicted targets of near indirect branches executed inside a Trust
   Domain (TD), a virtual machine managed by Intel® Trust Domain Extensions
   (Intel® TDX) module, cannot be controlled by software executing outside the
   TD.

IBRS: SUPPORT BASED ON SOFTWARE ENABLING

IBRS provides a method for critical software to protect their indirect branch
predictions.

If software sets IA32_SPEC_CTRL.IBRS to 1 after a transition to a more
privileged predictor mode, predicted targets of indirect branches executed in
that predictor mode with IA32_SPEC_CTRL.IBRS = 1 cannot be controlled by
software that was executed in a less privileged predictor mode3. Additionally,
when IA32_SPEC_CTRL.IBRS is set to 1 on any logical processors of that core, the
predicted targets of indirect branches cannot be controlled by software that
executes (or has executed previously) on another logical processor of the same
core. Therefore, it is not necessary to set bit 1 (STIBP) of the IA32_SPEC_CTRL
MSR when IBRS is set to 1.

If IA32_SPEC_CTRL.IBRS is already 1 before a transition to a more privileged
predictor mode, some processors may allow the predicted targets of indirect
branches executed in that predictor mode to be controlled by software that
executed before the transition. Software can avoid this by using WRMSR on the
IA32_SPEC_CTRL MSR to set the IBRS bit to 1 after any such transition,
regardless of the bit’s previous value. It is not necessary to clear the bit
first; writing it with a value of 1 after the transition suffices, regardless of
the bit’s original value.

Setting IA32_SPEC_CTRL.IBRS to 1 does not suffice to prevent the predicted
target of a near return from using an RSB entry created in a less privileged
predictor mode. Software can avoid this by using an RSB overwrite
sequence4 following a transition to a more privileged predictor mode. It is not
necessary to use such a sequence following a transition from user mode to
supervisor mode if supervisor-mode execution prevention (SMEP) is enabled. SMEP
prevents execution of code on user mode pages, even speculatively, when in
supervisor mode. User mode code can only insert its own return addresses into
the RSB, not the return addresses of targets on supervisor mode code pages. On
processors without SMEP where separate page tables are used for the OS and
applications, the OS page tables can map user code as no-execute. The processor
will not speculatively execute instructions from a translation marked
no-execute.

Enabling IBRS does not prevent software from controlling the predicted targets
of indirect branches of unrelated software executed later at the same predictor
mode (for example, between two different user applications, or two different
virtual machines). Such isolation can be ensured through use of IBPB, described
in the Indirect Branch Predictor Barrier (IBPB) section.

Enabling IBRS on one logical processor of a core with Intel HT Technology may
affect branch prediction on other logical processors of the same core. For this
reason, software should disable IBRS (by clearing IA32_SPEC_CTRL.IBRS) prior to
entering a sleep state (for example, by executing HLT or MWAIT) and re-enable
IBRS upon wakeup and prior to executing any indirect branch.

ENHANCED IBRS

Some processors may enhance IBRS by simplifying software enabling and improving
performance.  A processor supports enhanced IBRS if RDMSR returns a value of 1
for bit 1 of the IA32_ARCH_CAPABILITIES MSR.

Enhanced IBRS supports an always on model in which IBRS is enabled once (by
setting IA32_SPEC_CTRL.IBRS) and never disabled. If IA32_SPEC_CTRL.IBRS = 1 on a
processor with enhanced IBRS, the predicted targets of indirect branches
executed cannot be controlled by software executed in a less privileged
predictor mode or on another logical processor.

As a result, software operating on a processor with enhanced IBRS need not use
WRMSR to set IA32_SPEC_CTRL.IBRS after every transition to a more privileged
predictor mode. Software can isolate predictor modes effectively simply by
setting the bit once. Software need not disable enhanced IBRS prior to entering
a sleep state such as MWAIT or HLT.

On processors with enhanced IBRS, an RSB overwrite sequence may not suffice to
prevent the predicted target of a near return from using an RSB entry created in
a less privileged predictor mode.  Software can prevent this by enabling SMEP
(for transitions from user mode to supervisor mode) and by having
IA32_SPEC_CTRL.IBRS set during VM exits. Processors with enhanced IBRS still
support the usage model where IBRS is set only in the OS/VMM for OSes that
enable SMEP. To do this, such processors will manage guest behavior such that it
cannot control the RSB after a VM exit once IBRS is set, even if IBRS was not
set at the time of the VM exit. If the guest has cleared IBRS, the hypervisor
should set IBRS after the VM exit, just as it would do on processors supporting
IBRS but not enhanced IBRS. As with IBRS, enhanced IBRS does not prevent
software from affecting the predicted target of an indirect branch executed at
the same predictor mode. For such cases, software should use the IBPB command,
described in the Indirect Branch Predictor Barrier (IBPB) section.

On processors with enhanced IBRS support, Intel recommends that IBRS be set to 1
and left set. The traditional IBRS model of setting IBRS only during ring 0
execution is just as secure on processors with enhanced IBRS support as it is on
processors with basic IBRS, but the WRMSRs on ring transitions and/or VM
exit/entry will cost performance compared to just leaving IBRS set. Again, there
is no need to use STIBP when IBRS is set. However, IBPB should still be used
when switching to a different application/guest that does not trust the last
application/guest that ran on a particular hardware thread. 

Guests in a VM migration pool that includes hardware without enhanced IBRS may
not have IA32_ARCH_CAPABILITIES.IBRS_ALL (enhanced IBRS) enumerated to them, and
thus may use the traditional IBRS usage model of setting IBRS only in ring 0.
For performance reasons, once a guest has been shown to frequently write
IA32_SPEC_CTRL, we do not recommend that the VMM cause a VM exit on such WRMSRs.
The VMM running on processors that support enhanced IBRS should allow the
IA32_SPEC_CTRL-writing guest to control guest IA32_SPEC_CTRL. The VMM should
thus set IBRS after VM exits from such guests to protect itself (or use
alternative techniques like retpoline, secret removal, or indirect branch
removal).

On processors without enhanced IBRS, Intel recommends using retpoline or setting
IBRS only during ring 0 and VMM modes. IBPB should be used when switching to a
different process/guest that does not trust the last process/guest that ran on a
particular hardware thread. For performance reasons, IBRS should not be left set
during application execution.

SINGLE THREAD INDIRECT BRANCH PREDICTORS (STIBP)

As noted in the Indirect Branch Prediction and Intel® Hyper-Threading Technology
(Intel® HT Technology) section, the logical processors sharing a core may share
indirect branch predictors, allowing one logical processor to control the
predicted targets of indirect branches by another logical processor of the same
core. 

Single thread indirect branch predictors (STIBP) is an indirect branch control
mechanism that restricts the sharing of indirect branch prediction between
logical processors on a core. A processor supports STIBP if it enumerates
CPUID.(EAX=7H,ECX=0):EDX[27] as 1. Setting bit 1 (STIBP) of the IA32_SPEC_CTRL
MSR on a logical processor prevents the predicted targets of indirect branches
on any logical processor of that core from being controlled by software that
executes (or executed previously) on another logical processor of the same core.

Unlike IBRS and IBPB, STIBP does not affect all branch predictors that contain
indirect branch predictions. STIBP only affects those branch predictors where
software on one hardware thread can create a prediction that can then be used by
the other hardware thread for indirect branches. This is part of what makes
STIBP have lower performance overhead than IBRS on some implementations.

It is not necessary to use IBPB after setting STIBP in order to make the STIBP
effective. STIPB provides isolation of indirect branch prediction between
logical processors on the same core when and only when it is set, it is not a
branch prediction barrier between the execution before and after it being set,
on the same logical processor or on logical processors of the same core.

Processes that are particularly security-sensitive may wish to have STIBP be set
when they execute to prevent their indirect branch predictions from being
controlled by another hardware thread on the same physical core. On some older
Intel Core-family processors, this comes at significant performance cost to both
hardware threads due to disabling some indirect branch predictors (as described
earlier). Because of this, we do not recommend that STIBP be set during all
application execution on processors that only support basic IBRS.

Indirect branch predictors are never shared across cores. Thus, the predicted
target of an indirect branch executed on one core can never be affected by
software operating on a different core. It is not necessary to set
IA32_SPEC_CTRL.STIBP to isolate indirect branch predictions from software
operating on other cores.

Many processors do not allow the predicted targets of indirect branches to be
controlled by software operating on another logical processor, regardless of
STIBP. These include processors on which Intel Hyper-Threading Technology is not
enabled and those that do not share indirect branch predictor entries between
logical processors. To simplify software enabling and enhance workload
migration, STIBP may be enumerated (and setting IA32_SPEC_CTRL.STIBP allowed) on
such processors. 

A processor may enumerate support for the IA32_SPEC_CTRL MSR (e.g., by
enumerating CPUID.(EAX=7H,ECX=0):EDX[26] as 1) but not for STIBP
(CPUID.(EAX=7H,ECX=0):EDX[27] is enumerated as 0). On such processors, execution
of WRMSR to IA32_SPEC_CTRL ignores the value of bit 1 (STIBP) and does not cause
a general-protection exception (#GP) if bit 1 of the source operand is set. It
is expected that this fact will simplify virtualization in some cases.

As noted in the Indirect Branch Restricted Speculation (IBRS) section, enabling
IBRS prevents software operating on one logical processor from controlling the
predicted targets of indirect branches executed on another logical processor.
For that reason, it is not necessary to enable STIBP when IBRS is enabled. 

Recent Intel processors, including all processors which support enhanced IBRS,
provide this isolation for indirect branch predictions between logical
processors without the need to set STIBP.

Enabling STIBP on one logical processor of a core with Intel Hyper-Threading
Technology may affect branch prediction on other logical processors of the same
core. For this reason, software should disable STIBP (by clearing
IA32_SPEC_CTRL.STIBP) prior to entering a sleep state (for example., by
executing HLT or MWAIT) and re-enable STIBP upon wakeup and prior to executing
any indirect branch.

INDIRECT BRANCH PREDICTOR BARRIER (IBPB)

The indirect branch predictor barrier (IBPB) is an indirect branch control
mechanism that establishes a barrier, preventing software that executed before
the barrier from controlling the predicted targets of indirect branches5 
executed after the barrier on the same logical processor. A processor supports
IBPB if it enumerates CPUID.(EAX=7H,ECX=0):EDX[26] as 1. IBPB can be used to
help mitigate Branch Target Injection.

The IBPB also provides other domain isolation properties regarding speculative
execution, such as for the Fast Store Forwarding Predictor and Data Dependent
Prefetchers where relevant.

Unlike IBRS and STIBP, IBPB does not define a new mode of processor operation
that controls the branch predictors. As a result, it is not enabled by setting a
bit in the IA32_SPEC_CTRL MSR. Instead, IBPB is an operation that software
executes when necessary.

Software executes an IBPB command by writing the IA32_PRED_CMD MSR to set bit 0
(IBPB). This can be done either using the WRMSR instruction or as part of a VMX
transition that loads the MSR from an MSR-load area. Software that executed
before the IBPB command cannot control the predicted targets of indirect
branches executed after the command on the same logical processor. The
IA32_PRED_CMD MSR is write-only, and it is not necessary to clear the IBPB bit
before writing it with a value of 1.

IBPB can be used in conjunction with IBRS to account for cases that IBRS does
not cover:

 * As noted in the Indirect Branch Restricted Speculation (IBRS) section, IBRS
   does not prevent software from controlling the predicted target of an
   indirect branch of unrelated software (for example, a different user
   application or a different virtual machine) executed at the same predictor
   mode. Software can aim to prevent such control by executing an IBPB command
   when changing the identity of software operating at a particular predictor
   mode (for example, when changing user applications or virtual machines).
 * Software may choose to clear IA32_SPEC_CTRL.IBRS in certain situations for
   example, for execution with CPL = 3 in VMX root operation). In such cases,
   software can use an IBPB command on certain transitions (for example, after
   running an untrusted virtual machine) to prevent software that executed
   earlier from controlling the predicted targets of indirect branches executed
   subsequently with IBRS disabled.

Note that, on some processors that do not enumerate PBRSB_NO, there is an
exception to the IBPB-established barrier for RSB-based predictions. On these
processors, a RET instruction that follows VM exit or IBPB without a
corresponding CALL instruction may use the linear address following the most
recent CALL instruction executed prior to the VM exit or IBPB as the RSB
prediction (refer to the Post-barrier Return Stack Buffer Predictions guidance).
In these cases, software can use special code sequences (refer to the Return
Stack Buffer Control section) to steer RSB predictions to benign code regions
that restrict speculation. 

OTHER INDIRECT BRANCH PREDICTOR CONTROLS

The BHI_DIS_S indirect predictor control prevents predicted targets of indirect
branches executed in CPL0, CPL1, or CPL2 from being selected based on branch
history from branches executed in CPL3. While set in the VMX root (host), it
also prevents predicted targets executed in CPL0 (ring 0/root) from being
selected based on branch history from branches executed in a VMX non-root
(guest). It may not prevent predicted targets executed in CPL3 of VMX root from
being based on branch history for branches executed in a VMX non-root (guest).
Future processors may have the behavior described above for BHI_DIS_S by
default; software can determine whether this is the case by checking whether
BHI_NO is enumerated by the processor.

The IPRED_DIS_U (affecting CPL3) and IPRED_DIS_S (affecting CPL < 3) controls,
when active, prevent transient execution at predicted targets of an indirect
near JMP/CALL before the target is resolved . This includes transient execution
at past targets of that same branch. Transient execution at predicted targets of
a near RET prediction will only occur for RSB-based return predictions, or for
linear address 0. Note that, as previously documented, fall-through speculation
to instruction bytes following an indirect JMP/CALL or speculation to linear
address 0 may still occur. 

When the RRSBA_DIS_S (affecting CPL < 3) and RRSBA_DIS_U (affecting CPL3)
indirect predictor controls are set, transient execution at predicted targets of
a near RET prediction will only occur for RSB-based return predictions, or for
linear address 0.

SOFTWARE TECHNIQUES FOR INDIRECT SPECULATION CONTROL

Besides the hardware-based mechanisms described above, software mechanisms can
also be used to limit indirect branch speculation.

For example, indirect branch prediction can be suppressed in some cases by using
a software-based approach called retpoline, which was developed by Google*.
Details of retpoline are described in Retpoline: A Branch Target Injection
Mitigation. 

RETURN STACK BUFFER CONTROL

Some software techniques to control speculation, such as retpoline, require
return address speculation to have predictable behavior to work properly. Under
some circumstances, for example, a deep call stack or imbalanced CALL and RET
instructions, the RSB may underflow and alternative predictors are used to
predict the return address of RET instructions. 

RSB stuffing is a software technique to fill the RSB with
trusted-software-controlled return targets to avoid RSB underflow. On older
processors, RSB stuffing with 32 return targets is sufficient. On these
processors, RSB stuffing may be used in conjunction with retpoline to restrict
return address mis-speculation to controlled targets. On processors without
enhanced IBRS, RSB stuffing may also be used by a VMM after VM exit to protect
against RSB underflow.

BRANCH HISTORY BUFFER CONTROL

To address Branch History Injection, software can use a code sequence to control
speculation that arises from collisions in the Branch History Buffer (BHB). This
code sequence overwrites the branch history after domain transitions to prevent
the previous domain from influencing BHB-based indirect branch prediction in the
current domain. As microarchitectural details of the BHB may change in future
processors, Intel recommends using hardware-based controls, such as BHI_DIS_S,
where available. 


CONDITIONAL BRANCHES

Intel processors use conditional branch predictors to predict the direction of
conditional branch instructions before their actual execution. This allows the
processor to fetch and speculatively execute instructions on the predicted
execution path after the conditional branch. Speculative execution side channels
(aka Transient Execution Attacks) that are based around conditional branch
prediction are classified as Spectre Variant 1.

OVERVIEW OF BOUNDS CHECK BYPASS

Bounds check bypass is a side channel method that takes advantage of the
speculative execution that may occur following a conditional branch instruction.
Specifically, the method is used in situations in which the processor is
checking whether an input is in bounds (for example, while checking whether the
index of an array element being read is within acceptable values). The processor
may issue operations speculatively before the bounds check resolves. If the
attacker contrives for these operations to access out-of-bounds memory,
information may be inferred by the attacker in certain circumstances.

BOUNDS CHECK BYPASS STORE

One subvariant of this technique, known as bounds check bypass store, is to use
speculative stores to overwrite younger speculative loads in a way that creates
a side channel controlled by a malicious actor. 

Refer to the example bounds check bypass store sequence below:

int function(unsigned bound, unsigned long user_key) {
     unsigned long data[8];
 
     /* bound is trusted and is never more than 8 */
     for (int i = 0; i < bound; i++){
     data[i] = user_key;
     }
         
     return 0;
}

The example above does not by itself allow a bounds check bypass attack.
However, it does allow the attack to speculatively modify memory, and therefore
could potentially be used to chain attacks. For example, it is possible that the
above sequence might speculatively overwrite the return address on the stack
with user_key. This may allow a malicious actor to specify a user_key that is
actually the instruction pointer of a disclosure gadget that they wish to be
speculatively executed.

The steps below describe how an example attack using this method might occur:

 1. The CPU conditional branch predictor predicts that the loop will iterate 10
    iterations, when in reality the loop should have only executed 8 times.
    After the 10th iteration, the predictor will resolve, fall through, and
    execute the following instructions. However, the 9th iteration of the loop
    may speculatively overwrite the return address on the stack.
 2. The CPU decodes the RET and speculatively fetches instructions based on the
    prediction in the return stack buffer (RSB). The CPU may speculatively
    execute those instructions.
 3. RET loads the value that it believes is at the top of the stack (but which
    came from the speculative store of user_key in step 1) and redirects the
    instruction pointer to that value. The results of any operations
    speculatively executed in step 2 are discarded.
 4. The disclosure gadget at the instruction pointer of user_key (which was
    specified by the malicious actor) speculatively executes and creates a side
    channel that can be used to reveal data specified by the malicious actor.
 5. The conditional jump that should have ended the loop then executes and
    redirects the instruction pointer to the next instruction after the loop.
    This discards the speculative store of user_key that overwrote the return
    address on the stack, as well as all other operations between step 1 and
    step 4.
 6. The CPU executes the RET again, and the program continues.

Where the compiler has spilled variables to the stack, the store can also be
used to target those spilled values and speculatively modify them to enable
another attack to follow. An example of this would be by targeting the base
address of an array dereference or the limit value.

SMEP will prevent the attack described above from causing a supervisor RET to
speculatively execute code in user mode page. Intel® Control-flow Enforcement
Technology (Intel® CET) can also help prevent speculative execution of
instructions at incorrect indirect branch targets.

This example can be mitigated either by applying LFENCE before the RET (after
the loop ends), by using bounds clipping to ensure that store operations do not
occur outside of the array’s bounds, even speculatively, or by ensuring that
incorrect return pointer is detected and that the return does not speculatively
use the incorrect value. 

A second variant of this method can occur where a user value is being copied
into an array, either on the stack or adjacent to function pointers. As
discussed previously, the processor may speculatively execute a loop more times
than is actually needed. If this loop moves through memory writing malicious
actor-controlled values, then the malicious actor may be able to speculatively
perform a buffer overrun attack.

int filltable(uint16_t *from)
		{
			uint16_t buffer[64];
			int i;

			for (i = 0; i < 64; i++)
				buffer[i] = *from++;
		}

In some cases, the example above might speculatively copy more bytes than 64
into the array, changing the return address speculatively used by the processor
so that it instead returns to a user-controlled gadget.

As the execution is speculative, some processors will allow speculative writes
to read-only memory, and will reuse that data speculatively. Therefore, while
placing function pointers into write-protected space is a good general security
mitigation, doing so is not sufficient mitigation in this case.

IDENTIFYING BOUNDS CHECK BYPASS VULNERABILITIES

The following section examines common instances of bounds check bypass,
including the bounds check bypass store variant, but should not be considered a
comprehensive list. It describes how to analyze potential bounds check bypass
and bounds check bypass store vulnerabilities found by static analysis tools or
manual code inspection and presents mitigation techniques that may be used. This
document does not include any actual code from any real product or open source
release, nor does it discuss or recommend any specific analysis tools.

COMMON ATTRIBUTES FOR BOUNDS CHECK BYPASS VULNERABILITIES

Bounds check bypass code sequences have some common features: they generally
operate on data that is controlled or influenced by a malicious actor, and they
all have some kind of side-effect that can be observed by the malicious actor.
In addition, the processor’s speculative execution sequence executes in a way
which would be thrown away in a normally retired execution sequence. In bounds
check bypass store variants, data is speculatively written at locations that
would be out of bounds under normal execution. That data is later speculatively
used to execute code and cause observable side-effects, creating a side channel.

LOADS AND STORES

A vulnerable code fragment forming a disclosure gadget is made up of two
elements. The first is an array or pointer dereference that depends upon an
untrusted value, for example, a value from a potentially malicious application.
The second element is usually a load or store to an address that is dependent
upon the value loaded by the first element. Refer to Microsoft*’s blog for
further details.

As bounds check bypass is based upon speculation, code can be vulnerable even if
that untrusted value is correctly tested for bounds before use.

The classic general example of such a sequence in C is:

if (user_value >= 0 && user_value < LIMIT) {
       x = table[user_value];
       node = entry[x];
} else
       return ERROR;

For such a code sequence to be vulnerable, both elements must be present.
Furthermore, the untrusted value must be under the malicious actor’s control.

When the code executes, the processor has to decide if the user_value < LIMIT
conditional is true or false. It remembers the processor register state at this
point and speculates (makes a guess) that user_value is below LIMIT and begins
executing instructions as if this were true. Once the processor realizes it
guessed incorrectly, it throws away the computation and returns an error. The
attack relies upon the fact that before it realizes the guess was incorrect, the
processor has read both table[user_value], pointing into memory beyond the
intended limit, and has read entry[x]. When the processor reads entry[x], it may
bring in the corresponding cache line from memory into the L1 cache. Later, the
malicious actor can time accesses to this address to determine whether the
corresponding cache line is in the L1 data cache. The malicious actor can use
this timing to discover the value x, which was loaded from a malicious
actor-specified location.

The two components that make up this vulnerable code sequence can be stretched
out over a considerable distance and through multiple layers of function calls.
The processor can speculatively execute many instructions—a number sufficient to
pass between functions, compilation units, or even software exception handlers
such as longjmp or throw. The processor may speculate through locked operations,
and use of volatile will not change the vulnerability of the code being
exploited.

There are several other sequences that may be used to infer information.
Anything that tests some property of a value and loads or stores according to
the result may leak information. Depending upon the location of foo and bar, the
example below might be able to leak bit 0 of arbitrary data.

if (user_value >= LIMIT)
		return ERROR;
	x = table[user_value];
	if (x & 1) 
		foo++;
	else
		bar++;

When evaluating code sequences for vulnerability to bounds check bypass, the
critical question is whether different behavior could be observed as a property
of x.

This question can be very challenging to answer from code inspection, especially
when looking for any specific code pattern. For instance, if a value is passed
to a function call, then that function call must be inspected to ensure it does
not create any observable interactions. Consider the following example:

if (user_value >= LIMIT)
	return ERROR;
x = lengths[user_value];
if (x)
	memset(buffer, 0, 64 * x);

Here, x influences how much memory is cleared by memset() and might allow the
malicious actor to discern something about the value of x from which cache lines
the speculatively executed memset touches.

Remember that conditional execution is not just if, but may also include for and
while as well as the C ternary (?:) operator and situations where one of the
values is used to index an array of function pointers.

TYPECASTING AND INDIRECT CALLS

Typecasting can be a problematic area to analyze and often conceals real
examples that can be exploited. This is especially challenging in C++ because
you are more likely to have function pointers embedded in objects and overloaded
operators that might behave in type-dependent fashion.

Two classes of typecasting problems are relevant to bounds check bypass attacks:

 1. Code/data mismatches. Speculation causes “class Foo” code to be
    speculatively executed on “class Bar” data using gadgets supplied with Foo
    to leak information about Bar.
 2. The type confusion is combined with some observable effect, like the
    load/store effects discussed above. For example, if Foo and Bar are
    different sizes, a malicious actor might be able to learn something about
    memory past the end of objects[] using something like the example below.

type = objects[index];
if (index >= len)
	return -EINVAL;
if (type == TYPE_FOO)
	memset(ptr, 0, sizeof(Foo));
else
	memset(ptr, 0, sizeof(Bar));

 

Take care when considering any code where a typecast occurs based upon a
speculated value. The processor might guess the type incorrectly and
speculatively execute instructions based on that incorrect type. Newer
processors that enable Intel® OS Guard, also known as Supervisor-Mode Execution
Prevention (SMEP), will prevent ring 0 code from speculatively executing ring 3
code. All major operating systems (OSes) enable SMEP support by default if the
hardware supports it. Older processors, however, might speculate the type
incorrectly, load data that the processor thinks are function pointers, or
speculate into lower addresses that might be directly controlled by a malicious
actor.

For example:

if (flag & 4)
	(Foo *)ptr->process(x);
else
	(Bar *)ptr->process(x);

If the Foo and Bar objects are different and have different memory layouts, then
the processor will speculatively fetch a pointer offset of ptr and branch to it.

Consider the following example:

int call; /* from user */
if (call >= 0 && call < MAX_FUNCTION)
		function_table[call](a,b,c);

On first analysis this code might seem safe. We reference function_table[call],
but call is the user’s own, known value. However, during speculative execution,
the processor might incorrectly speculate through the if statement and
speculatively execute invalid addresses. Some of these addresses might be mapped
to user pages in memory, or might contain values that match suitable gadgets for
ROP attacks.

A less obvious variant of this case is switch statements. Many compilers will
convert some classes of switch statement into jump tables. Refer to the
following example code:

switch(x) {
case 0: return y;
case 1: return z;
...
default: return -1;
}

Code similar to this will often be implemented by the compiler as shown:

if (x < 0 || x > 2) return -1;
goto  case[x];

Therefore when using switch() with an untrusted input, it might be appropriate
to place an lfence before the switch so that x has been fully resolved before
the implicit bounds check.

SPECULATIVE LOOPS

A final case to consider is loops that speculatively overrun. Consider the
following example:

while (++x < limit) {
	y = u[x];
	thing(y);
}

The processor will speculate the loop condition, and often speculatively execute
the next iteration of the loop. This is usually fine, but if the loop contains
code that reveals the contents of data, then you might need to apply mitigations
to avoid exposing data beyond the intended location of the loop. This means that
even if the loop limit is properly protected before the processor enters the
loop, unless the loop itself is protected, the loop might leak a small amount of
data beyond the intended buffer on the speculative path.

DISCLOSURE GADGETS

In addition to the load and store disclosure gadget referenced above, there may
be additional gadgets based on the microarchitectural state. For example, using
certain functional blocks, such as Intel® Advanced Vector Extensions (Intel®
AVX), during speculative execution may affect the time it takes to subsequently
use the block due to factors like the time required to power-up the block.
Malicious actors can use a disclosure primitive to measure the time it takes to
use the block. An example of such a gadget is shown below:

if (x > sizeof(table))
	return ERROR;
If (a[x].op == OP_VECTOR)
	avx_operation(a[x]);
else
	integer_operation(a[x]);

CONDITIONAL BRANCH SPECULATION ANALYSIS

Controlling conditional branch speculation, such as bounds check bypass, is not
generally relevant if your code doesn’t have secrets that the user shouldn’t be
able to access. For example, a simple image viewer probably contains no
meaningful secrets that should be inaccessible to software it interacts with.
The user of the software could potentially use bounds check bypass attacks to
access the image, but they could also just hit the save button.

On the other hand, an image viewer with support for secure, encrypted content
with access authorized from a central system might need to care about bounds
check bypass because a user may not be allowed to save the document in normal
ways. While the user can’t save such an image, they can trivially photograph the
image and send the photo to someone, so protecting the image may be less
important. However, any keys are likely to be far more sensitive.

There are also clear cases like operating system kernels, firmware (refer to the
Host Firmware Speculative Execution Side Channel Mitigations technical paper)
and managed runtimes (for example, Javascript* in web browsers) where there is
both a significant interaction surface between differently trusted code, and
there are secrets to protect. 

Whether to apply mitigations, and what areas to target has to be part of your
general security analysis and risk modelling, along with conventional security
techniques, and resistance if appropriate to timing and other non-speculative
side channel attacks. Bounds check bypass mitigations have performance impacts,
so they should only be used where appropriate.

SOFTWARE TECHNIQUES FOR CONDITIONAL SPECULATION CONTROL

LFENCE

The main mitigation for bounds check bypass is through use of the LFENCE
instruction. The LFENCE instruction does not execute until all prior
instructions have completed locally, and no later instruction begins execution
until LFENCE completes. Most vulnerabilities identified in the Identifying
Bounds Check Bypass Vulnerabilities section can be protected by inserting an
LFENCE instruction; for example:

if (user_value >= LIMIT)
	return ERROR;
lfence();
x = table[user_value];
node = entry[x];

Where lfence() is a compiler intrinsic or assembler inline that issues an LFENCE
instruction and also tells the compiler that memory references may not be moved
across that boundary. The LFENCE ensures that the loads do not occur until the
condition has actually been checked. The memory barrier prevents the compiler
from reordering references around the LFENCE, and thus breaking the protection.

PLACEMENT OF LFENCE

To protect against speculative timing attacks, place the LFENCE instruction
after the range check and branch, before any code that consumes the checked
value, and before the data can be used in a gadget that might allow
measurement. 

For example:

if (x > sizeof(table))
	return ERROR;
lfence();
If (a[x].op == OP_VECTOR)
	avx_operation(a[x]);
else
	integer_operation(a[x]);

Unless there are specific reasons otherwise, and the code has been carefully
analyzed, Intel recommends that the lfence is always placed after the range
check and before the range checked value is consumed by other code, particularly
if the code involves conditional branches.

BOUNDS CLIPPING

Software can use instructions, such as CMOVcc, AND, ADC, SBB, and SETcc, to
constrain speculative execution and prevent bounds check bypass on current
family 6 processors (Intel® Core™, Intel® Atom™, Intel® Xeon® and Intel® Xeon
Phi™ processors). However, these instructions may not be guaranteed to do so on
future Intel processors. Intel intends to release further guidance on the usage
of instructions to constrain speculation in the future before processors with
different behavior are released. This approach can avoid stalling the pipeline
as LFENCE does.

At the simplest:

unsigned int user_value;

if (user_value > 255)
	return ERROR;
x = table[user_value];

Can be made safe by instead using the following logic:

volatile unsigned int user_value;

if (user_value > 255)
	return ERROR;
x = table[user_value & 255];

This works for powers of two array lengths or bounds only. In the example above
the table array length is 256 (2^8), and the valid index should be <= 255. Take
care that the compiler used does not optimize away the & 255 operation. For
other ranges, it’s possible to use CMOVcc, ADC, SBB, SETcc, and similar
instructions to do verification.

Although this mitigation approach can be faster than other approaches it is not
guaranteed for the future. Developers who cannot control which CPUs their
software will run on (such as general application, library, and SDK developers)
should not use this mitigation technique. Intel intends to release further
guidance on how to use serializing instructions to constrain speculation before
future processors with different behavior are released.

Both the LFENCE approach and bounds clipping can be applied to function call
tables, while the LFENCE approach is generally the only technique that can be
used when typecasting.

INTERACTION WITH MEMORY DISAMBIGUATION

Memory disambiguation (as described in the Overview of Data Speculation section)
can theoretically impact bounds clipping techniques when they involve a load
from memory. In the following example, a CMOVG instruction is inserted to
prevent a side channel from being created with data from any locations beyond
the array bounds.

CMP RDX, [array_bounds]
JG out_of_bounds_input
MOV RCX, 0
MOV RAX, [RDX + 0x400000]
CMOVG RAX, RCX
<Further code that causes cache movement based on RAX value>

As an example, assume the value at array_bounds is 0x20, but that value was only
just stored to array_bounds and that the prior value at array_bounds was
significantly higher, such as 0xFFFF. The processor can speculatively execute
the CMP instruction using a value of 0xFFFF for the loaded value due to the
memory disambiguation mechanism. The instruction will eventually be re-executed
with the intended array_bounds value of 0x20. This can theoretically cause the
above sequence to support the creation of a side channel that reveals
information about the memory at addresses up to 0xFFFF instead of constraining
it to addresses below 0x20.

MULTIPLE BRANCHES

When using mitigations, particularly the bounds clipping mitigations, it is
important to remember that the processor will speculate through multiple
branches. Thus, the following code is not safe:

int *key;
int valid = 0;

if (input < NUM_ENTRIES) {
	lfence();
	key = &table[input];
	valid = 1;
}
….
if (valid)
	*key = data;

In this example, although the mitigation is applied correctly when the processor
speculates that the first condition is valid, no protection is applied if the
processor takes the out-of-range value and then speculates that valid is true on
the other path. In this case it will probably expose the contents of a random
register, although not in an easy-to-measure fashion.

Preinitializing key to NULL or another safe address will also not reliably work,
as the compiler can eliminate the NULL assignment because it can never be used
non-speculatively. In such cases it may be more appropriate to merge the two
conditional code sections and put the code between them into a separate function
that is called on both paths. Or you could add volatile to key and assign it to
NULL—forcing the assignment to occur with volatile, or to add lfence before the
final assignment.

COMPILER-BASED APPROACHES

Note that there are also compiler-based approaches that automatically augment
software with instructions to constrain speculation and can help prevent Bounds
Check Bypass, such as Speculative Load Hardening (clang) and the /Qspectre
option (MSVC). 

Compiler protections against buffer overwrites of return addresses, such as
stack canaries, also provide some resistance to speculative buffer overruns. In
situations where a loop speculatively overwrites the return address it will also
speculatively trigger the stack protection diverting the speculative flow.
However, stack canaries alone are not sufficient to protect from bounds check
bypass attacks.

MICROSOFT VISUAL STUDIO* 2017 MITIGATIONS

The Microsoft Visual Studio* 2017 Visual C++ compiler toolchain includes support
for the /Qspectre flag, which may automatically add mitigation for some bounds
check bypass vulnerabilities. For more information and usage guidelines, refer
to Microsoft’s public blog and the Visual C++ /Qspectre option page for further
details.

LFENCE IN INTEL FORTRAN COMPILER

You can insert an LFENCE instruction in Fortran applications as shown in the
example below. Implement the following subroutine, which calls _mm_lfence()
intrinsics:

interface
        subroutine for_lfence() bind (C, name = "_mm_lfence")
            !DIR$ attributes known_intrinsic, default :: for_lfence
        end subroutine for_lfence
    end interface
  
    if (untrusted_index_from_user .le. iarr1%length) then
        call for_lfence()
        ival = iarr1%data(untrusted_index_from_user)
        index2 = (IAND(ival,1)*z'100') + z'200'   
        if(index2 .le. iarr2%length)
            ival2 = iarr2%data(index2)
    endif


The LFENCE intrinsic is supported in the following Intel compilers:

 * Intel C++ Compiler 8.0 and later for Windows*, Linux*, and macOS*.
 * Intel Fortran Compiler 14.0 and later for Windows, Linux, and macOS.

COMPILER-DRIVEN AUTOMATIC MITIGATIONS

Across the industry, there is interest in mitigations for bounds check bypass
vulnerabilities that are provided automatically by compilers. Developers are
continuing to evaluate the efficacy, reliability, and robustness of these
mitigations and to determine whether they are best used in combination with, or
in lieu of, the more explicit mitigations discussed above.

OPERATING SYSTEM MITIGATIONS

Where possible, dedicated operating system programming APIs should be used to
mitigate bounds check bypass instead of using open-coded mitigations. Using the
OS-provided APIs will help ensure that code can take advantage of new mitigation
techniques or optimizations as they become available. 

LINUX* KERNEL

The current Linux* kernel mitigation approach to bounds check bypass is
described in the speculation.txt file in the Linux kernel documentation. This
file is subject to change as developers and multiple processor vendors determine
their preferred approaches.

ifence() : on x86 architecture, this issues an LFENCE and provides the compiler
with the needed memory barriers to perform the mitigation. It can be used as
lfence(), as in the examples above. On non-Intel processors, ifence() either
generates the correct barrier code for that processor, or does nothing if the
processor does not speculate.

array_ptr(array, index, max): this is an inline that, irrespective of the
processor, provides a method to safely dereference an array element.
Additionally, it returns NULL if the lookup is invalid. This allows you to take
the many cases where you range check and then check that an entry is present,
and fold those cases into a single conditional test.

Thus, we can turn:

if (handle < 32) {
	x = handle_table[handle];
	if (x) {
		function(x);
		return 0;
	}
}
return –EINVAL;

Into:

x = array_ptr(handle_table, handle, 32);
if (x == NULL)
	return –EINVAL;
function(*x);
return 0;

MICROSOFT WINDOWS*

Windows C/C++ developers have a variety of options to assist in mitigating
bounds check bypass (Spectre variant 1). The best option will depend on the
compiler/code generation toolchains you are using. Mitigation options include
manual and compiler assisted.

In mixed-mode compiler environments, where object files for the same project are
built with different toolchains, there are varying degrees of mitigation options
available. Developers need to be aware of and apply the appropriate mitigations
depending on their code composition and appropriate toolchain support
dependencies.

As described in the Operating System Mitigations section, we recommend inserting
LFENCE instructions (either manually or with compiler assistance) for mitigating
bounds check bypass on Windows. The following sections provide details on how to
insert the LFENCE instruction using currently available compiler tool chain
mechanisms. These mechanisms are (from lowest level to highest level):

 * Inline/external assembly
 * _mm_lfence() compiler intrinsic
 * Compiler automatic LFENCE insertion

INLINE/EXTERNAL ASSEMBLY

The Intel® C Compiler and Intel® C++ Compiler provide inline assembly support
for 32- and 64-bit targets, whereas Microsoft Visual* C++ only provides inline
assembly support for 32-bit targets. Microsoft Macro Assembler* (MASM) or other
external, third-party assemblers may also be used to insert LFENCE in assembly
code.

_MM_LFENCE() COMPILER INTRINSIC

The Intel C Compiler, the Intel C++ Compiler, and the Microsoft Visual C++
compiler all support generating LFENCE instructions for 32- and 64-bit targets
using the _mm_lfence() intrinsic.

The easiest way for Windows developers to gain access to the intrinsic is by
including the intrin.h header file that is provided by the compilers. Some
Windows SDK/WDK headers (for example, winnt.h and wdm.h) define the _mm_lfence()
intrinsic to avoid inclusion of the compiler intrin.h. It is possible that you
already have code that locally defines _mm_lfence() as well, or uses an already
existing definition for the intrinsic. 

LFENCE IN C/C++

You can insert LFENCE instructions in a C/C++ program as shown in the example
below:

#include <intrin.h>
#pragma intrinsic(_mm_lfence)
 
    if (user_value >= LIMIT)
    {
        return STATUS_INSUFFICIENT_RESOURCES;
    }
    else
    {   
        _mm_lfence();   /* manually inserted by developer */
        x = table[user_value];
        node = entry[x];
    }



DATA SPECULATION


OVERVIEW OF DATA SPECULATION

Intel processors implement performance features that allow instructions that
depend on the behavior of older instructions to speculatively execute before
these older instructions have executed:

 * Memory disambiguation predicts whether the address of a memory load overlaps
   with the yet-unknown address of a preceding memory store to allow speculative
   execution of the memory load. Misprediction of memory disambiguation can
   allow for Speculative Store Bypass attacks that transiently access and infer
   stale data in memory (as described in the Speculative Store Bypass section).
 * Fast store forwarding predictor allows a memory load to speculatively use the
   data of a preceding memory store before all store-to-load forwarding
   conditions are resolved, for example, before a match of the load and store
   addresses have been resolved. 
 * The floating-point unit statically predicts floating-point results to be
   normal to speculatively execute floating-point operations. A microcode assist
   is triggered to handle denormal/subnormal floating-point results. Floating
   Point Value Injection is a technique to infer information using the
   transiently computed floating-point result before a subnormal floating-point
   microcode assist is triggered and the transient result is cleaned up. 


SPECULATIVE STORE BYPASS

Many Intel processors use memory disambiguation predictors that allow loads to
be executed speculatively before it is known whether the load’s address overlaps
with a preceding store’s address. This may happen if a store’s address is
unknown when the load is ready to execute. If the processor predicts that the
load address will not overlap with the unknown store address, the load may
execute speculatively. However, if there is indeed an overlap, then the load may
consume stale data. When this occurs, the processor will re-execute the load to
ensure a correct result.

Through the memory disambiguation predictors, an attacker can cause certain
instructions to be executed speculatively and then use the effects for side
channel analysis. For example, consider the following scenario:

K is a secret asset (for example, a cryptographic key) inside the victim code.
The attacker is allowed to know the value of M, but not the value of K. X is a
variable in memory. Assuming an attacker can find the following code in a victim
application:

 1. X = &K;             // Attacker manages to get variable with address of K
    stored into pointer X
    <at some later point>

 2. X = &M;     // Does a store of address of M to pointer X

 3. Y = Array[*X & 0xFFFF]; // Dereferences address of M which is in pointer X
    in order to
    // load from array at index specified by M[15:0]

When the above code runs, the load from address X that occurs as part of step 3
may execute speculatively and, due to memory disambiguation, initially receive a
value of address of K instead of the address of M. When this value of address of
K is dereferenced, the array is speculatively accessed with an index of K[15:0]
instead of M[15:0]. The CPU will later re-execute the load from address X and
use M[15:0] as the index into the array. However, the cache movement caused by
the earlier speculative access to the array may be analyzed by the attacker to
infer information about K[15:0].

As in the previous example, an attacker may be able to discover confused deputy
code which may allow them to use speculative execution to reveal the value of
memory that is not normally accessible to them. In a language-based security
environment (for example, a managed runtime), where an attacker is able to
influence the generation of code, an attacker may be able to create such a
confused deputy. Intel has not currently observed this method in situations
where the attacker has to discover such an exploitable confused deputy scenario.


SPECULATIVE STORE BYPASS CONTROL MECHANISMS

Intel has developed mitigation techniques for speculative store bypass. It can
be mitigated by software modifications, or if those are not feasible, then the
use of Speculative Store Bypass Disable (SSBD), which prevents a load from
executing speculatively until the addresses of all older stores are known. Intel
recommends using the below mitigations only for managed runtimes or other
situations that use language-based security to guard against attacks within an
address space.

SOFTWARE-BASED MITIGATIONS

Speculative store bypass can be mitigated through numerous software-based
approaches. This section describes two such software-based mitigations: process
isolation and the selective use of LFENCE.

PROCESS ISOLATION

One approach is to move all secrets into a separate address space from untrusted
code. For example, creating separate processes for different websites so that
secrets of one website are not mapped into the same address space as code from a
different, possibly malicious, website. Similar techniques can be used for other
runtime environments that rely on language-based security to run trusted and
untrusted code within the same process. This may also be useful as part of a
defense-in-depth strategy to prevent trusted code from being manipulated to
create a side channel. Protection Keys can also be valuable in providing such
isolation. Refer to the Protection Keys section for more information.

USING LFENCE TO CONTROL SPECULATIVE LOAD EXECUTION

Software can insert an LFENCE between a store (for example, the store of address
of M in step 2 of the Speculative Store Bypass section) and the subsequent load
(for example, the load that dereferences X in step 3 of the Speculative Store
Bypass section) to prevent the load from executing before the previous store’s
address is known. The LFENCE can also be inserted between the load and any
subsequent usage of the data returned which might create a side channel (for
example, the access to Array in step 3 of the Speculative Store Bypass section).
Software should not apply this mitigation broadly, but instead should only apply
it where there is a realistic risk of an exploit; for example, if an attacker
can control the old value in the memory location, there is a realistic chance of
the load executing before the store address is known, and there is a disclosure
gadget that reveals the contents of sensitive memory.

Other mitigations like inserting register dependencies between a vulnerable load
address and the corresponding store address may reduce the likelihood of
Speculative Store Bypass Attacks being successful.

SPECULATIVE STORE BYPASS DISABLE (SSBD)

If the earlier software-based mitigations are not feasible, then employing
Speculative Store Bypass Disable (SSBD) will mitigate speculative store bypass.

When SSBD is set, loads will not execute speculatively until the addresses of
all older stores are known. This ensures that a load does not speculatively
consume stale data values due to bypassing an older store on the same logical
processor.

BASIC SUPPORT

Software can disable speculative store bypass on a logical processor by setting
IA32_SPEC_CTRL.SSBD to 1.

Both enclave and SMM code will behave as if SSBD is set regardless of the actual
value of the MSR bit. The processor will ensure that a load within enclave or
SMM code does not speculatively consume stale data values due to bypassing an
older store on the same logical processor.

SOFTWARE USAGE GUIDELINES

Enabling SSBD can prevent exploits based on speculative store bypass. However,
this may reduce performance. Intel provides the following recommendations for
the use of such a mitigation.

 * Intel recommends software set SSBD for applications and/or execution runtimes
   relying on language-based security mechanisms. Examples include managed
   runtimes and just-in-time translators. If software is not relying on
   language-based security mechanisms, for example because it is using process
   isolation, then setting SSBD may not be needed.
 * Intel is currently not aware of any practical exploit for OSes or other
   applications that do not rely on language-based security.  Intel encourages
   these users to consider their particular security needs in determining
   whether to set SSBD outside context of language-based security mechanisms.

These recommendations may be updated in the future.

On Intel® Core™ and Intel® Xeon® processors that enable Intel® Hyper-Threading
Technology and do not support enhanced IBRS, setting SSBD on a logical processor
may impact the performance of a sibling logical processor on the same core.
Intel recommends that the SSBD MSR bit be cleared when in an idle state on such
processors.

Operating systems should provide an API through which a process can request it
be protected by SSBD mitigation.

VMMs should allow a guest to determine whether to enable SSBD mitigation by
providing direct guest access to IA32_SPEC_CTRL.


DATA-DEPENDENT PREFETCHERS

Besides control and data speculation, Intel processors implement prefetchers
that prefetch cache lines from memory based on data values previously loaded or
prefetched from memory, for example, data-dependent prefetchers (DDP). While
such prefetchers do not create speculative execution paths, they may yet allow
an attacker to infer information about loaded data values via cache-based side
channels. 

Intel processors automatically enforce properties for these prefetchers to
mitigate potential security concerns, as well as exposing a disable control, as
described in the provided DDP documentation. 


ADDITIONAL SOFTWARE GUIDANCE

The following section describes additional guidance for how software can
effectively restrict speculation and protect against speculation-based attacks
in a selection of use cases that have an increased risk of exploitation. 


OPERATING SYSTEMS

Due to the Speculative Behavior of SWAPGS and Segment Registers, operating
systems that use SWAPGS to change the GS segment register on kernel entry need
additional mitigations: the recommended mitigation for when SWAPGS is
speculatively missed, such as when speculative execution takes a path that does
not contain the SWAPGS instruction, is to add an LFENCE or serializing
instruction before the first memory reference using GS on all paths that can
speculatively skip the SWAPGS instruction. The mitigation for when an extra
SWAPGS instruction is speculatively executed when it should not be is to add an
LFENCE or serializing instruction after the SWAPGS instruction.


SYSTEM MANAGEMENT MODE (SMM)

On certain processors from the Skylake generation, System Management Interrupt
(SMI) handlers can leave the RSB in a state that OS code does not expect. To
avoid RSB underflow on return from SMI and ensure retpoline implementations in
the OS and VMM work properly, on these processors, an SMI handler may implement
RSB stuffing before returning from System Management Mode (SMM).


RELATED INTEL SECURITY FEATURES AND TECHNOLOGIES

There are security features and technologies, either present in existing Intel
products or planned for future products, which reduce the effectiveness of the
attacks mentioned in the previous sections.


INTEL® OS GUARD

When Intel® OS Guard, also known as Supervisor-Mode Execution Prevention (SMEP),
is enabled, the operating system will not be allowed to directly execute
application code, even speculatively. This makes branch target injection attacks
on the OS substantially more difficult by forcing the attacker to find gadgets
within the OS code. It is also more difficult for an application to train OS
code to jump to an OS gadget. All major operating systems enable SMEP support by
default.


EXECUTE DISABLE BIT

The Execute Disable Bit is a hardware-based security feature that can help
reduce system exposure to viruses and malicious code. Execute Disable Bit allows
the processor to classify areas in memory where application code can or cannot
execute, even speculatively. This reduces the gadget space, increasing the
difficulty of branch target injection attacks. All major operating systems
enable Execute Disable Bit support by default. Applications are encouraged to
only mark code pages as executable.


INTEL® CONTROL-FLOW ENFORCEMENT TECHNOLOGY (INTEL® CET)

Intel Control-Flow Enforcement Technology (Intel® CET) is a feature on recent
Intel products to protect control-flow integrity against Return-Oriented
Programming (ROP) / Call-Oriented Programming (COP) / Jump-Oriented Programming
(JOP) style attacks. It provides two main capabilities:

 * Shadow stack: A shadow stack is a second independent stack which is used
   exclusively for control transfer operations. When shadow stacks are enabled,
   RET instructions require that return addresses on the data stack match the
   address on the shadow stack, which can be used to mitigate ROP attacks.
 * Indirect branch tracking (IBT): When IBT is enabled, the processor requires
   that the instruction at the target of indirect JMP or CALL instructions is an
   ENDBRANCH. Software must be compiled to place the ENDBRANCH instruction at
   valid targets.

Intel CET also applies restrictions to transient execution to constrain
speculative control flow. These restrictions may be relevant for both
control-flow speculation and attacker-controlled jump redirection. More details
can be found in the “Control-flow Enforcement Technology (CET)” chapter of the
IA-32 Intel® Architecture Software Developer’s Manual.

INTEL CET SHADOW STACK SPECULATION LIMITATIONS

When CET Shadow Stack is enabled, the processor will not execute instructions,
even speculatively, at the loaded target of the return address of a RET
instruction if that target differs from the predicted target (for example, that
predicted by the Return Stack Buffer), and:

 * The RET address values on the data stack and shadow stack do not match; or
 * Those address values may be transient (for example, the values may have been
   modified by an older speculative store).

INTEL CET INDIRECT BRANCH TRACKING (CET IBT) SPECULATION LIMITATIONS

When CET IBT is enabled, instruction execution will be limited or blocked, even
speculatively, if the next instruction is not an ENDBRANCH after an indirect JMP
or CALL which sets the IBT tracker state to WAIT_FOR_ENDBRANCH. The Tiger Lake
implementation of Intel CET limits speculative execution to a small number of
instructions (less than 8, with no more than 5 loads) after a missing ENDBRANCH.
On Alder Lake, Sapphire Rapids, Raptor Lake, and some future processors, the
potential speculation window at a target that does not start with ENDBRANCH is
limited to two instructions (and typically fewer) with no more than 1 load.

The intended long-term direction, and behavior on some current implementations
(including E-core only products like Alder Lake-N and Arizona Beach), is to
completely block the speculative execution of instructions after a missing
ENDBRANCH.


PROTECTION KEYS

On Intel processors that have both hardware support for mitigating Rogue Data
Cache Load (IA32_ARCH_CAPABILITIES[RDCL_NO]) and protection keys support
(CPUID.7.0.ECX[3]), protection keys can limit the data accessible to a piece of
software. This can be used to limit the memory addresses that could be revealed
by a branch target injection or bound check bypass attack.


SUPERVISOR-MODE ACCESS PREVENTION (SMAP)

SMAP can be used to limit which memory addresses can be used for a cache-based
side channel, by blocking allocation of an application line. This may make it
more difficult for an application to perform the attack on the kernel, as it is
more challenging for an application to determine whether a kernel line is cached
than an application line. On Intel processors that have both hardware support
for mitigating Rogue Data Cache Load (IA32_ARCH_CAPABILITIES[RDCL_NO]) and SMAP
support, loads that cause a page fault due to SMAP will not speculatively return
the loaded data even on a L1D cache hit or fill/evict any caches for that
address. On processors that have SMAP support but do not enumerate RDCL_NO,
loads that cause a page fault due to SMAP may speculatively return the loaded
data on L1D cache hits but will not fill/evict any caches for that address.


CPUID ENUMERATION AND ARCHITECTURAL MSRS

The link above describes processor support for mitigation mechanisms as
enumerated using the CPUID instruction and several architectural MSRs. 

 


REFERENCES

 * Speculative Execution Side Channel Mitigations
 * Intel Analysis of Speculative Execution Side Channels
 * CPUID Enumeration and Architectural MSRs
 * Refined Speculation Execution Terminology
 * Retpoline: A Branch Target Injection Mitigation
 * Analyzing Potential Bounds Check Bypass Vulnerabilities
 * Host Firmware Speculative Execution Side Channel Mitigations
 * Fast Store Forwarding Predictor
 * Software Security Guidance
 * Intel Security Center 

 


FOOTNOTES

 1. The specific instructions are described in the Overview of Indirect Branch
    Predictors section . Note that the target address of direct branch
    instructions is also predicted but Intel processors do not allow speculative
    execution at incorrect target addresses that are due to direct branches. 
 2. This is an example of attacker-controlled jump redirection.
 3. A transition to a more privileged predictor mode through an INIT# is an
    exception to this and may not be sufficient to prevent the predicted targets
    of indirect branches executed in the new predictor mode from being
    controlled by software operating in a less privileged predictor mode.
 4. An RSB overwrite sequence is a sequence of instructions that includes 32
    more near CALL instructions with non-zero displacements than it has near
    RETs. 
 5. Note that indirect branches include near call indirect, near jump indirect
    and near return instructions; as documented by the speculative execution
    side channel mitigations guidance. Because it includes near returns, it
    follows that RSB entries created before an IBPB command cannot control the
    predicted targets of returns executed after the command on the same logical
    processor.

SOFTWARE SECURITY GUIDANCE HOME | ADVISORY GUIDANCE | TECHNICAL
DOCUMENTATION | BEST PRACTICES | RESOURCES




1




PRODUCT AND PERFORMANCE INFORMATION

1

Performance varies by use, configuration and other factors. Learn more
at www.Intel.com/PerformanceIndex.


Get Help
<link rel="stylesheet"
href="/etc.clientlibs/settings/wcm/designs/ver/240617/intel/clientlibs/pages/get-help.min.css"
type="text/css">
 * Company Overview
 * Contact Intel
 * Newsroom
 * Investors
 * Careers
 * Corporate Responsibility
 * Diversity & Inclusion
 * Public Policy

 * 
 * 
 * 
 * 
 * 

 * © Intel Corporation
 * Terms of Use
 * *Trademarks
 * Cookies
 * Privacy
 * Supply Chain Transparency
 * Site Map
 * Recycling
 * Your Privacy Choices California Consumer Privacy Act (CCPA) Opt-Out Icon
 * Notice at Collection

Intel technologies may require enabled hardware, software or service activation.
// No product or component can be absolutely secure. // Your costs and results
may vary. // Performance varies by use, configuration and other factors. // See
our complete legal Notices and Disclaimers. // Intel is committed to respecting
human rights and avoiding causing or contributing to adverse impacts on human
rights. See Intel’s Global Human Rights Principles. Intel’s products and
software are intended only to be used in applications that do not cause or
contribute to adverse impacts on human rights.


<link rel="stylesheet"
href="/etc.clientlibs/settings/wcm/designs/ver/240617/intel/clientlibs/pages/commons-page.min.css"
type="text/css"><script
src="/etc.clientlibs/settings/wcm/designs/ver/240617/intel/clientlibs/pages/commons-page.min.js"
defer></script> <link rel="preload"
href="/etc.clientlibs/settings/wcm/designs/ver/240617/intel/clientlibs/pages/contact-us.min.css"
as="style"><link rel="stylesheet"
href="/etc.clientlibs/settings/wcm/designs/ver/240617/intel/clientlibs/pages/contact-us.min.css"
type="text/css"> <script>!function(){var
e=setInterval(function(){"undefined"!=typeof
$CQ&&($CQ(function(){CQ_Analytics.SegmentMgr.loadSegments("/etc/segmentation"),CQ_Analytics.ClientContextUtils.init("/etc/clientcontext/intel",window.location.pathname.substr(0,window.location.pathname.indexOf(".")))}),clearInterval(e))},100)}();</script>
<link rel="preload" as="style"
href="/etc.clientlibs/settings/wcm/designs/intel/us/en/css/resources/css/intel.rwd.override.css"/>
<link rel="stylesheet" type="text/css"
href="/etc.clientlibs/settings/wcm/designs/intel/us/en/css/resources/css/intel.rwd.override.css"/>


WELCOME!

Intel Corporation uses cookies and similar technologies on this website to
improve your online experience, to analyze site usage and to show tailored
advertising to you. You can manage cookies at any time by clicking on the
'Manage Cookie Settings' option. For more information, visit our Cookie Notice.
Manage Cookie Settings Deactivate Cookies Activate Cookies



MANAGE COOKIES SETTINGS




 * INFORMATION WE COLLECT


 * STRICTLY NECESSARY COOKIES


 * ANALYTICS


 * FUNCTIONAL


 * AD TARGETING


 * DATA TRANSFER


 * DATA SHARING

INFORMATION WE COLLECT

Intel values your privacy. Our Sites use Cookies and Similar Technologies on
this website to improve your online experience, to analyze site usage, and to
show tailored advertising to you. This consent management tool will help you
understand what information is being collected and give you control over how it
is being used.

Information Our Partners Collect
Details of the vendors that we use to improve your overall web browsing
experience are provided in the tool. They use Cookies and Similar Technologies
to connect you with your social networks and tailor advertising to better match
your interests.

Your Choices
You can manage your cookie settings by visiting the Analytics, Functional, and
Ad Targeting tabs on the left.

Intel Privacy Notice
Intel Cookie Notice

STRICTLY NECESSARY COOKIES

Always Active

These technologies are necessary for the Intel experience to function and cannot
be switched off in our systems. The technology is usually only set in response
to actions made by the device owner which amount to a request for services, such
as setting privacy preferences, logging in, filling in forms, maintaining secure
login areas, maintaining state across pages (remembering items in a shopping
basket), and server load balancing. The device owner can set their preference to
block or alert Intel about these technologies, but some parts of the Intel
experience will not work. These technologies do not store any personally
identifiable information.

Cookies Details‎

ANALYTICS

Analytics Inactive


These technologies allow Intel to count device visits and traffic sources, so
Intel can measure and improve the performance of our experiences. The technology
helps Intel to know which experiences are the most and least popular and see how
device owners interact with the experience. All information these technologies
collect is aggregated. If the device owner does not allow these technologies,
then Intel will not know when the device owner visited or how the device owner
interacted with our experiences.

Cookies Details‎

FUNCTIONAL

Functional Inactive


These technologies enable the Intel experience to provide enhanced functionality
and personalization. The technology may be set by Intel or by third-party
providers whose services Intel have added to our experiences. If the device
owner does not allow these technologies, then some or all of these services may
not function properly for the device owner.

Cookies Details‎

AD TARGETING

Ad Targeting Inactive


These technologies may be set through our Intel experience or by our advertising
partners. The technology may be used to build a profile of the device owner’s
interests and show the device owner relevant advertisements on other
experiences. The technology does not store directly personal information on the
device, but the technology is based on uniquely identifying the device. If the
device owner does not allow these technologies, then the device owner will
experience less targeted advertising.

Cookies Details‎

DATA TRANSFER

Data Transfer Inactive


You consent to allowing Intel to transfer your personal information outside of
China for processing based upon Intel's Privacy and Cookies notices as well as
the Intel Privacy Notice Supplement for Users in China.

Cookies Details‎

DATA SHARING

Data Sharing Inactive


You consent to Intel sharing your personal information with Intel's affiliates,
Intel's partners, and Intel-authorized third-party personal information
processors.

Cookies Details‎
Back Button


COOKIE LIST

Filter Button
Consent Leg.Interest
checkbox label label
checkbox label label
checkbox label label

Clear
checkbox label label
Apply Cancel
Confirm My Choices
All Inactive All Active