Hazard (computer architecture)

In the domain of central processing unit (CPU) design, hazards are problems with the instruction pipeline in CPU microarchitectures when the next instruction cannot execute in the following clock cycle,^[1] and can potentially lead to incorrect computation results. Three common types of hazards are data hazards, structural hazards, and control hazards (branching hazards).^[2]

There are several methods used to deal with hazards, including pipeline stalls/pipeline bubbling, operand forwarding, and in the case of out-of-order execution, the scoreboarding method and the Tomasulo algorithm.

Background

Instructions in a pipelined processor are performed in several stages, so that at any given time several instructions are being processed in the various stages of the pipeline, such as fetch and execute. There are many different instruction pipeline microarchitectures, and instructions may be executed out-of-order. A hazard occurs when two or more of these simultaneous (possibly out of order) instructions conflict.

Types

Data hazards

Data hazards occur when instructions that exhibit data dependence modify data in different stages of a pipeline. Ignoring potential data hazards can result in race conditions (sometimes known as race hazards). There are three situations in which a data hazard can occur:

read after write (RAW), a true dependency
write after read (WAR), an anti-dependency
write after write (WAW), an output dependency

Consider two instructions i1 and i2, with i1 occurring before i2 in program order.

Read After Write (RAW)

(i2 tries to read a source before i1 writes to it) A read after write (RAW) data hazard refers to a situation where an instruction refers to a result that has not yet been calculated or retrieved. This can occur because even though an instruction is executed after a previous instruction, the previous instruction has not been completely processed through the pipeline.

Example

For example:

i1. R2 <- R1 + R3 i2. R4 <- R2 + R3

The first instruction is calculating a value to be saved in register R2, and the second is going to use this value to compute a result for register R4. However, in a pipeline, when we fetch the operands for the 2nd operation, the results from the first will not yet have been saved, and hence we have a data dependency.

We say that there is a data dependency with instruction i2, as it is dependent on the completion of instruction i1.

Write After Read (WAR)

(i2 tries to write a destination before it is read by i1) A write after read (WAR) data hazard represents a problem with concurrent execution.

Example

For example:

i1. R4 <- R1 + R5 i2. R5 <- R1 + R2

If we are in a situation that there is a chance that i2 may be completed before i1 (i.e. with concurrent execution) we must ensure that we do not store the result of register R5 before i1 has had a chance to fetch the operands.

Write After Write (WAW)

(i2 tries to write an operand before it is written by i1) A write after write (WAW) data hazard may occur in a concurrent execution environment.

Example

For example:

i1. R2 <- R4 + R7 i2. R2 <- R1 + R3

We must delay the WB (Write Back) of i2 until the execution of i1 finishes.

Structural hazards

A structural hazard occurs when a part of the processor's hardware is needed by two or more instructions at the same time. A canonical example is a single memory unit that is accessed both in the fetch stage where an instruction is retrieved from memory, and the memory stage where data is written and/or read from memory.^[3] They can often be resolved by separating the component into orthogonal units (such as separate caches) or bubbling the pipeline.

Control hazards (branch hazards)

Branching hazards (also known as control hazards) occur with branches. On many instruction pipeline microarchitectures, the processor will not know the outcome of the branch when it needs to insert a new instruction into the pipeline (normally the fetch stage).

Eliminating hazards

Generic

Pipeline bubbling

Bubbling the pipeline, also known as a pipeline break or a pipeline stall, is a method for preventing data, structural, and branch hazards from occurring. As instructions are fetched, control logic determines whether a hazard could/will occur. If this is true, then the control logic inserts NOPs into the pipeline. Thus, before the next instruction (which would cause the hazard) is executed, the previous one will have had sufficient time to complete and prevent the hazard. If the number of NOPs is equal to the number of stages in the pipeline, the processor has been cleared of all instructions and can proceed free from hazards. All forms of stalling introduce a delay before the processor can resume execution.

Flushing the pipeline occurs when a branch instruction jumps to a new memory location, invalidating all previous stages in the pipeline. These previous stages are cleared allowing the pipeline to continue at the new instruction indicated by the branch.^[4]^[5]

Data hazards

There are several main solutions and algorithms used to resolve data hazards:

insert a pipeline bubble whenever a read after write (RAW) dependency is encountered, guaranteed to increase latency, or
utilize out-of-order execution to potentially prevent the need for pipeline bubbles
utilize operand forwarding to use data from later stages in the pipeline

In the case of out-of-order execution, the algorithm used can be:

scoreboarding, in which case a pipeline bubble will only be needed when there is no functional unit available
the Tomasulo algorithm, which utilizes register renaming allowing the continual issuing of instructions

We can delegate the task of removing data dependencies to the compiler, which can fill in an appropriate number of NOP instructions between dependent instructions to ensure correct operation, or re-order instructions where possible.

Operand forwarding

Examples

NOTE: In the following examples, computed values are in bold, while Register numbers are not.

For instance, let's say we want to write the value 3 to register 1, (which already contains a 6), and then add 7 to register 1 and store the result in register 2, i.e.:

Instruction 0: Register 1 = 6

Instruction 1: Register 1 = 3

Instruction 2: Register 2 = Register 1 + 7 = 10

Following execution, register 2 should contain the value 10. However, if Instruction 1 (write 3 to register 1) does not completely exit the pipeline before Instruction 2 starts execution, it means that Register 1 does not contain the value 3 when Instruction 2 performs its addition. In such an event, Instruction 2 adds 7 to the old value of register 1 (6), and so register 2 would contain 13 instead, i.e.:

Instruction 0: Register 1 = 6

Instruction 2: Register 2 = Register 1 + 7 = 13

Instruction 1: Register 1 = 3

This error occurs because Instruction 2 reads Register 1 before Instruction 1 has committed/stored the result of its write operation to Register 1. So when Instruction 2 is reading the contents of Register 1, register 1 still contains 6, not 3.

Forwarding (described below) helps correct such errors by depending on the fact that the output of Instruction 1 (which is 3) can be used by subsequent instructions before the value 3 is committed to/stored in Register 1.

Forwarding applied to our example means that we do not wait to commit/store the output of Instruction 1 in Register 1 (in this example, the output is 3) before making that output available to the subsequent instruction (in this case, Instruction 2). The effect is that Instruction 2 uses the correct (the more recent) value of Register 1: the commit/store was made immediately and not pipelined.

With forwarding enabled, the ID/EX or Instruction Decode/Execution stage of the pipeline now has two inputs: the value read from the register specified (in this example, the value 6 from Register 1), and the new value of Register 1 (in this example, this value is 3) which is sent from the next stage (EX/MEM) or Instruction Execute/Memory Access. Additional control logic is used to determine which input to use.

Control hazards (branch hazards)

To avoid control hazards microarchitectures can:

insert a pipeline bubble (discussed above), guaranteed to increase latency, or
use branch prediction and essentially make educated guesses about which instructions to insert, in which case a pipeline bubble will only be needed in the case of an incorrect prediction

In the event that a branch causes a pipeline bubble after incorrect instructions have entered the pipeline, care must be taken to prevent any of the wrongly-loaded instructions from having any effect on the processor state excluding energy wasted processing them before they were discovered to be loaded incorrectly.

Other techniques

Memory latency is another factor that designers need to pay attention to because the delay could cause a decrease in performance. Different types of memory have different accessing time to the memory. Therefore, by choosing a suitable type of memory designers can improve the performance of the pipelined data path.^[6]

References

↑ Patterson & Hennessy 2009, p. 335.
↑ Patterson & Hennessy 2009, pp. 335-343.
↑ Patterson & Hennessy 2009, p. 336.
↑ Lua error in package.lua at line 80: module 'strict' not found.
↑ Lua error in package.lua at line 80: module 'strict' not found.
↑ Lua error in package.lua at line 80: module 'strict' not found.

Lua error in package.lua at line 80: module 'strict' not found.
Lua error in package.lua at line 80: module 'strict' not found.
John P. Shen and Mikko H. Lipasti, Modern Processor Design: Fundamentals of Superscalar Processors, 2004, ISBN 0070570647

External links

Lua error in package.lua at line 80: module 'strict' not found.
Pipeline hazards, January 18, 2005, by Dean Tulsen

[FOOTNOTEPattersonHennessy2009335-1] Patterson & Hennessy 2009, p. 335.

[FOOTNOTEPattersonHennessy2009335-343-2] Patterson & Hennessy 2009, pp. 335-343.

[FOOTNOTEPattersonHennessy2009336-3] Patterson & Hennessy 2009, p. 336.

[4] Lua error in package.lua at line 80: module 'strict' not found.

[5] Lua error in package.lua at line 80: module 'strict' not found.

[6] Lua error in package.lua at line 80: module 'strict' not found.

[1]

[2]

[3]

[4]

[5]

[6]

v t e CPU technologies
Architecture	Von Neumann Harvard (Modified Harvard) Dataflow TTA
Instruction set	ASIP CISC RISC EDGE EPIC MISC OISC VLIW NISC ZISC TRIPS Comparison
Word size	1-bit 4-bit 8-bit 9-bit 10-bit 12-bit 15-bit 16-bit 18-bit 22-bit 24-bit 25-bit 26-bit 27-bit 31-bit 32-bit 33-bit 34-bit 36-bit 39-bit 40-bit 48-bit 50-bit 60-bit 64-bit 128-bit 256-bit 512-bit variable
Execution	Instruction pipelining Bubble Operand forwarding Out-of-order execution Register renaming Speculative execution Branch predictor Memory dependence prediction Hazards
Parallel level	Bit Bit-serial Word Instruction Scalar Superscalar Task Thread Process Data Vector Memory
Multithreading	Temporal Simultaneous Preemptive Cooperative
Flynn's taxonomy	SISD SIMD MISD MIMD SPMD Addressing mode
Types	Digital signal processor (DSP) GPGPU Microcontroller Physics processing unit System on a chip (SoC) Cellular
Components	Address generation unit (AGU) Arithmetic logic unit (ALU) Barrel shifter Floating-point unit (FPU) Back-side bus Multiplexer Demultiplexer Registers Memory management unit (MMU) Translation lookaside buffer (TLB) Cache Register file Microcode Control unit Clock rate
Power management	APM ACPI Dynamic frequency scaling Dynamic voltage scaling Clock gating
CPU hardware security	NX bit Hardware restriction (firmware) Trusted Execution Technology Secure cryptoprocessor Hardware security module Hengzhi chip

Hazard (computer architecture)

Contents

Background

Types

Data hazards

Read After Write (RAW)

Example

Write After Read (WAR)

Example

Write After Write (WAW)

Example

Structural hazards

Control hazards (branch hazards)

Eliminating hazards

Generic

Pipeline bubbling

Data hazards

Operand forwarding

Examples

Control hazards (branch hazards)

Other techniques

See also

References

External links

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools