Vs1r.v & Fence.i Interaction Bug In XiangShan

Dec 1, 2025 by Alex Johnson 46 views

Understanding the Unexpected Interaction Between `vs1r.v` and `fence.i` Instructions in XiangShan

When diving into the world of computer architecture, especially with cutting-edge designs like XiangShan, encountering unexpected behaviors is part of the journey. In this article, we'll break down a fascinating issue involving the interaction between the vs1r.v (vector single-element read) and fence.i (instruction memory barrier) instructions within the XiangShan processor. We'll explore the observed bug, its implications, and how it deviates from expected ISA (Instruction Set Architecture) semantics.

The Bug: A Deep Dive

At the heart of the issue is an unexpected behavior triggered by the following instruction sequence:

vs1r.v v14, (a3)
fence.i

The problem arises when register a3 holds the Program Counter (PC) of a subsequent instruction. In this scenario, the behavior in XiangShan and its reference models (REFs) diverges from what the RISC-V ISA specification anticipates. To fully grasp this, we need to dissect each component of this interaction.

Understanding `vs1r.v`

The vs1r.v instruction is a vector instruction that reads a single element from memory into a vector register. Vector instructions are powerful tools for parallel processing, allowing processors to perform the same operation on multiple data elements simultaneously. This instruction, in particular, is crucial for loading data into vector registers, which are then used in various vector operations.

The Role of `fence.i`

On the other hand, fence.i serves as an instruction memory barrier. Instruction memory barriers, or fences, are critical for maintaining memory consistency in modern processors. They ensure that all prior instruction fetches are completed before any subsequent instruction fetches are initiated. This is particularly important in scenarios where instructions might be modified in memory, such as during dynamic code generation or self-modifying code.

The Unexpected Divergence

The core of the bug lies in how XiangShan handles the fence.i instruction following a vs1r.v when a3 points to a later instruction's PC. According to the ISA semantics, fence.i should ensure that the instruction cache is synchronized with memory. Meaning any modifications to the instruction stream should be visible before further instructions are executed. However, the observed behavior in XiangShan and its REFs differs, leading to incorrect program execution. This divergence highlights a critical discrepancy between the intended behavior as per the ISA and the actual implementation in XiangShan.

Reproduction and Observation

To effectively debug and resolve this issue, reproducing the bug is essential. The original reporter of this issue has generously provided a test case, including the assembly source file and the compiled .img file. By modifying the code as indicated in the assembly file's comments, multiple mismatch behaviors can be reproduced. This hands-on approach is invaluable for pinpointing the root cause of the bug.

Reference Models: NEMU and Spike

The issue was observed using two reference models: NEMU and Spike. NEMU (NJU Emulator) is a popular emulator often used for RISC-V development and testing. Spike, developed by the RISC-V International, is another widely used instruction set simulator that serves as a golden reference for RISC-V implementations. The detailed logs provided from both NEMU and Spike runs clearly illustrate the point of divergence between the expected and actual behavior. These logs are crucial for understanding the sequence of instructions and the state of registers at the point of failure.

Error Analysis

The provided NEMU and Spike logs show that the vs1r.v instruction, when followed by a fence.i, leads to an illegal instruction exception. The logs highlight differences in register values, privilege modes, and CSR (Control and Status Register) values between the DUT (Device Under Test, i.e., XiangShan) and the reference models. Specifically, differences in registers like a0, mstatus, mepc, mtval, and mcause indicate that the processor is not handling the instruction sequence as expected. The root cause appears to be related to how the fence.i instruction interacts with the instruction cache after the vs1r.v instruction has been executed.

Expected Behavior: Adhering to ISA Semantics

To fully understand the severity of this bug, it's crucial to define the expected behavior. According to the RISC-V ISA specification, the fence.i instruction should ensure that all prior instruction fetches are completed and that the instruction cache is synchronized with memory. In the context of the problematic instruction sequence, this means that if a3 holds the PC of a later instruction, any modifications to that instruction in memory should be visible before that instruction is executed.

The Illegal Instruction Exception

The expected behavior, in this case, is that the modified instruction address should trigger an illegal instruction exception. This exception is a standard mechanism in RISC-V for handling situations where the processor attempts to execute an invalid or unsupported instruction. By raising this exception, the processor signals that something has gone wrong, allowing the operating system or runtime environment to take appropriate action.

Deviation from the Norm

The observed behavior in XiangShan deviates from this expected norm. Instead of raising an illegal instruction exception, the processor continues execution, leading to incorrect results and potential system instability. This discrepancy underscores the importance of rigorous testing and verification in processor design.

Implications and Importance of Resolution

The unexpected interaction between vs1r.v and fence.i has significant implications for the correctness and reliability of XiangShan. If left unaddressed, this bug could lead to a variety of issues, including:

Incorrect Program Execution: The most direct consequence is that programs relying on the correct behavior of fence.i might produce incorrect results. This is particularly problematic for applications that use dynamic code generation or self-modifying code.
Security Vulnerabilities: In some cases, incorrect instruction execution can lead to security vulnerabilities. For example, if an attacker can manipulate the instruction stream and bypass security checks, they might be able to gain unauthorized access to the system.
System Instability: In severe cases, the bug could cause the processor to enter an unstable state, leading to crashes or other unpredictable behavior.

Ensuring ISA Compliance

To ensure the integrity of XiangShan, it is crucial to resolve this bug. This involves identifying the root cause of the issue, implementing a fix, and thoroughly testing the fix to ensure that it resolves the problem without introducing new issues. Adhering to ISA semantics is paramount for any processor implementation, and this bug highlights the challenges in achieving that goal.

Environment and Configuration

The environment in which the bug was reproduced is also crucial for understanding the context of the issue. The reporter used the following compile command:

make emu CONFIG=MinimalConfig EMU_THREADS=2 -j40

This command indicates that the emulator was built with a minimal configuration and with two threads. The -j40 flag specifies that the compilation should use 40 parallel jobs, which can significantly speed up the build process.

XiangShan Commit and Ready-to-Run Commit

The specific commits used for XiangShan and the ready-to-run environment are also important:

XiangShan Commit: 74565eecc122e407e40fecf7f68fc9990f19e28f
Ready-to-Run Commit: c4e0350c0f686cfa206d5b47d80cfd730f39675a

These commit hashes uniquely identify the exact versions of the software used, making it easier for others to reproduce the issue and verify any fixes.

Additional Context and Assistance

The reporter has shown a commendable commitment to resolving this issue by providing detailed information, including test cases and reproduction steps. Their willingness to assist with further details or additional reproducing inputs is invaluable for the debugging process. Such collaborative efforts are essential for advancing the state of computer architecture and ensuring the reliability of new processor designs.

Conclusion: A Step Towards Robust Processor Design

The unexpected interaction between vs1r.v and fence.i in XiangShan serves as a compelling reminder of the complexities involved in processor design. Identifying and resolving such bugs is a critical step towards building robust and reliable systems. By meticulously examining the behavior, comparing it against ISA specifications, and engaging in collaborative debugging, the XiangShan team can ensure that this cutting-edge processor meets the highest standards of correctness and performance. Understanding these interactions not only helps in fixing the immediate issue but also provides valuable insights for future designs and verification methodologies.

For further reading on RISC-V instruction set architecture and memory consistency models, you might find the resources at the RISC-V International website helpful.