Skip to content

fmsub.s incorrectly raises NV for 0.0 * 0.0 - qNaN #212

Description

@whensun

The issue

RVVM sets the invalid operation flag for fmsub.s in the case:

(0.0 * 0.0) - qNaN

In my test, fflags becomes:

0x10

which means NV is set. I expected:

0x00

The RISC-V fused multiply-add rule specifically calls out the infinity × zero case as requiring NV, even when the addend is a quiet NaN. This test does not use infinity. It uses 0.0, 0.0, and a canonical qNaN.

This looks like RVVM is raising NV too broadly for qNaN addends in FMA/FMSUB handling.

Steps to reproduce

Build this bare-metal test case:

li t0, 0x7FC00000
fmv.w.x ft2, t0
fmsub.s ft3, ft0, ft1, ft2, rne
csrr t1, fflags

For a more controlled setup, clear fflags and explicitly reload operands:

csrw fflags, x0
li t0, 0
fmv.w.x ft0, t0
fmv.w.x ft1, t0
li t0, 0x7FC00000
fmv.w.x ft2, t0
fmv.x.w t2, ft0
fmv.x.w t3, ft1
fmv.x.w t4, ft2
fmsub.s ft3, ft0, ft1, ft2, rne
csrr t1, fflags

Build command used:

riscv64-unknown-elf-gcc \
  -march=rv64imafdch_zicfiss_zicbom_zicboz_v_zicsr_zca_zimop_zcmop_zbb_zbs_zkne_zbkb_zabha_zacas_zawrs_zkr_smepmp_zcb_zicond_zba_zknd_zbc_zbkc_zfh_zfbfmin_zfhmin_zfa_zifencei_zvfbfmin_zbkx_zvksed_zvksh_zvknha_zvknhb_zvkg_zvfbfwma_zvbc_zvbb_zvkned_zksed_zksh_zknh_zvkb_zicbop_zicfilp_svinval_zve32f \
  -mabi=lp64 \
  -mcmodel=medany \
  -nostdlib \
  -nostartfiles \
  -T linker.ld \
  code.S machine_to_supervisor.S machine_to_user.S \
  -o code.elf

Run RVVM:

rvvm code.elf -m 256M -nogui -serial null -gdbstub

Connect with GDB:

riscv64-unknown-elf-gdb code.elf
set pagination off
target remote :1234
c

Then interrupt execution with Ctrl-C and inspect the registers:

info registers t1 t2 t3 t4

The controlled operand setup is:

t2 = 0x00000000
t3 = 0x00000000
t4 = 0x7FC00000

This corresponds to:

ft0 = +0.0f
ft1 = +0.0f
ft2 = qNaN (canonical)

Observed result on RVVM:

t1 = 0x10

Expected result:

t1 = 0x00

Investigation

I stumbled upon this while checking floating-point exception behavior for fused multiply-add instructions with qNaN operands.

I first verified the operands explicitly using fmv.x.w, so this should not be a NaN-boxing artifact or operand setup issue.

The tested expression is:

(0.0 * 0.0) - qNaN

The spec text I used says that fused multiply-add instructions must set NV when the multiplicands are infinity and zero, even when the addend is a quiet NaN.

Because this case is 0.0 * 0.0 - qNaN, not infinity * 0 + qNaN, I do not think that special invalid-flag rule applies here.

The current RVVM implementation appears to rely on the underlying FMA path and preserve NV even in qNaN cases that are outside the spec's special infinity-times-zero requirement.

Old code:

// RVVM/src/cpu/riscv_fpu.h

static forceinline void riscv_emulate_f_fmsub(rvvm_hart_t* vm, const uint32_t insn)
{
    const size_t   rds = bit_ext_u32(insn, 7, 5);
    const uint32_t rm  = bit_ext_u32(insn, 12, 3);
    const size_t   rs1 = bit_ext_u32(insn, 15, 5);
    const size_t   rs2 = bit_ext_u32(insn, 20, 5);
    const size_t   rs3 = insn >> 27;

    if (likely(riscv_fpu_is_enabled(vm) && riscv_fpu_rm_is_valid(rm))) {
        switch (bit_ext_u32(insn, 25, 2)) {
            case 0x0: // fmsub.s
                riscv_emit_s(vm, rds,
                             fpu_fma32(riscv_view_s(vm, rs1),
                                       riscv_view_s(vm, rs2),
                                       fpu_neg32(riscv_view_s(vm, rs3))));
                return;
            case 0x1: // fmsub.d
                riscv_emit_d(vm, rds,
                             fpu_fma64(riscv_view_d(vm, rs1),
                                       riscv_view_d(vm, rs2),
                                       fpu_neg64(riscv_view_d(vm, rs3))));
                return;
        }
    }

    riscv_illegal_insn(vm, insn);
}

Workarounds

I do not know of a practical guest-side workaround other than avoiding code that depends on the exact NV behavior for this qNaN FMSUB case.

The issue appears limited to software that checks fflags after fused multiply-add operations involving qNaN operands.

A local emulator-side workaround is to suppress NV when the addend is a qNaN and the multiplicands are not the special infinity-times-zero case.

Suggested fix / Expected behavior

I think RVVM should not raise NV for:

(0.0 * 0.0) - qNaN

A possible fix is to preserve the special invalid case for infinity × zero while clearing newly raised NV when the addend is a qNaN and the multiplicands are not infinity/zero.

Suggested change:

// RVVM/src/cpu/riscv_fpu.h

static forceinline void riscv_emulate_f_fmsub(rvvm_hart_t* vm, const uint32_t insn)
{
    const size_t   rds = bit_ext_u32(insn, 7, 5);
    const uint32_t rm  = bit_ext_u32(insn, 12, 3);
    const size_t   rs1 = bit_ext_u32(insn, 15, 5);
    const size_t   rs2 = bit_ext_u32(insn, 20, 5);
    const size_t   rs3 = insn >> 27;

    if (likely(riscv_fpu_is_enabled(vm) && riscv_fpu_rm_is_valid(rm))) {
        switch (bit_ext_u32(insn, 25, 2)) {
            case 0x0: { // fmsub.s
                fpu_f32_t a = riscv_view_s(vm, rs1);
                fpu_f32_t b = riscv_view_s(vm, rs2);
                fpu_f32_t c = riscv_view_s(vm, rs3);

                uint32_t e_old = fpu_get_exceptions();
                fpu_f32_t out  = fpu_fma32(a, b, fpu_neg32(c));
                uint32_t e_new = fpu_get_exceptions();

                bool c_is_qnan = fpu_is_nan32_soft(c) && !fpu_is_snan32_soft(c);

                uint32_t ua = fpu_bit_f32_to_u32(a);
                uint32_t ub = fpu_bit_f32_to_u32(b);
                bool a_zero = (ua & 0x7FFFFFFFU) == 0;
                bool b_zero = (ub & 0x7FFFFFFFU) == 0;
                bool a_inf  = !fpu_is_finite32(a) && !fpu_is_nan32_soft(a);
                bool b_inf  = !fpu_is_finite32(b) && !fpu_is_nan32_soft(b);
                bool inf0_invalid = (a_inf && b_zero) || (b_inf && a_zero);

                if (c_is_qnan && !inf0_invalid) {
                    uint32_t raised = e_new & ~e_old;
                    if (raised & FPU_LIB_FLAG_NV) {
                        fpu_set_exceptions(e_new & ~FPU_LIB_FLAG_NV);
                    }
                }

                riscv_emit_s(vm, rds, out);
                return;
            }
            case 0x1: // fmsub.d
                riscv_emit_d(vm, rds,
                             fpu_fma64(riscv_view_d(vm, rs1),
                                       riscv_view_d(vm, rs2),
                                       fpu_neg64(riscv_view_d(vm, rs3))));
                return;
        }
    }

    riscv_illegal_insn(vm, insn);
}

Expected behavior:

  • fmsub.s ft3, ft0, ft1, ft2, rne with ft0 = +0.0f, ft1 = +0.0f, and ft2 = qNaN should leave fflags unchanged in this case.
  • The special invalid-flag requirement should remain for the separate infinity × zero case.

Additional information

Tested expression:

(0.0 * 0.0) - qNaN

Verified operand setup:

t2 = 0x00000000 -> ft0 = +0.0f
t3 = 0x00000000 -> ft1 = +0.0f
t4 = 0x7FC00000 -> ft2 = qNaN (canonical)

Observed result:

t1 = 0x10 -> fflags = NV

Expected result:

t1 = 0x00

Real impact:

Low-medium real impact. This mainly affects software or tests that inspect fflags after fused multiply-add operations with qNaN operands and non-infinity multiplicands.

Helpful environment info to include:

Operating system: Linux
OS/kernel version: Linux DESKTOP-PL0JDQL 6.6.87.2-microsoft-standard-WSL2 #1 SMP PREEMPT_DYNAMIC Thu Jun 5 18:30:46 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Architecture: x86_64
RVVM version/commit: latest version at the time of testing, exact commit unknown

Verbose logs:

If needed, rerun with verbose logging enabled:

rvvm code.elf -m 256M -nogui -serial null -gdbstub -verbose

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions