SiFive - August 14, 2017
All Aboard, Part 1: The -march, -mabi, and -mtune arguments to RISC-V Compilers
Before we can board the RISC-V train, we'll have to take a stop
at the metaphorical ticket office: our machine-specific GCC command-line
arguments. These arguments all begin with -m
, and are all specific to
the RISC-V architecture port. In general, we've tried to match existing
conventions for these arguments, but like pretty much everything else
there are enough quirks to warrant a blog post. This blog discusses the
arguments most fundamental to the RISC-V ISA: the -march
, -mabi
, and
-mtune
arguments.
One advantage of having a functional GCC port for a long time
before we stabilized SiFive's interfaces is that we can have a well-thought-out
command-line interface to the RISC-V C and C++ compilers. This allows us
to expose the same command-line interface from both the GNU tools (GCC
and binutils) as well as the LLVM tools, as well as avoid the need for
users to directly pass flags to the assembler or linker via the
compiler's -Wa
and -Wl
arguments.
To ensure that the RISC-V compiler command-line interface is easy to extend in the future, we decided on a scheme where users describe the RISC-V target they are trying to compile for using three arguments:
-march=ISA
selects the architecture to target. This controls which instructions and registers are available for the compiler to use.-mabi=ABI
selects the ABI to target. This controls the calling convention (which arguments are passed in which registers) and the layout of data in memory.-mtune=CODENAME
selects the microarchitecture to target. This informs GCC about the performance of each instruction, allowing it to perform target-specific optimizations.
The -march
Argument
The -march
argument is essentially defined by the RISC-V user-level
ISA manual. -march
controls instruction set from which the compiler is
allowed to generate instructions. This argument determines the set of
implementations that a program will run on: any RISC-V compliant system
that subsumes the -march
value used to compile a program should be
able to run that program.
To get a bit more specific: Version 2.2 of the RISC-V User-Level ISA defines three base ISAs that are currently supported by the toolchain:
- RV32I: A load-store ISA with 32, 32-bit general-purpose integer registers.
- RV32E: An embedded flavor of RV32I with only 16 integer registers.
- RV64I: A 64-bit flavor of RV32I where the general-purpose integer registers are 64-bits wide.
In addition to these base ISAs, a handful of extensions have been specified. The extensions that have been specified and are supported by the toolchain are:
- M: Integer Multiplication and Division
- A: Atomic Instructions
- F: Single-Precision Floating-Point
- D: Double-Precision Floating-Point
- C: Compressed Instructions
RISC-V ISA strings are defined by appending the supported extensions to
the base ISA in the order listed above. For example, the RISC-V ISA with
32, 32-bit integer registers and the instructions for multiplication
would be denoted as RV32IM
. Users can control the set of instructions
that GCC uses when generating assembly code by passing the lower-case
ISA string to the -march
GCC argument: for example -march=rv32im
.
On RISC-V systems that don't support particular operations, emulation routines may be used to provide the missing functionality. For example the following C code
double dmul(double a, double b) {
return a * b;
}
will compile directly to a FP multiplication instruction when compiled with the D extension
$ riscv64-unknown-elf-gcc test.c -march=rv64imafdc -mabi=lp64d -o- -S -O3
dmul:
fmul.d fa0,fa0,fa1
ret
but will compile to an emulation routine without the D extension
$ riscv64-unknown-elf-gcc test.c -march=rv64i -mabi=lp64 -o- -S -O3
dmul:
add sp,sp,-16
sd ra,8(sp)
call __muldf3
ld ra,8(sp)
add sp,sp,16
jr ra
Similar emulation routines exist for the C intrinsics that are trivially implemented by the M and F extensions. As of this writing, there are no A routine emulations because they were rejected as part of the Linux upstreaming process -- this might change in the future, but - for now - we plan to mandate that Linux-capable machines subsume the A extension as part of the RISC-V platform specification.
The -mabi
Argument
The -mabi
argument to GCC specifies both the integer and
floating-point ABIs to which the generated code complies. Much like how
the -march
argument specifies which hardware generated code can run
on, the -mabi
argument specifies which software generated code can
link against. We use the standard naming scheme for integer ABIs
(ilp32
or lp64
), with an argumental single letter appended to select
the floating-point registers used by the ABI (ilp32
vs ilp32f
vs
ilp32d
). In order for objects to be linked together, they must follow
the same ABI.
RISC-V defines two integer ABIs and three floating-point ABIs, which together are treated as a single ABI string. The integer ABIs follow the standard ABI naming scheme:
ilp32
:int
,long
, and pointers are all 32-bits long.long
is a 64-bit type,char
is 8-bit, andshort
is 16-bit.lp64
:long
and pointers are 64-bits long, whileint
is a 32-bit type. The other types remain the same as ilp32.
while the floating-point ABIs are a RISC-V specific addition:
- "" (the empty string): No floating-point arguments are passed in registers.
f
: 32-bit and smaller floating-point arguments are passed in registers. This ABI requires the F extension, as without F there are no floating-point registers.d
: 64-bit and smaller floating-point arguments are passed in registers. This ABI requires the D extension.
Just like ISA strings, ABI strings are concatenated together and passed
via the -mabi
argument to GCC. In order to explain why the ISA and
ABI should be treated as two separate arguments, let's examine a handful
of -march
/-mabi
combinations:
-march=rv32imafdc -mabi=ilp32d
: Hardware floating-point instructions can be generated and floating-point arguments are passed in registers. This is like the-mfloat-abi=hard
argument to ARM's GCC.-march=rv32imac -mabi=ilp32
: No floating-point instructions can be generated and no floating-point arguments are passed in registers. This is like the-mfloat-abi=soft
argument to ARM's GCC.-march=rv32imafdc -mabi=ilp32
: Hardware floating-point instructions can be generated, but no floating-point arguments will be passed in registers. This is like the-mfloat-abi=softfp
argument to ARM's GCC, and is usually used when interfacing with soft-float binaries on a hard-float system.-march=rv32imac -mabi=ilp32d
: Illegal, as the ABI requires floating-point arguments are passed in registers but the ISA defines no floating-point registers to pass them in.
As a more concrete example, let's examine a simple C function that takes two double-precision arguments and returns their product. In order to make argument location explicit in all cases, we'll reverse the order of the arguments between the function call and the multiplication:
double dmul(double a, double b) {
return b * a;
}
The first argument is the simplest one: if neither the ABI, nor the
ISA, contains the concept of floating-point hardware, then the C
compiler cannot emit any floating-point-specific instructions. In
this case,emulation routines are used to perform the computation and
the arguments are passed in integer registers. As you can see, the
double-precision arguments are passed in 32-bit integer register pairs,
the order of arguments is swapped, ra
is saved (as it's callee saved),
the emulation routine is called, the stack is unwound, and the result is
returned (which is already in a0,a1
from __muldf3
).
$ riscv64-unknown-elf-gcc test.c -march=rv32imac -mabi=ilp32 -o- -S -O3
dmul:
mv a4,a2
mv a5,a3
add sp,sp,-16
mv a2,a0
mv a3,a1
mv a0,a4
mv a1,a5
sw ra,12(sp)
call __muldf3
lw ra,12(sp)
add sp,sp,16
jr ra
The second case is the exact opposite of this one: everything is
supported in hardware. In this case we can emit a single fmul.d
instruction to perform the computation, which when register allocated
correctly handles reversing the input arguments and producing the return
value.
$ riscv64-unknown-elf-gcc test.c -march=rv32imafdc -mabi=ilp32d -o- -S -O3
dmul:
fmul.d fa0,fa1,fa0
ret
The last case exposes why there is a split between the -march
and
-mabi
arguments to RISC-V compilers: users may want to generate code
that can be linked with code designed for systems that don't subsume a
particular extension while still taking advantage of the extra
instructions present in a particular extension. This is a common problem
when dealing with legacy libraries that need to be integrated into newer
systems so we've designed our compiler arguments and multilib paths to
cleanly integrate with this workflow.
The generated code is essentially a mix between the two above outputs:
the arguments are passed in the registers specified by the ilp32
ABI
(as opposed to the ilp32d
ABI, which could pass these arguments in
registers) but then once inside the function the compiler is free to use
the full power of the rv32imafdc
ISA to actually compute the result.
As a result, the compiler generates the double-precision arguments in
memory (the only way to construct a double on rv32
), loads them into
F
registers, performs the computation, stores the F
-register result
back out to the stack, and loads the result into the ABI-compliant
return value registers (a0
and a1
). While this is less efficient
than the code the compiler could generate if it was allowed to take full
advantage of the D-extension registers, it's a lot more efficient than
computing the floating-point multiplication without the D-extension
instructions
$ riscv64-unknown-elf-gcc test.c -march=rv32imafdc -mabi=ilp32 -o- -S -O3
dmul:
add sp,sp,-16
sw a0,8(sp)
sw a1,12(sp)
fld fa5,8(sp)
sw a2,8(sp)
sw a3,12(sp)
fld fa4,8(sp)
fmul.d fa5,fa5,fa4
fsd fa5,8(sp)
lw a0,8(sp)
lw a1,12(sp)
add sp,sp,16
jr ra
The final possibly ABI/ISA combination is easy: it's illegal. There's no
way the compiler could generate code for an ISA that requires passing
arguments in F
registers if it doesn't have access to the instructions
required to access those registers. As this must be user error, we bail
out right away.
$ riscv64-unknown-elf-gcc test.c -march=rv32imac -mabi=ilp32d -o- -S -O3
cc1: error: requested ABI requires -march to subsume the 'D' extension
The -mtune
Argument
The last compiler argument that's involved in specifying a target is the
simplest of the bunch. While the -march
argument can cause systems to
be unable to execute code and the -mabi
argument can cause objects to
be incompatible with each other, the -mtune
argument should only
change the performance of the generated code. As it currently stands we
really don't have any tuning models for RISC-V systems. Unless you've
just added a new tuning parameter to our GCC port, you probably
shouldn't bother doing anything with this argument.