Constraints for vector tuple types #43

nick-knight · 2023-06-29T00:23:10Z

How do we pass vector tuple types to/from extended asm templates? It seems that using the insertion/extraction intrinsics (with the first tuple element) might be unsafe.

@kito-cheng @leekillough

kito-cheng · 2023-06-29T02:29:03Z

It's safe to use vr constraint for tuple types, and compiler could recognized the type and use the right info, this could work on GCC trunk, but seems clang trunk will got ... crash.

#include <riscv_vector.h>

void foo(){
    vint32m1x2_t v1, v2; 
    asm volatile ("# %0 %1": "=vr"(v1) : "vr"(v2));
}

leekillough · 2023-06-29T15:43:55Z

But how to pass certain fields of tuples as non-tuple vector registers?

If (v0,v1) is a tuple called vx, how do I pass vx.v0 or vx.v1 to inline assembly or non-segment intrinsics?

Depositing/extracting vectors from tuple aggregate types seems to defeat the purpose of segment loads/stores, unless it's just massaging for the compiler and introduces no new instructions (moves).

kito-cheng · 2023-07-04T02:32:33Z

But how to pass certain fields of tuples as non-tuple vector registers?

If (v0,v1) is a tuple called vx, how do I pass vx.v0 or vx.v1 to inline assembly or non-segment intrinsics?

Depositing/extracting vectors from tuple aggregate types seems to defeat the purpose of segment loads/stores, unless it's just massaging for the compiler and introduces no new instructions (moves).

Yes, using vget/vset to depositing/extracting vectors from tuple types, compiler will try to allocate same register to prevent extra move instruction, if you saw a move instruction and you think it not necessary, you could report bug to llvm or GCC community since that might be potential performance regression issue.

leekillough · 2023-07-06T03:53:49Z

The tuple intrinsic type, since it's already a type outside of C/C++ proper, could have array indexing tuple[0 .. NFIELDS-1], and this would be a lot more straightforward. It would return an lvalue of a numbered field, and it would be a constraint violation to be outside of the range 0 .. NFIELDS-1 (or to use a value which isn't a compile-time constant).

Even if array subscripting is not practical, some intrinsic like __rvv_tuple_field() to return a numbered tuple field as an lvalue, which can be assigned to or converted to an rvalue, would be more intuitive than inserting or extracting, which sometimes requires creating extra variables that hopefully the compiler will merge with the tuples'.

Porting code which used the old syntax would also be a lot easier, since you would only need to replace things like xvec_real with xvec[0] or __rvv_tuple_field(xvec, 0), and xvec_imag with xvec[1] or __rvv_tuple_field(xvec, 1). It would work whether xvec[0] and xvec[1] ended up on the LHS or RHS of an assignment, and would not need to create new temporary local variables of vector type, or require the compiler to assign them to the tuple fields' same vector registers -- it would just access them directly.

kito-cheng · 2023-07-06T06:58:26Z

@leekillough honestly we've consider adding subscripting syntax for tuple type, I could imagining it would be useful and much simple for user - but unfortunately we are lack of engineering resource to implement that :(

…code. However, it does not get correct results for complex BLIS routines which use segment loads (or call those that do). The intrinsic types check out and make sense, but it returns wrong answers. It's probably something really simple. For historical reference, see: riscv-non-isa/riscv-c-api-doc#43 flame#737 (comment) https://reviews.llvm.org/D152134 riscv-non-isa/rvv-intrinsic-doc#139 riscv-non-isa/rvv-intrinsic-doc#198 https://github.com/riscv-non-isa/rvv-intrinsic-doc/blob/master/rvv-intrinsic-rfc.md https://github.com/riscv-non-isa/rvv-intrinsic-doc/blob/master/auto-generated/intrinsic_funcs/02_vector_unit-stride_segment_load_store_instructions_zvlsseg.md https://github.com/riscv-non-isa/rvv-intrinsic-doc/blob/master/auto-generated/intrinsic_funcs/03_vector_stride_segment_load_store_instructions_zvlsseg.md https://github.com/riscv-non-isa/rvv-intrinsic-doc/blob/master/auto-generated/intrinsic_funcs/04_vector_indexed_segment_load_store_instructions_zvlsseg.md

leekillough · 2023-07-06T10:26:45Z

@leekillough honestly we've consider adding subscripting syntax for tuple type, I could imagining it would be useful and much simple for user - but unfortunately we are lack of engineering resource to implement that :(

Here's a preview of what not having such a feature would require doing, unless I'm missing something:

Use the new tuple intrinsics to get rid of build errors in X280 BLIS

nick-knight mentioned this issue Jun 29, 2023

Add sifive_x280 configuration flame/blis#737

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Constraints for vector tuple types #43

Constraints for vector tuple types #43

nick-knight commented Jun 29, 2023

kito-cheng commented Jun 29, 2023

leekillough commented Jun 29, 2023

kito-cheng commented Jul 4, 2023

leekillough commented Jul 6, 2023 •

edited

Loading

kito-cheng commented Jul 6, 2023

leekillough commented Jul 6, 2023

Constraints for vector tuple types #43

Constraints for vector tuple types #43

Comments

nick-knight commented Jun 29, 2023

kito-cheng commented Jun 29, 2023

leekillough commented Jun 29, 2023

kito-cheng commented Jul 4, 2023

leekillough commented Jul 6, 2023 • edited Loading

kito-cheng commented Jul 6, 2023

leekillough commented Jul 6, 2023

leekillough commented Jul 6, 2023 •

edited

Loading