-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Rename SVE2 AddSaturate signed/unsigned addend API #122283
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Rename Sve2.AddSaturateWithUnsignedAddend/AddSaturateWithSignedAddend to Sve2.AddSaturate so that it aligns with the approved SVE2 API.
|
Tagging subscribers to this area: @dotnet/area-system-runtime-intrinsics |
| // always predicated. Therefore, separate HWIntrinsic IDs are used here instead of emitting the | ||
| // corresponding instructions in the codegen with the same HWIntrinsic. | ||
|
|
||
| if (op1BaseJitType != op2BaseJitType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's unclear why this is needed instead of using the auxiliaryJitType field, which exists to allow such special differentiations.
We tend to prefer fewer IDs where we can as it simplifies the handling and matching of patterns elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My consideration was that AddSaturate can be either predicated (for SVE2 only) or unpredicated (for SVE/SVE2), while AddSaturateWithSigned/UnsignedAddend is only in predicated form in SVE2. Their hwintrinsic flags are slightly different (OptionalEmbeddedMasked vs EmbeddedMasked). I'm not sure how I should implement the different embedded mask behaviors using the same intrinsic ID. Is there a better way to handle this in the codegen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't there already be NI_Sve_AddSaturate (unpredicated) vs NI_Sve2_AddSaturate (either), in which case the latter fits into the same general buckets as any other intrinsic that can potentially be predicated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think when using HW_Flag_OptionalEmbeddedMaskedOperation, it does not wrap the node with a ConditionalSelect (embedded mask) during lowering, but the SUQADD/USQADD instructions always require a mask (even if not given explicitly). Would it be better to mark NI_Sve2_AddSaturate as HW_Flag_EmbeddedMaskedOperation then check if it meets the criteria to use the unpredicated SQADD/UQADD?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would likely be the route I'd go, yes.
This just seems like one other case where we have an intrinsic with special semantics and so it requires minor additional handling when doing lowering and instruction selection; otherwise all the other handling in the front end should be the same.
The flags are mostly supposed to be hints and we often have helpers that query the flag and specially handle any remaining edges, so that other code isn't querying the flags directly. Likewise we tend to try to keep all the HIR (pre-lowering) IR simpler, as it makes pattern matching, value numbering, constant folding, and other phases simpler. While LIR is when we start really getting into machine and microarchitecture specific concepts (like register allocation, codegen, and emit) and so that's where we largely expect these complexities to start surfacing.
Fixes #121961.
Rename
Sve2.AddSaturateWithUnsignedAddend/AddSaturateWithSignedAddendtoSve2.AddSaturateso that it aligns with the approved SVE2 API.Since the Sve2.AddSaturate intrinsic is used for multiple instructions, special intrinsics are used internally for the signed/unsigned addend variants. The AddSaturate intrinsic is marked with the SpecialImport flag so that the actual instruction can be decided based on the operand types during importing.
@dotnet/arm64-contrib @a74nh