New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SR-10166] Optimizer fails to collapse loops to constant return without using unchecked arithmetic #52568
Comments
cc shajrawi (JIRA User) |
Comment by Joe Shajrawi (JIRA) I have looked at the resulting SIL - in the unchecked case LLVM is turning it into a constant: // testSwiftRangeLoopUnchecked()
sil hidden @$s8LoopTest014testSwiftRangeA9Uncheckeds5Int32VyF : $@convention(thin) () -> Int32 {
bb0:
%0 = integer_literal $Builtin.Int32, 0 // user: %7
%1 = integer_literal $Builtin.Int64, 0 // user: %7
%2 = integer_literal $Builtin.Int64, 999 // user: %16
%3 = integer_literal $Builtin.Int1, -1 // user: %12
%4 = integer_literal $Builtin.Int64, 1 // user: %12
%5 = integer_literal $Builtin.Int32, 2 // user: %14
%6 = integer_literal $Builtin.Int1, 0 // user: %14
br bb2(%0 : $Builtin.Int32, %1 : $Builtin.Int64) // id: %7
bb1: // Preds: bb2
%8 = struct $Int32 (%15 : $Builtin.Int32) // user: %9
return %8 : $Int32 // id: %9
// %10 // user: %14
// %11 // user: %12
bb2(%10 : $Builtin.Int32, %11 : $Builtin.Int64): // Preds: bb3 bb0
%12 = builtin "sadd_with_overflow_Int64"(%11 : $Builtin.Int64, %4 : $Builtin.Int64, %3 : $Builtin.Int1) : $(Builtin.Int64, Builtin.Int1) // user: %13
%13 = tuple_extract %12 : $(Builtin.Int64, Builtin.Int1), 0 // users: %18, %16
%14 = builtin "sadd_with_overflow_Int32"(%10 : $Builtin.Int32, %5 : $Builtin.Int32, %6 : $Builtin.Int1) : $(Builtin.Int32, Builtin.Int1) // user: %15
%15 = tuple_extract %14 : $(Builtin.Int32, Builtin.Int1), 0 // users: %18, %8
%16 = builtin "cmp_eq_Int64"(%13 : $Builtin.Int64, %2 : $Builtin.Int64) : $Builtin.Int1 // user: %17
cond_br %16, bb1, bb3 // id: %17
bb3: // Preds: bb2
br bb2(%15 : $Builtin.Int32, %13 : $Builtin.Int64) // id: %18
} // end sil function '$s8LoopTest014testSwiftRangeA9Uncheckeds5Int32VyF' In the unchecked case we are not so lucky: // testSwiftRangeLoop()
sil hidden @$s8LoopTest014testSwiftRangeA0s5Int32VyF : $@convention(thin) () -> Int32 {
bb0:
%0 = integer_literal $Builtin.Int32, 0 // user: %6
%1 = integer_literal $Builtin.Int64, 0 // user: %6
%2 = integer_literal $Builtin.Int64, 999 // user: %17
%3 = integer_literal $Builtin.Int1, -1 // users: %13, %11
%4 = integer_literal $Builtin.Int64, 1 // user: %11
%5 = integer_literal $Builtin.Int32, 2 // user: %13
br bb2(%0 : $Builtin.Int32, %1 : $Builtin.Int64) // id: %6
bb1: // Preds: bb2
%7 = struct $Int32 (%14 : $Builtin.Int32) // user: %8
return %7 : $Int32 // id: %8
// %9 // user: %13
// %10 // user: %11
bb2(%9 : $Builtin.Int32, %10 : $Builtin.Int64): // Preds: bb3 bb0
%11 = builtin "sadd_with_overflow_Int64"(%10 : $Builtin.Int64, %4 : $Builtin.Int64, %3 : $Builtin.Int1) : $(Builtin.Int64, Builtin.Int1) // user: %12
%12 = tuple_extract %11 : $(Builtin.Int64, Builtin.Int1), 0 // users: %19, %17
%13 = builtin "sadd_with_overflow_Int32"(%9 : $Builtin.Int32, %5 : $Builtin.Int32, %3 : $Builtin.Int1) : $(Builtin.Int32, Builtin.Int1) // users: %15, %14
%14 = tuple_extract %13 : $(Builtin.Int32, Builtin.Int1), 0 // users: %19, %7
%15 = tuple_extract %13 : $(Builtin.Int32, Builtin.Int1), 1 // user: %16
cond_fail %15 : $Builtin.Int1 // id: %16
%17 = builtin "cmp_eq_Int64"(%12 : $Builtin.Int64, %2 : $Builtin.Int64) : $Builtin.Int1 // user: %18
cond_br %17, bb1, bb3 // id: %18
bb3: // Preds: bb2
br bb2(%14 : $Builtin.Int32, %12 : $Builtin.Int64) // id: %19
} // end sil function '$s8LoopTest014testSwiftRangeA0s5Int32VyF' We are missing a bunch of optimizations that would allow us to "properly" handle this on SIL Level, such as Induction Variable Simplification... I have tested a quick "cheat" for this problem - changing loop unrolling with a new heuristic that ignores the trip count if we are reasonably sure we can constant propagate the loop: sil hidden @$s8LoopTest014testSwiftRangeA0s5Int32VyF : $@convention(thin) () -> Int32 {
bb0:
%0 = integer_literal $Builtin.Int32, 1998 // user: %1
%1 = struct $Int32 (%0 : $Builtin.Int32) // user: %2
return %1 : $Int32 // id: %2
} // end sil function '$s8LoopTest014testSwiftRangeA0s5Int32VyF' However, I would not be comfortable committing this heuristic: it is the wrong way to do this - would unnecessarily increase compile time, and, if we are wrong about being able to fold said loop we may blow up code size. The right way to do this would be adding the missing Loop Transformations to the SIL Optimizer. |
@swift-ci create |
Environment
Reproduced on macOS and Linux with Swift 4.2
Additional Detail from JIRA
md5: 3c4a456d5658c054098ca7d3eb5707ec
Issue Description:
I would expect the optimizer to collapse the following equivalent loops into a constant return of an immediate value (1998). However it appears that the checked variant C-style while loop generates less code than the Swift-style for over a range, both checked variants generate actual loops, and both unchecked variants replace the loop with an immediate value (which is the optimization I would expect).
The text was updated successfully, but these errors were encountered: