[SR-10909] Missed optimization for modulo #53299

palimondo · 2019-06-10T15:35:20Z


Previous ID	SR-10909
Radar	None
Original Reporter	@palimondo
Type	Bug

Additional Detail from JIRA


Votes	0
Component/s	Compiler
Labels	Bug, Performance
Assignee	None
Priority	Medium

md5: d8c5b88b3f3e6a6164d9118e82dd5949

Issue Description:

In PR #25286 When applying O'Neill's modulo optimization to Lemire's Nearly Divisionless Random Number Generation, Swift does not produce the tight instructions that make it work in C. The original modulo:

let t = (0 &- upperBound) % upperBound

O'Neill's optimization:

var t = (0 &- upperBound)
 if (t >= upperBound) {
  t -= upperBound
  if (t >= upperBound) {
    t %= upperBound
  }
 }

See the assembly in Godbolt's Compiler Explorer:

Probable cause preventing better optimization is the guard against division by zero, but given that the method is already guarded by

_precondition(upperBound != 0, "upperBound cannot be zero." )

it should be possible to eliminate it...

Edit: I suspect I have misinterpreted the situation. The above assembly comparison wasn't fair as I took a too narrow code snippet I suspected to be problematic and took out code that included important hints for the Swift compiler to perform optimization. See comments below.

palimondo · 2019-06-10T15:36:16Z

cc @stephentyrone

stephentyrone · 2019-06-10T16:23:06Z

My recollection is that _precondition does not imply an assume, but we should also be able to deduce this from the fact that the type is unsigned and earlier we have upperBound > something, IIRC.

palimondo · 2019-06-10T16:42:29Z

Yes, the whole modulo operation, where this optimization would come in is inside

if m.low < upperBound {
  … 
}

palimondo · 2019-06-10T16:44:53Z

Would such optimization be in the realm of Swift, or is this already some LLVM stuff?

stephentyrone · 2019-06-10T16:46:43Z

It's up to Swift to make sure that the necessary information reaches the LLVM layer. It's up to LLVM to take advantage of it to do the optimization. So, both.

palimondo · 2019-06-10T16:57:03Z

Could you please ping me on a PR that fixes this? I'd love to learn how that's done.

belkadan · 2019-06-10T18:53:32Z

Is there a self-contained example that can go in this bug? Neither the precondition nor the `>` check are present in the linked assembly generation.

palimondo · 2019-06-10T22:20:02Z

The relevant method is in extension of RandomNumberGenerator in this commit here. I'm not sure how to replicate _precondition outside of stdlib internals. The subtraction on line 102 should probably be overflowing for better performance. For completeness, here it is again with that fix:

@inlinable
  public mutating func next<T: FixedWidthInteger & UnsignedInteger>(
    upperBound: T
  ) -> T {
    _precondition(upperBound != 0, "upperBound cannot be zero.")
    var random: T = next()
    var m = random.multipliedFullWidth(by: upperBound)
    if m.low < upperBound {
      var t = 0 &- upperBound
      if t >= upperBound {
        t &-= upperBound
        if t >= upperBound {
          t %= upperBound
        }
      }
      while m.low < t {
        random = next()
        m = random.multipliedFullWidth(by: upperBound)
      }
    }
    return m.high
  }
}

belkadan · 2019-06-11T01:57:32Z

Regular precondition is probably close enough. Or does that change the results significantly?

palimondo · 2019-06-11T20:58:47Z

I suspect I have misinterpreted the situation. The Swift vs. C assembly comparison wasn't fair as I took a too narrow code snippet I suspected to be problematic and took out code that included important hints for the Swift compiler to perform optimization. Further, the original code was C++!

I re-did the Godbolt Compiler Explorer comparison again:

Swift
C++

I believe the generated instructions are largely equivalent.

Looking again at the original article by O'Neill, the modulo optimization provides a slight improvement on Large-Shuffle Benchmark, but on Small-Shuffle Benchmark the version with modulo optimization performs the same as Lemire's original version "Debiased Int Mult (t-opt)" in chart. Our benchmarks in SBS most closely correspond to small shuffle, which would explain why I saw no improvement from modulo optimization on our benchmarks.

@stephentyrone I guess I should come up with a synthetic benchmark to mimic O'Neill's Large-Shuffle and we could re-introduce the modulo optimization together with your rejection sampling. Could you please review my analysis and close this bug as invalid, if you agree?

zoecarver · 2020-05-14T20:26:02Z

I think the reason it's not optimized is because of the overflow checks. If you dump the llvm-ir, you can see that clang doesn't do this optimization itself, it just produces IR that LLVM can optimize.

If you change the example a bit and remove the `>=` checks: https://swift.godbolt.org/z/Psgou5 you get the optimization.

swift-ci transferred this issue from apple/swift-issues Apr 25, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SR-10909] Missed optimization for modulo #53299

[SR-10909] Missed optimization for modulo #53299

palimondo mannequin commented Jun 10, 2019

palimondo mannequin commented Jun 10, 2019

stephentyrone commented Jun 10, 2019

palimondo mannequin commented Jun 10, 2019

palimondo mannequin commented Jun 10, 2019

stephentyrone commented Jun 10, 2019

palimondo mannequin commented Jun 10, 2019

belkadan commented Jun 10, 2019

palimondo mannequin commented Jun 10, 2019

belkadan commented Jun 11, 2019

palimondo mannequin commented Jun 11, 2019

zoecarver commented May 14, 2020

[SR-10909] Missed optimization for modulo #53299

[SR-10909] Missed optimization for modulo #53299

Comments

palimondo mannequin commented Jun 10, 2019

palimondo mannequin commented Jun 10, 2019

stephentyrone commented Jun 10, 2019

palimondo mannequin commented Jun 10, 2019

palimondo mannequin commented Jun 10, 2019

stephentyrone commented Jun 10, 2019

palimondo mannequin commented Jun 10, 2019

belkadan commented Jun 10, 2019

palimondo mannequin commented Jun 10, 2019

belkadan commented Jun 11, 2019

palimondo mannequin commented Jun 11, 2019

zoecarver commented May 14, 2020