Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SR-7842] Inconsistent codegen for closed ranges vs bit-manipulation #50378

Open
milseman mannequin opened this issue Jun 1, 2018 · 15 comments
Open

[SR-7842] Inconsistent codegen for closed ranges vs bit-manipulation #50378

milseman mannequin opened this issue Jun 1, 2018 · 15 comments
Labels
bug A deviation from expected or documented behavior. Also: expected but undesirable behavior. compiler The Swift compiler in itself good first issue Good for newcomers performance SILOptimizer Area → compiler: SIL optimization passes standard library Area: Standard library umbrella

Comments

@milseman
Copy link
Mannequin

milseman mannequin commented Jun 1, 2018

Previous ID SR-7842
Radar rdar://problem/40725067
Original Reporter @milseman
Type Bug

Attachment: Download

Additional Detail from JIRA
Votes 0
Component/s Compiler
Labels Bug, Optimizer, Performance, StarterBug
Assignee None
Priority Medium

md5: 223093f3dc3457d4c7f42d8a0e46938f

Issue Description:

ClosedRange<UInt8> demonstrates poor codegen compared to explicit bit manipulation.

@usableFromInline
func _isContinuation_bits(_ x: UInt8) -> Bool {
  return x & 0b1100_0000 == 0b1000_0000
}

@usableFromInline
func _isContinuation_range(_ x: UInt8) -> Bool {
  let continuation: ClosedRange<UInt8> = 0x80...0xBF
  return continuation.contains(x)
}

for i in 0...UInt8.max {
  guard _isContinuation_bits(i) == _isContinuation_range(i) else { fatalError("mismatch") }
}
print("done")

The code generated for _isContinuation_bits is superior:

                     _$S3foo20_isContinuation_bitsySbs5UInt8VF:
000000010000701c         and        w8, w0, #&#8203;0xc0
0000000100007020         cmp        w8, #&#8203;0x80
0000000100007024         cset       w0, eq
0000000100007028         ret
                        ; endp

                     _$S3foo21_isContinuation_rangeySbs5UInt8VF:
000000010000702c         and        w8, w0, #&#8203;0xff
0000000100007030         sxtb       w9, w0
0000000100007034         cmp        w9, #&#8203;0x0
0000000100007038         cset       w9, lt
000000010000703c         cmp        w8, #&#8203;0xc0
0000000100007040         cset       w8, lo
0000000100007044         and        w0, w9, w8
0000000100007048         ret
                        ; endp
@milseman
Copy link
Mannequin Author

milseman mannequin commented Jun 1, 2018

@swift-ci create

@milseman
Copy link
Mannequin Author

milseman mannequin commented Jun 1, 2018

@eeckstein this is not super urgent, but will affect performance of small strings with UTF-8 support and UTF-8 validation in general (especially applicable to server-side). Do you think this could be tackled this summer?

I'll update this with some other codegen oddities I see.

@milseman
Copy link
Mannequin Author

milseman mannequin commented Jun 1, 2018

A case where the range is superior to bits and compare:

@usableFromInline
func _isNotOverlong_E0_range(_ x: UInt8) -> Bool {
  return (0xA0...0xBF).contains(x)
}
@usableFromInline
func _isNotOverlong_E0_bits(_ x: UInt8) -> Bool {
  return _isContinuation_bits(x) && x >= 0xA0
}

                     _$S15SmallUTF8String23_isNotOverlong_E0_rangeySbs5UInt8VF:
0000000000000104         and        w8, w0, #&#8203;0xe0
0000000000000108         cmp        w8, #&#8203;0xa0
000000000000010c         cset       w0, eq
0000000000000110         ret
                        ; endp

                     _$S15SmallUTF8String22_isNotOverlong_E0_bitsySbs5UInt8VF:
0000000000000114         and        w8, w0, #&#8203;0xff
0000000000000118         and        w9, w0, #&#8203;0xc0
000000000000011c         cmp        w9, #&#8203;0x80
0000000000000120         cset       w9, eq
0000000000000124         cmp        w8, #&#8203;0x9f
0000000000000128         cset       w8, hi
000000000000012c         and        w0, w8, w9
0000000000000130         ret
            ; endp

@milseman
Copy link
Mannequin Author

milseman mannequin commented Jun 1, 2018

A case where bits is very similar to range, but slightly superior (`and` being better than `sxtb` on some uArches):

@usableFromInline
func _isNotOverlong_ED_range(_ x: UInt8) -> Bool {
  return (0x80...0x9F).contains(x)
}
@usableFromInline
func _isNotOverlong_ED_bits(_ x: UInt8) -> Bool {
  return _isContinuation_bits(x) && x <= 0x9F
}



                     _$S15SmallUTF8String23_isNotOverlong_ED_rangeySbs5UInt8VF:
0000000000000144         and        w8, w0, #&#8203;0xff
0000000000000148         sxtb       w9, w0
000000000000014c         cmp        w9, #&#8203;0x0
0000000000000150         cset       w9, lt
0000000000000154         cmp        w8, #&#8203;0xa0
0000000000000158         cset       w8, lo
000000000000015c         and        w0, w9, w8
0000000000000160         ret
                        ; endp

                     _$S15SmallUTF8String22_isNotOverlong_ED_bitsySbs5UInt8VF:
0000000000000164         and        w8, w0, #&#8203;0xff
0000000000000168         and        w9, w0, #&#8203;0xc0
000000000000016c         cmp        w9, #&#8203;0x80
0000000000000170         cset       w9, eq
0000000000000174         cmp        w8, #&#8203;0xa0
0000000000000178         cset       w8, lo
000000000000017c         and        w0, w8, w9
0000000000000180         ret
                        ; endp

@eeckstein
Copy link
Member

Yes, the summer timeframe looks reasonable

@milseman
Copy link
Mannequin Author

milseman mannequin commented Jun 1, 2018

Attached foo.swift and foo.txt to demonstrate all the needed functionality for efficient UTF-8 validation byte-by-byte. It will likely be adjusted some day to use wider loads, but this gives good and relevant optimizer fodder!

@milseman
Copy link
Mannequin Author

milseman mannequin commented Jun 1, 2018

Investigating this could be a good performance starter bug too, as it's fairly self-contained. It may span conventions across both SIL and LLVM, though.

@milseman
Copy link
Mannequin Author

milseman mannequin commented Jun 1, 2018

@swift-ci update

@belkadan
Copy link
Contributor

belkadan commented Jun 4, 2018

Please don't tag something as a Starter Bug unless you already know the basic idea of how to fix it.

@swift-ci
Copy link
Collaborator

Comment by Erik Verbruggen (JIRA)

So there is @usableFromInline, so I assume it's swift 4.2/5.0/master? If so, is it possible to convince swift to build on macos 10.13?

@swift-ci
Copy link
Collaborator

Comment by Erik Verbruggen (JIRA)

Nevermind the previous question, 4.1 generates the same code.

@swift-ci
Copy link
Collaborator

Comment by Erik Verbruggen (JIRA)

So, how do I tell swiftc to apply optimisation to SIL, but to emit unoptimised LLVM IR? Just supplying -O will run optimisation for both.

@belkadan
Copy link
Contributor

We should really have an -emit-irgen option, but meanwhile you should be able to use -O -Xfrontend -disable-llvm-optzns, which is almost the same thing.

@eeckstein
Copy link
Member

IMO, this is something what should be handled by an LLVM peephole optimization.

@swift-ci
Copy link
Collaborator

Comment by Erik Verbruggen (JIRA)

For all functions, the LLVM IR is a nearly direct translation of the Swift code/SIL when disabling LLVM optimisations. So an easy observation is that not optimising _isContinuation_range is a missing pass. The "inverse" also holds: because LLVM doesn't recognise the bit fiddling done in _isContinuation_bits, it can't reconstruct the range, and that results in _isNotOverlong_E0_bits having the bit check, while _isNotOverlong_E0_range has the ranges folded. So, I'd say that _isContinuation_range is the best implementation for allowing LLVM to optimise it when it's inlined (ok, and the most readable implementation), but yeah, a bit fiddling pass would be useful to optimise _isContinuation_bits would be great.

I do agree with @eeckstein that the best place probably is a peephole pass, but I'll leave that to someone else as I don't have any experience in the MI layer. (If someone has an example of such a pass though, I'd be interested, just to learn about it.)

@swift-ci swift-ci transferred this issue from apple/swift-issues Apr 25, 2022
@AnthonyLatsis AnthonyLatsis added SILOptimizer Area → compiler: SIL optimization passes standard library Area: Standard library umbrella and removed Optimizer labels Nov 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug A deviation from expected or documented behavior. Also: expected but undesirable behavior. compiler The Swift compiler in itself good first issue Good for newcomers performance SILOptimizer Area → compiler: SIL optimization passes standard library Area: Standard library umbrella
Projects
None yet
Development

No branches or pull requests

4 participants