Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SR-7598] Strange performance of String.utf8.map #50140

Open
swift-ci opened this issue May 3, 2018 · 3 comments
Open

[SR-7598] Strange performance of String.utf8.map #50140

swift-ci opened this issue May 3, 2018 · 3 comments
Assignees
Labels
bug A deviation from expected or documented behavior. Also: expected but undesirable behavior. standard library Area: Standard library umbrella

Comments

@swift-ci
Copy link
Collaborator

swift-ci commented May 3, 2018

Previous ID SR-7598
Radar None
Original Reporter iosdevzone (JIRA User)
Type Bug
Environment
swift --version
Apple Swift version 4.1.1 (swiftlang-902.0.50 clang-902.0.37.1)
Target: x86_64-apple-darwin17.5.0
Additional Detail from JIRA
Votes 0
Component/s Standard Library
Labels Bug
Assignee @milseman
Priority Medium

md5: 4ce262e34afdb51cfec680c00edb3971

Issue Description:

Further to SR-7511, while trying to generate benchmarks for @milseman I ran into to some rather strange performance issues with String.utf8.map.

A large string can take minutes for String.utf8.map to run, (albeit on my antediluvian MacBook Pro) but if you process the string in some ways, for example, by copying it using let chunk = String(testString.prefix(testString.count)) the same map function will run in the order of seconds. It is worthwhile noting String.utf16 does not suffer the same problem.

The following little program illustrates the problem, run it with no arguments to see the poor performance, with -chunk to see good performance.

import Foundation

var chunk = false
for argument in CommandLine.arguments {
  switch argument {
    case "-chunk" : chunk = true
    default: break
  }
}

let testString = try! String(contentsOfFile: "/usr/share/dict/words")
if chunk {
    let chunk = String(testString.prefix(testString.count))
    let a = chunk.utf8.map { $0 }
    print("\(a.count)")
}
else {
    let a = testString.utf8.map { $0 }
    print("\(a.count)")
}
@milseman
Copy link
Mannequin

milseman mannequin commented May 4, 2018

iosdevzone (JIRA User), I actually see the reverse with a recent 4.2 branch build:

swiftc -O foo.swift 
time ./foo # 0.085 total
time ./foo -chunk # 0.560 total

I can further improve the non-chunk time to ~0.061 by appending "x" to the string in order to trigger an eager bridging, but that will regress the chunk time, probably because it forces another eager copy.

@milseman
Copy link
Mannequin

milseman mannequin commented May 4, 2018

Using a 4.1 beta I had lying around:

swiftc -O foo.swift  
time ./foo # 38.117 total
time ./foo -chunk # 1.044 total
# modify foo.swift to append "x"
time ./foo # 0.092 total
time ./foo -chunk # 0.359 total

So hooray for 4.2 (except that last chunk measurement)!

I think there's still a fair bit of perf to be gained against the UTF-8 view, even for lazily bridged Strings. Thanks for the benchmark code, I'll see about adapting it

@swift-ci
Copy link
Collaborator Author

swift-ci commented May 5, 2018

Comment by iOSDevZone (JIRA)

I'm trying to pull together a menagerie of benchmarks and I will try to get them to you in the form of a PR as we discussed. I just thought this one was sufficiently puzzling and interesting to warrant a report here. I'll also pull a recent snapshot and try to test against that too. Good to see that some of these issues are fixed already!

In the meantime if you want a sneak peek, you'll find a repo of Xcode tests here: https://github.com/iosdevzone/IDZSwiftStringBenchmarks

There's more on the way so I wouldn't bother adapting any of it yet, I'll probably automate that!

@swift-ci swift-ci transferred this issue from apple/swift-issues Apr 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug A deviation from expected or documented behavior. Also: expected but undesirable behavior. standard library Area: Standard library umbrella
Projects
None yet
Development

No branches or pull requests

1 participant