Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SR-7869] String.init(decoding:sourceEncoding:) Suffers Lack of Speed #50404

Closed
swift-ci opened this issue Jun 4, 2018 · 8 comments
Closed
Assignees
Labels
bug A deviation from expected or documented behavior. Also: expected but undesirable behavior. standard library Area: Standard library umbrella

Comments

@swift-ci
Copy link
Collaborator

swift-ci commented Jun 4, 2018

Previous ID SR-7869
Radar None
Original Reporter calebkleveter (JIRA User)
Type Bug
Status Resolved
Resolution Done
Environment

macOS 10.13.4 (17E202)

Xcode 9.4 (9F1027a)

Apple Swift version 4.1.1 (swiftlang-902.0.53 clang-902.0.39.2)

Target: x86_64-apple-darwin17.5.0

Additional Detail from JIRA
Votes 1
Component/s Standard Library
Labels Bug
Assignee @milseman
Priority Medium

md5: 779e5c587cc70109119e81ca1b157c0d

Issue Description:

When converting an array of UInt8 to String, there are couple ways to do it. You can convert the bytes to data and use the data initializer:

String.init(data: Data(bytes), encoding: .utf8)

Or decoding it as UTF-8:

String.init(decoding: bytes, as: UTF8.self)

Although the second initializer has much cleaner syntax, it suffers from lack of speed. I ran some benchmarks, doing the operation 1,000,000 (one million) times, and got the following results:

String(data:encoding) 0.902sec:

Test Suite 'Selected tests' started at 2018-06-04 14:26:24.631

Test Suite 'CSVTests.xctest' started at 2018-06-04 14:26:24.631

Test Suite 'CSVTests' started at 2018-06-04 14:26:24.632

Test Case '-[CSVTests.CSVTests testBytesToStringSpeed]' started.

/Users/calebkleveter/Development/CSV/Tests/CSVTests/CSVTests.swift:117: Test Case '-[CSVTests.CSVTests testBytesToStringSpeed]' measured [Time, seconds] average: 0.902, relative standard deviation: 1.434%, values: [0.897303, 0.938089, 0.900397, 0.905873, 0.890864, 0.907514, 0.897863, 0.899629, 0.890566, 0.896836], performanceMetricID:com.apple.XCTPerformanceMetric_WallClockTime, baselineName: "", baselineAverage: , maxPercentRegression: 10.000%, maxPercentRelativeStandardDeviation: 10.000%, maxRegression: 0.100, maxStandardDeviation: 0.100

Test Case '-[CSVTests.CSVTests testBytesToStringSpeed]' passed (9.545 seconds).

Test Suite 'CSVTests' passed at 2018-06-04 14:26:34.178.

Executed 1 test, with 0 failures (0 unexpected) in 9.545 (9.546) seconds

Test Suite 'CSVTests.xctest' passed at 2018-06-04 14:26:34.179.

Executed 1 test, with 0 failures (0 unexpected) in 9.545 (9.547) seconds

Test Suite 'Selected tests' passed at 2018-06-04 14:26:34.179.

Executed 1 test, with 0 failures (0 unexpected) in 9.545 (9.548) seconds

Program ended with exit code: 0

String(decoding:as: ) 18.517sec:

Test Suite 'Selected tests' started at 2018-06-04 13:58:52.495

Test Suite 'CSVTests.xctest' started at 2018-06-04 13:58:52.497

Test Suite 'CSVTests' started at 2018-06-04 13:58:52.498

Test Case '-[CSVTests.CSVTests testBytesToStringSpeed]' started.

/Users/calebkleveter/Development/CSV/Tests/CSVTests/CSVTests.swift:117: Test Case '-[CSVTests.CSVTests testBytesToStringSpeed]' measured [Time, seconds] average: 18.517, relative standard deviation: 1.796%, values: [18.325596, 17.819145, 18.555953, 18.594838, 18.493036, 18.182990, 19.057471, 18.755528, 18.531200, 18.854238], performanceMetricID:com.apple.XCTPerformanceMetric_WallClockTime, baselineName: "", baselineAverage: , maxPercentRegression: 10.000%, maxPercentRelativeStandardDeviation: 10.000%, maxRegression: 0.100, maxStandardDeviation: 0.100

Test Case '-[CSVTests.CSVTests testBytesToStringSpeed]' passed (185.769 seconds).

Test Suite 'CSVTests' passed at 2018-06-04 14:01:58.267.

Executed 1 test, with 0 failures (0 unexpected) in 185.769 (185.770) seconds

Test Suite 'CSVTests.xctest' passed at 2018-06-04 14:01:58.268.

Executed 1 test, with 0 failures (0 unexpected) in 185.769 (185.771) seconds

Test Suite 'Selected tests' passed at 2018-06-04 14:01:58.268.

Executed 1 test, with 0 failures (0 unexpected) in 185.769 (185.773) seconds

Program ended with exit code: 0

This is the test method I used:

func testBytesToStringSpeed() {
    let bytes: [UInt8] = [49, 50, 51, 52, 53, 54, 55, 56, 57, 48, 113, 119, 101, 114, 116, 121, 117, 105, 111, 112, 97, 115, 100, 102, 103, 104, 106, 107, 108, 122, 120, 99, 118, 98, 110, 109]
    measure {
        for _ in 0...1_000_000 {
            _ = String.init(data: Data(bytes), encoding: .utf8)
//            _ = String.init(decoding: bytes, as: UTF8.self)
        }
    }
}
@weissi
Copy link
Member

weissi commented Jun 4, 2018

Cc @milseman

@swift-ci
Copy link
Collaborator Author

swift-ci commented Jun 4, 2018

Comment by Caleb Kleveter (JIRA)

I thought this might be relevant. The String(bytes:encoding: ) initializer is faster than converting to data and then a string. Using the same measuring as before, I got 0.333 sec.

func testBytesToStringSpeed() {

    let bytes: [UInt8] = [49, 50, 51, 52, 53, 54, 55, 56, 57, 48, 113, 119, 101, 114, 116, 121, 117, 105, 111, 112, 97, 115, 100, 102, 103, 104, 106, 107, 108, 122, 120, 99, 118, 98, 110, 109]

    measure {
        for _ in 0...1_000_000 {
            guard let _ = String(bytes: bytes, encoding: .utf8) else {
                XCTFail()
                return
            }
        }
    }
}
Test Suite 'Selected tests' started at 2018-06-04 16:22:52.128

Test Suite 'CSVTests.xctest' started at 2018-06-04 16:22:52.130

Test Suite 'CSVTests' started at 2018-06-04 16:22:52.131

Test Case '-[CSVTests.CSVTests testBytesToStringSpeed]' started.

/Users/calebkleveter/Development/CSV/Tests/CSVTests/CSVTests.swift:117: Test Case '-[CSVTests.CSVTests testBytesToStringSpeed]' measured [Time, seconds] average: 0.333, relative standard deviation: 1.154%, values: [0.337403, 0.328976, 0.340650, 0.337263, 0.329202, 0.329619, 0.333868, 0.332411, 0.334991, 0.330322], performanceMetricID:com.apple.XCTPerformanceMetric_WallClockTime, baselineName: "", baselineAverage: , maxPercentRegression: 10.000%, maxPercentRelativeStandardDeviation: 10.000%, maxRegression: 0.100, maxStandardDeviation: 0.100

Test Case '-[CSVTests.CSVTests testBytesToStringSpeed]' passed (3.901 seconds).

Test Suite 'CSVTests' passed at 2018-06-04 16:22:56.033.

Executed 1 test, with 0 failures (0 unexpected) in 3.901 (3.902) seconds

Test Suite 'CSVTests.xctest' passed at 2018-06-04 16:22:56.033.

Executed 1 test, with 0 failures (0 unexpected) in 3.901 (3.903) seconds

Test Suite 'Selected tests' passed at 2018-06-04 16:22:56.034.

Executed 1 test, with 0 failures (0 unexpected) in 3.901 (3.905) seconds

Program ended with exit code: 0

@milseman
Copy link
Mannequin

milseman mannequin commented Jun 7, 2018

Thanks for providing all of these![]( I'll collect all of these into a benchmark for the project so that we can track changes and fixes. Please keep us updated with anything else you find in the mean time)

@milseman
Copy link
Mannequin

milseman mannequin commented Jun 14, 2018

Opened #17213 for the benchmarks.

@milseman
Copy link
Mannequin

milseman mannequin commented Jun 15, 2018

This PR speeds up the example code here: #17244

Unfortunately, we still have some poor performance when it comes to transcoding/parsing the UTF-8 into UTF-16. These won't be fixed at this moment, but we're planning on tackling it from another direction soon. I'll close this bug when that PR is merged; feel free to open any followup reports.

@palimondo
Copy link
Mannequin

palimondo mannequin commented Dec 20, 2018

@milseman You wanted to close this already, right?

@milseman
Copy link
Mannequin

milseman mannequin commented Jan 2, 2019

Yes, this is tracked in the UTF8Decode.swift benchmarks, and currently

String.init(decoding:as:)

is around 8x faster than the init from Foundation.

@milseman
Copy link
Mannequin

milseman mannequin commented Jan 22, 2019

In progress at #21959

edit: To clarify, this bug was originally that the stdlib's was slow. Then we made the stdlib's way way faster than Foundation's. That PR is now making Foundation's faster and more in line with the stdlib's by producing fast native Strings. All related general perf goodness.

@swift-ci swift-ci transferred this issue from apple/swift-issues Apr 25, 2022
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug A deviation from expected or documented behavior. Also: expected but undesirable behavior. standard library Area: Standard library umbrella
Projects
None yet
Development

No branches or pull requests

2 participants