Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SR-14424] Small strings don't play well with 'init(unsafeUninitializedCapacity:initializingWith:)' #56780

Closed
xwu opened this issue Mar 30, 2021 · 3 comments
Labels
bug A deviation from expected or documented behavior. Also: expected but undesirable behavior. standard library Area: Standard library umbrella

Comments

@xwu
Copy link
Collaborator

xwu commented Mar 30, 2021

Previous ID SR-14424
Radar rdar://problem/76057106
Original Reporter @xwu
Type Bug
Status Resolved
Resolution Done
Additional Detail from JIRA
Votes 0
Component/s Standard Library
Labels Bug
Assignee None
Priority Medium

md5: 39168a2f7392a285b9cb9a4f674f7768

Issue Description:

let str = String(unsafeUninitializedCapacity: 2) {
    let last = $0.count &- 1
    $0[last] = 49 // "1"
    $0.baseAddress!.moveInitialize(from: $0.baseAddress! + last, count: 1)
    return 1
}
print(str) // "1"
print(str == "1") // false (!)

The unsafeUninitializedCapacity initializer requires users to return the initialized count and to leave the remaining bytes uninitialized. This is what's happening here. To wit:

  • Bytes are written from the end of the buffer towards the start as the string representation is being built up.

  • Once the final byte sequence for the string has been obtained, it is move-initialized to the start of the buffer (as per discussions on the forums, it is permitted to use moveInitialize with overlapping source and destination in this way).

  • This leaves the unused bytes uninitialized, as required by the documentation.
    All of this works perfectly when the string isn't small. Digging through the sources, I understand that a _SmallString's buffer is backed by a tuple of (UInt64, UInt64). Although unused bytes are deinitialized here, they aren't zero immediately prior to deinitialization.

It should be the job of _SmallString to initialize unused bytes to zero before rebinding the buffer to its original type. (In fact, isn't it undefined behavior to rebind memory and read from it after it's been deinitialized?)

The user can't leave the unused capacity zero-initialized themselves after using it because (a) that may not be what a non-small string expects; and (b) the documentation tells the user that they must leave the unused capacity uninitialized.

AFAICT, to fix this bug, it's just a matter of writing (rawPtr + initializedCount).initialize(repeating: 0, count: unusedCapacity) before rebinding the memory to the type of self._storage.

@xwu
Copy link
Collaborator Author

xwu commented Mar 30, 2021

cc @milseman @atrick

@typesanitizer
Copy link

@swift-ci create

@xwu
Copy link
Collaborator Author

xwu commented Apr 3, 2021

Resolved in #36667.

@swift-ci swift-ci transferred this issue from apple/swift-issues Apr 25, 2022
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug A deviation from expected or documented behavior. Also: expected but undesirable behavior. standard library Area: Standard library umbrella
Projects
None yet
Development

No branches or pull requests

2 participants