Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SR-12222] No safe way to performantly zero-extend an Array #54648

Open
Lukasa opened this issue Feb 17, 2020 · 4 comments
Open

[SR-12222] No safe way to performantly zero-extend an Array #54648

Lukasa opened this issue Feb 17, 2020 · 4 comments
Labels
bug A deviation from expected or documented behavior. Also: expected but undesirable behavior. standard library Area: Standard library umbrella

Comments

@Lukasa
Copy link
Contributor

Lukasa commented Feb 17, 2020

Previous ID SR-12222
Radar rdar://problem/59516884
Original Reporter @Lukasa
Type Bug
Environment

Apple Swift version 5.2 (swiftlang-1103.0.24.3 clang-1103.0.30.1)
Target: x86_64-apple-darwin19.4.0

Additional Detail from JIRA
Votes 1
Component/s Standard Library
Labels Bug
Assignee None
Priority Medium

md5: dd5e30591fa50bc140e23288fa911137

Issue Description:

Related to SR-12221.

Sometimes I have situations where I want to zero-extend an Array. This is usually done to bring a whole bunch of indices into existence and make them valid at once. One case where I might do that is when I want to express a safe memmove (hence the relationship to SR-12221), which requires that all indices used be valid.

I am happy enough to pay the price of initialising this memory in this case, as it's usually not too much memory I have to initialise. Right now, there are three obvious ways to do this on Array (using Array<UInt8> for clarity), shown below and in a compiler explorer page:

func extendWithArray(_ arr: inout [UInt8], count: Int) {
    arr.append(contentsOf: Array(repeating: 0, count: count))
}

func extendWithRepeatElement(_ arr: inout [UInt8], count: Int) {
    arr.append(contentsOf: repeatElement(0, count: count))
}

func extendWithLoop(_ arr: inout [UInt8], count: Int) {
    for _ in 0..<count {
        arr.append(0)
    }
}

All three of these generate suboptimal code.

The first two generate monstrously large chunks of code. repeatElement is the baseline: it generates an enormous amount of code, which would be acceptable if it was vectorised, but it degrades to a byte-by-byte copy: the compiler is not able to constant-propagate the zero byte through and observe that there is no need to do this byte wise. The Array based one avoids the byte wise operation, but has to initialise each byte twice: once with bzero to create the new array, and then again when those bytes are {{memcpy}}d into the target.

Option 3 is in most ways the best. It generates much smaller code, and the compiler would almost be able vectorise it, except that the quadratic growth of Array stymies it. Each byte must be checked to confirm that the Array has space, otherwise a resizing operation occurs. This also slows this down, and in cases where the array started out empty we'll get a fairly large number of resizing operations.

It's very unfortunate that there is no good way to rapidly extend a collection with a single element. We can construct new ones this way: just not extend old ones. It would be very helpful to have some way for a type to express that it can safely be rapidly extended like this.

@weissi
Copy link
Member

weissi commented Feb 17, 2020

@swift-ci create

@Catfish-Man
Copy link
Member

How does this fare?

func extendWithLoop(_ arr: inout [UInt8], count: Int) {
    arr.reserveCapacity(count)
    for _ in 0..<count {
        arr.append(0)
    }
}

@Lukasa
Copy link
Contributor Author

Lukasa commented Feb 20, 2020

No better. It guarantees the branch won't be taken, but LLVM can't tell and so the branch is still emitted, which prevents the vectorisation opportunity.

Additionally, reserveCapacity's documentation pretty strongly warns against doing this:

If you implement a custom data structure backed by an array that grows dynamically, naively calling the reserveCapacity() method can lead to worse than expected performance. Arrays need to follow a geometric allocation pattern for appending elements to achieve amortized constant-time performance. The Array type’s append() and append(contentsOf) methods take care of this detail for you, but reserveCapacity() allocates only as much space as you tell it to (padded to a round value), and no more. This avoids over-allocation, but can result in insertion not having amortized constant-time performance.

@Catfish-Man
Copy link
Member

I'm in the middle of redesigning Array's growth strategy so I can pretty safely say that the "naive" use it's warning about is calling it inside the loop. Calling it once up front is how it's supposed to be used.

That said, point taken about vectorization.

@swift-ci swift-ci transferred this issue from apple/swift-issues Apr 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug A deviation from expected or documented behavior. Also: expected but undesirable behavior. standard library Area: Standard library umbrella
Projects
None yet
Development

No branches or pull requests

3 participants