[SR-15173] FlattenSequence.count has awful, awful performance #57496
Labels
bug
A deviation from expected or documented behavior. Also: expected but undesirable behavior.
standard library
Area: Standard library umbrella
Attachment: Download
Environment
Apple Swift version 5.5-dev (LLVM bcf6482631963a8, Swift 37aa203)
This is just a random nightly I happen to have installed. As of reporting, Godbolt is running:
Swift version 5.6-dev (LLVM ca88d53176e346a, Swift dc60996)
Additional Detail from JIRA
md5: 09ab6e8a9dcadc52c76efb301ff669b5
Issue Description:
FlattenSequence.count has really, really bad performance.
Check out the following code:
Trying this in Godbolt (nightly, -O), it generates an enormous amount of code. 349 lines, full of generic specializations of this and that, retains, releases, __swift_instantiateConcreteTypeFromMangledName, and, worst of all, this thing:
Yes, you read that correctly. This is an Array 'read' accessor - the thing that will access every element, and it has a big, fat malloc in the middle.
It's really bad. Here you can see it (and the resulting frees) absolutely dominate my actual collection code.
My benchmarks got 500% slower when using it. I couldn't believe it.
Anyway, what's really interesting is what happens when you try the following instead:
Boom. 59 lines, all in a single function. None of these generic specializations floating around, no __swift_instantiateConcreteTypeFromMangledName, and most importantly: no mallocs (okay, there's a couple of retain/releases. I wonder if they're really necessary, but I can live with them). This code is hundreds of times faster, not to mention much, much smaller. Again, compiling at -O.
Anyway, I looked at the implementation of FlattenSequence, and nothing really jumps out at me. It seems to do a full index traversal in its `distance(from:to🙂`, which seems a bit unnecessary, and it doesn't appear to make use of the wrapped collection's version of `distance(from:to🙂` AFAICT - so, room for improvement, but nothing that would cause the behaviour I'm seeing.
I get a feeling this is going to be a tricky one. Bring a shovel, because there's some serious digging to be done.
Here's the Godbolt link with both versions of the function. Un-comment the one you'd like to see.
The text was updated successfully, but these errors were encountered: