New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SR-4164] Consider treating LazyFilteredCollection like FlattenCollection #46747
Comments
Comment by Hector David (JIRA) |
Hector, it doesn't look like @davidgreens is a recognized userID, so that comment probably will not have any effect. |
A big +1 for this change—the multiple iteration of lazily filtered collections is consistently surprising for users of the lazy system. |
Wow, this is very different from what I saw the last time I tried this. My preliminary benchmarks show that you have to have > 10⁵ elements in the resulting array before pre-counting is a win, at least when the elements are Ints. However, this still isn't as cut-and-dried as it looks, since this shows import Darwin
import CoreFoundation
var total = 0
let length = 10_000_000_00
let sc = (0..<length).lazy.filter { $0 != 0 }
let t0 = CFAbsoluteTimeGetCurrent()
total += sc.reduce(0) { $0 + $1 } >> 8 // shift to avoid overflows
let t1 = CFAbsoluteTimeGetCurrent()
total += sc.count >> 8
let t2 = CFAbsoluteTimeGetCurrent()
print("reduce", t1 - t0)
print("count", t2 - t1)
exit(total == 0 ? 1 : 0) // A way to make sure it isn't optimized out |
Here's a more complete test, with some results at the bottom. I don't have time to complete the testing right now but I've left a |
@dabrahams—I ran with your test and added the remainder. My forked gist is here with some results in a comment. Interesting stuff! |
@natecook1000 Cool! What do you conclude from these results? |
Skipping preallocation looks like a win in enough cases that I'd skip it as the default, since it's an unexpected quirk of the implementation that filtered lazy collections get iterated multiple times. By the time it seems to start to matter, the collection is big enough (billions of items?) that I'd assume people are measuring performance. If someone does want the benefit of that preallocation, they should be able to measure the count of the filtered lazy collection directly, reserve that capacity in a new array, and then append the elements. This was my preference before testing, as well, but I don't seea strong case in this data for keeping the current behavior. |
@natecook1000 Hmm, I'd go further; to me it looks like there's a really strong case for dropping the current behavior. Time for a pull request? |
Comment by Hector David (JIRA) #demo-1 |
Additional Detail from JIRA
md5: e621a9d1963ce2fe4205f377c6b2fb16
Issue Description:
The refusal to pre-count elements of a `FlattenCollection` is based on reasoning about the performance of `flatMap`. But `flatMap`, when applied to collections of optionals, doesn't produce a
LazyFlattenedCollection<…>
, but aLazyMapCollection<LazyFilterCollection<…>>
. We are not doing a similar thing forLazyFilterCollection
.The following demonstrates:
which currently generates
Foo(i)
3 times for eachi
in0..<10
. If this FIXME were addressed it would go down to 2 times (once for counting), but if we take the option suggested in this bug, it would go down to 1 (at the cost of Array reallocations due to failing to pre-count, of course).The text was updated successfully, but these errors were encountered: