[SR-68] SequenceType._preprocessingPass can cause duplicate evaluation of lazy collections #42690
Labels
bug
A deviation from expected or documented behavior. Also: expected but undesirable behavior.
standard library
Area: Standard library umbrella
Environment
Swift version 2.2-dev (LLVM 46be9ff861, Clang 4deb154edc, Swift 3dbfefa)
Additional Detail from JIRA
md5: d0004a1684c9d205b2caf0b3ae2fa8d3
Issue Description:
Code that uses
SequenceType._preprocessingPass
will have the closure executed when the collection is a lazy collection. This is a problem because the closure almost certainly is going to access some property that forces lazy evaluation (for maps, that would be iterating over the elements; for filters, even just accessing the count will evaluate the filter). And this is a bad idea because the overhead of extra evaluations of the lazy computations is probably worse than the performance gain from doing the preprocessing pass.I've reproduced this with
<SequenceType where Generator.Element == String>.joinWithSeparator(String)
on a lazy map. This should apply to other situations as well. The test case forjoinWithSeparator
looks likeI'm unsure what the correct behavior here is. If we make
_preprocessingPass
skip the pass for lazy collections, that throws away some potential valid optimizations, such as relying on an exactcount
for lazy maps, or optimizing on lazy reverses. But the alternative is to define several classifications of "safe" preprocessing and have separate preprocessing methods for each classification, so e.g. a lazy filter will never preprocess, a lazy map can preprocess for computations involving just the collection indexes, and lazy reverses can do everything. If we do go down this route, we'll have to make sure the lazy collections actually query their underlying base collection too if they think they can handle the pass, because they might be wrapping some other lazy collection that can't.The text was updated successfully, but these errors were encountered: