Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SR-68] SequenceType._preprocessingPass can cause duplicate evaluation of lazy collections #42690

Open
lilyball mannequin opened this issue Dec 5, 2015 · 2 comments
Open
Labels
bug A deviation from expected or documented behavior. Also: expected but undesirable behavior. standard library Area: Standard library umbrella

Comments

@lilyball
Copy link
Mannequin

lilyball mannequin commented Dec 5, 2015

Previous ID SR-68
Radar None
Original Reporter @lilyball
Type Bug
Environment

Swift version 2.2-dev (LLVM 46be9ff861, Clang 4deb154edc, Swift 3dbfefa)

Additional Detail from JIRA
Votes 1
Component/s Standard Library
Labels Bug
Assignee None
Priority Medium

md5: d0004a1684c9d205b2caf0b3ae2fa8d3

Issue Description:

Code that uses SequenceType._preprocessingPass will have the closure executed when the collection is a lazy collection. This is a problem because the closure almost certainly is going to access some property that forces lazy evaluation (for maps, that would be iterating over the elements; for filters, even just accessing the count will evaluate the filter). And this is a bad idea because the overhead of extra evaluations of the lazy computations is probably worse than the performance gain from doing the preprocessing pass.

I've reproduced this with <SequenceType where Generator.Element == String>.joinWithSeparator(String) on a lazy map. This should apply to other situations as well. The test case for joinWithSeparator looks like

JoinTestSuite.test("StringCollection.joinWithSeparator()") {
  // make sure we only evaluate each element once
  func testLazyJoin(separator: String, _ elements: [String]) {
    var expectSeen: [Int: Int] = [:]
    for i in 0..<elements.count {
      expectSeen[i] = 1
    }
    var seen: [Int: Int] = [:]
    // elements needs to be a collection, not a sequence
    let elements = Array(elements.enumerate()).lazy.map({ (i, x) -> String in
      seen[i] = (seen[i] ?? 0) + 1
      return x
    })
    let _: String = elements.joinWithSeparator(separator)
    expectEqual(expectSeen, seen)
  }

  testLazyJoin("", ["a"])
  testLazyJoin("", ["a", "b", "c"])
  testLazyJoin("xy", ["ab", "cd", "ef"])
}

I'm unsure what the correct behavior here is. If we make _preprocessingPass skip the pass for lazy collections, that throws away some potential valid optimizations, such as relying on an exact count for lazy maps, or optimizing on lazy reverses. But the alternative is to define several classifications of "safe" preprocessing and have separate preprocessing methods for each classification, so e.g. a lazy filter will never preprocess, a lazy map can preprocess for computations involving just the collection indexes, and lazy reverses can do everything. If we do go down this route, we'll have to make sure the lazy collections actually query their underlying base collection too if they think they can handle the pass, because they might be wrapping some other lazy collection that can't.

@Dante-Broggi
Copy link
Contributor

Has this been resolved? If so, it should be marked as such.

@lilyball
Copy link
Mannequin Author

lilyball mannequin commented Aug 11, 2018

This is still an issue in Swift 4.2. someLazyCollection.joined(separator:) does 2 passes over the collection.

@swift-ci swift-ci transferred this issue from apple/swift-issues Apr 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug A deviation from expected or documented behavior. Also: expected but undesirable behavior. standard library Area: Standard library umbrella
Projects
None yet
Development

No branches or pull requests

1 participant