Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SR-1927] Subsequences of String Views don’t behave correctly #44536

Closed
swift-ci opened this issue Jun 28, 2016 · 4 comments
Closed

[SR-1927] Subsequences of String Views don’t behave correctly #44536

swift-ci opened this issue Jun 28, 2016 · 4 comments
Assignees
Labels
bug A deviation from expected or documented behavior. Also: expected but undesirable behavior. standard library Area: Standard library umbrella

Comments

@swift-ci
Copy link
Collaborator

Previous ID SR-1927
Radar rdar://problem/22785355
Original Reporter loic (JIRA User)
Type Bug
Status Resolved
Resolution Done
Environment

Swift 3.0 Preview 1
(included with Xcode 8S128d)

Additional Detail from JIRA
Votes 1
Component/s Standard Library
Labels Bug
Assignee @natecook1000
Priority Medium

md5: 4cb067c1ef10d57d7423021666825754

duplicates:

  • SR-1487 Bug in String Slice Indices or Generic Code Generator

Issue Description:

(from https://lists.swift.org/pipermail/swift-evolution/Week-of-Mon-20160627/022460.html, where I was told this is a bug)

String Views (UTF8, UTF16, UnicodeScalar, Character) conform to Collection. However, they do not all provide the correct expected behavior when working with subsequences.

There are two issues:

  1. Slices should use the same indices for the same elements as the original collection
  2. Out-of-bounds elements should not be accessible

Using the same indices

The documentation of Collection has this requirement:

    /// Accesses a contiguous subrange of the collection's elements.
    ///
    /// The accessed slice uses the same indices for the same elements as the
    /// original collection uses. 
    /// ...
    public subscript(bounds: Range<Self.Index>) -> Self.SubSequence { get }

which is not respected by UTF16View and CharacterView. For example:

let str = "Hello World!".utf16
let (start, end) = (str.index(str.startIndex, offsetBy: 2), str.index(str.startIndex, offsetBy: 9))

let sub1 = str[start ..< end]
print(sub1) // llo Wor

let sub2 = str[sub1.startIndex ..< sub1.endIndex]
print(sub2) // Hello W
            // should be "llo Wor"

Bounds checking

UTF8View and UTF16View allow subscripting with indices past the end of the subsequence.

let str = "Hello World!".utf8
let (start, end) = (str.index(str.startIndex, offsetBy: 2), str.index(str.startIndex, offsetBy: 9))

let sub1 = str[start ..< end]
print(sub1) // llo Wor

let pastEnd = sub1.index(sub1.endIndex, offsetBy: 2)

let sub2 = sub1[sub1.startIndex ..< pastEnd]
print(sub2) // llo World
// should have crashed

UnicodeScalar returns an incorrect value when accessing an element past the end of the subsequence.

let str = "Hello World!".unicodeScalars
let (start, end) = (str.index(str.startIndex, offsetBy: 2), str.index(str.startIndex, offsetBy: 9))

let sub1 = str[start ..< end]
print(sub1) // llo Wor

let pastEnd = sub1.index(sub1.endIndex, offsetBy: 2)

let sub2 = sub1[pastEnd]
print(sub2.value) // 65533
// should have crashed
@belkadan
Copy link
Contributor

I don't think it's possible to have the same indexes for a string's UTF-8, UTF-16, and Character views. Or, well, it's possible, but it would require jamming them all into an enum.

@swift-ci
Copy link
Collaborator Author

Comment by Loïc Lecrenier (JIRA)

I didn’t mean that all views should share the same indexes. The bug report wasn’t very clear, but I was referring to the documentation of Collection, which says:

    /// Accesses a contiguous subrange of the collection's elements.
    ///
    /// The accessed slice uses the same indices for the same elements as the
    /// original collection uses. 
    /// ...
    public subscript(bounds: Range<Self.Index>) -> Self.SubSequence { get }

And this requirement (“accessed slice uses the same indices for…”) is not respected for UTF-8 and UTF-16 views.

I have reworded the bug report to make that clearer.

@belkadan
Copy link
Contributor

Oh, my bad! Thanks for clarifying.

cc @gribozavr

@natecook1000
Copy link
Member

Fixed in #4896

@swift-ci swift-ci transferred this issue from apple/swift-issues Apr 25, 2022
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug A deviation from expected or documented behavior. Also: expected but undesirable behavior. standard library Area: Standard library umbrella
Projects
None yet
Development

No branches or pull requests

3 participants