Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SR-5300] More Fun with Data Slices #3841

Closed
CharlesJS opened this issue Jun 24, 2017 · 17 comments
Closed

[SR-5300] More Fun with Data Slices #3841

CharlesJS opened this issue Jun 24, 2017 · 17 comments
Assignees

Comments

@CharlesJS
Copy link

Previous ID SR-5300
Radar None
Original Reporter @CharlesJS
Type Bug
Status Resolved
Resolution Invalid
Environment

Xcode 9 beta 2 (9M137d)

Additional Detail from JIRA
Votes 0
Component/s Foundation
Labels Bug
Assignee @phausler
Priority Medium

md5: 1baea7b1d98ed200a1f2256fc17334a2

Issue Description:

The indexing for Data slices is inconsistent, as you can see from the code below. Accessing slice[0] gives the first byte in the slice, but slice[0..<2] gives values from the beginning of the parent slice. It is very confusing that slice[0] is not the same as slice[0..<2][0].

This occurs with Xcode 9 beta 2. Unfortunately, trying to access slice[0] causes an immediate crash when building from the trunk, so I can't evaluate whether it is still present there.

import Foundation

let data = Data(bytes: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
let slice = data[5..<7]

print(slice[0])                                             // outputs '5'
print(slice[0..<2].map { "\($0)" }.joined(separator: ", ")) // outputs '0, 1'
print(slice[0..<2] as NSData)                               // outputs <00010203 04050607 0809>
@belkadan
Copy link

It's been fixed. Like all other Collections, the indexes you use with a slice are now the same indexes you use with the base collection, so slice[0] is illegal because it refers to index 0 in the original collection, which is not in the slice.

cc @phausler

@phausler
Copy link
Member

   let data = Data(bytes: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
let slice = data[5..<7]

let slice = data[5..<7]

slice[slice.startIndex] // prints 5
slice[slice.startIndex..<(slice.startIndex + 2)].map { "\($0)" }.joined(separator: ", ")) // prints 5, 6
slice[slice.startIndex..<(slice.startIndex + 2] as NSData // prints <0506>

However there is a failure in there still..

slice[..<(slice.startIndex + 2] 

incorrectly converts the range to 0..<7 because the relative API on RangeExpression is a bit unclear

@CharlesJS
Copy link
Author

Just out of curiosity, was there ever a swift-evolution post explaining the rationale behind making Data into its own slice? I've been poking around, and have so far been unable to find it. In addition to being more intuitive in its behavior, it seems that the implementation of a separate DataSlice type would be much simpler than what we have here, probably greatly reducing the issues that crop up (of which there currently appear to be many; I seem to find a new one every time I play with these things for a few minutes).

@phausler
Copy link
Member

The rationale was that the slice type was not usable with any other framework APIs if it was something different. E.g. writing to a file or passing off to an objc api etc. Without overhauling all of bridging to do something wildly different like bridge protocol existentials we wouldn't be able to emit any sort of optimized carrier type of NSData (or a representation of bytes) without either some huge overhauls or making Data its own slice.

In the end we figured that self typed slices were the better choice (however it is definitely tricky business)

@phausler
Copy link
Member

If you find more edge cases please file bugs and or radars.

@CharlesJS
Copy link
Author

Ah, that makes sense. Still, though, when rewriting old Objective-C code, to Swift, it's easy to do something like this if one isn't thinking about it:

func readSome(data: Data) {
    let magic = data[0..<4]

    if magic == whatever {
        ...
    }
}

What makes it worse is that in a lot of cases, what you get will be an unsliced data object about 98% of the time, so it's easy for such errors to linger in the code undetected until hard-to-debug runtime errors start popping up (possibly non-reproducibly).

@phausler
Copy link
Member

part of this is because the approach is new, so we need to rely on external contributors filing issues and implementing good coverage for unit tests etc.

@phausler
Copy link
Member

put up apple/swift#10584 to address the additional failure found here

@CharlesJS
Copy link
Author

You know, I wonder if Data.Index should be an opaque type like String's, since I can't think of any cases where the integer subscripts are actually safe to use. Any Data you have could be a slice of some bigger Data you don't know about, so you're never really going to have enough information to know 100% what data[0] means. (Even if you read it from a FileHandle, you can't prove that FileHandle didn't do some janky thing under the hood before it returned the data to you.) Taking away the ability to subscript with integer literals would at least force people to do the right thing (I also wish "startIndex" and "endIndex" were renamed just to "start" and "end" to make them less painful to use, but the ship's probably already sailed on that).

@phausler
Copy link
Member

startIndex and endIndex are from Collection (we have some naming issues with IndexPath and IndexSet with those too...)

Removing the Int nature would definitely cause some consternation of source compatibility and ergonomics for the 90% use case (as you mentioned before).

The other more controversial approach would be to make slices to always be zero indexed... that was my first attempt at the slice type; but that was horrifically inconsistent with the other collections.

@CharlesJS
Copy link
Author

Yeah, I was referring to the properties on Collection. The verbosity of them bothers me when working with other collections, too (particularly String). Oh well.

It may cause consternation, but I think it'd ultimately be correct. Can you think of any cases that the integer subscripts are actually safe to use? As it is, it just seems like a land mine for runtime errors, particularly in the presence of old code updated from older versions of Swift in which the subscripts worked differently.

@phausler
Copy link
Member

Older versions used a default wrapper struct to indirect but that type was nearly impossible to actually do anything sensible with due to the fact it had none of the api of Data nor did it have any interoperation to items that took Data. Effectively before Data was its own slice type it was not sliced but instead people used subdata(with: Range<Index>) which caused copies.

@CharlesJS
Copy link
Author

You know, I'm a bit embarrassed, but I don't think when I'm running Swift in the REPL, it's linking against the latest libraries. When I check ps, I see this:

/Users/<redacted>/Development/swift/build/Ninja-RelWithDebInfoAssert/swift-macosx-x86_64/bin/swift -frontend -repl -enable-objc-interop -sdk /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.13.sdk -module-name REPL

suggesting that it may be still getting the Xcode 9 beta 2 version of the overlay, making these tests invalid. Running swiftc instead of swift, of course, complains about missing an -sdk argument, and although there appear to be a few SDKs included with the source, none of them work:

$ ./swiftc -sdk ../test-macosx-x86_64/Driver/Output/sdk-apple.swift.tmp/MacOSX10.13.sdk test.swift
<unknown>:0: error: cannot load underlying module for 'CoreFoundation'
$ ./swiftc -sdk ../test-macosx-x86_64/Driver/Output/sdk-apple.swift.tmp/MacOSX10.13.Internal.sdk test.swift
<unknown>:0: error: cannot load underlying module for 'CoreFoundation'
$ ./swiftc -lswiftFoundation -lswiftCoreFoundation -L../lib/swift/macosx test.swift 
<unknown>:0: error: cannot load underlying module for 'CoreFoundation'
<unknown>:0: note: did you forget to set an SDK using -sdk or SDKROOT?
<unknown>:0: note: use "xcrun swiftc" to select the default macOS SDK installed with Xcode
$ ./swiftc -lswiftFoundation -lswiftCoreFoundation -L../lib/swift/macosx/x86_64 test.swift 
<unknown>:0: error: cannot load underlying module for 'CoreFoundation'
<unknown>:0: note: did you forget to set an SDK using -sdk or SDKROOT?
<unknown>:0: note: use "xcrun swiftc" to select the default macOS SDK installed with Xcode

Would it be all right if I asked you what's the proper command line to get this building with all the latest things, to avoid wasting everyone's time?

Thanks!

@belkadan
Copy link

You should use the real SDK. The overlays are packaged with the compiler (and found relative to the compiler) under lib/swift/.

@phausler
Copy link
Member

   ./swift/utils/build-toolchain -r

that should build similarly to the build bots; however I would really claim that building sclf is way easier; open the workspace and select the TestFoundation scheme and click run. (provided you have XCTest checked out side-by side to it)

@CharlesJS
Copy link
Author

edit: this was posted in the wrong thread, as was my last comment here. Sorry about that!

@CharlesJS
Copy link
Author

However, since I'm here, might as well play with this some more:

import Foundation

let data = Data(bytes: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
let slice = data[4..<8]
let subslice = slice[(slice.endIndex - 2)...]

print(subslice.map { $0 })

This time we are using endIndex like we should, so we should get the correct results: [6, 7]. Do we?

Xcode 9 beta 2: output is [0, 0] on the first run, [253, 7] on the second, [249, 7] on the third. Apparently we're reading random memory.

Latest toolchain: crash!

$ ./swiftc -sdk /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk test2.swift 
$ ./test2
Illegal instruction: 4

However, if we change it to use a regular Range instead of a partial range:

import Foundation

let data = Data(bytes: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
let slice = data[4..<8]
let subslice = slice[(slice.endIndex - 2)..<slice.endIndex]

print(subslice.map { $0 })

Xcode 9 beta 2 continues to give random results, but the latest toolchain now works:

$ ./swiftc -sdk /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk test2.swift 
$ ./test2
[6, 7]

@swift-ci swift-ci transferred this issue from apple/swift-issues Apr 25, 2022
@shahmishal shahmishal transferred this issue from apple/swift May 5, 2022
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants