Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SR-11336] Array + key-path crasher due to -enable-ownership-stripping-after-serialization for swiftCore #53737

Closed
swift-ci opened this issue Aug 20, 2019 · 12 comments
Labels
bug A deviation from expected or documented behavior. Also: expected but undesirable behavior. compiler The Swift compiler in itself crash Bug: A crash, i.e., an abnormal termination of software run-time crash Bug → crash: Swift code crashed during execution

Comments

@swift-ci
Copy link
Collaborator

Previous ID SR-11336
Radar None
Original Reporter pschuh (JIRA User)
Type Bug
Status Resolved
Resolution Done
Additional Detail from JIRA
Votes 0
Component/s Compiler
Labels Bug, Crash
Assignee None
Priority Medium

md5: 6c4c6e52724c440ca710405a852fb670

blocks:

  • TF-799 Re-enable tests involving -enable-ownership-stripping-after-serialization

Issue Description:

protocol SomeProtocol {}
func SR_11336(_ array: Array<Int>) {
  let keyPaths: [PartialKeyPath<Array>] = array.indices.map { \Array<Int>[$0] }
  for kp in keyPaths {
    _ = array[keyPath: kp] is SomeProtocol
  }
}
let array: [Int] = [1000, 1]
// Crashes only if `array` contains 2 or more elements.
SR_11336(array)

The cause is enabling -Xfrontend -enable-ownership-stripping-after-serialization for swiftCore.

This was done in 4ecab47 (for swiftCore), then later in 447f008 (for swiftCore and overlays).

Reproduce:

$ swift sr-11336.swift
[1]    96610 illegal hardware instruction  $SWIFT_BIN/swift sr-11336-2.swift

Weirdly, swiftc sr-11336.swift && ./sr-11336 doesn't crash.

$ swiftc sr-11336.swift && ./sr-11336
# Succeeds, no crash.
@gottesmm
Copy link
Member

Thanks.

@gottesmm
Copy link
Member

Just tried with near TOT and failed to reproduce. I am going to try a snapshot and then do a ToT.

@swift-ci
Copy link
Collaborator Author

Comment by Parker Schuh (JIRA)

Bummer. I am on linux if that makes a difference.

I diffed the -emit-sil with and without your change and found differences only in

struct_extract -> struct_element_addr
Int.hash(into:) -> Hasher._combine(_:)

in what seem to be unrelated functions.

I'm building things with: utils/build-script -R

@swift-ci
Copy link
Collaborator Author

Comment by Parker Schuh (JIRA)

The original problem (that I minimized to not be tensorflow specific) is here in the tensorflow branch: https://ci-external.swift.org/job/swift-PR-TensorFlow-macOS/1908/console also on macos. Unfortunately, it depends on keypathiterable, which is not available upstream so it is not easy to separate out.

@swift-ci
Copy link
Collaborator Author

Comment by Parker Schuh (JIRA)

I'm now able to reproduce on macos with:

~/swift-source/build/Ninja-ReleaseAssert/swift-macosx-x86_64/bin/swiftc --version
Swift version 5.1-dev (LLVM af1f73e9e9, Swift cfaecfc015)
Target: x86_64-apple-darwin18.6.0

Here are the steps to reproduce.

~/swift-source/build/Ninja-ReleaseAssert/swift-macosx-x86_64/bin/swiftc test.swift && DYLD_INSERT_LIBRARIES=~/swift-source/build/Ninja-ReleaseAssert/swift-macosx-x86_64/lib/swift/macosx/x86_64/libswiftCore.dylib ./test

This is probably another bug, but swiftc now always links against /usr/lib/swift/libswiftCore.dylib instead of the dylib in the build directory in macos. This implies that the bug is in libswiftCore.dylib and is just exercised by the example above.

@gottesmm
Copy link
Member

pschuh (JIRA User) can you provide the file? I am assuming what you posted works with only tensor flow, right?

@swift-ci
Copy link
Collaborator Author

Comment by Parker Schuh (JIRA)

Hi Michael, everything in this thread except https://ci-external.swift.org/job/swift-PR-TensorFlow-macOS/1908/console (which is just initial proof that the bug can happen on macos) is with respect to master. The original crasher still stands and is reduced to fully not depend on tensorflow. You have to be careful to make sure that the system swift dylib is not used in place of the one containing the bug.

@dan-zheng
Copy link
Collaborator

I can reproduce the issue from master TOT reliably using swift sr-11336.swift instead of swiftc sr-11336.swift && ./sr-11336.

$ cat sr-11336.swift
protocol SomeProtocol {}
func test_fn(_ v: Array<Float>) -> [PartialKeyPath<Array<Float>>] {
  var result: [PartialKeyPath<Array<Float>>] = []
  let tmp: [PartialKeyPath<Array>] = v.indices.map { \Array<Float>[$0] }
  for kp in tmp {
    result.append(kp)
    if let nested = v[keyPath: kp] as? SomeProtocol {
      print("never reaches here")
    }
  }
  return result
}
let container = Array<Float>([1, 2, 3])
print(test_fn(container))

$ swift sr-11336.swift
sr-11336.swift:7:12: warning: value 'nested' was defined but never used; consider replacing with boolean test
    if let nested = v[keyPath: kp] as? SomeProtocol {
       ~~~~^~~~~~~~~               ~~~
                                   is
[1]    3762 illegal hardware instruction  $SWIFT_BIN/swift sr-11336.swift

$ swiftc sr-11336.swift && ./sr-11336
sr-11336.swift:7:12: warning: value 'nested' was defined but never used; consider replacing with boolean test
    if let nested = v[keyPath: kp] as? SomeProtocol {
       ~~~~^~~~~~~~~               ~~~
                                   is
[Swift.WritableKeyPath<Swift.Array<Swift.Float>, Swift.Float>, Swift.WritableKeyPath<Swift.Array<Swift.Float>, Swift.Float>, Swift.WritableKeyPath<Swift.Array<Swift.Float>, Swift.Float>]
# No crash.

This is blocking downstreaming swift-DEVELOPMENT-SNAPSHOT-2019-09-02-a to the tensorflow branch.

@dan-zheng
Copy link
Collaborator

I suspect the issue may be related to #26608 by @gottesmm, or related patches.
Previously, pschuh (JIRA User) fixed the issue by reverting that patch in c907dab.
@marcrasi later reverted the reversion in e48c89b (a related patch is a786c24c52b6aaf4d582ef1a9045d0c3c766b63c), which one might expect to resurface the issue. I tried reverting both of Marc's commits, but the issue persists.


@gottesmm: I wonder if you have time to look into this issue, since it seems related to your patches?

Here's a lit test reproducer:

// RUN: %target-run-simple-swift

protocol SomeProtocol {}
func test_fn(_ v: Array<Float>) -> [PartialKeyPath<Array<Float>>] {
  var result: [PartialKeyPath<Array<Float>>] = []
  let tmp: [PartialKeyPath<Array>] = v.indices.map { \Array<Float>[$0] }
  for kp in tmp {
    result.append(kp)
    if let nested = v[keyPath: kp] as? SomeProtocol {
      print("never reaches here")
    }
  }
  return result
}
let container = Array<Float>([1, 2, 3])
print(test_fn(container))

@dan-zheng
Copy link
Collaborator

Attempting to narrow down the cause with git bisect.

git bisect start
git bisect good c907dab46d0c47ce712253776c1cf5bad744cc3f # ~pschuh's patch reverting https://github.com/apple/swift/pull/26608
git bisect bad 9a7c32a5a86d0cad43d045dce83d1cb6a57bebd4 # apple/swift:tensorflow-merge HEAD
git bisect run ./script.sh
# script.sh:
#!/usr/bin/env bash

set -e

cd ..
ninja -C build/Ninja-ReleaseAssert+stdlib-Release/swift-linux-x86_64 swift swift-stdlib
# Use `status` variable to work around "git bisect run failed: exit code 139 is < 0 or >= 128".
# https://stackoverflow.com/a/22592593
status=0
$SWIFT_BIN/swift foo.swift || status=$?
if [ "$status" -eq 125 ] || [ "$status" -gt 127 ]; then
  status=1
fi
exit "$status"

EDIT: git bisect was not fruitful. Actual cause found in comment below.

# good: [c907dab46d0c47ce712253776c1cf5bad744cc3f] Revert "Merge pull request #&#8203;26608 from gottesmm/ownership-onone-stdlibcore"
git bisect good c907dab46d0c47ce712253776c1cf5bad744cc3f
# bad: [9a7c32a5a86d0cad43d045dce83d1cb6a57bebd4] [Sema] Fix `Differentiable.TangentVector` derived conformances.
git bisect bad 9a7c32a5a86d0cad43d045dce83d1cb6a57bebd4
# bad: [c773364c7ccfc612ea4a296ca19b6206a20b5487] Merge pull request #&#8203;26842 from gottesmm/pr-30ea01bc2b3b5696b7ebd88b6c149bd1e7abcfe7
git bisect bad c773364c7ccfc612ea4a296ca19b6206a20b5487
# bad: [673728ed1f5484d5704e68047b4561bd10ea620c] Merge pull request #&#8203;21388 from xiaobai/remote_buildtree_exports
git bisect bad 673728ed1f5484d5704e68047b4561bd10ea620c
# bad: [746cad564ffe6886e3f7b50c13abf9bda646b799] Merge pull request #&#8203;26459 from xedin/diag-conversion-to-specified-type
git bisect bad 746cad564ffe6886e3f7b50c13abf9bda646b799
# bad: [529bec17f583db334a2410bba5fc70b3ffa155c8] Merge pull request #&#8203;26633 from compnerd/windows-test-regression
git bisect bad 529bec17f583db334a2410bba5fc70b3ffa155c8
# bad: [075e92b1828037985fc30501e359a8d2c33240a1] Merge pull request #&#8203;26576 from LucianoPAlmeida/SR-10597-name-lookup-failure-diagnostics
git bisect bad 075e92b1828037985fc30501e359a8d2c33240a1
# bad: [eb1a752ea7cd7f138a42248818ffde207cbe9809] Merge pull request #&#8203;26623 from owenv/add_newlines
git bisect bad eb1a752ea7cd7f138a42248818ffde207cbe9809
# bad: [817753a937d771d2ec80cd88350a282cfd70ad30] Merge pull request #&#8203;26554 from DougGregor/connected-components-cleanup
git bisect bad 817753a937d771d2ec80cd88350a282cfd70ad30
# good: [ab38be128d73c64d512b3634834bfab400f8d0e6] [Constraint graph] Handle orphaned constraints within connected components
git bisect good ab38be128d73c64d512b3634834bfab400f8d0e6
# bad: [8fc26349c61646833898b335c1fd619847122b5c] Merge pull request #&#8203;26615 from rintaro/ide-completion-rdar54215016
git bisect bad 8fc26349c61646833898b335c1fd619847122b5c
# good: [4353a55587dbb8dfe950e27c3963fd7b5bfca1f9] [CodeCompletion] Fix a crash in collectPossibleCalleesByQualifiedLookup
git bisect good 4353a55587dbb8dfe950e27c3963fd7b5bfca1f9
# first bad commit: [8fc26349c61646833898b335c1fd619847122b5c] Merge pull request #&#8203;26615 from rintaro/ide-completion-rdar54215016

# The "first bad commit" is a red herring.
# https://github.com/apple/swift/commit/4ecab471909a8bb101afe6f7c40f5ae7f492f17c is the true first bad commit, as known.

@dan-zheng
Copy link
Collaborator

I found the true cause for SR-11336: enabling -Xfrontend -enable-ownership-stripping-after-serialization for swiftCore.

This was done in 4ecab47 (for swiftCore), then later in 447f008 (for swiftCore and overlays).

We will need to revert enabling -Xfrontend -enable-ownership-stripping-after-serialization for swiftCore to unblock tensorflow branch development (so all of our tests pass).
Resolving this issue should allow us to re-enable the flag. cc @gottesmm

@dan-zheng
Copy link
Collaborator

This issue appears to have been fixed recently! Exact commit unknown.

@swift-ci swift-ci transferred this issue from apple/swift-issues Apr 25, 2022
@AnthonyLatsis AnthonyLatsis added the run-time crash Bug → crash: Swift code crashed during execution label Sep 17, 2022
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug A deviation from expected or documented behavior. Also: expected but undesirable behavior. compiler The Swift compiler in itself crash Bug: A crash, i.e., an abnormal termination of software run-time crash Bug → crash: Swift code crashed during execution
Projects
None yet
Development

No branches or pull requests

4 participants