Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SR-12515] OperationQueue deadlock/UaF on Swift 5.2 Linux #3261

Closed
Frizlab opened this issue Apr 5, 2020 · 16 comments
Closed

[SR-12515] OperationQueue deadlock/UaF on Swift 5.2 Linux #3261

Frizlab opened this issue Apr 5, 2020 · 16 comments

Comments

@Frizlab
Copy link
Contributor

Frizlab commented Apr 5, 2020

Previous ID SR-12515
Radar rdar://problem/61911109
Original Reporter @Frizlab
Type Bug
Environment

Official Swift Docker image (000366daa6eb)

Additional Detail from JIRA
Votes 2
Component/s Foundation
Labels Bug, 5.2Regression
Assignee None
Priority Medium

md5: d9b761cd41e216d23fd95d4d60093f8b

Issue Description:

Adding operations with `addOperations` in an OperationQueue twice will lead to a deadlock (when run via `swift run`) and a crash (when running the file a script) on Linux.
Works fine on macOS.

Test repository: https://gitlab.com/frizlab-demo-projects/test_op_swift
Commands (either of them fails; one crashes, the other just deadlocks):

  • docker run --rm -it -v "$(pwd):/tmp" swift:5.2 bash -c 'cd /tmp; swift run'

  • docker run --rm -it -v "$(pwd):/tmp" swift:5.2 /tmp/standalone_test.swift

[EDIT by @weissi]: The standalone repro seems to just be

import Foundation

let queue = OperationQueue()
queue.maxConcurrentOperationCount = 2

for _ in 0..<2 {
    var results = [Int]()

    let op1 = BlockOperation{           results.append(1) }
    let op2 = BlockOperation{ sleep(1); results.append(2) }
    op1.addDependency(op2)

    queue.addOperations([op1, op2], waitUntilFinished: true)

    let expected = [2, 1]
    if results != expected {
        print("Got:      \(results)")
        print("Expected: \(expected)")
    } else {
        print("ok")
    }
}
@weissi
Copy link
Member

weissi commented Apr 5, 2020

@swift-ci create

@weissi
Copy link
Member

weissi commented Apr 5, 2020

CC @millenomi/@spevans/@drexin any of you aware of this?

@weissi
Copy link
Member

weissi commented Apr 5, 2020

on 5.2.1 the standalone repro either hangs for me or crashes! Something's definitelyt very wrong

root@3adc3cc4163c:/tmp# swiftc -g test.swift 
root@3adc3cc4163c:/tmp# lldb ./test
(lldb) target create "./test"
Current executable set to '/tmp/test' (x86_64).
(lldb) r
Process 52 launched: '/tmp/test' (x86_64)
ok
Process 52 stopped
* thread #&#8203;1, name = 'test', stop reason = signal SIGSEGV: invalid address (fault address: 0x448)
    frame #&#8203;0: 0x0000555555555b74 test`main at test.swift:13:8
   10       let op2 = BlockOperation{ sleep(1); results.append(2) }
   11       op1.addDependency(op2)
   12   
-> 13       queue.addOperations([op1, op2], waitUntilFinished: true)
   14   
   15       let expected = [2, 1]
   16       if results != expected {
Target 0: (test) stopped.
(lldb) bt
* thread #&#8203;1, name = 'test', stop reason = signal SIGSEGV: invalid address (fault address: 0x448)
  * frame #&#8203;0: 0x0000555555555b74 test`main at test.swift:13:8
    frame #&#8203;1: 0x00007ffff5e76b97 libc.so.6`__libc_start_main + 231
    frame #&#8203;2: 0x00005555555555ba test`_start + 42
(lldb) register read
General Purpose Registers:
       rax = 0x000055555556e940
       rbx = 0x0000000000000000
       rcx = 0x000055555556e230
       rdx = 0x000055555556ea90
       rdi = 0x000055555556ea70
       rsi = 0x0000000000000001
       rbp = 0x00007fffffffe590
       rsp = 0x00007fffffffe2e0
        r8 = 0x0000000000000001
        r9 = 0x0000000000000000
       r10 = 0x000055555555c010
       r11 = 0x0000000000000000
       r12 = 0x0000555555555590  test`_start
       r13 = 0x000055555556e230
       r14 = 0x0000000000000000
       r15 = 0x0000000000000000
       rip = 0x0000555555555b74  test`main + 756 at test.swift:13:8
    rflags = 0x0000000000010246
        cs = 0x0000000000000033
        fs = 0x0000000000000000
        gs = 0x0000000000000000
        ss = 0x000000000000002b
        ds = 0x0000000000000000
        es = 0x0000000000000000

(lldb) bt all
* thread #&#8203;1, name = 'test', stop reason = signal SIGSEGV: invalid address (fault address: 0x448)
  * frame #&#8203;0: 0x0000555555555b74 test`main at test.swift:13:8
    frame #&#8203;1: 0x00007ffff5e76b97 libc.so.6`__libc_start_main + 231
    frame #&#8203;2: 0x00005555555555ba test`_start + 42
  thread #&#8203;2, name = 'test'
    frame #&#8203;0: 0x00007ffff6e5e8c2 libpthread.so.0`do_futex_wait + 66
    frame #&#8203;1: 0x00007ffff6e5e9d3 libpthread.so.0`__new_sem_wait_slow + 163
    frame #&#8203;2: 0x00007ffff64884c5 libdispatch.so`_dispatch_sema4_timedwait + 85
    frame #&#8203;3: 0x00007ffff648089e libdispatch.so`_dispatch_semaphore_wait_slow + 30
    frame #&#8203;4: 0x00007ffff647fc1e libdispatch.so`_dispatch_worker_thread + 1166
    frame #&#8203;5: 0x00007ffff6e556db libpthread.so.0`start_thread + 219
    frame #&#8203;6: 0x00007ffff5f7688f libc.so.6`clone + 63

@weissi
Copy link
Member

weissi commented Apr 5, 2020

yes, heap UaF

root@3adc3cc4163c:/tmp# swiftc -g -sanitize=address test.swift  && ./test
/usr/bin/ld.gold: warning: Cannot export local symbol '__asan_extra_spill_area'
ok
=================================================================
==73==ERROR: AddressSanitizer: heap-use-after-free on address 0x611000000040 at pc 0x55560a797c66 bp 0x7ffcf8ccfed0 sp 0x7ffcf8ccfec8
READ of size 8 at 0x611000000040 thread T0
    #&#8203;0 0x55560a797c65  (/tmp/test+0xc6c65)
    #&#8203;1 0x7f854e9c1b96  (/lib/x86_64-linux-gnu/libc.so.6+0x21b96)
    #&#8203;2 0x55560a6ef7a9  (/tmp/test+0x1e7a9)

0x611000000040 is located 0 bytes inside of 221-byte region [0x611000000040,0x61100000011d)
freed by thread T1 here:
    #&#8203;0 0x55560a76765d  (/tmp/test+0x9665d)
    #&#8203;1 0x7f85503a4dea  (/usr/lib/swift/linux/libswiftCore.so+0x3ccdea)
    #&#8203;2 0x7f85503a4dea  (/usr/lib/swift/linux/libswiftCore.so+0x3ccdea)

previously allocated by thread T0 here:
    #&#8203;0 0x55560a7678dd  (/tmp/test+0x968dd)
    #&#8203;1 0x7f85503a36b1  (/usr/lib/swift/linux/libswiftCore.so+0x3cb6b1)
    #&#8203;2 0x55560a7973b1  (/tmp/test+0xc63b1)
    #&#8203;3 0x7f854e9c1b96  (/lib/x86_64-linux-gnu/libc.so.6+0x21b96)

Thread T1 created by T0 here:
    #&#8203;0 0x55560a75207a  (/tmp/test+0x8107a)
    #&#8203;1 0x7f854f3e70a8  (/usr/lib/swift/linux/libdispatch.so+0x330a8)

SUMMARY: AddressSanitizer: heap-use-after-free (/tmp/test+0xc6c65) 
Shadow bytes around the buggy address:
  0x0c227fff7fb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c227fff7fc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c227fff7fd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c227fff7fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c227fff7ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c227fff8000: fa fa fa fa fa fa fa fa[fd]fd fd fd fd fd fd fd
  0x0c227fff8010: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c227fff8020: fd fd fd fd fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c227fff8030: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c227fff8040: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c227fff8050: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==73==ABORTING

@weissi
Copy link
Member

weissi commented Apr 5, 2020

TSan not adding more but detecting it too

root@3adc3cc4163c:/tmp# swiftc -g -sanitize=thread test.swift  && ./test
ok
ThreadSanitizer:DEADLYSIGNAL
==83==ERROR: ThreadSanitizer: SEGV on unknown address 0x000000000210 (pc 0x7ff67518882d bp 0x7b0c000005a0 sp 0x7ffe71cf8888 T83)
==83==The signal is caused by a WRITE memory access.
==83==Hint: address points to the zero page.
    #&#8203;0 _dispatch_lane_push :? (libdispatch.so+0x3282c)
    #&#8203;1 __interceptor_dispatch_async /home/buildnode/jenkins/workspace/oss-swift-5.2-package-linux-ubuntu-18_04/llvm-project/compiler-rt/lib/tsan/rtl/tsan_libdispatch.cc:219 (test+0xb7f3a)
    #&#8203;2 $s8Dispatch0A5QueueC4sync7executeyAA0A8WorkItemC_tFTm Queue.swift.o:? (libswiftDispatch.so+0x1d786)
    #&#8203;3 $s10Foundation14OperationQueueC9_scheduleyyF ??:? (libFoundation.so+0x49c605)
    #&#8203;4 $s10Foundation14OperationQueueC14_addOperations_7barrierySayAA0B0CG_SbtF ??:? (libFoundation.so+0x49dd9e)
    #&#8203;5 $s10Foundation14OperationQueueC13addOperations_17waitUntilFinishedySayAA0B0CG_SbtF ??:? (libFoundation.so+0x49de4f)
    #&#8203;6 main /tmp/test.swift:13 (test+0xbcc8b)
    #&#8203;7 __libc_start_main ??:? (libc.so.6+0x21b96)
    #&#8203;8 _start ??:? (test+0x21f49)

ThreadSanitizer can not provide additional info.
SUMMARY: ThreadSanitizer: SEGV :? in _dispatch_lane_push
==83==ABORTING

@weissi
Copy link
Member

weissi commented Apr 5, 2020

This is definitely a 5.2 regression. 5.1.5 seems to always run fine; ASan shows 5.1.5 leaks memory but nothing else; TSan is happy with 5.1.5. Although ASan/TSan is are not super meaningful for Foundation because Foundation isn't instrumented on Linux, nor is it usually inlined (which makes the compiler annotate it).

@Frizlab
Copy link
Contributor Author

Frizlab commented Apr 5, 2020

To be clear, it does not crash nor hang on 5.1.5, but the dependencies are not respected (it was bug SR-12138). On 5.2 the dependencies are respected, but we have the crash/hang.

@weissi
Copy link
Member

weissi commented Apr 5, 2020

I see, thanks for the clarification @Frizlab. We need to fix this ASAP. 5.2.2's merge window is also open right now so there's opportunity (https://forums.swift.org/t/development-open-for-swift-5-2-2-for-linux/35012).

@weissi
Copy link
Member

weissi commented Apr 5, 2020

master is just as broken as 5.2.1 🙁

@drexin
Copy link
Member

drexin commented Apr 6, 2020

I did some investigation and it seems like there are multiple issues. I think I tracked down the original use-after-free to this line:

https://github.com/apple/swift-corelibs-foundation/blob/swift-5.2-branch/Foundation/Operation.swift#L305

The problem is that we are releasing the object without retaining it before.

When fixing that, I run into another SIGSEGV here: https://github.com/apple/swift-corelibs-foundation/blob/swift-5.2-branch/Foundation/Operation.swift#L1243

Removing the code that releases the operation object at https://github.com/apple/swift-corelibs-foundation/blob/swift-5.2-branch/Foundation/Operation.swift#L919 fixes that segfault, but leaves us with leaked memory (not surprising, because we are not releasing the operation object at all anymore) and the deadlock. I don't have more time right now, but I hope to get back to it later.

@drexin
Copy link
Member

drexin commented Apr 6, 2020

So I think the problems were introduced in this commit: 2d86bb7

It adds a lot of usages of Unmanaged and this usage for example would cause an unbalanced release: https://github.com/apple/swift-corelibs-foundation/blob/swift-5.2-branch/Foundation/Operation.swift#L125-L129

The naming in the Unmanaged API makes it easy to make this mistake, as it sounds like it would retain the value before returning it, but in reality it consumes an unbalanced retain.

So even thought this commit was part of the 5.1 release, I think that the problems have not surfaced before, because the dependencies were broken. So the commit that fixed dependencies was likely the trigger for the problems we are seeing now.

@weissi
Copy link
Member

weissi commented Apr 6, 2020

Thanks so much for this investigation @drexin

@Frizlab
Copy link
Contributor Author

Frizlab commented Nov 26, 2020

Any updates on this? It seems to still fail on Swift 5.3.1 (hanged for me w/ the official swift:5.3.1 image).

@weissi
Copy link
Member

weissi commented Mar 10, 2021

Still not fixed on 5.3.3

$ jw-docker-swift-5.3 bash -c 'while swift test2.swift; do true; done'
ok
ok
ok
ok
ok
ok
ok
ok
ok
ok
ok
ok
ok
ok
ok
ok
ok
ok
ok
ok
/usr/bin/swift[0x51fa1c4]
/usr/bin/swift[0x51f7dbe]
/usr/bin/swift[0x51fa49c]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x12980)[0x7fbf73fcb980]
/usr/lib/swift/linux/libswiftCore.so(+0x3fe588)[0x7fbf704d6588]
/usr/lib/swift/linux/libFoundation.so($s10Foundation14BlockOperationCfD+0x21)[0x7fbf696b7f31]
/usr/lib/swift/linux/libswiftCore.so(+0x3fe58b)[0x7fbf704d658b]
/usr/lib/swift/linux/libFoundation.so(+0x474806)[0x7fbf696bf806]
ok
ok
ok
ok
ok
ok

occasionally crashes.

@swift-ci swift-ci transferred this issue from apple/swift-issues Apr 25, 2022
@shahmishal shahmishal transferred this issue from apple/swift May 5, 2022
@Frizlab
Copy link
Contributor Author

Frizlab commented Aug 15, 2023

I cannot reproduce this anymore using the infinite loop in swift:5.8.1, but as I have also failed to reproduced it in swift:5.3.3, I’ll leave the issue open.

@Frizlab
Copy link
Contributor Author

Frizlab commented Aug 15, 2023

I have reproduced the issue in swift:5.3.3 after all, and not in swift:5.8.1 (it’s been running for ~ more than an hour now).
I’ll go ahead and close the issue.

@Frizlab Frizlab closed this as completed Aug 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants