Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SR-12079] libdispatch SEGFAULT when using Foundation's FileHandle readabilityHandler on lots of pipes #3276

Open
weissi opened this issue Jan 24, 2020 · 7 comments

Comments

@weissi
Copy link
Member

weissi commented Jan 24, 2020

Previous ID SR-12079
Radar rdar://problem/58997211
Original Reporter @weissi
Type Bug

Attachment: Download

Additional Detail from JIRA
Votes 1
Component/s Foundation
Labels Bug, Crash
Assignee None
Priority Medium

md5: 507eb4e7b2c1f88b3f62677d14f5efd3

Issue Description:

The following program always opens 1000 Pipe s (sequentially), then sets up a readabilityHandler for each of the pipes, then closes the other end of the pipe, and waits for the readabilityHandlers to send an empty Data meaning EOF.

This works for a while until we get a SEGFAULT in dispatch here

(lldb) run
......................Process 24 stopped
* thread #​2, name = 'File', stop reason = signal SIGSEGV: invalid address (fault address: 0xffff800013ff70b7)
    frame #​0: 0x00007ffff60b516a libdispatch.so`_dispatch_event_loop_drain + 1242
libdispatch.so`_dispatch_event_loop_drain:
->  0x7ffff60b516a <+1242>: movl   0x8(%rcx), %edx
    0x7ffff60b516d <+1245>: cmpl   $0x7fffffff, %edx         ; imm = 0x7FFFFFFF
    0x7ffff60b5173 <+1251>: je     0x7ffff60b5187            ; <+1271>
    0x7ffff60b5175 <+1253>: movl   $0x2, %edx
Target 0: (File) stopped.

[...]

* thread #&#8203;2, name = 'File', stop reason = signal SIGSEGV: invalid address (fault address: 0xffff800013ff70b7)
  * frame #&#8203;0: 0x00007ffff60b516a libdispatch.so`_dispatch_event_loop_drain + 1242
    frame #&#8203;1: 0x00007ffff60a8ea2 libdispatch.so`_dispatch_mgr_invoke + 146
    frame #&#8203;2: 0x00007ffff60a8dee libdispatch.so`_dispatch_mgr_thread + 126
    frame #&#8203;3: 0x00007ffff60ac993 libdispatch.so`_dispatch_worker_thread + 515
    frame #&#8203;4: 0x00007ffff6a826db libpthread.so.0`start_thread + 219
    frame #&#8203;5: 0x00007ffff5ba388f libc.so.6`clone + 63

Please note that the fault address is 0xffff800013ff70b7 and given that the pointer value ends in 7 and we're looking at a 32 bit load, this looks like a nice memory corruption to because that's an unaligned 32 bit load which is unlikely to be how the code is meant to work.

Credit to rignatus (JIRA User) for finding the initial issue that led to us debugging this together.

Repro:

On Linux

swiftc -O File.swift && ./File

With lldb in Docker (runnable from macOS)

# assuming File is in .
docker run --privileged --rm -v "$PWD:$PWD" -w "$PWD" -it norionomura/swift:nightly  bash -c 'swiftc -O File.swift && lldb --batch -o run -k "image list" -k "register read" -k "bt all" -k "exit 134" ./File'

program reproduced here

import Foundation

for i in 1..<1_000_000 {
    fputs(".", stdout)
    fflush(stdout)
    if i % 100 == 0 {
        fputs("\n", stdout)
    }
    let g = DispatchGroup()
    let ps = (0..<1000).map { _ in Pipe() }
    ps.forEach { p in
        var numberOfCalls = 0
        g.enter()
        p.fileHandleForReading.readabilityHandler = { handle in
            if handle.availableData.isEmpty {
                numberOfCalls += 1
                precondition(numberOfCalls == 1)
                g.leave()
            }
        }
    }
    ps.forEach { p in
        try! p.fileHandleForWriting.close()
    }
    g.wait()
}
@weissi
Copy link
Member Author

weissi commented Jan 24, 2020

CC phabouzit (JIRA User)/@millenomi/@spevans

@weissi
Copy link
Member Author

weissi commented Jan 24, 2020

@swift-ci create

@weissi
Copy link
Member Author

weissi commented Jan 24, 2020

   docker run --privileged --rm -v "$PWD:$PWD" -w "$PWD" -it norionomura/swift:nightly  bash -c 'apt-get update && apt-get install strace && swiftc -O File.swift && strace -e trace=epoll_wait,epoll_ctl -ff ./File'

leads to which is interesting.

[pid   236] epoll_ctl(2004, EPOLL_CTL_DEL, 2994, NULL) = -1 ENOENT (No such file or directory)
[pid   239] epoll_ctl(2004, EPOLL_CTL_DEL, 2995, NULL) = 0
[pid   236] epoll_ctl(2004, EPOLL_CTL_DEL, 2995, NULL) = -1 ENOENT (No such file or directory)
[pid   239] epoll_ctl(2004, EPOLL_CTL_DEL, 2996, NULL) = 0
[pid   236] epoll_ctl(2004, EPOLL_CTL_DEL, 2996, NULL) = -1 ENOENT (No such file or directory)
[pid   236] epoll_wait(2004, [{EPOLLHUP, {u32=1677787328, u64=140567367450816}}, {EPOLLHUP, {u32=1677787392, u64=140567367450880}}, {EPOLLHUP, {u32=1677787456, u64=140567367450944}}, {EPOLLHUP, {u32=1677787520, u64=140567367451008}}, {EPOLLHUP, {u32=1677787584, u64=140567367451072}}, {EPOLLHUP, {u32=1677787648, u64=140567367451136}}, {EPOLLHUP, {u32=1677787712, u64=140567367451200}}, {EPOLLHUP, {u32=1677787776, u64=140567367451264}}], 16, -1) = 8
[pid   237] epoll_ctl(2004, EPOLL_CTL_DEL, 2997, NULL <unfinished ...>
[pid   236] epoll_ctl(2004, EPOLL_CTL_DEL, 2997, NULL <unfinished ...>
[pid   237] <... epoll_ctl resumed> )   = 0
[pid   236] <... epoll_ctl resumed> )   = -1 ENOENT (No such file or directory)
[pid   238] epoll_ctl(2004, EPOLL_CTL_DEL, 2998, NULL) = 0
[pid   236] epoll_ctl(2004, EPOLL_CTL_DEL, 2998, NULL) = -1 ENOENT (No such file or directory)
[pid   239] epoll_ctl(2004, EPOLL_CTL_DEL, 2999, NULL) = 0
[pid   236] epoll_ctl(2004, EPOLL_CTL_DEL, 2999, NULL) = -1 ENOENT (No such file or directory)
[pid   239] epoll_ctl(2004, EPOLL_CTL_DEL, 3000, NULL) = 0
[pid   236] epoll_ctl(2004, EPOLL_CTL_DEL, 3000, NULL) = -1 ENOENT (No such file or directory)
[pid   239] epoll_ctl(2004, EPOLL_CTL_DEL, 3001, NULL) = 0
[pid   236] epoll_ctl(2004, EPOLL_CTL_DEL, 3001, NULL) = -1 ENOENT (No such file or directory)

@beccadax
Copy link
Contributor

@swift-ci create

1 similar comment
@weissi
Copy link
Member Author

weissi commented Jan 28, 2020

@swift-ci create

@weissi
Copy link
Member Author

weissi commented Apr 13, 2021

still happening on main

$ jw-docker-swift-main bash -c 'swiftc -O test.swift && lldb --batch -o run -k "image list" -k "register read" -k "bt all" -k "exit 134" ./test'
docker.io/swiftlang/swift:nightly-main-bionic
(lldb) target create "./test"
Current executable set to '/tmp/test' (x86_64).
(lldb) run
.............Process 25 stopped
* thread #&#8203;2, name = 'test', stop reason = signal SIGSEGV: invalid address (fault address: 0xffff800013fe1a17)
    frame #&#8203;0: 0x00007ffff649e53a libdispatch.so`_dispatch_event_loop_drain + 1130
libdispatch.so`_dispatch_event_loop_drain:
->  0x7ffff649e53a <+1130>: movl   0x8(%rcx), %edx
    0x7ffff649e53d <+1133>: cmpl   $0x7fffffff, %edx         ; imm = 0x7FFFFFFF 
    0x7ffff649e543 <+1139>: je     0x7ffff649e557            ; <+1159>
    0x7ffff649e545 <+1141>: movl   $0x2, %edx
Target 0: (test) stopped.

Process 25 launched: '/tmp/test' (x86_64)

@swift-ci swift-ci transferred this issue from apple/swift-issues Apr 25, 2022
@weissi
Copy link
Member Author

weissi commented Apr 25, 2022

Still not fixed in either 5.6 or main.

$ jw-docker-swift-5.6 bash -c 'swiftc -O test.swift && lldb --batch -o run -k "image list" -k "register read" -k "bt all" -k "exit 134" ./test'
Unable to find image 'swift:5.6-focal' locally
5.6-focal: Pulling from library/swift
8e5c1b329fe3: Pull complete 
f82f1f6ea714: Pull complete 
66034e6bd3e5: Pull complete 
68bf25c86869: Pull complete 
Digest: sha256:c3e9e80db696df7aec4daa6a942b756b86772294996ef3ecc932d05253fb7b04
Status: Downloaded newer image for swift:5.6-focal
(lldb) target create "./test"
Current executable set to '/tmp/test' (x86_64).
(lldb) run
....................................................................................................
....................................................................................................
..........................................................Process 45 stopped
* thread apple/swift#2, name = 'test', stop reason = signal SIGSEGV: invalid address (fault address: 0xffff80000fff5bb7)
    frame #0: 0x00007ffff6ebcd6a libdispatch.so`_dispatch_event_loop_drain + 1130
libdispatch.so`_dispatch_event_loop_drain:
->  0x7ffff6ebcd6a <+1130>: movl   0x8(%rcx), %edx
    0x7ffff6ebcd6d <+1133>: cmpl   $0x7fffffff, %edx         ; imm = 0x7FFFFFFF 
    0x7ffff6ebcd73 <+1139>: je     0x7ffff6ebcd87            ; <+1159>
    0x7ffff6ebcd75 <+1141>: movl   $0x2, %edx
Target 0: (test) stopped.
Process 45 launched: '/tmp/test' (x86_64)

$ jw-docker-swift-main bash -c 'swiftc -O test.swift && lldb --batch -o run -k "image list" -k "register read" -k "bt all" -k "exit 134" ./test'
docker.io/swiftlang/swift:nightly-main
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ModuleNotFoundError: No module named 'lldb'
(lldb) target create "./test"
Current executable set to '/tmp/test' (x86_64).
(lldb) run
....................................................................................................
....................................................................................................
....................................................................................................
....................................................................................................
.........................................................................................Process 44 stopped
* thread apple/swift#2, name = 'test', stop reason = signal SIGSEGV: invalid address (fault address: 0xffff800013fe74f7)
    frame #0: 0x00007ffff62e7fca libdispatch.so`_dispatch_event_loop_drain + 1130
libdispatch.so`_dispatch_event_loop_drain:
->  0x7ffff62e7fca <+1130>: movl   0x8(%rcx), %edx
    0x7ffff62e7fcd <+1133>: cmpl   $0x7fffffff, %edx         ; imm = 0x7FFFFFFF 
    0x7ffff62e7fd3 <+1139>: je     0x7ffff62e7fe7            ; <+1159>
    0x7ffff62e7fd5 <+1141>: movl   $0x2, %edx
Target 0: (test) stopped.
Process 44 launched: '/tmp/test' (x86_64)

@shahmishal shahmishal transferred this issue from apple/swift May 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants