Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SR-12080] readabilityHandler on pipe that is Process.standardError sometimes doesn't get called on EOF #3275

Open
weissi opened this issue Jan 24, 2020 · 9 comments

Comments

@weissi
Copy link
Member

weissi commented Jan 24, 2020

Previous ID SR-12080
Radar rdar://problem/58997201
Original Reporter @weissi
Type Bug

Attachment: Download

Additional Detail from JIRA
Votes 1
Component/s Foundation
Labels Bug, Linux
Assignee None
Priority Medium

md5: cd028eabf8a3fd19f398f6e16e95a8e4

Issue Description:

description

The following program just repeatedly spawns /usr/bin/env and waits for both the stdout & stderr pipes to call the readabilityHandler with a handle.availableData.isEmpty (which means EOF).

On macOS it works just fine, on Linux it usually gets stuck fairly soon and it's always missing the call with handle.availableData.isEmpty of stderr. We've never seen an invocation where the readabilityHandler wasn't called for stdout.

This affects at least Swift 5.1, 5.2, and the latest master snapshot.

Credit to rignatus (JIRA User) for finding the initial issue that led to us debugging this together.

program

import Foundation

for i in 1..<1_000_000 {
    fputs(".", stdout)
    fflush(stdout)
    if i % 10 == 0 {
        fputs("\n", stdout)
    }

    let process = Process()
    process.executableURL = URL(fileURLWithPath: "/usr/bin/env")
    let g = DispatchGroup()
    
    let stdoutPipe = Pipe()
    let stderrPipe = Pipe()
    process.standardOutput = stdoutPipe
    process.standardError = stderrPipe
    g.enter()
    g.enter()
    var numberOfCallsOut = 0
    var numberOfCallsErr = 0
    stderrPipe.fileHandleForReading.readabilityHandler = { handle in
        if handle.availableData.isEmpty {
            fputs("o", stdout)
            fflush(stdout)
            numberOfCallsOut += 1
            //precondition(numberOfCalls == 1, "numberOfCalls: \(numberOfCalls)")
            if numberOfCallsOut == 1 { g.leave() }
        }
    }
    stdoutPipe.fileHandleForReading.readabilityHandler = { handle in
        if handle.availableData.isEmpty {
            fputs("e", stdout)
            fflush(stdout)
            numberOfCallsErr += 1
            //precondition(numberOfCalls == 1, "numberOfCalls: \(numberOfCalls)")
            if numberOfCallsErr == 1 { g.leave() }
        }
    }

    try! process.run()
    g.wait()
}

analysis

We did check the following things:

  • the process actually exited

  • the write end of the pipe is not open anymore (we checked by attaching lldb and calling read on it which promptly returned 0. We also checked ls -la /proc//fd/ and lsof -p PID)

repro

on Linux

noformat
swift File.swift && ./File
noformat

on macOS (via docker)

# assuming File.swift is in the working directory
docker run --privileged --rm -v "$PWD:$PWD" -w "$PWD" -it norionomura/swift:nightly bash -c 'swiftc File.swift && ./File'
@weissi
Copy link
Member Author

weissi commented Jan 24, 2020

CC @spevans/@millenomi

@weissi
Copy link
Member Author

weissi commented Jan 24, 2020

@swift-ci create

1 similar comment
@weissi
Copy link
Member Author

weissi commented Jan 28, 2020

@swift-ci create

@spevans
Copy link
Collaborator

spevans commented Feb 5, 2020

I dont think this is a bug.

From the documentation of FileHandle.readabilityHandler @ https://developer.apple.com/documentation/foundation/filehandle/1412413-readabilityhandler
it states thats the block is executed when there is data to read. However it doesnt say that the block will be executed on EOF condition.

You may sometimes see an extra call of the block when there is no data left to read but it is not guaranteed to happen.

Infact, if the test is modified slightly with some extra debugging:

import Foundation

let process = Process()
process.executableURL = URL(fileURLWithPath: "/usr/bin/env")
let g = DispatchGroup()

let stdoutPipe = Pipe()
let stderrPipe = Pipe()
process.standardOutput = stdoutPipe
process.standardError = stderrPipe
g.enter()
g.enter()
var numberOfCallsOut = 0
var numberOfCallsErr = 0
stderrPipe.fileHandleForReading.readabilityHandler = { handle in
    let data = handle.availableData
    if data.isEmpty {
        print("stderr EOF")
        numberOfCallsOut += 1
        if numberOfCallsOut == 1 { g.leave() }
    } else {
        print("stderr: \(data)")
    }
}
stdoutPipe.fileHandleForReading.readabilityHandler = { handle in
    let data = handle.availableData
    if data.isEmpty {
        print("stdout EOF")
        numberOfCallsErr += 1
        if numberOfCallsErr == 1 { g.leave() }
    } else {
        print("stdout: \(data)")
    }
}

try! process.run()
fputs("!", stdout)
fflush(stdout)

g.wait()
print("numberOfCallsOut:", numberOfCallsOut)
print("numberOfCallsErr:", numberOfCallsErr)

On macOS it gives the following output:

 swift ./sr-12080.swift
!stderr EOF
stderr EOF
stderr EOF
stderr EOF
stderr EOF
stderr EOF
stderr EOF
stderr EOF
stderr EOF
stderr EOF
stderr EOF
stdout: 1104 bytes
stderr EOF
stderr EOF
stderr EOF
stderr EOF
stderr EOF
stderr EOF
stdout EOF
stderr EOF
stdout EOF
stderr EOF
stdout EOF
stderr EOF
stdout EOF
stderr EOF
stdout EOF
stderr EOF
stdout EOF
stderr EOF
stdout EOF
numberOfCallsOut: 17
numberOfCallsErr: 7
stderr EOF
stdout EOF
stderr EOF
stderr EOF
stdout EOF
stderr EOF
stdout EOF

[...repeats until contexts returns to the main thread and it exits...]

Where it sees an EOF straight away. If anything is a bug here its probably that the handler gets called too many times on macOS when there is nothing to read. I think it only 'works' on macOS becuase there are so many events one of them is going to have no data available.

I think the correct solution is to use process.waitUntilExit() to know that the sub-process has finished and there is no more data.

@spevans
Copy link
Collaborator

spevans commented Feb 5, 2020

On Linux the output is:

swift-DEVELOPMENT-SNAPSHOT-2020-01-30-a-ubuntu18.04/usr/bin/swift swift-corelibs-foundation/sr-12080.swift
!stderr EOF
stdout: 2551 bytes
[...hangs...]
^C
 

So it hangs as it doesnt see an EOF on stdout but I think this is correct and trying to detect EOF this was isnt correct.

@swift-ci
Copy link
Contributor

swift-ci commented Feb 5, 2020

Comment by Raul Ignatus (JIRA)

Hi Simon,

You are right that this is the expected behaviour. We have assumed that because the docs for `availableData` say that it will be empty on EOF (https://developer.apple.com/documentation/foundation/filehandle/1411463-availabledata), the readabilityHandler will be called on EOF too. I think this can now be closed.

Thanks

@weissi
Copy link
Member Author

weissi commented Feb 5, 2020

Agreed! Pinging @millenomi to confirm that the readabilityHandler is not necessarily expected to fire on EOF.

@abertelrud
Copy link

Just ran into this myself, and I think I've discovered a bit of a pattern, at least for my case.

In my case the problem with the lack of a zero-length callback happened on `stdout`, and in particular it was because the subprocess closed its `stdout` before exiting. When I changed that to leave it open, then the zero-length callback consistently occurred. Not sure if that sheds some light but might be another data point.

Could it be that in the original example, `stderr` was closed before the process exit but `stdout` wasn't and that's why it would always hang?

> I think the correct solution is to use process.waitUntilExit() to know that the sub-process has finished and there is no more data.

Unfortunately this doesn't work as far as I can tell, since the last data on the pipe can arrive after this call returns. There is no guarantee in the documentation that this method waits for all data to be received.

So I thought the standard pattern here was to have a DispatchGroup with an `enter()` call for each asynchronous file handle handler and also for the termination of the process, and then a matching `leave()` call for each EOF condition seen on a file handle as well as when the termination handler is called.

If there is no guarantee about seeing a zero-length data as an indicator of EOF, then what's the best pattern for ensuring that all data has been received in addition to the process exiting?

@swift-ci swift-ci transferred this issue from apple/swift-issues Apr 25, 2022
@shahmishal shahmishal transferred this issue from apple/swift May 5, 2022
@beefon
Copy link

beefon commented Dec 16, 2022

Indeed, there is a race between Process termination and the data to be delivered through its pipes. The behaviour clearly differs on macOS and Linux: on macOS there is a final call with empty data, on Linux there isn't.

I double the last question for advice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants