Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SR-14920] CFRunLoop reads 8 bytes from each file descriptor it listens on Linux #3953

Open
swift-ci opened this issue Jul 14, 2021 · 9 comments

Comments

@swift-ci
Copy link
Contributor

Previous ID SR-14920
Radar rdar://problem/80820393
Original Reporter PALKOVNIK (JIRA User)
Type Bug

Attachment: Download

Environment

Any swift version. I've noticed the issue in swift 5.1 or something like that, but based on the code it was present since the original support of Linux's epoll was introduced in CFRunLoop.

Additional Detail from JIRA
Votes 1
Component/s Foundation
Labels Bug
Assignee None
Priority Medium

md5: 5d0bc6cdfdf95adc92e299ec86b5ecdc

Issue Description:

CFRunLoop on Linux is using epoll on order to wait for events on "ports", which are just file descriptors on Linux. This includes the port used to wake up run loop. In order to tell run loop that it's awoken the code writes value of `1` as 8 byte integer into the wake up file descriptor and to "reset" the file descriptor state the code reads same 8 byte integer from it. The issue is in the fact that client code can register custom CFRunLoopSource with run loop and the code in run loop itself would read the same 8 bytes from the underlying file descriptor as if it would the wake up file descriptor.
I've prepared a very small sample project to reproduce the issue (it's a tiny part of my pet project) that can be found in attached archive. Jut do `swift run` on any Linux machine with `libx11-dev` installed and click few buttons on the keyboard in the window created by the app. xlib will abort execution saying
```
[xcb] Unknown sequence number while processing queue
[xcb] Most likely this is a multi-threaded client and XInitThreads has not been called
[xcb] Aborting, sorry about that.
```
xlib is capable to understand that there's incorrect amount of data read from it's connection file descriptor and abort execution

@typesanitizer
Copy link

cc @millenomi ? (Not sure who the right person to look at this is)

@millenomi
Copy link
Contributor

@swift-ci create

@millenomi
Copy link
Contributor

Really tempted to just outright say that if you're not XCTest, you should really not use CFRunLoop (and you already can't outside of Linux).

@swift-ci
Copy link
Contributor Author

Comment by Serhii Mumriak (JIRA)

hello @millenomi,
Thanks for fast response!
I've started a [forum tread](https://forums.swift.org/t/cfrunloop-reading-8-bytes-from-any-signalled-file-descriptor-on-linux/50431) to discuss this.
Tl;DR: i'm not looking into using CFRunLoop per say
I am building a UI framework on Linux as a pet project for last ~18 months and I am using Foundation's RunLoop to in main thread as a way to process events. Originally I uxed Xlib to handle events from X11 windowing system. Xlib exposes a file descriptor for a given display connection (so as xcb, which I am looking to adopt instead of Xlib now) to notify interested client about available changes on the display. Originally I tried to hook it up into RunLoop via adding a CFRunLoopSource created from CFRunLoopSourceContext1 with this file descriptor. This is when I've found this issue. To workaround it I've made a background thread with hot loop which would only epoll_wait on this fd and it would signal another eventfd. And latter eventfd is the one I added to the RunLoop's CFRunLoop on main thread essentially replicating internal behavior of CRunLoop.
At some point I've discovered that Foundation team is very nice and ported the FileHandle to linux which allows achieve exactly same thing via `waitForDataInBackgroundAndNotify` which allowed me to almost completely remove CFRunLoop reference from my code (I've found your post where you ask how people use CFRunLoop on Linux and commented on it too).
But (there's always a but) I am now looking into adopting Wayland display protocol and one of the things that is required there is to use libinput. libinput also exposes the file descriptor with the same purpose: notify client about the change. Tho it comes with limitation (quote from it's [documentation](https://wayland.freedesktop.org/libinput/doc/latest/api/)):
> libinput exposes a single file descriptor to the caller. This file descriptor should be monitored by the caller, whenever data is available the caller must immediately call libinput_dispatch(). Failure to do so will result in erroneous behavior.
>
> libinput_dispatch() may result in one or more events being available to the caller. After libinput_dispatch() a caller should call libinput_get_event() to retrieve and process this event. Whenever libinput_get_event() returns NULL, no further events are available.

The even more interesting parts come further: majority (if not all, can't say for sure) libraries that do any I/O expose the file descriptor that is not always eventfd (some are pipes, some are timers, some are sockets, etc). Particularly, UI toolkits like GTK/GDK, Qt, weston (as reference wayland compositor) and others.
Given the fact that "everything is a file" is still a core concept philosophy on Linux it would be nice to have similar abstraction like NSMachPort but for file descriptors that could be added to RunLoop without messing with CFRunLoop.

I also understand (and really very welcome) the async coroutines direction of swift and do realize that anyone can literally observe file descriptors via DispatchSource on background queue. Tho for some tasks, especially UI handling, this might not be the most efficient way to do things. Particularly, in Xlib (and xcb for that matter) it's very discouraged to have multithread processing of events and sending requests to X11 server. And doing some thing asynchronously (like reacting to map event or expose event) might (and will) introduce visible jitter and lag.

That said, Chriss Lattner in his swift concurrency manifesto suggest having ability to opt-out from the async nature of code for system libraries or frameworks and this is exactly this case for me.

(That's a huge wall of text, thanks for reading to this point!)

With all that said, there is some work that I've started doing:

  • I've changed the CFRunLoop code a bit [here](smumriak@26410a68bcda04541133ddb758d0922b35929345) so it would perform the `read` only on file descriptors that semantically are owned by CRRunLoop. Those are the awake fd, timer fd and fd that GCD creates to integrate main queue processing.

  • I've started implementation of something called CFFileDescriptor which semantically is similar to CFMachPort, but it allows for the client of the API to perform actual reading of data from file descriptor. Unfortunately unlike Darwin platforms, Linux does not have consolidated notion of "message", which limits ability of CFRunLoop to read "data" from client fd's because it literally can not know how much of data it should read.

  • I'd like to make the FileDescriptor abstract entity in Foundation that would provide public interface to this CFFileDescriptor, with specialized types like EventFileDescriptor, TimerFileDescriptor (not sure that specialization is good idea tho, might be better to have abstract interface only) since the polling on fd is such a common work to be done on Linux.

Regarding other platforms. In my specific case I am looking into expanding the UI framework to windows, where concept of "handles" is very similar to file descriptor. CFRunLoop already correctly works with those (as far as i can tell from my limited knowledge for win32 API), especially with regard of calling `ResetEvent` only for wakeUpPort handle.

What do you think about that? I am really looking for feedback 🙂

PS. Does it feel that I just like RunLoop too much and it's an outdated technology now? I am so used to RunLoop as a thing on Darwing platforms

@swift-ci
Copy link
Contributor Author

Comment by Serhii Mumriak (JIRA)

Omg, I'm so dumb. I don't want CFMachPort equivalent, I want this: https://developer.apple.com/documentation/corefoundation/cffiledescriptor?language=objc
I did not even know about this class existence. It does exactly what I want and it's CF type ID already lives in swift's CF:
```
You use CFFileDescriptor to monitor file descriptors for read and write activity via CFRunLoop using callbacks. Each call back is one-shot, and must be re-enabled if you want to get another one.

You can re-enable the callback in the callback function itself, but you must completely service the file descriptor before doing so.
```

@swift-ci
Copy link
Contributor Author

Comment by Serhii Mumriak (JIRA)

Hey @millenomi I've ported CFFileDescriptor on Linux and implemented a swift wrapper for that in Foundation (actually spent like ~2 weeks on that without knowing about CFFileDescriptor on Darwin). Right now I have only three tests for that thing, but it requires much more unit testing to be added. Especially considering that there's a bug in kernel that prevents nested epoll fd's to be monitored in edge-triggered mode. Because of that the "one shot" behavior can not be implemented, so i had to do a workaround. I'm having a lot of fun with that 🙂
Worth to note that I've fixed this issue itself, I'm going to send a pull request if that's alright.

@mickeyl
Copy link

mickeyl commented Oct 13, 2021

PALKOVNIK (JIRA User) Could you perhaps share your CFFileDescriptor implementation? Due to severe bugs in NSFileHandle.read, I‘m afraid I need to implement reading myself and CFFileDescriptor would be very important since I also have a NSRunloop to integrate with.

@swift-ci
Copy link
Contributor Author

Comment by Serhii Mumriak (JIRA)

Hello @mickeyl,
Yes, will do. I've also added few test cases and whatnot. The only thing is turns out there's a bug in linux kernel with nested epoll file descriptors, so i had to make a workaround.
I'll publish a pull request tomorrow and will ping you

@swift-ci
Copy link
Contributor Author

swift-ci commented Nov 4, 2021

Comment by Serhii Mumriak (JIRA)

hey @mickeyl, it took me some time to get legal approval to participate in opensource project from current employer. Here's the PR with what I did: #3106

@swift-ci swift-ci transferred this issue from apple/swift-issues Apr 25, 2022
@shahmishal shahmishal transferred this issue from apple/swift May 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants