Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SR-6420] Dispatch concurrent queues leak CoreFoundation TSDs #668

Closed
Lukasa opened this issue Nov 17, 2017 · 9 comments
Closed

[SR-6420] Dispatch concurrent queues leak CoreFoundation TSDs #668

Lukasa opened this issue Nov 17, 2017 · 9 comments

Comments

@Lukasa
Copy link
Contributor

Lukasa commented Nov 17, 2017

Previous ID SR-6420
Radar None
Original Reporter @Lukasa
Type Bug
Status Closed
Resolution Done

Attachment: Download

Environment

Swift version 4.0.2 (swift-4.0.2-RELEASE)
Distributor ID: Ubuntu
Description: Ubuntu 14.04.1 LTS
Release: 14.04
Codename: trusty

Additional Detail from JIRA
Votes 1
Component/s Foundation, libdispatch
Labels Bug
Assignee @alblue
Priority Medium

md5: 5a92d9c07669127110266896a5ed6515

Issue Description:

When using dispatch concurrent queues on Linux, if any of the code in the blocks dispatched to that queue cause CoreFoundation to allocate a TSD, that TSD will be leaked.

To see this, compile the attached C program. This C program creates 100 blocks that are dispatched to a concurrent queue, each of which creates a CFString. The program runs, then waits 40 seconds after all of the queue blocks have executed to allow libdispatch to tear down all its threads (if you observe the execution of this program in htop's tree mode, you'll see a large number of threads be spawned and then exit).

If you run this program using valgrind (e.g. using valgrind --leak-check=full ./a.out), you'll find that all of the Foundation TSDs that were allocated for the background threads have been leaked:

==2956== 80,088 bytes in 71 blocks are definitely lost in loss record 18 of 18
==2956==    at 0x4C2CC70: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==2956==    by 0x5050C21: _CFGetTSDCreateIfNeeded (in /usr/lib/swift/linux/libFoundation.so)
==2956==    by 0x50D656F: __CFStringCreateImmutableFunnel3 (in /usr/lib/swift/linux/libFoundation.so)
==2956==    by 0x50D70FF: CFStringCreateWithBytesNoCopy (in /usr/lib/swift/linux/libFoundation.so)
==2956==    by 0x400C10: __main_block_invoke (in /code/a.out)
==2956==    by 0x405E426: _dispatch_call_block_and_release (in /usr/lib/swift/linux/libdispatch.so)
==2956==    by 0x4068DED: _dispatch_continuation_pop (in /usr/lib/swift/linux/libdispatch.so)
==2956==    by 0x40683B4: _dispatch_async_redirect_invoke (in /usr/lib/swift/linux/libdispatch.so)
==2956==    by 0x406D267: _dispatch_worker_thread (in /usr/lib/swift/linux/libdispatch.so)
==2956==    by 0x5AAA183: start_thread (pthread_create.c:312)
==2956==    by 0x5DBDFFC: clone (clone.S:111)
@Lukasa
Copy link
Contributor Author

Lukasa commented Nov 17, 2017

Updated this with a slightly smaller repro that prefers CFStringCreateWithBytes instead of CFStringCreateWithBytesNoCopy, as the use of the more complex function would raise some questions that aren't needed.

@Lukasa
Copy link
Contributor Author

Lukasa commented Nov 17, 2017

This leak also reproduces in Swift 4.0.0, so it's not new to 4.0.2.

@Lukasa
Copy link
Contributor Author

Lukasa commented Nov 17, 2017

This leak does seem to depend on the number of concurrent blocks. If we dropped below 10 blocks the leak vanished. For 10 blocks we leaked 1 TSD. For 11 blocks we leaked 2. I can't suggest this linear pattern goes the whole way though, because for 100 blocks we leaked 71 blocks, but there is certainly some sub-linear growth here.

I have no particularly compelling theory for why 10 is the magic number, though I have a few wild hypotheses.

@alblue
Copy link

alblue commented Nov 17, 2017

I think the issue is we're calling `pthread_create_key` too many times.

Each thread we're calling invokes this bit of code:

https://github.com/apple/swift-corelibs-foundation/blob/abbbbc39699ffca725ae3b10171081918f25bd9a/CoreFoundation/Base.subproj/CFPlatform.c#L661

which is calling this bit of code:

https://github.com/apple/swift-corelibs-foundation/blob/abbbbc39699ffca725ae3b10171081918f25bd9a/CoreFoundation/Base.subproj/CFPlatform.c#L584-L586

which in turn calls pthread_create_key. I think this should be done once per process, not once per thread. From the man page:

https://linux.die.net/man/3/pthread_key_create

static pthread_key_t key;
static pthread_once_t key_once = PTHREAD_ONCE_INIT;

static void
make_key()
{
(void) pthread_key_create(&key, NULL);
}

...
(void) pthread_once(&key_once, make_key);
if ((ptr = pthread_getspecific(key)) == NULL) {
...

The key (ha) here is that the make_key must only be called once. I think when we end up calling it from a subsequent thread, we're blowing away some kind of in-memory data structure that's used to hold the thread local variables, effectively meaning the table that we're creating is disappearing into the ether.

@alblue
Copy link

alblue commented Nov 17, 2017

Note also in the linked man page the section "It is the responsibility of the programmer to ensure that it is called exactly once per key before use of the key":

"There were requests to make pthread_key_create() idempotent with respect to a given key address parameter. This would allow applications to call pthread_key_create() multiple times for a given key address and be guaranteed that only one key would be created. Doing so would require the key value to be previously initialized (possibly at compile time) to a known null value and would require that implicit mutual-exclusion be performed based on the address and contents of the key parameter in order to guarantee that exactly one key would be created.

Unfortunately, the implicit mutual-exclusion would not be limited to only pthread_key_create(). On many implementations, implicit mutual-exclusion would also have to be performed by pthread_getspecific() and pthread_setspecific() in order to guard against using incompletely stored or not-yet-visible key values. This could significantly increase the cost of important operations, particularly pthread_getspecific().

Thus, this proposal was rejected. The pthread_key_create() function performs no implicit synchronization. It is the responsibility of the programmer to ensure that it is called exactly once per key before use of the key. Several straightforward mechanisms can already be used to accomplish this, including calling explicit module initialization functions, using mutexes, and using pthread_once(). This places no significant burden on the programmer, introduces no possibly confusing ad hoc implicit synchronization mechanism, and potentially allows commonly used thread-specific data operations to be more efficient."

@alblue
Copy link

alblue commented Nov 17, 2017

Filed a PR apple/swift-corelibs-foundation#1325 that will avoid the duplicate key assignment.

@Lukasa
Copy link
Contributor Author

Lukasa commented Nov 18, 2017

So I haven't had a chance to test with a new CoreFoundation yet, but I'll try to get to it soon, thanks Alex. As an extra data point, though, the following pthreads-based program (i.e. does not use or link dispatch) does not reproduce the issue we saw with Dispatch. I cannot guarantee that this repro is good, I may have messed up when I ported it to pthreads, but if this was purely related to the spawning of threads on Linux I'd have expected pthreads to trigger the same behaviour.

#include <CoreFoundation/CoreFoundation.h>
#include <stdio.h>
#include <pthread.h>

const unsigned char name[] = "hello world";
const int THREAD_COUNT = 100;

void *background(void *ign) {
    CFRelease(CFStringCreateWithBytes(kCFAllocatorDefault,
                      name,
                      11,
                      kCFStringEncodingASCII,
                      false));
    sleep(5);
    return NULL;
}


int main(int argc, const char * argv[]) {
    pthread_t threads[THREAD_COUNT] = {0};
    for (int i = 0; i < THREAD_COUNT; i++) {
        int rc = pthread_create(threads + i, NULL, background, NULL);
        assert(rc == 0);
    }
    for (int j = 0; j < THREAD_COUNT; j++) {
        int rc = pthread_join(threads[j], NULL);
        assert(rc == 0);
    }
    printf("ok, done basically\n");
    getc(stdin);
    return 0;
}

@Lukasa
Copy link
Contributor Author

Lukasa commented Nov 20, 2017

Ok, so I tested a clean build of Foundation with Alex's patch and it does appear to resolve the issue. I have no particular theory for why the issue only ever manifested under Dispatch, so we may want to consider whether there is still some investigation worth doing in Dispatch, but at least the interaction with Foundation seems to be resolved by Alex's patch.

@alblue
Copy link

alblue commented Nov 29, 2017

This has been closed and fixed in swift-4.1; there is SR-6491 which is for any further investigations required as to why the call was created in the first place.

@swift-ci swift-ci transferred this issue from apple/swift-issues Apr 25, 2022
@shahmishal shahmishal transferred this issue from apple/swift May 5, 2022
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants