Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SR-4036] Memory leak with String.cString(using: Encoding) #3885

Open
pushkarnk opened this issue Feb 22, 2017 · 15 comments
Open

[SR-4036] Memory leak with String.cString(using: Encoding) #3885

pushkarnk opened this issue Feb 22, 2017 · 15 comments

Comments

@pushkarnk
Copy link
Collaborator

Previous ID SR-4036
Radar rdar://problem/73153090
Original Reporter @pushkarnk
Type Bug

Attachment: Download

Additional Detail from JIRA
Votes 1
Component/s Foundation
Labels Bug
Assignee None
Priority Medium

md5: 342a1be011db14a8a68d02d0d35c641e

Issue Description:

This code leaks memory

for i in 1...100000 {
    let s: String = "test\(i)"
    let c = s.cString(using: .utf8)
    print(c)
}
@pushkarnk
Copy link
Collaborator Author

On investigation, I could see that String.cString() on Linux calls NSString.cString() under the covers:
https://github.com/apple/swift-corelibs-foundation/blob/master/Foundation/NSStringAPI.swift#L371

Further, NSString.cString() malloc's memory and returns the address of that memory to the caller:
https://github.com/apple/swift-corelibs-foundation/blob/master/Foundation/NSString.swift#L834

String.cString() calls _persistCString that copies the contents of this malloc'd memory into a [CChar] and returns it.
https://github.com/apple/swift/blob/master/stdlib/public/core/CString.swift#L167

We do not free the memory that was malloc'd by NSString.cString() and it's clear that malloc'd memory does not come under the purview of the ARC. Looks like we need to call `free` explicitly.

@pushkarnk
Copy link
Collaborator Author

With String.cString() we can attempt to free up the malloc'd memory after copying the contents to [CChar]. But with NSString.cString() this becomes the responsibility of the application.

@pushkarnk
Copy link
Collaborator Author

#891

@belkadan
Copy link

On Apple platforms the returned string is autoreleased. I think the Foundation folks just didn't have a great answer on non-Apple platforms. cc @phausler

@phausler
Copy link
Member

it is leaked; I have a prototype that would make a pseudo autorelease but that was considered to be "distasteful" 🙁

@pushkarnk
Copy link
Collaborator Author

Comment from @parkera on the PR (which I've closed):

```
Hm, I'm not really a fan of this approach for a couple of reasons:
It makes this file different across darwin and Linux, because on darwin that C string is backed by an autoreleased NSData and it would be a serious error to free it.
It leaves the leak in place at the source of the problem, which is NSString's cString (and utf8string, which we do nothing about here).
It seems like we should elide cString and utf8String from NSString on Linux, because there is no choice but for them to leak, and implement this method in a way which is more canonical (perhaps data:usingEncoding: followed by a copy).
```

@phausler
Copy link
Member

http://swift.sandbox.bluemix.net/#/repl/58b24367f008f226a7910246

That would be a way we could have autorelease work on linux. And the NSData backing could be done as

internal func _bytesInEncoding(_ str: NSString, _ encoding: String.Encoding, _ fatalOnError: Bool, _ externalRep: Bool, _ lossy: Bool) -> UnsafePointer<Int8>? {
    let theRange = NSMakeRange(0, str.length)
    var cLength = 0
    var used = 0
    var options: NSString.EncodingConversionOptions = []
    if externalRep {
        options.formUnion(.externalRepresentation)
    }
    if lossy {
        options.formUnion(.allowLossy)
    }
    if !str.getBytes(nil, maxLength: Int.max - 1, usedLength: &cLength, encoding: encoding.rawValue, options: options, range: theRange, remaining: nil) {
        if fatalOnError {
            fatalError("Conversion on encoding failed")
        }
        return nil
    }
    
    let buffer = malloc(cLength + 1)!.bindMemory(to: Int8.self, capacity: cLength + 1)
    if !str.getBytes(buffer, maxLength: cLength, usedLength: &used, encoding: encoding.rawValue, options: options, range: theRange, remaining: nil) {
        fatalError("Internal inconsistency; previously claimed getBytes returned success but failed with similar invocation")
    }
    
    buffer.advanced(by: cLength).initialize(to: 0)
    let data = NSData(bytesNoCopy: buffer, length: cLength, deallocator: .custom( { (bytes, length) in
        free(bytes)
    }))
    Unmanaged<NSData>.passUnretained(data).autorelease()

    return UnsafePointer(buffer)
}

@phausler
Copy link
Member

Of course the requirement for that code is that it would need to be housed in a autoreleasepool call else it would hard abort and it is worth noting that swift on linux does not have a root pool installed by default. We could at CF constructor time push a root pool and then with an atexit pop the pool. That would be a long stretch to accommodate for the handful of APIs that require autorelease behavior and it might be more reasonable to offer a new SPI:

extension NSString {
    internal func withUnsafeCString<Result>(encoding: UInt, apply: (UnsafePointer<Int8>?) throws -> Result) rethrows -> Result {
        let buffer = _bytesInEncoding(self, String.Encoding(rawValue: encoding), false, false, false)
        defer { free(buffer) }
        return apply(buffer)
    }
}

extension String {
    public func cString(using encoding: Encoding) -> [CChar]? {
        return withExtendedLifetime(_ns) { (s: NSString) -> [CChar]? in
            return s.withUnsafeCString(encoding: encoding.rawValue) { 
                return _persistCString($0)
            }
        }
    }
}

@phausler
Copy link
Member

That all being said: the signature `public func cString(using encoding: Encoding) -> [CChar]?` is not exactly proper since it should be a fast-path accessor if the string is eight bit represented (it should return inner pointer) and the Array<CChar> will cause an allocation as well as a full copy of the contents.

@pushkarnk
Copy link
Collaborator Author

@phausler Tested your second solution. As expected, it works well but it doesn't address the leak in NSString.cString().

@weissi
Copy link
Member

weissi commented Jan 8, 2019

@phausler/@parkera could we maybe deprecate String.cString(using🙂 ? Returning an interior pointer without lifetime management is just not that great, especially in Swift so I'd say let's deprecate it in favour of `withCString`

@belkadan
Copy link

belkadan commented Jan 8, 2019

It's still needed occasionally for things like super.init, which can't be nested in a closure just yet.

@kmahar
Copy link

kmahar commented Jan 13, 2021

I have nothing to add in terms of what the right way to fix this is, but I just wanted to call out that if this is not going to be fixed I think it should be made far more obvious that this is an issue. Deprecation, a note on the docs site, something.... would be great. I accidentally wrote some code using this method ~3 years ago when I was new to Swift and just figured out today that it's been causing memory leaks for Linux users. Because I have for the most part only developed and checked for memory leaks on macOS I never noticed it until now.

@weissi
Copy link
Member

weissi commented Jan 13, 2021

+1, we should definitively document how you're supposed to hold this API (or deprecate it but as Jordan points out there are some (IMHO rare) usecases)

@typesanitizer
Copy link

@swift-ci create

@swift-ci swift-ci transferred this issue from apple/swift-issues Apr 25, 2022
@shahmishal shahmishal transferred this issue from apple/swift May 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants