Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SR-5397] String.addingPercentEncoding sometimes encodes allowed characters #4280

Closed
swift-ci opened this issue Jul 7, 2017 · 9 comments
Closed

Comments

@swift-ci
Copy link
Contributor

swift-ci commented Jul 7, 2017

Previous ID SR-5397
Radar None
Original Reporter marc (JIRA User)
Type Bug
Status Resolved
Resolution Invalid
Environment

Xcode 9 beta 2, Swift 4

Additional Detail from JIRA
Votes 0
Component/s Foundation
Labels Bug
Assignee None
Priority Medium

md5: df9bb1b68dee2b7087d086565050c270

Issue Description:

import Foundation

CharacterSet.urlHostAllowed.contains(":") // true

":".addingPercentEncoding(withAllowedCharacters: CharacterSet.urlHostAllowed) // "%3A"
// actual result: "%3A"
// expected result: ":"

":".addingPercentEncoding(withAllowedCharacters: CharacterSet(charactersIn: ":")) // ":"
// actual result: ":"
// expected result: ":"
@belkadan
Copy link

cc @phausler

@phausler
Copy link
Member

This is not a swift failure:

#import <Foundation/Foundation.h>

int main(int argc, const char * argv[]) {
    @autoreleasepool {
        // insert code here...
        assert([NSCharacterSet.URLHostAllowedCharacterSet characterIsMember: ':']);
//        CharacterSet.urlHostAllowed.contains(":") // true
        NSString *res = [@":" stringByAddingPercentEncodingWithAllowedCharacters:NSCharacterSet.URLHostAllowedCharacterSet];
        assert([res isEqualToString:@"%3A"]);
//        ":".addingPercentEncoding(withAllowedCharacters: CharacterSet.urlHostAllowed) // "%3A"
//        // actual result: "%3A"
//        // expected result: ":"
        
        NSString *res2 = [@":" stringByAddingPercentEncodingWithAllowedCharacters:[NSCharacterSet characterSetWithCharactersInString:@":"]];
        assert([res2 isEqualToString:@":"]);
//        ":".addingPercentEncoding(withAllowedCharacters: CharacterSet(charactersIn: ":")) // ":"
//        // actual result: ":"
//        // expected result: ":"
    }
    return 0;
}

@phausler
Copy link
Member

My guess is that URL hosts are defined as the part after the slashes and before a port

http://host:port

So that would mean that the host having a ':' character would mean that it would early encode to a breaking character for a port. So perhaps ':' should not be allowed in a host? Not certain there...

@swift-ci
Copy link
Contributor Author

Comment by Marc Knaup (JIRA)

IPv6 addresses contain ":" so it's legit and shouldn't be encoded.

It used to work as expected in iOS 10. Good catch that ObjC is also affected. Should I file a radar for this instead?

@phausler
Copy link
Member

yep radar is the way to get this fixed

@swift-ci
Copy link
Contributor Author

Comment by Marc Knaup (JIRA)

Filed as rdar://33216827

@phausler
Copy link
Member

Ok, I followed up on the radar and this is the response according to the current maintainer of NSURL/CFURLRef (long short is that it behaves correctly):

String.addingPercentEncoding has to special case the single characters sets, URLHostAllowedCharacterSet and URLPathAllowedCharacterSet. (If you make copies of the character sets URLHostAllowedCharacterSet or URLPathAllowedCharacterSet, you'll get the behavior you expected in the bug description.)

URLHostAllowedCharacterSet has to be special-cased because the host component of a URL can be an IP-literal (e.g. an IPv6 address), an IPv4address, or a reg-name (registered name). The form IPv4address requires everything to be decimal digits or periods, so nothing will be percent-encoded. reg-names can be any character and invalid characters will be percent-encoded. IP-literals always start with a '[' and end with a ']'. So, when URLHostAllowedCharacterSet is passed, String.addingPercentEncoding will percent-encode ':' characters UNLESS the string is identified as a IP-literal.

URLPathAllowedCharacterSet has to be special-cased for two reasons:
(1) String.addingPercentEncoding when passed URLHostAllowedCharacterSet percent-encodes ';' characters (even though they are legal in the path) for backwards compatibility with CFURL and NSURL where there was a deprecated parameters URL component.
(2) String.addingPercentEncoding when passed URLHostAllowedCharacterSet percent-encodes any ':' characters found before the first '/' character in the string to ensure the path will work when the URL has no scheme.

@phausler
Copy link
Member

Here's some code to illustrate:

https://gist.github.com/phausler/e945f1b9c6486ff9a990487261fdeab4

@swift-ci
Copy link
Contributor Author

Comment by Marc Knaup (JIRA)

wow, that's quite unexpected and should be documented. Esp. since the behavior has changed since the last iOS version.

Thanks for the insight and examples. That's really helpful to understand what's going on and it's implications when using/modifying CharacterSet.urlHostAllowed.

@swift-ci swift-ci transferred this issue from apple/swift-issues Apr 25, 2022
@shahmishal shahmishal transferred this issue from apple/swift May 5, 2022
valeriyvan pushed a commit to valeriyvan/swift-corelibs-foundation that referenced this issue Nov 7, 2022
…#4280)

This API is documented in its headers to only allow being called once
for a particular domain, so we have to make sure our check for an
existing provider is synchronized with the setting.

rdar://problem/27541751
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants