Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SR-15505] URL.standardized removes too many slashes #3354

Open
karwa opened this issue Nov 21, 2021 · 1 comment
Open

[SR-15505] URL.standardized removes too many slashes #3354

karwa opened this issue Nov 21, 2021 · 1 comment

Comments

@karwa
Copy link
Contributor

karwa commented Nov 21, 2021

Previous ID SR-15505
Radar None
Original Reporter @karwa
Type Bug

Attachment: Download

Environment

macOS 11.6, Xcode 13.1 (13A1030d)

Additional Detail from JIRA
Votes 0
Component/s Foundation
Labels Bug
Assignee None
Priority Medium

md5: 0ac46f29cce8a652173171b7f80cac07

Issue Description:

This example is adapted from Foundation's test suite (NSURLWithString-parse-absolute-with-relative-029)

func dumpURL(_ url: URL) {
  print("----------------------------------")
  print(url)
  print(url.path)
  print(url.standardized)
  print(url.standardized.path)
}

dumpURL(URL(string: "http://example.com/..///../blow")!)

Outputs the following:

http://example.com/..///../blow
/..///../blow
http://example.com/blow
/blow

This result is incorrect; the standardized path should actually be "//blow".

  • The first ".." component is discarded since it is at the start of the path and can't pop anything

  • The path then starts with "///..", which is 2 empty components and a ".." component. The ".." pops one of the empties, leaving one remaining empty component

  • Finally, the "/blow" component should be added (including its leading slash)

The older RFCs which URL conforms to don't typically specify how ".." components should be resolved in an absolute URL, so it's unclear which behaviour Foundation is attempting to implement, but they do define how those should work when resolving a relative URL:

Additionally, the more recent RFC-3986 does define a more general algorithm:

> The pseudocode also refers to a "remove_dot_segments" routine for
interpreting and removing the special "." and ".." complete path
segments from a referenced path. This is done after the path is
extracted from a reference, whether or not the path was relative, in
order to remove any invalid or extraneous dot-segments prior to
forming the target URI.

Perhaps this is what Foundation is attempting to implement?

All of these references appear to confirm my belief that Foundation is in error, and the correctly-resolved path should be "//blow". Additionally, this is the result returned by the WHATWG URL Standard and modern browsers, including Safari 15 (JSDOM Live URL Viewer)

Additionally, if we add a component, a rather curious thing happens:

dumpURL(URL(string: "http://example.com/..///../blow")!)
dumpURL(URL(string: "http://example.com/../x///../blow")!) // Add a component before triple slashes, now they are not collapsed?!?!?
----------------------------------
http://example.com/..///../blow
/..///../blow
http://example.com/blow
/blow
----------------------------------
http://example.com/../x///../blow
/../x///../blow
http://example.com/x//blow
/x//blow

So if we add a path component before the triple slashes, for some reason they no longer get collapsed. This is just bizarre; I have no idea what's going on here.

@karwa
Copy link
Contributor Author

karwa commented Nov 21, 2021

It's hard to find another library which implements the same standard as Foundation (most of them use newer standards), but I did find one: "URL Toolkit" in JS (https://github.com/tjenkinson/url-toolkit).

Testing it on jsfiddle, it agrees that there should be 2 slashes before "blow". It keeps the first ".." component, because RFC-1808 isn't specific about what implementations should do if there are too many ".." components at the start and says they may be kept, but later standards are clearer and say they should be discarded. Either way is fine, but what isn't fine is dropping the slash after the ".." component.

https://jsfiddle.net/ovk4gcpd/9/

@swift-ci swift-ci transferred this issue from apple/swift-issues Apr 25, 2022
@shahmishal shahmishal transferred this issue from apple/swift May 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant