Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SR-2413] String.characters not handling some unicode characters correctly #4339

Closed
swift-ci opened this issue Aug 19, 2016 · 8 comments
Closed

Comments

@swift-ci
Copy link
Contributor

Previous ID SR-2413
Radar None
Original Reporter jayant (JIRA User)
Type Bug
Status Resolved
Resolution Done
Environment

Xcode Version 8.0 beta 6 (8S201h) with Swift 3.0

Additional Detail from JIRA
Votes 1
Component/s Foundation, Standard Library
Labels Bug
Assignee None
Priority Medium

md5: 2cb0836048dcca5c9085078157a3d62f

Issue Description:

{{let _str = "Good Coffee <emoji with skin tone here>"
for char in _str.characters {
print(char)
}

/*
This should print the emoji as is, but instead it displays the emoji and the skin tone separately.

G
o
o
d

C
o
f
f
e
e

<emoji - thumbs up>
<skin tone>
*/
}}

On a side note, could not post the issue with the emojis on JIRA 🙁

@belkadan
Copy link

cc @gribozavr

@xwu
Copy link
Collaborator

xwu commented Aug 19, 2016

IIUC, this is correct behavior as-is. Per UTR #51 skin tone modifiers are separate characters that modify the preceding character if possible. Besides emoji skin tone modifiers there are other Unicode selector characters that work like this; there is no ambiguity in the spec in this respect.

@belkadan
Copy link

I think this goes into the general "Unicode 9 character segmentation" bucket.

@xwu
Copy link
Collaborator

xwu commented Aug 19, 2016

Ah you're right, Unicode 9 updated the extended grapheme cluster rules...

@swift-ci
Copy link
Contributor Author

Comment by Jayant Varma (JIRA)

Thanks all, but as I understand that Swift aims at making string handling correct in comparison to other languages. While the emoji + skintone are valid in the UTR #51 as referenced above. From a developer's perspective, that should still read as the Emoji which can then be further queried to get the skintone etc as required. If someone is getting the length of the string or working on things like that, then it should work accordingly. Did I understand incorrectly Apple's direction regarding strings?

@xwu
Copy link
Collaborator

xwu commented Aug 22, 2016

You're right about what the optimal segmentation for emoji with skin tone modifiers would be. This issue is addressed in Unicode 9, which Swift will support in the future (as far as I know).

@johnno1962
Copy link
Contributor

PR raised for this and other issues apple/swift#6204

@Dante-Broggi
Copy link

I think this was fixed when Swift changed the used Unicode version to 9. If so, this should be closed.

@swift-ci swift-ci transferred this issue from apple/swift-issues Apr 25, 2022
@shahmishal shahmishal transferred this issue from apple/swift May 5, 2022
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants