Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SR-6076] [String] var count: String.CharacterView.IndexDistance { get } returns a wrong value on Linux when "Regional Indicator Symbols" are contained. #48631

Closed
YOCKOW opened this issue Oct 6, 2017 · 9 comments
Assignees
Labels
bug A deviation from expected or documented behavior. Also: expected but undesirable behavior. standard library Area: Standard library umbrella

Comments

@YOCKOW
Copy link
Collaborator

YOCKOW commented Oct 6, 2017

Previous ID SR-6076
Radar None
Original Reporter @YOCKOW
Type Bug
Status Resolved
Resolution Done
Environment
  • Swift 4.0

  • OS

    • macOS: High Sierra

    • Linux: Ubuntu 16.04

Additional Detail from JIRA
Votes 0
Component/s Standard Library
Labels Bug
Assignee @milseman
Priority Medium

md5: 3e4be3a5ea4f496c18a2f04121ef3483

relates to:

Issue Description:

[Sample Code]

let jp: Character = "\u{1F1EF}\u{1F1F5}" // Flag of Japan
let de: Character = "\u{1F1E9}\u{1F1EA}" // Flag of Germany
print("\(jp)".count) // Prints "1", of course
print("\(de)".count) // Prints "1", of course
print("\(jp)\(de)".count) // Prints "2" on macOS, but prints "1" on Linux
print("\(jp)\(de)\(jp)\(de)".count) // Prints "4" on macOS, but prints "1" on Linux

[Note]

  • Results on macOS must be correct.

  • There's no political intention to choose the flags.

@belkadan
Copy link
Contributor

belkadan commented Oct 6, 2017

This is a difference between Unicode 9 and Unicode 10 (or possibly Unicode 10 and Unicode 11, I'm not sure) and is dependent on the version of ICU used to build Swift. cc @airspeedswift

@airspeedswift
Copy link
Member

I think the flags issue was resolved in Unicode 9. Unfortunately, AFAICT Ubuntu 16 is still on ICU 55 which is Unicode 7.

We're considering switching to bundling ICU with the toolchain in future releases which would allow Linux to have a modern ICU in a similar fashion to Darwin. Not sure if we have a JIRA for this already, if not we can probably repurpose this for that.

@milseman
Copy link
Mannequin

milseman mannequin commented Oct 6, 2017

This is due to the user having an old version of ICU (such as that shipped on Ubuntu LTS). As Ben mentioned, we're hoping to ship a modern ICU similarly to Darwin for Linux. That will also carry with it performance improvements, and greater behavior parity and build system simplification.

Should I put this as a dup on that task? Is there a JIRA for that task?

@YOCKOW
Copy link
Collaborator Author

YOCKOW commented Oct 7, 2017

I'm very sorry but I've bothered you all.
Now I understand that this is not a bug of Swift, but a "feature" of Unicode (which depends on the version of it).

@milseman
Copy link
Mannequin

milseman mannequin commented Oct 7, 2017

It’s no bother at all! It’s useful to know people hit this and that unifying ICU versions would help our users.

@spevans
Copy link
Collaborator

spevans commented Mar 3, 2018

FYI, I built a version of Swift with Darwin's ICU from https://opensource.apple.com/tarballs/ICU/ICU-59152.0.1.tar.gz on Ubuntu 16.04 and it fixes this issue and also fixes SR-5591. Im happy to sort out a patch to build swift with this version but would obviously need someone to add Apple's ICU to a repository on github if you think this is worth pursuing.

@milseman
Copy link
Mannequin

milseman mannequin commented Mar 4, 2018

I think that's totally worth pursuing.

@airspeedswift how can we proceed?

@allevato
Copy link
Collaborator

The pull request implementing SE-0211 recently hit a similar issue, because Ubuntu 16.04 comes with a version of ICU too old to support the emoji properties that we exposed. I linked to this bug in the FIXME comment (we're just conditionally hiding the declarations on non-Darwin for the time being). The treatment of grapheme clusters above is reason enough, but if we're also going to be shipping APIs like Unicode.Scalar.Properties that wrap modern ICU calls, we really need to be shipping a consistent known version.

@spevans
Copy link
Collaborator

spevans commented Nov 14, 2018

Linux Swift now ships with ICU 61.1 due to SR-8876 so this is now fixed in swift-DEVELOPMENT-SNAPSHOT-2018-11-13

$ cat sr_6076.swift 
let jp: Character = "\u{1F1EF}\u{1F1F5}" // Flag of Japan
let de: Character = "\u{1F1E9}\u{1F1EA}" // Flag of Germany
print("\(jp)".count) // Prints "1", of course
print("\(de)".count) // Prints "1", of course
print("\(jp)\(de)".count) // Prints "2" on macOS, but prints "1" on Linux
print("\(jp)\(de)\(jp)\(de)".count) // Prints "4" on macOS, but prints "1" on Linux

$ ~/swift-DEVELOPMENT-SNAPSHOT-2018-11-13-a-ubuntu14.04/usr/bin/swift sr_6076.swift 
1
1
2
4

@swift-ci swift-ci transferred this issue from apple/swift-issues Apr 25, 2022
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug A deviation from expected or documented behavior. Also: expected but undesirable behavior. standard library Area: Standard library umbrella
Projects
None yet
Development

No branches or pull requests

5 participants