Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SR-9432] Stop using ICU for normalization #51896

Closed
milseman mannequin opened this issue Dec 7, 2018 · 1 comment
Closed

[SR-9432] Stop using ICU for normalization #51896

milseman mannequin opened this issue Dec 7, 2018 · 1 comment
Assignees
Labels
standard library Area: Standard library umbrella

Comments

@milseman
Copy link
Mannequin

milseman mannequin commented Dec 7, 2018

Previous ID SR-9432
Radar rdar://problem/51635207
Original Reporter @milseman
Type Sub-task
Status Resolved
Resolution Done
Additional Detail from JIRA
Votes 4
Component/s Standard Library
Labels Sub-task
Assignee @Azoy
Priority Medium

md5: 2e7cb865a995734ac55ca28ea9e21299

Parent-Task:

Issue Description:

We use ICU heavily for normalization, and doing so efficiently is a source of considerable stdlib complexity (more complexity than just implementing the algorithm). If we have efficient access to the data tables, we should just implement this ourselves.

We heavily check NFC_QC=yes and hasCompBoundaryBefore in our fast-paths. Bouncing over to ICU gives us a hefty perf cost compared to checking locally. A local Unicode.Scalar trie-like structure that can answer these queries efficiently would alleviate this.

Using ICU for normalization involves transcoding UTF-8 to UTF-16 and back. This is costly and another source of complexity. E.g., we need many growable buffers of different widths, and even more conservative growth reservation factors.

We'd like fast-paths for languages with combining characters. Scalar-based queries only fast-path single-scalar segments, and ICU's implementation of the multi-scalar QC algorithm is UTF-16.

@Azoy
Copy link
Member

Azoy commented Nov 18, 2021

This has been resolved here: #38922

@swift-ci swift-ci transferred this issue from apple/swift-issues Apr 25, 2022
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
standard library Area: Standard library umbrella
Projects
None yet
Development

No branches or pull requests

1 participant