Uploaded image for project: 'Swift'
  1. Swift
  2. SR-7602

UTF8 should be (one of) the fastest String encoding(s)

    XMLWordPrintable

    Details

      Description

      I believe that there are really only one (and a half) encodings that matter today: UTF8 (and its subset ASCII).
      Therefore it's important that Swift's fastest String encoding is UTF8.

      From what I can tell today the fastest String encodings are UTF16 and ASCII. Everything else will have worse performance.

      This also seems to ABI relevant so AFAIK this needs to be fixed very soon.

      Requirements:

      1. being able to copy UTF-8 encoded bytes from a String into a pre-allocated raw buffer must be allocation-free and as fast as memcpy can copy them
      2. creating a String from UTF-8 encoded bytes should just validate the encoding and store the bytes as they are
      3. slightly softer but still very strong requirement: currently (even with ASCII) only the stdlib seems to be able to get a pointer to the contiguous ASCII representation (if at all in that form). That works fine if you just want to copy the bytes (UnsafeMutableBufferPointer(start: destinationStart, count: destinationLength).initialize(from: string.utf8) which will use memcpy if in ASCII representation) but doesn't allow you to implement your own algorithms that are only performant on a contiguously stored [UInt8]

        Attachments

          Activity

            People

            • Assignee:
              milseman Michael Ilseman
              Reporter:
              jw Johannes Weiss
            • Votes:
              26 Vote for this issue
              Watchers:
              20 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: