Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SR-9048] Intentionally crafted String can crash Swift application by fatalError #51551

Closed
niw opened this issue Oct 20, 2018 · 7 comments
Closed
Labels
bug A deviation from expected or documented behavior. Also: expected but undesirable behavior. standard library Area: Standard library umbrella

Comments

@niw
Copy link

niw commented Oct 20, 2018

Previous ID SR-9048
Radar rdar://problem/45425918
Original Reporter @niw
Type Bug
Status Resolved
Resolution Duplicate
Environment

Confirmed next swift on macOS X 10.14 (18A391)

Apple Swift version 4.2 (swiftlang-1000.11.37.1 clang-1000.11.45.1)
Target: x86_64-apple-darwin18.0.0
Apple Swift version 4.2-dev (LLVM c4a0883115, Clang 0a99881462, Swift a640fe7dd6)
Target: x86_64-apple-darwin18.0.0
Additional Detail from JIRA
Votes 1
Component/s Standard Library
Labels Bug
Assignee Lance (JIRA)
Priority Medium

md5: 03f7687e9659d4da4cf021d01c16e1b2

blocks:

  • SR-8939 "this should not happen" panic when putting unicode in a Map

relates to:

  • SR-8939 "this should not happen" panic when putting unicode in a Map

Issue Description:

Next swift code will crash by fatalError()

let str = "\u{0336}\u{0344}\u{0357}\u{0343}\u{0314}\u{0351}\u{0340}\u{0300}\u{0340}\u{0360}\u{0314}\u{0357}\u{0315}\u{0301}\u{0344}a"
let set = Set([str])

Stacktrace:

Fatal error: Output buffer was not big enough, this should not happen: file /Users/buildnode/jenkins/workspace/oss-swift-package-osx/swift/stdlib/public/core/NormalizedCodeUnitIterator.swift, line 177
Stack dump:
0.  Program arguments: /Library/Developer/Toolchains/swift-DEVELOPMENT-SNAPSHOT-2018-10-19-a.xctoolchain/usr/bin/swift -frontend -interpret test.swift -enable-objc-interop -sdk /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.14.sdk -color-diagnostics -module-name test 
0  swift                    0x000000010a2f0918 llvm::sys::PrintStackTrace(llvm::raw_ostream&) + 40
1  swift                    0x000000010a2efb95 llvm::sys::RunSignalHandlers() + 85
2  swift                    0x000000010a2f0f22 SignalHandler(int) + 258
3  libsystem_platform.dylib 0x00007fff5c320b3d _sigtramp + 29
4  libsystem_malloc.dylib   0x00007fff5c2e5d71 tiny_malloc_should_clear + 273
5  libswiftCore.dylib       0x000000010ddb08a2 $sSKs11SubSequenceQzRszs6UInt16V7ElementRtzrlE9hashUTF164intoys16_BufferingHasherVys14_SipHash13CoreVGz_tFs16_UnmanagedStringVyADG_Tg5Tf4nx_n + 4338
6  libswiftCore.dylib       0x000000010ddb08e6 $ss16_UnmanagedStringVss5UInt8VRszrlE4hash4intoys6HasherVz_tFTf4nx_nTm + 22
7  libswiftCore.dylib       0x000000010ddb0b75 $ss11_StringGutsV4hash4intoys6HasherVz_tFTf4nx_n + 261
8  libswiftCore.dylib       0x000000010dbada94 $sSJ4hash4intoys6HasherVz_tF + 548
9  libswiftCore.dylib       0x000000010dbadc25 $sSJSHsSH13_rawHashValue4seedS2i_tFTW + 117
10 libswiftCore.dylib       0x000000010dcf1c6c $sSh8_VariantV6insertySb8inserted_x17memberAfterInserttxnF + 316
11 libswiftCore.dylib       0x000000010dcf319f $sShyShyxGqd__nc7ElementQyd__RszSTRd__lufC + 719
12 libswiftCore.dylib       0x000000010e27108c $sShyShyxGqd__nc7ElementQyd__RszSTRd__lufC + 5759420
13 swift                    0x0000000106efb0dd llvm::MCJIT::runFunction(llvm::Function*, llvm::ArrayRef<llvm::GenericValue>) + 461
14 swift                    0x0000000106efea91 llvm::ExecutionEngine::runFunctionAsMain(llvm::Function*, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&, char const* const*) + 1313
15 swift                    0x00000001067d6d3b swift::RunImmediately(swift::CompilerInstance&, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&, swift::IRGenOptions&, swift::SILOptions const&) + 3547
16 swift                    0x0000000106787ea0 performCompile(swift::CompilerInstance&, swift::CompilerInvocation&, llvm::ArrayRef<char const*>, int&, swift::FrontendObserver*, swift::UnifiedStatsReporter*) + 13296
17 swift                    0x0000000106783a1d swift::performFrontend(llvm::ArrayRef<char const*>, char const*, void*, swift::FrontendObserver*) + 3021
18 swift                    0x0000000106735e2e main + 686
19 libdyld.dylib            0x00007fff5c137085 start + 1
20 libdyld.dylib            0x000000000000000a start + 2750189446
[1]    629 illegal hardware instruction   test.swift

This is because when we use hash of String (In this example, put it in Set,) it will end up with running Unicode normalization Form C over the String, by using ICU’s unorm2_normalize. However, this crafted String can fill up given buffer and return None from _tryNormalize, then hit fatalError().

@niw
Copy link
Author

niw commented Oct 22, 2018

Current workaround is using NSString instead of String for any code that may take arbitrary payload in String (probably most of all non constant String in your code) that requires hash(into:) call, like put it in Set or using it as key of Dictionary.

let set = Set([str].map { $0 as NSString })

If you can’t use Set<NSString>, such as using setter API that enforces takes Set<String> and if it’s Objective-C class, use performSelector:withObject: to workaround type conversion.

obj.perform(#selector(setter: ObjClass.property), with: NSSet(array: [str].map { $0 as NSString }))

if it’s not Objective-C class, probably no easy solution so don’t use that setter.

@niw
Copy link
Author

niw commented Oct 22, 2018

Looks like there is _NormalizedCodeUnitIterator implementation issue that only provides limited size of buffer to _tryNormalize.

The input Unicode code points in this example are in length 16.

U+0336, U+0344, U+0357, U+0343, U+0314, U+0351, U+0340, U+0300, U+0340, U+0360, U+0314, U+0357, U+0315, U+0301, U+0344, U+0061

By allocating enough size of buffer, the output Unicode code points of NFC normalization by unorm2_normalize are in length 18.

U+0336, U+0308, U+0301, U+0357, U+0313, U+0314, U+0351, U+0300, U+0300, U+0300, U+0314, U+0357, U+0301, U+0308, U+0301, U+0315, U+0360, U+0061

This is bigger than input, however, _NormalizedCodeUnitIterator is passing a buffer that has same size as source:

// In swift/stdlib/public/core/NormalizedCodeUnitIterator.swift
      var intermediateBuffer = _FixedArray16<CodeUnit>(allZeros:())
      if overflowBuffer == nil, 
         let filled = source.tryFill(buffer: &intermediateBuffer) 
      {
        guard let count = _tryNormalize(
          _castOutputBuffer(&intermediateBuffer, 
          endingAt: filled),
          into: &segmentBuffer
        ) 

This may be icu4c logic that can’t execute NFC normalization properly that ends up with longer length than input (I didin’t verify it yet,) or if not, then this _NormalizedCodeUnitIterator implementation seems problematic.

If this is prior case (icu4c logic issue,) there are no action for Swift but libicuCore.A.dylib need to be updated, or, if this normalization is valid, then it’s Swift implementation issue.

Update: I tried other normalization implementation like the one in Ruby's String (lib/unicode_normalize/normalize.rb) and it also generates the same Unicode code points, which is longer than input. I didn’t follow entire NFC spec yet probably icu4c logic is correct.

See https://gist.github.com/niw/cd27dbb0ffc37eb61c3bd7f94dc347e4

@belkadan
Copy link
Contributor

cc @milseman

@niw
Copy link
Author

niw commented Oct 22, 2018

Looks like SR-8939 is similar or same issue.

@milseman
Copy link
Mannequin

milseman mannequin commented Oct 22, 2018

Unicode defines a max NFC expansion factor of 3, so we should (though there seems to be a bug where we aren't) be allocating 3x the number of input code units in our fixed buffers.

Lance (JIRA User) is working on a fix to https://bugs.swift.org/browse/SR-8939, which might be the same issue. Lance, could you merge this example into the list of test cases you're building up?

@niw
Copy link
Author

niw commented Oct 22, 2018

@milseman Thank you for your explanation, that’s really good to know...![]( It was unclear to me by quickly reading UAX #​15 or implementations. Then sounds like we need more buffer to complete NFC. I’m happy to hear that Lance (JIRA User) is already working on it. Thank you so much) 🙂

@milseman
Copy link
Mannequin

milseman mannequin commented Oct 22, 2018

You can see more at http://unicode.org/faq/normalization.html#12

Expanders are nasty, and there are even expanders whose expansion have normalization boundaries inside them, thwarting attempts at algebraic properties of segments.

@swift-ci swift-ci transferred this issue from apple/swift-issues Apr 25, 2022
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug A deviation from expected or documented behavior. Also: expected but undesirable behavior. standard library Area: Standard library umbrella
Projects
None yet
Development

No branches or pull requests

2 participants