Uploaded image for project: 'Swift'
  1. Swift
  2. SR-13871

"Fixed size array" codegen via unsafeBufferPointer is dramatically worse in 5.3

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Open
    • Priority: Medium
    • Resolution: Unresolved
    • Component/s: Compiler
    • Labels:

      Description

      https://swift.godbolt.org/z/KW81eE

      The code generation for "testOptimization" in this example got dramatically worse sometime between 5.2 and 5.3. In 5.2 we see:

              push    rbp
              mov     rbp, rsp
              movdqu  xmm0, xmmword ptr [rdi]
              movdqu  xmm1, xmmword ptr [rdi + 16]
              movdqu  xmm2, xmmword ptr [rdi + 32]
              paddq   xmm2, xmm0
              movdqu  xmm0, xmmword ptr [rdi + 48]
              paddq   xmm0, xmm1
              paddq   xmm0, xmm2
              pshufd  xmm1, xmm0, 78
              paddq   xmm1, xmm0
              movq    rax, xmm1
              pop     rbp
              ret
      

      which is basically optimal, but in 5.3 we see the following instead:

              push    rbp
              mov     rbp, rsp
              movups  xmm0, xmmword ptr [rdi]
              movaps  xmmword ptr [rbp - 64], xmm0
              mov     rax, qword ptr [rbp - 64]
              movups  xmm0, xmmword ptr [rdi]
              movaps  xmmword ptr [rbp - 64], xmm0
              add     rax, qword ptr [rbp - 56]
              movups  xmm0, xmmword ptr [rdi + 16]
              movaps  xmmword ptr [rbp - 48], xmm0
              add     rax, qword ptr [rbp - 48]
              movups  xmm0, xmmword ptr [rdi + 16]
              movaps  xmmword ptr [rbp - 48], xmm0
              add     rax, qword ptr [rbp - 40]
              movups  xmm0, xmmword ptr [rdi + 32]
              movaps  xmmword ptr [rbp - 32], xmm0
              add     rax, qword ptr [rbp - 32]
              movups  xmm0, xmmword ptr [rdi + 32]
              movaps  xmmword ptr [rbp - 32], xmm0
              add     rax, qword ptr [rbp - 24]
              movups  xmm0, xmmword ptr [rdi + 48]
              movaps  xmmword ptr [rbp - 16], xmm0
              add     rax, qword ptr [rbp - 16]
              movups  xmm0, xmmword ptr [rdi + 48]
              movaps  xmmword ptr [rbp - 16], xmm0
              add     rax, qword ptr [rbp - 8]
              pop     rbp
              ret
      

      This is probably not a Swift bug, but rather a break in the LLVM cost model, but it's worth tracking here because the performance and code size impact is severe.

        Attachments

          Activity

            People

            Assignee:
            Unassigned Unassigned
            Reporter:
            scanon Stephen Canon
            Votes:
            1 Vote for this issue
            Watchers:
            2 Start watching this issue

              Dates

              Created:
              Updated: