Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SR-13871] "Fixed size array" codegen via unsafeBufferPointer is dramatically worse in 5.3 #56269

Closed
stephentyrone opened this issue Nov 18, 2020 · 3 comments
Labels
bug A deviation from expected or documented behavior. Also: expected but undesirable behavior. compiler The Swift compiler in itself performance

Comments

@stephentyrone
Copy link
Member

Previous ID SR-13871
Radar rdar://problem/71537963
Original Reporter @stephentyrone
Type Bug
Status Closed
Resolution Cannot Reproduce
Additional Detail from JIRA
Votes 1
Component/s Compiler
Labels Bug, Performance
Assignee None
Priority Medium

md5: 9b9937bfb81eafdddc433dcd972c423b

Issue Description:

https://swift.godbolt.org/z/KW81eE

The code generation for "testOptimization" in this example got dramatically worse sometime between 5.2 and 5.3. In 5.2 we see:

        push    rbp
        mov     rbp, rsp
        movdqu  xmm0, xmmword ptr [rdi]
        movdqu  xmm1, xmmword ptr [rdi + 16]
        movdqu  xmm2, xmmword ptr [rdi + 32]
        paddq   xmm2, xmm0
        movdqu  xmm0, xmmword ptr [rdi + 48]
        paddq   xmm0, xmm1
        paddq   xmm0, xmm2
        pshufd  xmm1, xmm0, 78
        paddq   xmm1, xmm0
        movq    rax, xmm1
        pop     rbp
        ret

which is basically optimal, but in 5.3 we see the following instead:

        push    rbp
        mov     rbp, rsp
        movups  xmm0, xmmword ptr [rdi]
        movaps  xmmword ptr [rbp - 64], xmm0
        mov     rax, qword ptr [rbp - 64]
        movups  xmm0, xmmword ptr [rdi]
        movaps  xmmword ptr [rbp - 64], xmm0
        add     rax, qword ptr [rbp - 56]
        movups  xmm0, xmmword ptr [rdi + 16]
        movaps  xmmword ptr [rbp - 48], xmm0
        add     rax, qword ptr [rbp - 48]
        movups  xmm0, xmmword ptr [rdi + 16]
        movaps  xmmword ptr [rbp - 48], xmm0
        add     rax, qword ptr [rbp - 40]
        movups  xmm0, xmmword ptr [rdi + 32]
        movaps  xmmword ptr [rbp - 32], xmm0
        add     rax, qword ptr [rbp - 32]
        movups  xmm0, xmmword ptr [rdi + 32]
        movaps  xmmword ptr [rbp - 32], xmm0
        add     rax, qword ptr [rbp - 24]
        movups  xmm0, xmmword ptr [rdi + 48]
        movaps  xmmword ptr [rbp - 16], xmm0
        add     rax, qword ptr [rbp - 16]
        movups  xmm0, xmmword ptr [rdi + 48]
        movaps  xmmword ptr [rbp - 16], xmm0
        add     rax, qword ptr [rbp - 8]
        pop     rbp
        ret

This is probably not a Swift bug, but rather a break in the LLVM cost model, but it's worth tracking here because the performance and code size impact is severe.

@stephentyrone
Copy link
Member Author

@swift-ci create

@stephentyrone
Copy link
Member Author

@eeckstein observed that we still get the "good" codegen with 5.3 (and ToT) on macOS. So this regression may exist only on Linux.

@stephentyrone
Copy link
Member Author

This is resolved in 5.4 and onwards. Closing.

@swift-ci swift-ci transferred this issue from apple/swift-issues Apr 25, 2022
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug A deviation from expected or documented behavior. Also: expected but undesirable behavior. compiler The Swift compiler in itself performance
Projects
None yet
Development

No branches or pull requests

1 participant