Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SR-9905] passing array as array of existential super type is slow (allocates) #52311

Open
weissi opened this issue Feb 11, 2019 · 6 comments
Open
Labels
bug A deviation from expected or documented behavior. Also: expected but undesirable behavior. compiler The Swift compiler in itself performance

Comments

@weissi
Copy link
Member

weissi commented Feb 11, 2019

Previous ID SR-9905
Radar rdar://problem/52529580
Original Reporter @weissi
Type Bug
Additional Detail from JIRA
Votes 0
Component/s Compiler
Labels Bug, Performance
Assignee None
Priority Medium

md5: 5a14898a75863f85c86fa2ef5c938d2a

Issue Description:

I would expect this Swift program

protocol Super {}

protocol Sub: Super {}

struct Foo: Sub {}

@inline(never)
func foo(_ supers: [Super]) {
}

let things: [Sub] = [Foo(), Foo()]

for _ in 0..<10_000 {
    foo(things)
}

to not allocate in the loop because Sub is a sub-type of Super so I would think that [Sub] should be directly compatible with [Super].

I didn't think about this too much but I'd expect the PWTs of Super and Sub to be compatible (when used as Super) and therefore I'd think that passing a [Sub] as [Super] should be a no-op. I totally might be wrong here though.

These are the allocations we see

              libsystem_malloc.dylib`malloc
              libswiftCore.dylib`swift_slowAlloc+0x19
              libswiftCore.dylib`_swift_allocObject_(swift::TargetHeapMetadata<swift::InProcess> const*, unsigned long, unsigned long)+0x14
              TestApp`specialized ContiguousArray.reserveCapacity(_:)+0xc3
              TestApp`main+0xfc
              libdyld.dylib`start+0x1
              TestApp`0x1
            10000
@belkadan
Copy link
Contributor

The protocol witness tables aren't compatible; they chain by reference rather than putting one inside the other.

@weissi
Copy link
Member Author

weissi commented Feb 12, 2019

@belkadan ah thanks! In other words, we need the allocation+copy in this case as what will need to happen is really (in pseudo-code)

foo(arrayOfSubPWTs.map { $0.superPWT })

Or do you see any way of not doing that? In theory, it'd be possible to see that this is a unique reference to the array and replace the PWTs in-place rather than alloc+copy but I don't know enough to judge if that's possible or not.

@belkadan
Copy link
Contributor

We don't really consider related protocols to have related representations today, but I suppose they could. They're all "existential box + witness table reference".

That said, you're calling this thing in a loop, which means it has to keep the original array around anyway to do the conversion again. You'd only get the optimization in the "last use" case.

@weissi
Copy link
Member Author

weissi commented Feb 12, 2019

yes, @belkadan, you're right. I put it in a loop so I can dtrace it easier 🙂. But I guess this program might be a better test case?

import Darwin

protocol Super {}

protocol Sub: Super {}

struct Foo: Sub {}

@inline(never)
func foo(_ supers: [Super]) {
}

@inline(never)
func CHECK_ALLOCATIONS_HERE(_ things: [Sub]) {
    foo(things)
}

CHECK_ALLOCATIONS_HERE([Foo(), Foo(), Foo(), Foo()]) // prime it

close(0xcafe) // just a dummy for the dtrace script
CHECK_ALLOCATIONS_HERE([Foo(), Foo(), Foo(), Foo(), Foo()])
close(0xdead) // just a dummy for the dtrace script

which results in

$ swiftc -O test.swift && sudo dtrace -n 'syscall::close:entry / pid == $target && arg0 == 0xcafe / { printf("go"); running = 1; } pid$target::malloc:entry / running == 1 / { @malloc_calls[ustack()] = count(); } syscall::close:entry / pid == $target && arg0 == 0xdead / { running = 0; } ::END { printa(@malloc_calls); }' -c ./test
dtrace: description 'syscall::close:entry ' matched 7 probes
dtrace: pid 91538 has exited
CPU     ID                    FUNCTION:NAME
  2    164                      close:entry go
  0      2                             :END 

              libsystem_malloc.dylib`malloc
              libswiftCore.dylib`swift_slowAlloc+0x19
              libswiftCore.dylib`_swift_allocObject_(swift::TargetHeapMetadata<swift::InProcess> const*, unsigned long, unsigned long)+0x14
              test`specialized ContiguousArray.reserveCapacity(_:)+0xc3
              test`specialized _arrayForceCast<A, B>(_:)+0x39
              test`CHECK_ALLOCATIONS_HERE(_:)+0x17
              test`main+0x18b
              libdyld.dylib`start+0x1
              test`0x1
                1

so as expected, it doesn't apply that optimisation and allocates.

@weissi
Copy link
Member Author

weissi commented Jul 2, 2019

@swift-ci create

@airspeedswift
Copy link
Member

Drive by comment that the way to fix this is to remove this implicit conversion completely.

@swift-ci swift-ci transferred this issue from apple/swift-issues Apr 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug A deviation from expected or documented behavior. Also: expected but undesirable behavior. compiler The Swift compiler in itself performance
Projects
None yet
Development

No branches or pull requests

3 participants