New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SR-11147] Add withUnsafeElements<T: HomogeneousAggregate, Result>(of:) #53543
Comments
@swift-ci create |
I was a little too aggressive assuming layout rules. We don't actually want to guarantee that Swift structs are layout compatible with tuples even if they are homogenous. This will only apply to structs imported from C (because they must follow ABI rules anyway), or Swift structs that are explicitly annotated with a new attribute, like @_layoutCompatible. |
Eventually, we may want tuples to automatically conform to Collection. In that case, rather than defining a new Builtin.HomogenousAggregate protocol, we could use HasContiguousStorage protocol shared by other Collections. |
Is the specific problem we're trying to solve here the usability of imported C array fields in structs? If so, I think we might want to start with how these fields are imported from C to begin with (aside from the "wait until we have real fixed size arrays" option). The homogeneous tuple thing was a hack, and though we'll have to keep the hack around to some degree for source compatibility, maybe we can also provide a better representation for new code to opt in to, since the only ABI we have to worry about is the C ABI. Using accessor coroutines, and the unowned-reference-that-becomes-strong-on-copy trick we'd discussed with @rjmccall previously wrt in-place slice mutations, could we efficiently import C array fields as `ArraySlice<Element>` fields into Swift (with an assertion that mutation doesn't change their size, of course)? |
Imported statically sized C arrays are by far the most important and compelling use case. So if we want to really improve usability that's a good one to focus on. However, we still need a fix for the other use cases, like regular Swift tuples and layout-constrained structs. The only alternative today is to abuse memory binding APIs; a practice that we need to put an end to. HasContiguousStorage conformance is the simplest, most direct way to express this. As @airspeedswift points out, this is independent of whether we have a Collection. We also want the conformance for SIMD types. |
It isn't clear to me when you'd want to use a tuple or struct this way (aside from the SIMD structs, maybe, though @stephentyrone made a point of not making those collections too) in code that didn't already have to deal with the weird clang importer tuples. |
This is pretty similar to the API that I've been converging on for native static array types as implementation details:
It's pretty useful for working with fixed-size objects (like the words storage for a fixed-size integer type, or a small fixed-size matrix, etc). In cases like these, we don't want the type itself to be a collection, but being able to get a buffer pointer is one of the key use cases. |
This proposed API applies to any value that you want to pass to a C function taking a pointer to that value's element type. So, this definitely comes up with tuples, regardless of whether they are defined in C or Swift (that fact that we can't "round trip" a C API is just particularly egregious). It also comes up with structs where the C API assumes layout (based on ABI constraints). For example, we see this in real code using Apple's task_policy.h API. The reduced example is shown above in Again, this isn't just about handling common use cases so much as handling all reasonable corner cases while guarding against undefined behavior. |
To be clear, I'm strongly in favor of making |
It'd also be straightforward to have default implementations of |
@jckarter yeah, that's what I was envisioning to avoid the runtime overhead. I proposed adding a Builtin protocol because AFAICT it's the only way to use it in type position. i.e. this doesn't work The only thing that makes me sad is that we wouldn't be able to treat all our UnsafeBufferPointer-like things uniformly in generic context. i.e. this doesn't work |
The fact that `_Trivial` doesn't work in type position is also, in a sense, an artificial limitation. The type checker does know about layout constraints, and we model `AnyObject` as a layout constraint without a real protocol backing it. The way that works is that `AnyObject` is a typealias for a builtin that exposes the existential type with the layout constraint: public typealias AnyObject = Builtin.AnyObject Although the existential of `HomogeneousAggregate` is itself of questionable usefulness, it falls out from the other language rules that using an existential in an inheritance clause or generic constraint imposes the generic constraints on the type, in other words, `<T: AnyObject>` means `T` has the AnyObject constraint. Following the example of AnyObject, we could also expose `HomogeneousAggregate` as a builtin which imposes the layout constraint, and then use extensions to expose the API we want in terms of builtin functionality: public typealias HomogeneousAggregate = Builtin.HomogeneousAggregate
extension HomogeneousAggregate {
typealias Element = Builtin.HomogeneousAggregateElement<Self>
var count: Int { return MemoryLayout<Self>.stride / MemoryLayout<Element>.stride }
} I think the rules for layout constraints feel like a better fit for this than protocol conformance. A type can't retroactively conform to `AnyObject` and doesn't have to explicitly declare conformance to it, it implicitly conforms by being a class. Similarly, it feels natural to me that homogeneous aggregate types should just work as `HomogeneousAggregate`s in code that can see that they are in fact homogeneous (though in this case, we may also want resilient structs to be able to explicitly declare the constraint, if that's something they want to make into an API promise). |
A good dry run for this approach might be to try to expose `Trivial` as a user-facing layout constraint too, since it already exists. One thing we'd need to do with either Trivial or HomogeneousAggregate is add type checker support for telling whether a type satisfies the constraint. That might be tricky since we normally use SIL-level information to do those checks, but maybe requestification makes that less of an issue? |
We actually need to expose the Trivial layout constraint in order to properly enforce several of our UnsafeRawPointer APIs, which are only valid for trivial types. |
Note: If this is going to be a stand-alone function (because layout constraints don't have witnesses) then we may want to call it var value: Int = 0
withUnsafePointer(to: value) { ptr in
ptr.withMemoryRebound(to: UInt8.self, capacity: 1) {
readFourBytes($0)
}
} That should be: var value: Int = 0
withUnsafeBufferPointer(to: value) { ptr in
ptr.withMemoryRebound(to: UInt8.self) {
readFourBytes($0)
}
} |
Why do you want to have two essentially equivalent functions with different names? |
(Answering my own question): to disambiguate between getting a buffer of 1 x self and getting a buffer of n x elements. Still, not totally satisfactory. |
@stephentyrone I originally wanted to propose the same function name for both features. Then I noticed how bad the type checker is at inferring closure arguments and providing coherent diagnostics. So I though it best to avoid the overload and ambiguity. I don't feel strongly either way. |
I agree that overloading is best avoided, both for human understanding and the type checker's sake. Having the operations be top-level functions might also make exposing these type layout constraints easier to implement; thinking about it more, doing name lookup based on type layout constraints might be problematic for type checking, since unlike explicit protocol conformances, there's no explicit indication in AST we should look for members in extensions on type constraints, and there's no context to know we should be checking for type layout constraints given something like |
Similarly, spelling the constraint with a generic argument like |
@tbkka @stephentyrone @jckarter I suggest supporting a This naturally handles all imported C types, and only allows Swift types to participate when it is safe to view the elements as a contiguous region of memory with tail padding (there won't actually be any tail padding, so the illusion is maintained). For example, this contrived use case is still safe:
Viewing types with a size vs. stride mismatch as a pointer is extremely dangerous, particularly when passing the pointer to a C API that expects C layout. It is beneficial to require memory binding when dabbling in that realm. Note that we may still want to allow subscripting and collection conformance on all homogenuos tuples, regardless of |
Additional Detail from JIRA
md5: 7c8952a07b1ac467c740324f65861850
is duplicated by:
Issue Description:
Provide the API and compiler features necessary to convert any struct or tuple with elements of common type into an UnsafeBufferPointer.
To do this well, it depends on language features; in particular, protocol conformance for tuples.
It's common to pass a Swift tuple into a C API taking a pointer to the elements. For example, it should be simple and straightforward to import a statically sized C array (as a tuple) and pass it back off to a C api that takes a pointer to the element type. This situation also arises for homogenous structs. Since Swift already guaratees the layout of such structs, we should formalize that guarantee in the new API along with tuple types.
Without this convenience, programmers are forced into misusing the memory binding APIs and potentially introducing undefined behavior. Here real-world code reduced to a trivial example:
This code is bad because:
1. It's an abuse of bindMemory, which should only be called to change the bound memory type in order to reuse a chunk of memory as a different type (e.g. to write a custom allocator in Swift).
2. Taken at face value, it appears to have undefined behavior because it binds the memory of a local variable to a different type. The only reason that is safe in practice is that the call to bindMemory has no actual effect on the memory type because all fields of the struct happen to be Ints. If the bindMemory call actually had any semantic effect, then the local variable would be in an undefined state after the call. There's no enforcement on correct usage and it's misleading to someone else reading the code.
3. The code makes use of an UsafeRawPointer even though it wants all pointer types to be strongly typed. It shouldn't need a raw pointer because it's not working at the level of bytes or reinterpreting values of unrelated types.
Instead, the code should be written as:
Going back to the common case of importing C arrays, it could now be written as:
Another real-world example:
Could be written as:
This is a small but important change because
it makes a simple, common task, much more discoverable and obvious
it avoids the extremely dangerous low-level `assumingMemoryBound` API, which should only be used by experts in the Swift pointer aliasing rules.
Proposed Fix
1. Add a new layout constraint and (builtin) protocol, HomogenousAggregate, to the type system that effectively has an associated type as shown below. The compiler will require that all conforming types are homogenous tuples or structs:
Synthesize conformances for any type that needs to meet the layout constraint.
2. Add a compiler Builtin that generates a buffer pointer for any type conforming to the HomogenousAggregate layout constraint:
3. Add two free-standing functions to the standard library:
PREREQUISITES
Language support for Tuple types conforming to a builtin protocol. Tuple conformance in general is part of the generics manifesto, so it's something we'd like to support.
Synthesizing conformance to layout constraint protocols. How can it be done lazily without excessive overhead? Many types will potentially conform, but I don't expect the constraint to be used often.
The text was updated successfully, but these errors were encountered: