Uploaded image for project: 'Swift'
  1. Swift
  2. SR-14273

Working toward a byte code representation for value witnesses

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: In Progress
    • Priority: Medium
    • Resolution: Unresolved
    • Component/s: None
    • Labels:
      None

      Description

      To reduce code size of value witness functions we propose to model their
      functionality via type layout strings that are to be interpreted by runtime
      functions.

      Today, say you have a ``Pair<T>`` the value witness function to copy such a
      value will contain two function calls to `T->assignWithCopy()``.

      (To be clear this paragraph stems from a time before type layout based value witness generation existed i.e it describes what used to happen if IGM.getOptions().UseTypeLayoutValueHandling == false)

      Pair* Pair::assignWithCopy(Pair* dest, Pair* src, M* self) {
        auto *T = self.T_metadata;
        T->assignWithCopy(dest, src, T);
        T->assignWithCopy(dest+self.offsetOfSecondProperty,
                          src+self.offsetOfSecondProperty,
                          T);
        return dest;
      }
      

      This becomes worst as we nest types.

      struct Container<T> {
        var first: Pair<T>
        var second: Pair<T>
      }
      
      Container* Container::assignWithCopy(Container* dest, Container *src, M *self) {
        auto *T = M.T_metadata;
        T->assignWithCopy(dest, src, T);
        auto *PairOfT = getGenericMetdata(Pair, T);
        T->assignWithCopy(dest + PairOfT.offsetOfSecondProperty, ..., T)
        T->assignWithCopy(dest + self.offsetOfSecondProperty, ..., T)
        T->assignWithCopy(dest + self.offsetOfSecondProperty +
                                 PairOfT.offsetOfSecondProperty,
                          ..., T)
      }
      

      This approach has two drawbacks:

      • Code size, the value witness function require quite a bit of space.
          Furthermore, because the above value witness is slow (because it creates the
          metadata for stored property members) we create outlined value functions for
          concrete instantiations of a type.
          let y = Container(first: Pair(1, 1), second: Pair(2, 2))
          var x = y
        

          In this code instead of calling the value witness we create a specialized
          version that has knowledge of the types involved and therefore does not need
          to call getGenericMetadata.

      • Performance, in order to compute field offsets we instantiate metadata for the
          field's type.

      Layout strings:

      Instead of generating code for the value witness function we propose to compute
      a type layout and store a byte string representation of it that is to be
      interpreted by the runtime to achieve the copy.

      The layout string for the Pair<T> would just be an encoding of:

      pair_layout = "T,T"
      

      The runtime would then interpet this string.

      swift_assignWithCopy(pair_layout, T, Pair* dest, Pair* src) =
         auto currFieldDest = dest;
         auto currFieldSrc = src;
         T->assignWithCopy(currFieldDest, currFieldDest, T);
         currFieldDest = alignTo(currFieldDest + T->size, T->aligment);
         currFieldSrc = ...;
         T->assignWithCopy(currFieldDest, currFieldSrc, T);
      

      Similar for ``Container<T>``.

      container_layout = "T,T,T,T"
      
      swift_assignWithCopy(container_layout, T, Container* dest, Container* src) =
         auto currFieldDest = dest;
         auto currFieldSrc = src;
         T->assignWithCopy(currFieldDest, currFieldDest, T);
         currFieldDest = alignTo(currFieldDest + T->size, T->aligment);
         currFieldSrc = ...;
         T->assignWithCopy(currFieldDest, currFieldSrc, T)
         currFieldDest = alignTo(currFieldDest + T->size, T->aligment);
         currFieldSrc = ...;
         T->assignWithCopy(currFieldDest, currFieldSrc, T);
         ...;
      

      Value witness functions:

      • Copy destroy functions can be implemented in terms of layouts by interpreting
        the string.
      initializeBufferWithCopyOfBuffer, destroy, initializeWithCopy, assignWithCopy,
      initializeWithTake, assignWithTake
      
      • Single payload enum value witnesses similarly can be implemented, the runtime
        will have knowledge of size, alignment, and extra inhabitants of builtin types. The remainder can be
        calculated using this information like we do today in IRGen and runtime.
      getEnumTagSinglePayload, storeEnumTagSinglePayload
      
      • Constants like size, flags, extra inhabitant can be computed or are stored as part of type layouts.

      Enum only witness can be implemented in terms of layout.
      getEnumTag, destructiveProjectEnumData, destructiveInjectEnumTag,

      Type layouts:

      Basic types.

      Integers, boolean, floating point

      i256, ..., i16, i8, i1
      f64, ...
      copy, destroy functions are trivial
      extra inhabitants are known.
      

      Reference types:

      NativeObject (NO), Objective-C refcounted, UnknownObject, BridgedObject, Unowned, ...
      copy calls refcount implementation.
      extra inhabitants are known.
      

      Fixed layout types aggregate the layout.

      Examples of aggregates.

      Two<Int64, Int8>
      two_ints_layout = “POD72|{I64,I8}"
      
      two_pointers_layout = "BitwiseTakable128|NO,NO"
      
      Optional<Two<Int64, Bool>>
      single_payload_layout = "ES2{I64,I1}"
      
      MultiPayloadEnum<Int, Bool, AClass>
      
      multi_payload_layout = "EM{3}{2}{{Int}, {Bool}, {AClass}}"
      
      swift_getEnumTagSinglePayload =
        xi_tag_addr = max_xi_addr({I64,I1})
        tag = xi_tag(xi_tag_addr, I1)
        if tag <= 1 return 0
        else return tag - 1
      

      Resilient types:

      Two<ResilientStruct, ResilientEnum>
      two_resilient_layout = "ResilientStruct_ref, ResilientEnum_ref"
      

      Generic resilient types:

      ResilientTwo<Int64,T>
      resilient_two_layout = "ResilientTwo_ref{Int64_ref, T}"
      

      Computing the layout:

      A type layout is a tree of type layout entries. The contents describe the minimal information for generation of IR. We probably will want to make them independent of TypeInfo later.

      TypeLayoutEntry {
        kind;
      }
      
      ListTypeLayoutEntry {
        TypeLayoutEntry *this;
        TypeLayoutEntry *next;
      }
      
      LeafTypeLayoutEntry : TypeLayout {
        TypeInfo &scalar;
      }
      
      ArchetypeLayoutEntry : LeafTypeLayoutEntry {
        SILType archetypej;
      }
      
      AlignmentTypeLayoutEntry : TypeLayout {
         alignment;
         fixed;
         ListTypeLayoutEntry children;
      }
      
      EnumTypeLayoutEntry : TypeLayout {
        TypeInfo &enum;
        TypelayoutEntry children[];
      }
      
      ResilientTypeLayout : TypeLayout {
        TypeInfo &type;
        TypelayoutEntry children[];
      }
      

      We'll add APIs to IRGen's TypeInfo to compute type layout (e.g for structs as
      part of computing StructLayout). TypeLayouts should be unqiued as part of
      building in TypeConverter.

      Step 1:

      Using this type lowering we will be able to generate value witness functions
      that don't instantiate generic metadata except for resilient types.

      • compute offsets/sizes based on type layouts
      • copy/destroy non resilient enums based on cases

      My hope is that this will allow us to replace the outlined value functions with value witness calls to save code size and that this will improve performance for some witness calls because we can avoid calls to getGenericMetadata.

      (This is what happens if IGM.getOptions().UseTypeLayoutValueHandling == true which is the default as of today)

      Step 2:

      Generate a byte code representation of the type layout. This can be a
      serialization of the type layout representation:

      • Archetypes as indices
      • Encode paths for things like T.associatedType
      • Encoding for enums: num empty/payload cases, (no/single/multi)payload/c
      • Fast paths: {POD128|} {BitwiseTakable128|}

        Interpreter in the runtime:

        - Compute offsets, alignment, extra inhabitants at runtime
        - Tye metadata access, associated type witness
        - Optimizations like {POD128|}

        ,

      Replace value witness entries with functions that reference type layouts.

      Backwards deployment: the runtime library to interpret types is not available have to fall back to witnesses. Can we emit witness tables to sections that are not loaded if needed?

        Attachments

          Activity

            People

            Assignee:
            gwenm Gwen Mittertreiner
            Reporter:
            aschwaighofer Arnold Schwaighofer
            Votes:
            2 Vote for this issue
            Watchers:
            8 Start watching this issue

              Dates

              Created:
              Updated: