New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SR-5506] Change name mangling of subscripts to no longer contain the string "subscript" #48078
Comments
Before I start off, here's my plan for the new mangling scheme, @eeckstein, could you maybe have a look through and see if you've got any objections to it so far? In the grammar, extend entity-spec ::= type 'fSg'
entity-spec ::= type 'fSs'
ExampleIn the following file named struct Input {}
struct Output {}
class Foo {
subscript(myarg myarg: Input) -> Output {
fatalError()
}
} The subscript getter would be mangled as |
Subscripts themselves also need to be mangled for SourceKit purposes: they get a unique USR, and act as a DeclContext for any generic parameters. |
We need to include an optional file discriminator in case the same subscript is defined in two files (with fileprivate): entity-spec ::= type file-discriminator? 'fSg'
entity-spec ::= type file-discriminator? 'fSs' |
A comment on the demangler implementation: I suggest to not add a new demangler node (in DemangleNodes.def). Instead create the same node tree as now, e.g: kind=Global
kind=Getter
kind=Structure
kind=Module, text="m"
kind=Identifier, text="Mystruct"
kind=Identifier, text="subscript"
kind=Type
kind=FunctionType
kind=ArgumentTuple
kind=Type
kind=Structure
kind=Module, text="Swift"
kind=Identifier, text="Int"
kind=ReturnType
kind=Type
kind=Structure
kind=Module, text="Swift"
kind=Identifier, text="Int" The re-mangler can then just make a special case for an "subscript" identifier. |
@swift-ci create |
Thanks for the feedback. I've just tried to implement it and had a new idea that I like a lot better: Introduce a new level of indirection in the definition of decl-name ::= decl-base-name
decl-name ::= decl-base-name 'L' INDEX // locally-discriminated declaration
decl-name ::= decl-base-name identifier 'LL' // file-discriminated declaration
decl-base-name ::= identifier
decl-base-name ::= 'nS' // Special *n*ame *s*ubscript Implementation would be trivial by just writing the 'nS' operator when mangling of a subscript I feel like I need to create a new demangler node, though, so that we can differentiate between computed properties named 'subscript' and proper subscripts with the same signature (see SR-5568), but creating a new node "SubscriptName" (or a similar name) that replaces Any thoughts and/or objections, especially for the 'nS' operator? |
> The only disadvantage I see is that we burn another character ('n') for mangling operators and it looks like the supply is already running low. That's true. Therefore I suggest to use 'fSg'/'fSs'. I think if we go that way, it only makes sense if we also change 'fC'/'fc' to get rid of file-discriminator at all (and this would not be a trivial change). |
I don't care where the file discriminator is, but you definitely need it. Otherwise two files can end up defining the same subscript when using |
My idea was to still keep the 'fg', 'fs', or 'i' suffix to distinguish between subscript getters, setters and the subscript itself, mangling e.g. the getter in the example above as "_T04test3FooC nS AA6OutputVAA5InputV5myarg_tcfg". I thought this had the advantage that it exactly mirrors the internal representation of the subscript's getter being a getter to an entity that just happens to hava special name (which is now no longer an identifier) and that things like addressors (e.g. 'flo') would work out of the box, though I don't completely understand if they are necessary or even make sense. |
There are other accessors besides getters and setters, but if you go through the generalized accessor code that should be fine. Functions, closures, or types nested within subscript accessors will contain the subscript mangled name as their "context". |
Having tried multiple approaches now, I decided to keep this as a public notepad of my ideas and thoughts. Feel free to comment if you've got any thoughts about it or just ignore it. For the mangling of the subscript itself, we can just drop the "subscript" string from the mangled name since 'i' uniquely identifies that the mangled entity is a subscript. No problem here. The function entity mangling (i.e. everything behind a 'f') is currently used for two different things:
Thus only the last element in the mangled name string describes an unnamed entity (for 1. it's the constructor and for 2. it's the getter/setter part). Subscript accessors now have two unnamed elements, namely the subscript and the accessor kind. As far as I can see, this can be represented in the following ways:
I am currently pursuing the third approach. |
Additional Detail from JIRA
md5: 07c3a49891c6f12b15b2df18829ec01f
relates to:
Issue Description:
Subscripts are no longer represented internally by the identifier "subscript" but by a special
DeclBaseName
. In name mangling, however, the string "subscript" still surfaces (e.g. _T04test3FooC9subscriptyycfg). We should use a special flag like "fS" here instead, similar to how "fC" is used for constructors or "fD" for destructors.The text was updated successfully, but these errors were encountered: