Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SR-14115] SIL: derivative type calculation for coroutines #54404

Open
Tracked by #54401 ...
dan-zheng opened this issue Dec 21, 2019 · 2 comments
Open
Tracked by #54401 ...

[SR-14115] SIL: derivative type calculation for coroutines #54404

dan-zheng opened this issue Dec 21, 2019 · 2 comments
Labels
AutoDiff compiler The Swift compiler in itself feature A feature request or implementation SIL

Comments

@dan-zheng
Copy link
Collaborator

Previous ID SR-14115
Radar None
Original Reporter @dan-zheng
Type Sub-task
Additional Detail from JIRA
Votes 0
Component/s Compiler
Labels Sub-task, AutoDiff
Assignee None
Priority Medium

md5: fe00df47fa9d2ad8d02ae88725898756

Parent-Task:

  • SR-14113 Support _read and _modify accessor differentiation

Issue Description:

SIL has dedicated coroutine function types: https://github.com/apple/swift/blob/master/docs/SIL.rst#coroutine-types

Figure out derivative type calculation for coroutines.

// Array.subscript.modify example
Original function type: $@yield_once @convention(method) <τ_0_0> (Int, @inout Array<τ_0_0>) -> @yields @inout τ_0_0 
Derivative function type: ???
@asl
Copy link
Collaborator

asl commented Aug 17, 2023

@rxwei @dan-zheng @BradLarson Likely you already discussed this issue and maybe even came to some conclusion. Below are my thoughts about the issue.

For the sake of clarify I will consider the following trivial code in the examples:

struct S: Differentiable {
    private var _x : Float

    var x: Float {
        get{_x}
        set(newValue) { _x = newValue }
        _modify { yield &_x }
    }

    init(_ x : Float) {
        self._x = x
    }
}

Here, the _modify accessor for x has the following function type: $@yield_once @convention(method) (@inout S) -> @yields @inout Float

We need to specify:

  • VJP type
  • Pullback type

VJP type

In the simple case the VJP type is the same as the Differentiable-constrained original function type with the addition of one additional result: the pullback (for the sake of simplicity I am ignoring some additional cases like reabstraction thunks, etc.). Assume for a moment that we'd already defined a pullback type for S.x._modify above and denote it by PbFnTy.

Then if S.x._modify would be an "ordinary" function, then VJP-type for it will be @convention(method) (@inout S) -> @owned PbFnTy. However, S.x._modify is an co-routine and co-routines cannot have ordinary results. Only `@yields@ are allowed. Some thoughts about solution of this problem:

  • We could extend co-routines and allow them to have ordinary results. However, this solution seems to have very big consequences in semantics. For example, we'd need to enforce that both resume and unwind epilogues of a co-routine return value. In general, this might not be possible for unwind part (though, we can partly mitigate this with even more complex schemes, returning e.g. an Optional here). Semantically, it would be end_apply where the final result will be obtained, etc. So things here looks very complicated and I'm afraid this would require changing lots of code here and there.
  • Yield the pullback. This seems to be possible. And the VJP type would be $@yield_once @convention(method) (@inout S) -> @yields (@inout Float, @owned $PbFnTy). However, yielded values represent the execution state in the middle of the co-routine. Therefore, semantically this might be only possible if there are no active values in the resume part of the co-routine, so pullbacks for sub-function before yield instruction and the pullbacks for the complete function including resume sub-function would be the same. Practically this seems to be unlikely as there are defer calls here and there, including Array.subscript._modify.
  • Return pullback indirectly. This way the pullback value will be available only after end_access, however, it could capture the pullbacks up until the end and the store of pullback value would occur only in the resume part of co-routine. Adding support for indirect return of a pullback would require some changes in VJP cloner, but otherwise seems to be ok. This way, the VJP type would be $@yield_once @convention(method) ($*PbFnTy, @inout S) -> @yields (@inout Float)

In reality I think we'd need 2 pullbacks or some clever combination of them. See below.

Pullback

Pullback propagates derivative information from active results back to parameters. In the normal SIL world functions have single entries and single returns. Therefore, in general, for a function (T) -> R the pullback will go from tangent result to tangent of input parameter(s) and will have the following signature (R.Tan) -> T.Tan. Unfortunately, @yield_once co-routines effectively has two entries and two exits. First one is implemented though begin_apply instruction and the second one – via end_apply instruction. Fortunately, co-routines do not have normal results, the only results are yields and semantic result parameters – inout arguments. Therefore we might think that the appropriate pullback type for the case above would be (@inout Float, @inout S) -> Void. This almost correct, however, there are some subtle but important details: co-routine exits twice. Therefore the pullback itself should allow two entries and the plain function is not an appropriate one here.

Essentially, we can think about co-routines as a pair of functions: one for the first part until yield and another one – for the resume part (as I already mention, we assume that it is not possible to abort co-routines in question, so the unwind part could always be ignored). We'd not need co-routine at all, if it would be possible to return address from the function (usually we're yielding an address of some internal struct field). This would certainly complicate lots of things here and there, as we're not expecting to have some pointer aliases in SIL a lot.

One could think about using a co-routine as a pullback. If we'd try to execute the co-routine in reverse, then the output of reversed-resume part will be adjoints of all active values-inputs to resume part. This would be the yields of the co-routine and we would need to replace end_apply with begin_apply of a pullback co-routine. Fun stuff at reverse pass will start at begin_apply of original function, as we'd need to replace it with end_apply of a pullback co-routine (and here yield part of original co-routine would become a resume part of pullback co-routine). The problem here is that we'd need to inject the adjoints of original yields into the co-routine and there is no way to pass "additional" parameters to `end_access".

It looks like the only solution is to pass these adjoints indirect, storing the values to adjoint buffers after co-routine return. So, in our example the co-routine pullback type would be $@yield_once @convention(method) (@inout_aliaseable Float, @inout S) -> @yields $().

An alternative to this would be some kind of pair of pullbacks: one for the part of the function before yield and it would be a yield result and second one – for the resume part of the co-routine. It would be the part returned indirectly. We will need to introduce the special cases here and there (that there might be multiple pullbacks for a single function) and replace begin_apply and end_apply with corresponding pullback calls in the reverse pass. The advantage of this approach is more natural representation (e.g. inout yields would be turned into inout pullback parameters, etc.). The obvious disadvantage is that if there are multiple active inputs to resume block, then we'd need to expose their adjoints as pullback results / inputs and this is something not possible looking into a co-routine type only. Though usually the only active input is the struct itself, so we might be fine here (and deny differentiation of other cases).

Both approaches have the following assumptions:

  • It is not possible to abort the co-routine
  • There is a single yield instruction
    Therefore it could be possible to split the function into two parts – the yield and resume ones.

Are there any other ideas? I'm slightly leaning towards co-routine pullback that is returned indirect. Is there anything I missed?

@asl
Copy link
Collaborator

asl commented Sep 1, 2023

Some status update: I'm having a proof-of-concept implementation that seems to generate correct SIL code for things like:

struct S: Differentiable {
    private var _x : Float

    func _endMutation() {
     // do something 
   }
    
    var x: Float {
        get{_x}
        set(newValue) { _x = newValue }
        _modify {
            defer { _endMutation() }
            if (x > 0) { // invoke getter to have some nested function calls
                yield &_x
            } else {
                yield &_x                
            }
        }
    }

    init(_ x : Float) {
        self._x = x
    }
}

Implementation notes:

  • Both pullback and VJP must be co-routines (because e.g. we need to yield the value in VJP and the adjoint buffer in the pullback!)
  • It is possible to support multiple yield instructions and have some non-trivial code in resume part
  • We ended with some weird things that cannot be codegen'ed now as we'd need to partially apply a pullback co-routine

Still, it seems there are some quite fundamental obstacles that might require some language extensions / changes.

I summarized the caveats in the following forum post: https://forums.swift.org/t/accessor-coroutines-poor-children/67061

@AnthonyLatsis AnthonyLatsis added feature A feature request or implementation SIL labels Sep 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AutoDiff compiler The Swift compiler in itself feature A feature request or implementation SIL
Projects
None yet
Development

No branches or pull requests

3 participants