[SR-8589] [blocker] `model.updateParameters` in the iris tutorial crashes the compiler #51107

marcrasi · 2018-08-21T00:30:22Z


Previous ID	SR-8589
Radar	None
Original Reporter	@marcrasi
Type	Bug
Status	Resolved
Resolution	Done

Additional Detail from JIRA


Votes	0
Component/s	Swift for TensorFlow
Labels	Bug, s4tf-iris-tutorial
Assignee	bgogul (JIRA)
Priority	Medium

md5: cffdda1ec2c8949bd9538c7f7ce2d691

Issue Description:

Execute the following two cells in Jupyter (or paste them into an LLDB REPL):

import TensorFlow
struct Model : Parameterized {
    @TFParameter var x: Tensor<Float> = Tensor(zeros: [10, 10])
}
var model = Model()

model.updateParameters(withGradients: Model.Parameters(x: Tensor(1))) { $0 -= 0.1 * $1 }

You should get this error:

!!! Compiler bug -- Tensor op builtin __tfop_tfc.scalarToTensor,$in cannot be lowered to LLVM IR !!!

If you do everything in one cell, or if you do everything in one swift script, then you won't get the error!

rxwei · 2018-08-21T00:38:33Z

Do you have a minimal reproducer?

marcrasi · 2018-08-21T00:59:18Z

I created one and updated the description. It seems to be some interaction with LLDB because I can't get it to cause an error in a plain swift script.

rxwei · 2018-08-21T01:05:04Z

The problem does not occur in compiler mode or script interpreter mode. This only occurs in REPL.

rxwei · 2018-08-21T01:06:13Z

Same with

model.allParameters.update(withGradients: Model.Parameters(x: Tensor(1))) { $0 -= 0.1 * $1 }

rxwei · 2018-08-21T20:29:56Z

This bug clearly has the highest priority because parameter aggregate is a strict requirement. bgogul (JIRA User) Would you like to take a look? I can provide further support on this.

The first step is to figure out why the trailing closure

{ $0 -= 0.1 * $1 }

... is not getting partitioned.

One quick simple hack is to make the compiler partition even inlinable functions as long as they are not generic. This change can be added at TFUtilities.cpp:662.

swift-ci · 2018-08-21T23:06:45Z

Comment by Gogul Balakrishnan (JIRA)

It looks like we have a workaround for the time being. I will take a look once I wrap up the other bugs that are blocking the tutorial.

rxwei · 2018-08-21T23:12:43Z

There's no workaround yet.

If you do everything in one cell, or if you do everything in one swift script, then you won't get the error!

Putting everyone in one cell would make the tutorial notebook no longer a notebook.

lattner · 2018-08-22T21:49:52Z

Does this work if you hard code "-tf-promote-global-variables" to enabled?

swift-ci · 2018-08-22T22:21:45Z

Comment by Gogul Balakrishnan (JIRA)

@lattner, enabling tf-promote-global-variables did not fix the issue. I will dig around a bit more.

swift-ci · 2018-08-23T22:21:45Z

Comment by Gogul Balakrishnan (JIRA)

Just an update based on my investigation. In the repl mode, the compiler fails to inline the call to updateParameters:

model.updateParameters(withGradients: Model.Parameters(x: Tensor(1))) { $0 -= 0.1 * $1 }

The reason is that the attempt to link the updateParameters fails. The relevant condition in the shouldInlinePredicate is the following:

if (callee.empty() && !site.getModule().linkFunction(&callee, SILModule::LinkingMode::LinkAll)

"site.getModule().linkFunction(...)" returns false for updateParameters in repl mode.

rxwei · 2018-08-23T23:38:23Z

How about the following?

model.allParameters.update(withGradients: Model.Parameters(x: Tensor(1))) { $0 -= 0.1 * $1 }

Is this causing the same error?

This case above should be tested first. `updateParameters` is just a Swift-level wrapper that might obscure what's going on.

swift-ci · 2018-08-24T00:19:48Z

Comment by Gogul Balakrishnan (JIRA)

Yes, changing it to allParameters.update(..) is causing the same error.

I think I understand the issue now. Basically, the SIL IR is not retained after evaluating a line in repl mode. Only the lowered LLVM IR persists. That is why the deabstraction phase is not able to link the SILFunction in.

rxwei · 2018-08-24T00:35:31Z

I don't think that's the case, here's an example:

  2> func foo<T>(x: Tensor<T>) -> Tensor<T> {
  3.     return x
  4. } // I hit enter here
  5> foo(x: Tensor(1))
2018-08-23 16:33:19.642759: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.2 AVX AVX2 FMA
$R0: TensorFlow.Tensor<Double> = 1.0

This means the SIL is being preserved. Compiler cannot partition generic functions.

swift-ci · 2018-08-25T00:24:18Z

Comment by Gogul Balakrishnan (JIRA)

I did a bit more digging around. I don't completely understand how REPL is implemented. However, I can see that the SILModule is thrown away after generating the LLVM IR. For instance, SwiftExpressionParser.cpp:1887 calls swift:: PerformIRGeneration(), which ultimately releases the SILModule at IRGen.cpp:854.

If this is the case, we need to have a way of preserving the relevant SIL code throughout the REPL session.

swift-ci · 2018-08-29T19:45:53Z

Comment by Gogul Balakrishnan (JIRA)

I sent out a hacky patch-19502 to deal with this issue for the time being.

@rxwei, could it be the case that the generic case is working because foo has not #tfops, and hence, no deabstraction/partitioning is needed? Here is the SIL that was generated for foo in lldb:

// Foo<A>(x:)
sil @$S13__lldb_expr_33Foo1x10TensorFlow0D0VyxGAG_tAD013AccelerableBydE0RzlF : $@convention(thin) <T where T : AccelerableByTensorFlow> (@guaranteed Tensor<T>) -> @owned Tensor<T> {
// %0                                             // users: %2, %1
bb0(%0 : @guaranteed $Tensor<T>):
  debug_value %0 : $Tensor<T>, let, name "x", argno 1 // id: %1
  %2 = copy_value %0 : $Tensor<T>                 // user: %3
  return %2 : $Tensor<T>                          // id: %3
} // end sil function '$S13__lldb_expr_33Foo1x10TensorFlow0D0VyxGAG_tAD013AccelerableBydE0RzlF'

rxwei · 2018-08-30T02:40:25Z

Function not having tensor is a good point. I intended to add some tensor code there but forgot...

The following case is already being diagnosed, which is great.

  4> func bar<T : Numeric>(x: Tensor<T>) -> Tensor<T> { return x + x }
error: repl.swift:4:6: error: internal error: TensorFlow graph program extraction does not work on generic functions yet
func bar<T : Numeric>(x: Tensor<T>) -> Tensor<T> { return x + x }
     ^

And the following case fails as expected:

  4> @inlinable func bar<T : Numeric>(x: Tensor<T>) -> Tensor<T> { return x + x }
  5> bar(x: Tensor(1))

Your patch makes sense and is a nice hack to unblock progress!

swift-ci transferred this issue from apple/swift-issues Apr 25, 2022

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SR-8589] [blocker] `model.updateParameters` in the iris tutorial crashes the compiler #51107

[SR-8589] [blocker] `model.updateParameters` in the iris tutorial crashes the compiler #51107

marcrasi mannequin commented Aug 21, 2018

rxwei commented Aug 21, 2018

marcrasi mannequin commented Aug 21, 2018

rxwei commented Aug 21, 2018

rxwei commented Aug 21, 2018

rxwei commented Aug 21, 2018

swift-ci commented Aug 21, 2018

rxwei commented Aug 21, 2018

lattner mannequin commented Aug 22, 2018

swift-ci commented Aug 22, 2018

swift-ci commented Aug 23, 2018

rxwei commented Aug 23, 2018

swift-ci commented Aug 24, 2018

rxwei commented Aug 24, 2018

swift-ci commented Aug 25, 2018

swift-ci commented Aug 29, 2018

rxwei commented Aug 30, 2018

[SR-8589] [blocker] model.updateParameters in the iris tutorial crashes the compiler #51107

[SR-8589] [blocker] model.updateParameters in the iris tutorial crashes the compiler #51107

Comments

marcrasi mannequin commented Aug 21, 2018

rxwei commented Aug 21, 2018

marcrasi mannequin commented Aug 21, 2018

rxwei commented Aug 21, 2018

rxwei commented Aug 21, 2018

rxwei commented Aug 21, 2018

swift-ci commented Aug 21, 2018

rxwei commented Aug 21, 2018

lattner mannequin commented Aug 22, 2018

swift-ci commented Aug 22, 2018

swift-ci commented Aug 23, 2018

rxwei commented Aug 23, 2018

swift-ci commented Aug 24, 2018

rxwei commented Aug 24, 2018

swift-ci commented Aug 25, 2018

swift-ci commented Aug 29, 2018

rxwei commented Aug 30, 2018

[SR-8589] [blocker] `model.updateParameters` in the iris tutorial crashes the compiler #51107

[SR-8589] [blocker] `model.updateParameters` in the iris tutorial crashes the compiler #51107