Understanding the comment in Tutorial 02 about differentiating through NLSS vs TheseusLayer #556

DanielTakeshi · 2023-06-27T14:36:01Z

DanielTakeshi
Jun 27, 2023

Thanks for the helpful tutorials! I am going through Tutorial 02
https://github.com/facebookresearch/theseus/blob/main/tutorials/02_differentiating_theseus_layer.ipynb
and I am trying to understand the comment at the end of this file:

In this tutorial, we have a and b as our variables, and a is an auxiliary variable, while b is optimized through the cost function and non-linear least squares optimizer. So my way of interpreting this comment is that:

The variable we have can be updated in two ways: either through the TheseusLayer objective (a in this tutorial) or through Theseus NLLS (b in this tutorial).
We update a via backpropagation with loss.backward() and model_optimizer.step() where model_optimizer is a standard Adam optimizer which only updates the a values. That is what the comment refers to when it says “differentiation.”
So when this comment says that differentiation is not through non-linear least squares optimizers, that’s referring to how b is updated via theseus_optim.forward(...) and where theseus_optim is of type th.GaussNewton(...). So b is updated via Gauss-Newton which is an example of a NLLS optimizer, and is different from "differentiating through Theseus Layer".

My confusion: when the comment says "the TheseusLayer objective" doesn't that also include theseus_optim which is what is used to update b? Because the tutorial has this:

Here, theseus_optim is of type th.TheseusLayer. I see that this runs Gauss-Newton under the hood, but isn't this considered part of "the TheseusLayer objective"? Am I misunderstanding the terminology here?

Second minor clarification: the part about "in this example" refers to the entire Tutorial 02, not just to the section 3.2 at the end, right?

Thanks for any help / advice you may wish to provide.

Answered by luisenp

Jun 27, 2023

Hi @DanielTakeshi. This note refers to a somewhat subtle point that happens under the hood. It turns out that for this particular example, the derivative of the outer loss with respect to the learned parameter doesn't actually need derivative information of the NLLS optimization (e.g. , you can wrap the optimizer call with torch.no_grad() and you can still reduce the outer loss). You can see the associated discussion in #27. It's been a while since I looked at this example, but that's the main idea, IIRC.

I think the wording of this note might be a bit confusing, since it seems to contradict the written code.

View full answer

luisenp · 2023-06-27T22:00:20Z

luisenp
Jun 27, 2023
Collaborator

Hi @DanielTakeshi. This note refers to a somewhat subtle point that happens under the hood. It turns out that for this particular example, the derivative of the outer loss with respect to the learned parameter doesn't actually need derivative information of the NLLS optimization (e.g. , you can wrap the optimizer call with torch.no_grad() and you can still reduce the outer loss). You can see the associated discussion in #27. It's been a while since I looked at this example, but that's the main idea, IIRC.

I think the wording of this note might be a bit confusing, since it seems to contradict the written code.

1 reply

DanielTakeshi Jun 28, 2023
Author

Thanks, I think it was mostly a wording that got me a bit confused. As long as what's going on in the code is clear to me, it should be OK.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Understanding the comment in Tutorial 02 about differentiating through NLSS vs TheseusLayer #556

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Understanding the comment in Tutorial 02 about differentiating through NLSS vs TheseusLayer #556

DanielTakeshi Jun 27, 2023

Replies: 1 comment · 1 reply

luisenp Jun 27, 2023 Collaborator

DanielTakeshi Jun 28, 2023 Author

DanielTakeshi
Jun 27, 2023

Replies: 1 comment 1 reply

luisenp
Jun 27, 2023
Collaborator

DanielTakeshi Jun 28, 2023
Author