Rename rematerialization of saved for backward symbols #1367

riccardofelluga · 2024-10-30T10:04:59Z

This is part of #1232. PR renames the outputs of recomputed symbols so that they do not overlap with names used in the forward trace. Fusion rematerialization requires names used in producer and consumer fusions to be unique.

thunder/core/transforms.py

IvanYashchuk

Other reviewers, please don't merge this PR without my review.

IvanYashchuk · 2024-11-04T07:25:47Z

thunder/core/transforms.py

@@ -3148,6 +3148,9 @@ def recompute_saved_for_backward(fwd_trace: Trace, bwd_trace: Trace) -> tuple[Tr

    producers = find_producer_symbols(fwd_trace, tuple(unvariableify(i) for i in rematerializable), fwd_trace.args)

+    trace_tok = set_tracectx(bwd_trace)


Please set and reset traces only with "try: finally:" blocks. If there's any error between the calls, the trace will not be reset.

Why do you set the input bwd_trace as the active trace? There are no Thunder operations calls between set and reset, and the input trace shouldn't be modified.

would we not use with tracectx(bwd_trace)?

For this I've taken inspiration from the code in the torch_autograd executor, in particular these lines explain why the need to set the trace context:

lightning-thunder/thunder/executors/torch_autograd.py

Lines 33 to 40 in 3390c92

# [note: why setting trace ctx?]

# [`TensorProxy.replace_name`](https://github.com/Lightning-AI/lightning-thunder/blob/561b699/thunder/core/proxies.py#L1221-L1223) calls

# [`tensorproxy`](https://github.com/Lightning-AI/lightning-thunder/blob/561b699/thunder/core/proxies.py#L1506-L1520)

# which then calls `TensorProxy.__init__`. `TensorProxy.__init__` of course calls

# [` Proxy.__init__`](https://github.com/Lightning-AI/lightning-thunder/blob/561b699/thunder/core/proxies.py#L81-L86).

# `Proxy`'s dunder init calls [`make_proxy_name`](https://github.com/Lightning-AI/lightning-thunder/blob/561b699/thunder/core/proxies.py#L81-L86)

# which depends on a tracectx.

trace_tok = set_tracectx(bwd_trace)

@IvanYashchuk Would an acceptable workaround be to create a new empty trace and use it as ctx?

Yes, or allow creating Proxies with any name without active tracectx. Maybe all is needed is to return True if trc is None in this function

lightning-thunder/thunder/core/proxies.py

Lines 75 to 82 in 3390c92

def register_proxy_name(name: None | str = None):

trc = get_tracectx()

if name is not None and not trc.has_name(name):

trc.add_name(name)

return True

return False

thunder/core/transforms.py

riccardofelluga · 2024-11-06T13:07:21Z

To be noted this does not fix #1232 but it helps to debug it by having an overall clearer backward trace when remat saved for backward is enabled.

… rename-remat-symbols

add renaming of rematerialization symbols

631fa1d

riccardofelluga requested review from mruberry, lantiga and t-vi as code owners October 30, 2024 10:05

test remat names

917a9df

riccardofelluga requested a review from IvanYashchuk October 30, 2024 13:28

riccardofelluga commented Oct 30, 2024

View reviewed changes

thunder/core/transforms.py Show resolved Hide resolved

IvanYashchuk requested changes Oct 30, 2024

View reviewed changes

t-vi assigned IvanYashchuk Oct 31, 2024

IvanYashchuk removed their assignment Nov 4, 2024

IvanYashchuk reviewed Nov 4, 2024

View reviewed changes

IvanYashchuk self-assigned this Nov 4, 2024

riccardofelluga added 2 commits November 7, 2024 16:13

Merge branch 'main' of github.com:Lightning-AI/lightning-thunder into…

68f3323

… rename-remat-symbols

use ctx manager for rename

c5d54a6

riccardofelluga marked this pull request as draft November 8, 2024 08:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rename rematerialization of saved for backward symbols #1367

Rename rematerialization of saved for backward symbols #1367

riccardofelluga commented Oct 30, 2024 •

edited by IvanYashchuk

Loading

IvanYashchuk left a comment

IvanYashchuk Nov 4, 2024

IvanYashchuk Nov 4, 2024

t-vi Nov 4, 2024

riccardofelluga Nov 6, 2024

IvanYashchuk Nov 6, 2024

riccardofelluga commented Nov 6, 2024

		@@ -3148,6 +3148,9 @@ def recompute_saved_for_backward(fwd_trace: Trace, bwd_trace: Trace) -> tuple[Tr

		producers = find_producer_symbols(fwd_trace, tuple(unvariableify(i) for i in rematerializable), fwd_trace.args)

		trace_tok = set_tracectx(bwd_trace)

	# [note: why setting trace ctx?]
	# [`TensorProxy.replace_name`](https://github.com/Lightning-AI/lightning-thunder/blob/561b699/thunder/core/proxies.py#L1221-L1223) calls
	# [`tensorproxy`](https://github.com/Lightning-AI/lightning-thunder/blob/561b699/thunder/core/proxies.py#L1506-L1520)
	# which then calls `TensorProxy.__init__`. `TensorProxy.__init__` of course calls
	# [` Proxy.__init__`](https://github.com/Lightning-AI/lightning-thunder/blob/561b699/thunder/core/proxies.py#L81-L86).
	# `Proxy`'s dunder init calls [`make_proxy_name`](https://github.com/Lightning-AI/lightning-thunder/blob/561b699/thunder/core/proxies.py#L81-L86)
	# which depends on a tracectx.
	trace_tok = set_tracectx(bwd_trace)

	def register_proxy_name(name: None \| str = None):
	trc = get_tracectx()

	if name is not None and not trc.has_name(name):
	trc.add_name(name)
	return True

	return False

Rename rematerialization of saved for backward symbols #1367

Are you sure you want to change the base?

Rename rematerialization of saved for backward symbols #1367

Conversation

riccardofelluga commented Oct 30, 2024 • edited by IvanYashchuk Loading

IvanYashchuk left a comment

Choose a reason for hiding this comment

IvanYashchuk Nov 4, 2024

Choose a reason for hiding this comment

IvanYashchuk Nov 4, 2024

Choose a reason for hiding this comment

t-vi Nov 4, 2024

Choose a reason for hiding this comment

riccardofelluga Nov 6, 2024

Choose a reason for hiding this comment

IvanYashchuk Nov 6, 2024

Choose a reason for hiding this comment

riccardofelluga commented Nov 6, 2024

riccardofelluga commented Oct 30, 2024 •

edited by IvanYashchuk

Loading