Strange behavior with diagonal gates. #73

sss441803 · 2023-07-25T15:38:53Z

Hi,

I constructed a simple 6-qubit circuit with a brickwork pattern, and generate the corresponding expression for calculating an amplitude. The alternating two-qubit gates are generally not diagonal, but some can be decomposed into local single-qubit gates followed by a diagonal gate. Diagonal decomposition, with the introduction of hyperedges, should reduce the computational cost. Although the example provided below should not result in a reduction of cost that is very noticeable, we are still seeing some very unexpected behaviors. The contraction path results in a very large number of open indices in intermediate tensors, even though the graph should be almost a ring graph, which has a tree width of 2.

Code:

import numpy as np
from cuquantum import contract_path

# Filler for operands, values don't matter
value = np.zeros(2, dtype=complex)
cz = np.zeros([2, 2], dtype=complex)
single_qubit_gate = np.zeros([2, 2], dtype=complex)
two_qubit_gate = np.zeros([2, 2, 2, 2], dtype=complex)

# Regular no diagonal decomposition
exp_str = 'a,b,c,d,e,f,ag,bh,ci,dj,ek,fl,ghmn,ijop,klqr,mrsx,notu,pqvw,sy,tz,uA,vB,wC,xD,y,z,A,B,C,D->'
operands = [value] * 6 + [single_qubit_gate] * 6 + [two_qubit_gate] * 6 + [single_qubit_gate] * 6 + [value] * 6
path, info = contract_path(exp_str, *operands)
cost = info.opt_cost
largest_intermediate = info.largest_intermediate
intermediate_modes = info.intermediate_modes
print(f'No diagonal gate: cost {cost}, largest_intermediate {largest_intermediate}.')
print('Intermediate modes: ', intermediate_modes)

# Diagonal decomposition
exp_str = 'a,b,c,d,e,f,ag,bh,ci,dj,ek,fl,gh,ij,kl,gm,hn,io,jp,kq,lr,mr,no,pq,ms,nt,ou,pv,qw,rx,s,t,u,v,w,x'
operands = [value] * 6 + [single_qubit_gate] * 6 + [cz] * 3 + [single_qubit_gate] * 6 + [cz] * 3 + [single_qubit_gate] * 6 + [value] * 6
path, info = contract_path(exp_str, *operands)
cost = info.opt_cost
largest_intermediate = info.largest_intermediate
intermediate_modes = info.intermediate_modes
print(f'Diagonal gate: cost {cost}, largest_intermediate {largest_intermediate}.')
print('Intermediate modes: ', intermediate_modes)

Output:

No diagonal gate: cost 724.0, largest_intermediate 16.0.
Intermediate modes: ('jopc', 'jop', 'g', 'hmn', 'h', 'j', 'elqr', 'l', 'ymrx', 'znou', 'u', 'Bpqw', 'w', 'x', 'lqr', 'qr', 'pqw', 'pq', 'op', 'mn', 'nou', 'no', 'pn', 'mp', 'qm', 'rm', 'yx', 'x', '')
Diagonal gate: cost 5072.0, largest_intermediate 256.0.
Intermediate modes: ('i', 'g', 'h', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 'ji', 'hg', 'nh', 'pj', 'lk', 'rl', 'gm', 'on', 'io', 'qp', 'kq', 'mr', 'jimr', 'hglk', 'gmon', 'rlqp', 'nhjimr', 'pjhglk', 'kqgmon', 'iorlqp', 'nimrpglk', 'kgmnirlp', '')

haidarazzam · 2023-07-27T16:48:36Z

Thank you for submitting your issue.
We have identified the problem, the current solution is by disabling cutensorNet internal preprocessing simplification process 'CUTENSORNET_CONTRACTION_OPTIMIZER_CONFIG_SIMPLIFICATION_DISABLE_DR'
then you would get what you are expecting 312 Flops for the diagonal decomposition.
More details will come later.

sss441803 · 2023-08-10T17:35:43Z

Hi, I am also trying to understand the quality of contraction order for cuquantum and cotengra, and I am trying to use cotengra contraction order in cuquantum contraction for fair comparison.

from cuquantum import Network
import cuquantum.cutensornet as cutn
import cotengra as ctg
opt = ctg.HyperOptimizer(
        slicing_reconf_opts={'target_size': 2**30}, # target size 2 ** 20 still causes the same kind of error
        parallel=32,
        progbar=False,
        minimize='flops'
    )
tree = opt.search(inputs, output, size_dict)
path = tree.get_path()
network = Network(expression, *operands)
network.optimizer_config_ptr = cutn.create_contraction_optimizer_config(network.handle)
network._set_opt_config_option('SIMPLIFICATION_DISABLE_DR', cutn.ContractionOptimizerConfigAttribute.SIMPLIFICATION_DISABLE_DR, 1)
optimizer_options = configuration.OptimizerOptions(samples=1, path=list(path))
path, info = network.contract_path(optimize=optimizer_options)

This produces the correct result when the network is not too large. However, for large networks, I get the following output

Traceback (most recent call last):
  File "optimize.py", line 41, in <module>
    path, info = network.contract_path(optimize=optimizer_options)
  File "/home/minzhaoliu/.conda/envs/cuquantum/lib/python3.8/site-packages/cuquantum/cutensornet/_internal/utils.py", line 474, in inner
    result = wrapped_function(*args, **kwargs)
  File "/home/minzhaoliu/.conda/envs/cuquantum/lib/python3.8/site-packages/cuquantum/cutensornet/_internal/utils.py", line 443, in inner
    raise e
  File "/home/minzhaoliu/.conda/envs/cuquantum/lib/python3.8/site-packages/cuquantum/cutensornet/_internal/utils.py", line 435, in inner
    result = wrapped_function(*args, **kwargs)
  File "/home/minzhaoliu/.conda/envs/cuquantum/lib/python3.8/site-packages/cuquantum/cutensornet/tensor_network.py", line 568, in contract_path
    self._calculate_workspace_size()
  File "/home/minzhaoliu/.conda/envs/cuquantum/lib/python3.8/site-packages/cuquantum/cutensornet/_internal/utils.py", line 474, in inner
    result = wrapped_function(*args, **kwargs)
  File "/home/minzhaoliu/.conda/envs/cuquantum/lib/python3.8/site-packages/cuquantum/cutensornet/_internal/utils.py", line 474, in inner
    result = wrapped_function(*args, **kwargs)
  File "/home/minzhaoliu/.conda/envs/cuquantum/lib/python3.8/site-packages/cuquantum/cutensornet/tensor_network.py", line 384, in _calculate_workspace_size
    cutn.workspace_compute_contraction_sizes(self.handle, self.network, self.optimizer_info_ptr, self.workspace_desc)
  File "cuquantum/cutensornet/cutensornet.pyx", line 679, in cuquantum.cutensornet.cutensornet.workspace_compute_contraction_sizes
  File "cuquantum/cutensornet/cutensornet.pyx", line 696, in cuquantum.cutensornet.cutensornet.workspace_compute_contraction_sizes
  File "cuquantum/cutensornet/cutensornet.pyx", line 240, in cuquantum.cutensornet.cutensornet.check_status
cuquantum.cutensornet.cutensornet.cuTensorNetError: CUTENSORNET_STATUS_NOT_SUPPORTED

The variables input to cotengra can be converted from expressions and operators in the cuquantum format using the following code:

inputs = []
tensors = expression.split(',')
tensors[-1] = tensors[-1][:-2]
for tensor in tensors:
    inputs.append(tuple([char for char in tensor]))
output = ()
size_dict = {}
for i in range(offset, ord(max(expression)) + 1):
    size_dict[chr(i)] = 2

The expression and operands are as follows (from sycamore m20):

expression = 'ĩĪīĬ,ĭĮįİ,ıĲĳĴ,ĵĶķĸ,ĹĺĻļ,ĽľĿŀ,ŁłĉĊ,ċČčĎ,ďĐđĒ,ēĔĕĖ,ėĘęĚ,ěĜĝĞ,ğĠĩġ,ĢƈƉƊ,ƋĬƌƍ,ƎƏĭƐ,ƑİƒƓ,Ɣƒƕ,ıƖƗ,Ƙĵƙƚ,ƛĹƜƝ,ƞĽƟƠ,ơċƢƣ,ƤďƥƦ,ƧƨƩƪ,ƫěƬƭ,ƮĴŁƯ,ưĸēƱ,ƲļėƳ,ƴŀğƵ,ƶĊĢƷ,ƸĎƹƺ,ƻĒƋƼ,ƽĖƎƾ,ƿƨǀǁ,ǂġƑǃ,ǄƊǅǆ,Ǉƹǈǉ,ǊƐǋǌ,ǍƗƮǎ,Ǐƚưǐ,ǑƝƲǒ,ǓƠƴǔ,ǕƯƶǖ,ǗƣƸǘ,ǙƦƻǚ,ǛƱƽǜ,ǝƳǞǟ,ǠƭǡǢ,ǣƵǂǤ,ǥƷǄǦ,ǧƼǨǩ,ǪƾǊǫ,ǬǞĚǭ,ǮǡĞǯ,ǰǃǱǲ,ǳǨƍƔ,ǴǱƓǵ,ǶǷǸǹ,ǢǮǺǻ,ǩǳǼǽ,ǲǴǾǿ,ȀƖǍȁ,ȂƙǏȃ,ȄƟǓȅ,ȆǎǕȇ,ȈƥǙȉ,ȊǐǛȋ,Ȍǒǝȍ,ȎƬǠȏ,Ȑǔǣȑ,Ȓǖǥȓ,ȔǘȕȖ,ȗǚǧȘ,șǜǪȚ,țǟǬȜ,ȝǤǰȞ,ȟǦȠȡ,ȢǫȣȤ,ȃȀȥȦ,ȅȂȧȨ,ȉȄȩȪ,ȏȈȫȬ,ǸȎȭȮ,ȯȰƜǑ,ȱȇȰȲ,ȳȴƢǗ,ȵȋȆȶ,ȷȍȴȸ,ȹȺƩȻ,ȼȑȊȽ,ȾȓȌȿ,ɀȖȺɁ,ɂȘȐɃ,ɄȚȒɅ,ɆȜȔɇ,ɈȻƪƿ,ɉǺȗɊ,ɋȞșɌ,ɍȡțɎ,ɏȕƺǇ,ɐǼȝɑ,ɒǾȢɓ,ɔȠǆɕ,ɖȣǌɗ,ǹɉɘə,ǻɐɚɛ,ǽɒɜɝ,ɞȨȱɟ,ɠȪȵɡ,ɢȶȷɣ,ɤȬȼɥ,ɦȽȾɧ,ɨȿɀɩ,ɪȮɂɫ,ɬɃɄɭ,ɮɅɆɯ,ɰɊɋɱ,ɲɌɍɳ,ɴɑɵɶ,ɷɵȤȟ,ȥɸɹɺ,ȧɞɻɼ,ȩɠɽɾ,ȫɤɿʀ,ȭɪʁʂ,ʃɸȦȯ,ʄɟʅʆ,ʇʅȲȳ,ʈɡɢʉ,ʊɣʋʌ,ʍʋȸȹ,ʎɥɦʏ,ʐɧɨʑ,ʒɩʓʔ,ʕʓɁʖ,ʗɫɬʘ,ʙɭɮʚ,ʛɯʜʝ,ʞʜɇɈ,ʟəɰʠ,ʡɱɲʢ,ʣɳʤʥ,ʦʤɎɏ,ʧɛɴʨ,ʩɶɷʪ,ʫɝʬʭ,ʮʬɓɔ,ɺʃʯʰ,ʆʇʱʲ,ʌʍʳʴ,ʵɼʄʶ,ʷɾʈʸ,ʹʉʊʺ,ʻʀʎʼ,ʽʏʐʾ,ʿʑʒˀ,ˁʂʗ˂,˃ʘʙ˄,˅ʚʛˆ,ˇʝʞˈ,ˉʠʡˊ,ˋʢʣˌ,ˍʨʩˎ,ˏʭʮː,ˑʯʵ˒,˓ʶʷ˔,˕ʱʹ˖,˗ʸʻ˘,˙ʺʽ˚,˛ʳʿ˜,˝ʼˁ˞,˟ʾ˃ˠ,ˡˀ˅ˢ,ˣ˂ˤ˥,˦˄ˉ˧,˨ˆˋ˩,˪ˤɘʟ,˫ˊˬ˭,ˮˌˍ˯,˰˱ʥʦ,˲ˬɚʧ,˳ˎ˴˵,˶˴ɜʫ,˷˸ǿɖ,ȁˑ˹˺,ɹ˓˻˼,ʰ˕˽˾,ɻ˗˿̀,ʲ˛́̂,ɽ˝̃̄,ʴ̅̆̇,ɿˣ̈̉,̊˒˙̋,̌˔˟̍,̎˖ˡ̏,̐˘˦̑,̒˚˨̓,̔˜̖̕,̗˞˫̘,̙ˠˮ̚,̛̜̅ˇ,̝˧˳̞,̟˩̡̠,̢̕ˈ˱,̣˯̤̥,̦̠ʪˏ,̧̤ː˸,̨˺̩̊,̪˼̫̌,̬˾̭̎,̮̯̀̐,̰̱̋̒,̲̳̂̔,̴̵̗̄,̶̷̙̍,̸̹̺̏,̻̼̉̽,̝̾̑̿,̟̀̓́,̘͂̓̈́,̣͆̚ͅ,͇̹ˢ˰,͈̼˥˲,͉̞͊͋,͌̓˭˶,͍͊˵˷,ʁ˪͎͏,͈̽͐͑,͓̈́͌͒,͍͔͕͋,͖˻̪͗,͘˿̮͙,͚̩̰͛,̴̃͜͝,̶̫͟͞,̸̭͠͡,̻̈ͣ͢,̯ͤ̾ͥ,̱ͦ̀ͧ,̳ͨͩͪ,̵ͫ͂ͬ,̷ͭͮͅ,̺͇ͯͰ,ͱ͉̿Ͳ,ͳ́ʹ͵,Ͷ͆ͷ\u0378,͗\u0379ͺͻ,͙͖ͼͽ,͘͝;Ϳ,ͣ͜\u0380\u0381,͎͢\u0382\u0383,΄΅˽̬,Ά͛΅·,ΈΉ̲́,Ί͚͟\u038b,Ό͡Ή\u038d,ΎΏ̆ΐ,Αͥ͞Β,Γͧ͠Δ,ΕͪΏΖ,Ηʔʕ̜,ΘͬͤΙ,ΚͮͦΛ,ΜͰͨΝ,Ξΐ̛̇,Ο͐ͫΠ,ΡͲͭ\u03a2,Σ͵ͯΤ,Υ̢̖ͩ,Φ͒ͱΧ,Ψ͔ͶΩ,Ϊʹ̡̦,Ϋͷ̧̥,͏Οάέ,͑Φήί,͓Ψΰα,βͽΆγ,δͿΊε,ζ\u038bΌη,θ\u0381Αι,κΒΓλ,μΔΕν,ξ\u0383Θο,πΙΚρ,ςΛΜσ,τΠΡυ,φ\u03a2Σχ,ψΧωϊ,ϋω\u0378ͳ,όύͻ΄,ώγϏϐ,ϑϏ·Έ,ϒεζϓ,ϔηϕϖ,ϗϕ\u038dΎ,Ϙικϙ,Ϛλμϛ,ϜνϝϞ,ϟϝΖΗ,Ϡοπϡ,Ϣρςϣ,ϤσϥϦ,ϧϥΝΞ,Ϩέτϩ,Ϫυφϫ,ϬχϭϮ,ϯϭΤΥ,ϰίψϱ,ϲϊϋϳ,ϴαϵ϶,ϷϵΩΪ,Ϟϟϸ,ϹϺώϻ,ϼϽϒϾ,ϿϓϔЀ,ЁЂϘЃ,ЄϙϚЅ,ІϛϜЇ,ЈЉϠЊ,ЋϡϢЌ,ЍϣϤЎ,ЏϦϧА,БϩϪВ,ГϫϬД,ЕϱϲЖ,З϶ϷИ,\u0379˹̨Й,ͺύКЛ,КόМН,ͼβϺО,ЙМϹП,ϐϑРС,;δϽТ,ЛϻϼУ,НРϿФ,ϖϗХЦ,ЦϸЏ,\u0380θЂЧ,ОϾЁШ,ПЀЄЩ,СХІЪ,ЪАЫ,\u0382ξЉЬ,ЬЭάϨ,ТЃЈЮ,УЅЋЯ,ФЇЍа,аЫϮϯ,ЧЊЭб,бвήϰ,ШЌБг,ЩЎГд,дϳЗ,ЮВве,ежΰϴ,ЯДЕз,зИи,гЖжй,йи͕Ϋ,ƕкл,Įǯƌ,мǭǉ,ĪǷĝ,нǁǈ,оʖǀ,пČƧ,рĺơ,ƘĲƛ,ǵɗǋл,įɕǅк,īƉмƏ,сęнƈ,тĕсĠ,уčоĘ,фĉуĔ,хĻпł,ǶđтĜ,ƫĿфĐ,Ƥķхľ,ƞĳрĶ->'
operands = [np.zeros([2]*len(tensor), dtype=complex) for tensor in expression.split(',')]
operands[-1] = np.zeros([2]*4, dtype=complex)

DmitryLyakh · 2023-08-10T19:16:38Z

Can you rerun the failing case with the environment variable CUTENSORNET_LOG_LEVEL=5 set and attach the log file?

sss441803 · 2023-08-10T19:42:44Z

Hi Dmitry,

Thank you for the response and the file is attached.
std.log

DmitryLyakh · 2023-08-10T19:54:36Z

Is it a multi-GPU node? Do you mind rerunning it again with the the following environment variables set and attaching the log again?
CUDA_VISIBLE_DEVICES=0
CUTENSORNET_LOG_LEVEL=5
CUTENSOR_LOG_LEVEL=5

sss441803 · 2023-08-10T20:20:40Z

std.log
Hi, I have changed these variables, but I ran it on a single GPU login node of a cluster with compute nodes with multiple GPUs each.

DmitryLyakh · 2023-08-10T20:56:14Z

Is this the Sycamore-53 depth-20 circuit? Do you have any transformations done on it which would introduce hyper-edges? Do you enable slicing? I see very large intermediate tensors there and very high workspace size demands. What happens if you do not try to import the contraction path from Cotengra but let cuTensorNet figure out its own contraction path? Does it work?

sss441803 · 2023-08-10T21:12:22Z

Thank you for the feedback!

Yes it's sycamore 53 depth 20
I don't think there are hyperedges.
I enabled slicing.

When the path is not imported it works perfectly fine so I don't think there are hyperedges. Cuquantum also works fine with hyperedges when it finds its own path as long as simplification is disabled.

Perhaps somehow the path does not correspond to the same path when moved from cotengra to cuquantum.

acharara-nv · 2023-08-11T15:30:21Z

Hi Minzhao, from the log, we can see that the intermediate tensors are too large to fit the memory, and thus the network returns with CUTENSORNET_STATUS_NOT_SUPPORTED.
The path is imported properly from contengra, but the slicing info from cotengra is not imported, thus the intermediate tensors are too large.
You would want to use the OptimizerOptions slicing attribute to supply the sliced indices into cuTensorNet's optimizer. According to cotengra's docs, you would want to work with the cotengra's ContractionTree object to extract the sliced indices.
Would you please try that and post feedback?

sss441803 · 2023-08-11T16:02:38Z

Hi Ali! Thank you for the feedback. It works after slicing is supplied. I was expecting cuquantum to take the path and find the slices on its own.

haidarazzam · 2023-08-11T17:04:25Z

Also note that paths from Cotengra or other package can run with cuTensorNet but not always because again either if slicing is missing or if the path generated by such package cannot run on GPU for other reason such as number of modes >64.

mtjrider · 2023-09-03T05:31:09Z

@sss441803 has your issue been addressed?

sss441803 · 2023-09-06T12:33:48Z

@sss441803 has your issue been addressed?

Hi, the answers are sufficient for solving my own issues with the workaround 'CUTENSORNET_CONTRACTION_OPTIMIZER_CONFIG_SIMPLIFICATION_DISABLE_DR'.

Azzam did say that a proper fix should come regarding diagonal gates.

haidarazzam · 2023-09-06T14:59:15Z

@sss441803
hyperedges have been allowed in cuTensorNet since a while.
The flops issue you observed was due because cuTensorNet simplification phase was trying to simplify the contraction before the path-optimizer kick-in which ended-up with a non optimal path due to these simplifications. Usually for small circuit we prefer to disable simplification and let the optimizer find the best path. Simplification cannot guarantee an optimal path and usually is implemented as preprocessing phase to decrease the network size thus speeding up the optimizer process for large circuit.
Since 23.06, the performance of the path optimizer has been improved a lot in particular for large circuit (10x), thus, the simplification phase is not as advantageous as before and can be safely disabled except if user still want to get a faster path-optimizer for large circuit where simplification doesn't mess-up the path.

Bottom line the fix was to disable simplification which was enabled by default. ( I am not sure if you are mentioning another fix "Azzam did say that a proper fix should come regarding diagonal gates."

sss441803 · 2023-09-06T15:20:13Z

I am happy if the issue is closed since the current fix is sufficient for me. I was just expecting some more updates since you mentioned more details will come later in the first comment, but I am fine with what we have now.

Just to clarify the original issue, it's not that small diagonal gate circuits don't benefit from simplification, it's that simplification actively breaks things. For circuits with 300~400 tensors, the default behavior fails to find a path for diagonal gate circuits completely while working fine for non-diagnoal gates.

mtjrider self-assigned this Jul 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Strange behavior with diagonal gates. #73

Strange behavior with diagonal gates. #73

sss441803 commented Jul 25, 2023

haidarazzam commented Jul 27, 2023

sss441803 commented Aug 10, 2023 •

edited

Loading

DmitryLyakh commented Aug 10, 2023

sss441803 commented Aug 10, 2023

DmitryLyakh commented Aug 10, 2023

sss441803 commented Aug 10, 2023

DmitryLyakh commented Aug 10, 2023

sss441803 commented Aug 10, 2023 •

edited

Loading

acharara-nv commented Aug 11, 2023

sss441803 commented Aug 11, 2023

haidarazzam commented Aug 11, 2023

mtjrider commented Sep 3, 2023

sss441803 commented Sep 6, 2023 •

edited

Loading

haidarazzam commented Sep 6, 2023

sss441803 commented Sep 6, 2023

Strange behavior with diagonal gates. #73

Strange behavior with diagonal gates. #73

Comments

sss441803 commented Jul 25, 2023

haidarazzam commented Jul 27, 2023

sss441803 commented Aug 10, 2023 • edited Loading

DmitryLyakh commented Aug 10, 2023

sss441803 commented Aug 10, 2023

DmitryLyakh commented Aug 10, 2023

sss441803 commented Aug 10, 2023

DmitryLyakh commented Aug 10, 2023

sss441803 commented Aug 10, 2023 • edited Loading

acharara-nv commented Aug 11, 2023

sss441803 commented Aug 11, 2023

haidarazzam commented Aug 11, 2023

mtjrider commented Sep 3, 2023

sss441803 commented Sep 6, 2023 • edited Loading

haidarazzam commented Sep 6, 2023

sss441803 commented Sep 6, 2023

sss441803 commented Aug 10, 2023 •

edited

Loading

sss441803 commented Aug 10, 2023 •

edited

Loading

sss441803 commented Sep 6, 2023 •

edited

Loading