-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Strange behavior with diagonal gates. #73
Comments
Thank you for submitting your issue. |
Hi, I am also trying to understand the quality of contraction order for cuquantum and cotengra, and I am trying to use cotengra contraction order in cuquantum contraction for fair comparison. from cuquantum import Network
import cuquantum.cutensornet as cutn
import cotengra as ctg
opt = ctg.HyperOptimizer(
slicing_reconf_opts={'target_size': 2**30}, # target size 2 ** 20 still causes the same kind of error
parallel=32,
progbar=False,
minimize='flops'
)
tree = opt.search(inputs, output, size_dict)
path = tree.get_path()
network = Network(expression, *operands)
network.optimizer_config_ptr = cutn.create_contraction_optimizer_config(network.handle)
network._set_opt_config_option('SIMPLIFICATION_DISABLE_DR', cutn.ContractionOptimizerConfigAttribute.SIMPLIFICATION_DISABLE_DR, 1)
optimizer_options = configuration.OptimizerOptions(samples=1, path=list(path))
path, info = network.contract_path(optimize=optimizer_options) This produces the correct result when the network is not too large. However, for large networks, I get the following output
The variables input to cotengra can be converted from inputs = []
tensors = expression.split(',')
tensors[-1] = tensors[-1][:-2]
for tensor in tensors:
inputs.append(tuple([char for char in tensor]))
output = ()
size_dict = {}
for i in range(offset, ord(max(expression)) + 1):
size_dict[chr(i)] = 2 The expression and operands are as follows (from sycamore m20): expression = 'ĩĪīĬ,ĭĮįİ,ıIJijĴ,ĵĶķĸ,ĹĺĻļ,ĽľĿŀ,ŁłĉĊ,ċČčĎ,ďĐđĒ,ēĔĕĖ,ėĘęĚ,ěĜĝĞ,ğĠĩġ,ĢƈƉƊ,ƋĬƌƍ,ƎƏĭƐ,Ƒ݃Ɠ,Ɣƒƕ,ıƖƗ,Ƙĵƙƚ,ƛĹƜƝ,ƞĽƟƠ,ơċƢƣ,ƤďƥƦ,ƧƨƩƪ,ƫěƬƭ,ƮĴŁƯ,ưĸēƱ,ƲļėƳ,ƴŀğƵ,ƶĊĢƷ,ƸĎƹƺ,ƻĒƋƼ,ƽĖƎƾ,ƿƨǀǁ,ǂġƑǃ,DŽƊDždž,LJƹLjlj,NJƐNjnj,ǍƗƮǎ,Ǐƚưǐ,ǑƝƲǒ,ǓƠƴǔ,ǕƯƶǖ,ǗƣƸǘ,ǙƦƻǚ,ǛƱƽǜ,ǝƳǞǟ,ǠƭǡǢ,ǣƵǂǤ,ǥƷDŽǦ,ǧƼǨǩ,ǪƾNJǫ,ǬǞĚǭ,ǮǡĞǯ,ǰǃDZDz,dzǨƍƔ,ǴDZƓǵ,ǶǷǸǹ,ǢǮǺǻ,ǩdzǼǽ,DzǴǾǿ,ȀƖǍȁ,ȂƙǏȃ,ȄƟǓȅ,ȆǎǕȇ,ȈƥǙȉ,ȊǐǛȋ,Ȍǒǝȍ,ȎƬǠȏ,Ȑǔǣȑ,Ȓǖǥȓ,ȔǘȕȖ,ȗǚǧȘ,șǜǪȚ,țǟǬȜ,ȝǤǰȞ,ȟǦȠȡ,ȢǫȣȤ,ȃȀȥȦ,ȅȂȧȨ,ȉȄȩȪ,ȏȈȫȬ,ǸȎȭȮ,ȯȰƜǑ,ȱȇȰȲ,ȳȴƢǗ,ȵȋȆȶ,ȷȍȴȸ,ȹȺƩȻ,ȼȑȊȽ,ȾȓȌȿ,ɀȖȺɁ,ɂȘȐɃ,ɄȚȒɅ,ɆȜȔɇ,ɈȻƪƿ,ɉǺȗɊ,ɋȞșɌ,ɍȡțɎ,ɏȕƺLJ,ɐǼȝɑ,ɒǾȢɓ,ɔȠdžɕ,ɖȣnjɗ,ǹɉɘə,ǻɐɚɛ,ǽɒɜɝ,ɞȨȱɟ,ɠȪȵɡ,ɢȶȷɣ,ɤȬȼɥ,ɦȽȾɧ,ɨȿɀɩ,ɪȮɂɫ,ɬɃɄɭ,ɮɅɆɯ,ɰɊɋɱ,ɲɌɍɳ,ɴɑɵɶ,ɷɵȤȟ,ȥɸɹɺ,ȧɞɻɼ,ȩɠɽɾ,ȫɤɿʀ,ȭɪʁʂ,ʃɸȦȯ,ʄɟʅʆ,ʇʅȲȳ,ʈɡɢʉ,ʊɣʋʌ,ʍʋȸȹ,ʎɥɦʏ,ʐɧɨʑ,ʒɩʓʔ,ʕʓɁʖ,ʗɫɬʘ,ʙɭɮʚ,ʛɯʜʝ,ʞʜɇɈ,ʟəɰʠ,ʡɱɲʢ,ʣɳʤʥ,ʦʤɎɏ,ʧɛɴʨ,ʩɶɷʪ,ʫɝʬʭ,ʮʬɓɔ,ɺʃʯʰ,ʆʇʱʲ,ʌʍʳʴ,ʵɼʄʶ,ʷɾʈʸ,ʹʉʊʺ,ʻʀʎʼ,ʽʏʐʾ,ʿʑʒˀ,ˁʂʗ˂,˃ʘʙ˄,˅ʚʛˆ,ˇʝʞˈ,ˉʠʡˊ,ˋʢʣˌ,ˍʨʩˎ,ˏʭʮː,ˑʯʵ˒,˓ʶʷ˔,˕ʱʹ˖,˗ʸʻ˘,˙ʺʽ˚,˛ʳʿ˜,˝ʼˁ˞,˟ʾ˃ˠ,ˡˀ˅ˢ,ˣ˂ˤ˥,˦˄ˉ˧,˨ˆˋ˩,˪ˤɘʟ,˫ˊˬ˭,ˮˌˍ˯,˰˱ʥʦ,˲ˬɚʧ,˳ˎ˴˵,˶˴ɜʫ,˷˸ǿɖ,ȁˑ˹˺,ɹ˓˻˼,ʰ˕˽˾,ɻ˗˿̀,ʲ˛́̂,ɽ˝̃̄,ʴ̅̆̇,ɿˣ̈̉,̊˒˙̋,̌˔˟̍,̎˖ˡ̏,̐˘˦̑,̒˚˨̓,̔˜̖̕,̗˞˫̘,̙ˠˮ̚,̛̜̅ˇ,̝˧˳̞,̟˩̡̠,̢̕ˈ˱,̣˯̤̥,̦̠ʪˏ,̧̤ː˸,̨˺̩̊,̪˼̫̌,̬˾̭̎,̮̯̀̐,̰̱̋̒,̲̳̂̔,̴̵̗̄,̶̷̙̍,̸̹̺̏,̻̼̉̽,̝̾̑̿,̟̀̓́,̘͂̓̈́,̣͆̚ͅ,͇̹ˢ˰,͈̼˥˲,͉̞͊͋,͌̓˭˶,͍͊˵˷,ʁ˪͎͏,͈̽͐͑,͓̈́͌͒,͍͔͕͋,͖˻̪͗,͘˿̮͙,͚̩̰͛,̴̃͜͝,̶̫͟͞,̸̭͠͡,̻̈ͣ͢,̯ͤ̾ͥ,̱ͦ̀ͧ,̳ͨͩͪ,̵ͫ͂ͬ,̷ͭͮͅ,̺͇ͯͰ,ͱ͉̿Ͳ,ͳ́ʹ͵,Ͷ͆ͷ\u0378,͗\u0379ͺͻ,͙͖ͼͽ,͘͝;Ϳ,ͣ͜\u0380\u0381,͎͢\u0382\u0383,΄΅˽̬,Ά͛΅·,ΈΉ̲́,Ί͚͟\u038b,Ό͡Ή\u038d,ΎΏ̆ΐ,Αͥ͞Β,Γͧ͠Δ,ΕͪΏΖ,Ηʔʕ̜,ΘͬͤΙ,ΚͮͦΛ,ΜͰͨΝ,Ξΐ̛̇,Ο͐ͫΠ,ΡͲͭ\u03a2,Σ͵ͯΤ,Υ̢̖ͩ,Φ͒ͱΧ,Ψ͔ͶΩ,Ϊʹ̡̦,Ϋͷ̧̥,͏Οάέ,͑Φήί,͓Ψΰα,βͽΆγ,δͿΊε,ζ\u038bΌη,θ\u0381Αι,κΒΓλ,μΔΕν,ξ\u0383Θο,πΙΚρ,ςΛΜσ,τΠΡυ,φ\u03a2Σχ,ψΧωϊ,ϋω\u0378ͳ,όύͻ΄,ώγϏϐ,ϑϏ·Έ,ϒεζϓ,ϔηϕϖ,ϗϕ\u038dΎ,Ϙικϙ,Ϛλμϛ,ϜνϝϞ,ϟϝΖΗ,Ϡοπϡ,Ϣρςϣ,ϤσϥϦ,ϧϥΝΞ,Ϩέτϩ,Ϫυφϫ,ϬχϭϮ,ϯϭΤΥ,ϰίψϱ,ϲϊϋϳ,ϴαϵ϶,ϷϵΩΪ,Ϟϟϸ,ϹϺώϻ,ϼϽϒϾ,ϿϓϔЀ,ЁЂϘЃ,ЄϙϚЅ,ІϛϜЇ,ЈЉϠЊ,ЋϡϢЌ,ЍϣϤЎ,ЏϦϧА,БϩϪВ,ГϫϬД,ЕϱϲЖ,З϶ϷИ,\u0379˹̨Й,ͺύКЛ,КόМН,ͼβϺО,ЙМϹП,ϐϑРС,;δϽТ,ЛϻϼУ,НРϿФ,ϖϗХЦ,ЦϸЏ,\u0380θЂЧ,ОϾЁШ,ПЀЄЩ,СХІЪ,ЪАЫ,\u0382ξЉЬ,ЬЭάϨ,ТЃЈЮ,УЅЋЯ,ФЇЍа,аЫϮϯ,ЧЊЭб,бвήϰ,ШЌБг,ЩЎГд,дϳЗ,ЮВве,ежΰϴ,ЯДЕз,зИи,гЖжй,йи͕Ϋ,ƕкл,Įǯƌ,мǭlj,ĪǷĝ,нǁLj,оʖǀ,пČƧ,рĺơ,ƘIJƛ,ǵɗNjл,įɕDžк,īƉмƏ,сęнƈ,тĕсĠ,уčоĘ,фĉуĔ,хĻпł,ǶđтĜ,ƫĿфĐ,Ƥķхľ,ƞijрĶ->'
operands = [np.zeros([2]*len(tensor), dtype=complex) for tensor in expression.split(',')]
operands[-1] = np.zeros([2]*4, dtype=complex) |
Can you rerun the failing case with the environment variable CUTENSORNET_LOG_LEVEL=5 set and attach the log file? |
Hi Dmitry, Thank you for the response and the file is attached. |
Is it a multi-GPU node? Do you mind rerunning it again with the the following environment variables set and attaching the log again? |
std.log |
Is this the Sycamore-53 depth-20 circuit? Do you have any transformations done on it which would introduce hyper-edges? Do you enable slicing? I see very large intermediate tensors there and very high workspace size demands. What happens if you do not try to import the contraction path from Cotengra but let cuTensorNet figure out its own contraction path? Does it work? |
Thank you for the feedback! Yes it's sycamore 53 depth 20 When the path is not imported it works perfectly fine so I don't think there are hyperedges. Cuquantum also works fine with hyperedges when it finds its own path as long as simplification is disabled. Perhaps somehow the path does not correspond to the same path when moved from cotengra to cuquantum. |
Hi Minzhao, from the log, we can see that the intermediate tensors are too large to fit the memory, and thus the network returns with |
Hi Ali! Thank you for the feedback. It works after slicing is supplied. I was expecting cuquantum to take the path and find the slices on its own. |
Also note that paths from Cotengra or other package can run with cuTensorNet but not always because again either if slicing is missing or if the path generated by such package cannot run on GPU for other reason such as number of modes >64. |
@sss441803 has your issue been addressed? |
Hi, the answers are sufficient for solving my own issues with the workaround 'CUTENSORNET_CONTRACTION_OPTIMIZER_CONFIG_SIMPLIFICATION_DISABLE_DR'. Azzam did say that a proper fix should come regarding diagonal gates. |
@sss441803 Bottom line the fix was to disable simplification which was enabled by default. ( I am not sure if you are mentioning another fix "Azzam did say that a proper fix should come regarding diagonal gates." |
I am happy if the issue is closed since the current fix is sufficient for me. I was just expecting some more updates since you mentioned more details will come later in the first comment, but I am fine with what we have now. Just to clarify the original issue, it's not that small diagonal gate circuits don't benefit from simplification, it's that simplification actively breaks things. For circuits with 300~400 tensors, the default behavior fails to find a path for diagonal gate circuits completely while working fine for non-diagnoal gates. |
Hi,
I constructed a simple 6-qubit circuit with a brickwork pattern, and generate the corresponding expression for calculating an amplitude. The alternating two-qubit gates are generally not diagonal, but some can be decomposed into local single-qubit gates followed by a diagonal gate. Diagonal decomposition, with the introduction of hyperedges, should reduce the computational cost. Although the example provided below should not result in a reduction of cost that is very noticeable, we are still seeing some very unexpected behaviors. The contraction path results in a very large number of open indices in intermediate tensors, even though the graph should be almost a ring graph, which has a tree width of 2.
Code:
Output:
The text was updated successfully, but these errors were encountered: