Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion on power analysis attacks #195

Open
sipa opened this issue Feb 2, 2020 · 47 comments
Open

Discussion on power analysis attacks #195

sipa opened this issue Feb 2, 2020 · 47 comments

Comments

@sipa
Copy link
Owner

sipa commented Feb 2, 2020

See https://eprint.iacr.org/2017/985.pdf, which describes an attack against Ed25519, by doing power analysis on the nonce generation function.

It relies on having a single SHA512 message block that contains both attacker-controlled data, and secret data that the attacker wishes to learn. We use SHA256 instead of SHA512, but it is sufficiently similar that we should perhaps at least consider if the same kind of attack could apply.

With #194, our nonce generation function would effectively become H(priv || pub || msg [|| randomness]). All the secret data goes into the first compression, and all attacker-controlled data goes into the second compression, so I think the attack does not apply directly already, but maybe related attacks do. I'll email the authors of the paper about this.

The paper however suggests having at least 128 bits of randomness (synthetic nonce!) in the first hash block. I wonder if that means we should move the randomness to after the priv, so H(priv || randomness || pub || msg). That means that if the randomness has at least 128 bits of entropy, we've effectively implemented their countermeasure, but at the cost of losing the ability to precompute the midstate at H(priv || pubkey || ...).

Anyone have any thoughts?

@gmaxwell
Copy link

gmaxwell commented Feb 2, 2020

H(priv || pub || pad to block (if pub isn't 32 bytes) || random || pad to block || message) is superior, as it increases the precomputable data, and still implements their counter measure I think (which works by having there be a compression with attacker unknown random data before any with attacker controlled inputs).

@sipa
Copy link
Owner Author

sipa commented Feb 2, 2020

if pub isn't 32 bytes

It is, making the first padding unnecessary (we don't need to commit to the negation flag if we use the private key post-negation).

random || pad to block

That would increase the number of compression function invocations. I'm not sure it adds anything?

@gmaxwell
Copy link

gmaxwell commented Feb 2, 2020

My reasoning was that there should be no attacker controlled data in a compression function run where the midstate is a constant secret.

@ajtowns
Copy link

ajtowns commented Feb 3, 2020

My reasoning was that there should be no attacker controlled data in a compression function run where the midstate is a constant secret.

Rather than having a cached midstate and two compression functions, you could do H(rand23 || sk || pk || m) which would satisfy that requirement with two compression functions but not needing cached midstate?

@real-or-random
Copy link

Rather than having a cached midstate and two compression functions, you could do H(rand23 || sk || pk || m) which would satisfy that requirement with two compression functions but not needing cached midstate?

I like that idea.

And in general, implementing this protection seems to be a good idea. In the end it's never clear how effective these countermeasures are but this one here seem to make the attack at least harder and it does not introduce any drawbacks.

@gmaxwell
Copy link

gmaxwell commented Feb 3, 2020

Well the drawback is two compression function runs at runtime instead of one... but high speed implementations can implement alternative schemes.

Also the drawback is that you can't use comparison to check what nonce implementation a black box is using if synthetic nonces are in use... but it seems in practice no one actually does this anyways (or even cares when that test returns a failure-- something claims outright to be 6979 but its results are not reproducible).

@jonasnick
Copy link

jonasnick commented Feb 3, 2020

H(priv || pub || msg || randomness]). All the secret data goes into the first compression, and all attacker-controlled data goes into the second compression, so I think the attack does not apply directly already, but maybe related attacks do.

Seems more likely than not that such attacks exist. At least it's easy to imagine an otherwise hash function that is vulnerable.

In "Differential power analysis of HMAC based on SHA-2, and countermeasures" (google), McEvoy et al. stipulate an DPA attack that recovers the midstate of SHA256(x || m) where x is the secret key padded to the blocksize.

Their countermeasure is using a hash function similar to SHA256 but where individual operations are masked. Perhaps they don't simply pad the secret key with randomness because they don't want to make assumptions about the length of the secret key and how much space is available for padding (EDIT: the more likely reason is that they want to compatible with HMAC-SHA-2, so they mask the input and unmask the output of HMAC). I have not read the paper in detail (nor do I have much experience with DPA) but it seems that the H(rand23 || sk || pk || m) would equally thwart their attack:

the hash function is operating on (k ⊕ ipad), [k is the secret key and padded to the blocksize] which clearly
does not change from one execution of the HMAC algorithm to the next. Hence,
the intermediate hash H_1 is also fixed and unknown. Recall that in order to
perform a differential side-channel attack, we require fixed unknown data to
be combined with variable known data. This criterion is fulfilled during the
calculation of H_2 , when the variable m is introduced and combined with the
previous intermediate hash H_1

@jonasnick
Copy link

This synthetic nonce idea added to the WIP reference code update PR: f84cd77

@sipa
Copy link
Owner Author

sipa commented Feb 3, 2020

@jonasnick That's a great find.

So I think the general principles are that you should never have (unmasked) secret data together with attacker-controlled data within one input block (because expansion is independent of midstate) or when updating a midstate in a compression step.

That means the following things are insecure (assuming 16 byte rand).

  • H(priv||pub||rand||msg) (combining a secret midstate with an attacker-controlled msg).
  • H(rand||priv||pub||msg) (the second block contains both private and attacker-controlled data).

These however seem fine (each is listed with: total compressions; compression with precomputed key; compressions with precomputed key+rand):

  • H(priv||rand||pub||msg): 2, 2, 1
  • H(rand||priv||pub||msg): 2, 2, 1 (but unsafe with 33 <= len(rand) <= 63).
  • H(priv||pub||rand||pad_to_block||msg): 3, 2, 1
  • H(rand||pad_to_block||priv||pub||msg): 3, 3, 1

So moving the randomness to the second block to enable precomputating H(priv||pub means we need to add padding, which effectively undoes the gains of precomputation.

My preference is thus H(priv||rand||pub||msg). This never needs any padding, and seems safe for any length of rand >= 16. If we want to treat msg as variable-length too (not supported by BIP340, but who knows maybe someone builds a variant), rand should probably be prefixed with a 1-byte length.

@jonasnick
Copy link

jonasnick commented Feb 3, 2020

H(rand||privkey||pubkey||msg) (the second block contains both private and attacker-controlled data).

Doesn't the second block consist of <last 16 bytes of pubkey><msg><padding> ?

@sipa
Copy link
Owner Author

sipa commented Feb 3, 2020

Oh, you're right. Adding that one:

  • H(rand||priv||pub||msg): 2, 2, 1. Just as good as H(priv||rand||pub||msg).

@sipa
Copy link
Owner Author

sipa commented Feb 3, 2020

H(rand||priv||pub||msg) is however a problem when 33 <= len(rand) <= 63 (which is pointlessly large, so not recommended in any case).

@sipa
Copy link
Owner Author

sipa commented Feb 4, 2020

Perhaps this isn't worth spending too much time on. Adding randomness to protect against power analysis can only protect the signature, but systems where attacks like this are a risk will likely also performing BIP32 derivation and Taproot tweaking, where randomness cannot help.

Adding randomness remains useful as a countermeasure against fault attacks though, but the element ordering does not matter for that.

@ajtowns
Copy link

ajtowns commented Feb 4, 2020

Perhaps this isn't worth spending too much time on. Adding randomness to protect against power analysis can only protect the signature, but systems where attacks like this are a risk will likely also performing BIP32 derivation and Taproot tweaking, where randomness cannot help.

We can do taproot tweaking without the private key though, so it should be easy to make safe? Though it means the "pubkey" you're signing for is somewhat attacker controlled, so you'd want H(priv||rng32||pub||m) or H(priv||rng23||pub||tapbranch||m) and either way would need 3 rounds, not 2.

If I'm reading this attack right, it allows you to recover x where x = H(priv || attacker_controlled), so bip32 private-public derivation shouldn't be attackable (you'd just recover the public info, though maybe it would reduce hardened derivations to effectively unhardened), while private-private derivation is vulnerable if you first discover x = H(k + ipad || i) and then use that to discover y = H(k + opad || x) but needing two levels at least seems a bit harder, so this attack might not be feasible anyway?

I think you need more concrete implementation details to try to prevent side-channel attacks though, so I think I agree with it not being worth spending too much time on now.

@gmaxwell
Copy link

gmaxwell commented Feb 4, 2020

Seems more likely than not that such attacks exist. At least it's easy to imagine an otherwise hash function that is vulnerable.

To be fair, any implementation where sha256 has side-channels is going to have an extremely hard time avoiding sidechannels in the rest of its logic. (not that I disagree with picking the more resistant structure as the recommended one)

Edit: I see sipa said something similar, but I was actually thinking of the fixed base exp, the ex scalar mult and the addition with k rather than tweaking.

That means the following things are insecure (assuming 16 byte rand).

I had somehow (!) assumed in earlier discussions that rand would always be padded to 32 bytes (even if was empty), and had specifically brought up the compression function run having no attacker controlled data when suggesting h(rand||priv||pub||msg); I'm glad you made it clear that you were thinking of a different size!

@real-or-random
Copy link

My preference is thus H(priv||rand||pub||msg). This never needs any padding, and seems safe for any length of rand >= 16. If we want to treat msg as variable-length too (not supported by BIP340, but who knows maybe someone builds a variant), rand should probably be prefixed with a 1-byte length.

I agree with the preference but I can't follow why it's safe for any length of rand >= 16.

@sipa
Copy link
Owner Author

sipa commented Feb 5, 2020

@ajtowns:

You're right, leaks outside of nonce generation may not be as severe as I thought while writing the previous message here. Still, there are a number of them, though it's hard to say how serious they are:

  • I suspect that anything handling EC multiplication in software is at risk (and probably the only reason why there's interest at all in attacking the nonce generation instead of the EC code is that for common curves, devices where this is an issue likely have hardware for the EC part).
  • BIP32 hardened derivation includes a SHA512 compression that includes the parent key, though only 31 bits are controlled by the attacker, which probably is not enough to leak more than a small number of bits of information from the key.
  • Taproot derivation is indeed done entirely on public data, but using the tweaked key inside the nonce derivation function may actually be a risk. In H(privkey||pubkey||rand16||pad||msg) for example, the first block now contains both (a linear function of) secret data and attacker-controlled data (the pubkey). I'm not sure, but I wouldn't be surprised if hashing privkey isn't already attackable even without a pubkey in there (guess certain bits, know what tweak was added, compute likelyhood of high bits in those positions... as different parts of the input block get combined with itself). This is an argument in favor of having the synthetic randomness in the same block as the private key (rather than just anyway before the message).

@gmaxwell:

I had somehow (!) assumed in earlier discussions that rand would always be padded to 32 bytes

Let's redo the math for 32-byte synthetic randomness (or randomness padded to 32 bytes):

  • (a) H(priv||rand||pub||msg): 3, 3, 1
  • (b) H(rand||priv||pub||msg): 3, 3, 1
  • (c) H(priv||pub||rand||pad_to_block||msg): 3, 2, 1
  • (d) H(rand||pad_to_block||priv||pub||msg): 3, 3, 1

With caveats:

  • All of them are at risk when len(rand) <= 16, because then the private key data is not sufficiently masked before being combined with attacker-controlled data.
  • (b) is at risk when 33 <= len(rand) % 64 <= 64, because it brings part of priv and msg into the same block.
  • (c) and (d) always mix privkey and pubkey (which, under Taproot, may mean combining linear-function-of-secret data with attacker-controlled/grindable data). (a) and (b) only do this with len(rand) < 32, but even then, also mix in randomness in that same block (which may or may not make the attack harder).

I think my preference remains (a).

@real-or-random

I agree with the preference but I can't follow why it's safe for any length of rand >= 16.

Simply because without, the secret data is insufficiently blinded before being mixed with attacker-controlled data.

16 may be absolute overkill though. If the attacker cannot observe the private key itself, but only a midstate that is a (deterministic) function of it, even a little bit of randomness added (say, 4 bytes) to the overall nonce means he's not going to be able to predict future nonces.

@gmaxwell
Copy link

gmaxwell commented Feb 5, 2020

(b) H(rand||priv||pub||msg): 3, 3, 1

How are you getting 1 compression with precomputed priv/rand? Are you forgetting the 8 byte length?

This should be 3, 3, 2.

I don't care that much, we should probably favour lowest credible leakage. Implementations which favour performance above other considerations will likely do something different.

@sipa
Copy link
Owner Author

sipa commented Feb 5, 2020

You're right; (a) and (b) are of course 3, 3, 2.

@jonasnick
Copy link

I'm not sure, but I wouldn't be surprised if hashing privkey isn't already attackable even without a pubkey in there (guess certain bits, know what tweak was added, compute likelyhood of high bits in those positions... as different parts of the input block get combined with itself). This is an argument in favor of having the synthetic randomness in the same block as the private key [...].

Interesting thought. It looks like this countermeasure is not guarenteed to help. What may work is hashing untweaked priv and pub and separately hashing the tweak. But that sounds impractical if only for not having a concept of tweak in the BIP.

I think my preference remains (a).

If the added randomness is going to become the default signing algorithm, it would be simpler to also use fixed length. It makes sense to view the pubkey as similarly dangerous as the message (whereas the attack on the privkey tweak appears to be significantly harder). Perhaps then use 32 byte a) with a rationale that hints at saving a compression if power analysis can be ruled out.

@real-or-random
Copy link

If the added randomness is going to become the default signing algorithm, it would be simpler to also use fixed length. It makes sense to view the pubkey as similarly dangerous as the message (whereas the attack on the privkey tweak appears to be significantly harder). Perhaps then use 32 byte a) with a rationale that hints at saving a compression if power analysis can be ruled out.

What I had in mind was always fixed size, I assumed talking about different sizes is only about considering what would happen if people don't stick to the BIP.

Shouldn't 23 bytes be enough then? Except that more randomness probably does not hurt, I have not seen an argument yet for why 32 bytes is significantly better than 23.

@gmaxwell
Copy link

gmaxwell commented Feb 5, 2020

23 only arose because in one of the layouts it was the maximum length that would fit without causing another compression function invocation. Anything over 16 bytes should be ample from a 'randomness' perspective, and anything over 32 starts becoming inconveniently much... the actual amount should be picked based on whatever s efficient with the hashing. It would be silly to take on an extra compression run because it used 32 instead of (say) 23.

@jonasnick
Copy link

I have not seen an argument yet for why 32 bytes is significantly better than 23.

The argument in this thread is that with 23 bytes you'll have 9 bytes of pubkey in the same compression as priv. When you take tweaking into account the pubkey is just as attacker-controllable as the message and the original attack in this thread applies. On the other hand a probing with differently tweaked pubkeys should also mean the priv is different each time, but that's only by an attacker-known offset and it's unclear to me if that makes the attack any harder. In any case, to reduce contamination with the pubkey 23 would be preferred over 16.

@real-or-random
Copy link

The argument in this thread is that with 23 bytes you'll have 9 bytes of pubkey in the same compression as priv. When you take tweaking into account the pubkey is just as attacker-controllable as the message and the original attack in this thread applies.

Ah, I haven't seen this argument spelled out here explicitly. Is it really true that mixing attacker-controlled data even with secret data and randomness is an issue? The original attack mentioned here is against a scheme that doesn't use randomness at all. If it is indeed an issue, wouldn't then H(priv||rand32||pub||msg) also be a problem because then the second compression has a secret input (the midstate) and an attacker-controlled input (pub and msg)?

If the thing we need to avoid is secret data and attacker-controllable data in the same block, then H(msg||pub||sk||rand) would only have one byte attacker-controllable data in the second block (or we could bring it down to one bit even by putting the 02/03 byte at the end) but as I wrote in the previous I'm not convinced (yet) that this is the goal.

@jonasnick
Copy link

jonasnick commented Feb 5, 2020

wouldn't then H(priv||rand32||pub||msg) also be a problem because then the second compression has a secret input (the midstate) and an attacker-controlled input (pub and msg)?

The attacks discussed in this thread wouldn't apply because the midstate changes in each signing attempt due to rand32.

@sipa
Copy link
Owner Author

sipa commented Feb 6, 2020

Let's assume for a minute that secret key and randomness go into the first block and nothing else (by padding the rest). I wanted to know what helps and what doesn't. To do that, I wrote some code to analyze the behavior of the expansion step in SHA256, by propagating const/secret/random flags for all the integers in it. A complication is that the vulnerable step is an addition between 4 integers, and there's no good reason to assume any particular order in which they get added, so everything here is a worst case over all possible 24 orderings in which these additions occur (I am assuming sequential additions like (((a + c) + b) + d), and not hierarchical ones like (a + d) + (b + c), though).

I'm treating any addition between (unblinded) secret data and other (unblinded) secret data as a problem, because in a taproot/tweak setting, the secret key is both a function of the secret and partially under control of the attacker. It may well be the case that this is not a concern at all, as much of the control the attacker has is limited due to the tweak being a hash, and there are linear modular operations getting in the way of thinking of bits individually.

So, one observation is that mixing in randomness in there in general does not help. Of course, it's needed because otherwise the midstate becomes a target of attack for the next block, but the randomness does not affect the potential for leakage in the expansion of the first block itself - except in weird pathological configurations.

Another finding is that any "sane" orderings are pretty bad - both key32+pad32 or pad32+key32 result in exposing information of up to 224 bits of key material (this is very likely a serious overestimation; it's just 7 chances for leakage in 32-bit operations). There are some crazy orderings that never expose more than 32 bits, like KPPKPPKPKPKKPKPK (where each letter represents 4 bytes). I don't think we're going to seriously propose something like that.

So overall I think this must mean that we should treat operations between different secret key bits as safe. Not because we're sure that they are, but because if they aren't, there isn't anything non-crazy we can do about it.

EDIT: seems I had a bug in my code, not all the above findings are correct. Will redo.

@gmaxwell
Copy link

gmaxwell commented Feb 6, 2020

heh, would be good to know if a crazy permutation actually looked safer, we could still use it in libsecp256k1 even if it wouldn't be what we would specify as a recommended procedure.

@sipa
Copy link
Owner Author

sipa commented Feb 6, 2020

What do people think of this: H(priv XOR rand || pub XOR rand || msg), with rand either all zeroes or 32 bytes of randomness (the same randomness in both XORs). It's 2,2,1, has the private data completely masked before exposing to the message, and even if the pubkey is under attacker control (through grinding the tweak), it's never exposed to the private key directly. Further, given a = priv XOR rand and b = pub XOR rand, there is on average only a single value x such that b XOR x is the public key corresponding to a XOR x, so there is no entropy loss here even if rand is chosen maliciously.

@ajtowns
Copy link

ajtowns commented Feb 7, 2020

That's equivalent to doing H( (priv XOR pub XOR rand) || rand || msg ) (with a default of rand=pub).

Converting the "no entropy loss" argument, if you've got x such that pub2 = pub XOR rand XOR x is the public key for priv2 = priv XOR rand XOR x (expanding a and b), then pub2 XOR priv2 = pub XOR priv, so to me that looks like you do have some (extremely small) entropy loss (when you happen to have two keys that give the same result after xor'ing with their corresponding public key), not none.

(Other than that, I like it)

@jonasnick
Copy link

jonasnick commented Feb 7, 2020

if the pubkey is under attacker control (through grinding the tweak), it's never exposed to the private key directly.

Isn't it still a problem that they're masked by the same randomness?

Worth noting that with the previous suggestions in this thread, if the attacker controlled both rand and priv the nonce is fine. The XOR suggestion means that if both are controlled the nonce is at risk (because the attacker can just always set pubkey XOR rand to 0). (EDIT: This doesn't matter because in that case the nonce will not be repeated for different pubkeys because priv XOR pub is hashed)

entropy loss when you happen to have two keys that give the same result after xor'ing with their corresponding public key

Interesting observation. Seems unlikely indeed assuming uniform distribution of the secret key even if you take attacker-controlled tweaks into account.

@real-or-random
Copy link

What do people think of this: H(priv XOR rand || pub XOR rand || msg), with rand either all zeroes or 32 bytes of randomness (the same randomness in both XORs).

Hm, this seems dangerous because collisions are possible in the input of the hash function.

For example, if the least significant bit of the secret key is 0, then adding a tweak of 1 is equivalent to XORing rand = 1, and an example of a collision is H((priv + 1) XOR 0 || (pub XOR 1) XOR 0 || msg) = H(priv XOR 1 || pub XOR 1 || msg).

What I say here assumes that the attacker can control the tweak, the public key and the randomness. I know that this a lot of assumptions but simply given the existence of this attack, I don't think this suggestion is the safest choice that we can make.

Another reason why I don't like XORing with the secret key is that some genius could get the idea that rand = priv is a good choice. If you don't have a RNG, priv sure has more entropy than constant zero, no

Moreover, if the attacker sees/control certain bits of the randomness, XOR may not be a great choice when it comes to power analysis, because the attacker could try to find randomness values for which priv XOR rand has high or low Hamming weight (but this is just a spontaneous idea, not sure if this is a real concern.)

Maybe accepting an additional compression is the way to go. If you want high-speed signing, you probably don't want SHA256 anyway but are willing to change to BLAKE2 etc.
Other things you could try is to precompute seed = H(priv||pub) and then use some 256-bit PRP keyed with seed XOR rand.

@sipa
Copy link
Owner Author

sipa commented Feb 7, 2020

@real-or-random

Another reason why I don't like XORing with the secret key is that some genius could get the idea that rand = priv is a good choice. If you don't have a RNG, priv sure has more entropy than constant zero, no

I think you're missing that H(priv xor rand || pub xor rand), even when rand=priv, is still H(0 || pub xor priv). pub xor priv has very nearly the same amount of entropy as priv to begin. This is exploiting the fact that adding pub is in fact redundant, and then using that channel to make sure even malicious/broken rand values don't destroy entropy.

@real-or-random
Copy link

I think you're missing that H(priv xor rand || pub xor rand), even when rand=priv, is still H(0 || pub xor priv).

Oh yes, when I wrote that paragraph I missed entirely that rand will be there in the second part, too.

Did you consider
H( (priv XOR (rand16||rand16)) || (pub XOR (rand16||rand16)) || rand16 || msg )?

This has the disadvantage that it takes only 16 bytes of randomness but it has rand16 explicit in input, so the input is really only an encoding of the tuple (priv, pub, msg, rand16) and thus it's easy to analyze. (We could actually go up to 23 bytes still preserving 2, 2, 1 with (priv || sk) XOR (rand23 || rand23 || rand23[:-5]) || rand23 || msg but the truncation is somewhat annoying.)

@jonasnick
Copy link

Did you consider H( (priv XOR (rand16||rand16)) || (pub XOR (rand16||rand16)) || rand16 || msg )?

pub shouldn't be XORed with randomness, because as far as I can see there's no acute reason and it may undo the effect of masking priv under power analysis.

Since people seem to care about one compression, how about using either @sipa's or @real-or-random's (modified as above?) version, and mention the alternative hash(priv XOR rand32 || pub || rand32 || msg) with 3 compressions and better power analysis resistance.

Not sure, which version is better. @real-or-random's version can be easily extended to the rand32 version (by replacing rand16||rand16 and rand16 with rand32). OTOH, after a cursory look into the internals of SHA256 it seems at least plausible that 32 bytes randomness in @sipa's version improves randomization of the 64 fixed bytes over 16 bytes randomness.

the input is really only an encoding of the tuple (priv, pub, msg, rand16) and thus it's easy to analyze

@real-or-random do you have any specific analysis in mind? Or do you mean it's just easier to see that it works?

@sipa
Copy link
Owner Author

sipa commented Feb 13, 2020

@jonasnick I don't think anything is clearly better than anything else.

Looking at https://eprint.iacr.org/2017/985.pdf, their measurements are mostly correlated with the Hamming weight of words being read from the chip's internal memory. That's of course only one piece of hardware, but perhaps it's a good model to target. We must of course assume some weakness in the attacker's powers - if he can read individual bits being loaded from memory for example, then no amount of randomization is going to matter. However, there is an important thing to notice here: it's not necessarily operations (bitwise or arithmetic) that can be targeted, but really any variable involved.

Taking that into account, I think H(priv32 XOR rand32 || pub32 || msg32) is as good as it gets, as no bit-level information about priv32 is ever exposed, and the private key is completely blinded before being used elsewhere.

H(priv32 XOR rand32 || pub32 XOR rand32 || msg32) was aiming to avoid the loss of information in case rand32 was correlated with priv32, but unfortunately does not have the same blinding property. Assuming the attacker can observe the Hamming weight of pub32 XOR rand32, he obtains bit-level information about rand32, which he can use to derive some bit-level information about priv32.

H(priv32 XOR rand32 || pub32 || rand32 || msg32) does not look like it's guaranteed to be better, as bits of rand32 could be revealed by combining with msg32 in the same expansion, and then use that to learn information about priv32 which it's also XORed with.

I think H(priv32 XOR rand32 || pub32 || rand32 || pad32 || msg32) works, as the second rand32 is only being combined with completely-blinded information here. Of course, this has an additional compression invocation.

I can't find something with similar blinding properties that doesn't have an additional compression. H(priv32 XOR (rand16 || rand16) || pub32 || rand16 || msg32) doesn't have the additional compression, but it's using rand16 in the same expansion as the attacker-controlled msg32.

@sipa
Copy link
Owner Author

sipa commented Feb 13, 2020

So I think my preference is either H(priv32 XOR rand32 || pub32 || msg32), or H(priv32 XOR rand32 || pub32 || rand32 || pad32 || msg32).

@gmaxwell
Copy link

H(priv32 XOR rand32 || pub32 || msg32) feels particularly dangerous, you don't expect people will use the same random value for both the key and rand32?

@sipa
Copy link
Owner Author

sipa commented Feb 13, 2020

Sure (that's why I suggest the other options) - but I don't know how to weigh that risk against increased DPA risk and desire for fast compressions.

@real-or-random
Copy link

the input is really only an encoding of the tuple (priv, pub, msg, rand16) and thus it's easy to analyze

@real-or-random do you have any specific analysis in mind? Or do you mean it's just easier to see that it works?

What I had in mind was simply that anything that is a proper encoding of the inputs is nice because it's security (ignoring side channels) is a no-brainer.

Sure (that's why I suggest the other options) - but I don't know how to weigh that risk against increased DPA risk and desire for fast compressions.

I think we should choose the conservative option in the BIP. Singing performance is not our first priority, and implementations can choose other options if they want.

I can't find something with similar blinding properties that doesn't have an additional compression. H(priv32 XOR (rand16 || rand16) || pub32 || rand16 || msg32) doesn't have the additional compression, but it's using rand16 in the same expansion as the attacker-controlled msg32.

Can you explain why do you think it's bad to have randomness together with an attacker-controlled value? My understanding now is rather that this is fine, because the randomness is not fixed. Moreover, I'm not sure if it's a big difference if the attacker controls the value or just knows the value.

For example:

I think H(priv32 XOR rand32 || pub32 || rand32 || pad32 || msg32) works, as the second rand32 is only being combined with completely-blinded information here.

Is pad32 "completely-blinded"? It's at least known to the attacker here even.

I'm not only asking because of the rand16 variant that I proposed but this is also relevant here:

So I think my preference is either H(priv32 XOR rand32 || pub32 || msg32), or H(priv32 XOR rand32 || pub32 || rand32 || pad32 || msg32).

If it's okay to mix random/masked data with attacker-controlled data in a compression, then one could move pub32 to the front and make another compression precomputable.
For example
H(pub32 || pad32 || priv32 XOR rand32 || msg32 || rand32)
or
H(pub32 || pad32 || msg32 || rand32 || priv32 XOR rand32).

I think the latter variant is particularly interesting because the private key comes in only the last compression, and the midstates are thus not secrets. I think the reason why we put private key in the beginning is to randomize the entire function early, similar to putting R in front for the challenge hash to avoid that the attacker controls the prefix. But while this is relevant for collisions, I don't think that's (too) relevant for the pseudorandomness of SHA256.

There are also the variants
H(pub32 || rand32 || msg32 || pad32 || priv32 XOR rand32),
or
H(msg32 || pad32 || pub32 || rand32 || priv32 XOR rand32),
which does not allow for a precompuation of the first compression but do not mix rand32 with the attacker-controlled msg32.

@jonasnick
Copy link

jonasnick commented Feb 13, 2020

H(priv32 XOR rand32 || pub32 || rand32 || msg32) does not look like it's guaranteed to be better, as bits of rand32 could be revealed by combining with msg32 in the same expansion

Hm, I hadn't considered that. Since rand32 is supposed to be used only once, this likely would require the attacker to have higher resolution equipment than what seems to be used in these papers. As Tim also notes, using the constant pad32 instead of msg32 would probably not be any better.

I think my preference is either H(priv32 XOR rand32 || pub32 || msg32)

The power analysis resistance is nice, but I'd be hesitant to suggest something that's so easily misused.

There are also the variants H(pub32 || rand32 || msg32 || pad32 || priv32 XOR rand32), or H(msg32 || pad32 || pub32 || rand32 || priv32 XOR rand32), which does not allow for a precompuation of the first compression but do not mix rand32 with the attacker-controlled msg32.

I don't think that mixing it with msg32 is different to mixing it with pub32 (which is similarly attacker controlled due to the tweak).

Fwiw, with rand64 we can get similar properties as H(priv32 XOR rand32 || pub32 || msg32) but without potential for misuse:
H(priv XOR rand64[:32] || pub32 || rand64[:32] XOR rand64[32:64] || msg)

@sipa
Copy link
Owner Author

sipa commented Feb 14, 2020

Can you explain why do you think it's bad to have randomness together with an attacker-controlled value? My understanding now is rather that this is fine, because the randomness is not fixed. Moreover, I'm not sure if it's a big difference if the attacker controls the value or just knows the value.

Consider H(priv32 XOR rand32 || msg32) (just looking at DPA failures here, ignoring any other problems it may have). Assuming an attacker who can only observe word-level hamming weight aggregates of every variable, this is as good as it gets, because even with the attacker seeing weight(priv32), weight(priv32 XOR rand32) and weight(rand32), multiple observations with different unknown rand32 will never give the attacker bit-level information about priv32, only about its aggregates.

Now instead consider H(priv32 XOR rand32 || msg32 XOR rand32), where the attacker controls msg32. He uses this algorithm:

  • For i in range(256):
    • msg32 = (1 << i)
    • diffs = []
    • For j in range(1000):
      • w_priv = weight(priv32) [constant over all iterations]
      • w_randpriv = weight(priv32 XOR rand32)
      • w_rand = weight(rand32)
      • w_randmsg = weight(msg32 XOR rand32)
      • if (w_rand > w_randmsg): [predicting whether bit i of rand32 is true]
        • diffs.append(w_priv - w_randpriv)
    • output: predict bit i of priv32 to be equal to (avg(diffs) > 0)

Does this make sense? It's classifying instances based on a selected bit in rand32, and then using this slight bias to classify variations seen by XORing rand32 into priv32, using it to predict its bits.

Now, instead consider H(priv32 XOR rand32 || pub32 XOR rand32). The attacker does not directly control pub32 anymore, but he can cause many known random values to appear there. The same technique can be used, but now it needs 2 levels of classification: one using bits of pub32 to group rand32 values based on a bit in pub32, and then using those to infer information about bits in priv32. This may be completely infeasible in practice because of noise levels or the number of observations required, but it's clearly revealing something about the individual bits.

@jonasnick
Copy link

jonasnick commented Feb 14, 2020

It seems like in general @sipa's algorithm does not require combining rand32 with variables such as msg32 or pub32. It would apply similarly to constructions where randomness is combined with constants as for example in H(priv32 XOR rand32 || pub32 || rand32 || pad32 || msg32) .

Assume that the attacker is able to obtain weight(rand32) and weight(f(rand32)) for some constant function f. Assume that it just so happens that f(x) = x XOR 0100...00. Then the attacker can determine the first bit of rand32 by comparing weights as rand32[0] = weight(rand32) > weight(f(rand32)).

Now if the attacker obtains a measurement weight(priv32), weight(priv32 XOR rand32), weight(rand32), weight(f(rand32)), and it holds that weight(rand32) > weight(f(rand32)) then the attacker learns the first bit of the private key priv32[0] = weight(priv32) > weight(priv32 XOR rand32).

@sipa
Copy link
Owner Author

sipa commented Feb 18, 2020

Assume that the attacker is able to obtain weight(rand32) and weight(f(rand32)) for some constant function f. Assume that it just so happens that f(x) = x XOR 0100...00.

You're right in so far that some bit-level information leaks inevitably, whenever rand32 is used in more than one place. I'm not sure it's computationally feasible to extract much, however. In practice, the functions f in the "expansion of rand32 || pad32" phase are a fixed set of functions of rand32, most of which are far more complex than selecting one bit. I do count 496 operations that affect Hamming weight (assuming a naive implementation, and counting every XOR, SHIFT, and ADD, but not rotations) per expansion, so information theoretically it does seem enough to determine rand32 exactly even. This is hard to reason about...

On the other hand, with H(priv32 XOR rand32 || pub32 XOR rand32) I believe the attacker does not need any computational power, only lots of queries and the ability to classify them. The difference is that the relations f(rand32) as described above are now variable simple XORs with rand32.

@sipa
Copy link
Owner Author

sipa commented Feb 18, 2020

@real-or-random elsewhere suggested H(priv32 XOR H(rand32) || pub32 || msg32). I think that's an elegant and safe choice. It's obviously not vulnerable to any bad choice of rand32, and even an attacker who can learn some bits of information about rand32 cannot predict H(rand32) - and observing word-level weights of H(rand32) shouldn't let him gain bit-level information on priv32.

It also supports arbitrary length rand32, and should be fine with 16-byte. In case no randomness is available, it simplifies to a 2-compression construction (by precomputing H("") to XOR in, or just setting it to 0).

@jonasnick
Copy link

I like @real-or-random's idea.

In addition to being misuse resistant it doesn't have the priv leak with using rand32 twice pointed out above. Because attacker only learns weight(priv32 XOR H(rand)) and not weight(priv32 XOR rand), knowing weight(f(rand)) for some simple f doesn’t help. And knowing weight(H(rand)) doesn’t help either because H(rand) should be computationally independent from the input (different to simple example f).

This also seems to be safe against the attacker controlling both rand32 and pub32. The attacker can "remove a (known) tweak" from priv and force nonce reuse, but that'll only reveal the tweak again (when subtracting the two signatures).

Therefore, it's strictly better than the rand64 idea from above (because smaller rand). The other criteria in this thread also apply (randomness in the first block, no "(unmasked) secret data [including rand] together with attacker-controlled data within one input block" and no "addition between (unblinded) secret data and other (unblinded) secret data".

@real-or-random
Copy link

The ToDo left here is to add a rationale for our specific choice to the BIP.

@jonasnick
Copy link

See also https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2020-March/017711.html for a summary of the above discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants