Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The precise mathematical definition of obfuscation and what is considered obfuscation for an average software engineer are two very different things.

In fact, the article is only about indistinguishability obfuscation. What is mostly discussed in this thread is the notion of virtual black box obfuscation (VBB). VBB has been proven to be impossible in the general case (see https://www.wisdom.weizmann.ac.il/~oded/PS/obf4.pdf). There are a few special programs where VBB is feasible, such as point functions, but in general in cannot be achieved.

Indistinguishability obfuscation (iO) means that if you obfuscate two programs that compute the same function, then you cannot distinguish them. Or put in different words, if you get two obfuscated programs, then there is no better way than random guessing (except for a factor that is negligible in some security parameter) to find out if they stem from the same original program.



To try to make this concrete, if you have a program A that does bubble sort, and a program B that does selection sort:

1. VBB would be making it so you can't glean information about A or B by running VBB(A) or VBB(B) or examining them, for various definitions of "information".

You can't tell A is a bubble sort at all, and you can't tell B is a selection sort at all.

VBB is, as mentioned, impossible in the general case.

2. IO would be making it so if you are holding IO(A) and IO(B), you can't tell them apart, and can't tell if the original was A or B.

So you can have functionally identical programs, and when you run them through IO, you can't tell whether the original was A or B.


I'm kind of confused. I'm guessing this is a dumb question, but I'm obviously missing something. If a program is running bubble sort, and another is running selection sort... can't you just run them instruction-by-instruction to see which elements they swap? And deduce what they were based on that? Like if you have [4, 3, 2, 1], and the first swap the program does results in [1, 3, 2, 4], then it clearly wasn't bubble sort, right? What am I missing?


Boolean circuits themselves do not have memory. (though you can create memory within them).

See the formal model here: https://en.wikipedia.org/wiki/Circuit_(computer_science)

There are extensions of iO to RAM programs, but this is not it.


> You can't tell A is a bubble sort at all, and you can't tell B is a selection sort at all.

That doesn't make any sense. You can trace them while they execute, then compare and analyze the two traces?

By definition obfuscation is always reversible or the software wouldn't even work any more.

You can make it infeasible (timewise) to figure it out, but it's still possible, even if it takes infinite time.


I'd suggest reading the paper - which covers this ;)


Thanks, this is a great explanation.


Ed: redundant - asked and answered in thread.


Yeah, I probably should have collected an the pieces into a single reply. Right now the number of comments is small enough that it’s not hard to find though.


The latter is useful to hide who wrote the program? Like some features of code that might reveal your personal patterns?

Or what is it for?

And how does it deal with timing? İf two programs I gave it have different complexity, would it pad the running time to some fixed value? İf so, how is that value chosen?


Thanks, I had no idea what to make of the OP, so your comment is very valuable!

> Or put in different words,

I don't see how the second phrasing follows from the first one. In fact, I don't understand what the second statement is supposed to mean to begin with. Could you elaborate?


If you have a bubble sort and selection sort, and run them through IO, for either result, you can't tell if the original was bubble sort or selection sort.

Roughly: VBB is the ability to take any program and make it indistinguishable from random.

IO is the ability to take two programs that compute the same function and make them indistinguishable from each other.


How far does this analogy go?

For example, if I plot how long both programs take at various scales, at some point I should be able to determine which one is O(n log(n)) right?


I gave the formal definition of distinguishability in the other comment, but it does not include running time.


And that involves preventing the leakage of any data? Such as the time each take?


The original goal was to be able to make public key systems out of secret key systems, so hiding existing running time is not part of the definition of indistinguishable.

It is about what information you can glean about either of the originals. Formally (from the paper):

Indistinguishability: For every two ensembles {C0,λ} and {C1,λ} of polynomial-sized circuits that have the same size, input length, and output length, and are functionally equivalent, that is, ∀λ, C0,λ(x) = C1,λ(x) for every input x, the following distributions are computationally indistinguishable:

  {iO(1λ, C0,λ)} {iO(1λ, C1,λ)}


IE you can take a secret key system, hardwire the key into the program, and turn it into a public key system without fear that the secret key will be recovered - instead, it will not be distinguishable from other programs with different hardwired keys.


> IE you can take a secret key system, hardwire the key into the program, and turn it into a public key system without fear that the secret key will be recovered - instead, it will not be distinguishable from other programs with different hardwired keys.

I don't understand. Shouldn't a different secret key yield a different function / circuit?


You can actually do it both ways - where the ciphertext is an obfuscated program, and where keys a programs and ciphertexts are short.

https://eprint.iacr.org/2013/454.pdf and friends have a direct description of the latter and references to the former.

This is just one mechanism.


Ah I think I understand better. Thanks!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: