# Full text of "mit :: lcs :: tr :: MIT-LCS-TR-503"

## See other formats

E LABORATORY FOR COMPUTER SCIENCE MASSACHUSETTS INSTITUTE OF TECHNOLOGY MIT/LCS/TR-503 THE ROUND COMPLEXITY OF SECURE PROTOCOLS Phillip Rogaway April 1991 545 TECHNOLOGY SQUARE, CAMBRIDGE, MASSACHUSETTS 02139 The Round Complexity of Secure Protocols by Phillip Rogaway A.B., Computer Science University of California, Berkeley (1985) S.M., Electrical Engineering and Computer Science Massachusetts Institute of Technology (1988) Submitted to the Department of Electrical Engineering and Computer Science in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy June 1991 © Massachusetts Institute of Technology 1991 Signature of Author Department of Electrical E Certified by 4 f. Mf Thesis Supervisor / Silvio Micali Accepted by Chair, Departmental Committee on Graduate Students Arthur Smith The Round Complexity of Secure Protocols by Phillip Rogaway Submitted on April 22, 1991 to tie Department of Electrical Engineering and Computer Science at the Massachusetts Institute of Technology in partial fulfillment of the requirements for the degree of Doctor of Philosophy. Abstract Assume we have a network of three or more players, each player in possession of some private input. The players want to compute some function of these private inputs, but in a way which protects the privacy of each participant's contribution. Not all of the players can be trusted to do as they are instructed. The resources the players are given to accomplish their goal are communication — the ability to privately send messages to one another, or to broadcast messages to the community as a whole — and local computation. Many insightful protocols have been proposed for solving this problem of multiparty secure function evaluation. Building on Yao's protocol for the case of two players [Ya86], Goldreich, Micali and Wigderson [GMW87] offered the first general protocol for this prob- lem, and they provided the paradigm on which a large body of successive work was based. Despite enormous progress, research on secure function evaluation has suffered from some serious shortcomings. First, though many protocols have been devised for solving the problem, what, exactly, these protocols accomplish has not been fully understood. In fact, no rigorously specified and generally accepted definitions have been proposed in this field. Second, protocols for multiparty secure function evaluation could be extremely inefficient, the main cause being that they required an unbounded (and usually large) number of communication rounds. We address both of these points, carefully crafting definitions which satisfactorily deal with the myriad of issues lurking here, and offering a new protocol for multiparty secure function evaluation — one which categorically improves the complexity requirements for this task. The new protocol completely divorces the computational complexity of the function being collaboratively computed from the round complexity of the protocol that evaluates it. Using this approach, we show that a rigorously-specified and extremely strong notion of secure function evaluation can be achieved by a protocol which requires only a fixed constant number of rounds of interaction. This result assumes only the existence of a one-way function and that the majority of the participants to the protocol behave correctly. Thesis Supervisor: Silvio Micali. Title: Professor of Electrical Engineering and Computer Science. Keywords : computer security, cryptography, distributed computing, secure protocols. Acknowledgements First and foremost, thanks go to Silvio Micali. As an advisor, a researcher, and a generally amusing character, Silvio is unsurpassed. Bringing together amazing intuition and an un- canny eye for what is deep and what is interesting, Silvio is a tireless worker, an excellent communicator, and a scientist in the classical tradition. Silvio has my deepest gratitude and respect. lie has never pressured me during difficult times, and has treated me to more lunches and dinners than I can ever repay (but I shall try!). The notion of security described here was devised over countless discussions with Silvio. Notions out of which this one sprang were developed in collaboration with Joe Kilian and Silvio. Mihir Bellare listened to my mumbling on definitional matters until he turned green (not a pretty sight). Joe pointed out, at an early stage, the importance of structuring the protocol so as to make an oracle-substitution argument possible. Mihir, Joe, and Su-Ming Wu were kind enough to help me proofread this document, so that it could reach its current state of manifest perfection. The MIT faculty with whom I have interacted have been uniformly outstanding. In particular, I want to single out Shah Goldwasser, Albert Meyer, Ron Rivest, and Mike Sipser. Albert has advised me on difficult matters, and has always greeted me with a warm smile. He headed my oral qualifying exam committee, while Shafi, Ron, and Mike constituted my area exam committee and served as thesis readers. They helped make these experiences good ones. I have TA'd under Shafi, Albert, Silvio and Ron, in various combinations, always learning much and enjoying the work. Sometimes I would discuss ideas with Mike; he is a terrific explainer and listener both. As fine a teacher as he is a scientist, Manuel Blum interested me in computer science years ago, back at Berkeley. Other teachers important to me include Paul Chernoff and John Phipps. The Theory Group's students and post-docs with whom I spent the most time have amply demonstrated their selflessness by enduring my company and my cooking. I add insult to this injury by describing them. There is Mihir Bellare, cooking companion, philosopher, egregious insultor (I have the memory of a shellfish'!), and at least partial convert to the Theory; Margrit Betkc, a small group of rocks off the coast of Cape Ann, house plants grin as Margrit adjusts their leaves or mumbles a friendly word; Bronwyn Eisenberg, complex actress cum computer scientist, with a voice like wind and a name like the skin of a snake; Joe Kilian, with an intricate mind of light and shadows, fashions wonderful sculptures of strangely colored language and thought; Dina Kravets, guileless Russian immigrant with busy hands, she made me awfully happy by wedding Joe; Mojdeh Mohtashemi, perennially dissatisfied woman willing to eat my tah chin or khoresh—even though she knows what they are supposed to be; Su-Ming Wu, disgruntled mathematician with an excellent eye for things dark, believes he can anticipate my thoughts by executing an algorithm he has developed; and Julia Yang, purveyor of missed Star Trek episodes and other aspects of haute culture, skilled enemy of unwanted time. To all of these friends, my love and my thanks. Others friends who have passed through these halls include Bard Bloom, Avrim Blum, Ravi Boppana, Dimitris Bertsimas, Aditi Dhagat, Lance Fortnow, Michelangelo Grigni, Alex Ishii, Shlomo Kipnis, Miller Maley, Seth Malitz, and Yishay Mansour. Many of these people have taught me much, or have been helpful over the years. It has been a privilege to know so many fine individuals. The secretarial staff has been more than secretarial. Be Hubbard, provider of chocolate and other office essentials, mother to all whose mothers are inconveniently far, has waited so patiently, so suspiciously, for someone to confess, that I must finally admit that it was I, in collusion with my real mother, who anonymously sent the series of posters from the various stretches of the earth. My parents and grandparents have given me much love and support through the years, and I thank them for it. A supportive family makes a strong ally. Most of all, Fariba Fakhroo has helped me never lose sight of what is important. Phillip Rogaway Cambridge, Massachusetts April 1991 pii i i i pfiPiiiPHiWiW -~~—\ ' "m-- IM •tll^*^^:^ Table of Contents Introduction H 1.1 Examples of Protocol Problems H 1.2 Secure Protocols 12 1.3 Secure Function Evaluation 13 1.4 The Accomplishments of a Decade 13 1.5 Limitations of Earlier Work 15 1.6 Contributions of this Thesis 16 1.7 Related Work 17 1.8 Organization and History of Thesis Results 17 The Notion of Secure Computation 19 2.1 Overview 19 2.2 Preliminaries 21 2.2.1 Notation 21 2.2.2 Basic Notions 22 2.2.3 Indistinguishability of Ensembles 24 2.3 Protocols and Their Adversaries 26 2.3.1 Protocols 28 2.3.2 Executing Protocols in the Absence of an Adversary 30 2.3.3 Adversaries (Informal Treatment) 33 2.3.4 Adversaries (Formal Treatment) 35 2.3.5 Executing Protocols in the Presence of an Adversary 37 2.3.6 Dealing with Arbitrary Numbers of Players 41 2.3.7 Complexity Measures 42 2.4 Secure Function Evaluation 43 9 10 2.4.1 The Ideal Evaluation 43 2.4.2 Ideal Evaluation Oracles 44 2.4.3 Simulators 45 2.4.4 Ensembles for Secure Function Evaluation 52 2.4.5 The Definition of Secure Function Evaluation 54 2.5 Discussion ^^ 3 The Constant-Round Protocol 57 3.1 Overview 57 3.2 Pseudorandom Generators 60 3.3 High- Level Description 62 3.4 The Protocol 70 4 Proving the Protocol Secure "^4 4.1 Introduction 74 4.2 Preliminaries 75 4.3 Proof of the Main Theorem 78 4.4 Postmortem 94 References ^5 Chapter 1 Introduction This thesis is concerned with doing correct computation in ways that preserve people's privacy. We begin with some motivating examples. 1.1 Examples of Protocol Problems Millionaires problem (Yao, [Ya82a]). Two millionaires wish to find out who is richer, though neither is willing to reveal the extent of his fortune. Can they carry out a conversation which identifies the richer millionaire, but doesn't divulge additional information about cither's wealth? Coin flipping problem (Blum, [B182]). How can Alice and Bob, speaking to one another over the telephone, agree on a random, unbiased coin flip — even if one of them cheats to try to produce a coin flip of a certain outcome? Oblivious transfer problem (Rabin, [Ra81]). Is it possible for Alice to send to Bob a com- posite number n in such a way that, half the time. Bob gets just n, while, the other half of the time. Bob gets n together with a factorization of nl Alice should have no idea which of these two possibilities has occurred. Mental poker problem (Shamir, Rivest and Adleman, [SRA81]). A group of cryptographers want to play a game of poker — over the telephone. The stakes are high, so Alice becomes uneasy when Bob begins, "Alright, Alice, I've just shuffled a deck of cards — really I have — and now I'm dealing you your hand. You've got a 3'v*, a 10<(k, a 70, a J^, and a 5^ . . ." Digital voting problem (Benaloh and Fisher, [BF85]). Is it possible for a group of computer users to hold a fair election on a computer network? The election should enforce the "one-man, one- vote" constraint, and should respect the privacy of each participant's vote, revealing only the correctly computed tally. 11 12 Office assignment problem (Disgruntled staff, [MIT]). The n graduate students are thrown into disarray when suddenly told that they should pair themselves off into n/2 two-person offices. (Assume n is even.) Each student i has a list of '/ffccj -values specifying his will- ingness to share an office with each other student j. The students want their preferences taken into account, despite the fact that it would be more than a bit tactless for students to publish these lists! The students agree that the quality of a room assignment M, considered as a partition of the students into n/2 two-elements subsets, is reasonably well measured by wm = E{i j}eAf min {'likej,Uikei} . Now all the students want to do is to collaboratively compute an M that maximizes the value Wm while simultaneously respecting the privacy of their individual 'likej values . . . 1.2 Secure Protocols The problems above illustrate several variations in the goals one may have for carrying out a collaborative computation in a privacy- preserving manner. The following possibiUties should be singled out: There may be two parties or there may be many. The result of the computation might be a single value, known to all players, or it might be a private value for each player. What is being collaboratively computed may be a computationally simple task or a computationally intensive one. What is being collaboratively computed may depend deterministically on the each player's initial state, or it may depend only probabilistically on this. And the problem being solved might naturally be considered as a function, or — following the notion of Goldreich, Micali and Wigderson [GMW87]— it may be something which is not naturally considered as computing a function, a more "game-Uke" problem.^ The general notion of a secure protocol encompasses all of these possibilities. Each participant is willing to divulge some information pertaining to his own initial state in exchange for learning some information influenced by other participants' initial states. A player's willingness to participate in the protocol is based on the promise that the protocol will not only correctly compute what it is supposed to compute, but it will also protect the privacy of each player's own contribution. We shall take seriously the notion that the job of the cryptographer is both to make good on this promise, and to precisely elucidate its meaning. We will do this in the general setting which we now describe. ^[GMW87] uses the phrase "playing a mental game" to describe the problem of implementing on a communication network an n-party game of partial information — without recourse to a trusted party. This is the problem of allowing a progression in time of states of a system, certain aspects of each state known to various players, other aspects of the current state known to no one. In this viewpoint, computing a function by executing a Turing machine computation on inputs held by the various players is just one simple type of game that a group of players may wish to play. 13 1.3 Secure Function Evaluation Of the various "types" of secure protocols exemplified by our sample problems, we will limit our discussion in this thesis to multiparty secure function evaluations. Informally, we have n > 3 parties, l,...,n. Each party i has a private input, a;,-, known only to him. The parties want to correctly evaluate some function / on their private inputs — that is, to compute y = /(xi, . . . ,x„) — while maintaining the privacy of their own individual inputs. That is, they want to compute y without revealing more about their inputs than this value implicitly reveals. The players task of securely computing the function is made particularly difficult by the presence of bad players, who may do their utmost to compromise a good player's privacy or disrupt the correctness of the collaborative computation. An enormous variety of computational tasks that one might wish to perform in a privacy- preserving manner can naturally be expressed as multiparty function evaluations. Still, we note that a multiparty secure function evaluation specializes the general notion of a secure protocol in two ways: by insisting that there be three or more players; and by demanding that what the players are trying to do is to compute some (deterministic) function. From our initial list of examples, the digital voting problem and the office assignment problem are multiparty secure function evaluations, while the remaining problems are not. There is good reason to limit our discussions to multiparty secure function evaluations. As to our insistence on there being three or more players, it turns out that what is achiev- able in the two-party case and what is achievable in the multiparty case are qualitatively very different. In fact, for achieving the strongest notion of security, one needs to have three or more participants.^ As to excluding "game-like" computations, this simplifies our exposition, distilling the "core" of the problem of secure protocols without the extraneous complications. 1.4 The Accomplishments of a Decade It is one of the major triumphs of modern cryptography that the idea of performing secure function evaluation has not only been conceived, but, also, protocols have been offered for this ambitious goal. Let us review some of the most important advances. Two PARTY COMPUTATION. The consideration of the millionaires problem, the coin flip- ping problem, the oblivious transfer problem, the mental poker problem, and other specific ^Intuitively, when two parties communicate with one another over a clear communication channel, every- thing each player sends out to the other player is known by every player in the system. This "symmetry" means that there is no "partial knowledge" that can be exploited for the design of clever protocols, and, consequently, severely limits what can be securely computed by two parties communicating over a clear channel. 14 computational problems led to Yao's recognizing that there is a general problem to be solved here — the problem of secure function evaluation for two or many players [Ya82a]. The task of devising a protocol to securely evaluate an arbitrary function seemed enor- mous. But in 1986, Yao proposed a protocol for exactly that, for the special case o{ n = 2 parties, under a specific cryptographic assumption (that factoring is hard). Yao's proto- col made important use of the oblivious transfer primitive [Ra81], and it inaugurated the "garbled circuit" approach, which wiU be crucial for us. Secure two-party computation has been investigated extensively by Kilian [Ki89]. He offers a protocol for secure two-party computation and proves precise claims about its prop- erties. Instead of making a number theoretic assumption, Kilian achieves his results under a model of computation which supports a simple abstract primitive — oblivious transfer. In a two-party computation, if one of the parties stops speaking before the protocol is specified to end, we would like that he does not for his silence earn an unfair informa- tion advantage over the other party. Yao first brought to light this consideration [Ya86]. Strengthening what a two-party computation should accomplish to be called "secure," Gold- wasser and Levin investigate just how fair a two-party computation can be, and they show how to achieve such a high standard of fairness [GL90]. Their ideas are not only applicable to two-party computation, but to multiparty computation when half or more of the players are bad. Multiparty computation. In 1987, Goldreich, Micali and Wigderson proposed a protocol for solving the problem of multiparty secure function evaluation, and problems even more general than that [GMW87]. Their protocol was designed to overcome the influence of bad players, as long as they were in the minority. Significantly, it provided the paradigm on which successive solutions were based, a paradigm which is described Section 3.1. The protocol of Goldreich, Micali and Wigderson, as with Yao's protocol, requires a complexity-theoretic assumption (a trapdoor permutation, say). The assumption is used in several places: to provide for private communication between pairs of players; to permit oblivious transfer between pairs of players; and to implement the "garbled circuit" computa- tion of Yao's two-party protocol. In 1988, several workers managed to banish the complexity assumption of the [GMW87] protocol by positing a richer communication model, in which private communication was provided for by the model itself. Under this richer model of computation, Ben-Or, Goldwasser and Wigderson [BGW88], and Chaum, Crepeau and Damgard [CCD88] proposed multiparty protocols which made no cryptographic assump- tions and were designed to overcome the influence of bad players, as long as they constituted fewer than a third of all of the players. Besides achieving error- free secure distributed computation, the protocol of [BGW88] demonstrated the power of the arithmetization of Boolean computation — the power of work- ing over a finite field, instead of the Boolean domain, and exploiting its algebraic structure. It inaugurated the use of error correcting codes in this context. The arithmetization of 15 Boolean computation has subsequently proven important in a variety of contexts. Extending the protocols above. The fault tolerance of the [BGW88] and [CCD88] protocols was subsequently improved.^ Through the development of a new verifiable secret sharing scheme, Rabin and Ben-Or [RB89] managed to match the fault -tolerance of the [GMW87] protocol under the communication model providing both broadcast and pairwise private communication. The paper of Galil, Haber and Yung [GHY87], among other contributions, made several improvements to the [GMW87] protocol, one idea from which we will make use of; see Section 3.1. Work which made the protocols above possible. A large number of ideas had to be in place before the protocols above could be conceived. These ideas include the notion of secret sharing and verifiable secret sharing [Sh79, CGMA85], oblivious transfer [Ra81], probabilistic encryption [GM84], zero-knowledge proofs [GMR85, GMW86], and the slow revelation of a secret [B182, LMR83, BG89]. 1.5 Limitations of Earlier Work Limitations on definitions. The papers of the last subsection describe protocols. It is clear that these protocols have some remarkable properties. Of course the various authors have worked to describe what these properties are, offering definitions, to varying degrees of explicitness. But, in our view, no one succeeded in devising satisfactory definitions for the general problem at hand. Not having satisfactory definitions for secure protocols is a major problem. Cryptogra- phers have seen, in contexts like digital signatures, how a lack of well-planned definitions can lead them astray.'' Without good definitions, there are no proofs, there can be no full understanding of the problem, and, eventually, no one understands what anything really means. For secure function evaluation, definitions must be crafted with extreme care. Otherwise, they are likely to admit as "secure protocols" some protocols which we would like not to have this status; or they may exclude as being "secure" protocols which ought to be called secure; or protocols which achieve the definitions may fail to provably have properties one would expect a secure protocol to enjoy; or the definitions may not capture the appropriate intuition; or they may be too complicated to understand; and so forth. ^For error-free secure computation, the fault tolerance achieved by the [BGW88] protocol is, in fact, already optimal. ■"Central contributions in getting this notion straight were due to Diffie and Hellman [DH76], Goldwasser, Micali and Yao [GMY83], and, finally, to Goldwasser, Micali, and Rivest [GMR88]. The notion of digital signatures is now well understood. 16 Thus — as I see it — a large body of beautiful notions and protocols has sat atop rather murky foundations. As a consequence of the lack of agreement on definitions, the protocols which were offered were to a large extent offered without proof that they accomplished any rigorously-specified set of goals. Limitations on the protocols. There was another problem as well: all of the multiparty protocols which had been devised were computationally infeasible. To a large extent, this was because each minute step of the computation the players were interested in carrying out would manifest itself as additional communication rounds between the players. This is a serious problem, because in a network of communicating parties, such back-and-forth communication is usually the most costly resource. 1.6 Contributions of this Thesis This thesis makes three contributions in the design and understanding of secure protocols. ► First, we define the notion of secure function evaluation, in detail and with care never before attempted. The definitions given in this thesis are for secure function evaluation under the communication model that players may speak privately to one another in pairs, or they may broadcast messages to everyone. More general notions in the same spirit as ours will be described in [MR91]. It should be emphasized again that achieving good definitions in this domain is a very tricky matter; it is here that the most delicate issues arise, and, indeed, the definitions given here is a part of definitional work nearly two years in the making. ► Second, we offer a new protocol for multiparty secure function evaluation. This protocol has a major advantage over all previous protocols: it runs fast, in the sense of requiring little back-and-forth communication among the players. In fact, the protocol uses just a (fixed) constant number of rounds. This independence of the round complexity of the protocol from the computational complexity of the underlying function being evaluated is in sharp contrast with previous protocols, which used a number of rounds which grew directly with the circuit complexity of the function being evaluated. Not only does reducing the rounds to a constant amount to overcoming the main barrier to making secure protocols practical, but it also provides a key insight about the problem itself. Additionally, the new protocol is easier to analyze and make provable assertions about than previous complexity- theoretic proposals. ► Third, we prove that the formal notion of secure computation described is in fact achieved by the protocol presented, under the sole assumption of the existence of a one-way function, and that the majority of the players behave honestly. The later assumption is not really a limitation, but a necessary consequence of the "strong" notion of security which our 17 definitions are designed to capture.^ To prove our protocol secure, we assume the correctness of some previous work on secure multiparty function evaluation. What exactly is assumed and what is achieved is stated precisely as Theorem 4.1.1, and Theorems 4.3.1 and 4.4.1, respectively. Taken together, this research puts secure function evaluation on much firmer footing, and provides direction for the continued development of the area. 1.7 Related Work Concurrent with the definitional work of Kilian, Micali and Rogaway [KMR90], Gold- wasser and Levin independently proposed interesting definitions for secure function evalu- ation [GL90]. It is early to assess how their definitions compare with ours. Early in our research we shared definitional ideas with Beaver, who later pursued his own ones in [Be91]. 1.8 Organization and History of Thesis Results. Organization of thesis results. Paralleling the three contributions of Section 1.6, the notion of security is described in Chapter 2; the constant-round protocol is given in Chapter 3; and the proof that it is indeed a secure protocol is given in Chapter 4. Publication history of thesis results. The definitions of Chapter 2 are a special case of notions developed by Micali and Rogaway. An extensive paper investigating these notions is currently undergoing revision [MR91]. At an earlier stage of this research, Micali and Rogaway collaborated with Kilian in developing definitions for secure function evaluation. The fruits of this collaboration are de- scribed in [KMR90]. The definitions offered here are more stringent than those of [KMR90], attempting to capture elements of the "ideal evaluation" of a function which this earlier work did not attempt to capture. (See Section 2.4.1 for an explanation of the ideal evaluation of a function /.) ^Intuitively, when half or more of the players may cheat, certain functions can be computed "more and more correctly" only as you spend "more and more time trying to compute them." There is never a point in time in which the function is computed "totally correctly." Since we want a strong notion of security, in which our protocols should stop, at some fixed point in time, with everyone knowing exactly what they should know, we are forced to accept that there should be fewer than half faulty players. The first protocol to show how time spent interacting could be traded for increased correctness was due to Luby, Micali and Rackoff [LMR83], in the context of the coin flipping problem. Indeed it is necessary to pay for correctness with time for the coin flipping problem, as demonstrated by Cleve [C185]. Goldwasser and Levin, following Beaver and Goldwasser, have investigated just how strong a notion of security is achievable when there is a dishonest majority [BG89, GL90]. 18 The constant-round protocol of Chapter 3 was developed jointly with Micali. Having heard from Beaver that he too had developed these same results, we thought it fit to produce a jointly authored proceedings version. However, written documentation subsequently pro- vided to us by Beaver was only mildly related to our constant round-protocol [Be88b]. What we describe in Chapter 3 is a revised and simplified version of the protocol in [BMR90]. The contents of Chapter 4 — the proof of security of the constant-round protocol — has not appeared elsewhere. Chapter 2 The Notion of Secure Computation To a cryptographer, security means defeating an adversary. The stronger the adversary that can be defeated, the more secure the cryptosystem. Thus cryptographers try to dream up nastier and nastier adversaries, and then prove (sometimes under various assumptions) that these very strong adversaries are harmless nonetheless. In each context, one must carefully define what the adversary can do, and in what sense the adversary is rendered powerless. In this chapter, we carefully do this, in the context of secure function evaluation. We begin with a brief overview of our goal. 2.1 Overview For secure function evaluation, achieving security means overcoming the influence of those who would try to compromise a player's privacy or disrupt the integrity of the collaborative computation. The stronger this adversary, the more difficult to overcome the effect of her activities, and the more meaningful the resulting notion of security. Specifying what a protocol we call secure must accomplish consists of specifying the abilities of the protocol's participants, specifying the abifities of the adversary, and saying in what sense the adversary is rendered harmless. Powerful adversaries. The adversary we will consider will be extremely strong — the strongest "reasonable" adversary we can postulate. Roughly, the adversary is able to corrupt players at any time she wishes. When a player is corrupted, the adversary learns the state of the player at the time at which he was corrupted, and all future computation the corrupted player was responsible for is now controlled by the adversary. Effectively, the player has been turned into the adversary's loyal agent. To give our adversaries even more power, 19 20 we assert that even though our protocols are synchronous (communication occurs in fixed increments of rounds), the adversary is granted a certain amount of asynchrony in her ability to corrupt players.^ The only restrictions placed on the adversary is that there may be a bound on the number of players whom she is able to corrupt, and (for complexity-theoretic security) her local computation time must be "reasonable." Regarding an adversary as a single agent, rather than many, makes the resulting notion of an adversary stronger, effectively permitting a maximal degree of surreptitious cooperation among the maliciously faulty participants to a protocol. Defeating an adversary. To develop the notion of what it should mean to defeat such an adversary, we imagine an ideal protocol, in which computation is carried out by some external, trusted party. Privacy and correctness are non-issues in this scenario because the model provides for correct and private computation. Defeating an adversary ought to mean that the computation carried out by the protocol under attack by the adversary mimics the computation by the ideal protocol under attack by the adversary as closely as possible. In other words, a secure protocol succeeds in simulating the existence of an external trusted party, while actually trusting no one. Making this precise entails specifying in what sense the computation of a secure protocol is "just like" the computation of the function by a trusted party. A secure protocol should be "just as private" and "just as correct" as with a trusted party. To define privacy, we choose a simulation-based viewpoint, motivated by Goldwasser, Micali and Rackoff's ideas developed in the context of zero-knowledge proof systems [GMR85]. Basically, a protocol is deemed private if for any adversary, what she learns when executing with a protocol is nothing but a sample point of a distribution which she is entitled to sample. Correctness is then "interwoven" into the notion of privacy: the simulator existentially guaranteed for privacy is the principal object through which correctness is defined. We do this in a manner which preserves the idea present in the ideal protocol of sending a value off to the external trusted party, and then each player— even the bad players— getting (the right) value back from this agent. Getting the notions right. In the context of secure protocols, "getting the notions right" is extremely delicate. Definitional issues are tricky both because of the inherent complexity of the setting (a distributed protocol under attack by a powerful adversary is a complicated object!), and because of the severity of the constraints one wants to put on the behavior of the protocol in the presence of the adversary (that is, one wants to ensure ^In particular, we permit rushing — the ability of the adversary to corrupt a player at the end of round r and use this information to decide on additional players to corrupt during round t. Information gleaned from these players may in turn motivate additional corruptions, and so forth, until the adversary is done corrupting players for now. Then the outgoing messages are constructed and sent out on behalf of the corrupted players. 7U1 of these activities are completed before the beginning of round r -|- 1. 21 the highest possible standard for correctness as well as for privacy). Too weak a notion of security and "secure protocols" begin to pop into existence that one would not like to call secure; too strong a notion and "secure protocols" virtually drop out of existence. Besides achieving the right balance between definitions which are too strong and too weak, there are a host of other issues in the crafting of good definitions. For example, composability and reducibility properties are important. Uniformly being able to treat complexity-theoretic security and information-theoretic security is desirable. Simplicity is very important. Model independence. And there are many other concerns. Successful definitional work refines one's intuition about security, protocols, and adversaries, leading to substantially improved understanding. 2.2 Preliminaries Before specifying what it means that our adversary "learns nothing," and that our protocol "computes something," we must introduce some basic language to talk about these things. Some of this is standard, but much has been tailored to our specific goal of defining secure computation. 2.2.1 Notation Sets, strings, languages, and functions. An alphabet is a finite set. Fix the alpha- bet S = {0,1}. For 6 e S, we call b a bit and we let b denote its complement. A string is a finite sequence of characters from some alphabet, and an infinite string is an infinite sequence of characters over some alphabet. A language is a set of strings. If yl is a set, then A" denotes the set which is the n-wise Cartesian product of A with itself. Thus S" is the language of strings of length n over alphabet S, and we let S* = U„ S" denote the set of all strings over S. We let S*^ denote the set of infinite strings over alphabet S. The empty string (i.e., the length-0 sequence of characters) is denoted by A. For A a set, fix the convention that A° is the singleton language {A}. The notation 1" represents n written in unary. If we write x € S* where x is apparently not composed of characters of the alphabet S (e.g., x = llttOa), then it is understood that the string x is encoded over E in some natural manner. If A is a set, 2^ is the set of all subsets of A. If X and y are strings, xy denotes their concatenation. If a; = Oi • • ■ a„ is a string, a,- G S, then x[i -.j] (where i < j) is the substring ai • • -Uj. If a; = ai • • • a„ is a string, a,- G S, then lsb(a;) = a„. For x, j/ G S*, \x\ = \y\, x@y is the bitwise exclusive-or (XOR) of these strings. The NAND operator on bits give the negation of their conjunct. When a symbol denotes a string or an infinite string, use of the same symbol with a subscript denotes the indicated character. This convention holds even if the symbol denoting the string already bears a subscript. For example, if r,- is a string or an infinite string, then rn is the first character of rf. 22 The set of nonnegative integers is denoted N = {0,1,2, . . .}, and R is the set of real numbers. If a and b are integers, a < 6, we let [a..b] denote the set of integers between a and 6, inclusive. By [a..cx)) we denote the set of all integers greater than or equal to a, and by [a..oo] we denote the set of all integers greater than or equal to a, together with a point "oo". This set is ordered in the natural way, with all numbers n < oo. For A and B ordered sets, the set A X 5 is ordered according to (a, 6) < {a',b') if either a < a' ot a = a' and b < b'. If A and B are sets. A- B denotes the elements of A which are not in B. For /: A ^ B a. function, / is automatically extended to a function on subsets of A according to f{X) = {/(x): x € X}. For /: A^ B a function, f-\y) = {x : f{x) = y}. Vectors. As usual, we denote a vector by a letter topped with an arrow symbol, """. The same letter without the arrow but with a subscript denotes the indicated component. For example, if f is a vector, Xi is its i"" component. An n-vector x has n components, xi, . . . ,x„. If X and y are n-vectors, xy denotes the n-vector whose i*'' component is x.y,. Fix n, and let T C [l..n]. Then we write T for [l..n] - T. If f is an n-vector and T C [l..n], we define the tagged vector Xt = {(«,a;,): i G T}. (That is, xt keeps track of the indices as well as the values.) If x and x' are n-vectors and T C [l..n], then x^ U x'j, can be regarded as an n-vector y where j/,- = a; • lii £ T and t/,- = x,- otherwise. For /: (SO" -^ (S')" and T C [l..n], /t(x) is the function defined by /t(x) = (/(x))t. (That is, /t(^) is the tagged T-coordinates of the image of x under /.) Probability. All probability spaces we will be concerned with have underlying set fi = S*, and cr- field 2^*. Thus we won't distinguish between a probability space and the probability measure of a space. Our probability measures will all have finite support (i.e., the measure is nonzero only on finitely many points). The probability of an event E C T," with respect to some measure fj, is written Prob^ [E], or simply Prob [£;] when the measure /z is under- stood. If E is an event of some probability space, E is the event which is the set-theoretic complement of E in the underlying set. A random variable is any function (not necessarily real- valued) on the underlying set of a probability space. The expectation of a real- valued random variable X is denoted EX. By Uk we denote the uniform distribution on S*, that is, Uk is the probability measure which assigns mass 2"* to each string of length k, and assigns mass to all other strings. 2.2.2 Basic Notions This subsection defines some notions fundamental for stating our results: these are the notions of function families, circuits, negligibility, and ensembles. Function families. A finite function is a map /„<; : (E^)" -^ (E')", ot a map /„« : (5]<)" _> s'. The former is a vector-valued finite function, while the later is a string-valued 23 finite function. Since we are devising definitions tailored for distributed computation by three or more parties, we consider families of functions, each a function of three or more strings. Suppose L C S*, with each c € L specifying values n<; > 3, icJc > 0, in some natural manner. Let / - {/J be a collection of functions, one for each c G X. If each /^ is a vector- valued finite function fe : {T,^")"" -* (£'•=)"% we say that / is a vector-valued function family; if each fc is a string-valued finite function /^ : {Y,^')"" -^ S'=, we say that / is a string-valued function family; in either of the above two cases we say that / is a function family. Circuits. A circuit C is a computing device specialized for computing a function from a fixed number of bits to a fixed number of bits. It is a (finite) labeled directed acyclic graph. Each node is labeled by a symmetric Boolean operator drawn from some fixed set of Boolean operators, such as AND, OR, XOR, and their negations. Input nodes (those with in-degree zero) are labeled a;i,.. .,Xi, and output nodes (those with out-degree zero) are labeled j/i , . . . , j/o • Circuits provide a convenient encoding for finite functions. A circuit C on i inputs and o outputs computes a function C: S' -^ T,° in the natural way. For i = n£, o — nl, C can be regarded as computing a vector- valued finite function C : (S^)" -^ (S')"- For i — n£, o = I, C can be regarded as computing a string- valued finite function C : (S')" -^ S'. If C is a circuit, then \C\ is its size — the number of gates in C plus the number of wires in C, say — and depth(C) is its depth — the length of a longest path from an input node to an output node. If C is a circuit, we also write C to denote a string describing it in some standard encoding. What is fast? We adopt the notion that the polynomiality of an algorithm captures its running in a "reasonable" amount of time. In this thesis, "polynomial" wiU always mean a nonnegative- valued polynomial in a single variable. It will be convenient for us to speak of the time complexity of algorithms which have infinite strings as inputs. To allow such discourse, we always measure the time complexity of a function in terms of the length of its first argument, and we assume that each argument to an algorithm can be efficiently scanned from left to right. More precisely, let Xi,X2,. . ■ ,Xm be finite or infinite strings, the first of which is a finite string. We say that a string- valued function M on such tuples of strings (xi, . . . ,Xm) is polynomial-time computable if there ex- ists a polynomial Q and a Turing machine M, having m input tapes, such that M computes M(xi,X2,... ,Xrn) wltMu Q(|xi|)-time steps when M is begun in its initial configuration with its i*'' input tape initialized to the string Xj. What is small? We will say that a function c : N — * R is negligible if it is nonnegative and vanishes faster than the inverse of any polynomial: for any c > there exists a K £N such that e(k) < k'" for all k > K. A function e{k) which is not negligible is called nonnegligible. 24 There is, of course, some arbitrariness in this notion of negligibility. An alternative advocated by Levin (e.g., [Le85]) says that a function is negligible if it vanishes faster than the inverse of any function in some fixed resource class TZ, where U is required to satisfy certain properties. Fortunately, the notions we develop here are essentially independent of the particular definition selected for negligibility. 2.2.3 Indistinguishability of Ensembles A central notion in defining secure protocols is indistinguishability, as introduced by [GM84] in the context of encryption. (The notion has also proven crucial for the complexity theoretic treatment of pseudorandom generation [Ya82b] and for zero-knowledge proofs [GMR85].) Essentially, it captures the fact that two families of probability spaces can be (asymptoti- cally) so close as to be considered insignificantly different. To say this exactly requires some specialized language— the notion of a distinguisher and of a probability ensemble. The reader who wishes a bit more discussion about this notion may consult [GMR85] (pages 191-193). DiSTINGUISHERS. A distinguisher is our formalization of a "judge" who votes to decide among two competing alternatives. As will be discussed shortly, our notion of a distinguisher is a "nonuniform" one. A distinguisher D" is an (always halting) probabilistic algorithm, D, together with an infinite "advice" string, a. The algorithm D takes one or more (possibly infinite) strings, Xi,X2.,..-, and uses its infinite sequence of random bits, r^, and its infinite string, a, to compute a value D{xi,X2,.. . ,a,rD) G {0,1}. A distinguisher D" is polynomial-time if D is polynomial-time. (Recall that this means polynomial-time in the length of the first argument.) Ensembles. If >C = {Lk : A; G N} is a family of languages, then an ensemble E (over C) is a collection of probability measures on S*, one for each {k,u) G N X Xj;; that is, E = {Ei,{(^) : k eN, u e Lk}. The argument k is called the index of the ensemble E, the argument u is called the parameter of E, and £ is called the parameter set of E. As the index k is always drawn from the index set N, we never specify it, writing E = {Ek(u!): u) G Lk} to indicate an ensemble over £. When the parameter set of an ensemble is understood and there is no danger of confusion, we refer to an ensemble by writing the symbol "£" in front of its "generic element" — that is, we simply write SEk{oj) instead of {i?it(w): uj G Lk}. The above notion of an ensemble applies only to distributions on strings. However, the notion is trivially extended to any other domain whose elements can be canonically encoded as strings. We will thus speak of ensembles on other domains, where it is understood that there is, implicitly, a fixed encoding of domain points into strings. If E and E' are probability ensembles over a common parameter set C = {Lk}, then E X E' is an ensemble over C, where points in E* encode pairs of points in E* by some 25 fixed, natural encoding. This ensemble is defined by asserting that PTo\ExE'),,(u,)[ix , x')] = PTobE^(,^)[x] ■ Probjs-^(<^)[x']. The notion generalizes in the obvious way to arbitrary finite products. The notation £" denotes an ensemble which is the n-wise product of E with itself. If / : S* -^ E* is a function on strings and E is an ensemble, then f{E) is an ensemble in the natural way, with I'Toh(^j(^E))^{w)[x] = Prob£;^(^)[/~^(a;)]. If SEk{u>) is an ensemble and A — {Ak C S*} is a family of events, we say that A occurs almost certainly if e{k) = sup^^g^^^ ProbBj(<^)[4A;] is negligible. Sometimes a simpler notion of an ensemble will do, ensembles which have no param- eter oj. In this case, SE^ is simply an N-indexed family of probability measures on S*, and all subsequent definitions are made meaningful by interpreting the parameter set of the ensemble to be a collection of singleton languages. As an example of an unparameterized ensemble, £Uk is the ensemble of uniform dis- tributions on A;-bit strings. As an example of a parameterized ensemble, fix c and define Ek{xi---Xko) (for |a;,| = 1) as the distribution on tuples (n,Xi, • • • ,Xk^) given by first se- lecting n to be the product of two random A;-bit primes, then selecting X,- to be a random residue modulo n if a;,- = 0, and selecting Xi to be a random nonresidue modulo n of Jacobi symbol +1 if a;,- = 1. Then £Ei.{x) is an interesting ensemble over {Lk = S* }. Computational indistinguishability. We now specify what it means for two ensembles to be indistinguishable to an observer with bounded computational resources. Definition 2.2.1 Let E and E' be ensembles. We say that E and E' are computationally indistinguishable, written E « E' , if the ensembles are over the same parameter set C — {Lk}, and for every polynomial-time distinguisher D" eik)= sup \ED{l\Ek{u;),iJ,a)-ED{l\E',iu),u,a)\ is negligible. When we wish to emphasize the sequence of languages {Lk} which parameterizes the en- sembles E and E' we write E k^ E'. Two ensembles over the same parameter set £ which are not computationally indistinguishable are called computationally distinguishable. This is written E 56 E'. The notion we have defined for computational indistinguishability is a nonuniform notion — possibly, the ensembles appear different to the resource- bounded judge only by virtue of the advice string a. Nonuniform notions of indistinguishability have more com- monly been defined by polynomial- size circuit families. We find the phrasing above more convenient, because it is more natural for an infinite string to be an input to an algorithm than to a circuit. Uniform notions of computational indistinguishability — where the distinguisher does not have benefit of the advice a — are also possible. In fact, all results of this thesis hold equally 26 in the uniform model. However, for economy of notions, we choose to describe only the nonuniform notion of security. Some reasons for favoring the nonuniform notion of security over its uniform counterpart are given in Section 2.3.3. ► We remark that the following two variations of the concept of indistinguishability are possible without affecting which pairs of ensembles are indistinguishable and which are not. First, the distinguisher need not be probabiUstic: the advice a can always be used to specify a "good" set of coin flips to use. Second, instead of saying that a single infinite advice string a is associated to the algorithm D, we could instead associate an advice string for each value k. To distinguish Ek{uj) from El{u), the distinguishing algorithm D would be given 1*, the sample point, the parameter u, and a^. To see that this "^-dependent advice" does not improve one's ability to distinguish ensembles, note that, by the standard diagonalization method, an infinite sequence of infinite advice strings {a,,} can be encoded into a single infinite advice string a in such a way that the overhead to read a bit ak[i] from the advice string a which encodes it is only quadratic in k and i. Statistical indistinguishability. We define a stronger notion of indistinguishability, one that does not depend on the resource bounds of the observer. Definition 2.2.2 Let E and E' be ensembles. We say that E and E' are statistically indistinguishable, written E ~ £', if the ensembies are over tiie same parameter set £ = {ijt}, and for every distinguisher D", €(fc)= sup \ED{l',Ekiuj),u!,a)--ED{l'',E',{u),u;,a)\ is negligible. It is an easy theorem that ensembles €Ek{ij) and SE[{u) over {Lk} are statistically indis- tinguishable if an only if e(A;) = sup^gj,^(E.eK- |Prob£,(^)[x] - Prob£^(^)[a;]) is negligible. 2.3 Protocols and Their Adversaries In this section we describe not only the communication mechanism provided to the agents of a collaborative computation, but also the adversary which may attack these agents — for one has not described the behavior of a protocol until it is specified how it runs in the presence of the adversary. The adversary we consider is extremely strong, which makes our definition of security much more meaningful. On the other hand, given that secure protocols are non-trivial to find, we at least make them easier to write by providing a generous communication model. In essence, we allow each player both to privately talk to any other player and to talk "aloud," so that all other players will agree on what was said and on who said it. 27 Summary of the model. The players (or processors) are the agents who wish to carry out the joint computation. The collection of players is the network, and the program that the players run is called a protocol. To be certain that a protocol can be easily described, each player runs the same program; it is made specific to the player who is running it by letting the player be aware of his own identity. The players have two resources at their disposal: the ability to compute locally, and the ability to communicate with one another. To communicate, the processors may talk privately to one another in pairs, or they may announce messages to the community as a whole. When they do broadcast a message, everyone knows who sent it. As a player computes, his computational state progresses in time. One might imagine that this computational state should progress as the communication rounds progress, but instead we formalize matters with a finer level of granularity, thinking of a processor as carrying out many computational steps within a single round. These steps consist of flip- ping a coin or applying some deterministic function to his computational state. When the round is over, the messages a player sends out to the other players are functions of his computational state, and the messages a player receives from other players— functions of their computational states — augment his computational state in a timely manner. Initially, each player knows only his own initial computational state. The information this contains is his identity, i, his private input, a;,-, and a string called the common input, c, which is shared by all of the players. The common input contains such information as the number of players in the network, n, and a description of the function they wish to compute, fc. For simplicity, we insist that players' private inputs are all of the same length. While it is not important that all inputs be of the same length— this could be accomplished by padding the inputs which are too short— it is important that the players know a bound on the length of the longest possible private input. That this is a natural and reasonable restriction can be motivated as follows. One thing a player i learns about another player j when interacting with him is the number of bits which were sent out by j and received by i. For some functions, player j must send out at least as many bits as his private input is long — even if there were no privacy constraint at all. A convenient way to make sure that this degree of exposure of one's input is not considered to be damaging is to say that bounds on input lengths were already known in advance by each player.^ Our goal in this chapter is to properly define those protocols which are robust against the incorrect behavior of some of its participants. We adopt an extremely pessimistic view of how players may diverge from faithfully executing a protocol. Namely, we imagine that the players are not the only agents involved in the joint computation: also present is an adversary, who runs a program the players know nothing about. The unpleasant trait of ^When this restriction is not made, the problem of secure computation changes radically in character. See [CGK90] for some work in this framework. 28 an adversary is her ability to corrupt players. When a player is corrupted, he is made the adversary's loyal servant, with the adversary subsuming all communication and computation associated with the corrupted player. Additionally, the adversary is handed over the private state of the corrupted player at the time at which the player was corrupted. The computation proceeds in rounds. The adversary runs; then the players run; then the adversary runs; then the players; and so forth, until all the players have terminated. At this point, the adversary is given one last run, and then the protocol has terminated. Within each of these adversary and player rounds, a good deal of activity may go on — as players do computation, and the adversary computes and corrupts processors. We will be interested in how much the adversary can learn as a result of her activities, and what the good players compute, despite the interference of the adversary. To concretize these ideas, we now proceed more formally. 2.3.1 Protocols Protocols for ti-parties. In this section we describe what a protocol for a fixed number of players is. Later we discuss protocols for arbitrary numbers of players. Recall that S is the binary alphabet, and words over this alphabet are indicated by writing that they belong to S*. However, we will sometimes indicate that words which are apparently not over the binary alphabet are to be considered as belonging to S* — for example, we might write OitO+1 e E*. When we do this, it is understood that the symbols which constitute the "expanded" alphabet over which our strings are drawn are encoded as strings over S* by some fixed, natural encoding scheme. In the definition that follows, the protocol P is the main object of interest. It specifies how the computational states of the players are to progress in time. Specifying a protocol means defining the map P. The interaction functions are the "glue" that allows applying the function P repeatedly to capture the network's state progressing in time. Definition 2.3.1 An n-party protocol is a Turing-computeible function P: S* X E* ^ S* . common c\irrent new input state state A network interaction function is any of the following polynomial-time computable functions: 1. a next-action function r] : E* -^ {compute,flip-coin, round-done, protocol-done}, 2. a broadcast messages function M : S* -+ S*, 3. a private messages function m : E* X [l..n] -^ E*, and 4. an output function o : E* -^ E*. A player configuration is an element of E* X E^ X S^ . player's coins history state remaining 29 Notation. In place of m(si,j) we will write mj{si). Discussion. The next subsection describes, formally, how a protocol runs. Here we give a sketch of this. To run a protocol, each player starts off in some initial computational state. Each player applies the next action function, t], to his current computational state to see what he should do. If it says that he should compute, then the algorithm P is applied to the computational state to come up with a new computational state. If it says that he should flip-coin, then the player is given a random coin toss from his infinite sequence of random coins. The consumed coin then "vanishes" from the infinite sequence of future coin tosses. As long as the player computes or gets coin flips, the process continues. When the player is done with all of his activities for this round, this is indicated by 7/ assuming the value round-done. When all the players are done computing in this round, their broadcasts and private messages are "delivered" — that is, these strings are properly appended to the computational states of other players. (This will be described in the next subsection.) The messages a player broadcasts and those he sends out privately to other players are determined by applying the functions M and m to his own computational state. Thus, even though these functions assume values before a player is done with a given round, their values are of no importance then. After all the messages are delivered, each player resumes running: the next action function 77 is again applied to his current computational state — the state that the processor is in after the messages he receives from other players have been appended to his computational state. When a processor is finished with all the activities he wishes to engage in for this execution of the protocol, instead of simply choosing a computational state in which 77 assumes a value of round-done, he instead selects a computational state which rj indicates protocol-done. At this point, the output function o defines the player's private output; previous to this, the output function is not meaningful. The protocol terminates when every player is protocol-done. As the proceeding discussion suggests, a protocol P can only be run with respect to a fixed set of interaction functions. When combined with P, these interaction functions specify how a player's computational state is to progress in time, affecting the computational states of other players in the network. Thus we might have considered a protocol to be a five- tuple (P,T7, M,m,o) consisting of the protocol algorithm proper and the collection of interaction functions. However, it is more convenient to require that the interaction functions be fixed functions, good for any protocol. This way, for example, protocols can more easily be composed with one another: there is no danger that two protocols employ different conventions on how processors communicate, say. We have not specified these interaction functions since the particular conventions chosen for defining them are not important. Each function should specify its range value in a natural and simple manner from its domain value. For example, with "(", ")", "(", ")", and "," all being formal symbols, we might say that if s,- contains one and only one occurrence of a 30 substring {V), then rf{si) = compute if j = 1, ri(si) = flip-coin if j = 2, t/(s,) = round-done if j = 3, and T]{si) = protocol-done otherwise; if Si contains one and only one occurrence of a substring (fi), then Af (sj) = fi; otherwise, M{3i) = A; if s,- contains one and only one occurrence of a substring {V,fij), for j G [l..n], then mj{si) = fjLj\ otherwise; mj{si) = A; and if 5,- contains one and only one occurrence of a substring [C], then o(s,) = Ci otherwise, o(s,) = A. This was just an example of how the interaction functions could be defined; the specific choice of conventions is irrelevant. We have not described a player's configuration. It captures his current computational state, his future coin tosses, and his history. The history is information associated to a player which is not relevant to the task at hand. The presence of the history is useful for properly dealing with protocol composition, and for proving certain properties of secure protocols. For example, when a protocol is called as a subroutine, the "saved state" which is irrelevant to the subroutine call is tucked away in the history. Since we wiU not be concerned with subroutine calls in this thesis, the saved state is of no real importance. However, we keep this information around because of its playing a significant role in the more general theory of secure protocols. We now define more formally how a protocol runs in the absence of an adversary. 2.3.2 Executing Protocols in the Absence of an Adversary Notice how, in the formalism for a protocol, we have a fine level of granularity in how a protocol runs— all the way down to individual coins being tossed. We could have tried to be more succinct, letting a player's computational state progress from round-to-round, with a player doing all the necessary computation "in one shot." However, the approach selected turns out to be more symmetric with the natural way of phrasing the adversary's behavior — she gets information repeatedly from the players within a single round. This approach also serves to emphasize that a player may choose to remember a coin toss or he may choose not to remember a coin toss, but — as we shall see when we discuss the adversary — a coin, once flipped, is not recoverable by the adversary except to the extent that it is stored in the player's computational state. Thus we index the player configurations by two superscripts. The first index, r, is a counter for the round number. The second index, p, is the "micro-round" number, formal- izing this finer granularity z& the protocol flips coins and does computation. We allow the micro-round to take on the formal value "oo". If a player's computation for round r is com- pleted at some micro-round p, then aU subsequent computational states for micro-rounds during round r, including that at time (r,oo), are fixed. This is notationally convenient. Configuration sequences. An n-party protocol P generates from n initial player con- figurations, (C°°, . . . , C°°), a sequence of configurations {C;'': i € [l..n], r G N, p e [O..00]}, which we now describe. 31 We remark that some configurations may fail to be defined by the recurrences below; this will be dealt with later. We note that the character r is somewhat overworked: with a subscript i, it indicates player i's random tape; as a superscript, it indicates a round number. This should cause no confusion. Recall that, if r,- (for example) is an infinite string, then r.^ is its j** bit. Finally, the symbols {#, !!,*,•,■}, which appear here and elsewhere, are all just formal punctuation symbols. Fix the notation for player configurations, and the notation ^, _ f M«°°) if r > and v(4"^^°°) = round-done I A otherwise r _ f "»;«°°) if r > and J7(4''~^''°°) = round-done '^ 1^ A otherwise for broadcast messages and messages sent out along private channels. (Intuitively, the former is what processor i "tries" to broadcast at the end of round r, and the latter is what processor i "tries" to send to j at the end of round r.) Let the common input, c, be the "j)"-terminated prefix of s°°. Then the players' configurations progress as follows: CI <P+i) {P{c, s'i"), ^^^ <") if T?(5^'") = compute and r > (s^'+r^f , rl^r'ii • • • , <") if 7/(3^^) = flip-coin and r > c: rp otherwise cr~ = c.('-+^)° = . CI'' if 3 /!) G N s.t. r = or rj{sY) e {round-done, protocol-done} (5^°°*M[* • • • ifM^^^mli* • • • *m;,-, if 7/«°°) = round-done and V cr otherwise If any configuration fails to be defined by the recurrences above, then the protocol is said to have diverged (on this execution). Nomenclature. As mentioned, the first superscript indexes the round while the second superscript indexes the micro-round. The string Cf is called the configuration of party i at the beginning of round r, while Cl°° is the configuration of party i at the end of round r. The configuration C?" = C?°° is the initial configuration of party i. If the round r, micro- round p 32 configuration of party i is C-" = (s■^7■[^7r[^), then we refer to s-'' as the computational state of player i at this point in time. The string M[ is the message broadcasted by i in round r, and the string mlj is the message sent from i to j in round r. Executing protocols. In the absence of an adversary, an execution of an n-party pro- tocol P with common input c = l*#l"#l^#l'#l'"#C, private inputs a;i,...,a:„ G I;^ histories 7ri,...,7r„ € S"", and coin sequences ri,...,r„ G S'^ is the sequence of config- urations {CI''} generated by P when the initial configuration of party i is taken to be C°° = (cjtx.jjz, r,-, TTi). The set of executions of P with common input c, private inputs x, histories tt , and all possible coin sequences ri , . . . , r„ € S'^ enjoys a probability measure by endowing each execution with the measure induced by taking each bit of each r, to be se- lected uniformly and independently. In the absence of an adversary, executing a protocol P with common input c and private inputs f , means sampling according to this distribution. The values specified on the common input, c, in an execution of an n-party protocol are called the security parameter, k, the number of players, n, the input length, i, the output length, I, the history length, m, and function description, C. Any of these values may take a subscript c to emphasize their being specified on the common input c. Since we will be interested in secure computation of functions, Cc will specify — somehow — a vector-valued or string-valued finite function, C^ : (S^^)"^ ^ (S'-)"% or C, : (S^«)"= ^ S'=. Message delivery. The string M[ is the message that processor i broadcasts at the end of round r, or A if processor i does not broadcast a message in round r. The string m^ is the messages that processor i sends to processor j at the end of round r, or A if processor i does not send a message to processor j in round r. The empty message is delivered to each processor in round 1, since no activity occurred in round (see below). Discussion. As mentioned, the history of a processor is thought of as information which a processor may possess which is not relevant to the task at hand, but which is nonetheless part of the processor's configuration; for example, it might be the saved state before a subroutine call, and the currently executing protocol is really just a subroutine. To ensure that a processor does not "use" information it should not use, we do not include the history in a processor's computational state. But, as we will see, it is there in the sense that when a processor is corrupted, its history is made available to the adversary. We note that we could, alternatively, have said that some properly-marked portion of each processor's computational state is not allowed to be "used" by the protocol P. Saying this formally is more awkward than the approach taken. We have established the convention that a protocol begins with the players executing a "dummy" round, round 0; during this round, "nothing happens." The presence of the void round facilitates bringing the adversary into the model of computation in a manner which allows for clean protocol composition. This will not, however, be a concern for us here. 33 2.3.3 Adversaries (Informal Treatment) Modeling errors. So far we have described how a fault-less network operates. We now consider the possibility for some players becoming "bad" in an execution of a protocol — that is, of deviating from their prescribed instructions. In fact the goal of this chapter is to properly define those protocols that can "withstand" the action of some bad players. How powerful should we let these bad players be? In some scenarios the only natural way for a processor to deviate from a protocol is by ceasing all communications, such as in the case of a computer "crash." Alternatively, processors may start sending messages "at ran- dom," corresponding — for example — to having some short-circuited register. If people are behind their processors, it is safer to consider more "malicious" deviations. This possibility, clearly subsuming the previous ones, is the one we focus on. Our goal is in fact reaching the strongest, natural notion of security, so that a protocol satisfying our definitions may be safely and easily used in any natural context. We thus allow bad players to deviate from their prescribed instructions in any way — the only constraint we consider is that even bad processors may (perhaps) be computationally bounded, and there may be a limit on the number of bad players possible. We also allow bad players to secretly cooperate with each other. Actually, to guarantee their "perfect cooperation," we envisage a single agent, the adversary, that during an execution of a protocol, may corrupt and control players. Let us now address an equally important question: when can the adversary corrupt players? One possibility is to consider a static adversary, one who can choose and corrupt a subset of the players only at the start of a protocol. Since any real adversary may be expected to make an effort to corrupt those players whose corruption would be most beneficial to her for the current execution, a better possibility is to consider a dynamic adversary, one capable of corrupting players at arbitrary points during the execution of a protocol, based on the information acquired from previous corruptions. This appears to capture the worst natural model of malicious behavior which one might hope to defeat in our scenario.^ Such an adversary is provably more powerful than a static one (see [MR91]), and security with respect to a dynamic adversary is both harder to achieve and harder to properly define. If an adversary were allowed to corrupt all players, then nothing could be said about the behavior of the network. Thus the number of corruptible players will appear in our definition of security. We now refine these ideas, until we are ready to specify a "mathematical" description of an adversary, and how a protocol runs in the presence of such an agent. ^For the Byzantine agreement problem [PSL80] — the problem perhaps most responsible for clarifying notions of adversarial behavior — even stronger adversaries can be postulated and defeated, including an adversary capable of "seeing" the internal states of all players, but capable of gaining control of the output of only a certain fraction of them. 34 Start Player's round Adversary's round Player's round 1 Adversary's round 1 Player's round u Adversary's round w End Figure 2.1: An execution of a protocol with tlie adversary. The player's O"* round is a round on which there is not activity — so, effectively, the adversary begins and ends the execution of a protocol. Adversaries interacting with networks. Think of an adversary A as an abstract machine interacting with the participants of a network in a prescribed way. This way entails the players and the adversary alternating periods of activity, as suggested by Figure 2.1. In the beginning, the adversary and all the players in the network are quiescent. The adversary and players will take turns being active. In the beginning, all of the players are good, and so they remain unless the adversary corrupts them. Initially, all the information the adversary has on which to base these corruptions is the same common input which the players share. (If the players are entitled to this information without doing any work, so is the adversary!) At the start of every round, the adversary is quiescent. Once all of the still good players have finished their activities for this round, having well-defined out-going private messages and broadcast messages, the players go to sleep and adversary A is awakened and receives all broadcast messages just computed, together with all of the messages the players composed for already-corrupted processors. The adversary may then choose to corrupt some new processor i. When she does so, within the same round, she learns all of i's internal state (his computational state, history, and initial private input), all of the messages which were just sent out by processor i (as a consequence), and all messages which were just sent to processor i. After this, still within the same round, A can corrupt, one at a time and exactly as before, additional players, until she does not want to corrupt any more of them. At this point, the adversary composes all outgoing messages from the bad players to the good, and it is these messages (together with the messages sent by good players) which will be delivered to the players in the next round. This process continues until all of the processors have either been corrupted or have halted. Then the adversary is given one last period of activity. After this, the protocol is 35 said to have terminated. To formalize what we have just described, we will say that the execution of a protocol in the presence of an adversary follows the sequence of "macro-rounds" shown in Figure 2.1. Within each macro-round, there may be many "micro-rounds," in which the players per- form their local computations, or the adversary computes and corrupts various players, in sequence. We choose to think of the players as having a 0"'-round in which no activity occurs; after that void round, it is the adversary's turn to be active. The adversary's advice. To obtain a robust notion of security, we demand that our protocols remain secure even if there is some information a^ known to the adversary in advance of the protocol's execution. Oren has pointed out the importance of providing such advice in the context of zero-knowledge proof systems [Or87]. The adversary advice might, for example, consist of information which the adversary has somehow come to acquire about the input vector x — perhaps from previous executions of other protocols. Or, it might be information that depends on the security parameter k, which, perhaps, the adversary worked for years to come up with in a "preprocessing stage" of her attack on the protocol. The advice will be doled out in a manner like the coin flips are: when the adversary asks for the next advice bit, it is given to her and vanishes from the infinite string of advice remaining. Why the NONUNIFORMITY? Adversaries like the one we have described — who may possess some fixed information before the protocol begins, information, possibly hard to compute, tailored to the choice of security parameter or, in our case, the private inputs and proces- sor histories, as well — such adversaries are called called nonuniform adversaries (because, traditionally, such adversaries have been modeled by (possibly nonuniform) polynomial-size circuit families). Nonuniform adversaries have several advantages over their uniform coun- terparts, advantages which we now enumerate. Most importantly, since a cryptosystem is generally run for a particular choice of security parameter, one would be unhappy with a protocol which was only secure against uniform adversaries: a sufficiently committed at- tacker would mount an attack that would break the cryptosystem itself, a much worse break than just breaking a particular usage of the cryptosystem. Secondly, proofs are frequently simpler or more natural in the nonuniform model. Third, the main theorem of this thesis talks about how an arbitrary circuit can be used to specify a protocol for securely evaluating it; thus there is already nonuniformity present in the families of circuits which might be evaluated. 2.3.4 Adversaries (Formal Treatment) Adversaries for h-party protocols. Formally, an adversai-y will not be very different from the other participants of the collaborative computation; but she has additional abilities which allow her to make life difficult for the players. Defining how a protocol runs in the 36 presence of an adversary will be a matter of properly modifying the players' and adversary's computational state as she "interacts" with the protocol. Definition 2.3.2 An adversary for an n-party protocol is a {unction A: E* X S* -^ S* . common cuirent new input state state An adversary interaction function is any of the following polynomial-time functions: 1. a next-action function ^ : S* -> {compute,flip-coin, get-advice, corruptj, ... ,corrupt„, round-done}, 2. a broadcast nnessages function M : S* X [1..7i] — >■ S*, and 3. a private messages function, m : S* X [l..n] X [l..n] -^ S*. An adversary configuration is an element of X 2[l..n] ^ ^* adversary's coins advice corrupted traffic state remaining remaining players Notation. In place of M(s^,i) and m{sA,i,j) we will write Mi{sA) and fhij{sji), respec- tively. Discussion. We do not explicitly specify the interaction functions since the particular conventions selected for them is irrelevant. All that is important is that each function specifies its range value in a natural and simple manner from its domain value, as with the interaction functions associated to the players of a network. As with a protocol, the first component in A's domain is the common input. Though this could be considered as an unnecessary component — it could be encoded in the second component, the adversary's current state — we make a separate argument of it to facilitate specifying the computational complexity of an adversary. The function f/ describes the action an adversary wishes to perform: does she do some computation, flip a coin, corrupt some processor i, or complete her activities for this round? Note that while a player may terminate, we choose to say that an adversary does not; we will select a formalization in which the adversary effectively terminates after aU processors have done so. The string Mi(sA) denotes the message that the adversary will broadcast for player i, if i has been corrupted. The string 7nij(5^) denotes the message that the adversary sends to processor j on behalf of processor i, if i has been corrupted. Note that, while a protocol must at least be computable, no similar assumption is made of an adversary; an adversary which is, say, an arbitrary function, with no finite description, is a perfectly good adversary. However, possible restrictions on the power of an adversary will be defined in Section 2.3.7. 37 The adversary configuration captures that information about the adversary's computa- tion which needs to be remembered and updated across the application of A. The adver- sary's computational state, the coins which she has not yet used, and the advice which she has not yet read are all components of the adversary's configuration. Also included are the set of processors which are currently corrupted; this set grows with each corruption, and never shrinks. The final component of the adversary configuration is an encoding of the traffic of exchanges between the adversary and the players with whom she speaks. This quantity will prove sufficiently important to warrant explicitly specifying how it evolves as the adversary interacts with the players. 2.3.5 Executing Protocols in tlie Presence of an Adversary We now describe how an n-party protocol P executes in the presence of an adversary A. After specifying the behavior formally, we will describe what it means in English. Configuration sequences. Let A be an adversary for attacking an n-party protocol, and let P be such a protocol. We describe how, from any n initial player configurations, {C°°,... ,C°°), and any initial adversary configuration, C^ = (-s^, ^/t,aA, k^,Ta), proto- col P and adversary A generate a sequences of player configurations, {Ci'': i € [l..n],r € N,/) e [O..00]}, and a sequence of adversary configurations, {C^: r € N,/? € [O..00]}. Fix the notation Cr = i^7, r[^ ttD, and and let the common input c and the private input x be given by c'^Xi'^i = s°°. Once again, the symbols {#,8, *,•,■} are just formal punctuation symbols. The players' configurations progress as before, ' (P(c,5i''), r[^ <") if rjisr") = compute and r > (^r*r[f , r^^rli . ■ ■ , ttJ") if 7j{s7) = flip-coin and r > a Kp+1) _ c: otherwise Q~ = Ci" if 3 p e N s.t. r = or ??«'') 6 {round-done, protocol-done} CI ('■+1)0 _ = < (Si°°*M[* • • • *M^*ml-* • ■ • *mj^^, if 7/(s^°°) = round-done and cr otherwise 38 while the adversary's sequence of configurations progresses as follows: C r(p+l) = < {A{c,il\ r\\ a'l, k7, r^) if f^^s^:) = compute {s7*r'A, rZrZ ' " " , <, ^'I, ^7) if 1(^7) = fip-coin {s'-Z^aZ, r^I, a^aZ • • • , «7, r^) if fii^''/) = get-advice {s''/ *s^i^ *Tr^°° *Xi*ffili* ■■• *m';,i, ii vis"/) = corrupt,- r7, a:,^ «:,'• u {o, ry«j«sj'~*7r[°°*a;i*mi,.* • • • *mJi,) 57+<°°*7rr°°+Xi, r7, TT^, K^UiO C7 and r > if 77(5^) = corrupt,- and r — otherwise Croo A = < fo*"^ r'"'' n'''' «■'■'' ■■•*w;;i*---*m;„B) /„'■'' T-'"'' o'"'' K-'"'' r'"''*'! if 3 /9 € N s.t. r > and iHs^a) = round-done otherwise ^(r + l)0 = < (s:4°°*M[+^* • • • *M^+Hfh\r* • • • *fh[Z * ■ ■ ■ *m;t^* *m: r + l TOO ^'*P t.-''oo A 5 ^4 ? '^A ? ^: ri~*M[+'* • • • +M;+i*m!;+i* • • • *fh[t^* ■ ■ ■ *m';,t'* ■ ■ ■ *m;;+i) (o^cOj, T-roo „rco ^roo t-^oo \ \^A *i ^A ^ "'A ^ '^A ^ ~A ) if r > if r = where M{s\'=°) if 7- > and i ^ k"^ and ■r}{s^[ ^^~) # protocol-done Ml = \ Mi{sT) if r- > and i € k^^ A otherwise mj{s';°°) if r > and i ^ k''^ and 7?(s^''"^^°°) ^ protocol-done m;-,- = -! rhij{s''^) if r > and i e kT' A otherwise 39 m; fn- M{s1'=°) if r > and i ^ k^^~^^°° and 7?(*.^'""^^°°) ^ protocol-done A otherwise ' mj«~) if r > and i ^ k^'-'^°° and j € k^'-'^°° and = < f]i.^t~^^'^) 7^ protocol-done A otherwise m: = m-j JVf,(5^°°) if r > and j G k^"' A A TOO A otherwise ) if r > and i G k^°° and j ^ k; otherwise roo A If any configuration fails to be defined by the recurrences above, then the execution with the specified initial configuration is said to have diverged. Nomenclature. The tuple Cl° is called the configuration of party i at the beginning of round r, while C[°° is the configuration of party i at the end of round r. The configuration C°° = C°°° is the initial configuration of the party i. The tuple C^° is called the configuration of the adversary at the beginning of round r, while C^°° is the configuration of the adversary at the end of round r. The configuration C™ = C°°° is the initial configuration of the adversary. If the round r, micro-round p configuration of party i is CI'' = (5^'',r['',:r[''), then we refer to sY as the computational state of player i at this point in time. The string M/" is the message broadcasted by i in round r, and the string m\j is the message sent from i to j in round r. If T/(s|'°°) — protocol-done, then C[°° is called the final configuration of party i. The value C^°° is called the adversary configuration at the end of round r. Player j is corrupted in round r if f}{s^/) = corrupt, for some p. Processor i terminates at round r if r is the first round for which ??(Si°°) — protocol-done. Executing protocols. In the presence of an adversary A, an execution of an n-party protocol P with common input c = l*#l":j^l^#l':jfil'"#C private inputs Xi,...,x„ 6 S^, histories 7ri,...,7r„ € E'", player coin sequences ri,...,r„ G S*^, adversary coin sequence r^ G S"^, adversary advice a a G S'^, and initial corruptions k^, is the collection of con- figurations {0^,6"^} generated by P and A when the initial configuration of party i is taken to be C°° = (cj|a;,j|J, r,-, tt,), and the initial configuration of A is taken to be {c*ka, va, ttA, ka, c*ka)- The set of executions of P with common input c, private inputs X, histories if, adversary advice a a, initial corruptions Ka, and aU possible coin sequences f 40 and Ta becomes a probability space by endowing each execution with the measure induced by taking each bit of r^ and each bit of each r,- to be selected uniformly and independently. In the presence of an adversary, executing a protocol P with common input c, private inputs X, histories ?, and adversary advice a a means sampling from this probability space. Message delivery. We explain the meaning of the various m&A/'s. The string M[ is the message an uncorrupted processor i "tries" to broadcast to the network in its round r. Due the adversary's activities, some other message M[ may actually be broadcast to the network on behalf of player i. The string m^ is the message an adversary receives from an uncorrupted processor i, intended for (corrupted) processor j. The string m'^j is the message that a (good) player j receives on behalf of player i, the source of this message depending on whether i is corrupted or not. The string M- is the message the adversary broadcasts in round r on behalf of the corrupted processor i; the string m-^- is the message the adversary sends in round r to the uncorrupted processor j on behalf of corrupted processor i. The adversary's interaction with the network. When the adversary corrupts a processor, she learns the current computational state of that processor, and the history as- sociated to that processor. Additionally, she learns that processor's private input. As long as there are messages which were delivered to the corrupted processor in this round (i.e., as long as it is not round 0), the adversary is given those messages which did not originate from the adversary herself. When the adversary terminates her activities for some round, the messages she composes on behalf of the corrupted processors are then delivered. When the processors terminate their activities, the messages which they compose and which the adversary can see (broadcast messages or messages sent to corrupted processors along pri- vate channels) are delivered to the adversary. So that the formalism matches the intuition, we demand that an adversary corrupts a processor at most once: if 7i{s''/) — corrupt^, then fiis'/) ^ corrupt,, for all (r',p') < {r,p). Traffic. The "traffic" of exchanges between the adversary and the uncorrupted processors consists of everything the adversary "gets" from the currently good players — the messages they broadcast, the messages they send to corrupted players, and the information the ad- versary learns when one of these good players is corrupted — together with the information that the adversary "gives" to the currently good players — the messages the adversary broad- casts on behalf of corrupted players, and the messages the adversary sends out along private channels to uncorrupted players on behalf of corrupted players. As we have formalized it, the traffic also includes (tacked onto the beginning) the common input and a description of the initially corrupted processors. In words, each time a processor i is corrupted, the information learned from that proces- sor which augments the adversary's state is tacked onto the transcript of traffic, preceded by an indication of which processor was corrupted; each time the adversary completes her 41 round-r activities, the messages she has composed on behalf of corrupted players is ap- pended to the transcript of traffic, terminated by a special marker; and, finally, each time the adversary is awakened as a result of the player's completing a round, those messages sent by good players which are received by the adversary are properly delineated and appended to the transcript of traffic. If r^ specifies traffic, then corrupted{TA) denotes the set of processors who are corrupted in this transcript — that is, the set of all i such that •i» appears in r^^ — and ^(r^) denotes the input length in this traffic— that is, the value i such that c = 1*#1"#1^# ■ • • is a prefix of Ta. Output and termination. A protocol is said to terminate in round r if r is the first round at the end of which every uncorrupted processor has terminated. We demand the following of any protocol P: for any adversary A with which P may interact, when P runs with A, P terminates with probability 1. Player i's output in an r-round execution of a protocol P is the image yt = 0(5;°°) under o of that players final computational state, if the player is uncorrupted at the end of round r, and the distinguished value corrupted otherwise. The players' output is the tagged vector consisting of each player's output different from corrupted. Adversary's view. In an execution e of an adversary A with a network, the view of A in this execution is "everything the adversary sees" before termination: that is, the triple {TA,r'^,a'A) consisting of the traffic, the adversary's coins actually consumed (a prefix r^ of the infinite string r^), and the adversary's advice which is consumed (a prefix a'^ of the infinite string a>i). A random bit is consumed when fj becomes flip-coin; an advice bit is consumed when 77 becomes get-advice. 2.3.6 Dealing with Arbitrary Numbers of Players General protocols. To properly talk about general multiparty protocols we must relax the requirement that a protocol is tailored for one particular number of players. (It is discussed in [MR91] why imagining n fixed is too restrictive.) Thus we treat a protocol P as a family of n-party protocols P = {Pn}- However, recall that we demanded that a protocol be "describable" by a Turing machine (that is, we demanded a protocol be Turing- computable). In passing to more general protocols, we would like not to lose this property. In fact, any "reasonable" protocol should have a description that is efficiently computable knowing the number of players involved. Definition 2.3.3 A protocol P is a polynomia^l-time computable function that maps a number 1" to a standard encoding of an n-party protocol P„. *If corruption is interrogation followed by murder, surrounding i with bullets is quite mnemonic! 42 We will usually suppress the subscript n when considering an n-party protocol P„ , using P to denote either a general protocol or a particular n-party protocol that it specifies. General adversaries. Likewise, to talk about a protocol with arbitrary numbers of players under attack by an adversary, we must suitably relax our notion of an adversary. Definition 2.3.4 An adversary A is a function that maps a number 1" to an n-party ad- versary A„ . Again, we usually suppress the subscript n when considering an adversary for an n-party protocol, using A to denote either a general adversary or an adversary for a particular value of n. 2.3.7 Complexity Measures Adversaries with bounded charisma. If an adversary corrupts all the players, then nothing can be said about their behavior in her presence. We thus prefer less captivating adversaries. Definition 2.3,5 A t(n)-adversary for a protocoi P is an adversary A for w/iicii |k^°°| < t{n) for any r-round execution of An witii the n-party protocol Pn- For our purposes, we may imagine that the constraint above is strengthened to demand that the adversary always corrupts exactly t(n) players, rather than at most t{n) players. As long as t{n) is efficiently computable and the adversary "knows" when the protocol will terminate, the notions are equivalent. This will be the case for us. But in contexts Uke the Byzantine agreement protocol of Feldman and Micali [FM88a], these conditions are not met. Often, t{n) is a step-function associated to Hnear function of n, such as t{n) — [(n - l)/2j . Local computation complexity. Let e = {C['',C^''} be an execution of an n-party protocol. The number of player micro-rounds for execution e is {{CI" : i e [l..n], r,p £ N, 77(5^") ^ {round-done, protocol-done}, and i ^ k^~}| , while the number of adversary micro-rounds is {{C^^ •■ r,p e N, 77(5^'') ^ round-done, and the protocol has not terminated before round r}\ . A protocol P = {Pn} is polynomial-time if there is a polynomial Q such that for any n, P„ is Q(|c|)-time computable (where c is the common input); and for any execution e with any adversary A, the number of player micro-rounds is bounded above by <9(|c|). An adversary A who interacts with a polynomial-time protocol P is polynomial-time if there is a polynomial Q such that the encoding of A„ is computed by A in time bounded above by Q(n); and A„ is computable in time at most Q(|c|); and, last of all, the number of adversary micro-rounds is bounded above by (5(|c|). 43 Round and communication complexity. The round complexity of a protocol P is least value r such that when protocol P is run in the presence of an adversary A it necessarily terminates within r rounds. The round complexity is "oo" if no such number exists. The round complexity is regarded as a a function of \c\. The communication complexity of a protocol P is the least value K such that when protocol P is run in the presence of an adversary A, the total number of bits sent out by uncorrupted processors is at most K. The total number of bits sent out by uncorrupted processors is Er,i '^(« ^ «^(^^''"^^°°))(l^/'l + Ej I'^i'jl)^ where ^ is 1 for a true predicate, for a false predicate. The communication complexity is "oo" if no such number exists. The communication complexity is regarded as a a function of |c|. 2.4 Secure Function Evaluation With the language of Sections 2.2 and 2.3 in hand, we begin to phrase our definition of security. We begin by providing some intuition as to what correctly and privately computing a function should mean in the presence of an adversary as strong as the one we have defined. 2.4.1 The Ideal Evaluation A secure protocol should mimic — as closely as possible — the computation of a function / by an ideal protocol for computing it. An ideal protocol for / can be considered as achieving security by adopting a model of computation which provides a trusted party for computing the function. The rest of the parties, though, are subject to corruption. We describe the ideal protocol somewhat more colorfully below. Imagine distributively computing / : (S^)" -* (E')" by playing the following game. Each player sits at a table, his private input string written underneath the lead plate in front of him. One by one, the adversary "corrupts" some of the players, removing a player's lead plate and learning the information written beneath it. Then the adversary substitutes (fake) input strings x'j- in front of each of these corrupted players, and she replaces the lead plates. Once the plates have all been replaced, the value of /,(a;^ U x^), magically appears underneath the plate of each player i. In this way, each player i has computed / evaluated at a vector of inputs which — though partially determined by the adversary — still, has been correctly evaluated at a point x'^^Ux^r, where a;^ was chosen independent of the good players private inputs. After the adversary learns the /^(x^UXy)- values for the already-corrupted players i, one by one, she may corrupt additional players j. Removing their lead plates, the adversary learns their Xj- and fj{x'^ U x^rj-values. All together, she may corrupt a total of t players. Turning the ideal protocol into a definition. The formalization of security at- tempts to ensure that the computation of / by the protocol is "just like" the computation 44 of / by the ideal protocol which computes /. Unfortunately, it is not obvious what "just like" really means. We attempt to imitate the ideal computation in all essential ways. For example, in the ideal protocol, the adversary "knows" what values x'^ she substitutes in under the plates of corrupted players, and she "knows" what her share of the output is as a result of these substitutions. If a protocol does not enjoy this property, might it still be secure in some significant sense? Absolutely. But our viewpoint is that all important aspects of the ideal protocol should be mimicked in a protocol we call secure — and the adversary's substitution on a given run of a value xip (a value she is aware of) on behalf of the corrupted players would seem to be an important aspect of the ideal protocol. See Section 2.5 for some additional discussion. 2.4.2 Ideal Evaluation Oracles To formalize our notion of security, we develop an abstraction for how an adversary can corrupt players and steal information in the ideal protocol for computing /. Though the following notion does not appear explicitly in our formalization of security, the language it offers in useful, and the informal description is helpful in understanding the formalism of Section 2.4.3. In Chapter 4, we will will speak in terms of oracles to make more under- standable the proof of that chapter. We imagine a special type of oracle, Ot{x,Tf;f), whose behavior is dictated by the players' inputs x e C^^T , their histories if £ (S*)", the function / to be computed, and the bound t £ [0..n] on the number of processors the adversary is allowed to corrupt. We now describe how this oracle behaves. The oracle accepts two types of queries: • A component query is an integer i, i G [l..n]. It is answered by (a;,-,7rj) if t or fewer component queries have been made so far, and no output query has been made so far; it is answered by ((a;,-,7ri),2/,) if t or fewer component queries have been made so far, and the proper output query x'^ previously made resulted in a response y = f(x'rpUxY)- Additional component queries are answered by A. • An output query is a tagged vector a;^. The query is valid if T consists precisely of the component queries made so far, and if this is the first output query. Asking a valid output query x'j. results in the oracle computing y = f{x'^ U x^), and answering yr- An output query which is not valid is answered by A. Let us emphasize that the oracle is willing to answer at most t component queries. If the oracle is asked an improper output query (that is, T is not the set of previous component queries), or if the oracle is asked more than one output query, it does not give the requested information. Note that we do allow component queries to follow the output query, so long as the total number of component queries is bounded by t. Also, component queries behave differently depending on whether or not the output query has been made: if the 45 output query has been made, then a component query returns (in addition to the requested component) the queried player's "share" of the function value. Clearly what you (as a ^-adversary) can learn when speaking to a ideal evaluation oracle exactly coincides with what you can learn in the ideal computation of f. Privacy is then easy to formalize using ideal evaluation oracles: roughly said, the ensemble of views you get as an adversary interacting with the protocol is indistinguishable from an ensemble of views you can generate without speaking to the network, but speaking to some algorithm equipped with ideal evaluation oracle, instead. Actually, we will define not only privacy but correctness, too, through the interaction of an cdgorithm with its ideal evaluation oracle. If Ot(f, 7?;/) is an oracle, we will sometimes omit the subscript t and the parenthesized arguments x, n and /, and write simply O, instead. The response to a component query of i is then written simply as 0{i), and the response to an output query of Xy is written simply as 0{x'j-). The description of an ideal evaluation oracles just given handles vector- valued functions. In fact we will be interested both in secure computations in which every player will get his own private output, and in secure computations in which all players will get a common — and thus public — output. For the purpose of computing string-valued functions, the definition of component queries can be simplified: a component query i returns a;; if there have been / or fewer component queries so far; there is no need to return the value ?/,• for queries made after the output query. 2.4.3 Simulators We now introduce a key tool for achieving our definition of security: simulators. This notion was also central to the definition of zero-knowledge proofs, of which protocols for secure function evaluation are an extension. There are, however, several adjustments to be made, reflecting the several differences of context. 2.4.3.1 Informal description Simulators talk to adversaries. A simulator is an algorithm which interacts with an adversary with "mechanics" similar to the interaction between an adversary and a protocol, as suggested by Figure 2.2. We describe this mechanics below. Later we will concentrate on those simulators interacting with which is indistinguishable — to the adversary — from interacting with a network. We point out that we are demanding more of our simulators than is customary to demand of simulators for zero-knowledge proofs [GMR85]. Our simulators are required to lay down a conversation with an adversary in a message-by-message manner, something 46 rs \t rA 1* i r s ' traffic A V a-A Figure 2.2: A simula,tor S creates a "virtual world" and allows the adversary A to act in this world, as though S itself were a network running protocol P. which has never been demanded in simulation for zero-knowledge proofs. Crepeau [Cr90] also explicitly demands this type of simulation, and it is discussed by Kilian as well [Ki89]. This restriction is central to achieving our notion of correctness, since it is through this mechanics of the adversary speaking to the simulator that a meaningful "committal" can be associated with the execution of an adversary with a network, not just with a simulator. We caution that this restriction is different from the "black-box" notion sometimes considered for zero-knowledge proofs. Simulators own ideal evaluation oracles. It will be the simulator's responsibility to send to the adversary messages which would "appear" to be from the network itself. To this end, a simulator is provided an ideal evaluation oracle. When and only when the adversary A with which simulator S speaks corrupts a processor i, simulator S makes a component query of i. Once, and only once, the simulator makes an output query x^, where T is the set of processors currently corrupted by A. We will say that the simulator does this at some point just following the the completion of an adversary round, after the adversary's outgoing messages have been composed and sent to the simulator. Presumably, the simulator's output query was required to provide the next round of messages going to the adversary from the simulator on behalf of the uncorrupted players. A simulator is sometimes productively thought of as the "subconscious" of an adversary who is endowed with an ideal evaluation oracle, but who does not having benefit of a network with which to compute. Everything that a simulator gives to an adversary an adversary could give to herself — if only she had the oracle. Dependency of S on A. There are issues involved in deciding to what extent the simu- lator's behavior may depend on the adversary A with which it interacts. At one extreme, the simulator might "see" the adversary's current state, the advice string which A has been given, and the coins which A Hips; and the simulator algorithm may depend in an arbitrary manner on the adversary's algorithm itself. This is the least restrictive requirements for a 47 simulator, giving rise to the weakest notion of security. At the other extreme (and there are many in the middle), the simulator knows nothing of the adversary's program, advice, coins, or internal state, and it must provide its messages without such benefit, based only on the traffic exchanged between itself and the adversary. This is the most restrictive requirement for a simulator, giving rise to the strongest notion of security. We adopt this notion here. Note that it is completely meaningful to have such a simulator interacting with an adversary who computes an arbitrary function. Computing the output query. How does the simulator compute its output query? Necessarily, it is a function of the simulator's coins, its oracle responses, and the traffic of exchanges between itself and the adversary. But we wish to be less generous. Why? Key to achieving strong notions of reducibihty and independence (see Section 2.5) is that the adversary should know the output query. But if we allow the simulator to compute its output query without any restrictions, then the adversary — not knowing the simulator's coins — cannot possibly know this. One (particularly restrictive) way to demand that the adversary knows the output query (meaning that she could compute it, if she wanted to), is to demand that the simulator's output query must be an efficiently computable function AI (for "adversary input") oi just the traffic of message exchanges. Though less restrictive notions are possible (and are in fact necessary to properly deal with secure computation in the broadcast model), the issue is always the same: how to ensure that the adversary knows what the output query is, in the sense that it can be calculated by the adversary in roughly the time allotted for the execution. Adversary output. In the ideal computation of a protocol, the adversary necessarily learns not only what she has sent off to the trusted server, but she also learns what the trusted server returns to her (that is, her share of the output). Through the AI function we have mimicked the former; through the AO function (for "adversary output") we imitate the later. Like the adversary input, the adversary output AO is an efficiently computable function of the traffic. In an r-round execution of a protocol, the adversary's share of the common output is yr = ^0(c,r^°°), where T - corrupted(T'^°°) is the set of players who are corrupted when the protocol terminates. ► Strong simulatability. We emphasize that the restriction that AI and AO be ef- ficiently computable functions on the traffic is a lot to demand. More general notions of simulatability are less restrictive on this point. The spirit of definitions which capture the sense in which the adversary is aware of her input and output is the same, but is not dealt with here. Refer to [MR91]. 48 2.4.3.2 Formal description Definition of a simulator. We define a simulator as a triple consisting of the simulator algorithm proper and two associated (total) functions on the traffic. These specify what the adversary has effectively sent to the trusted server and received from the trusted server. Definition 2.4.1 A simulator S - {S,AI,AO) is a polynomia.l-time computable function 5 : S* X S* X S'" ^ S* , common current simulator 'i^^ input state coins ^'*'*' together with a polynomia.l-time computable adversary input function AI : T,* X 'S* ^ 2'^ "1''^' U {not-now}, common traffic substituted input values and a polynomial-time computable adversary output function AO: S* X S* ^ 2t^ -"J XE' common final deserved input traffic output The function AI has the property that for any traffic r^°°, either AI{c,t^°°) = not-now or else AJ(c,T^°°) € corrupted {t^°°) X E^^''^^^. In the later case, AI{c,s'^) = not-now for all f < r. A simulator interaction function is either of the following polynomial-time computable func- tions: 1. a next action function T : S* — > {compute, protocol-done), and 2. a give-to-adversary function $ : S* ^^ S*. A simulator configuration is an eiement of S* X S"^ . simulator's simulator's state coins Discussion. As before, the simulator interaction functions are fixed maps which determine their range points in a natural, easily encoded manner from their domain points. Recall that we chose to say that the adversary does not terminate; the protocol is said to terminate in the adversary round following the termination of all uncorrupted processors. Since there is no protocol running when an adversary interacts with a simulator, it is a simulator's responsibility to effectively indicate when the protocol should be regarded as being over. It does this with its next action function, $. The simulator's other interaction function is used to read off, from the simulator's com- putational state, what it provides to the adversary with whom it is interacting. 49 Running an adversary with a simulator. When having an adversary interact with a protocol, the adversary went through the following sequence of configurations: C •(p+i) _ (A(c,.7), r7, aj, /.:,^r7) if f^s^l) = compute {s'I*rZ, rZrZ • • • , a''/, /.^^ r^) if fiis''/) = flip-coin (^7*<i, ^7, «72«73 • • • , «A^ ^7) if 77(.7) = get-advice (^T*^r *x,*mi,* • ■ •*Tn„,- r7, aj, K:,^U{i}, r7«i« s-'°°+7r[~+3;j*mi,-+---*m^i ) (//* „roo j,_.roo *7rv tXi r;^ ^7, >^7 u {i} r7«i« sr~+7r[~+a;i ) y~,Tp if 7?(s^'') = corrupt; and r > if fi{s7) = corrupt,- and I* = otherwise /^roo 'A 5 ' A 5 "A ' r''*' *M[+ ■ ^A 5 *??1' l*---+"Jnn") (s'"" A ' 'A » "A 5 M 5 if 3 /) G N s.t. r > and tK^a") = round-done otherwise C (r+l)0 _ 3^* M[+i*.-- *M„^+i*m:;+^-- •*mjr*-- •*"^nt^*-- •*m;t^ 5 ^r ? <^A 5 '^A 1 rr* JVf[+i*- -■*M;;+i*m![|^*- ••*7Tiir*- • •*<!'* • • *^:;+^ {s: ') if r > if r = The boxed quantities indicate the points at which the the adversary has obtained informa- tion from the players in the network, and where this information appears in the updated traffic. The idea for defining the behavior of a simulator interacting with an adversary is to have the simulator — rather than the protocol — provide the boxed quantities, above. To run an adversary A having associated adversary input function AI with a simula- tor S, common input c = l*#l"#l^#l'#l'"#Cc (where C^ specifies a function /e : (S^)" ^ (E')"), adversary coins r^, adversary advice a^, simulator coins vs, and oracle Ot{x,T^; f), 50 where x e (S^)" and x € (S"")", the boxed quantities above are replaced by that informa- tion which the simulator returns — as specified by * — and the simulator's state is allowed to progress in time in a manner analogous to a protocol's progress in time. To describe this, fix the notation that and let the adversary configurations progress according to C '•(p+i) _ f (Aics^n, r7, a'l, ^7, ^7) = < if 77(5^'') = compute (^7*^Z, ^aV73 • • • , «:l^ ^1^ r"/) if 7?(//) = flip-coin {s'l^oTA, r7, aZa'-A • • • , ^7, ^7) if ^(^7) = get-advice rp.„rco. (^7*< ^4"^'') r-7, a7, K7u{i}, ^4"^'^) D (37* h4'^'^) r7, 7r7, K7u{i} ' A 5 "A ■: rp • l» h4'^'^) D cv if fi{s''j^) = corrupt,- and r > if 7y(5^'') = corrupt^ and r = otherwise /~iroo ^A ( ^'•P r-'-p „»•? K-'"^ tY*MI*-- ■ M^*m[i* •••*m^„* *m ■nl- r"-p '^A *m„ ») if 3 /? € N s.t. r > and fi(s''/) = round-done otherwise C ('•+1)0 (s: $(.r^)°) 'a t "-a 1 '^A 5 $(.r^)°) ) I (S roo^ „roo „roo ? 'A ) "A ) '^A ) ~. roo A if r >0 ) ifr^O 51 C rp+l while the simulator's configurations progress according to (5(c, //*O(0, rs\ Ts) if n{s''^) = corrupt, Cy otherwise Cl°° = d" if 3p € N s.t. j?(s7) G {round-done, protocol-done} 4'-^^^° = (5(c, 4-*C?(^I(c,rr)), ^5), »-5) where 0{i) (a;,-,7r.) if Vf < r, ^I(c,rr) = ((a;i,7ri), Vi) if 3 f < r s.t. a;^ = ^I(c,r^°°) # not-now and y = fcix'^. U Xjr) 0(a;^) = 2/T where y = /c(a-^ U xyp) and O(not-now) = A. In defining simulators for string valued computation, the definition of component queries is simplified to 0{i) - {xi,-Ki) but the rest of the definition of a simulator interacting with an adversary is unchanged. Termination. A simulator is said to terminate in round r if r is the first round for which T(55°) = protocol-done. The adversary with which the simulator interacts is then said to terminate at the end of its round r. Adversary's view. In an execution e of an adversary A with a simulator, the view of A in this execution is "everything the adversary sees" before termination: that is, the triple (■r^,r^,a'^) consisting of the traffic, the adversary's coins actually consumed, and the adversary's advice wliich is consumed. Requirements on AI and AO. For S = {S,AI,AO) to be a simulator establishing the security of some protocol P, we demand that in any r-round execution of an adversary A with P, that there is exactly one value r' < r such that ^I(r^'~) = x^ / not-now. Additionally, AO{t'^°°) € corrupted(T'/°) X S^(^^~). That is, the adversary input assumes a well-defined vector of "substituted inputs" xij, at some point in the execution, and the adversary output, evaluated at the final traffic, specifies a meaningful output on behalf of the corrupted processors. 52 2.4.4 Ensembles for Secure Function Evaluation The parameter set £(/). A vector-valued function family / is a way to interpret each common input c as a map /« : (E^-')"" -> (E'^)""; a string-valued function family / is a way to interpret each common input c as a map fc : {'E^')"" -^ T,'-=. The description of /e appears on the common input following the usual quantities. Thus to each function family / = {/,,} we associate the A;-indexed family of languages £(/) = {Lkif)} given by i,(/) = U {(x,7r,a^,c): fG(I]0",7FG(S'")",a^GS-,and n>3, k,t,l,m>0 c = l*#l"#l^#l'#l'"#Cc, where C, specifies a function fc from the function family /} of all possible inputs, histories, adversary advice strings, and common inputs, in any "proper" initial configuration of the network. Restricting our attention to proper initial configurations (with the void set of initially corrupted processors), the initial configuration of the players and the adversary are uniquely identified by a vector {x,Tr,aA,c, f,rA)- This is a point in Lj{k) together with coins f for the players and Ta for the adversary. Just as {x,ic,aA,c, f,rA) specifies the initial configuration of the players and the adver- sary, a point (x, tt, aA,c, rs,rA) uniquely specifies the initial configurations of the adversary and a simulator S. This is a point in Lf(k) together with simulator coins, rs, and adversary coins, rA- The adversary's view. Consider an adversary. A, attacking a protocol, P. Let / = {fc} be a function family. With a proper initial configuration specified by (£,f,aA,c, r,r^), there is an associated view which will be held by the adversary at termination (if the protocol terminates). Omitting mention of f and r^, there is an associated distribution on adversary views induced by the taking each bit in f and r^ to be selected uniformly at random. We let A-VIEWf(x,f, a^,c) denote this distribution of adversary views. Read this as the view which A gets when speaking to the network running protocol P. When f , tt, Ua and c are not regarded as fixed, but are allowed to vary in Lkif), then A-VIEWf (£, 7r,a^,c) becomes an ensemble 5A-VIEWf(:c,7r, a^,c) indexed by k and parameterized by £(/). Sometimes we simplify the notation by writing £Pk{x,Tr,aA,c) or SPi, in place of ^A-VIEW^i. {x,Tr, aA,c). The simulated view. Consider an adversary, A, interacting with a simulator, S. Let / = {fc} be a function family. With a proper initial configuration specified by (f , tt, Oa^c, rs,rA), there is an associated view which will be held by the adversary at termination (if the protocol terminates) when A talks to simulator 5, where S has an ideal evaluation oracle 0{x, 7f;/c). 53 Omitting mention of r^ and r^, there is an associated distribution on adversary views induced by the taking each bit in 7-5 and r^ to be selected uniformly at random. We let A-YlEWt{x,Tv,aA,c) denote this distribution of adversary views. Read this as the view which A gets when speaking to the simulator S. When x, tt, a a and c are not regarded as fixed, but are allowed to vary in Lkif), then A-VIEWf (ar,7f,aA,c) becomes an ensemble f A-VIEWf (f,7r,aA,c) indexed by k and parameterized by £(/). Sometimes we simplify the notation for this ensemble by writing 55'f (x, 7f,a^,c) or £S^ in place of £:A-VIEWf (f,7F,a^,c). Network input. Consider an adversary, A, attacking a protocol, P. Let / = {fc} be a function family, and let 5 be a simulator with adversary input function AI. With a proper initial configuration specified by {x,T,aA,c, f,rA), there is (as long as the protocol terminates) a well defined tagged vector xij, which is the a;^-value obtained by evaluating the adversary input function AI on traffic values r^~ for r- = 0, 1, . . ., until a value different from not-now is obtained. Define jVl(f , fr, a^ , c, f,rA) = x'rpU Xf- This is the n-vector of good players' private inputs "shuffled in" with the values entered into the computation by the adversary, as indicated by AJ. It is called the network input Junction, and, intuitively, it specifies, on a particular run, what the network has committed to on this run. Network output. Consider an adversary. A, attacking a protocol, P. Let / = {fc} be a function family, and let 5 be a simulator with adversary output function AO. With a proper initial configuration specified by (x,7f, a^,c, r,r^), there is (as long as the protocol terminates) a well defined tagged vector j/r which is the result of applying the ^O-function to the traffic r^°° which has occurred when the adversary terminates. Define A/'C(f,7f,aA,c, r,r-x) = 2/t U2/7 where 2/7 is the vector of outputs of good players. The vector above consists of good players' outputs "shuffled in" with the output values for the adversary specified by the function AO. The function NO is called the network output, and specifies, intuitively, what the good players did compute as output values, and what the adversary could compute as output values on behalf of the corrupted players. 54 2.4.5 The Definition of Secure Function Evaluation We are now ready to define what it means for a protocol P to securely evaluate a function family / = {/c}. Computational security. We first define what it means for a protocol for function evaluation to be secure with respect to a computationally bounded adversary. Definition 2.4.2 Protocol P i-securely computes / = {fc} if there is a simulator S = {S,AI,AO) such that for any polynomial-time t-adversary A, • (Privacy) f A-VIEWf (3;,7r,a^,c) «,^ ^A-VIEWf (x,7r,a^,c). • (Correctness) For some negligible function €{k), for all (f, 7f,a^,c) G Lk{f), Probf.,,^ [MO{x,if,aA,c, f,rA) 7^ fci^I{x,^,aA,c, r,r^))] < e{k). Definition 2.4.2 can be explained as follows. When the adversary talks to the network, what she is learning is that which she can computationally closely approximate given an ideal evaluation oracle. How the adversary would compute the output query to this oracle defines what she commits to on a run of the protocol with the simulator. According to this function, when the adversary talks to the network, what she does is to effectively commit to a value x'^. By the time the adversary terminates, she is in possession of a value j/r, T 2 T. During this run, almost certainly the good players computed y^ = fri^'r ^ ^t) and the adversary has learned yr — fri^'r ^ ^t)- Statistical and perfect security. We now define what it means for a protocol for function evaluation to be secure with respect to an adversary with unlimited computational power. The only change that is made is to require statistical closeness for privacy. The following notion is also called information-theoretic security.^ Definition 2.4.3 Protocol P statistically ^-securely computes / = {/J if there is a simulator S = {S^AI,AO) such that for any t-adversary A, • (Privacy) ^/i-VIEWf(x,7r,a^, c) :^ f A-VIEWf (£, 7f,a^,c). • (Correctness) For some negligible function e{k), for all (x, 7f,aA,c) G Lk{f), Frohf^r^ [AfO{x,f,aA,c, r,rA) # fc{JVl{x,Tf,aA,c, f,rA))] < e{k). There is a corresponding notion oi perfect security, in which there is equality on the privacy constraint, and no chance of error on the correctness constraint. ^For information-theoretic security, one might select a nonasymptotic notion of security: the distance between £Pk and £Sk is at most 2~*, and the chance that the AfO differs from /(A/*!) is at most 2~*. 55 2.5 Discussion A great many issues have been considered in constructing our definition of secure function evaluation. In this section, we draw the readers attention to just a few of them. Greater discussion and justification for these notions will appear in the paper of Micali and Rog- away [MR91]. The marriage of privacy and correctness. Some earlier definitional work treated privacy and correctness as separate concerns that could be met independently. One must be cautious of approaches like this: in many possible formalizations, protocols exist which would be private with respect to one simulator, correct with respect to another simulator, but not simultaneously private and correct with respect to any simulator. The idea that correctness should be defined by using the same simulator existentially guaranteed for privacy was one of the early ideas underlying this work. In fact, at this point it seems crucial that privacy and correctness be interwoven: though privacy is a meaningful notion in its own right, it is doubtful that a strong notion of correctness is possible without its definition being entwine with that of privacy. Statistical closeness for correctness. A protocol should compute what it is sup- posed to compute — and not just something that "looks like" what the protocol is supposed to compute. For example, a protocol for collaboratively flipping coins is not an acceptable protocol for collaboratively computing range points of a pseudorandom generator. (Pseu- dorandom generators are described in the next chapter.) Reducibility. Cryptography is a slippery business. One demonstration of this lies in the fact that many plausible definitions for secure function evaluation fail to make the "obvi- ous" theorems true (or, if they are true, what their proofs are is unclear). Illustrating this, many possible definitions of a secure protocol appear not to provably achieve the following reducibility property, informally stated as follows: that a secure protocol for / in the "spe- cial" model of computation which is like ours but which provides for the computation of g as a "primitive" should remain a secure protocol for / when a secure computation for g is inserted in place of using the primitive provided by the enriched model. That is, suppose you have designed a secure protocol for some complicated task — computing some function /, say. In an effort to make more manageable your job as protocol designer, you assumed in designing the protocol that you had some primitive g in hand (an oblivious transfer "box," say, for implementing a voting protocol). You proved that your protocol P^ for computing / was secure in a model of computation in which an "ideal" computation of the primitive g was provided "for free." Now you have continued your work and designed a protocol Pg which securely computes g. One would hope that you obtain a secure protocol for / by inserting the code Pg wherever it is necessary in P^ that g be computed. 56 Our definition of security admits such reducibility. However, stating tliis theorem pre- cisely and proving that it holds would take us well afield of our goal. Adversarial awareness. In the ideal computation of a function, the adversary neces- sarily "knows" what she has sent off to the trusted party on a particular execution of the protocol. She knows that the function will be computed on this value, shuffled in with the good players' inputs. She knows what values, subsequently, are returned to her by the trusted party. Through the adversary input function, the adversary output function, and our notion of correctness, we have required that all of this is directly paralleled in a protocol we call secure. In particular, the adversary is "knows" the input she has effectively "sent off" to the network to compute on (in the sense that she could easily calculate this xip value, using AX); she knows that, almost certainly, the function will be computed on this value, shuffled in with the good players' inputs; and, finally, the adversary is aware of what values, subsequently, have been effectively returned to her (in the sense that she could easily calculate them, using AO). A failure to mimic any of these properties of the ideal protocol would be a significant breach of the abstraction. Independence. We wish to ensure the highest possible standard for independence— the adversary's inability to significantly correlate her behavior with the private input values held by uncorrupted processors. Our definitions do this, though we shall not in this thesis formalize statements which capture the extent to which independence has been obtained. Chapter 3 The Constant-Round Protocol The protocol described in this chapter was invented with a goal of minimizing what would seem to be the key resource for secure distributed computation — interaction. Of course as interaction is minimized, other complexity measures must simultaneously be kept in check. This chapter does not prove — or even rigorously state — anything about the protocol we exhibit. This is left to Chapter 4. We begin with an overview. 3.1 Overview To develop some intuition, we first look at why interaction rounds were formerly used in such excess. The gmw-paradigm. In the multiparty protocols of [GMW87, CDG87, GHY87, GV87, BGW88, CCD88, BG89, RB89, GL90] and others, the underlying notions of security are often quite different, and so are the assumed communication models. Nonetheless, all of them follow the same basic paradigm of Goldreich, Micali and Wigderson [GMW87], which we now describe. There are three stages to collaboratively compute the value of some function y = f{xi,...,Xn). (For simplicity, take / to be a Boolean function of n binary strings.) In the first stage, each player "breaks up" his private input into pieces, or shares, and dis- tributes these pieces. When sharing a value b, for some parameter t, t < n/2, we require that no t players get information about b from the shares received; and yet, the value b is recoverable given the cooperation of the "good" players — even if the "bad" players try to obstruct this recovery to the best of their abilities. After the sharing stage, a computation stage follows, in which each player, given his own shares of the input {xi, . . . ,a;„), computes his own share of /(xi, . . . ,x„). To accomplish 57 58 this, the function / to be evaluated is represented by a Boolean circuit, C. Thus, in Stage 1, each player got shares of the values along the input wires of C. In Stage 2, for each gate of the circuit, from shares for the input wires for this gate the parties compute in a privacy-preserving manner, the shares for the output wire of this gate. In general, this privacy-preserving computation employs interaction. In this way, the parties work their way up the circuit, from leaves to root, and eventually hold shares for the value corresponding to the output wire of circuit C. In the third and final stage, the result of the function evaluation is recovered by having the players properly combine the shares held for the output wire of the circuit C. The problem. In view of even this brief description, it can be seen that all of these protocols for secure function evaluation run in unbounded "distributed time" — that is, using an unbounded number of rounds of communication. Even though the interaction for each gate can be implemented in a way that requires only a constant number of rounds, the total number of rounds will still be linear in the depth of the underlying circuit. Bar-Ilan and Beaver [BB89] were the first to investigate reducing the round complexity for secure function evaluation. They exhibited a method that (for information-theoretic security) always saves a logarithmic factor of rounds (logarithmic in the total length of the players' inputs), while the total amount of communication grows only by a polynomial fac- tor. While their result shows that the depth of a circuit is not a lower bound for the number of rounds necessary for securely evaluating it, the savings is far from being substantial in the general setting. Thus, the key question for making secure function evaluation practical or for understanding its complexity is: How many rounds a,re necessary to securely evaluate a circuit, while keeping the communication and local computation polynomial in the size of the circuit? Constant round secure computation. Many of us believed that more complicated functions (i.e., those with greater circuit complexity) required more rounds for secure eval- uation. We now know this to be false, for the case of complexity-theoretic secure computa- tion: Main Theorem — Informal version Tiiere exists a protocol which, using a constant number of rounds and a polynomial amount of communication, allows its participants to evaluate any circuit securely. The protocol works in the model of computation described in Chapter 2, in which parties can communicate privately in pairs and can broadcast messages to everyone. It assumes the existence of a one-way function. The protocol tolerates any polynomial-time adversary who can corrupt fewer than half of the total number of players. Informally, the theorem says that interaction is like an atom. Without interaction secure function evaluation is impossible; but with a tiny bit of interaction, it is fully possible. The formal statement of the theorem above is given by Theorem 4.3.1. 59 A NEW APPROACH. Achieving low round complexity necessitates a break from the gate-by- gate approach described above. The protocol described here is the first multiparty secure function evaluation protocol which does this. Subsequently, Beaver, Feigenbaum, Kilian and Rogaway [BFKR90] also developed a secure function evaluation protocol which does not follow the gate-by-gate approach. A bird's-eye view of the constant-round protocol. The method for achieving fast secure function evaluation can be described as finding the right way to generalize the older two-party protocol of Yao [Ya86]. His approach — what I call computing a "garbled circuit" — had been used within the GMW-paradigm for computing the desired "out-going shares" of each gate from its "incoming shares" by engaging in many, suitably chosen, two- party computations. This use, however, leads to an unbounded number of rounds. We, instead, modify the construction "from the inside," generalizing it to many parties but preserving the constant round complexity. The idea is to have the community use the limited interaction available to it to construct a publicly- known garbled circuit, along with a set of garbled inputs to feed to this circuit. Taken together, the garbled circuit and the garbled input are called the garbled program. The garbled program is sufficiently "scrambled" that its revelation does not divulge more information than what is permissible to divulge. The garbled program is computed using the older gate-by-gate approach to secure function evaluation. Once issued, each individual player evaluates the garbled program on his own, without interacting with other players. The garbled program is defined is a very special way, so as to allow the players to compute this object in a way that permits them to perform the brunt of the scrambling locally, rather than use intensive interaction to collaboratively scramble the program step- by-step. To efficiently come up with the garbled program, the players join various pieces together, each piece contributed by an individual participant. Of course, no player is trusted to contribute a correct piece, so each player uses interaction to prove to the community that he has done his work correctly. As usual, verification is simpler than computation, and correctness of very deep circuits (evaluated locally by individual players) can be verified by small, shallow circuits. These can be evaluated securely in a constant number of rounds using the gate-by-gate approach of previous protocols. In the end, the community can be certain that it is issuing a correct garbled program, which has been found with very little interaction and is evaluated without any interaction at all. Why bootstrapping helps. In the remainder of this chapter, even some of what is described above is abstracted out, as we show how to implement a secure function evaluation protocol on top of any other secure function evaluation protocol. In the next chapter, we show that if the underlying secure function evaluation is correct and private, then the protocol implemented on top of it wiU also be correct and private. 60 How, intuitively, can we possibly gain something by implementing a secure function evaluation protocol on top of another secure function evaluation protocol? Seems like a strange strategy. In general, if one wants to securely evaluate some function / in a particularly efficient manner, one would expect to have to exploit the specific structure of the particular function. But in devising a general method of producing secure protocols, the function being evaluated is arbitrary, so there would seem to be no structure there to exploit. The bootstrapping strategy works because the evaluation of / is implemented on top of the evaluation of a function / which is concoted to have a remarkably simple structure — so simple that / can be evaluated securely in constant rounds. Thus the arbitrary, "structureless" function / necessarily does have enough structure to allow it to be distributively evaluated efficiently. Ideas from which the protocol springs. As mentioned, the idea of performing two- party computation by using a garbled circuit is due to Yao [Ya86]. We require the use of a secure function evaluation protocol, as developed by [GMW87, BGW88, CCD88, RB89]. To achieve constant round complexity, we demand that constant- depth circuits can be evaluated in constant rounds; the exact statement of what is required is given by Theorem 4.1.1. Any of the information-theoretic secure protocols mentioned above, [BGW88, CCD88, RB89], can be implemented to have this property. We note that the security of the constant round protocol, though, does not depend on situating it on top of an information-theoretic secure protocol; a complexity-theoretic secure protocol would do as well. The protocol of [Ya86] required the intractability of factoring. This assumption was re- duced to a trapdoor permutation in [GMW87]. The protocol we develop assumes the mildest of common cryptographic assumptions— the existence of a one-way function. However, this relaxation in assumptions is not new. Galil, Haber and Yung [GHY87] had already showed that a pseudorandom generator suffices to produce the underlying "encryptions" for the garbled circuit that were formerly achieved using public-key cryptography in the [GMW87] protocol. In [BG89], Beaver and Goldwasser expUcitly recognize that a one-way function is adequate for the job. A DETOUR. We have managed to state the general strategy for achieving fast secure function evaluation without even hinting at what a garbled program looks like or how it is evaluated! Before we rectify this, we take a short detour and describe a central tool needed to answer these question: the tool is a pseudorandom generator. 3.2 Pseudorandom Generators A pseudorandom generator deterministically stretches a short, truly random "seed" into a longer "pseudorandom" string. The distribution induced on the pseudorandom strings 61 when the seeds are sampled from uniformly at random is such that the pseudorandom output "looks random" with respect to any polynomial- time computation. This notion of a pseudorandom generator was first defined and investigated by Blum and Micali [BM82], and by Yao [Ya82b]. These authors showed that pseudorandom generators exist under appropriate complexity assumptions. Definition 3.2.1 A pseudorandom generator is a polynomial-time computable function G : S* -> S* taking k-bit strings to Q{k)-bit strings, Q{k) > k, such that the k-indexed ensembles £G(Uk) and £UQ(k) are computationally indistinguishable. Call the function Q{k) in Definition 3.2.1 the stretch of the generator G. The following theorem appears in Boppana and Hirschfeld [BH88]. It says that the ability to stretch the truly random seed just a little implies the ability to stretch it a lot. Theorem 3.2.2 If there exists a pseudorandom generator with stretch k + 1, then, for any polynomial Q{k) > k -\- 1, there exists a pseudorandom generator with stretch Q(k). ■ A major effort has gone into weakening the conditions known sufficient for the existence of a pseudorandom generator, including the work of Levin [Le85], and of Goldreich, Krawczyk, and Luby [GKL88]. This effort has culminated in the work of Impagliazzo, Levin and Luby [ILL89], and Hastad [Ha90]. They characterize nonuniform and uniform pseudorandom generation by the existence of nonuniform and uniform one-way functions, respectively. (But recall that we are limiting the scope of our definitions to nonuniform notions.) We now define a (nonuniform) one-way function. Informally, this is an efficiently computable function whose inverse is computable only a negligible fraction of the time. Definition 3.2.3 A one-way function is a polynomial-time computable function / : S* — > S* such that for any polynomial-time "inverting" algorithm I and any infinite string aj, e{k) = VxoK^u, [/(l^/(a:),a,) G f-\f{x))] is negligible. The existence of a function /' satisfying the definition above turns out to be equivalent to the existence of a function / satisfying the apparently weaker condition that any inverting algorithm should fail to invert / a significant fraction of the time — more precisely, that there exists a constant c such that for any inverting algorithm / and any infinite string aj, Aik) = FroK^u, [/(l^/(a;),a,) ^ f-'ifi^))] is at least n""^ for all sufficiently large n [Ya82b]. Similarly, it is also equivalent to allow the inverting algorithm to be a probabilistic polynomial- time algorithm; the probability that / successfully inverts is now taken over the inverting algorithm's coins as well as over xE.Uk. A theorem mentioned previously is the following: 62 Theorem 3.2.4 ([ILL89, Ha90]) A pseudorandom generator exists if and only if there exists a one-way function. ■ As a simple example of the results cited, consider the multiplication function f{XiX2) = Xi ■X2, where |xi| = \x2\ + S, 6 €. {0,1}, and Xj, ■ x^ is the product of Xi and X2, treated as binary numbers. Suppose that every polynomial-time algorithm fails a fraction n~-'° of the time (for big enough n) to split a number x into numbers Xi and X2 with a;iX2 = x and |a;i| = |a;2| + ^, S € {0, 1}, when x is the product of random numbers of lengths varying by at most one. Then there exists a pseudorandom generator stretching length-fc strings to length 2nk + 2 strings, where n is any fixed polynomial in k. 3.3 High-Level Description We have said that constant-round secure function evaluation is achieved by efficiently issuing to each player a garbled circuit and a set of garbled inputs at which to evaluate this circuit. This section informally describes what a garbled program looks like, how it is evaluated, why it is plausible that it can be computed quickly, and why it might be issued to the players without compromising their private inputs. The next section more formally describes how to collaboratively compute a function / given a protocol for computing some related function /, instead. The set-up. The players want to collaboratively evaluate some function. This function must be represented somehow; we represent it by a Boolean circuit C. We assume that C is made up of only two-input gates. Though the gates have bounded fan-in, they may have unbounded fan-out. Each player knows the input bits along some of the input wires to C — namely, player i, who possesses private input x,-, knows the |x,| bits along the input wires which take x,-. The community wants to learn the bits induced along the output wires. Figure 3.1 is an example of a circuit three players may want to collaboratively evaluate. The circuit has two gates and five wires. Players 1, 2, and 3 provide the bits xi, X2, and X3 along wires 1,2, and 3, respectively. Thus the players are trying to evaluate the function /(xi,X2,X3) = X1X2 V X3. We will suppose that Xi = 0, X2 = 1, and X3 = 1, so the players should compute the bit 1. (In this example, each player just provides a single bit, and there is only a single bit of output. But neither of these facts will be relevant to the method we describe.) Evaluating an (ungarbled) circuit. How would a circuit C normally be evaluated? Here is one way to describe it. See Figure 3.2. In the circuit C, each wire carries one of two possible signals — the formal 63 Figure 3.1: A circuit for three players to distributively compute x-^x^ V 0:3. string or the formal string 1. The two possible signals are the same for each wire, and everyone knows what the two signals associated to a wire are. Each signal has a corresponding, "publicly-known" semantics. The O-signal means that the wire has semantics (or "false"), while the 1-signal means that the wire has a semantics of 1 (or "true"). If you know the signals along the incoming wires of a gate, you can mechanically prop- agate a signal to the out-going wire of the gate. For example, an AND gate with incoming signals of and 1 gets an out-going signal of 0. In this way, signals propagate up the circuit, from input wires to the output wires, eventually defining signals for all output wires. Since the signals have a fixed, known, semantics, the circuit has then been evaluated. Evaluating a garbled circuit. Evaluating a garbled circuit is not so very different. Once again, there will be wires, gates, and signals, and these will mirror the structure of the corresponding "ungarbled" circuit. See Figure 3.3, which depicts a garbled circuit, and information related to it. Wires will still carry one of two signcils — but this time, the signals will not be the strings and 1. Instead, each wire u will have longer strings associated to it, signals s^ and s". These will be random strings of length nk + 1 — random except that signal Sq will always end in a 0, and signal 5^ will always end with a 1. Before, the signals associated with a wire were publicly known, and the same two signals were associated with each wire. Now, the signals associated with a wire will be secret, and they will vary from wire to wire. Before, the two signals had a fixed semantics, meaning (false) and 1 meaning 1 64 Figure 3.2: The circuit and its input. Each wire caries one of two signals, or 1, with associated semantics <-»■ and 1^1. To compute, signals are propagated mechanically up the circuit. (true). Now, the signals will have a secret, "hidden" semantics: s^ having associated semantics A'^ and 5^ having associated semantics A". For wires other than the output wires, this semantics is random, and is not known by anybody. In Figure 3.3, the signals sj and s\ have been given the semantics and 1, respectively, while si and sf have semantics 1 and 0, respectively. Just as evaluating an ungarbled circuit consists of learning the correct signal for each wire by mechanically propagating signals up the circuit, evaluating a garbled circuit consists of learning one of the two signals associated with each wire, and propagating these signals up the circuit. Initially, you will hold (that is, you will "know") one of the two possible signals for each input wire — you will hold the signal with the correct semantics for this input wire. Holding a signal s^ for a wire u will correspond to that wire having semantics of A'^**. Given the two incoming signals for a gate, a method will be provided allowing you to learn the correct out-going signal for that gate. For example, in Figure 3.3, knowing sj and s^ "means" that the left and right incoming wires to Gate 1 carry and 1, respectively. Consequently, in evaluating this "garbled gate," the players should learn the signal sf, since this is the signal for the out-going wire which carries the semantics of A 1 = 0. As mentioned, the Sq and s" signals are concocted to be differentiable: we asserted that the last bit of 5^ is a and the last bit of 5^ is a 1. Thus these signals are referred to as the even and odd signals, respectively. If you know a signal for a wire, you know whether you possess the even or the odd signal for the wire, but this tells you nothing about the 65 s ^ si <-+ 1 A 1 00 41 41 A\, s ^- 1 s t- / ^00 42 ^01 42 ^11 *^ 1 si 4 s\ 4 ^0 1 Figure 3.3: A sample garbled circuit (the eiglit gate labels) and garbled input (the three signals that are fed in along the input wires). Also shown are the two signals (secretly) associated with each wire, and their corresponding (secret) semantics. underlying semantics of the wire, because this was chosen at random independent of the signal being even or odd. An exception to this is made for output wires, for which we want the semantics to be public. We simply assert that the even signals for these wires have semantics of 0, and the odd signals have semantics of 1. Now when a player has computed the signals for the output wires of a circuit, he has also computed the "semantics" of the circuit's output. In evaluating the circuit of Figure 3.3, each player initially knows sj, si, and s^. The first two of these are combined and — somehow — each player learns Sj. Then Sj and s^ are combined and — somehow — each player learns s^. Since this signal belongs to an output wire and is odd, the circuit evaluates to 1. How SIGNALS PROPAGATE ACROSS A GATE. What, exactly, is this mechanism that allows a player, given knowledge of two incoming signals to a gate, to compute the correct out-going 66 ■'01 'On*- 'In-' s s 4 01 4 11 -^01 = 4 s\-- = 'On ■4a^o '01 '11 'On'' ■^Li-0 Figure 3.4: A closer look at the signals associated with the wires of Gate 2, and their corresponding semantics. Each signal consists ofn length-k strings, and one additional bit. signal for the gate? Each gate g of the garbled circuit provides "help" for accomplishing this task for gate g. The help is in the form of a table of four strings of length nk + 1. Each of these strings is called a gate label. The garbled circuit is the collection of all of the gate labels for all of the gates. The four gate labels associated with gate g are written A^q, Aqi, A{o, and Af^. If you possess two even incoming signals for gate g, then Aqq allows you to compute the correct out-going signal; if you possess an even signal for the left incoming wire of gate g and an odd signal for its right incoming wire, then Aq^ lets you recover the out-going signal; and so forth. We now describe how the out-going signal is computed from the incoming signals and the gate labels. The signals for a wire are not treated atomically. Rather, each signal is considered as consisting of n strings of length k, together with one more bit that specifies whether this signal is even or odd. Figure 3.4 shows the makeup of the signals for Gate 2 of Figure 3.3, together with their semantics. Each of the n "pieces" of each incoming signal serves as the seed of a pseudorandom generator G stretching k bits to 2{nk + 1) bits, where n is a fixed polynomial in k which bounds the number of players, n, in terms of the security parameter used by the algorithm, k. (For concreteness, we will later select n = k^°, quite arbitrarily.) Define G'q and G[ by G{s) = G'ois)G[(s), for \G'q{s)\ = \G[{s)\ = nk + 1, and set Go{s) = G'o{s)[l : nk + I] and 67 s G G'o{s) G[{s) k nk + 1 nk-\- 1 Figure 3.5: The pseudorandom generator G, stretching k bits to 2nk + 2 bits, for n = n{k) a fixed polynomial in k. The map G defines Go{s) = G{s)[l : nk + 1] and Gi{s) = G{s)[nk + 2:nk + nk + 2]. Gi{ai)@---®Gt,{a„) ® G<,(n)® ■ • • ©G'i(r„) ® A^j CTi •cr„ a Tnb Figure 3.6: How the out-going signal for a gate is determined from the two incoming signals. Gi{s) = G[{s)[l : nk + 1]. See Figure 3.5. In evaluating the garbled circuit, for each wire incoming to some gate g, either Go or Gi is computed at each of the n pieces of the incoming signal carried along that wire. To illustrate, suppose that you possess two even signals coming into a gate g. That is, both of the incoming signals are strings of length nk + 1 which end in 0. Then each of the two signals, minus the last bit, is disected into n k-h'it long strings. Apply Go to each of these 2n strings to get 2n strings of length nk + I. The bitwise exclusive-or of all of these images, also exclusive-or'ed with Afj^, is the desired out-going signal for gate g. More generally (had the signals not both been even), the out-going signal is computed as follows. If the left incoming signal to gate g has parity a and prefix ai- --an (each cr,- of length k); and the right incoming signal to gate g has parity b and prefix n • • • r„ (each Ti of length k); and the gate labels for this gate are Aqo, Aoi,^io, ^ii; then the out-going signal is defined to be G{s) = G'o{s)G[{s), for |G'o(s)| = \G[{s)\ = nk + 1, and set Go{s) - G'o{s)[l : nk + I] and 68 This is illustrated in Figure 3.6. The string A^q can be thought of as the encryption of the out-going signal with the correct semantics given that both incoming signals were even. To decrypt this encryption, you must hold both of the two even incoming signzils. More generally, the convention on the signals being even and odd lets you know which of the four gate labels is meaningfully decrypted given the "secret key" (aj • • • ct„ a, n ■ • ■ r„ b). We have specified what the garbled circuit looks like: it is the collection of A^j-values. The garbled input is the collection of those signals which carry the correct semantics for each input wire of the circuit. In Figure 3.3, the garbled input consists of the strings slsls^, while the garbled circuit is AlaAl.AloAl.Al^Al.Al^Al,. The garbled circuit together with its garbled input is called the garbled program. We have now described how to evaluate a garbled program, and how to understand the result. Why can a garbled program be found quickly? For any choice for the even and odd signals of length nk+l for the wires of the circuit C, and for any semantic assignment ijj ^ X'^ for these signals, the gate labels for each gate are well defined so that the gate computes correctly with respect to the assigned semantics. For example, in Figure 3.4, we must arrange that Al, = Go(sJi)© ■ • • ®GoislJ © Go{sl,)® ■ ■ ■ ®Go{slJ ® s^ for the garbled Gate 2 to compute correctly on two even incoming signals. Though in this section we will not write out the general expression for what A^^ should be, it is clearly a simple function of the signals and semantics for the three wires that touch gate g. In particular, the gate labels depend only on local information. This means that in calculating the garbled circuit, all the gate labels can be calculated in parallel. Since signals and their semantics are selected randomly, the calculation of the garbled circuit completely parallelizes. This will mean that while the communication complexity for computing the garbled circuit depends on the size of the circuit C, the number of rounds will not. Thus to achieve constant rounds the key is to ensure that the calculation of a single set of gate labels can be accomplished in constant rounds. It is in order to accomplish this that we have treated signals as consisting of n separate seeds to the generator G. In constructing the garbled circuit, each player will be responsible for computing G on "his own" portion of each signal. That is, each player i will be required to locally compute the image under G of the i*'' piece of each signal; and he will enter this information into the collaborative computation. For example, in Figure 3.3, player i— if honest— selects ten random length k strings s'^f, where u G [1..5], /? G {0,1}, applies G to each of these, and enters all of this data into the collaborative computation. Applying the generator G to the strings may be time-consuming— but this computation is done locally, not distributively. It is in this way that the brunt of the scrambling operation is performed locally. 69 Given the images of the seeds under the generator and the A'" -values specifying their semantics, the calculation necessary to compute the gate labels is fixed and straightforward; it is not surprising that this can be performed efficiently. Why is the protocol private? The technical answer that "the simulation goes through" is not particularly satisfying! We try to give some intuition why the strategy outlined might be private. The intent is that knowledge of two incoming signals for a gate gives rise to knowledge of the correct out-going signal, and nothing else that is meaningful. So if you are magically handed a garbled circuit and a garbled input to feed to it, you should learn only a randomly selected signal for each wire not an output wire. For each output wire, you should learn a random signal of the "correct" semantics. Apart from the output wires, knowledge of a single signal for a wire should have no discernible meaning. In some sense, the secure collaborative computation of the garbled program does amount to magically being handed the garbled circuit and the garbled input — except that the signals are not entirely secret, since each player knows one piece of each signal (the piece that player "was responsible for"). The main concern, then, is that the garbled program does divulge extractable information, even if taken together with those pieces of signals a bad player might already know about. That is, as a dishonest player, you know one complete set of signals for the input wires (from knowing the garbled input), and you know those pieces of various other signals that your dishonest friends have told you about. But, since there is some honest player who has not told you the seeds he was responsible for, you are denied knowledge of some piece of each incoming signal that is not a garbled input. Intuitively, in order for one of the gate labels to be meaningful, you must not be denied knowledge of any of the 2n seeds whose images under the generator are exclusive-or'ed to decrypt the out-going signal. In fact, replacing a gate label for which you lack a seed required to "decrypt" that gate label with a truly random gate label does not give rise to a noticeable change in distributions. (Of course, we will be proving this.) Three of the four gate labels for each gate will be like this, and so all of these can be exchanged with truly random strings without a noticeable change in distributions. But this new distribution — a distribution on "fake" garbled circuits — is easily generable. So, releasing a garbled circuit reveals no significant information, insofar as an indistinguishable "fake" garbled circuit can be released if one knows only what the garbled circuit ought compute. In the next section, we repeat the protocol more formally, without attempting to provide additional intuition. 70 3.4 The Protocol Notation. We fix notation to be used throughout the remainder of the thesis. Let G denote a pseudorandom generator that stretches a A;-bit string 5 to a 2{nk + l)-bit string G{s), where n = k^° bounds the actual number of players, n, in terms of the security parameter, k, which will be used by the protocol. (The constant 10 is an arbitrary constant.) Let Go and Gi be given by G'^is) = G{s)[l : nk + 1], and G[is) = Gis)[nk + 2:nk + 2 + 2nk + 2], with Go = G'o[l:nk + 1] and Gi = G[[l:nk + 1]. We consider the collaborative computation of a string- valued function fc : (S^)" -> S'. This function is represented by a circuit Cc : (S')" -> E', with f^Xi ■■■x„) - Cc{xi ■ ■ ■ a;„). A description of this circuit appears on the common input c = 1* #l"#l^#l'#l'"#Cc. The circuit C^ has T gates, which are labeled l,2,...,r, and it has W wires, which are labeled 1,2,...,W. The gates are taken to be two-input gates of arbitrary functionality and arbitrary fan-out. Each gate has distinguished left and right incoming wire. The numbering of wires is chosen so that the input wires are wires 1, . . . ,n£, and the output wires are wires W -l + l,...,W. The j*''-bit of private input Xi appears along input wire £{i - 1) + j, and the j*''-bit of the output y appears along output wire W - I + j. In the description of the protocol that follows, the index i ranges over the players, i e [l..n]. The index a; ranges over the wires, cj G [l-W], or— when indicated— some subset of these wires (like the input wires). The index g ranges over the gates, g G [l-L]. Comments on the protocol. The following remarks and terminology wiU be used in the proof presented in Chapter 4, or are otherwise useful in understanding the protocol. • When run on common input c = l*^'#l"#l^#l'#l'"#G, with k' as the security parameter specified by c, the "true" security parameter k used by the players in the protocol is not necessarily k' — because we insist that k be at least a polynomial fraction of \c\. This is necessary because the adversary is given time polynomial in |c|, rather than time polynomial in k. We enforce this requirement by saying that k = max{A,-', |c|^/^°}, where 10 is an arbitrary absolute constant. Because of this convention, n is bounded above by a fixed polynomial in k, n < n{k) = k^°. • The garbled circuit is the collection of gate labels issued to the players, A^j € S"*'+^ The garbled input is the collection of signals a'^ G i;"*+i issued to the players, one for each input wire u). The garbled program is the garbled circuit together with the garbled input. Thus a garbled program looks like 2/ — ^00^01^10^11 ^oo^oi^io^u " " t '^ 5 where / = (4r + ni)(nk + 1). • The mapping specified by Step 1 from player inputs f and player random coins f to the garbled program y, is referred to as the function /. The function / can be 71 considered as a map from (S^')" to S', for £^i + p, with p = 2kW + W-1. Of course / is actually a function family; it depends on the common input. The mapping specified by Step 2 of the protocol, from the garbled program y € S' to the string y € E' to which it evaluates, is called the evaluation function, Eval. We write y = Eval(c, y). In the evaluation of a garbled program y, the collection of W signals which come to be held are called the on-path signals. The other W signals are the off-path signals. The collection of T gate-labels which are used for computing the on-path signals — one A^j for each gate g — these are called the on-path gate labels. The other 3r gate labels are called the off-path gate labels. Evaluating a garbled program entails learning the on-path signals and outputting the parity of each on-path signal for each output wire, in the proper order. to the garbled program y, is referred to as the function /. The function / can be 72 w-i Step 1: Collaboratively compute the garbled program y. Common Input: The shared string c = l*'#l»#l^#l'#l'"#C. Private Input: Each player i has private input Xi 6 S^. Coin Tosses: Each player i uses random coins r, 6 2^ Compute: Players compute gate labels Ago, Agi, Afo, A?i G S"*+^ and mpwi s?>na/s (7^ e S"*+S for 5 € [l.T] and a; G [l..n^]. Let k = max{it',lc|i/i°}. The parties information-theoretically [(n - 1)/2J -securely compute (with security parameter k), gate labels and input signals, defined as follows (a) (i) Each string r; defines length fc strings 5 J^.s^^, ... ,5^1; >-slV and bits X},...,X] by asserting that r; = sJisL • • • s^iSu A| • • • A,- ~ . (ii) The private inputs Xi,...,x„ define the bits 6S...,6"' associated with each input wire, according to b^ ■ ■ -b"^ — Xi ■ --Xn. (ill) For b G {0, 1}, let s'^ = st,--- st„ b. Comment: Sg and sf are the even and odd signals associated to wire w. (iv) Define .^^i^t®---®K iovl<u;<W-l, . \0 forVK-/ + l<u;<W^ Comment: A*^ is the semantics of signal s^, and I^ is the semantics of signal s^. (b) For each input wire u of the circuit, u G [l..ni], the string a^ is given by ^(■"©A"- (3.2) Comment: That is, <7" = s^ if 6'^ = A'^, and (t"^ = s^ if 6*^ ^ A"^. In general, s^^;^„ is the signal for wire w which carries semantics 6. (c) For each gate g of the circuit, g G [I..T], the strings Ago, Agi, A?o and A{^ are defined as follows: let ® denote the function computed by gate g, suppose the left wire is wire a, the right wire is wire /?, and the output wire is wire 7, for a,/3,7 G [1..W]. Then, for a, be {0,1}, define the gate label A^j by AU = G,{s:,)®---®G,{s:J ® Ga(sfi)8---®Ga(sL) ® .T (3.3) Comment: A"©a is the semantics of the left wire if you possess a parity-a signal for the left wire, and X^®b is the semantics of the right wire if you possess a parity-6 signal for the right wire. In this case, the gate should output the signal of semantics (A"©a) ® {X^®b), which is signal «p»ea)»(A^®6)]®AT- Figure 3.7: Step 1: the parties collaboratively compute the garbled program y. 73 Step 2: Locally evaluate garbled program y, computing y = Eval(y). Common Input: The shared string c = l*'#l"#l^#l'#l'"#C. Input: A garbled program y, i.e., gate labels Ago,^on^io»^ii € S"*+S and input signals a"^ G S"*^+S for ff € [l..r], a; € [l..n^]. Output : The string y e T,' that this garbled program "evaluates to." Let k = max{fc', |c|^/^°}. Each player i behaves as follows: Initially, player i is said to hold the signal <7- =<■••< IsbK), for each input wire u), where |o-^| = • • • = |o'!t'l = ^• Consider a gate g, having left input wire a, right input wire (3, and output wire 7, with a,/3,7 G [L.IF]. Suppose inductively that player i holds the signal a" = a^ ■ ■ ■ a^ a for wire a, and the signal a^---(y^ b for wire /?, where \a^\ = ••• = \a^\ - |cti | = . . . z= \a^\ = k and a,b e {0, 1}. Then player i computes and is said to hold the signal a-^ = G6«)e • • • ®Gi«) 8 Ga(af )e ■ • • ©Ga(a^) ® ^^ (3.4) for wire 7. In this way, cr"' values are computed for each wire u of the circuit. When a value a"^ is held for each wire uj of the circuit, each player i outputs 2/ = lsb(cT"'-'+^) •■■ IsMcr"') (3.5) as his private output. Figure 3.8: Step 2, in which the players, on their own, evaluate the garbled program y. Chapter 4 Proving the Protocol Secure 4.1 Introduction The protocol described in Chapter 3 is not specified fully because we have not said how to implement the secure function evaluation on which it rests. All we have specified is what we want to collaboratively compute, and how a computation of this is used to define each player's private output. We will not remedy this; in fact, we will exploit it. Our strategy has been to abstract out the specifics of the already very complicated information-theoretically secure collaborative computation, and argue that — whatever its implementation — if it securely computes the function / that it is designed to compute, then the protocol as a whole securely computes the function / that it was designed to compute. Here, precisely, is what we will assume: that existing protocols — those of [BGW88, CCD88, RB89]— can be used as the basis for proving Theorem 4.1.1, below. We do not say what particular protocol is used— though, for concreteness, we state the theorem with the bound on fault-tolerance of the [RB89] protocol, t = [(n- l)/2j . If a protocol with a weaker bound on fault tolerance is used, this bound is simply mirrored as the fault-tolerance in Theorem 4.3.1. If an error-free protocol is employed as the basis of Theorem 4.1.1, there will, correspondingly, be no error in the correctness constraint for the complexity-theoretic constant- round protocol. Theorem 4.1.1 Let f = {fc} be a string-valued function family, in which each f^ : (S^o)nc _^ Y,'" is described by a circuit Cc, with fc{xi---x„J = Cdgiixi) ■ ■ ■ QnA^nJ), where 5 : S* X N -> S* is polynomial-time computable and {Cc} is a family of const ant - depth circuits, each containing only two-input NAND-gates and unbounded fan-in XOR- gates. Then, for some absolute constant R, there is a polynomial-time, R-round protocol P which information-theoretically t-securely computes f, where t = [{n - l)/2j. ■ 74 75 More general statements than the one above are possible; this one is tailored to our specific needs. Theorem 4.1.1 says something more general than that a function can be information- theoretically securely computed in constant rounds if it has constant-depth circuits. Rather, it says that this is possible if each player need only apply a polynomial-time function to his private input, and then the result of this computation is the input to a collaborative computation of a function having constant-depth circuits. The intuition why Theorem 4.1.1 holds is the following. Secure function evaluation using the gate-by-gate approach was sketched in Chapter 1, and, for constant depth cir- cuits, protocols employing this approach require a constant number of rounds. Further- more, because of the specifics of the mechanisms for secure function evaluation developed in [BGW88, CCD88, RB89], not only can the shared out-going bit for a bounded fan-in gate be computed from the shared incoming bits in a constant number of rounds, but this is possible for unbounded fan-in XOR gates, as well.^ For the function g that each player is "asked" to contribute an image under, each player shares the preimage as well as the image. He then "proves" to the community that the shared image was properly computed from the shared preimage. Though this proof will use an amount of communication which grows with the depth of a circuit for computing g, it requires only a fixed number of rounds. How does Theorem 4.1.1 apply to us? Basically, the function we collaboratively evaluate in Step 1 of the protocol of Chapter 3 does indeed factor into a "hard" part, ^f— which play- ers evaluate locally— and an "easy part," Cc— which the players collaboratively evaluate. Details specifying this decomposition are given in the proof of Theorem 4.3.1. Since the round complexity for computing the output y in our protocol is precisely the round complexity for computing the garbled program y, and since the local computational complexity of our protocol as a whole is within a polynomial factor of the local compu- tational complexity for just computing y, Theorem 4.1.1 assures us (after setting up the appropriate language) of having specified in Chapter 3 a constant-round, polynomial-time protocol, P. However, showing that P f-securely computes /c is not a small task. In Sec- tion 4.3 we do this. First, in Section 4.2, we establish some preliminaries needed for the argument. 4.2 Preliminaries In this section we collect up some facts which will be useful in proving our main result. Perhaps the most basic fact we require is that indistinguishability of ensembles defines an equivalence relation. In particular, indistinguishability is transitive (reflexivity and sym- ^ Alternatively, the result of Bar-Ilan and Beaver [BB89] allows any function with log-depth circuits to be securely evaluated in a constant number of rounds. The unbounded fan-in XOR gates could thus be replaced by a complete binary tree of bounded fan-in XOR gates, and then this result applied. 76 metry are obvious). Proposition 4.2.1 Suppose R and S are compuUtiona.lly 'mdist'mguisha.hle ensembles, and suppose S and T are computationally indistinguishable ensembles. Then R and T are computationally indistinguishable ensembles. Proof: The proof is a gentle introduction to a standard cryptographic argument. Suppose for contradiction that R = £Rk{<^) is computationally distinguishable from T = £Tk{u}), as ensembles over a common parameter set C = {Lk}. Then there is a polynomial- time distinguisher D", a polynomial Q, and a collection of strings {uk e L^}, such that \ED{l'',RkiL^k),i^k,a) - EDil\Tk{uJk),i^k,a)\ > 1 / Q(A;) for infinitely many fc € N. By the triangle inequality, for each such k, either \EDil'',RkiLJk),uJk,a) - ED{l'',Sk{uk),^k,a)\ > 1 / 2Qik) or \ED{l\Sk{uJklLJk,a)-ED{l',Tk{uk),Uk,a)\>l/2Qik) (or both). Thus either there is an infinite collection of /;; € N such that \ED{l'',RkicJk),0Jk,a)-EDil\Skiu;k),oJk,a)\>l/2Qik), or there is an infinite collection of fc G N such that \BD{l\SkiiOk),i^k,a) - EDil'',RkiiOk),i^k,a)\ > 1 / 2Q(fc). In the former case, R is distinguishable from 5, and in the later case, S is distinguishable from T. This contradicts the theorem's assumption. ■ ► Notation. The notation in the proof above, with all four argument to J9, gets a bit tiresome. So we will sometimes simplify expressions such as D{l'',Rk{u)k),i^k-,a) to D{Rk{u)). If you can not distinguish ensembles R and S based on a single sample, then you can not distinguish R and S based on many sample points, either. (Recall that we are only dealing with nonuniform indistinguishability.) The proof is implicit in Yao [Ya82b], and follows the now standard "probability walk" argument, which we shall use again in the proof of Theorem 4.3.1. Proposition 4.2.2 If ensembles R and S are computationally indistinguishable over C = {Lk}, then, for any polynomial Q{k), R'^'^''^ and 5^^*^ are computationally indistinguishable 77 Proof: Assume for contradiction that iZ^^*) and S'^^^'^ are computationally distinguishable. Then there is a distinguisher D" , a polynomial g, an infinite set A', and a collection of strings {wi} such that \ED{R^{uk)) - Y.D{S,{u,))\ > 1 / qik) for all it € K. We use D and o to construct a distinguisher D'^ for distinguishing R from 5. For i e [0..(g(A;)] and a; G L,, define T^(u;) = Eii.(a;)'5fc(w )«(*)-'. Note that T^{u;) = Skiu}), and T^^''\ui) = -Rt(w). By the triangle inequality, for each k € K there must exist a fifjt € [l..Q(A;)] such that uik) = |Ei^(rf -^(o;,)) - EZ)(Tf (u;,))| > 1 / Q{k)qik). (Informally, we have taken a "probability walk" between -Ri(cjjb) and Ski^^k), using T^(a;fc) to specify a set of steps that take you from the one space to the other. Since the two endpoints have significantly different expectations under D, some step in between must have associated to it a nonnegligible "jump" in the induced expectations.) Consider a "distinguisher" D[f,^^y, where each nSk G (S*)^^*)-^ This distinguisher behaves as follows: it takes a sample x (which can be thought of as being either Rki^k)- distributed or 5t(wifc)-distributed), and it computes D(l*, (r^ • ■ rf "^xsf +^ • • • s^^*^),^^, a). Observe that if each rl is drawn according to Rk{u>k), and each s{ is drawn according to Sk{i^k), then i^ik) = |E,,y,E5{^,j,j(iZ,(a;fc))-E,,y,E£){^,y,}(5*(a;0)| < E^,y, |E5{^,,,)(i?,(w,)) - ED^r,rMSk{^k))\ ■ Since the average of a bunch of elements cannot exceed the maximum of those elements, the equation above implies that there exists, for each k, particular values rl, • ■ -jrl"' and 4*^S--->«?^*^ such that uik) < \ED^,,,,y{Rk{u^k)) - ED^,,,,]iSkiuJk))\ . Letting a reasonably encode the {grj-j-values, these {fi^fcl-values, and the infinite string a, we have constructed a distinguisher 5* which, along the infinite set K, distinguishes R and S by at least the inverse polynomial 1 / q{k)Q{k). This is a contradiction. ■ We draw the following immediate conclusion to Proposition 4.2.2, where n = k^° (or any other polynomial in k): Lemma 4.2.3 Let G : E* ^ ^2nk+2 j^^ ^ pseudorandom generator, and let Q{k) be a poly- nomial. Then the k-indexed ensembles SU (2nk+2)Q(k) and £[G{Uk)\^''''^ are computationally indistinguishable. ^ 78 4.3 Proof of the Main Theorem Our goal is to show the following; Theorem 4.3.1 (Main theorem — string-valued computation) Assume a one-way function exists, and let f = {fc} be a string-valued function family in which each fc : (S^")"" -^ T,'" is described by a circuit Cc for computing it. Then, for some absolute constant R, there is a polynomial- time, R-round protocol P that t-securely computes f, where t = [{n - l)/2j. We emphasize that the protocol P which Theorem 4.3.1 asserts the existence of is a ficed protocol: in our formulation the common input contains a description of the function the protocol distributively evaluates. Though the polynomial that bounds P's running time in terms of the length of the common input has degree which depends on the underlying one-way function, the constant R does not depend on this. Proof: We have described the protocol P in Chapter 3, given a protocol for information- theoretic secure function evaluation. We must establish that this protocol securely com- putes /. This entails exhibiting a simulator S - (S,AI,AO), and showing that the simu- lator "works" to establish P's privacy and correctness. Part 1: Strategy, and exhibition of the simulator S. A MENAGERIE OF PROTOCOLS, SIMULATORS, AND ORACLES. To describe the simulator S and prove that it works, we will introduce some extra terminology. There are sufficiently many terms that the table of Figure 4.1 may help the reader keep them straight. We begin by defining the first row of terms in this table, taking a closer look at the secure computation of Step 1. The PROTOCOL P. In Step 1 of the protocol, a player i gets private input a;,- € S' and common input c = l*'#l"#l^#l'#l'"#C, where C specifies T, W, and the topology of a circuit on W wires and T gates, obeying the conventions indicated on Page 70. He computes k = max{k' , \c\'''°}, p = 2kW ^W -I, I ^ i + p,i = {AT -f nl){nk + 1), and flips coins r,- € S''. He then runs a protocol P for secure function evaluation on private input XiTi £ YI- and common input c = l*#l"#l'#l'#l'"#C', where C is a certain constant- depth circuit of two-input NAND gates and n-input XOR gates, a description of C being efficiently computable from c. Circuit C specifies the function / which protocol P computes by asserting that /(xiri,---,x„r„) = C'(5'i(ca;iri),---,fif„(ca;„r„)), where g,{cxiV,) = XiriG{sl,)G{s\,)---G{s^,)GisYi), 79 P The protocol which t- securely computes f{xr). S Simulator for this protocol. 6 = d{xf,Tt;f), the ora. cle for the simulator S. P Player i selects r,- <— $, computes c from c, then runs Pi with private in- put XiVi and common in- put c. S Same as S apart from us- ing c instead of c and a syntactic modification of strings handed over to the adversary. Selects f <- $, then an- swers according to d{xf;f). 6 Same as 6, but "turns off" to query of j, for some random ; G [l..n]. O Same as 6, but "turns off" to query of j, for some random j G [l..n]. S Same as 5, ex- cept garbled program is evaluated. O Same as O, but returns a "fake" garbled program, having random strings for 3/4 of the gate labels. P Same as P, except out- puts the evaluation of the garbled program. S Same as 5, except sim- ulates the behavior of using 0. O = 0(«,7f;/), the oracle for the simulator S. Figure 4.1: A myn&d of closely related protocols, simulators, and oracles. Omitted from the table are the oracles d, for < i < T, and the oracle O*. Only O and O are "true" oracles; the rest are probabilistic algorithms of c, x, f, f, and the oracle's queries. with Sj*^ defined from r< according to the equation of Step la(i) of the protocol P. The definition of C can then be read off the description of Step 1 of the protocol. We assume that c is encoded w^ithin the circuit C, so that not only is the map c t-^ c easily computed, but so to is the inverse map c\-^ c. Applying Theorem 4.1.1, Step 1 of P can be performed in polynomial- time and constant rounds. Since Step 2 requires polynomial- time and no communication at all, protocol P is a polynomial-time, constant-round protocol. Definition of S. Protocol P is an information-theoretically secure protocol for /. Let S = {S,^,Ad) be the simulator which establishes the security of P. Simulator S is given access to an oracle O = 0{xf, tt; /). It is through S, M, and AO that S, Al, and AO wiU be defined. Defining the functions AI and AO is the easy part, so we dispense with it right off. Definition of Al. The adversary input function for S is defined as follows: AI{c,t) - x'^ when !AJ(c, r) = x'j-r'j.. That is — after adjusting the common input in the same way that P 80 adjusts it for P — the adversary input function associated to S just "reads off' the adversary input as the appropriate substring of AI, applied to the same conversation. Definition of AO. As the adversary output function for S specifies a garbled program, the natural thing is to define the adversary output function for S to simply evaluate that garbled program: thus AO{c,t) is defined as Eval(c,.4C?(c,r)). Construction of S, at a glance. In what way does S fail to be a good simulator for PI At a glance, S would seem to be a completely different beast from the simulator S we wish to construct, because S has access to an 0(ff, 7f;/)-oracle, while we only have access to an C?(x, tt; /)-oracle. All the same, the simulator S we define will behave very much like S. Except that whenever S would interact with its oracle d{xf, tt; /), simulator S will interact with its oracle 0{x,f;f), instead. From this, it will come up with a response of its own creation. For example, to a component query of i, the oracle won't simply return the pair (xjjTTj), but some (a;,ri,7r,), instead. In effect, we construct a "fake" oracle, O, which behaves like the "real" oracle, O. Oracle O is not really a "true" oracle at all, but a rather smart little probabilistic algorithm. This oracle substitution method is at the center of the proof described here. Because of the complexity of the arguments involved, oracle substitutions are described as being made in several stages. The (9(ff,f;/)-oracle, though not exactly (or even statistically) simulatable with only the aid of an C)(f,7f; /)-oracle, will be shown to be very closely approximable nonetheless. In fact, our approximation O to the oracle O will be such a good approximation that no polynomial-time test could tell with which of the two oracles it was conversing (as long as the polynomial- time test made fewer than n component queries to the oracle). In particular, the view provided to some polynomial-time i-adversary A when talking to the simulator S running under 0(xf, ff; /) will be computationally indistinguishable from the view provided to A when it speaks to the simulator S we construct running under the oracle O we devise. The meaning of P's privacy. The protocol P ^securely computes / in the information- theoretic sense. The privacy part of this assertion means that for any polynomial-time t- adversary A interacting with the network running P, the view A receives from the network is very closely approximated by interacting with the simulator 5, instead. In symbols, and using the "shorthand" notation mentioned following the definitions of A- VIEW j. and A-VIEWf , we have, by assumption, that fA(arf,f,aA,£) ^ £Sf''-^'\xf,n,aA,c). (4.1) Equation 4.1 asserts statistical indistinguishability of £(/)-parameterized ensembles. We sometimes abbreviate statements like the one above to the less cumbersome £Pk — SS^ . 81 Definition of P. We define a protocol P "in between" protocols P and P. This helps us reason about P. Protocol P is identical to P apart from one matter: in protocol P, the garbled program y which P computes is evaluated to a string y = Eval(c,2/), and player i outputs y instead of y. Protocol P does not do this; it simply outputs the garbled program y. In other words, P instructs each player i to take his private input Xi and his common input c, and replace his private input by x.r,- and replace his common input by c, where r,- is i's length-p prefix of coins. After this initial replacement, P instruct player i to behave as P dictates. Definitions of S and O. Paralleling the modification of P to P, we modify simulator 5 to create a different simulator S, and modify oracle O to a make a different oracle O. We consider a probabilistic analog to the oracle O, which we call 0. Oracle O behaves as follows: it selects a random r ^ (S'')", and then behaves like d{xf,Tf; f), returning (xjT-jjTr,) to a component query of i, and returning yr = f{x'^r'j.\Jxfrf) to an output query Ol X T' ' T' • The simulator S is very similar to the simulator S. It begins by mapping its common input c to the string c, and pretending, subsequently, that c was the common input it was issued. Additionally, when a processor i is corrupted, simulator S would normally provide information to the adversary specifying the private input XiVi of the corrupted processor, its history tt,, and its computational state s-°°. Simulator S answers identically, except that the private input of i is replaced by Xi, and the computational state of i is syntactically modified so that r, specifies the coins that were initially flipped when player i executes P? Protocol P is well-approximated by S*^ . We argue later that Equation 4.1 and our definitions of P, S, and O imply that ^Pt(£,7f,a^,c) ~ 55f (i',7r,a^,c). (4.2) This is the content of Claim 4. Equation 4.2 asserts statistical indistinguishability of £(/)- parameterized ensembles. The simulator S. When an adversary attacking the network running P corrupts a pro- cessor i in her final round, she discovers, encoded in i's computational state, i's private output — the garbled program y. Simulator S is obtained from S by replacing the string re- turned by S in response to any final-round corruption by the corresponding string in which the specified output value, rather than being the garbled program y, is the value y that this garbled program evaluates to. (For technical reasons — see page 89 — we also demand ^Since P is a protocol we specify the description of, there is no problem in determining how these coins are to be inserted in a description of the computational state so as to "look" like a computational state obtained from a processor running under P. 82 li 3< ^1 o (a;.i »-.•:, TT,- J ij {Xi^ri.,TTi^) X^Tj, ^ y = /(x'rpf^U XyTy) «i+i , \Xi,^i'<'ij + x1^ij+l) '* {xi,ri,,Tri,) Figure 4.2: VViiat the simulator S asks of its oracle O that S provides a distinguished forbidden-v&lue to the adversary with which it speaks if 5's oracle returns a distinguished forbidden-valne. Oracles O and O do not return forbidden.) We will later show (and it is pretty obvious, given Equation 4.2), that £Pk(x,Tr,aA,c) ~ £Sf{x,f,aA,c) (4.3) This is the content of Claim 5. Equation 4.3 asserts the statistical indistinguishability of £(/)-parameterized ensembles. The oracle O. So far, the modifications to P, 5, and O have been simple, essentially "syntactic" modifications. In essence, we have simply set ourselves up to do something more interesting: to replace the oracle which knows about / with an oracle that only knows about /. That is, we will now replace the oracle (5 by a "fake" oracle O, realizable given an oracle O = 0{x,f;f). We now describe the behavior of O. Let us consider the sequence of queries that S makes to its oracle 0, and describe how we will answer them with O. See Figures 4.2 and 4.3. As with any simulator interacting with its ideal evaluation oracle, there are three phases of oracle queries: (1) the component queries; (2) the output query; and (3) additional component queries. 83 1< 3< *1 O H 0{x,f-J) {Xi.Ti^-Ki,) (a;i:,7r,-,) ij h {Xi^ri^,iri.) i^i,,^',) T" T" Xrp y y= /{x'tDxy) ij+i ij+i (^'j+i''«i+i'''"»y+i) {xi,^„ni^^,) ^* r it iXuru,T^u) (xi.^'^i,) Figure 4.3: Alternatively, S's queries can be answered by the "fake" oracle, O, which uses an 0{x,f;f) oracle to compose its answers. The oracle O begins by selecting a random f G (S'')". To a component query of i (whether before or after the output query), O will make an oracle call to its C?(x, 7f;/)- oracle to learn x,- and tt,-; it will then return (xiri,Tri). Note that the distribution on what O returns in Phase 1 and Phase 3 of the oracle queries is identical to the distribution that O returns for these queries. What remains is to describe how O answers an output query of x^r^. The oracle O, on receiving the output query x'j.r!j^, makes a query of x'^ to 0{xf\f), receiving a string y G S^ The strings y, r'j,, r^r, and additional random bits are used in constructing O's response y = Al,Al,Al,Al, . . . Al,Al,A\,A{, a'..-a"'e S' to the output query. The string y is called the fake garbled program, as opposed to the real garbled program, y, that oracle O returns. To construct the fake garbled program, define f" = tt U r^ and set sj.sj^ •■■s^iS^i = r'/[l:2kW], where |5^J = fc, for J G [l..n]. Define s^ = <i ■ • •5^„6 for u; G [l..W],be {0,1}. What we have done is produce signals which are a proper "intermixing" of random strings with those strings which the output query x^r^ specifies. 84 A "random path" through the garbled circuit is selected by flipping a coin /i"^ for each wire uj that is not an output wire, w[l..iy - /]. The fake garbled program, y, will be devised so that s^„ will be the signal held for wire w when y is evaluated, for u G [l.-M/-/]. (Claim 3 establishes that this is really so.) For an output wire a;, y will be constructed so that the semantically correct signal will be held, Sy[^_(vv_()]- For uniformity in notation, then, set fi'^ — y[u) -W -\-l] for each output wire w, w € [VF - / + I..W]. The conditions specified by the last paragraph are simple to satisfy by the appropriate choice of Jake input signals, <t^ • • • ct"^ and a correct definition of just one of the four gate labels A^^, for each gate g. Namely, select fake garbled inputs of a'' = s";. (4.4) for each input wire u. Then, to a gate g with left input wire a, right input wire /?, and output wire 7, define one of the four gate labels A^^ for each gate g according to We have thus specified fake garbled input ct^ • • • tr"' of the fake garbled program jf, and we have specified, for each gate g, one of the four gate labels for g. All that remains unspecified in y are the remaining three gate labels of each gate g. Set all 3T of these yl^^-va7ues to be random strings of length nk + 1. ♦ ♦ Having defined oracle O and the simulator S, we have defined the simulator S: S = gO(s;,fj) jg go^ except that the oracle responses we have imagined being provided from O are computed by S itself, with the aid of its (!)(f,7f;/)-oracle. Notice that the responses of are easily computable from the responses of O and some coin tosses. Oracle O well-approximates oracle O. The main goal is to establish that £5f(£,7f,a^,c) ^ SSfix,Tt,aA,c). (4.6) This equation asserts computation indistinguishability of £(/)-parameterized ensembles. The right hand side is, by definition, SSf • Equation 4.6 is still rather awkward to argue directly, so we argue instead (Claim 6) that Equation 4.6 follows from €St{x,7:,aA,c) « £Sfix,n,aA,c). (4.7) for oracles O and O, which are very much like O and O, respectively. The difference is that each chooses a random j € [l..n] and, if there is ever a component query of j, it "shuts up," 85 refusing to answer any more questions. Equation 4.7 will then be argued by contradiction (Claim 7), where assuming its falsity allows the breaking the pseudorandom generator G. Summary of strategy. Before proceeding, we observe that, taken together, Equations 4.3 and 4.6 imply that P is ^private. The global strategy for arguing privacy, can be described as: ePk{xf,f,aA,c) ~^ 55jfc (ff,7f,a^,c) SP,{x,n,aA,c) ^^^^^ €Sfix,ir,aA,c) ^^^^^^^ £Sf{x,n,aA,c)=^€Sf(x,f,aA,c). Under each assertion is written the claim used to establish it. The "heart" of the proof — and the only arguments which really depends on the clever definition of the protocol P — is Claim 7, which begins on page 90. (On a first reading, it may be desirable to begin there.) Part 2: The simulator S establishes security. Claim 1: Evaluating y gives y We show that evaluating a (real) garbled program y gives the value y which it "should" evaluate to: for any a;^, r^, x^ and r^, y = Eval(c, f{x!j,r'j. U aj^rr^)) is equal to f(xip U Xf). The argument is by induction on the depth of the circuit C defining /. We maintain the following invariant: if, in the evaluation of y = f{x'^r'j< U XfTf), signal a" = s^ is held for some wire u>, then wire u carries the bit bQ\'^ when C is evaluated at x'j^ U x^. Here, A'^ is defined from r = r^ U r^r according to _ J ®^ ri[2nW + u] for u £ [1..W - /], and ~\0 foT uj € [W - I + 1..W], as with the equations of Steps la(i) and la(iv). The invariant is seen to hold for input wires, since the signal issued for an input wire lj is a'^ = s^.^A"' by Equation 3.2, and this wire carries a truth value of b'^ = (6'^ffiA'^)®A'^ inC. Inductively, consider a gate g of functionality having left incoming wire a (carrying a signal a" = s°c), right incoming wire /3 (carrying a signal ct'' = 5^^), and out-going wire 7. By Equation 3.4, the signal computed for wire 7 will be a^ = G'^.(s;„i)®---®G^H6«„„) ® G'^»(s^.i)®---®G'^«(5^.„) ® A^^,,, which, from the definition of A^ ^^, Eqiiation 3.3, is each chooses a random j € [l..n] and, if there is ever a component query of j, it "shuts up," 86 The inductive hypothesis says that the left incoming wire carries the bit fi^QX" in C, and the right incoming wire carries the bit |i^®A''. So we know that out-going wire for gate g of C carries the bit {fi"®X") ® {iJ,^®X^). Consequently, the signal cr^ we have computed for wire 7 is sj^^y, where wire 7 carries the bit b in circuit C. This establishes the induction. By Equations 3.1 and 3.2, we conclude that Eval(j/) = (0©A^-'+i) • • • {0®X^) = y a Claim 2: PROTOCOL P IS CORRECT. We next argue that protocol P is correct. This follows from Claim 1 and definitions of ^X and AO. The correctness of protocol P implies that, almost certainly, each player outputs y = /(a;^4 UccTyri-y) as a result of the execution of Step 1, where x^r^ is the adversary's commit- tal in P, as specified by M.^ In the protocol P, each player then computes his output y by evaluating this garbled program y. We need that the y- value so computed almost certainly equals f{x'j. U x^), where a;^ is the adversary's committal in protocol P. In fact, Claim 1 establishes that whenever each good player computes y = fix'j-r'^ U x^tt-jt), for any r'j. and TjT, each player computes y - /{x'^ U a;^). Thus, almost certainly, each player, when running P, outputs y = f{x'j. U Xy), where x'j. is given by AI and the output attributable to the adversary is given by AO. This establishes correctness. □ Claim 3: EVALUATING y GIVES y. Very similar to Claim 1, we show that, in evaluating the fake garbled program y, one comes to hold the signal 5^„ for each wire w. (By the same reasoning, "hybrid" garbled programs — where some of the off-path gate labels are meaningful and others are not — will share this property.) This holds by induction on the depth of the circuit C. For an input wire u, one holds s^o. by the definition of a'^ , Equation (4.4). Inductively, from holding s^^ for the left incoming wire of some gate g, and from holding s^^ for the right incoming wire of gate g, one computes and holds, according to Equation 3.3, for the out-going wire 7 of gate g. From the definition of A^„^^, Equation (4.5), this is precisely sj^-,. We conclude that Ev&l{y) = fi^-'+^ ■ ■ ■ n^ . By our choice of /i'^ values on output wires. Equation (4.3), this is precisely the string y returned from the oracle query 0(a;^), y = fi^T U Xjr). D Claim 4: Equation 4.2 is correct. Let A be a f-adversary. By the privacy of P, £Pk c^ £Sf, which is Equation 4.1. We wish to show that Equation 4.2 holds, that SPk 'y. £S° . ^That is, the good players do output the string y in Step 1, while the adversary could compute y by evaluating AO. 87 First note that, for any distinguisher D", for a negligible function e{k'), > > max T'€(S'"=)"'= max (f,f,aA,c)eii-(/) max (5,f,a^,c)€ifc/(/) EDiPkixr, ff, aA,c)) - EDiSfixf, tt, aA,c)) lo-Pc-^ Y^ |E^(n(^F,7r,a^,c))-EI)(5f(£f,7f,a4,c)) [ r'e(i;''c)"^ 2-Pcnc J2 BDiP,ixf,7r,aA,c)) - 2-^onc J2 BD{Sf{xf,7r,aA,c)) (4.8) (Recall that b{Ek{^)) is short for Dil" ,Ek{io),uj,a), and k = mm{k',\c\'/'°}.) That the first quantity is negligible follows directly from P's privacy; the next inequality just says that the average of a bunch of elements can not exceed the maximum of these element; and the last line is the triangle inequality: Y, \-^i - -^il > E(^i - -Si)l = E^i - Z)^il- Now, given a distinguisher D" — think of this as having been designed for distinguishing £Pk from €S^ — given any such distinguisher, there is a distinguisher D" such that EZ?(1*' , Pk'{x, f, (2^ , c), (f , 7f , aA,c),a) = 2-Pc"c Y^ ED{l\Pk{xf,7t,aA,c),{xf,Tf,aA,c),a) (4.9) j'g(S/'c)'>c and EZ)(l*',5f!(x, 7f,aA,c),(f,7f,aA,c),a) = 2-Pc"c Y EDil\Sf{xf,f,aA,c),{xr,T^,aA,c),a). r'e(S''c)"c (4.10) Namely, the distinguisher D is defined as follows. It maps its first argument, fc, to k'; it maps its third argument, {xi\ tt, aA,c), to (f , n, 0^,0); it maps its fourth argument to itself; and it maps it second argument, which is the view i> of the adversary A, to an alternative view P, as follows: when /> indicates that a processor i is corrupted, and processor i had private input Xtrt, the view T> is constructed so that Xi is the private input and r,- is the prefix of i's random string. That is, i/ is a syntactic modification of so that it is a view that would appear to come from P, not P. After modifying the arguments in this way, the distinguisher D is applied to them. Equations 4.9 and 4.10 follow directly from the definitions of P, P, S, S,D, and D. Combining Equations 4.8, 4.9, and 4.10, we have that e(fc') > EDiPk,ix,Tr,aA,c))-ED{Sf,ix,f,aA,c)) which is precisely Equation 4.2. D Claim 5: Equation 4.3 is correct. The last claim easily gives Equation 4.3. To see this, note that if SEk{oj) ~ £E'i,{u) and h is any function, then £h{Ek{u)) ~ eh{E'k{u})). (Likewise, if E Ek{uj)^€ Ei{u) and h is efficiently computable, then £h{Ek{oj)) ^ £h{Ek{uj)).) More explicitly, the only difference between P and P is that P evaluates the garbled cir- cuit which has been collaboratively computed, outputting this evaluation, while P outputs the garbled program itself. Let A be a ^adversary. By our definition of P and P, Pkix,f,aA,c) = eval(Pi(x,7f,aA,c)), (4.11) where aval is the function on views that simply replaces the string encoding A's view with the identical string, except that the garbled program y which the view i? specifies to be output is, instead, evaluated, with the function Eval; it is this string which v = eval(p) specifies should be output. Likewise, 5f(x,7r,a^,c) = eval(5f(£,7f,a^,c)), (4.12) as follows directly by the definition of S and S. Applying eval to both sides of Equation 4.2 gives 5 eval(Pi(f,7f,a^,c)) = ^ eval(5'f (f,7r,a^,c)). Combining this with Equations 4.11 and 4.12 gives ePk{x,Tt,aA,c) ~ £5f (f,7f,a^,c), as desired. D ♦ ♦ Definitions of and O. We are now ready to address the main concern, that the "fake" oracle O provides a very close approximation to O — not in the information theoretic sense, as with all modifications made so far, but with respect to polynomial-time computation. Actually, it will be convenient to argue this indirectly, showing that O remains closely approximated by O even if each of these oracles "shut up" if you ask some particular randomly- chosen component query. Specifically, define O and O as identical to O and O, respectively, except that each oracle initially chooses a random j G [l..n] and then, to a component query of j, and O answer forbidden-— vdXhev than XjTj, as they otherwise would. Once O ox O answers forbidden, they answer A on any subsequent queries. We now show that it suffices to show that O is well approximated by O. 89 Claim 6: If S^ ^ S^ , then 5"^ « S<^. We may assume that 5" always makes exactly t component queries, since a simulator which makes fewer queries than that can always be modified to make exactly t component queries, ignoring the unneeded responses. Notice that S inherits this property from 5. To establish Claim 6, suppose for contradiction that Sf ^ Sf, (4.13) yet Sf t Sf. ^. 7^ -.. (4-14) Equation 4.14 means that there exists a polynomial-time ^adversary A, a polynomial-time distinguisher D" and a sequence {(£»., 7ffc,afc,Cjt) € Lk{f)} such that u{k) = \ED{Sfixk,Wk,ak,Ck)) - ED(5f (ft, TTt, at , Cjfc)) is nonnegligible. Henceforth we omit the arguments to the simulators, writing an equation like the one above as simply i.ik)=\EDiSf)-BDiSt). We will contradict Equation 4.13. To do this, define a distinguisher D', which is identical to D except that D outputs the bit on views containing a distinguished forbidden-value. As mentioned on page 81, S provides A with a distinguished forbidden-vdlue when ^'s oracle responds forbidden. Then ED'(5^) = Prob [i'f does no< get a /or6z(/den] • E fl'(5f )| 5f does not get a forbidden] Prob ^o J ^— [l..n] : Sf does not ask component query j ED{Sf) = il-t/n)EDiSf), where Equation 4.15 holds by our extension of D'^. to on forbidden, and E [D{Sf)\ Sfdoes not get a forbidden] = E£)(5f ) follows because of 5* making exactly t component queries. Similarly, we have ED'iSf) = Prob \sf does not get a forbiddeni^ ■ E [£)(5f )|5'f does not get a forbidden] = Prob j •(— [l..n] : Sf does not ask component query j ED{Sf) = il-t/n)ED{Sf). (4.15) (4.16) (4.17) 90 Thus \ED'{St)-ED'(Sf)\ = {l-t/n)uik) is nonnegligible, contradicting Equation 4.13 and establishing Claim 6. □ Claim 7: 5^ a 5^. Finally, we address the heart of the matter. Proceeding by contradiction, assume that S^ jk S^. (4.18) This means that there is a polynomial-time f-adversary A, a polynomial-time distinguisher D", a sequence {(fj.,?;t, (0^)^,0^) G !*(/)}, a polynomial Q{k), and an infinite set K C N such that ^ EZ)(5f)-Ei?(5f)|>^ (4.19) for k e K. We will use this to construct a probabilistic polynomial-time distinguisher {D')" for distinguishing the ensembles STlk = £iU2nk+2y and £:p7^fc = £{G{Uk)Y, where G : S* ^ Y,'^ink+i) jg ^j^^g pseudorandom generator used by protocol P. Recall that n = A;^° is a bound on the size of n in terms of the associated security parameter k. The existence of such a distinguisher contradicts Lemma 4.2.3. To describe D' and a', fix k £ K and let f = ft, tt = tF*, a^ = (oyi)*, and c = c^. Let n,l,l,m,C,T, and /) all be as indicated by c. The distinguisher D' takes as inputs the two strings, A',B' e £2"*=+^ It uses a substring of each of these strings to compute a "good guess" as to whether A and B are random or pseudorandom. Let Aq = A'[l : nk + 1], A^ = A'[nk + 2:nk + nk + 2], Bq = B'[l : nk + 1], and Bi = B'[nk + 2:nk + nk + 2]. Definition of the oracle O,. We define a sequence of oracles, Oo,Oi,. . . ,Or, the first oracle wiU be identical to O, and each oracle will be successively more and more like O, until Or = 0. This might be depicted as d = Oo->Oi^ ■■■^ Or-i =>Or = d. Passing from oracle Og_i to oracle Og can be thought of as "fixing up" the gate labels on gate g. That is, recall that O provides "meaningless" (random) gate labels on three of the four gate labels per gate, while 6 provides "correct" gate labels throughout. The intermediate oracles have meaningful gate labels on more and more gates. 91 The behavior of Ot is as follows. The oracle begins by selecting a random f £ (E'')" and a random j e [l..n]. To a component query of j, d responds with forbidden, and it responds with A to all queries following a forbidden response. Otherwise, to a component query of i O, responds with x,ri. We must describe how 0, answers an output query of xipV^, returning some string which we now define. Exactly paralleling the definitions in the protocol and in the description of O, define f" = ttU r-^, and set for i e [l.-n], |s^,-| = k. For u £ [l-W], declare ■•®K ifo; G [1..W-/], and Al ^" ' - if u e [w - 1 + 1..W], and set 5;^' = s^j • • • s^„ 6. Let b'^ be the value carried by wire u; of C when C is evaluated at x'j, U X57, and define fi'^ = X'^Qb'^. Now to answer the output query, set ct'^ = s"^^, for lj € [l..n£]. This defines the garbled input. To define the gate labels, for a gate g of functionality (gi, left incoming wire a, right incoming wire /3, and out-going wire 7, define yl^j according to ^i^(.eA<.)«(.^A.)]e.^ if S e [1"^1 or ab = /.<>/ (4.20) Rl^ otherwise where iZ^j is a randomly selected string from 11"*+^ Informally, we have put in correct gate-labels for all gates numbered l,...,i, and we have also put in correct on-path gate labels for the remaining gates. The other 3(r - i) gate-labels are garbage — just random strings. Claim 8: = Or. This claim means that O and Or exhibit exactly the same (probabilistic) behavior when used as ideal evaluation oracles which are asked at most t component queries. This statement is immediate from the definitions of of and Or- Both oracles choose a random j G [l..n], and answer forbidden to a component query of j, answering A subse- quently. Apart from that, both oracles return x.r,- (for a random r,) to a component query of i. To an output query of a-^r^, both oracles return f{x'rpr'j- U Xfr-f). □ 92 Claim 9: O = Oq. This claim means that 6 and Co exhibit exactly the same (probabilistic) behavior when used as ideal evaluation oracles which are asked at most t component queries. As before, both oracles choose a random j G [l..n], and answer forbidden to a component query of j, answering A subsequently. Both oracles then choose a random r, and, to a component query of i, each answers Xjr,-. To an output query of x^r^, the oracle O chooses a random path /t^ . . . ,/i"'"' and returns a random fake garbled program y that computes /(x^ Ux^r) along this path, having random off-path gate labels. On the other hand, Oi explicitly chooses no such random path; the path is implicitly specified by the f"-values. Still, the path chosen, being a function of the randomly selected ©^I'r.-values, is inde- pendent of the values returned by the ^ < n component queries. Furthermore, the path does not depend on the output query itself. Thus both oracle O and Oo answer an output query x'j.r'j. by a random path computing /(a;^ U x^r), with C/„it+i-distributed off-path gate labels. This establishes Claim 9 D Claim 10: THERE IS A ^Jt G [l.-F] SUCH THAT lEL'CSf"-') - Y.D{S°"')\ > 1 / TQ{k). The claim follows immediately from Claims 8, 9, Equation 4.19, and the triangle inequality. In common parlance, we have taken a "probability walk" through the oracles Oo => Oi => • ■ • => Or-i => Or- The distinguishing probabilities differ at the two endpoints; somewhere, there must be a nonnegligible "jump" in the distinguishing probabilities. We have let g^ denote a place in which there is a jump between the behavior induced by oracles Og^^i and Og^. The same proof technique was used in the proof of Proposition 4.2.2. □ ♦ ♦ Before proceeding, we make the following definition: a^, /Su, and fk denote the left incoming wire, right incoming wire, and out-going wire to gate Qk, respectively. The oracle a." The distinguisher D', recall, takes a string AqAiBoBi £ S^("*+i) (ex- tracted from A'B'). Consider the oracle O,, whose behavior depends on such a string AoAiBqBi, defined to behave as follows: it flips coins to determine the values j, f, and E^j, exactly as Og^ did before. Component queries are answered as Og^ would answer them. Then, to an output query of x'^t't, the values f", 5^^, Xf , \" , s'^ , b'^ , /i" and a"^ are all com- puted as Og^ would compute these values. However, in forming the garbled program y., the yl^j- values are computed differently than they were for O^,— but just for the case of * Introduced for the benefit of readers who believe there have not been enough oracles described in this proof. 93 gate Qk. Namely, set ^KaeA" Kb A^)]®A-> iig e [l..gk] or ab = /i"/i'^ otherwise (4.21) where Gpis'^i) - { Ag if a; = ak and i — j and ^ = /P^ B/3 if a; = /?jt and i — j and S = fi^'' G/}{s'^i) otherwise for /3,6 £ {0, 1}. We can describe the preceding as follows. Player j's contribution to the left wire feeding gate Qk is AoA^ for the (supposed) generator output corresponding to the off- path signal. For the right wire, his contribution is BqBi for the (supposed) generator output corresponding to the off-path signal. Now if AqA^ and BqBi really are pseudorandom, this is what player j should contribute for these wires, and we will compute a meaningful set of off-path gate labels for this gate. But if AqAi and BqB^ are truly random, then this contribution will cause all of the off-path gate labels to be random. We formalize this as foUows: Claim 11: If A' and B' are G(C/fc)-DISTRIBUTED, THEN C:);*o^iBoB. ^ (Tf^^ Claim 12: If A' AND B' are C/(2„,t+2)-DISTRIBUTED, THEN 0;*oAiBoBi - Og^_i. The first of the these two claims is immediate. The only concern is that a component query would necessitate revealing a preimage under the generator of an A or iJ- value. We don't have any such values to reveal. Fortunately, we have arranged that a component query of j need only return a value of the response forbidden. The second of the these claims is just a little more complicated. The behavior of O^ and Og^-i potentially differ only by the gate labels associated to gate gk. In O^^-i, random strings are used for the three off-path gate labels. In 0», on the other hand, the three off-path gate labels are produced according to Equation 4.21. The calculation of these gate labels can be viewed as XOR-ing three S"*+^-valued random variables, Xi, X2, and X3, with random variables Aq, -Bq, and Ai, respectively, where Xq, Xi, and X2 are independent of Ao, Bo, and Aj, respectively. Thus, when A' and B' are f/2„jfc4.2-distributed, the three off-path gate-labels are (/„i,+i- distributed. □ Wrapping up. We now complete the proof of Claim 7. Observe that, given x,f,aA,c and gk, and given A and B, it is easy to compute a sample from 5*^* in time polynomial in \c\. The distinguisher D' , using advice which encodes both the (infinite collection) of strings a' = {{xk,nk,{aA)k,Ck,gk)} and the infinite string a in a reasonable manner, and using A 94 and B, works as follows: it computes a sample u from S'^-" and then computes the bit i:'(l*,i',(ffc,7fjb,(a^)fc,Ct),a). It output this bit. We know that for all fc G K. This contradicts the assumption given by Equation 4.18, and so have established Claim 7, as well as the Theorem 4.3.1. D 4.4 Postmortem We begin with a few comments about the proof of Theorem 4.3.1. Requirement on honest majority. The assumption that t < n/2 was only required because Theorem 4.1.1 requires this; a garbled program reveals no useful information as long as t < n. Indeed, the proof we have given remains unmodified to establish this claim, apart from relaxing the bound in the conclusion of Claim 6 to "> u{k)ln > i'{k)/k^°" Generality of techniques. Some of the techniques used in the proof of Theorem 4.3.1 are applicable to other simulator arguments. One of these methods is considering the probabilistic analog of deterministic protocols and simulators, as we did in passing from P to P and from O to O. Another is considering oracles which "shut up" when asked oracle queries they find inconvenient to answer. The high-level method of replacing one oracle by another — the "oracle substitution argument" we have employed — is likely to be generally applicable. Vector-valued secure computation. For simplicity, we have described the protocol for string- valued computation. There is no difficulty extending the protocol to vector valued computation, and there is no difficulty carrying the proof over to this case, as well. To collaboratively compute a vector- valued function / : (S^" -^ (S')"> the players compute, instead, a string-valued function /, : (2^^+')" -^ S^', defined by y* = f*ixf) - (?"i®/i(^))---(fn®/n(^))- To compute /, each player is instructed to choose a random Ti € S' and then run the protocol for securely computing /». Then, each player i outputs Vi = ri®y^[{i - 1)/ -f 1 : il]. Using this method, the following extension of Theorem 4.3.1 can be obtained: Theorem 4.4.1 (Main theorem — vector-valued computation) Assume a one-way function exists, and let f - {fc} be a vector-vaiued function family in which each fc : (S^'=)"'= -^ (£'•=)""= is described by a circuit Cc for computing it. Then, for some absolute constant R, there is a polynomial-time, R-round protocol P that t-securely computes f, where t ^ [{n - 1)/2J . ■ Bibliography [BB89] J. Bar-Han and D. Beaver, "Non-Cryptographic Fault Tolerant Computing in a Constant Number of Rounds of Interaction," Proc. of the 8th PODC (1989), 201-209. [Ba86] D. Barrington, "Bounded-Width Polynomial-Size Branching Programs Recog- nize Exactly those Languages in NCS" Proc. of the 18th STOC (1986), 1-5. [Be88a] D. Beaver, "Secure Multiparty Protocols Tolerating Half Faulty Processors," CRYPTO-89 Proceedings, Springer- Verlag. To appear in J. of Cryptology. [Be88b] D. Beaver, "Distributed Non-Cryptographic Oblivious Transfer with Constant Rounds Secret Function Evaluation," Harvard University Technical Report TR- 13-88. [Be91] D. Beaver, "Formal Definitions for Secure Distributed Protocols," in Distributed Computing and Cryptography — Proceedings of a DIMACS Worlcshop, October 1989. (Paper not presented at workshop but invited to appear in proceedings.) [BG89] D. Beaver and S. Goldwasser, "Multiparty Computations with Faulty Major- ity," Proc. of the 30th FOCS (1989), 468-473. [BFKR90] D. Beaver, J. Feigenbaum, J. Kilian and P. Rogaway, "Security with Low Com- munication Overhead," CRYPTO-90 Proceedings, Springer- Verlag. [BMR90] D. Beaver, S. Micali and P. Rogaway, "The Round Complexity of Secure Pro- tocols," Proc. of the 22nd STOC (1990), 503-513, [Be87a] J. Benaloh, "Secret Sharing Homomorphisms: Keeping Shares of a Secret Shar- ing," CRYPTO-86 Proceedings, Springer- Verlag. [Be87b] J. Benaloh, "Verifiable Secret-Ballot Elections," Yale University Ph.D. Thesis (1987), TR-561. 95 96 [BF85] J. Cohen Benaloh and M. Fischer, "A Robust and Verifiable Cryptographically Secure Election Scheme," Proc. of the 26th FOCS (1985), 372-381. [BL85] M. Ben-Or and N. Linial, "Collective Coin Flipping, Robust Voting Schemes, and Minima of Banzhaf Values," Proc of the 26th FOCS. [BGW88] M. Ben-Or, S. Goldwasser and A. Wigderson, "Completeness Theorems for Non-Cryptographic Fault- Tolerant Distributed Computation," Proc. of the 20th STOC (1988), 1-10. [BS81] G. Blakley and R. Swanson, "Security Proofs for Information Protection Sys- tems," Proc. of IEEE Symposium on Security and Privacy (1981), 75-81. [B182] M. Blum, "Coin Flipping by Telephone," IEEE COMPCON (1982), 133-137. [BM82] M. Blum and S. Micali, "How to Generate Cryptographically Strong Sequences of Pseudo- Random Bits," SIAM J. of Computing, 13(4):850-864, 1984. Earlier version in Proc. of the 23rd FOCS (1982). [BD84] A. Broder and D. Dolev, "Flipping Coins in Many Pockets," Proc of the 25th FOCS (1984). [BH88] R. Boppana and R. Hirschfeld, "Pseudorandom Generators and Complexity Classes," Advances in Computing Research, Vol. 5, Randomness in Computa- tion, Silvio Micali, editor. JAI Press, 1989, 1-26. [CCD88] D. Chaum, C. Crepeau and I. Damgard, "Multiparty Unconditionally Secure Protocols," Proc. of the 20th STOC (1988), 11-19. [CDG87] D. Chaum, I. Damgard and J. van de Graff, "Multiparty Computations Ensur- ing the Privacy of Each Party's Input and Correctness of the Result," CRYPTO- 87 Proceedings, Springer- Verlag, 87-119. [CK91] B. Chor and E. Kushilevitz, "A Zero-One Law for Boolean Privacy," SIAM Journal on Discrete Math, February 1991. Earlier version in Proc. of the 21st STOC (1989), 62-72. [CGK90] B. Chor, M. Gereb-Graus and E. Kushilevitz, "Private Computations over the Integers," Proc. of the 31st FOCS (1990), 325-344. Earlier version by Chor and Kushilevitz, "Sharing over Infinite Domains," CRYPTO-89 Proceedings, Springer- Verlag, 299-306. [CGMA85] B. Chor, S. Goldwasser, S. Micali and B. Awerbuch, "Verifiable Secret Sharing and Achieving Simultaneity in the Presence of Faults," Proc. of the 26th FOCS (1985), 383-395. [CR87] B. Chor and M. Rabin, "Achieving Independence in Logarithmic Number of Rounds," Proc. of the 6th PODC (1987). 97 [C185] R. Cleve, "Limits on the Security of Coin Flips When Half the Processors are Faulty," Proc. of the 18th STOC (1986) 364-369. [Co85] S. Cook, "A Taxonomy of Problems with Fast Parallel Algorithms," Information and Control, 64 (1985) 2-22. [Cr90] C. Crepeau, "Correct and Private Reductions Among Oblivious Transfers," MIT Ph.D. Thesis, 1990. [DLM] R. DeMillo, N. Lynch, and M. Meritt, "Cryptographic Protocols," Proc. of the 14th STOC (1982), 383-400. [DH76] W. Diflie and M. HeUman, "New Directions in Cryptography," IEEE Transac- tions on Information Theory, 22(6):644-654, (November 1976. [Ed65] J. Edmonds, "Paths, Trees, and Flowers," Canadian J. of Mathematics, 17:449- 467, 1965. [FM88a] P. Feldman and S. Micali, "Optimal Algorithms for Byzantine Agreement," Proc. of the 20th STOC (1988), 148-161. [FM88b] P. Feldman and S. Micali, "One Can Always Assume Private Channels," un- published manuscript (1988). [GHY87] Z. Galil, S. Haber and M. Yung, "Cryptographic Computation: Secure Fault- Tolerant Protocols and the Public-Key Model," CRYPTO-87 Proceedings, Springer- Verlag, 135-155. [Go89] O. Goldreich, "Foundations of Cryptography — Class Notes," Spring 1989, Technion University, Haifa, Israel. [GGM86] 0. Goldreich, S. Goldwasser and S. Micali, "How to Construct Random Func- tions," J. of the ACM, 33(4):792-807, 1986. [GKL88] 0. Goldreich, H. Krawczyk, and M. Luby, "On the Existence of Pseudorandom Generators," Proc. of the 29th FOCS (1988), 12-24. [GL89] 0. Goldreich and L. Levin, "A Hard-Core Predicate for all One- Way Functions," Proc. of the 21st STOC (1989), 25-32. [GMW86] O. Goldreich, S. Micali and A. Wigderson, "Proofs that Yield Nothing but their Validity and a Methodology of Cryptographic Protocol Design," Proc. of the 27th FOCS (1986), 174-187. [GMW87] 0. Goldreich, S. Micali and A. Wigderson, "How to Play Any Mental Game," Proc. of the 19th STOC (1987), 218-229. [GV87] O. Goldreich and R. Vainish, "How to Solve any Protocol Problem — An Effi- ciency Improvement," CRYPTO-87 Proceedings, Springer- Verlag, 76-86. 98 [GL90] S. Goldwasser and L. Levin, "Fair Computation of General Functions in Pres- ence of Immoral Majority," CRYPTO-90 Proceedings, Springer- Verlag, 75-84. [GM84] S. Goldwasser and S. Micali, "Probabilistic Encryption," Journal of Computer and System Sciences, 28(2):270-299, 1984. [GMR85] S. Goldwasser, S. Micali, and C. Rackoff, "The Knowledge Complexity of Inter- active Proof Systems," SIAM Journal on Computing, 18(2):186-208, February 1989. Earlier version in Proc. of the 17th STOC (1985), 291-305. [GMR88] S. Goldwasser, S. Micali, and R. Rivest, "A Digital Signature Scheme Secure Against Adaptive Chosen-Message Attacks," SIAM Journal on Computing, 17(2):281-308, April 1988. [GMT82] S. Goldwasser, S. Micali, and P. Tong, "Why and How to Establish a Private Code on a Public Network," Froc. of the 23rd FOCS, (1982), 134-144. [GMY83] S. Goldwasser, S. Micali, and A. Yao, "Strong Signature Schemes," Proc. of the 15th STOC (1983), 431-439. [Ha88] S. Haber, "Multi-Party Cryptographic Computation: Techniques and Applica- tions," Columbia University Ph.D. Thesis (1988). [Ha90] J. Hastad, "Pseudo-Random Generators under Uniform Assumptions," Proc. of the 22nd STOC (1990), 395-404. [ILL89] R. Impagliazzo, L. Levin and M. Luby, "Pseudo-random generation from one- way functions," Proc. of the 21st STOC (1989), 12-23. [Ki88] J. Kilian, "Founding Cryptography on Oblivious Transfer," Proc. of the 20th STOC (1988), 20-29. [Ki89] J. Kilian, "Uses of Randomness in Algorithms and Protocols," MIT Ph.D. The- sis (1989). [Ki91] J. Kilian, "General Completeness Theorems for 2-Party Games," Proc. of the 23rd FOCS (1991). [KMR90] J. Kilian, S. Micali and P. Rogaway, "The Notion of Secure Computation," unpublished manuscript, 1990. [Le85] L. Levin, "One- Way Functions and Pseudorandom Generators," Combinatorica, 17:357-363, 1988. EarUer version in Proc. of the 17th STOC (1985). [LMR83] M. Luby, S. Micali and C. Rackoff, "How to Simultaneously Exchange a Secret Bit by Flipping a SymmetricaUy Biased Coin," Proc of the 24th FOCS (1983). [Me83] M. Meritt, "Cryptographic Protocols." Georgia Institute of Technology Ph.D. Thesis, Feb. 1983. 99 [MR91] S. Micali and P. Rogaway, "Secure Computation," in preparation, 1991. [MRS88] S. Micali, C. Rackoff and B. Sloan, "The Notion of Security for Probabilistic Cryptosystems," SIAM J. of Computing, 17(2):412-26, 1988. [NIV] Unattributed, Ecclesiastes (New International Version, date of original un- known). International Bible Society (1977). [Or87] Y. Oren, "On the Cunning Power of Cheating Verifiers: Some Observations about Zero Knowledge Proofs," Proc. of the 28th FOCS (1987), 462-471. [PSL80] M. Pease, R. Shostak and L. Lamport, "Reaching Agreement in the Presence of Faults," J. of the ACM 27(2), 1980. [Ra81] M. Rabin, "How to Exchange Secrets by Oblivious Transfer," Technical Memo TR-81, Aiken Computation Laboratory, Harvard University, 1981. [Ra88] T. Rabin, "Robust Sharing of Secrets when the Dealer is Honest or Cheating," Hebrew University Master's Thesis. Also described in [RB89]. [RB89] T. Rabin and M. Ben-Or, "Verifiable Secret Sharing and Multiparty Protocols with Honest Majority," Proc. of the 21st STOC (1989), 73-85. [RSA78] R. Rivest, A. Shamir and L. Adleman, "A Method for Obtaining Digital Signa- tures and Public Key Cryptosystems," Communication of the ACM, 21(2):120- 126, February 1978. [Sh79] A. Shamir, "How to Share a Secret," Communication of the ACM, 22(11):612- 613, November 1979. [SRA81] A. Shamir, R. Rivest, and L. Adleman, "Mental Poker," in Mathematical Gar- dener, D. D. Klarner, editor, Wadsworth International (1981) 37-43, [TW87] M. Tompa and H. Woll, "Random Self-Reducibility and Zero Knowledge In- teractive Proofs of Possession of Information," Proc. of the 28th FOCS (1987), 472-482. [Ya79] A. Yao, "Some Complexity Questions Related to Distributed Computing," Proc. of the 11th STOC (1979), 209-213. [Ya82a] A. Yao, "Protocols for Secure Computation," Proc. of the 23 FOCS (1982), 160-164. [Ya82b] A. Yao, "Theory and Applications of Trapdoor Functions," Proc. of the 23 FOCS (1982) 80-91. [Ya86] A. Yao, "How to Generate and Exchange Secrets," Proc. of the 27 FOCS (1986).