# Full text of "mit :: lcs :: tr :: MIT-LCS-TR-097"

## See other formats

This blank page was inserted to preserve pagination. '-h-~akj ~d.tVv THE COMPLEXITY OF FINITE FUNCTIONS id i^daKwr stf.3 ^rd O&J.'Issml -ra^ali s Cambridge sis* . ^i .1 - : J • J dUs^a ?&:{:) nWO-Kfe 8X Jl iCs^vl^qg-S 3Ateb'iO aj*t3s«&sg a*!3 3ss«siigs3 3onrrso b 1p ioo2<j sw 3e43~ »i &M5 lo assise •fell 5«!3 ii £ -$&btb UBt-lupmi nol3onuJ "ie pas srf^ isi cs*olj Harass*] *o*>*iq a it's d i^uk 31 s «Ism»Jt3s-£ siM iav*o gffo-r HASSJKH UW g lH -to £flS.I?:»aul! sis sstyg m-'ii&siTc Zfl3 , 33V9WCJM -■?:rise*^ 3ifi»Jt^fc*vni &' :36! .?rsa:J3s$ Thai 'mlus's l^aq £ ssatij iin -^dnux: tts 02139 -5 > JJJ! iiasi so i3"00<.i tipoo snj THE COMPLEXITY OF FINITE FUNCTIONS ABSTRACT Lower bounds on the length of formulas for finite functions are obtained from a generalization of a theorem of Specker. Let f: {0,1,... ,d-l} n -» {0,1,... ,d-l) be a function which can be represented by a formula of length ^ en. For any m, if n is sufficiently large, there is a restriction f: { 0,1, . . . ,d-l} m -» { 0,1,. . . ,d-l} of f which is representable by a special class of formulas called homogeneous e-complexes. By showing that certain functions do not have restric- tions representable by homogeneous e-complexes, we are able to conclude that the length of formulas representing the mod p sum, p > d, or the connectedness of a pattern on a discrete retina cannot be bounded by a linear function of the number of variables in the formula. Also considered are perceptrons over finite fields (cyclic per- ceptrons). It is shown that cyclic perceptrons of bounded order cannot represent the geometric predicate connectivity. An interesting aspect of this is that one proof of the corresponding result for bounded order perceptrons over the rationals rests on the inability of the latter to represent the parity function. However, the parity function requires order 1 if the field has chracteristic 2; thus, this proof breaks down in the case of cyclic perceptrons. Another geometric predicate that cannot be represented by bounded order cyclic perceptrons is Euler number equals k (for an arbitrary k). However, this predicate can be represented by bounded order percep- trons over the rationals. It must be noted, however, that our proofs are different and much simpler than the corresponding proofs derived by Minsky and Paper t for perceptrons over the rationals. Finally, we investigate k-pattern spectra of a discrete retina. This is the 2 k -tuple, each component of which corresponds to the number of times a particular kxk pattern occurs on the retina. It is shown that the only topological predicates that can be determined from k-pattern spectra of discrete figures are functions of the Euler number of the figure. This report reproduces a thesis of the same title submitted to the Department of Electrical Engineering, Massachusetts Institute of Technology, in partial fulfillment of the requirements for the degree of Doctor of Philosophy, February 19 72. mimmx^ wm ^ittm -'fi'^^ m i ^ ^ v. m mm ^ r f * ** ,<. / ***** * i Uj^ULi p ^.Wpi l ipjIl i ll l LI I Jm^- ' iipPi ' l ' . I WIU ll .1 J.HW^L«KjHWWP*wwiMp mu! AgKMQWIJSPGBMENTS H*5 : .1 In the first place, I cp^^ Aefe% ^ jg^fl^^u^k jto Professor Albert R. Meye^ t»,tfea qftFCft ^ 3 M^F^^ft^f it ^.*r hom * learned the heuristics of Research. ^<i£$$§£%u\fg&^g iijtifoduced me to the problems described he? ,a, f ^ r ^a^J (3 t l %ijQ <#$«&% l<?n,g hours with me suggesting improvements and modifications. I am also indebted to my readers, Professors Michael J. Fischer 89 xs !' qatot)** s* J * i and C. L. Liu for valuable suggestions. I would like to thank Professor Frederick C. Hennie and -*5TO-vf r ;.- f swi:»<;8 rC IS Project MAC for financial support during my study. Last but not least, thanks are due to Miss Marsha Baker for consenting to type this thesis. •■'■■--'■ ■■: 'iXlS'-""^~: ?f> i'1gy».J :>ffT' ? ,£ -. ^i.-./u MU<: £ ocr SAT %ij H 1 I»!£ : : ms -i y;s~:w^i^ CONTENTS CHAPTER ONE: CHAPTER TWO: INTRODUCTION AND SURVEY 5 1.1 Finite Functions 5 1.2 Formulas 5 1.3 Measures of Complexity 9 1.4 Problems Related to the Length Measure 13 1.5 Specker's Theorem 20 1.6 Cyclic Perceptrons 23 A GENERALIZATION OF A THEOREM OF SPECKER 26 2.1 e-Complexes 26 2.2 The Generalized Specker's Theorem 38 2.3 On Specker's Theorem 41 CHAPTER THREE: APPLICATIONS OF THE GENERALIZED SPECKER THEOREM 50 3.1 Counting mod p 51 3.2 Connectivity 55 3.3 The Length of Symmetric Functions 60 CHAPTER FOUR: CYCLIC PERCEPTRONS 74 CHAPTER FIVE: PATTERN COUNTING MACHINES 89 APPENDIX A: CERTAIN PROPERTIES OF SHORT FORMULAS 97 APPENDIX B: THE LENGTH OF THE MOD 2 SUM OVER II 112 LITERATURE 116 BIOGRAPHICAL NOTE 118 CHAPTER ONE INTRODUCTION AND SURVEY 1.1 Finite Functions Let n be nonzero and finite; then a partial function IN ■"♦JKT , defined on only finitely many n-tuples, is called a finite function . We will restrict our attention to a subclass of finite functions. D = {0,1, . . . ,d-l] is an initial interval of^KT. Then we will consider ?, the set of all (total) functions D -♦ D for all possible D and (finite and nonzero) n. Let f : D n -♦ D. Then f is identified with a (functional) table with d rows (corresponding to all possible n-tuples over D) and n+1 columns (corres- ponding to the n arguments and the value of f). Obviously the number of func- tions D -» D is d . Consider any function f i D -» D for arbitrary D, n. We will say that f depends on the i argument if and only if there exist two n-tuples a = (a n a. , . . . ,a ) and b = (b. , . . . ,b , . . . ,b ) such that a . = b for j t i , lin lin J j a. / b., and f(a) ^ f(b). Suppose that f does not depend on its j* h argument; then we will say that the j argument is a fictitious argument. 1.2 Formulas Let there be given the countable sets H ■ {x^x^...} of variable symbols and fi of operator symbols. Each element of fJ is a name for a function in ^, and conversely each function in ^ has a name in 0. Let 9 £ represent the function f : D n -♦ D. Then we will write argOP) = n and dom(9) = D. 1.2.1 Definition A D- formula is a finite expression F = cp(G..,...,G ) such that tp G H , arg(cp) = n, dom(cp) = D, and either G. 6 H or G. is a D-formula for 1 ^ i ^ n. A formula is simply a D-formula for some D. Let F be an arbitrary D-formula and let x be the highest numbered variable symbol appearing in F. Then F represents a function f : D -» D. This correspondence is well-known and we will not describe it in detail. Without danger of imprecision, F will also be considered as a representation for all functions obtained from f by adding fictitious arguments. Let there be given two formulas F and G. Suppose that F represents a certain function f, and also a representation for f can be obtained from G by possibly choosing different variable symbols. Then we will say that F is equivalent to G (F = G). Remarks. Usually, if we are dealing with D- formulas for a single domain D, we represent the identity function by a variable symbols (i.e., we omit the operator symbol for the identity). In the formal model we use, we cannot do this since it would be ambiguous. Also, for purely technical reasons, we insist that every operator has at least one argument (otherwise, the wording of several definitions and results would be more cumbersome). Thus, we do not allow constants. Rather, instead of constants, we use operators with one fictitious argument. Suppose we are given the formula F. Occasionally, we will say "Replace the variable x (in F) by the constant a". This is to be interpreted as "Replace the variable x with a (y) " where y is a variable symbol not appearing in F. Let f : D -+ D and let g be an arbitrary finite function of n arguments with domain E £ D and such that g = f E. Let F be a D-formula for (i.e., representing) f. Then we can also say that (F,E) represents g. From now on we will not be pedantic, and we will simply say that F represents g. Some of the main results in this thesis are concerned with the question, given a specific function D -+ D , how much can we simplify its representation if we choose an E-formula for it with D f E. If F is an arbitrary formula, then the set of variables appearing in it will be called its support (denoted by S(F)). The set of operators appearing in F will be called its basis (denoted by B(F)). Let § s ft. Then the set of formulas F such that B(F) C $ will be called the set of formulas over f. Hopefully without too much danger of ambiguity, we will also say that $ is a basis of operators (for formulas over $). All the significant results we will describe deal with formulas over $ when $ is finite (and representing a set of operators with domain D for a single value of D). From now on, whenever a basis of operators $ is introduced, it is always assumed finite. Usually, we are interested only in bases that allow all function D -♦ D for a certain D and arbitrary n to be represented. Such bases will be called complete bases (for D). Notation. Elements of % will always be denoted by lower case Latin letters. The various bases of operators we will use will be denoted by capital Greek letters; operators (i.e., basis elements) will be denoted by lower case Greek letters (except for well known operators for which established notation exists); formulas will be denoted by capital Latin letters; and D will always refer to the domain of formulas. d will denote D . 1.2.2 Example . If D = {0,1}, then the functions D -» D for arbitrary n are known as Boolean functions. A complete basis for (0,1} conists of the binary operators A (conjunction) and V (disjunction), and the unary operator — - (complementation), This basis shall be denoted by II. The formula F = V(A(— (x ),x 2 ),A(x , — ( x 2 ^^ over II represents x 1 x„ (the mod 2 sum of x 1 and x„). Usually, this is written as x.. A x V x- A x_. We have S(F) = {x..,x 2 }, and B(F) = H „ A convenient representation of formulas is by trees. This is a standard device that will not be described; suffice it to say that to each formula F there corresponds a tree T(F) whose terminal nodes are labelled with variable symbols and the nonterminal nodes with basis symbols. As an example, let F be as defined in Example 1.2.2. Then T(F) is shown in Fig. 1.1. Given a formula F, we need a notation for subformulas of F. The definition of sub formula is the standard one: (1) F is a subformula of F, (2) if F = cd(F ,...,F k ), then if F . for 1 ^ i ^ k is not a variable symbol, any subformula of F. is a subformula of F, and (3) subformulas of F are only objects satisfying (1) and (2). Subformulas distinct from F are proper subformulas. Let G be a subformula of F such that G = cp(H., , . . . ,Hg). Then we will say H. =G . for 1 £ i £ jfc. This notation can be iterated. In Example 1.2.2, i .1 F = -(x 9 ). However, note that F „ . is a variable symbol which according to our definition is not a formula. This can be remedied by replacing this particular occurrence of the variable symbol x. by id(x^). For this reason we will require that all the bases we consider contain the identity function whether this is specifically mentioned or not. If G = F .i(l).i(2). i(r)' then ^ = J ^)J ( 2 )° ■ • J( r ^ is called th e index of G (for completeness, let X denote the index of F). If G is a proper sub- formula of F, then F = H(X,G) where X U S(G) = S(F) and H(X,z) is a formula (determined by j ) where z appears only once. We write H = F/G. In this case, with F and G as given, we will also write S*(G) = X (i.e., the variables of F F that appear outside of G) . We define S*(F) = <t>. The subscript F will F generally be suppressed when it will be clear to what formula F we refer to. In what follows, whenever we will deal with a subformula G of F, it will be assumed that the index of G is also given; for if not, then, e.g., F/G and S*(G) are not uniquely defined. Frequently, formulas will occur where certain variables have been re- placed with constants. Suppose F is a formula over I, X G S(F), and a € D; then, F with all variables except those in X replaced by a will be denoted X X by F . Obviously, F is a formula over § U {a}. If f is an arbitrary function, a a X a subset of its arguments, then f has the analogous meaning, viz., the cl function obtained from f by restricting the elements outside of X to a. The functional table of f X is obtained from that of f by deleting all columns a except those that correspond to X and retaining only the rows with a entries in the deleted columns. 1.3 Measures of Complexity Let us introduce the three most widely studied measures on formulas: (1) Length. The length of a formula F, denoted by L(F), is the number of occurrences of variable symbols in F. In other words, it is the number of terminal nodes of T(F). 10 (2) Cost. The cost of a formula F, denoted by C(F), is the number of opera- tor symbols in F. In other words, it is the number of nonterminal nodes of T(F). (3) Depth. The depth of a formula F, denoted by D(F), is the depth of nest- ing of operators in F. In other words, it is the number of arcs on the long- est branch of T(F). Now, given an arbitrary function f : D -» D and a (finite) basis $, we define the length of f over $ as L(f,§) = min({4: There exists a formula F over $ for f such that L(F) = I}) If f cannot be represented by a formula over §, we define L(f,f) = °°o Simil- arly, for cost and depth. It is noteworthy that all the measures above are closely related. In fact, c «L(f,$) ^ c (f,$) ^ C 1 -L(f,$) (1.3.1) c 2 -log 2 (L(f,$)) iD(f,l) * c 3 -log 2 (L(f,3)) (1.3.2) for an arbitrary function f such that L(f,$), C(f,$), and D(f,$) are finite, and certain constants c Q , c., c 2 , and c„ that depends on $. The basis I is also arbitrary, except in the case of the right inequality of (1.3.2) where it must be such that all the constants and the function g (see Lamma 1.3.1) may be represented. We first establish the relation between cost and length (1.3.1). 11 Any formula F over $ can be built up from one which uses only one opera- tor symbol (an elementary formula) by successively replacing variable symbols with new elementary formulas. If F does not contain one-argument operators, then whenever we increase the cost during the build-up (by adding an elementary formula with cost 1), we also increase the length. Specifically, the length increases by between n . -1 and n -1 where n . and n are respectively J min max mm max the smallest number larger than 1 and the greatest number of arguments of an operator of §. This results in the estimate 1 - -L(F) £ C(F) <: — — r -L(F) (1.3.3) n -I n . -I max mm where c = 1„ Suppose F contains one-argument operators. In other words, T(F) contains nodes with branching factor one. Let the maximal number of such nodes that occur one after another on any branch of T(F) be c*; then (1.3.3) still applies with c = c* + 1. (1.3.1) is obtained from (1.3.3) by noting that the minimal length or cost representation of any function (over the chosen basis $) can be achieved with a formula where c* ^ d (the number of functions D -♦ D). The left inequality in (1.3.2) is established by a trivial counting argu- ment (the maximal number of terminal nodes in a tree with branching factor ^ n and depth d is n ). The right side requires more effort (the following max max argument is due to R. W. Floyd). We first state the following obvious 1.3.1 Lemma . Given a formula F such that F = F-CX^F^xp) where F 2 is a proper sub- formula of F and F.. = F/F„, the following holds: 12 F = F 3 (F 1 (X 1 ,C ), F 1 (X 1 ,C 1 ),...,F 1 (X 1 ,C d-1 ),F 2 (X 2 )) where G. for D ^ i ^ d-1 is any formula representing the constants 0,...,d-l (or, as we have remarked previously, the one-argument function with constant value), and F„ is any formula representing the function g(z Q , . . . ,z^_^,z^_ = "z n if z, = 0: z, if z, = 1,..., z, , if z, = d-1. d 1 d ' d-1 d Let F be an arbitrary formula over §, and let G be a proper subformula of F. We already know that F = H(X,G). The claim is made that if L(F) > 1, G can be chosen in such a way that L(H)-1,L(G) < ^MJj- -L<F) d.3.4) max where n is as defined previously. (Remark: L(H)-1 is the number of max r occurrences of the variables of S*(G) in H. ) To find G use the following procedure: Start with F and proceed to sub- formulas of F. Assume you are considering the subformula K. Then two cases can arise. Either among K for 1 £ j £ k where k is the number of arguments • J of the outermost operator of K there is one, j', such that L(K , ) ^ ct'L(F) (0 < a < 1 will be determined later with the purpose of obtaining the lowest possible estimate of L(H)-1 and L(G)), or not. In the first case, proceed to K ., and containue. Otherwise, set G = K „ where j" is such that L(K „) = . j • J • J max (L(K .)) and terminate. Before the procedure terminates, L(K) ^ ct'L(F). l^j^k ' J Thus — — £ L(G) < CO-L(F). This also means (l-a)'L(F) £ L(H)-1 < (1-- ) ri max max .L(F) (because L(G) + L(H)-1 = L(F)). The lowest bound for L(G) and L(H) is obtained by setting a = 1 ; hence (1.3.4) max 13 Now apply Lemma 1.3.1 with G replacing F ? and H replacing F . F is of depth c, depending on §. If the outlined procedure is applied recursively to H(X,C ) for <; i <C d-1 and to G, we obtain in (1.3.2) c„ = , C , where i 3 log b n +1 + 2 , max T b = . n max Note that unlike the cost-length relationship, the minimal value of depth may not be achieved by the same formula as the minimal value for length. Apart from the relationship between the various measures, depth and cos' will not be treated further. Even though in what follows (in this chapter) many things hold mutatis mutandis for depth and cost, most of the specific discussion and the examples shall be confined to length. 1A Problems Related t o the Length Measure In this section we will mention several questions that have been asked about the complexity of finite functions, their status as of this writing, and how they relate to the work to be described here. t A more precise expression is obtained if the right side of (1.3.2) is re- placed by • log. (L(f ))+i for some constant I. Namely, if we start out J-Og.D L with a formula F and decompose it according to (1.3.4) and Lemma 1.3.1, then the length of H(X, C.) and G is bounded by ~L(F)+k where k is the length of C.. After n applications of Lemma 1.3.1, the lengths of the relevant formulas are bounded by ^ L(F) + k^FT +. ..+ k^+k « ^ (L(F)) + -^— j- ; hence the figure above. 14 The Problem of Aggregate Length Let $ be a complete basis (for a certain domain D). The statement of the problem is: What is the largest number L(n,$) such that there exists a function f: D n -+ D and L(f,$) = L(n,§)? It has been studied by several authors, and is now effectively disposed of. Riordan and Shannon [Ri42] first derived a lower bound for L(n,IT). Actually they studied series-parallel contact networks, but the two models are equivalent. The first upper bound (for the same model) was obtained by Shannon [Sh49]. Krichevskii [Kr59] derived a lower bound for L(n,$) for arbitrary domains and bases, while Lupanov [Lu59] obtained the best upper bound for the general case. The result is 0(L(n,$)) = rr (1.4.1) log a n where 0(f(n)) = g(n) means that lim — pf is finite and nonzero. There are two remarks that are in order here. The first is that Formulas represent finite functions efficiently; i.e., the total number of formulas (over a given basis $) of length up to L(n,$) closely matches the number of func- tions of n variables. (1.4.2) The second is The fraction of functions D n -♦ D that can be represented by formulas of length up to L(n,f ) • (l-e)for an arbitrary < e ^ 1 approaches zero as n -» °°. The interested reader may obtain more information in the literature cited above. 15 Obviously, we could define functions C(n,§) and D(n,$) analogous to L(n,§) in terms of the cost and depth measures. In general, such functions (aggregate complexity functions) can be defined in connection with any model for the representation of functions D -♦ D and any measure on this model (an obvious variation of L(n,$) is to remove the condition of completeness on $). It should be noted that the asymtotic behavior of aggregate complexity functions remains an active area of research. For references on the subject, see Lupanov [Lu70]. The Minimization Problem Investigation of the complexity of finite functions started on representa- tions of Boolean functions by logical circuits. In fact, formulas can be thought of as circuits with fan-out one. Thus, the first problems studied were those a logic designer is likely to ask: Given a finite function, what is the minimal circuit (formula) that represents it (i.e., find the complexity, and do so "effectively"). Unfortunately, no statisfactory solution to the minimization problem exists (for any measure). This does not mean that it is impossible to obtain a minimal formula for a given function f : D n -» D; rather that existing algorithms are impractical. Thus, it is always possible to order formulas according to length, and then search all formulas up to length L(n,f) for the d n first formula that represents f; but since there are d functions of n argu- ments this approach is absurb. At the present, all existing algorithms for the minimization of functional representations employ some sort of an exhaustive search (e.g. , the Quine algorithm for the minimization of disjunctive form representations of Boolean 16 functions). In fact, there is reason to believe that a more efficient method does not exist, i.e., 1.4.1 Conjecture Any generally applicable exact minimization procedure is comparable (in terms of computational complexity) to an exhaustive search among formulas. It is useful to consider a specific machine model. Let us consider implementations of such a procedure as a deterministic one-tape Turing machine M $ (see, e.g., Arbib [Ar69]) that receives as its input the d -tuple defining an arbitrary function f : D -♦ D, and whose output is the minimal formula F (over I) for f. Conjecture 1.4.1 gives us that the computation time of M- may attain an exponential (in the length of the input). Let us venture a more restrictive and precise version of Conjecture 1.4.1: 1.4.2 Conjecture Let M. be as described, m is the length of its input, and let Tl(m) be a $ function such that ^' m ' -» as m -» °° for an arbitrary constant c > 1. Then m c the proportion of inputs of length m at which the running time of M exceeds T|(m) approaches 1 as m -♦ °°. Actually, the specific machine model on which the procedure of Conjecture 1.4.1 above is implemented is not particularly important. It can be easily shown (see, e.g., Arbib [Ar69] , Chapter 4) that different deterministic machine models (this applies to the most widely used models, e.g., one-tape and multi- tape Turing machines) can simulate each other in such a way that the running 17 time of one is related to the running time of another at most by a polynomial. In this way, whenever the running time is exponential (in the length of the input) in one case, it must be so also in others. It seems that Conjecture 1.4.1 was first expressed by Yablonskii [Ya59]. A very interesting result connected with this subject was recently obtained by Cook [Co71]. He obtained strong evidence that a simpler problem requires nonpolynomial time. The problem is that of recognizing whether a certain disjunctive normal form (for a Boolean function) represents the constant 1. Cook showed that if this problem could be solved in polynomial time (by a deterministic one-tape Turing machine), then a number of other problems that are regarded as very difficult (e.g., given the graphs G., and G^, determine whether G.. is isomorphic to a subgraph of G 2 ; the recognition of primes; etc.), would also be rapidly computable. Note that a fast minimization machine would give us also a fast constant recognizer; hence, Cook's results supports Conjecture 1.4.1. The Classification Problem In view of the difficulty of finding an exact nontrivial solution to the minimization problem (i.e., one that does not employ exhaustive search), present research is directed at establishing bounds for the length of functions. We consider sequences of functions f,,... of 1,... arguments and study the growth rate of the length of f . Thus, we can talk of classes of linear (length) sequences, quadratic (length) sequences, etc. Also of nonpolynomial (length) sequences. Unfortunately, if a sequence belongs to a nonlinear class, it is very difficult to estimate its length. We cannot even assign represen- tatives to the polynomial classes of degree > 2, let alone the nonpolynomial 18 classes. In fact, at the present we have only a very limited store of examples of nonlinear sequences. 2 n Consider the Boolean function f = .©., x.. Subbotovskaya [Su61] gave a n i=l l 2 3/2 striking proof that 0(L(f , II)) ^ n . It was known already to Shannon (see 2 2 [Sh49], or [Ya54]) that 0(L(f ,11)) £ n (the length of this sequence, of course, grows linearly if © is used). Unfortunately, it seems that the technique of [Su61] cannot be generalized to d > 2. Subbotovskaya' s result has recently 2 been improved by Khrapchenko [Kh71]. He succeeded in showing that 0(L(f , II)) 2 ^ n . Since this result employs a very interesting technique, and since it has not yet been translated into English, it is reproduced in Appendix B. Neciporuk [Ne66] discovered a sequence of Boolean functions f such that n 2 0(L(f ,§)) = ~ for an arbitrary basis ?. It is true that the functions n' log 2 n J involved in the Neciporuk sequence are rather "artificial" in that, while defined in a straightforward way, they have no special significance; however, lately Harper and Savage [Ha71] have succeeded in applying the Neciporuk tech- nique to a practical combinatorial problem (The Marriage Problem). Neciporuk 's construction is based on the following lemma: Let f be a Boolean function of n arguments. Consider a subset X of the arguments of f and the set of restrictions of f to X obtained by setting the arguments outside of X to constants. Let the number of such restrictions be r. If F is any formula over a finite basis $ for f, then the number of occurrences of variables representing the arguments in X is 2: clog.r where c depends on the basis § (for the proof of this see [Ne66] or [Ha71]). The Neciporuk function f of n arguments is then obtained as follows: The n arguments are arranged in a rectangular array with dimensions as shown in Fig. 1.2. Each argument x. . is associated with a 0-1 valued m-tuple a.. ° ij — ij 19 such that (1) not all components are 0, and (2) if (i,j) 4 (k,-0 then a. . ^ a, ,. Then we define f = © x.. © K(a..,k) all i,j i£j where K(a. . ,k) denotes the conjunction of those arguments x, whose second subscript (t) corresponds to nonzero components of a... It can be verified that the number of restrictions of f to the variables of an arbitrary row (except, perhaps, the last which may be imcomplete) ob- , . „n-m tained by replacing the variables of the other rows with constants is l This follows from the fact that any Boolean function can be uniquely represen- ted by a Boolean polynomial (see Lamma 4.5). Then, by the lemma above, the number of occurrences of variables of any row (except, perhaps the last) in any formula for f is ^ c(n-m); hence, the length of f over $ w c- — (n-m). n 2 * In other words, 0(L(cp ,$)) = ■: for an arbitrary basis $. T n *-°&2 n Neciporuk's construction may be viewed as a solution to a special case of the following problem (the problem of exhibiting a function of arbitrary length): Given a basis $ and a number k ^ L(n,$), exhibit a function n n 2 f: D -> D of length ^ k over §. In Neciporuk's case 0(k) = , n . Since so few examples of functions that are known to be of large length exist (in spite of (1.4.3)), the reader has no doubt already gained the impression that this problem too is very difficult. However, we again have the trivial solution that consists in examining formulas in n variables in the order of their length, recording the functions they represent, and choosing the first previously unencountered function represented by a formula of length ^ k. In fact, it is reasonable to state an analog of Conjecture 1.4.1: 20 1.4.3 Conjecture The problem of exhibiting a function of arbitrary length is comparable (in terms of computational complexity) to an exhaustive search among formulas. We again make this conjecture more precise on the example of determin- istic one-tape Turing machines. 1.3.4 Conjecture $ is an arbitrary basis. N. is a deterministic one-tape Turing machine with input (n,k) where n is arbitrary and k ^ L(n,§), and whose output is the d -tuple describing a function f of n arguments such that L(f,§) ^ k. Then there exists a constant c > 1 such that if k ^ e«L(n,$) for any < £ £ 1, n the running time of N, on input (n,k) exceeds c when n ^ n( e ). We can sum up the discussion of the classification problem as follows. The problem is far from understood. At the present no sequence of functions 2 is known whose length grows faster than n . Isolated examples of sequences 2 with growth rate ^ n are known, and present research is directed at inven- ting more general techniques that can be used for estimating the complexity of whole classes of sequences. Also techniques have to be devised for d > 2. The importance of this will be discussed below in Section 1.5. 1.5 Specker's Theorem The first general technique for proving the nonlinearity of a large class of sequences (of Boolean functions) was discovered by Specker [Ho68]. Let the basis IT U {x © y} be denoted by S . Then 21 Theorem (Specker) . If f is a Boolean function of n arguments, if L(f,£) ^ c«n for some constant c, then for any integer m, if n ^ T| (m,c), a subset X = [x 1 ,...,x } of the arguments of f can be found such that (1) „ m m f J = c © c • n (i© x .)ec 2 © x. (1.5.1) i=l i=l where c n , C- , c„ are Boolean constants and Tl (m,c) is a certain number- theoretic function. Furthermore, (2) if the basis is II (the other assump- tions remaining unchanged), then c~ =0. This theorem has been used by Hodes and Specker to show that the predi- cate n S x = mod k (1.5.2) i=l 1 for k > 2 and x. f {0,1} is of nonlinear length over S. Using the second statement of the theorem, they are also able to give n an alternative proof of the nonlinearily of the length of © x. over n. i-1 Another result obtained with Specker 's Theorem is the fact that some geometrical predicates (in particular, connectivity) discussed by Minsky and Papert [Mi69] are of nonlinear length over S (see Hodes [Ho70]). In Chapter Two we will formulate and prove a generalization of Specker' s Theorem (Theorem 2.2.2) to include the case d > 2 and multi-argument oper- ators in $. Our proof reveals the nature of both results more clearly. t _ T] Will be discussed in Chapter Two. 22 They belong to a class of combinatorial results reminiscent of Ramsey's Theorem (see Ryser [Ry63]). In fact, an earlier version of our proof of Theorem 2.2.2 used Ramsey's Theorem. Besides this, Theorem 2.2.2 enables us to derive the nonlinearity of new functions (sequences of functions) such as counting mod p where p is a prime, d possibly equals p, but there are restrictions on the basis, etc. An example of an im- provement over existing results is the connectivity predicate. Hodes [Ho70] proves that it is nonlinear if d = 2. However, in Automata Theory, for example, the result that a certain language can be computed in non- linear time if k states are used in the finite control would be considered weak. Rather we search for proofs that work for arbitrary finite controls. The Generalized Specker Theorem (Theorem 2.2.2) gives us a tool for proving the nonlinearity of the length of the connectivity predicate regardless of the domain D and basis §. We can apply it to connectivity by "reducing" connectivity (for the meaning of "reduction" see [Mi69] or 3.2) to certain symmetric functions. We should note that the generalization of Specker 's Theorem that we prove is the obvious one to attempt; but, as the reader will see, the proof turns out to be less straightforward. As an indication, consider (1.5.2). It does not generalize directly to d > 2 since, e.g., the function {0,1} -» n _ {0,1} defined by E x. = mod 6 can be represented in linear length with i=l x d = 3. This is because 2, 5, n [Ex. =0 mod 6] = [ E x = mod 3] A [ E x =0 mod 2] 1=1 X i=l l i=l X 23 Hodes and Specker do not derive any bounds for the lengths of the functions investigated by them. This question is asked (and to an extent answered) in 3.3. 1.6 Cyclic Perceptrons Cyclic Perceptrons will be treated in Chapter Four. They are an application of ideas of Minsky and Papert to the representation of functions by combinations of finite operators. In particular, one of the concerns in [Mi69] is to formalize the intuitive idea that the connectivity predicate, being "global" in nature, cannot be computed (or represented) by a "simple" combination of "local" predicates. The perceptron is the predicate E a.'tp. ^ i€I L X where I is an indexing set, a. € Q, the rationals, cp. € $, a set of Boolean functions (whose value is interpreted as being either the rational or 1. The cyclic perceptron is defined as where a. € F, a finite field, Y ^ F, and other symbols have the same inter- pretation as before. Thus, both represent a certain Boolean function. Minsky and Papert introduce the concept of the order of a perceptron (the maximal number of arguments on which cp depends where i ranges over I). They define then the order of a predicate as the minimal order of a per- ceptron that represents the predicate. They formalize "local" by defining 24 an infinite predicate sequence to be local if and only if every member is representable by a perceptron of order ^ r, for some finite r. They are then able to show that connectivity is nonlocal. The concept of order can also be applied to cyclic perceptrons. Chapter Four will contain results on the order of the various predicates introduced in [Mi69]. In particular, connectivity is shown to be nonlocal. This will be an extension (to finite fields of arbitrary characteristic) of the results described in [Vi70]. Chapter Five describes a model of computation (Pattern Counting Machines) that again performs- a "local" computation followed by a "global" computation. In this case the "local" computation is even more constrained than in the case of perceptrons. The result is that no matter how cleverly we utilize the "local" information in the subsequent "global" phase, the connectivity predicated cannot be computed. 25 T(F) for the formula F in Example 1.2.2 Fig. 1.1 .th l row .th , j column X. , i; ■ number of columns: m = llog 9 nl + 1 number of rows: ni/ml The array of arguments used in the definition of the Neciporuk function f n Fig. 1.2 26 CHAPTER TWO A GENERALIZATION OF A THEOREM OF SPECKER 2.1 e-Complexes Throughout this section, all formulas are D-formulas for some fixed (but r arbitrary) domain D, and all operators are functions D -»D. Given the formulas F.,...,F , we shall call the formula F = y(F^,. . . ,F^) where cp is an arbitrary operator a parallel combination (PC) of F^, . . . jF^. cp is called the decoding operator of F. Let F(X,z) be a formula where the distinguished variable z appears only once, and let G be an arbitrary formula. Then F(X,G) shall be called a series combination (SC) of F and G through z. 2.1.1 Definition We give an inductive definition of an elongated n- component (e^component) for n ^ 0. (1) Let cp. be an arbitrary unary operator and z an arbitrary variable symbol Then cp. (z) is an e -component. z is the input variable while cp in is the input operator . (2) Let cp be an arbitrary binary operator, G an arbitrary e n _ ^component , and x $ S(G). Then F = cp(x,G) (or cp(G,x)) is an e^component. The input vari- able and input operator of G are also the input variable and input operator of F. x is a lateral variable of F. Any lateral variable of G is also a lateral variable of F. cp will be called an internal operator of F. 27 An example of an e -component is given in Fig. 2.1. Let F be an arbitrary e -component, and let x be the sequence of lateral variables arranged in the order they are connected to the branch of T(F) extending to the input variable. Then x is the lateral sequence of F. If F is an e -component, then the lateral sequence of F is \ (the empty sequence). For example, the lateral sequence of the e-component in Fig. 2.1 is x..,...,x . An e -component with all internal operators equal is a homogeneous e - component . 2.1.2 Definition A formula F is an e r - complex if (1) F is a PC of the e n -components F^..., F , and (2) the lateral sequence of F. for 2 <: i <: r is either equal to the lateral sequence of F. , or the reverse of it. F ,..., F are the components of F. If the variables of F- are numbered as in Fig. 2.1, the second condition of Definition 2.1.2 means that any compo- nent F ..,..., F either appears as in Fig. 2.1, or as in Fig. 2.2. The compo- nents of the former kind will be known as standard components, while those of the latter kind will be called the reverse components. The lateral sequence of F 1 will also be called the lateral sequence of F. r Both in the case of e -components and e -complexes, one or both indices will occasionally be omitted if the particular property they refer to is irrelevant to the argument at hand. An e-complex composed of homogeneous e-components is a homogeneous e- complex . 28 One might wonder what the purpose of introducing e-complexes is since for appropriate r and m every function of n variables can be represented by r an e -complex. Thus, it would seem that this class of formulas is trivial. r However, we will be concerned with e -complexes where r remains fixed as n ' n r grows without bounds, and this will allow us to obtain interesting results. We introduce some notation. Let F be an e -component with lateral sequence x. ,., N ,x, , ON , . . . ,x. , N . a € D is an arbitrary constant, cp. denotes i(l)' i(2)' i(n) J j the internal operator corresponding to x...,. . Also set cp = cp. . Then T n+1 T m <p.(a,<P. +1 (a,...,<P k _ 1 (a,cp k )...)) if 1 ^ j < k ^ n+1 cp* , N =1 cp. if 1 ^ j - k £ n+1 undefined otherwise Note that cp^, . * is a unary operator if j = n+1, otherwise it is a binary operator (if it is defined at all). Usually we will suppress the superscript a because it will be clear what constant is referred to. We now state the simple 2.1.3 Proposition Let F be an e -component with lateral sequence x and input variable z. y. is an arbitrary subsequence of x of length m ^ and a € D is an arbitrary constant. If we denote the set consisting of z and the elements of y_ by Y, Y then F is equivalent to an e -component G. 29 Proof Let x = (x 1 ,x 2> ... ,x n ) and y = (Xj/jj, x i (2)'••• ' x i(m)^ ~ -' Set ± ^ = ° and i(m+l) = n+1. Then G has the operators ilf. = CD,. , . lN , n . ,. N v for i ^ j ^ j (i(j-l)+l, l(j)) m+1 (ijr is the input operator of G). □ 2.1.4 Remark Obviously, Proposition 2.1.3 holds for e-complexes as well; one merely has to perform the above construction for each component. Proposition 2.1.3 will be frequently invoked. Namely, we will take an e-complex F, select a subsequence i e x, the lateral sequence of F, and obtain G as above. In this case, G is called the result of an a - merger with basis y on F. We introduce another restricted class of formulas. 2.1.5 Definition A series parallel combination of e- components (SPCeC) is obtained accor- ding to the following rules: (1) An e-component is an SPCeC. (2) Let F and G be an e-component and an arbitrary SPCeC respectively. Then the SC of F and G through the input variable of F is an SPCeC. (3) If F.,...,F are SPCeC s, then a PC of F. F is an SPCeC. 1' ' r ' 1' ' r (4) An SPCeC is only an object satisfying (1), (2), or (3). 30 Given an arbitrary SPCeC F, we describe its set of components . If F consists of the single e-component G, then G is the only component of F. If F is the SC of an e-component G and another SPCeC H, then the set of components of F consists of G and the set of components of H. If F is a PC of F ..,..., F , then the set of components of Fconsists of the sets of components of F. for 1 ^ i ^ r. Among the components of F, those whose input variable corresponds to a terminal node of T(F) will be called terminal components while the others will be called internal components . An example of an SPCeC is given in Fig. 2.3. This particular SPCeC has four terminal components and two internal components. 2.1.6 Proposition An SPCeC is equivalent to a PC of r e-components where r = d-I+J and I and J respectively are the number of internal and the number of terminal component of F. Proof F can be converted into a PC of e-components by using Lemma 1.3.1. The estimate of the number of e-components in the PC is also obtained from there. ^ Remark It is a simple matter to verify that if F of Proposition 2.1.6 has k components, then I ^ k-1; and thus r £ d- (k-l)+l. 2.1.7 Proposition F is a SPCeC with k components F^...^. F. for 1 ^ i ^ k is an e^- component for n ^ 0, and, furthermore, the sets of lateral variables of F^ and F. are equal for 1 <■ i, j ^ k. Let X be the set of lateral variables of F. and let Z be the set of input variables of F. Then for any m ^ and l 31 a f D if n ^ TL (m,k) where T] (m,k) is a certain function (to be defined), there exists a subset Y £ X with |y| = m such that F is equivalent to an e -complex G with Y as the set of lateral variables. Furthermore, r ^ d* (k-l)+l. m Proof If m = 0, we can immediately apply Proposition 2.1.6 and obtain an e n ~complex where r is as described in the statement of the proposition; thus TL (0,k) = 0. We assume, therefore, that m > 0. We recall the following familiar result: 2 Let i(l), i(2) ,. . . ,i((p-l) +1) be a sequence of distinct integers. Then we can extract a subsequence of length p that is either increasing or decreasing (for the proof see Berge [Be71] p. 16). Without loss of generality, we can assume that the lateral sequence of F. is x-j.-.jX . Then the lateral sequence of F 2 is x. ,^, Xj/ 2 ) ' * " " ' x i(n)" The sequence i(l), i(2),..., i (n) consists of distinct integers; therefore, if n ^ (n -1) +1, we can apply the above result and find a subset X^ £ X of n variables such that after performing an a-merger with basis X- on all components of F, the lateral sequences of the descendants of F^ and F^ are either the same or opposite. We can continue in this way, processing one after another all components. We end up with an SPCeC with components G 1 ,...,G k such that the lateral sequence of G ± for 2 <: i <■ k is either equal to that of G, or the reverse of it. To obtain G, we apply Proposition 2.1.6. In order that Iy| = m, we must have 2 k-l n ^ T^On.k) = (m-1) for m > 1. The estimate for r is obtained from the Remark following Proposition 2.1.6. D 32 Another equivalence that will be used later is given by 2.1.8 Lemma E is an e r -complex, X is the set of its lateral variables, and Z is the set of its input variables. Then for any m ^ and a € D, if n ^ Tt 2 (m,r), YlJZ there exists a subset Y C X with 'Y I = m such that E is equivalent to a ' a r it, homogeneous e -complex F. Proof If m = 0, we simply use Proposition 2.1.3, and the result is a homo- geneous e n -complex (obviously, any e Q -complex is homogeneous). Then T! 2 (0,r) =0. Thus, from now on we assume that m ^ 1. The proof will be given for the special case when E has two components: a standard component E- and a reverse component E . It will then only be indicated how to generalize the proof. A procedure (The Homogenizing Procedure- -HP) will be described that will 2 transform an e -complex G consisting of a standard component G. and a reverse P i component G„ with p a TL(q) (for a function Tl that will be defined later) and with the properties: (1) There exist (possibly empty) subsets 1^ and R 2 = D such that (^(a.y^I^ = id R (identity on 1^) and ilr^a.y)/^ = id R for 1 < i ^ p where CD. and i|r, is an operator of G 1 and G 2 respectively and the first argument corresponds to the lateral variable, and (2) V(i si,j«p) [ffl^x.y) € Rj_ => »j(x,y) = ^(x.y)] (2.1.1) (i.e., the operators of G, are identical on the inverse image of R^). Simil- arly for the operators of G 2 on R 2 « 33 Remark Note that if R- and R include the range of every operator, Property 2 translates into the identity of the operators. In particular, this holds if R. = R 2 = D. 2 Remark Note that an arbitrary e -complex satisfies Properties 1 and 2 with R 1 = R- = 0. 2 The result of applying HP will be an e -complex H that will either be q homogeneous, or will have Properties 1 and 2 with S. and S„ replacing R.. and R 2 respectively and R / S. or R ? S_. Due to the Remarks above and to the fact that D is finite, repeated application of HE on E finally yields F. The condition on n is n * \(m,2) = T) 3 Q) 3 (...T1 3 (in) ...)) (2.1.2) 2d times This bound for n corresponds to the worst case when R. or R_ increase by only 1 on each application of HP. Before describing HP, note the useful fact that because of Property 1, Properties 1 and 2 are preserved under a-mergers. Description of HP The lateral sequence of G is of length (v+l)«u-l for certain values of u and v that will be defined later. Consider the sequence EL 3 (CD ((k-l)-u+l,k-u)' + ((v-D-u+1, (v-k+l).u) ) (2.1.3) for k = l,...,v. Sequence (2.1.3) is illustrated in Fig. 2.4. The two vertical lines represent G- and G„; the numbered horizontal outlets represent the lateral variables (with the corresponding number); the boxes indicate the 34 variables and operators that take part in the formation of any particular cd,. .v and ilr,. • an 'x' beside a variable indicates that it is not set to the constant while 'a' indicates that it is set to a; the two checked boxes represent the first member of (2.1.3). In the sequence (2.1.3), either (Case I) the ranges of c P//k_-i\. .i if. u "\ and ilr,, ., r1 , , ,.. N for 1 ^ k < v are included in R and R respectively, ((v-k)*u+l, (v-k+l)-u) 1 2. or (Case II) not. Case I . If v is large enough, we can find q identical elements in the sequence (2.1.3). Let the indices k corresponding to these elements be k 1s ...,k . Performing an a-merger with this set as basis, the desired e-complex H is obtained. Note that in this case we use Property 1 of G. Namely if cp* is the first component of a pair in (2.1.3) whose range C R^ and if 9 is an arbitrary operator of G^ then 9(a,ep*) = cp* (similarly for the second components of the pairs in (2.1.3) and operators of G,,). Thus, the components of the identical pairs in (2.1.3) become the operators of H. d 2 d 2 A bound for v is q« (d ) (d is the number of operators D ■+ D). Case II . Assume t P(/£_ 1 ). u+1 i. u )^ h ' C ^ ^ R l f ° r S ° me b » c ^ D and 1 <= I* v (the case when ♦ ((v _ i) . u+1> (v .^i). u ) < b > c > * R 2 Can be treated similarly). But then cp,„ , nO>,c) t R n (2.1.4) for all ^ j ^ u-1 (as a consequence of Property 1). Provided that u is large enough, we can find an element e £ D that appears w times in the se- quence (2.1.4). Let the indices j corresponding to the appearances of e be j (1),... ,j(w) (see Fig. 2.5). Obviously then (all the variables considered except x« have been set to a), 35 '(i-u-Mt), *-u-j(t-l)-l) (a > e) =e for 2 £ t ^ w (the first argument of cp . . corresponds to the lateral variable). Thus, ^•u-jCt), i.u-j(t-l)-l) (a ' y) f R l U{6] =ld At this point we consider separately two cases: Case Ha ^ = 1 and j (w) = u-1. We perform an a-merger with the basis consisting of the variables with indices u-j (t)-l for 1 ^ t ^ w-1. As a re- 2 suit of this we obtain an e -complex G' such that Property 1 holds for G' w-1 1 on R 1 U [e] (G' satisfies Property 1 on Rl e R„; in any event R^-Ri = $). Note that at this point Property 2 is still satisfied only on R 1 and R by Gl and Gi respectively. Case lib . A 4- 1 or j (w) < u-1. We perform an a-merger with the basis consisting of the variables with indices 4«u-j(t)-l for 1 £ t £ w. As a result 2 of this we obtain an e -complex G" such that Property (almost) holds for G " (the same remarks regarding G" and Property 1 as well as G", G^ and Property 2 apply as in Case Ha). The only exception may be cd'.' (the operator of G" that is closest to the decoding function). By definition, cp" = ©,.. . ■ / -\_^) : and there is no assurance that cp''(a,e) = e. We may rectify this situation by absorbing co" into the decoding function in the following way: Let G" = 9(G", G' r ). Now set x.. = a (after the a-merger the variables have been renumbered). Let S(G") - {Xj} = U. Then we have (G ,r ) ^ = G' = e(cp^(a,GJ), C-.1 ; wh ere G' equals G' minus co * and Gi equals G' with the input operator modified as follows: ilf! = -dr "(a, ilr ! r ) (remember that G' r is a reverse components, and, in w in 2 2 hence, x, is attached to i]f''). Clearly, G' is an e -complex satisfying 1 w w -l 36 Property 1 with R- U {e] replacing R 1 . We can now resume considering Cases Ila and b together. To obtain H (with components HL and H„) we must find among cp.' for 1 ^ i ^ w-1 q operators that are identical on the inverse image of R.. U (e) (i.e., (2.1.1) with R- U [e] replacing R ) and again perform an a-merger. We again emphasize that the operators of H are identical only on the inverse image of R , and this property has not been violated by any of the transformations of the original e-complex G. To obtain q operators that are identical on the inverse image of R. U { e} , 2 2 it is sufficient that w-1 ^ q-d ; therefore, u ^ d. (q.d +1) and T] 3 (q) = q 2 .d-6 3 + q .d.(6 2 +6)5d-l (2.1.5) d 2 where 6 = d . This is obtained from the values of u and v derived above. Recall that T] (q) = (v+l)«u-l, the length of G, 7L for r = 2 can then be obtained from (2.1.5) and (2.1.2). The proof for the general case is obtained by defining the Generalized H Procedure (GHP) with the corresponding function Tl, . We consider instead of (2.1.3) the sequence 1 r' 1 r" . , (s,t)"-" a5 (s,t)' Vs-.f) V,t) ; (CD' \a,\.) \*,^-) (>s ,z ) - ' vs ,u; 1 r' where s = (k-l)-u+l, t = k»u, s' = (v-k)-u+l, and t' = (v-k+l)«u. co. ,- . . ,cp 1 r" denote the operators of the standard components of G while i|r. ,. . . ,i|f. denote the operators of the reverse components (r ' + r" = r). Without detailed argument we state that in the general case v = q« 6 while u remains the same (u is determined by the requirements of Case II at which time only one 37 component is considered). From this we obtain analogously to (2.1.5) \(q) = q 2 -d.6 r+1 +q-d-(8 r +6)+d-l TL(m,r) can then be obtained from T) 2 0n,r) = n 4 CTl 4 (...T1 (m)...)) r«d times _ which is an analog of (2.1.2). 2.1.9 Remark As we have seen, TL in Lemma 2.1.8 depends on r, the number of components of F. However, we shall mostly be using e-complexes that contain many components that are identical except for the input operator (such e-complexes are obtained e.g., by the use of Proposition 2.1.7). It may be checked that in the application of GHP only one representative from each such group of components need be considered. This significantly reduces 7], . Similarly, in computing TL from (2.1.5), only &•& compositions are required where I is the number of groups of similar components (corresponding to d compositions for each group of similar components). 2.1.10 Remark The operators of a homogeneous e -complex F obtained as a result of applying Lemma 2.1.8 possess an added property that will be used later: Let R be the range of cp(x,y) an internal operator of R (x corresponds to a lateral variable); then CD (a,y) R = id (the identity on r). This fact follows from R the definition of HP (GHP). This particular property of the operators of the components of F will be called the I -property relative to y. In what R 38 follows, we will always suppress "relative to y" since there is no danger of ambiguity. We will simiarly suppress the subscript R unless we will be interested in a specific range. We will abbreviate "F is an e-component (complex) whose operators possess the i a -property" to "F is an e-component (complex) with the I -property". A familiar and convenient way of representing a binary operator cp(x,y) is by a labeled directed graph. The graph of cp, denoted by T (?) , is defined as follows: The nodes of F(cp) are labeled with elements of D. A directed arc labeled with a £ D exists from b to c if and only if cp( a ,b) = c. If D = {1, 2, 3, 4} and R = (2, 3, 4] an example of a graph T(co) for 2 an operator cp with the I -property is shown in Fig. 2.6. Given an arbitrary e -component F, the output of the operator Cp is ^VW^W-'-VV^in^"-^ in the case of a homogeneous e -component with internal operator to this will be abbreviated to cp(x. x, ,,. ..x , cp. (y)). k k+1 n' in J 2.2 The Generalized Specker's Theorem We first give the following 2.2.1 Definition Let $ be an arbitrary basis and a € D any constant. Let F be a formula over § with S(F) = X U Y U Z such that Ixl ^ n -1 (the maximal number of 1 ' max arguments of an operator of $), Y is disjoint from X, but otherwise arbitrary, 39 and Z = {z} is a singleton (disjoint from X and Y) such that z occurs only once in F. The set of functions f(X,z) represented by all possible such formulas F with the elements of Y replaced by the constant operator a will be denoted by $ . In particular, every operator of $ with all but k ^ 1 arguments (k is •a q arbitrary) replaced by a is in $ . Note that if cp(X,z) €. $ , then tp may qualify for $ by virtue of a number of different representations. If z (or any variable of X) in any one of them corresponds to a variable that occurs only once, it is called a distinguished argument ). The other arguments are called free arguments . Thus we may easily find a basis $ and an operator cp such that all arguments are at the same time distinguished and free. We now define a restricted class of e-components and e-complexes: § is an arbitrary basis and a f D is any constant. Let F be an arbitrary e Q -component; then F is an e n ~component over $ . Let cp(x,z) f $ be a binary operator, z a distinguished argument (hence x is free), and G an e .-component over $ ; then cp(x,G) is an e -component over $ . An e-complex over 9 is an e-complex such that all its components are e-components over $ . The main result of this chapter is 2.2.2 Theorem Let there be given the function f : D -► D such that L(f ,$) ^ c.n for some constant c and basis $. Then for any m ^ 1 and a f D if n ^ T] (c ,m), there exists a subset Y of the arguments of f such that |y| = m and f is either a a constant or is represented by F, a homogeneous e^-comp lex over i a with the I -property, Y as the set of lateral variables, constant input operators, and r <■ d- (2c-l)+l. 40 Proof If f has m fictitious arguments, then let Y be the set of these arguments, Y and f is a constant. From now on we assume that f has ^ m-1 fictitious a arguments;. The statement of the theorem gives us that there exists a formula E over $ such that L(E) ^ c«n. Therefore, there are ^ 1/2. n variable symbols representing the arguments of f which either do not appear in E or appear ^ 2*c times. In other words, there are s l/2n-m+l variable symbols t^at actually appear in E and such that the number of occurrences of each is ^ 2-c. Denote the set of these variables by X- . If n ^ 2- T| (n„,2c >m-l, we can apply Lemma A. 9 and obtain a subset X 2 X £ X with |x | = n and such that E is equivalent to E_, and SPCeC over $ a with at most 2c components and such that the set of lateral variables of every component is X„. If n„ ^ IWn-^c) we can apply Proposition 2.1.7 and obtain a subset X r X, <: X with JX,| = n and such that E 3 is equivalent to E3, an e n -complex over f a with r £ d«(2c-l)+l). The estimate for r is obtained at this point. If n„ > Tl 2 (m,2c). we can apply Lemma 2.1.8 and Remark 2.1.9 and obtain F, the desired homogeneous e r -complex over ? with the I -property. The I property is a consequence of Lemma 2.1.8. 41 Discussion of TL. The present proof yields T1 5 (m,c) = 2. 1| 8 CT1 1 (Tl 2 <:m,2c), 2c ),2c)+m-l The exact representation of TL is extremely complex, and in what follows we shall use only a very rough approximation. In Appendix A it is seen that 71 (t,k) (as a function of k) grows faster than iexp(b,2k) for any constant b. The functions TL and TL contribute only insignificantly to this, and thus we state: TL(m,c) ^ iexp(b,4c) for c ^ c(b) 5 (2.2.1) and an arbitrary constant b (Later we shall se that the size of TL prevents us from obtaining any interesting bounds for the functions investigated with Theorem 2.2.2. The size of TL which contributes most to TL results from the technique used in Lemma o 5 A. 3 to obtain a nesting sequence for a given formula F. It is not known whether this technique can be improved. Our guess is that it cannot be.) D 2.3 On Specker's Theorem In this section it will be shown how Specker's Theorem follows from Theorem 2.2.2 (the statement of Specker's Theorem is given in section 1,5). In Theorem 2.2.2 set D = (0, lj , $ = £, a = and let f be as described. Then by Theorem 2.2.2, we obtain that for an appropriate choice of n, we can find a subset X of the arguments of f with |x| = m and such that f Q is either a constant or represented by ♦(F 1 ,...,F r ) 42 where ty:[0, 1} -» (0, 1] and F for 1 ^ i ^ r is either a standard or reverse homogeneous e -component over Z with the I -property and constant input m operators. The value of r is bounded as described in Theorem 2.2.2. We now analyze the various functions that can be represented by e-compo- nents with these restrictions. First note that S consists of all Boolean binary operators; furthermore, if f(x,z) € E » then botn x and z are free (because every Boolean binary operator can be represented over S in such a way that each variable appears only once), and thus there are no restrictions on the use of operators in the e-components we encounter. All possible graphs T(cp) for cp f 2 are shown in Fig. 2.7. The ones that satisfy the I -property are starred. The functions obtained by choosing a value for the constant input operator for the starred graphs are shown in m Table 2.1. In general, this function is either b n ffb.- tt where tt = IT (1 *x.) 1-1 m or c„ © c'CT where a = .$, x. for some values of b_, b. or c n , c-. Now, taking 1 i=l i i u i into consideration the fact that rr« a = 0, and that every |: (0, 1} -» -0, 1} can be uniquely expressed as a Boolean polynomial 2 h -l ffi c.«M. i=0 1 1 where c. f {0, 1] and M is the monomial (of degree one in each variable) in those among x, , . . . ,x, corresponding to nonzero bits of the binary representation of i(see Lemma 4.5) we obtain the first part of Specker's Theorem. The second part of Specker's Theorem could be obtained directly at this point; however, we will derive a generalization of it in Example 3.1.3, and thus omit it here. 43 It must be pointed out that our derivation of Specker's Theorem results in a slightly larger bound for n; however, since no known application requires a specific value for the bound, this is immaterial. Specker's bound (see [H068]) is obtained from the function y.(m,0) = m H(m,k) = 4 (k+1) .n(u(m,k-l) } k-l) by setting T) (m,c) = 2|j.(m,2c). Our bound is slightly larger due to the addi- tional processing implicit in the application of Proposition 2.1.7 and Lemma 2.1.8. However, u resembles T| and this function by far contributes the most to T] ; thus, we can state that the bounds are approximately equal. Finally, let us note the fact that Theorem 2.2.2 allows us immediately to amplify Specker's Theorem. Namely, the statement of the theorem involves the basis consisting of all binary Boolean operators. However, the proof of Theorem 2.2.2 works for bases consisting of operators of an arbitrary number of arguments. 44 ^r) — (£)-•• — (£) — ©— O T(F) where F is an e -component Fig. 2.1 ^J) @ — @ — @ — Q © © T(F) where F is a reverse component of an e-complex Fig. 2.2 45 F is an SPCeC Fig. 2.3 46 C u-1- u u+1 2u-l 2u (k-l).u+l X { k.u-1' k«u (v-l)«u+l- v.u-1 ■ VtU (v+l).u-l X Y (i a x a a x a x %~ a x (v+l)*u V'U+1- v»u ■ k.u+1- k«U' 3u-l- 2u+l 2u 2u-l u+1- u - qp ((k-l)«u+l,k.u) } ((v-k).u+l,(v-k+l).u) Illustration of the HP procedure (I) Fig. 2.4 (\r 1 1 ^ . ii 1 }Y — !> <» 47 (4-1). u-1 X.u-j (w) <p ( (J e-i).u-i^.u) (b ' c)? R i Illustration of the HP procedure (II) Fig. 2.5 2, 3 F((p) for an operator cp with the I f2 3 , -property Fig. 2.6 48 0, 1 (i) 1 ) (J) (k) 0, 1 1 ) (!>) (m) (n) CV^>0 ( ) ( 1 ) (&)■■ 0, 1 The graphs of all Boolean binary operators Fig. 2.7 49 r<$P) <P (y) in Function a a 1 c c 1 m rid® x > i=l d d 1 1 g m © X i=l i g 1 m 1 ffi © x. i-1 h m i © n (i ®x.) i=l x h 1 1 P 1 P 1 1 1 Table of functions that can be represented by e-components with the I -property (see Fig. 2.7) and constant input oper- ators if D =« {0, 1} Table 2.1 50 CHAPTER THREE APPLICATIONS OF THE GENERALIZED SPECKER THEOREM The principal results obtained previously [H068, Ho70] by the use of Specker's Theorem are 3.0.1 n A new proof that the Boolean function .§- x. is of nonlinear length over t II . This is accomplished as follows. First note that the restriction of the mod 2 sum of n variables obtained by setting certain variables to is again a mod 2 sum (but of a smaller number of variables). Now apply Specker's n Theorem (see 1.5). Suppose .6. x. is of linear length over II. Choose n large enough to obtain m = 3. The theorem states that for this particular bases c = in (1.5.1). However, it can be checked that in this case no choice of c n and c. will yield the mod 2 sum of three variables. A contra- diction. 3.0.2 n The function f: {0, l} n -» {0, 1] defined by f = 1 if and only if E x = i=l mod 3 is of nonlinear length over E. We proceed similarly as before. Assume it is of linear length. Apply Specker's Theorem with n sufficiently large to + Of course the results of Subbotovskaya and Khrapchenko are stronger for this particular example. 51 obtain m = 3. If we replace x.. , x„, x„ in (1.5.1) once by 1, 0, 0, and another time by 1, 1, 1, then the value of (1.5.1) remains unchanged. However, the value of f (with all variables except x.. , x„, x replaced by the constant 0) is different on these two assignments. Again a contradiction. Both of these results were derived by Hodes and Specker in [Ho68]. We might note that the technique of 3.0.2 can easily be generalized to counting mod k where k is an arbitrary integer (see (1,5.2)). 3.0.3 Certain geometric predicates (see [Mi69]), in particular the connectivity predicate, are of nonlinear length if expressed with binary Boolean operators (this result was obtained by Hodes in [H070]). We will not discuss this in greater detail now since this technique will be treated later in 3.2. In this chapter we will use Theorem 2.2.2 to generalize all these results. 3.1 Counting mod p Consider the function (0, 1} -» (0, 1} n 1 if Z x. = mod p i=l X f n (x 15 ...,x n ) - otherwise then 3.1.1 Theorem If p is a prime, if |d| < p, then f™ is not of linear length over an arbitrary basis. 52 Proof Suppose the statement of the theorem is not true. That is, there exists a prime p, a finite set D such that |d| < p, a basis § of operators on D, and L(f , $) £ en for some constant c. First note that if X is a subset of the arguments of f P and jx| = m, ther (f n ) Q = f P . We can now apply Theorem 2.2.2. For an arbitrary m, if n is sufficiently large, there exists a subset X of the arguments of f P with |x| = m and such that (f )_ is represented by a homogeneous e -complex F (over $ and with the I -property) with X as the set of lateral variables. In addition, since f is a Boolean function, the lateral variables of F are restricted to n ' (0, 1]. Consider now any component F. of F and T(cp ) where cp is the internal operator of F.. Since cp has the I -property, T(cp ) has the general appearance of Fig. 3.1. F. is determined by cp. and the constant input operator a.. Now let m ^ d and consider the sequence (s . ) for ^ j < m where s~ = a. and s - cp. (11. . . 11, a. ) for j > 0. Let s, ,.,. be the first element in the sequence j times that is repeated at some later point; in fact, let k(i) be the position of the first occurrence of this element. Let k(i)+j£(i) be the position of the second occurrence of this same element. Then we shall call k(i) the prefix of F. while 4(i) will be called the period of F . . Clearly, if F. is a standard (reverse) component, then cp. (x.x 2 ...x . 11... 1, a ) (cp (x x _ . ..x 11... 1, a )) where k ^ k(i) is a function of the numb er of l's among x,, x„,...,x , (x, -,..., x ) mod 4(i). 53 Thus, If we set Y = {x, ..,..., x , } and choose k-ClO to exceed or equal the prefixes of all the reverse Y (standard) components of F, then F. represents a function of the number of l's among the variables of Y mod jgcm(A(l),...,4(r)). (3.1.1) y On the other hand, by the initial assumption, F. is a function of the number of l's among the variables of Y mod p; this results in a contradiction since d < p. On the basis of (3.1.1) we can obtain the following 3.1.2 Theorem Let D be an arbitrary domain, 5 is a certain basis, and p is an arbitrary integer > 1. If 5 is such that any e-component over $ with the I -property and constant input operator has period one, then f p is of nonlinear length over $. 3.1.3 Example This is an example of a basis satisfying the hypothesis of Theorem 3.1.2. Consider an arbitrary domain D = (0,1 d-1] . Then a complete basis for D is ♦ - (min(x,y), max(x,y) ,0,1, . . . ,d-l, e Q (x),. . . , e (J _ 1 (x)} where min and max are defined in the usual way, 0,...,d-l are the constants, and f d-1 if x « i e,(x) = jtherwise S I W = 1 I ot 54 (Note that i|r, _ -. = { A,V,0,l,-,id] ; thus, a result on the nonlinearity of the length of a certain function over i|r, _ . , is also a result on the nonlinearity of the same function over II; in particular, applying Theorem 3.1.2 to ijr, - .. , , we obtain (2) of Specker's Theorem.) i|r is interesting because it gives rise to an analog of the disjunctive normal form for arbitrary D: Consider the table for an arbitrary function f : D -> D. Then d n -l f = max (M. ) i=Q ^ where M equals if the current assignment is not the i assignment and the value of the function at the i assignment otherwise. M. is represented as follows M i -"^.(I.nfr]? e a(i,n) (x n ) ' f i ) where a(i,j) is the j component of the i * assignment. We claim that i|f satisfies the hypothesis of Theorem 3.1.2. Note that a b i|; = i|f for all a, b £ D because iL contains all the constants. Therefore, we will write simply i|r * Given cp € jr with the I -property, the statement that there exists b £ D such that the homogeneous e-component with internal operator f and b for its input operator has period & is equivalent to saying that there exists a subset LED with |l] = & and cp(0,z) L is the identity (id ) while cp(l,z) L is the permutation with cycle length & (p^)« * We contend that for any L £ D, cp(x,z) € ± and c, e € D, if cp(c,z) L 55 and cp(e,z) L are 1-1, then cp(c,z) L = cp(e,z) L. Since id 4 pj. if I > 1, this will establish the original claim. This can be proved by induction on the depth 6^ of the distinguished variable z in the formula F that represents cp (since there may be many such formulas, let F be one of the formulas where the depth of z is minimal. If 6 =1 then either F = max(F',z), or F = min(F",z). Assume the first * case (the second can be argued similarly). By definition of jf , F contains only the variable x. If we replace x by c , F' represents a constant c' f D. Now if c' ^ L, then cp(c,z) L is the identity, otherwise it is not 1-1. •k If 6 > 1, then either cp(x,z) = e. (cp 1 (x,z)) where cp' £ jf_ and 6 , < 6 or cp(x,z) = cp"(x,cp'"(x,z)) where cp", cp"' f^ and 6 6 < 6 (to see this, think of F). In any case cp', cp", and cp'" satisfy the inductive hypothesis, and we are done. Note that Theorem 3.1.1 and 3.1.2 hold with f P replaced by the function n g p : D n -♦ {0,1} given by £ x = mod p since g£ f (0,l] n = fj\ i=l 3.2 Connectivity The connectivity predicate was already discussed in 1.6. It attacted considerable attention after Minsky and Papert [Mi69] succeeded in obtaining interesting results on the complexity of perceptrons that represent the connectivity predicate. Works that follows [Mi69] and that treat specifically the representation of the connectivity predicate by finite operators are, e.g., [H070], [Mi71], and [V170], 56 Minsky and Papert describe a circuit for computing the connectivity 2 predicate of depth (of the order of) (log„n) which on intuitive grounds seems minimal. This circuit translates into a formula of nonpolynomial length. Thus, the connectivity predicate seems to be a good benchmark for testing estimation methods for the complexity of functions (i.e., any appropriately general method which is presumed able to give estimates for length up to log 2n f(n) < n should declare the connectivity predicate complex). 2 Consider a set of n variables {x. .) for 1 Si, j ^n; then the connec- ij 2 tivity predicate is the function c : {0, 1] -* {0, 1} defined as follows: (we will not give a formal definition since the formalization is obvious) Given a specific assignment to the variables, consider it as a square array of 0's and l's. Then c = 1 on the empty pattern (i.e., consisting of all 0's), or if the l's form a connected pattern. By "connected" we mean that any two l's can be linked by a sequence of adjacent l's (two l's, corresponding to the variables x . and x, . are adjacent if !i-kt + !j-^| = 1). For example, the pattern in Fig. 3.2 is connected. The general approach used here to obtain an estimate for the length of c is to consider reductions of c . n n Given an arbitrary function f , a function g will be called a k- reduction of f if g is obtained from f by replacing each argument of f by a function with at most k arguments. Suppose we want to prove that f of i = 1,2,... arguments is of nonlinear length (over the basis $). Assume there exists a k-reduction g. of f. such that the number of arguments m in g is m > Ct-n for some constant < CC ^ 1. Assume L(f ,$) ^ en for some constant c. That is, there exists a formula 57 F for f and L(F ) ^ c«n. Since the length of any function of k arguments is bounded by L(k,$), we obtain L(g n ,$) ^ c- L (k,5)-n by making substitutions for variables in F . If [g.] is rearranged (and renumbered) in the order of increasing number of arguments, and all but one functions with the same number of arguments are deleted, then we obtain L(g,$) ^ c' -m for some constant c'. Finally, if we can prove (e.g., by applying Theorem 2.2.2) that g, is nonlinear, we obtain a contradiction, and we are done. Hodes [Ho70] shows the nonlinearity of the length of c over S by reducing c to the function n m V (A y ) Ay 1=1 jtt J (i.e., exactly one variable is 1) which can then be proved to be nonlinear using Specker's Theorem. Unfortunately, this reduction does not work for domains with more than two elements because this function is linear over an appropriate basis in such domains. However, another approach works, and we can state 3.2.1 Theorem Regardless of the size of D and the nature of f, c is of nonlinear 2 length (i.e,, L(c ,$) ^ en is not true for any D, §, and constant c). 58 Proof Minsky and Papert [Mi69] succeed in reducing c to counting mod 2 by exhibiting a contact network such that its connectivity depends on the number mod 2 of contact variables equal to 1, and then by simulating this network on the square array of variables (they call it the "retina"). We shall proceed similarly. c is reduced to the function s P : {0, 1} -» {0, 1} (for an appropriate t n t) defined as follows: 1 if > arguments are equal to 1 s n = otherwise s , can be represented by the connectivity of a contact network S . S is n n n shown in Fig. 3.3a. It has p contact arms for each variable y^ The value of y. corresponds to the upward position of the corresponding arms while the 1 value of y. corresponds to the downward position. The contact arm of S^ are arranged in p rows (n arms in each row). Whenever an arm for y. is in the upward position, it is connected to an arm for y..-, i n the same row; if the arm for y. is in the downward position, it is connected to an arm for y. in the next row. Thus, it may be easily checked, point A^ is connected to B , if and only if at least p among y, .•••,y are !• It: ma y also be p+1 1 n verified that in this case all the contact arms in the network are connected either to A„ or to B ,, and since these two are connected together, the whole U P+1 network is connected. S p in turn can be simulated by a rectangular array R P of 0*s and l's n n where certain positions are constant and others depend on the y^s (see Fig. 3.3b). The size of / is (3 (p+l)+l) . (3n+p-l). 59 We now show that s P is a 1-reduction of c. for some t. This is done n t by cutting R P into smaller rectangular pieces along vertical lines. The first piece is of length (£-l)-q where q = 3(p+l)+l and I will be defined later, the second through SL-lst piece is of length (£-2).q, while the I , last, piece is of length between 1 and (4-1) «q. These pieces are then arranged into an I- q * &' q square pattern T P as shown in Fig. 3.4 (the arrangement depends on the parity of I). Corresponding rows of adjacent pieces are connected by =>- or c- shaped patterns of O's (in the case when one of the positions along the cut at the row in question is 0) or l's. The unused positions of T P (corresponding to the case when the last piece r n is not of the maximal length) are replaced by O's. t is set to l-q and the variable x. . in c t is replaced by the correspon- ding position in T P (one among 0, 1, y^ or y). Obviously, the function obtained by this replacement is s P . If 3n+ P^ :> 1, I is obtained as r x"l where x is the positive solution of 2 , ,^ 3n+p-l x -2(x-l) = ~ We also have that n «(l/3q)t, and, thus (by the reasoning outline previously), if c is linear so is s. t n With the assumption that s P is linear, we apply Theorem 2.2.2 with a = 0. n In this way we obtain that (s P )q where Z is a certain subset of the arguments of s P of size m is represented by an e -complex F (with the requisite restric- n m , . p N Z p tions) where m is arbitrary. Note that (s^; Q - s^. 60 As noted in 3.1, if a sufficiently large number of variables at the beginning and end of the lateral sequence of F is replaced by 1, then F with this substitution represents a function of the number of l's among the remaining variables mod the 1cm of a set of integers £ d. The number of variables that have to be set of 1 is u i 2(d-l) (at most d-1 at each end of the lateral sequence). Thus we obtain a representation of the function S m-u if P ^ U * If P ^ 2(d-l)+2, we obtain a function s. for i ^ 2. However, it is clear that s. is not a function of the number of l's mod k for any integer k. Thus, we have arrived at a contradiction, and, hence, c is of nonlinear ' ' n length over any basis ?. D t 3.3 The Length of Symmetric Functions As we have seen in the previous examples, Theorem 2.2.2 has been applied only to functions that are either symmetric or that can be reduced to symmetric functions. While we know of no formal statement that can be proved and that asserts that this indeed exhausts the applicability of Theorem 2.2.2, it intuitively seems probable. In this section we will discuss several bounds on the length of symmetric functions (both specific functions and all symmetric functions). Recall that in 1.4 we have already mentioned several such bounds (Subbotovskaya, Khrapchenko). All of the results in this section were suggested by A. R. Meyer. 61 Does Theorem 2.2.2 (or Specker 's Theorem) give us any information on the length of the functions investigated? Hodes and Specker do not treat this subject, and, in fact, the bound that can be obtained is very weak; however, we do mention it for the sake of completeness. In an application of Theorem 2.2.2 (or Specker 's Theorem) to a certain function f, we proceed with the assumption that L(f,$) ^ c*n. To apply Theorem 2.2.2 we must have n ^ Tl (m,c) where m is a sufficiently large number to obtain a contradiction. However, m does not depend on c. Thus, n depends only on c and m is assumed constant. Consider now c as a function of n. We ask what is the maximal value c for c(n) for which Theorem 2.2.2 can be applied (and a statement contradic- ting L(f,$) ^ en obtained). c(n)«n is then a lower bound for L(f,$). Due to (2.2.1) c grows slower than (l/4)«ht (p,n) where ht (p,n) = maximal x such that n ^ iexp (p,x). Then we have c(n)-n £ (l/4)«ht (p,n)-n (3.3.1) for an arbitrary constant b > 1 and for sufficiently large n. This bound seems unrealistically low, and it is useful to compare it 3 with known bounds for length for the particular function f over some bases consisting of Boolean operators (we will suppress the subscript n). 3 It has already been established that f is of nonlinear length if D = 3 3 3 1 (0, 1] (see 3.1). We introduce the following notation: f = f ' , f ' , 3 2 n and f ' stand for S x = 0, 1, and 2 mod 3 respectively. We will repre- i=l sent f 3 '°, f 3,1 , f 3 ' 2 by the formulas F°, F 1 , and F respectively. F is obtained by the following recursive relation 62 F°(x) = F°(Y) A F°(Z) V F^y) A F 2 (Z) V F 2 (Y) A F X (Z) (3.3.2) (If X is the singleton {x}, then F (X) = x) L(F°(X)) = L(F°(Y))+L(F°(Z))+L(F 1 (Y))+L(F 2 (Z))+L(F 2 (Y))+L(F 1 (Z)) 1 2 Similar identities can be obtained for F (X) and F (X). When these identities are used recursively, we obtain i lo §? 6 9 fi 0(L(r,$)) £ n «ri (3.3.3) an exact description how we obtain a bound of the form (3.3.3) from a recur- sive relation similar to (3.3.2), see [Ya54]. This upper bound can be further reduced by using multiargument operators. 1 R 3 Let y: (0,1,2] -+{0,1,2} be the operator £ y. mod 3. Then f can be i=l X represented by a formula G which uses y recursively (i.e., the arguments of f 3 are repeatedly divided by I together with an outermost decoding operator {0,1,2} -+ {0,1}. G is of linear length. If we use D = {0,1}, y can be en- coded by two operators v ' and y"> and G translates into a formula such that log n log,k 0(L(G)) = (2k) k = n (3.3.4) 3 Thus, as & increases, the upper bound for L(f ,$) (where y $) approaches c*n. However, the gap between this bound and (3.3.1) is still huge. But, the important thing to note is that any theorem that retains the same broad assumption (bases with an arbitrary number of operators) as Theorem 2.2.2 3 cannot yield a better bound for f than (3.3.4). 63 Another example of a function that is nonlinear in length (over all 4 Boolean binary operators) by Specker's Theorem is f . However, it too has a relatively short representation (the previous and this example show that Theorem 2.2.2 is a sensitive tool for deriving the nonlinearity of functions; i.e., it can be used on functions that are only "slightly" nonlinear). A representation for f with Boolean operators is obtained by dividing 4 the arguments of f into disjoint (nonempty) pieces Y and Z, and adding the 4 4 bineary representations of f (Y) and f (Z). Let the binary representations of f (X) be given by the formulas F ' (X) and F"(X), obtained by the following recursive relations F"(X) = F"(Y) a F"(Z) F'(X) = F'(Y) © F'(Z) © F"(Y) A F"(Z) (If X is a singleton F"(x) = x and F'(x) = 0) Consequently, L(F"(X)) = L(F H (Y))+L(F"(Z)) L(F'(X)) = L(F'(Y))+L(F'(Z)+L(F"(Y))+L(F'(Z)) By choosing Y and Z always as equal as possible, we obtain L(F"(X)) = n 0(L(F"(X))) = n-log 2 n Since f (X) is represented by F ' (X) A F"(X) n'log 2 n is also a bound for 0(L(f ?) where ©, A 3. n We now turn our attention to an upper bound for the length of all symmetric functions. 64 Note that a symmetric function g: D n -» D where D = -0,1, . . . ,d-l] depends exclusively on N.,,...,N d , where N. is the number of variables equal to i. It can be represented, e.g., as g = max(M ) n (l)"'» n (d-l) where M ,^ nCd-l") ec l uals t if N = n(i) and is otherwise. The number of combinations of n(l),. . . ,n(d-l) is a polynomial inn - ( ~ ) — and the max function can also be represented in polynomial length, regardless of the basis i. The latter fact is established by representing max using the two- argument max recursively. Thus, if M . . (a-\\ were polynomial, g would also be. M. J. Fischer and A. R. Meyer discovered that M ,. v ,, n «. can, n(i; ,. . . ,n(d-l; indeed, be represented in polynomial length by using a special code for integers described by Avizienis [Av69], We will illustrate the construction on Boolean symmetric functions. It will be seen that if the basis of operators is appropriately chosen, the length of an arbitrary symmetric function is bounded above by a polynomial of a surprisingly low degree. The Avizienis code is a redundant positional representation of integers to an arbitrary base b > 2. We describe it for b = 3. An integer n is represented by all possible Tlog_nl - tuples a riog 3 nT'"' a l where a ± € {-2,-1,0,1,2} for 1 ^ i <: flog^n! and 65 Tlog nl h a. J i = n i=l The property that is exploited is that there are no long carry's in addition. Thus, if we want to add two Avizienis coded integers a = a.a, ..... a. and b = b,b , ...b. , we can do it in two steps using the following 3.3.1 Algorithm (Avizienis) (1) Find the carry c and intermediate sum r such that a.+b. = 3c, +r. i i i i where a ± , b € {-2,-1,0,1,2} and c^ r ± € {-1,0,1}. (2) Compute the sum s according to S i = r i +C i-l Let us estimate the length of the formula representing any ternary place in the Avizienis representation of N.. for X = (x^,...,x }. Again let X = Y U Z and Y H Z = 0. r ± (X) and c^X) can be represented as R t (X) = p(R i (Y),R i (Z),C i _ 1 (Y),C i _ 1 (Z)) and C^X) = X(R i (Y),R.(Z),C i _ 1 (Y),C i _ 1 (Z)) If X is a singleton, r. and c. are if i > 1, or 1 and respectively if 4 i = I. P and X are certain operators £-1,0,1} -» {-1,0,1} which can be obtained from the definition of Algorithm 3.3.1. Strictly speaking, the 66 domain used here is not permitted in our definition of finite functions; however, the difference is merely one of coding. Thus, L(R i (X)) = L(R i (Y)+L(R i (Z)+L(G i _ 1 (Y))+L(C i _ 1 (Z)) and MC^CX)) = L(R i (Y))+L(R i (Z))+L(C i _ 1 (Y))+L(C i _ 1 (Z)) If we use these relations recursively and always make Y and Z as equal as possible, we obtain 0(L(R i )), 0(L(C.,)) ^ n 2 for 1 £ i ^ Tiog nl. If D = {0,1}, we need two bits to encode r and c^. Therefore, using certain operators p', p", X' , and X" to encode p and X, we can encode R. (X) and C. (X) and combine them into a { 0,1] -formula A.^ representing the i ternary place of the Avizienis representation of N^. We have 0(L(A i )) ^ n 3 (3.3.5) Let there be given a positive Avizienis coded number a = a a _ ...a . We desire to convert it into its binary equivalent b = b b _...b 1 . Let q q-1 1 U £ (a ,...,a.}. Then we define b (U) = 1 if and only if the i bit of pi i E a.3 1 is 1. Note that even if a is positive, b.(U) may be negative a.€u x 1 for some i and U. This is further discussed below. For the moment we assume that b. (U) is always positive. We can then again compute b.(a . ..a^) by a recursive method. Let U = V U W, V w = 0. Then b^U) is the i bit of the sum of b (V)...b.(V) and b (W)...b 1 (W). q 1 q 1 67 b.(U) is represented by the formula B. B i (U) = P(B.(V),B i (W),G i _ 1 (U)) , _ , . th , where G. represents the carry from the 1 place; G.(Y) = Y(B 1 (V),B 1 (W),G 1 _ 1 (U)) G n represents the constant and P and y are certain Boolean operators. Then we obtain L(B.(U))« E L(B.(V))+L(B (W)) 1 j=l J J If a is the Avizienis representation of a number ^ n then p = llog^n I and q = Tlog^nl . Thus, we obtain the following bound for L(B (a)) log log.n 0(L(B 1 (a») < (2q) » log log n (21og 2 n) (l+log 2 log 2 n) log 2 log 3 n log 2 n « n (3.3.6) Note that (3.3.6) means that 0(L(B.(a))) ^ n £(n) for 1^ i ^ Tlog^l where e -+ as n -» °°. It has already been remarked that b(U) need not be positive. Thus b(U) must be treated as a signed number. If we use the l's complement representation and the en-around carry technique (see, e.g., [Gr59]), addition can be performed as follows. Let b(U,g Q )=G(V)+b (W)+g Q where g Q is either 68 or 1 and let g 1 denote the carry from the highest position of b(U,0). Then b(U) = b(U,0) if g x = and b(U,l) if ^ = 1. This means that in (3.3.4) we obtain (i.q) P for some constant SL instead of (2q) exp where exp has the value given above. If a^ for 1 <: j £ riog 3 nl . in B i for 1 ^ i ^ Tlog^l is replaced with A , we obtain formulas F^X) representing N in binary form. Combining (3.3.5) and (3.3.6) we obtain 0(L(F i (X), ))) £n 3+6(N) where e(n) -» as n -* °°. To obtain the desired representation of an arbitrary Boolean symmetric function, we proceed as follows: Consider the formula S. defined inductively S l = X 10 V x ll S i+1 = X i0 A S i-1 V x il A S i-1 (3.3.7) Take Sp. -| and replace x. Q and x. - by F. and F. respectively. It is easily seen that S p. -| with this replacement is identically 1 (this can be proved, e.g., by induction; for S.. it is trivially true, and the general statement follows from (3.3.7)). Let there be given an arbitrary symmetric function g. It is defined by a subset M C {0,l,...,n] of possible values of N- . Each branch of length Tlog^nl in T(S|- -i ) corresponds to one value of N- (given by the binary number j (Tlog^l) ,. . . , j (1) where K^nl , j ( Tlog^l) '* * ' > X 1, j (1) define the branch in question). If we remove branches of T(Sr. -i) corresponding to M, thus obtaining the formula S', and perform the substitution defined pre- viously to obtain the formula p, we obtain a representation for g. 69 We have 0(L(S')) £ n and thus 0(L(P)) <. n +e(n) where e -» as n -» °° . Thus, if a basis $ is given that contains all operators used to obtain P, T / »\ ^ 4+e(n). then L(g,$) £ n In Lu70 Lupanov announced a result of Khrapchenko to the effect that an 4 93 arbitrary symmetric function is of length < n * # Since the assumption were not made explicit, and the result itself is unavailable a3 of this writing, no exact comparison can be made with the estimate above. 70 > (x x .. i m m-1 T(tp i ) where cp has the I -property (only arrows labeled with and 1 are drawn). T' denotes the set of nodes x„0,a.) where m is as described in the text and x , may assume arbitrary values in {0, 1}. Fig. 3.1 x„ 00000000 00001111 00001101 00001001 01101011 01111011 00000000 00000000 A connected pattern of l's Fig. 3.2 71 Contact network S for s P n n Fig. 3.3a 72 O rH pa pa w CO pa CO i a, pa CM (X pa pq pa + pa r-<i-400i-IOOt-lo\ /r-'OOr-IOOr-fOO'-'OO'-'O , p f 5 I \ 1-1 I S*> r-l >i«-< OOr-to, i-IOO'-IOO'-<00'- ( OOr-IO ', ' P P • \ r-i i-l i-l o IfM-H >,i-) o |i-'OOi-<00'-<00'- | OOr-40 rH _ r-l C I r-l , P r-* P O O r-t ,-1 |>, >« r-i i-l I r-l lr-1 O, fi C>P C IH r-l O I <N °* fl 7 i-<OOr-(OOr-<OOi-'0 ~-L-0 O <-* O O i-l o P Oj/i-li-li-l|>-,r-l>,i-IO O-i r-l pH f5 O O r-l i-l [>*, >> CM CM IrH I i-l r-l r-l r-l O P P O H H r-IOr-lrHr-IOir^i- 1 >~ir-IOOr-IO I rH lr-1 P P .P P OOi-lr-IO|rM-l>^r-IO p-, >, i-H i-H CO CO ! / -d" <f HHHO|>iH >,i-IO]{ HHQ^H >,r-IOOi-lr-IOi-"'0 , , , co / i •* <t HrlOOHHO^rll) i-<OOr-<r-IO|>»i-< >•> r-1 O O r"l O ')\ cm cm ' j co co vt st ■ _, I5>-.*-' >-.i-IOOr-lr-IM (>>r-l f>»i-IOOi-lr-lO|>^t-l >•, i-l O , cm cm / co co r-l l-l r-l O |>M-I >,HO MHO^^ >,r-IOOr-lr-IOr-IO I r-l O O r-l r-l O CO co , <N / .»■!*' I |r*,rH/!i-JOOr-li-lo|(»i-l >*rH O O r-l O r-l r-l ! i CM CM CO CO H I^H >m-4 O O r-l r-l / ' / >, r-l >, r-l O O r-lr-t O |>% r-l >■, i-l O r-l r-l \ I CM CM i-l r-l r-l O |>v-< >,W O / j r-l r-l O fc>, r-1 >^ r-l O O r-l i-l O r-1 O —Iff CM CM r-IOOr-lr-IO|>~.i-l >> r-1 O O r-1 O i-l r-l CM CM l>M-l >M-IOOr-li-IO|r^rH $►, I-l O r-l r-l r-1 r-l O |>* r-l J^r-IOOr-lr-IOr-IO r-l i-H r-IOO!-lr-IO|>-,r-l rS' _l OOr-IO r-l I-l r-IOOi-<OOr-li-IO|>ir-< >>i-< O i-^OOr-IOOr-IOOi-lr-IOi-IO 1-lOOr-lOOi-lOOi-iOOi-lO O O r-l O O r-l O r-lr-'OOr-'OOi-IO O r-l < < CM CO CO I (X CM I a, i a. + a, Planar connectivity simulation of S P n Fig. 3.3b L p+1- 73 i.< T H for i = 5 n Fig. 3.4a J "hB, ^L T P for H = 4 n Fig. 3.4b 74 CHAPTER FOUR CYCLIC PERCEPTRONS The perceptron has already been discussed in 1.6. In the beginning of this chapter, we will first expand on that discussion in order to further motivate the study of cyclic perceptrons<> The classical perceptron (for references on the subject see [Mi69] became the subject of extensive research centered around concepts such as pattern recognition, learning, adaptive behavior, etc. A whole myth had been created around it -- about its capabilities and its potential for use. The thing that attracted people most were its ability to learn from experience and its simplicity -- it combines many small decisions, the values of the functions cp. , into a final decision by considering their weighted sum. Minsky and Paper deflated this myth by showing that such a scheme has its inherent drawbacks. In particular, it cannot compute predicates such as connectivity. The most general intuitive basis for the result that the connectivity preciate cannot be represented by a perceptron is the following: First of all the reasoning makes sense only if the complexity of the functions cp. is limited in some way; if not, we can choose to. to be the function that we desire to represent and then it can be represented by a perceptron trivially. Minsky and Papert use the order and diameter restrictions (see [Mi69]). The former is also used by us. Suppose we want to represent connectivity. Then, if the cp . ' s are bounded in complexity (so the reasoning goes), the weighted sum is too simple a function 75 to be able to integrate all the information that is required in computing connectivity. We set out to apply the same basic reasoning to models where the inte- grating function is constructed out of finite operators. In particular, we choose addition in a finite field because of the unique representation property for functions in such a field (see 4.5) which makes proofs rather simple, and because of the purely formal resenblance to perceptrons. One particularly interesting aspect of using addition in a finite field as the integrating function is that one proof of the inability of perceptrons to compute connectivity is based on the reduction of connectivity to addition mod 2. However, this function is precisely the simplest one possible in GF(2), This underscores the need to make different reductions for different models of computation that are presumed to be incapable of computing connectivity. In this chapter we shall limit ourselves to Boolean functions. We introduce cyclic perceptrons formally: 4.1 Definition GF(p ) is the finite field consisting of p elements. $ (the basis) is an infinite set of Boolean functions (0,1] -» [0,1] such that each cp € $ (w is the first infinite ordinal) depends on a finite number of arguments. Elements of $ are assumed to be ordered (in an arbitrary way). Then a (p,k)- perceptron (over ?) is a pair P = (a,Y), where a is an w-vector such that the i th component a. € GF(p ) and a.^0 for only finitely many values of i; Y c GF(p k ). 76 Given a function f: {0,1] -» {0,1], we will denote the set of arguments on which it depends by S (f ) . Let P = (a,Y) be a (p,k)-perceptron. Then P will represent the predicate (Boolean function) 00 f = [ £ a cp. € Y] (4.1) i=0 *■ X where the value of cp. € {0,1} £ GF(p k ). Obviously, S(f) £ U S^) 1 i€{j: j? «0] We will indicate the function represented by a (p,k)-perceptron P as in (4.1), or simply [P] . Let us recall a concept from [Mi69]. Given a (p,k)-perceptron P = (a,Y) over a certain basis $, its order (ord(P)) is max (S^) . i€[j: a.#)] We can also introduce the order of a function. 4.2 Definition The (p,k)-order of a Boolean function f over a given basis $ ((p,k)-ord $ (f,> is the smallest I such that there exists a (p,k)-perceptron of order I repre- senting f. If no such perceptron exists, the (p,k)-order of f is defined to be °°. Let 0. be the set of all Boolean functions with finite support. Note then that for an arbitrary Boolean function f, (p,k)-ord n (f ) is finite and <■ S(f), for all primes p and arbitrary k. Also note that for an arbitrary basis § ( P ,k)-ord n (f) ^ (p,k)-ord i (f) (4.2) 77 We show now, as is done in [Mi69], that we can choose for the basis a more restricted set. Let the set of arguments of the basis functions be H = {x 1 ,x„ ,...]; then we define the set of masks M=( A x.:Sisa finite sub- i6S x set of JW }. A convenient way of ordering M is to assign to cp€M the binary number b b ...b. where b = 1 if and only if x, appears in the conjunction defining cp. 4.3 Proposition Any Boolean function f can be represented by a (p,k)-perceptron over M for any prime p, and arbitrary k. The proof is the same as that of Theorem 1.5.1 in [Mi69], i.e., we util- ize the following correspondence between Boolean operations and operations in GF(p ) if the variables assume only the values and 1: x. A x_ ~ x • x„, x, V x„ ~ x. + x 2 - x • x 2 , x ~ 1 - x If f is a function of n arguments, then from its disjunctive normal form, by using this correspondence and by multiplying out afterwards, we obtain the following representations for f: E « 1 x 1 x 2 ... x m = x i-0 th k where a. . is the j bit of the binary representation of i and a. € GF(p ). Note that the mask cp. (see the ordering above) is represented by the monomial with exponents corresponding to the binary representation of i. 78 Theorem 1.5.3 of [Mi69] also holds in our case. We state it as 4.4 Proposition The following holds for an arbitrary Boolean function f, an arbitrary basis §, and an arbitrary integer k and prime p: (p,k)-ord M (f) £ (p,k)-ordj(f) Proof The same as in [Mi69]. Note that if we take fi for the basis in Proposition 4.4, and combine it with (4.2), we obtain that (p,k)-perceptrons over M achieve minimal order. We state without proof the following well-known 4.5 Lemma Every function GF(p ) n -» GF(p ) can be uniquely represented as a polynomial k k in n variables over GF(p ) that is at most of degree p -1 in each variable. (see, e.g. , [La67].) It has already been noted that we will be interested in whether a function can be represented by a (p,k)-perceptron with a limitation on its order. For this we need the following 4.6 Definition A sequence of Boolean functions f-,f 2> ... of 1,2,... arguments is of + finite (p,k)-order (over a given basis ?) if there exists a finite r such Bounded would be a better word, but we conform to the terminology of [Mi69] 79 that for all i (p,k)-ord- (f . ) £ r. Let there be given a (p,k)-perceptron (a,Y). If Y = (y,,...,y ], then, recalling that GF(p ) is a vector space of dimension k over GF(p), and desig- nating the j component of a. by a., (similarly for y 6 Y) , we have x. x j n eo m k CO [ E a m € Y] » V A [ E a., cp =y ] (4.3) i=0 L x h=l j-1 i=0 1J x tlJ We can restrict the diversity of perceptrons we are dealing with by noting 4.7 Proposition Let $ be a basis closed under conjunction (i.e., cp, ijr € § => cp A \|r € $). If a Boolean function f is of finite (p,k)-order over f, then it is of finite (p,l)-order (but the order may change). Proof We have f = [(a,Y)] where (a,Y) is a (p,k)-perceptron. Suppose !y( = m and the (p,k)-order of f is I. Prom (4.3) we have m k °° f = V A [ E a. ..«p. = y ] (4.4) h-1 J-1 1-0 ij x hj where a. ., v.. € GF(p). By Lemma 4.5, we know that for all a € GF(p) there always exists a poly- nomial P (x) over GF(p) of degree p-1 which takes on the value of 1 if x = a and is otherwise (the degree follows from the number of zeros of the poly- nomial). Thus substituting the Boolean operations with the field operations introduced in the proof of Proposition 4.3, we obtain from (4,4) 80 k °° k f = Q( n P ( E a -cp ),..., n P ( E a -cp )) (4.5) j-1 y ij 1-0 1J j=l y mj 1=0 1J where Q(x-,...,x ) is the polynomial (of degree m) that represents the Boolean m f function V x, . Each P is of degree p-1. Hence f can be expressed as h=l y ij a polynomial in the cp. 's of degree ^ nr (p-1). Obviously, cp^ for j > 1 can be replaced by cp since it assumes only the values and 1. Also, cp* i|r represents the function cpAi|r and |s(cp A i|r) | ^ |s(cp)| + !s(i(r)|); thus, if the basis is closed under conjunction (as, e.g., Q or M) , (4.5) describes a (p,l)-perceptron for f of order ^m«(p-l)'^. □ Remark Incidentally, this proof also shows that we can assume the cardinality of Y to be 1. Since we shall subsequently be concerned only in whether the order of cer- tain functions is finite or not, we will be able to limit ourselves to (p,l)~ perceptrons. For convenience, we will write simply "p-perceptrons". Also, we will be only concerned in whether there exists a basis over which a function is of finite order. This is equivalent to whether a function is of finite order over M. t Q(x 1 ,...,x m ) is obtained by using y ± V y 2 ~ y x + y 2 - y L • y 2 recursively; i.e., Q(x 1 ,...,x m ) = Q(x 1 ,...,x m _ 1 )+x m +x m .Q(x 1 ,...,X m _ 1 ). If Q is a poly- nomial over GF(2), then Q = © II y where the sum ranges over all nonempty y€s subsets S s (x.,,...,x }. 81 We first turn our attention to the case when p = 2. Instead of '^-percep- tron 1 ', we will say "Boolean perceptron". From Lemma 4.5 we conclude that every Boolean function can be uniquely represented as a polynomial over GF(2) that is at most of degree one in each variable (a Boolean polynomial). Noting that the terms of a Boolean polynomial represent marks, we conclude that every Boolean function f has a unique representation as a Boolean perceptron over M. Furthermore, by Proposition 4.4, this representation is a minimal order representation for f. Note then that 2-ord^(f) corresponds to the degree of the Boolean polynomial for f. This unique representation property allows us to establish the minimal order of certain interesting predicates very easily. As in 3.2, we are again 2 interested only in functions {0,1} -♦ {0,1} that are interpreted as functions of nxn patterns of 0's and l's. In particular, we are interested in the Boolean 2 function of n variables c (introduced in 3.2) and e , (the Euler number of a pattern of l's on a square array of 0's and l's is equal to k). It is well known (see, for example [Mi69]) that the Euler number of a planar figure is the difference between the number of its components and the number of its holes. If we use the notion of connectivity introduced in 3.2, then the Euler number of the pattern in Fig. 4.1 is 1. 4.8 Theorem The connectivity predicate is not of finite 2-order over M (hence, over any basis). 82 Proof We use the One-in-a-box construction introduced in [Mi69]= Before pro- ceeding, however, we must define certain auxiliary predicates, n, the size of the pattern is assume odd (henceforth, we will suppress the subscript n in the notation for functions). The variables representing positions in the square array will, as usual, be denoted by x . for 1 £ i, j £ n. Then we define r = (x n A x l2 A . . . A x ln ) A (x„ 7 A x„ 2 A... A x„ ) A... A (x . A x „ A . . . Ax ) nl n2 nn and s - (x 21 Vx 22 V ... Vx 2n ) A (x 41 Vx 42 V... Vx 4n ) A... A <Vl,l V Vl,2 V '" V Vl,n>5 i.e., r is 1 only on patterns with odd rows consisting exclusively of l's, and s is 1 only on patterns where each even row has at least one 1 (the One- in-a-box predicate). Then, r A c = r A s (4*6) (c is the connectivity predicate). Now, for arbitrary functions f, g, h,if h = f A g, then 2-ord M (h) ^ 2 "° rd M (f)+2-ord M (g); i.e., 2-ord M (g) £ 2-ord M (h)-2-ord M (f) (4.7) 83 Replacing h by r A s, f by r, and g by c we obtain 2-ord^c) ^ 2-ord^Cr A s)-2-ord M (r) (4.8) We have 2-ord (r) = ~- *n; 2-ord^(h) = ~- *n (recall the Boolean m polynomial for V x. described in the footnote on p. 80 ) 2-ord^(r A s) = n(n-l) (because ;the Boolean polynomial representations of r and s have no variables in common). Using this we obtain from (4.8) 2-ordfc) ^ „ ; i.e., the 2-order over M of the connectivity predicate is not finite. D We next establish 4.9 Theorem The predicate "the Euler number of a pattern equals k" is not of finite 2-order. Proof We again consider the case when M is the basis. The general case follows from Proposition 4.4. n is the size of the pattern. We need to consider a subset T C S = (x..: i+j even] (note that all points of S are disconnected from each other, in the sense we use this word). |t| = t will be determined subsequently. We define the following predicates P r = 1 if and only if all points of T are 0; i.e., pr = n (i e x) x£t q = 1 if and only if k points of T are 1; i.e., 84 q = © II x • II (lffiy) x€U y€T-U where the sum ranges over all possible subsets U C x with |u| = k. When the expression for q is multiplied out each term produces exactly one term of the form II x and thus the above Boolean polynomial is of degree t if and only x€T if the number of terms in the sum is odd. The number of terms is (5). But 2 1 -! ' t i k ( ) is odd for all ^ k ^ 2 -1 and all I. Thus if t = 2-1, £ k £ t k _ then 2-ord(q) = t. Also, 2-ord(pT) = |t| = n -t. Recalling once again the e, is the difference between the number of compo- nents of a figure and the number of holes, we have the relationship pr A e k = pr A q Again using (4.7) with g = e,, h = pr A q, f = pr we obtain 2-ord^e^ * n 2 - |t| = 2^-1 No matter how large we choose I, we can find an n such that we can obtain a set T with |T| = 2 -1. Thus, the Euler predicate is not of finite order over M. Theorems 4.8 and 4.9 can be extended to p-perceptrons for arbitrary p. The generalization will only be indicated for Theorem 4.8. Proof: First show that ( ) is even for all I and all k?*0, 2 . This is done by induction. Now observe that due to (, ) = ( ) + (, .), and the fact that I 9 k 2-1 2-1 2 ( , ) is odd, ( „ ) is also odd (for otherwise („) would not be even). We can continue this way and establish the claim. 85 The obvious difficulty is that Boolean functions do not have a unique representation as polynomials over GF(p) for p > 2. Specifically, in the case of Boolean perceptrons over M, we were able to reduce the problem of the order of connectivity to the order of r and s (see above). The orders of these predicates (equal to the degrees of the corresponding Boolean poly- nomials) were easily computed due to Lemma 4.5. Suppose c is of finite p-order over M (we have already remarked that this brings no loss of generality) for some p > 2. Due to Proposition 4.7 we can assume that we have an expression of the form of (4.5) for c . When multiplied out, we obtain CO c = E a.m. mod p (4.9) n i=0 l i where m. is the monomial representing the i mask. m. is of degree one in each variable, and the values of the variables are restricted to (0,1). Since the perceptron from which we obtained (4.9) is finite order, we can assume that the degree of (4.9) is ^ i. We can now extend c to the domain GF)p) (i.e., to square patterns A = n fb.J, 1 s i, j s n, and b . , € GF(p)) by defining c'(A) = 1 if and only if c (f(A)) = 1 where f(b.J) = {c.J such that c - 1 if b - 1, otherwise n ij lj ij l J c . =0. We have from (4.9) 00 c 1 = T. a^.m! mod p (4.10) n i-0 i l where m* is obtained from m. by replacing each variable x. by P- (x ) such that i 1 J ■"■ J p ( X J = i s otherwise P.,(x.) = (see the proof of Proposition 4.7 how to ob- tain P 1 (x.)). Now we have a total function c n over GF(p), and thus its 86 polynomial representation obtained by multiplying out (4.10) must be unique. Since the degree of (4.9) is £ A, the degree of the polynomial (4.10) £ (p-l)i (4.11) Another estimate of the degree of the polynomial for c 1 is obtained using the predicates r' and s' obtained from r and s of Theorem 4.8 similarly as c' was obtained from c . The polynomial P , representing r 1 , is obtained from the polynomial representing r by substituting each variable x with P.(x). Similarly, for the polynomial representation P , of s'. The degrees of P , and P , are then found to be ^ ^^ n« (p-1) and ^ —^ n(p-l) respectively (2-ord M (r) and 2-orcL/s) multiplied by deg(P.,)). Since the analogs of (4.6) aid (4.8) again hold, we obtain that deg(P ,) is not bounded, contradicting (4.11), and thus also the finite order of c. Note that the obvious generalization, i.e., if a function is of finite 2-order, then it is also of finite p-order (over M) , is not true: Consider n the Boolean function © x . . We will investigate the degree of the polynomial i-1 representation of ffi 1 (obtained in the same way from © as c 1 was from c above). If a p-perceptron over M of finite order exists for 0, then we obtain a poly- nomial representation of bounded degree for <£' similarly as (4.10) was obtained from (4.9). On the other hand we have the following representation for ffi E n x n (1-x ) mod p (4.12) i€s i?S 3 where S ranges over all subsets of {l,...,n] of even size. When multiplied n out, each term produces exactly one monomial of the form II x. . The number i=l ii^ui^^mua^uwj ii Mu^y ' uiM i pi^ 89 of such moaamUU appearing If, ^ djseafrinadj fern of (4.12) is 0IIOOQO0 .4. oiiooooo L £ J n. MO I I I 1-0 3 X I I 000CL0000 (use ths identity (J) - ( n ^)0 4 ^f % fetes this nanber is not «od p, (4.12) T±^4£Z^*^&K£M **•* » ** * If *i *■ replaced by P.(x.) n obteia s (yi ^saeA^prtlynnerf ■ 1 representation for «» of degree a* (p-i), contradicting die orl stones of s finite order p-perceptron over M for #i wr OMIfi 19 01 m * » g # & JfljE $ ♦ fc^, » son el 3edBu« «Mt soajt G Jt 4|^M ♦•f T) - (p t*-**«»&* «& *•*) ?abio artlnii s 3o 90*193 a Jaw «*fo ^a^s^bsrfaod f (J-q) *a »»«g»i> 1© *# «2 .x II •** io5 b M »o5 if wiro noWqsosa^-q 89 CHAPTER FIVE PATTERN COUNTING MACHINES In this chapter we shall permit ourselves a certain degree of informality. We are again concerned with the power of machines that combine a large number of "local" computations through an integrating function. Only this time we shall not be limited to functions that can be represented as a combin- ation of finite operators. This class of machines again operates on square patterns of O's and l's. The operation is divided into two phases : In Phase I the pattern is scanned with a square "window" of a certain size. Each time a nonzero pattern appears in the window, we take note of it (there is a finite number of nonzero patterns since the window is of finite size). At the end of the scan we have a count of the various patterns, and we are then allowed to utilize this data in Phase II which consists of computing the value of a partial recursive function for this data. The formalization of this model is obvious and we omit it. What can such a machine do? Clearly the computation of this machine is divided into a local phase and a global phase, so that it fits into the broad class of problems considered in [Mi69] and Chapter Four. Note that the boundedness of the window size is essential. If we insis- ted only that the window contain a given number of points, but otherwise allowed it to be of any shape with arbitrary distances between its points, then Phase II could reconstruct the whole figure as was observed already in [ML69], We again inquire whether these machines can recognize the familiar 90 topological predicates connectivity and Euler number. 5. 1 Theorem Pattern Counting Machines (PCM) cannot recognize the connectivity pred- icate. Proof We need only exhibit two patterns, one connected and the other discon- nected, with the same pattern spectrum. In this case, no algorithm of Phase II could establish the difference between them. Two such patterns are given in Fig. 5.1. Specifically, these patterns are equivalent under windows of size 2x2. However, it is easily seen that increasing the dimensions of the patterns in Fig. 5.1 linearly by a factor of k makes them equivalent under windows of size up to k + 1. We can arrive at this conclusion by setting up a 1-1 map between occurrences of the same pattern in the window in the two pat- terns 5 . 2 Theorem PCM's can compute the Euler number. Proof It is shown in TMi69] how to compute the Euler number from the spectrum of patterns of the shape : m n □ 91 Before proceeding, we need a notion of continuous deformation . Pattern B can be obtained from pattern A by continuous deformation if B arises from A by a sequence of additions or deletions of l's of the following kind: Let us fix attention on a 3x3 square with the central position in the place of the 1 being added (deleted). For simplicity assume that the boundary positions are always 0. Each position of the periphery of the square which has a 1 in it is either connected or disconnected to another 1 on the periphery (not necessarily by a path in the square). This set of connections may be described by a symmetric 8x8 0-1 valued connection matrix, i.e., a . = 1 if and only if the i and j positions on the periphery have l's and are connected. The proposed addition (deletion) is permitted only if (1) the connection matrix remains unchanged as a result of it, and (2) there is a 1 adjacent to the proposed addition (deletion). Any predicate whose value remains unchanged if the pattern A is replaced by B, obtained from A by continuous deformation, is called a topological predicate . We assert without proof that connectivity and Euler number are topological predicates. The reader is warned, however, that there is a pitfall in proving this fact for the Euler number predicate. The number of holes in Fig. 5.2 should be one, not two (i.e., 0's are connected diagonally in addition to their usual connections). This is discussed more fully in [My71], However, if the holes are sufficiently large (so that all the 0's in them are connected in the usual way) this difficulty is not encountered. 5.3 Theorem Any topological predicate recognized by a PCM must be a function of the Euler number. 92 Proof We will have established the theorem if we succeed in showing that, given any PCM p computing a topological predicate, then for two figures X 1 and X 2 with EULER(X 1 ) = EULER(X 2 ), we also have P(X ) = P(X 2 ). In [ML69] it is shown that for every figure X there exists an "Euler canonical" figure C(X) such that EULER(X) = EULER (C(X)); and if for two figures X^ X 2 , EULERCX^ = EULER (X 2 ) , then C(X ) = CCXj. If the Euler number of X is n > 0, then C(X) consists of n components without holes. If the Euler number of X is n £ 0, then C(X) consists of 1 component with-n+1 holes. We will show that we can deform any figure X into C(X) without changing the value of P„ The deformations available to us are: (1) Continuous deformation. If we subject X to this kind of deformation, then P(X) remains unchanged because it computes a topological predicate. (2) Deformations that leave the pattern spectrum unaltered. By defini- tion of PCM's. As a consequence, we have (3) Removal of components inside holes. To accomplish this without changing the value of P(X), we first apply (1) until the window cannot scan simultaneously an interior component and the wall of the hole in which it resides. Then it is obvious that the pattern spectrum will remain unchanged if we remove the component from inside the hole. After this we can apply (1) in the reverse direction. 93 If we are given two figures X- and X with same pattern spectra, then we can add equally shaped holes to the figures in such a way that the pattern spectra remain the same. The holes have only to be placed in such a way that the window cannot scan any other boundary while scanning the newly introduced hole. We can then repeat this to add any number of holes. Specifically, given two figures of the shape of A and B in Fig. 5.1, we can add holes in this way and still have the same pattern spectra. For example, C and D in Fig. 5.3 have the same pattern spectra for a sufficiently small window size. Note that given any two components, one of which has a hole, we may apply deformation (1) to obtain a figure proportional in dimensions to C and they apply (2) to obtain D. We call this sequence of deformations "cancelling a hole and a component". We deform X into C(X) by cancelling as many holes and components as possible. We first apply (3) until no component remains within a hole. Then we may either have a hole and a component not containing this hole, or not. In the latter case, we are done. In the former, we select a hole and a component not containing it and cancel them. After this we are left with one less component and hole. We repeat this until we arrive at C(X). We can summarize the results on the recognition of topological predicates contained in [M.69], Chapter Four, and in the present section in the following table 94 Recognition of Predicates Cyclic Perceptrons PCM Classical Perceptrons Connectivity NO NO NO Euler NO YES YES Other Topological Predicates ? Functions of Euler number only Functions of Euler number only Thus, all these results support the conjecture expressed in [Mi69] that no "local-global" computer can recognize connectivity. It appears, however, that all models are extremely sensitive to altera- tions. We have already mentioned how PCM's can be converted into universal machines with the removal of the restriction on the size of the window. A. R. Meyer noticed that ordinary perceptrons may be modified to recognize 00 any Boolean function with order one. Instead of E a, tp , ^ consider i-0 00 £ a. cp. £ Y for some subset of integers Y. Now we can choose the coeffi- i=0 3. 1 cients a. in such a way that the sums of the coefficients in no two subsets of the set K of all coefficients is the same, ie. , V(X,Y C K) [ S a = E a =» X = Y] a. € X i a.€Y We can define the coefficients inductively, i.e., choose a to be greater n-1 than S a... This is in the spirit of stratification (see [Mi69]). Now 1*1 no tice that if $ is the set of rn^ks of order 1, then a Boolean function cp ppw HPH Lji..yjij,ii.iii .i.i.jiil^jip ^p 0X000 0X000 la ■iaply § fo|l§c$lfA0of •ubsoto of I. 0X000 of integer*, fi§irfipgt|f to* •mm of belonging to «p. a ooooooo I I X $>**• J Mi Iw*^ ** ** **• ** t pt» jipif <Jiw1 oaboot* 9 siaosqa cr»33s<i SxS aese arfa jf*Jhr aoragjrf out X.2 .gil aero el urtmismq a±i& ax aalorf lo S.S .g*f •lit a Sxrsmxpoa * btr« aXorf * gsilloajroO 96 1 1 1 1 1 1 1 1 1 A 1 1 1 1 1 1 1 1 1 B Two figures with the same 2x2 pattern spectra Fig. 5.1 00000000 00000000 1 1 1 1 1 1 1 1 1 1 1 1 1 1 The number of holes in this pattern is one Fig. 5.2 C D Cancelling a hole and a component Fig. 5.3 97 APPENDIX A CERTAIN PROPERTIES OF SHORT FORMULAS The purpose of this appendix is to modify certain results of [H068] in the light of our different requirements. Our goal is Lemma A. 9 which is used directly in the proof of Theorem 2.2.2. We prove it by way of a series of intermediate results, none of which are used elsewhere. In what follows we would frequently use the phrase n F is a formula in n variables over $, and such that no variable occurs more than k times". This will be abbreviated to "F is a (§,n, k) -formula". If any of the para- meters is not present, we will replace it by *. For example, "F is a ($,*,k)- formula" and M F is a (<S,n,*)- formula" mean "F is a formula over $, and such that no variable appears more than k times'* and "F is a formula in n vari- ables over $ n respectively. A. 1 Definition Let there be given the sequence of formulas G = (G.(X-,z) , . . . ,G ^ (X .,,z).G (X )). If 1 £ i £ p-1, then G. contains the distinguished p-1 p p i variable z, occurring only once. X. for 1 S. i ^ p is nonempty and is either a singleton or = U X.. Let F be an arbitrary formula, and j<i J G = G 1 (X 1 ,G 2 (X 2 ,... ,G _ 1 (X _ X G (X ))...)). If F = G, then G is a nesting sequence of length d. for F. If, in addition, the total number of occur- rences of any variable (except z) in G is < the corresponding number in F, then G is a proper nesting sequence for F. 98 A. 2 Remark Let G and G be as described in Definition A. 1. Furthermore, let all X. for 1 ^ i ^ p be singletons and distinct. Then G is equivalent to an e _. -component. Also, suppose X. is arbitrary and G. is a formula over §. Now replace all variables except possibly one in G. for 1 ^ i ^ p-1 by the constant a. Let the set of variables that have not been touched be Y. Y a Then G is equivalent to an e-component over $ . 3 Let F be an arbitrary formula over $, X J S(F), and a f D; then we would like to obtain a formula over $ with the following properties: (1) G = F (2) S(G) = X, and (3) the number of occurrences of any vari- 3 able of X in G is £ the corresponding number in F. G can be obtained by a straightforward replacement of operators in F such that the variable symbols that are replaced with a in forming F (and subformulas of F 3 where S(F) consists entirely of such variable symbols) are removed, and the remaining operators are changed to preserve equivalence with F . SL Mor e precisely, if cp(F- , . . . ,F. ) is a subformula of F, then if S(F.) <£ X 1. iC i- for all i, cp remains the same; if S(F.) <= x and S(F.) <£ X for j^i, then cp is replaced with cp(x. , . . . ,x , ,F ,X . , . . . ,x, ) where all variables of F. have been replaced with a (if there are more such indices i, we proceed in the obvious way); and if S(F.)cx for all i, then cp is eliminated. This transformation will be called normalization and G will be denoted by norm(F ), "■ ■ ■■ " '""■ ' " '■■ ' 3 99 A. 3 Lemma If F is a ($,n,*)-formula, then for any p, q ^ 1 and a £ D, if n ^ \ (P>q)j there exists a subset X £ s(F) such that either (1) |xj = q and F is equivalent to a PC of the formulas F ,...,F^_ 3 where r ^ n and F for 1 < i £ r is a formula over $ such that each element max l of X occurs in at least two among F..,...,F , and the total number of occurrences of any x f X in F , . . . ,F is ^ the number of occurrences of x in F; or (2) |xj is arbitrary and F has a proper nesting sequence G = (G..,..., 3 x G , , G ) where G. for 1 ^ i < p is a formula over $ . p-1 P i Proof Assume there is no X c S(F) such that F is as described in (1) of 3 the statement of the lemma. We will describe a (proper) nesting sequence extraction procedure (NSE) whose inputs will be a formula H over § and a set of variables Y. The output of NSE will be two formulas H 1 (Z,z) and H" over $ such that Z is either a singleton or £ Y; furthermore, H = H'(Z,H n ) for some 3 U ^ S(H). G will be obtained by the repeated use of NSE. Initially, the input of NSE will be F = F- and <t>. In the first application of NSE, the output will be G, and F (F is an intermediate formula whose significance will f"Vi be describe immediately). In general, the i application of NSE will receive the input F. and M X. and yield as output G.^ and F^. We will j< i J 100 show that if n ^ TL(p 5 cj)j we can apply NSE p-1 times and end up with F _„ from which G is obtained as will be described below. P Description of NSE . The input to NSE is as describe above. Then we can distinquish two cases: Case I . L(H) = 1. In this case we cannot apply NSE, and the output is undefined. Case II . L(H) > 1. In this case we can assume that H has no unary operators; for suppose there exists a subformula J of H such that J = co(ty( J-,..., J )) where J. for 1 £ i < r is either a variable symbol or another 1 r i subformula of H. In this case cp« \|/ = p 6 $ and we can replace J by the equivalent formula P(J..,...,J ). Similarly, if J = cd(J,, . . . , ^(J. ) > • • • ,J^) > we can eliminate i|f because cp(x.. , . . . ,i|f (x. ) , . . . ,x, ) = p(x- , . . . ,x^, . . . ,x^) f $ a (thus, if a unary operator of H corresponds to an internal node of T(H), we can eliminate it by either of these two means; on the other hand, if a unary operator of H corresponds either to the root node or to a node next to a terminal node of T(H), then we can use only one of the two methods described). Now choose i' such that S(H .,) is maximal among S(H ) for 1 ^ i £ r. Since support is defined only for formulas, S(H ,) may be undefined if all arguments of the outermost operator of H are variable symbols. In this case replace one of them by the identity operator which is possible since id p $ a . Consider H/H ^, = K(Z,z). Again two cases can arise: 101 Case Ila . Y R Z = <t>. Choose any variable of Z, e.g., x, and let fx z) H' = norm(K ). z is a distinquished argument (hence x is free). In this case set V = S(H ,)-Z U[x}. The significance of V will be • X seen immediately. Case lib . Y R Z ^ 0. H' = norm(K )] z is again a distinguished argument V = S(H . , )-Z U Y. In both cases H" = norm((H .,) ). . i a Analysis of NSE . Let I S (H) ! = m, and let us estimate ls(H")'. Obviously, |s(H . ,)| ^ . In the case that H results from a chain max of applications of NSE to a formula F, and F does not satisfy (1) of the statement of the lemma, then we claim that in Cases Ila and b less than q variables are set to a in H . , . Suppose this is not true. Let the • J- set of variables that is set to a on this occasion be W. Then W £ Z-{x} W (Case Ila), or W £ Z-Y (Case lib). In any case consider F & . This is equivalent to ^(H , n >) , . . . , (H . .) where i(l) , . . . ,i(s) are the « X \ X / H • X \ S / cl indices corresponding to the subformulas H . where all variables have • J not been replaced by a (if in H all variables have been replaced by a, then it is absorbed into 9). But then cp(norm((H ,,J ) ,. . . ,norm((H ^ g J a )) satisfies (1) of the statement of the lemma. A contradiction. Thus, IS(H")| s ^ J!L - -q+1 max Hence, if we define 102 VI, q) = 1 Tl 6 ( P +i,q) - <Vp.q>+q-D- r W x,e " n fi (p,q) = n P " 1 -q+ maX - • (q-1) 6 ^ max ^ n -1 M max for p,q ^ 1 (and if F does not satisfy (1)), we will be able to apply NSE p-1 times and obtain F ,. G can then be obtained as follows: If S(F ,) p-1 p p-1 fl S(G.) = 6 for 1 ^ i ^ p-1, then choose any variable y f S (F 1 ) and i p-i H obtain G from (F _-) yi by normalization; otherwise, denoting U S(G.) i=l U by U, obtain F . from (F n ) by normalization. It can be checked that p-l p-1 a G. for 1 ^ i £ p satisfy the conditions of (2) of the statement of the lemma. n Consider a sequence of (nonempty) sets X for 1 £ i ^ p such that X. is either a singleton, or is included in tj X . We will call such L J<i j a sequence of sets a normal sequence (of length p). Note that the sequence X..,...,X in Definition A. 1 is a normal seqdence. Then A. 4 Lemma Let X..,...,X be a normal sequence of sets with the additional pro- of I) A. sequence. Then if p S(k+l) m , there exists a subset Y C fj X. and an 1=1 L increasing sequence of indices i(l) ,i (2) , . . . ,i (q) such that (1) q ^ m, perty that each element of U X. appears in at most k elements of the i-1 L :here exists a subset Y C I ! J 1=1 L 103 (2) 1(1) = 1, (3) X ... Y is a singleton for 1 £ j £ q, (4) Xj Y = if 4 £ i(q) and X j* i(j) for 1 £ j £ q, and (5) if x £ Y, ^ < J 2 < j g and x £ X. , . N , x f X. , . N , then also x € X. , . N . UJj) i(J 3 ) Kj 2 ) froof (this is a direct translation of the proof of Lemma 2 of [Ho68] into P our terminology). Let IJ X. =* (x.. ,x„ , . . . } . Without loss of generality i=l 1 l assume that x- € X.. . If m = 1, set Y = (x..}, i(l) = 1, and conditions 1-5 are satisfied. For the inductive step two cases are distinguished. Case I . x 1 occurs in none of the sets X., 2 £ j < (k+1) + 1. Setting r = (k+1) +1, the sequence X„,...,X is normal and each element r occurs in at most k of the X., 2 <. j £ r. If Z C \\ X and the sequence J j =2 J j (1),... ,j (q-1) are obtained by the inductive hypothesis, then [xj U Z = Y and 1(1) = 1, 1(2) = j(l) i( q ) = j (q-1) satisfy conditions 1-5. TTl— 1 Case II . Assume that x.. occurs in some X., 2 ^ j £ (k+1) +1 and let h be the smallest such number j. Furthermore, let V be the set of elements different from x, , and occurring in X 2 ,...,X ,. Delete the elements of V from X-,X. ,...,X , and delete those among X.X_,...,X that remain empty. Let the resulting sequence be Y-,...,Y . The length of the sequence (X. ,X, ,X, , . . . ,X ) is at least p- (k+1) +1. There are less than (k+1) distinct variables in X„,...,X _p each one occurring in at most k-1 of the formulas X ■, S X, ,...,X . Therefore, 104 r * p-(k+l) m " 1 +l-(k-l)(k+l) in " 1 ,i.e. ) r 2= (k+l) 1 "'^! m— 1 The sequence Y_,...,Y is normal and its length is at least (k+1) r x 1 occurs in Y„. Let Z £ I J Y. and the sequence j (1) = 2, j(2),..., j-2 J (q-1) be obtained according to the inductive hypothesis for Y^j'.'jY . Then Z and i (1) = 1, i(2) = j (1) , . . . ,i (q) = j (q-1) (where q * m) , satisfy conditions 1-5. LJ Let there be given a (*,*,k)- formula F with the proper nesting sequence G = (G.,,...,G ) such that G. is a formula over f. As has M — ^ 1 ' p l already been remarked above, X , ...,X (see Definition A. 1) is a normal i p sequence of sets. If p 2 (k+l) m , then by Lemma A. 4 there exists a set Y £ N X and 1*1 q indices i(j) for 1 £ j £ q such that conditions 1-5 hold. Note that if m = k-t, then |y| * t since no variable appears more than k times in G (G is proper). In particular, consider only Z = {x^...^] £ Y where x-,...,x are numbered in the order of their appearance in G. Note that due to condition 5 of Lemma A. 4, if x, y €Y and y follows x in G, then x cannot appear again after y in G. Let G be as defined in 2 Definition A. 1. Then we will let the reader convince himself that G 7 (hence also F ) is equivalent to K(Z,G') where K(Z,z) is an e -component a c over $ with input variable z, and G' is a certain formula over $ such that each variable of G' occurs at most k-1 times. 105 Note that in this case we do not know the size of S(G'). This can be remedied in the following way: There are two cases; either |S(G')| s 1/2- t, or not. In the first case perform an a-merger on K with basis S(G'), after which we obtain an SC of an e-component K' of length ^ 1/2* t and a formula G" (through the input variable) such that S(G") equals the set of lateral variables of K' ; in the second case perform an a-merger on K(Z,G') with basis Z-S(G') in which case we obtain an e-component K' of length ^ 1/2' t with a constant input operator. We summarize the preceding in the following A. 5 Lemma Let there be given a (*,*,k)-formula F with a proper nesting sequence 2k« t of length p ^ 1 composed of formulas over $. Then if p ^ (k+1) , there exists a set Z = S(F), |f| ^ t, and F is either equivalent to an SC of SL 3. an e-component K over § and a formula G over $ such that S(G) is the set of lateral variables of K, and no variable of G occurs more than k-1 times in G; or to an e-component K over i with constant input operator. Let there be given a PC F of the formulas F, ,...,F where r £ n & 1' ' r max such that |S(F)| = q and each variable appears in at least two among F-,...,F (i.e., a situation as described in (1) of the statement of Lemma A. 3). We are interested in obtaining a (nonempty) subset X £ S(F) such that when the variables outside of X have been replaced 106 by the constant a, |s(norm(F.) )) I for those F. where not all variables ■L a ^- h?ve been replaced with a is equal or larger than a predetermined number t (as large as possible). We could solve the problem as follows: Each variable of S(F) appears in a certain subset of the formulas F ..,..., F . The number of such n t* max subsets is 2 (in general, S 2 ); thus, we are sure to find a subset X with |x| 2: — 3 — such that all elements of X appear in the same subset 2 max of F- ,. .. ,F . 1 r However, we can improve this number. Let us construct the occurrence table of F. The table consists of rows corresponding to elements of S(F), and of columns corresponding to F. for 1 <■ i ^ r. The entry a . is 1 if x occurs in F. and otherwise. We will try to extract a subset i J X £ S(F) such that either all variables of F^, are replaced by a, or XT S(norm(F.) )) contains 2 elements (t will be determined later). i a If all columns in the occurrence table contain 2» t l's, we are done and X = S(F). Suppose not. Let the column j contain < t l's. Delete all rows corresponding to the l's in column j and column j itself. Let the set of variables corresponding to the remaining rows be X 1 . Consider the remainder of the occurrence table (i.e., minus the deleted rows and column); and again look for the column with < l's. If it does not exist, we are done and X = X... If such a column exists continue. Now two things can happen. Either at some point we end up with a certain subset of columns, all of which contain ^ t l's, or we end up with two columns that both contain < t l's. We shall see that by 107 an appropriate choice of t, the latter case cannot happen. The number of l's in the whole table ^ 2q (each variable Occurs in at least two formulas). The smallest number of l's remaining after all but two columns have been deleted 2: 2q-m where m is largest possible number of l's that can be deleted in the course of this procedure, m = (t-1)- (r+r-l+r-2+. . . +3) = (t-1)* -^ j ^ (this corresponds to the case when each deleted row contains only l's and at each stage t-1 rows are deleted). If, after the table is reduced to two columns, both columns are to contain ^ t l's (both have to contain the same number of l's since each variable occurs in at least two formulas), then 2ais * t 2 or since r ^ n max t ^^rr wh ere c = (n +3)(n -2) c+4 max max For large n this is better than the previous bound. This result ° max can be summarized in A. 6 Lemma Let there be given a PC F of the formulas F ..,..., F over $ where r ^ n such that |S(F) = q and each variable appears in at least max ' ' two among F. , . . . ,F . Then if 108 q 2 f where t 2: 1 y we can find a subset X £ S(F) such that F is equivalent to a PC of a the formulas &,,..., G over § and S(G.) ^ t for 1 ^ i £ s <. r. Is v i Lemmas A. 3, A. 5, and A. 6 can be combined into A. 7 Lemma Let F be an ($ ,n,k)-formula. Then for any t ^ 1 and a £ D if n^^CCk+l^^/^^C) (see Lemma A. 6 for the value of c) , there exists a subset X £ S(F) such that either (1) F is equivalent to a PC of the formulas F ..,..., F over $ a ^ 1' ' r where r < n , each variable of F. occurs at most k-1 times in it, max l ' and F. contains at least t variables of X or X a (2) F is equivalent to an SC of an e -component K over $ a t with a formula G over $ (through the input variable) such that S(G) is the set of the lateral variables of K and no variable occurs more than k-1 times in G; or to an e -component K over $ with a constant input operator. A. 8 Lemma Let F be a ($,n,k)-formula. Then for any t s 1 and a € D if n > T! 7 (t,k) 109 then there exists a subset X C S(F) such that F is equivalent to an a a k SPCeC over § G such that (1) G has ^ n components, (2) each max component is of length ^ t, and (3) the terminal components of G have constant input operators. Proof TL(t,l) = nt . In this case T(F) has at least one branch 7 max connected to t+1 variable symbols (k=l and thus all variable symbols are distinct) at different nodes. This branch can be converted into an e -component with constant input operator. The idea is illustrated in Fig. A. 1. n 7 (t,k + l) - T| 6 <(k + 2) 2 °' fl),7) 7 (t . k > i , «*4). V,k)-c ) We can apply Lemma A. 7. The result is either (1) an e-component K of the correct length and constant input operator, (2) an SC of an e-component over $ of the correct length and a formula to which we can apply the inductive hypothesis, and (3) a PC of formulas to which we can apply the inductive hypothesis. In each case we obtain an SPCeC with the desired properties. D A. 9 Lemma Let F be a ($,nk)-formula. Then for any t ^ 1 and a f D if n * TLCt.k) 110 there exists a subset X C S(F) such that F is equivalent to an SPCeC a. a over $ G such that (1) G has ^ k components, (2) each component has X as the set of its lateral variables, (3) the terminal components have constant input operators and (4) |x| = t. Proof /n k \ (n k ) (n k \ Set T1g(t,k) = TL(s«t,k) where s =f max) +1 max/ +. . .+ ( max J Apply Lemma A. 8 to obtain a SPCeC G' with all components having length s s*t. Since each variable appears at most k times, it can occur in at most k components. s is the number of nonempty subsets of £ k elements. Thus, if the number of variables is as indicated we are sure to kind in G' a subset of t variables that all occur in the same set of components of G'. After performing an a-merger with this set as basis, we obtain the desired SPCeC G. D Remarks on the bounds in Lemmas A.3-A.9 . If Tl is approximated by 6 P m n r • q, then TL is inductively defined as follows: max "»> 7 j Vt,i) - n<: 7 max 2k-Tl 7 (t,k-l) V'^ - v*™ 1 ' -yt.k-i) b / L c • — , ^ / , s -^ . „ o, \ , b f2k times for a certain constant y. Thus we see that TL(t,k) 2: iexp(b,2k) = b J for k ^ k(b) for any constant b (t has not been included in the estimate because in applications it is constant). Ill V Conversion of a formula F where each variable occurs only once into an equivalent e-component by setting certain variables to a. Fig. A.l 112 APPENDIX B t THE LENGTH OF THE MOD 2 SUM OVER II There is an isomorphism between the set of formulas over II and series- parellel contact networks. We assume the reader is familiar with this model as well as with the isomorphism in question. In this case if F is a formula over II, then L(F) corresponds to the number of contacts in the network corres- ponding to F. For convenience, we will derive the result in contact network terminology. Given a (series-parallel) contact network C, a chain is set of contacts such that when they are all closed, C conducts (we will say "C is 1"); a cut set is a set of contacts such that when they are all open, C does not conduct (we will say "C is 0"). In the obvious way, we define minimal chain , minimal cut set (i.e., when one contact is deleted the corresponding property does not hold). B. 1 Lemma Given a contact network C and any minimal chain and minimal cut set, their intersection is a singleton. This result is due to Khrapchenko [Kh71]. 113 Proof By induction on the number m of contacts in C. For m = 1 the assertion is obviously true. If m > 1, C must be either a series combination of smaller networks C. and C™, or a parallel combination of smaller networks C. and C ? . In each case it is simple to establish the lemma. D n Suppose now we have a contact network S that represents © x.. Let m. i=l J denote the number of contacts labeled with x. or x. . Then we are interested in H m.. Consider n-tuples (a.) for 1 ^ i ^ n and a. € {0,1}. An n-tuple of this kind will be called even if it has an even number of l's, otherwise it is odd . Obviously S must be 1 on odd n-tuples and on even ones. Consider an arbitrary odd n-tuple a_ = (a- ,. . . ,a. ,. •• ,a ) and an even n-tuple b = (b 1 ,. . . ,b. ,. . . ,b ) at Hamming distance 1 from a. If b. = a., then all other components of a and b are equal, e. will denote the n-tuple with a single 1 in the i place. Then we will write b = a ffi e.. To each odd n-tuple a. we can assign a minimal chain c (a) (consisting of a subset of contacts of S that are closed at a and that do form a minimal chain); similarly, to each even n-tuple b_ we can assign a minimal cut set s(b) (consisting of a set of contacts of S that are open at b and that do form a minimal cut set). Let a be odd, b_ = a ff" e. even. Then by Lemma B.l, c (a) H s (b) is a singleton; in fact, it is easy to verify that it must be a contact labeled either with x. or x.. 114 We build now Tables I and II. The rows of Table I correspond to odd n-tuples while those of Table II correspond to even n-tuples. Thus both n-1 have 2 rows. The columns of both tables correspond to the variable x. for 1 ^ i ^ n. The entry a(a,j) in Table I is c (a) H s (a © e.). This entry will be represented by a number between 1 and m.. Let t. . denote the number of times contact number i (among those labeled with x. or x.) appears in column j of Table I. Then S t = 2 n_1 (B.l) 1=1 J The entry p(b,j) of Table II is s (b ) D c (b © e ). (B.l) again holds. Construct now Table III. The rows of Table III correspond to all possible pairs (a,b) where a and b are odd and even n-tuples respectively. The columns of Table III again correspond to variables. An entry of Table III is yCa.bjJ) = (a(a,j), p(b,j>. Consider now the diagonal entries of Table III (i.e., (C0,p) such that a = p). Let (CC(a,j), p(b,j)) be such an entry. Then CC(a,j) = P(b,j) = c(a) D s(b). Thus, by Lemma B.l, there can be only one such entry in a row. HI: n J 2 The number of diagonal entries is S S t^ .. Thus, j=l 1=1 ij m. j-1 i=l " lj Combining a version of Cauchy's inequality m j „ "j t t?. ^ 2 2(n " 1) (B.2) -L ( t t ..) 2 <. s t 2 . m j i-i 1J i=i 1J its with (B.l) and (B.2), we obtains J ri: n 1 Thi» time *e apply the, idaqnality , ■<*& lab mItM ,«WI .W t <sjH sijaH|iJK>,^ ( E »,).( E J- ) *n (for both inequalities see, e.g., Dfc*4j#. 9), «aa obtttartmeeeeired result ot:>.i3i;iac;i?«.; : tu rjivii «»l-dQi*-i s,ii;.i--. *:■;-: E *. a^tt-i' vsll* orfot .£ „IoV s jQi3nc.D i>B£ ■lot! -on 3fe3i3^,a 5<" cbs; "SSI.lBi -iBi , 5tf«R:3"X«/-2 iot3;K t. HZi^M S H$#Jft J ^».5-.**0A y i|C 116 LITERATURE Ar69 M. A. Arbib, Theories of Abstract Automata, Prentice- Hall, 1969. Av69 A. Avizienis, On the Problem of Computational Time and Complexity of Arithmetic Functions, Proc. ACM Symposium Theory of Computing, May 5-7, 1969, Marina del Rey, California, pp. 255-258. Be71 C. Berge, Principles of Combinatorics, Academic Press, 1971. Co71 S. A. Cook, The Complexity of Theorem Proving Procedures, Proc. 3rd Ann. Symposium Theory Computing, Shaker Heights, Ohio, May 3-5, 1971 pp. 151-158. Gr59 Editors, Eugene M. Grabbe et al. , Handbook of Automation Computation and Control, Vol. 2, John Wiley, 1959. Ha71 L. H. Harper and J. E. Savage, On the Complexity of the Marriage Problem (unpublished) . Ho68 L. Hodes and E. Specker, Lengths of Formulas and Elimination of Quanti- fiers I, Contributions to Mathematical Logic, K. Schutte, editor, North Holland Publ. Co., 1968, pp. 175-188. Ho70 L. Hodes, The Logical Complexity of Geometric Properties in the Plane, Journal ACM, 17, No. 2, pp. 339-347. Kh71 V. M. Khrapchenko, On the Complexity of the Realization of the Linear Function in the Class of rr-Circuits , Mat. Zametki, 9, No. 1, 1971, pp. 35-40 (Russian). Kr59 R. E. Krichevskii, Realization of Functions by Superpositions, Prob. Cypernetics II, 1961, pp. 458-477, Pergamon Press (translated from the Russian). La67 S. Lang, Algebra, Addi son-Wesley, 1967. Lu59 0. B. Lupanov, Complexity of Formula Realizations of Functions of Logical Algebra, Prob. Cybernetics III, A. A. Lyapunov, editor, Pergamon Press, 1962, pp. 782-811 (translated from the Russian). Lu70 0. B. Lupanov, On Some Results in the Mathematical Theory of Synthesis of Control Systems, Information Materials 5(42), Ac. Sci. USSR, Moscow 1970, pp. 16-22 (Russian). 117 Mi69 Mo Minsky and S. Papert, Perceptrons, MIT Press, 1969. Mk71 R. McKenzie, et al. On Boolean Functions and Connected Sets, Math. Systems Theory, 5, No. 3, pp. 259-270. Mt64 D. S. Mitrinovic, Elementary Inequalities, P. Noordhoff Ltd. Groningen, 1964. My71 J. P. Mylopoulos and T. Pavlidis, On the Topological Properties of Quantized Spaces I, II, Journal ACM, JL8, No. 2, pp. 239-254. Ne 66 E. I. Neciporuk, A Boolean Function, Soviet Math. Dokl. , 2, No. 4, 1966, pp. 999-1000. Ri42 J. Riordan, C. E. Shannon, The Number of Two Terminal Series-Parallel Networks, J. Math, and Phys. 21, 1942, pp. 83-93. Ry63 H. J. Ryser, Combinatorial Mathematics, MAA Math. Monographs, No. 14, John Wiley, 1963. Sh49 C. E. Shannon, The Synthesis of Two-Terminal Switching Circuits, Bell System Tech. J., 28, No. 1, 1949, pp. 59-98. Su61 B. A. Subbotovskaya, Realizations of Linear Functions by Formulas Using V, &, -, Soviet Math. Dokl., 2, No. 2, 1961, pp. 110-112. Vi70 B. Vilfan, Cyclic Perceptrons and Pattern Counting Machines, Proc. 4th Ann. Princeton Conf . Info. Sci. and Syst. , Princeton U. , March 1970 Ya54 S. V. Yablonskii, The Realization of the Linear Function in the Class of TT-Circuits, Dokl, Ac. Sci. USSR, 94, No. 5, pp. 805-806 (Russian). Ya59 S. V. Yablonskii, On the Impossibility of Eliminating Exhaustive Search of Boolean Functions in the Solution of Some Problems in the Theory of Circuits, Dokl. Ac. Sci. USSR, 124, No. 1, pp. 44-47, (Russian). This empty page was substituted for a blank page in the original document. CS-TR Scanning Project Document Control Form Date JJ±±JJ±. Report # LCS-TR-^"? Each of the following should be identified by a checkmark: Originating Department: □ Artificial Intellegence Laboratory (Al) ^g" Laboratory for Computer Science (LCS) Document Type: V^ Technical Report fTR) □ Technical Memo (TM) □ Other: Document Information Number of pages: //8(/^-.>^fJ - Not to include DOD forms, printer instructions, etc... originarpages only. Originals are: Intended to be printed as : □ Single-sided or □ Single-sided or ^ Double-sided £( Double-sided Print type: □ Typewriter □ Offset Press □ Laser Print □ InkJet Printer J^f Unknown □ Other:_ . Check each if included with document: 2^ DOD Form □ Funding Agent Form ^ Cover Page □ Spine □ Printers Notes □ Photo negatives □ Other: Page Data: Blank Pages^,-,,™^: foU-owj Q»*t PAG^ Photographs/Tonal Material (by,»9anumb«): . Other (note dMcnplion^MB* numbar). Description . Page Number Scanning Agent Signoff: . . r . Date Received: I lJ3lHi__ Date Scanned: / / Ml K Date Returned: cLjJjl*. %4ArhJy>'<fA Scanning Agent Signature: rWAtjfA^ykjl f v • mtt^R,, Rev &» ds/lcs Document control Fo<mc*Morm.v«d UNCLASSIFIED Security Classification DOCUMENT CONTROL DATA - R&D (Sacumy claaallication ol till,, body of abstract and indexing annotation muat ba antarad whan the overall report ia claa tilted) I. ORIGINATING ACTIVITY (Corporate author) Massachusetts Institute of Technology Project MAC 3. REPORT TITLE The Complexity of Finite Functions 2a. REPORT SECURITY CLASSIFICATION UNCLASSIFIED 26. GROUP NONE 4. OESCRIPTIVE NOTES (Type ol report and tnclu Ph.D., Department of Electrical Engineering, February 1972 5. AUTHORIS) (Laat name. Itrat name, Initial) Vilfan, Bostjan 6. REPORT DATE March 19 72 8«. CONTRACT OR GRANT NO. N00014-70-A-0 362-0001 6. PROJECT NO. 7*. TOTAL NO. OF PAGES 118 76. NO. OF REFS 25 9a. ORIGINATOR'S REPORT NUMBER(S) MAC TR-97 (Thesis) •6. OTHER REPORT NOIS) (Any other number t that may be aeaigned thla report) NONE 10. AVAILABILITY/LIMITATION NOTICES Distribution of this document is unlimited M. SUPPLf MENTARY NOTES None 12. SPONSORING MILITARY ACTIVITY Advanced Research Projects Agency 3D-200 Pentagon Washington, D. C. 20301 s. abstract T h e topics covered are the length of formulas for finite func- tions , the order of cyclic perceptrons , and pattern counting machines. Using a generalization of a theorem of Specker, it is shown" that the Boolean function is 1 if the number mod p of arguments equal to 1 is cannot be represented by a formula of length proportional to the number of arguments if k-ary logic is used with p>k. The same thing can be shown for arbitrary k if the only binary operators used are max(x,y) and min(x,y). It is also shown that the connectivity predicate cannot be represented by a formula of this kind, regardless of k and of the operators used. Next shown is that the connectivity predicate and the Euler number predicate cannot be represented by finite order cyclic per- ceptrons. Finally, it is shown that the only topological predicates that can be reconstructed from the k-subpattern spectrum of a given square pattern of 0's and l's are functions of the Euler number. The k-subpattern spectrum of a pattern is a tuple given the number of omnrrsnces of anv kxk square subpattern in the original pattern. 14. KEY WORDS L computational complexity combinatorics finite functions DD/ N r.. 1473 (M.I.T.) UNCLASSIFIED Security Classification Scanning Agent Identification Target Scanning of this document was supported in part by the Corporation for National Research Initiatives, using funds from the Advanced Research Projects Agency of the United states Government under Grant: MDA972-92-J1029. The scanning agent for this project was the Document Services department of the M.I.T Libraries. Technical support for this project was also provided by the M.I.T. Laboratory for Computer Sciences. Scanned ^te: {/&g//J<lg M.I.T. Libraries Document Services darptrgt.wpw Rev. 9/94