Skip to main content

Full text of "mit :: lcs :: tr :: MIT-LCS-TR-097"

See other formats

This blank page was inserted to preserve pagination. 

'-h-~akj ~d.tVv 


id i^daKwr stf.3 ^rd O&J.'Issml -ra^ali s 



. ^i .1 - : J 

• J dUs^a ?&:{:) nWO-Kfe 8X Jl iCs^vl^qg-S 

3Ateb'iO aj*t3s«&sg a*!3 3ss«siigs3 3onrrso 
b 1p ioo2<j sw 3e43~ »i &M5 lo assise 

•fell 5«!3 ii £ -$&btb UBt-lupmi nol3onuJ 
"ie pas srf^ isi cs*olj Harass*] *o*>*iq a it's 

d i^uk 31 s «Ism»Jt3s-£ siM iav*o gffo-r 


-to £flS.I?:»aul! sis sstyg 


Zfl3 , 33V9WCJM 

-■?:rise*^ 3ifi»Jt^fc*vni &' 


.?rsa:J3s$ Thai 'mlus's l^aq £ ssatij iin -^dnux: 

tts 02139 

-5 > JJJ! 

iiasi so 


tipoo snj 



Lower bounds on the length of formulas for finite functions are 
obtained from a generalization of a theorem of Specker. Let f: 
{0,1,... ,d-l} n -» {0,1,... ,d-l) be a function which can be represented 
by a formula of length ^ en. For any m, if n is sufficiently large, 
there is a restriction f: { 0,1, . . . ,d-l} m -» { 0,1,. . . ,d-l} of f which 
is representable by a special class of formulas called homogeneous 
e-complexes. By showing that certain functions do not have restric- 
tions representable by homogeneous e-complexes, we are able to conclude 
that the length of formulas representing the mod p sum, p > d, or the 
connectedness of a pattern on a discrete retina cannot be bounded by 
a linear function of the number of variables in the formula. 

Also considered are perceptrons over finite fields (cyclic per- 
ceptrons). It is shown that cyclic perceptrons of bounded order 
cannot represent the geometric predicate connectivity. An interesting 
aspect of this is that one proof of the corresponding result for 
bounded order perceptrons over the rationals rests on the inability 
of the latter to represent the parity function. However, the parity 
function requires order 1 if the field has chracteristic 2; thus, 
this proof breaks down in the case of cyclic perceptrons. Another 
geometric predicate that cannot be represented by bounded order 
cyclic perceptrons is Euler number equals k (for an arbitrary k). 
However, this predicate can be represented by bounded order percep- 
trons over the rationals. It must be noted, however, that our proofs 
are different and much simpler than the corresponding proofs derived 
by Minsky and Paper t for perceptrons over the rationals. 

Finally, we investigate k-pattern spectra of a discrete retina. 
This is the 2 k -tuple, each component of which corresponds to the 
number of times a particular kxk pattern occurs on the retina. It 
is shown that the only topological predicates that can be determined 
from k-pattern spectra of discrete figures are functions of the Euler 
number of the figure. 

This report reproduces a thesis of the same title submitted to 
the Department of Electrical Engineering, Massachusetts Institute 
of Technology, in partial fulfillment of the requirements for 
the degree of Doctor of Philosophy, February 19 72. 

mimmx^ wm ^ittm -'fi'^^ m i ^ ^ v. m mm ^ r f * ** ,<. / ***** * i Uj^ULi p ^.Wpi l ipjIl i ll l LI I Jm^- ' iipPi ' l ' . I WIU ll .1 J.HW^L«KjHWWP*wwiMp mu! 


H*5 : .1 

In the first place, I cp^^ Aefe% ^ jg^fl^^u^k jto Professor 
Albert R. Meye^ t»,tfea qftFCft ^ 3 M^F^^ft^f it ^.*r hom * learned 
the heuristics of Research. ^<i£$$§£%u\fg&^g iijtifoduced me 
to the problems described he? ,a, f ^ r ^a^J (3 t l %ijQ <#$«&% l<?n,g hours with 
me suggesting improvements and modifications. 

I am also indebted to my readers, Professors Michael J. Fischer 

89 xs !' qatot)** s* J * i 

and C. L. Liu for valuable suggestions. 

I would like to thank Professor Frederick C. Hennie and 

-*5TO-vf r ;.- f swi:»<;8 rC IS 

Project MAC for financial support during my study. 

Last but not least, thanks are due to Miss Marsha Baker for 
consenting to type this thesis. 

•■'■■--'■ ■■: 'iXlS'-""^~: ?f> i'1gy».J :>ffT' ? ,£ 

-. ^i.-./u MU<: £ ocr SAT %ij H 1 I»!£ : : ms -i y;s~:w^i^ 





1.1 Finite Functions 5 

1.2 Formulas 5 

1.3 Measures of Complexity 9 

1.4 Problems Related to the Length Measure 13 

1.5 Specker's Theorem 20 

1.6 Cyclic Perceptrons 23 


2.1 e-Complexes 26 

2.2 The Generalized Specker's Theorem 38 

2.3 On Specker's Theorem 41 


3.1 Counting mod p 51 

3.2 Connectivity 55 

3.3 The Length of Symmetric Functions 60 









1.1 Finite Functions 

Let n be nonzero and finite; then a partial function IN ■"♦JKT , defined 
on only finitely many n-tuples, is called a finite function . We will restrict 
our attention to a subclass of finite functions. D = {0,1, . . . ,d-l] is an 
initial interval of^KT. Then we will consider ?, the set of all (total) 
functions D -♦ D for all possible D and (finite and nonzero) n. 

Let f : D n -♦ D. Then f is identified with a (functional) table with d 
rows (corresponding to all possible n-tuples over D) and n+1 columns (corres- 
ponding to the n arguments and the value of f). Obviously the number of func- 
tions D -» D is d . 

Consider any function f i D -» D for arbitrary D, n. We will say that f 
depends on the i argument if and only if there exist two n-tuples 

a = (a n a. , . . . ,a ) and b = (b. , . . . ,b , . . . ,b ) such that a . = b for j t i , 

lin lin J j 

a. / b., and f(a) ^ f(b). Suppose that f does not depend on its j* h argument; 
then we will say that the j argument is a fictitious argument. 

1.2 Formulas 

Let there be given the countable sets H ■ {x^x^...} of variable symbols 
and fi of operator symbols. Each element of fJ is a name for a function in ^, 
and conversely each function in ^ has a name in 0. Let 9 £ represent the 
function f : D n -♦ D. Then we will write argOP) = n and dom(9) = D. 

1.2.1 Definition 

A D- formula is a finite expression F = cp(G..,...,G ) such that tp G H , 
arg(cp) = n, dom(cp) = D, and either G. 6 H or G. is a D-formula for 1 ^ i ^ n. 

A formula is simply a D-formula for some D. 

Let F be an arbitrary D-formula and let x be the highest numbered 
variable symbol appearing in F. Then F represents a function f : D -» D. 
This correspondence is well-known and we will not describe it in detail. 
Without danger of imprecision, F will also be considered as a representation 
for all functions obtained from f by adding fictitious arguments. 

Let there be given two formulas F and G. Suppose that F represents a 
certain function f, and also a representation for f can be obtained from G 
by possibly choosing different variable symbols. Then we will say that F 
is equivalent to G (F = G). 

Remarks. Usually, if we are dealing with D- formulas for a single domain 
D, we represent the identity function by a variable symbols (i.e., we omit 
the operator symbol for the identity). In the formal model we use, we cannot 
do this since it would be ambiguous. Also, for purely technical reasons, we 
insist that every operator has at least one argument (otherwise, the wording 
of several definitions and results would be more cumbersome). Thus, we do 
not allow constants. Rather, instead of constants, we use operators with one 
fictitious argument. Suppose we are given the formula F. Occasionally, we 
will say "Replace the variable x (in F) by the constant a". This is to be 
interpreted as "Replace the variable x with a (y) " where y is a variable 
symbol not appearing in F. 

Let f : D -+ D and let g be an arbitrary finite function of n arguments 
with domain E £ D and such that g = f E. Let F be a D-formula for (i.e., 
representing) f. Then we can also say that (F,E) represents g. From now on 
we will not be pedantic, and we will simply say that F represents g. Some of 
the main results in this thesis are concerned with the question, given a 
specific function D -+ D , how much can we simplify its representation if we 
choose an E-formula for it with D f E. 

If F is an arbitrary formula, then the set of variables appearing in it 
will be called its support (denoted by S(F)). The set of operators appearing 
in F will be called its basis (denoted by B(F)). 

Let § s ft. Then the set of formulas F such that B(F) C $ will be called 
the set of formulas over f. Hopefully without too much danger of ambiguity, 
we will also say that $ is a basis of operators (for formulas over $). All 
the significant results we will describe deal with formulas over $ when $ 
is finite (and representing a set of operators with domain D for a single 
value of D). From now on, whenever a basis of operators $ is introduced, 
it is always assumed finite. Usually, we are interested only in bases that 
allow all function D -♦ D for a certain D and arbitrary n to be represented. 
Such bases will be called complete bases (for D). 

Notation. Elements of % will always be denoted by lower case Latin 
letters. The various bases of operators we will use will be denoted by 
capital Greek letters; operators (i.e., basis elements) will be denoted by 
lower case Greek letters (except for well known operators for which established 
notation exists); formulas will be denoted by capital Latin letters; and D 
will always refer to the domain of formulas. d will denote D . 

1.2.2 Example . 

If D = {0,1}, then the functions D -» D for arbitrary n are known as 
Boolean functions. A complete basis for (0,1} conists of the binary operators 
A (conjunction) and V (disjunction), and the unary operator — - (complementation), 
This basis shall be denoted by II. The formula F = V(A(— (x ),x 2 ),A(x , — ( x 2 ^^ 
over II represents x 1 x„ (the mod 2 sum of x 1 and x„). Usually, this is 
written as x.. A x V x- A x_. We have S(F) = {x..,x 2 }, and B(F) = H „ 

A convenient representation of formulas is by trees. This is a standard 
device that will not be described; suffice it to say that to each formula F 
there corresponds a tree T(F) whose terminal nodes are labelled with variable 
symbols and the nonterminal nodes with basis symbols. As an example, let F 
be as defined in Example 1.2.2. Then T(F) is shown in Fig. 1.1. 

Given a formula F, we need a notation for subformulas of F. 

The definition of sub formula is the standard one: (1) F is a subformula 
of F, (2) if F = cd(F ,...,F k ), then if F . for 1 ^ i ^ k is not a variable 
symbol, any subformula of F. is a subformula of F, and (3) subformulas of F 
are only objects satisfying (1) and (2). Subformulas distinct from F are 
proper subformulas. 

Let G be a subformula of F such that G = cp(H., , . . . ,Hg). Then we will say 

H. =G . for 1 £ i £ jfc. This notation can be iterated. In Example 1.2.2, 
i .1 

F = -(x 9 ). However, note that F „ . is a variable symbol which according 
to our definition is not a formula. This can be remedied by replacing this 
particular occurrence of the variable symbol x. by id(x^). For this reason 
we will require that all the bases we consider contain the identity function 
whether this is specifically mentioned or not. 

If G = F .i(l).i(2). i(r)' then ^ = J ^)J ( 2 )° ■ • J( r ^ is called th e index 
of G (for completeness, let X denote the index of F). If G is a proper sub- 
formula of F, then F = H(X,G) where X U S(G) = S(F) and H(X,z) is a formula 
(determined by j ) where z appears only once. We write H = F/G. In this case, 

with F and G as given, we will also write S*(G) = X (i.e., the variables of 


F that appear outside of G) . We define S*(F) = <t>. The subscript F will 


generally be suppressed when it will be clear to what formula F we refer to. 
In what follows, whenever we will deal with a subformula G of F, it will be 
assumed that the index of G is also given; for if not, then, e.g., F/G and 
S*(G) are not uniquely defined. 

Frequently, formulas will occur where certain variables have been re- 
placed with constants. Suppose F is a formula over I, X G S(F), and a € D; 

then, F with all variables except those in X replaced by a will be denoted 

X X 

by F . Obviously, F is a formula over § U {a}. If f is an arbitrary function, 
a a 

X a subset of its arguments, then f has the analogous meaning, viz., the 


function obtained from f by restricting the elements outside of X to a. 

The functional table of f X is obtained from that of f by deleting all columns 


except those that correspond to X and retaining only the rows with a 
entries in the deleted columns. 

1.3 Measures of Complexity 

Let us introduce the three most widely studied measures on formulas: 

(1) Length. The length of a formula F, denoted by L(F), is the number of 
occurrences of variable symbols in F. In other words, it is the number of 
terminal nodes of T(F). 


(2) Cost. The cost of a formula F, denoted by C(F), is the number of opera- 
tor symbols in F. In other words, it is the number of nonterminal nodes of 

(3) Depth. The depth of a formula F, denoted by D(F), is the depth of nest- 
ing of operators in F. In other words, it is the number of arcs on the long- 
est branch of T(F). 

Now, given an arbitrary function f : D -» D and a (finite) basis $, we 
define the length of f over $ as 

L(f,§) = min({4: There exists a formula F over $ for f such that 

L(F) = I}) 

If f cannot be represented by a formula over §, we define L(f,f) = °°o Simil- 
arly, for cost and depth. 

It is noteworthy that all the measures above are closely related. In fact, 

c «L(f,$) ^ c (f,$) ^ C 1 -L(f,$) (1.3.1) 

c 2 -log 2 (L(f,$)) iD(f,l) * c 3 -log 2 (L(f,3)) (1.3.2) 

for an arbitrary function f such that L(f,$), C(f,$), and D(f,$) are finite, 
and certain constants c Q , c., c 2 , and c„ that depends on $. The basis I is 
also arbitrary, except in the case of the right inequality of (1.3.2) where 
it must be such that all the constants and the function g (see Lamma 1.3.1) 
may be represented. 

We first establish the relation between cost and length (1.3.1). 


Any formula F over $ can be built up from one which uses only one opera- 
tor symbol (an elementary formula) by successively replacing variable symbols 
with new elementary formulas. If F does not contain one-argument operators, 
then whenever we increase the cost during the build-up (by adding an elementary 
formula with cost 1), we also increase the length. Specifically, the length 

increases by between n . -1 and n -1 where n . and n are respectively 
J min max mm max 

the smallest number larger than 1 and the greatest number of arguments of an 
operator of §. This results in the estimate 

1 - -L(F) £ C(F) <: — — r -L(F) (1.3.3) 

n -I n . -I 

max mm 

where c = 1„ Suppose F contains one-argument operators. In other words, T(F) 
contains nodes with branching factor one. Let the maximal number of such nodes 
that occur one after another on any branch of T(F) be c*; then (1.3.3) still 
applies with c = c* + 1. (1.3.1) is obtained from (1.3.3) by noting that the 
minimal length or cost representation of any function (over the chosen basis 
$) can be achieved with a formula where c* ^ d (the number of functions D -♦ D). 

The left inequality in (1.3.2) is established by a trivial counting argu- 
ment (the maximal number of terminal nodes in a tree with branching factor ^ 

n and depth d is n ). The right side requires more effort (the following 
max max 

argument is due to R. W. Floyd). We first state the following obvious 

1.3.1 Lemma . 

Given a formula F such that F = F-CX^F^xp) where F 2 is a proper sub- 
formula of F and F.. = F/F„, the following holds: 


F = F 3 (F 1 (X 1 ,C ), F 1 (X 1 ,C 1 ),...,F 1 (X 1 ,C d-1 ),F 2 (X 2 )) 

where G. for D ^ i ^ d-1 is any formula representing the constants 0,...,d-l 

(or, as we have remarked previously, the one-argument function with constant 

value), and F„ is any formula representing the function g(z Q , . . . ,z^_^,z^_ = 

"z n if z, = 0: z, if z, = 1,..., z, , if z, = d-1. 
d 1 d ' d-1 d 

Let F be an arbitrary formula over §, and let G be a proper subformula 
of F. We already know that F = H(X,G). The claim is made that if L(F) > 1, 
G can be chosen in such a way that 

L(H)-1,L(G) < ^MJj- -L<F) d.3.4) 


where n is as defined previously. (Remark: L(H)-1 is the number of 
max r 

occurrences of the variables of S*(G) in H. ) 

To find G use the following procedure: Start with F and proceed to sub- 
formulas of F. Assume you are considering the subformula K. Then two cases 

can arise. Either among K for 1 £ j £ k where k is the number of arguments 

• J 

of the outermost operator of K there is one, j', such that L(K , ) ^ ct'L(F) 

(0 < a < 1 will be determined later with the purpose of obtaining the lowest 

possible estimate of L(H)-1 and L(G)), or not. In the first case, proceed to 

K ., and containue. Otherwise, set G = K „ where j" is such that L(K „) = 
. j • J • J 

max (L(K .)) and terminate. Before the procedure terminates, L(K) ^ ct'L(F). 
l^j^k ' J 

Thus — — £ L(G) < CO-L(F). This also means (l-a)'L(F) £ L(H)-1 < (1-- ) 

max max 

.L(F) (because L(G) + L(H)-1 = L(F)). The lowest bound for L(G) and L(H) 

is obtained by setting a = 1 ; hence (1.3.4) 



Now apply Lemma 1.3.1 with G replacing F ? and H replacing F . F is of 

depth c, depending on §. If the outlined procedure is applied recursively to 

H(X,C ) for <; i <C d-1 and to G, we obtain in (1.3.2) c„ = , C , where 
i 3 log b 

n +1 + 2 

, max T 

b = . 


Note that unlike the cost-length relationship, the minimal value of 
depth may not be achieved by the same formula as the minimal value for length. 

Apart from the relationship between the various measures, depth and cos' 
will not be treated further. Even though in what follows (in this chapter) 
many things hold mutatis mutandis for depth and cost, most of the specific 
discussion and the examples shall be confined to length. 

1A Problems Related t o the Length Measure 

In this section we will mention several questions that have been asked 
about the complexity of finite functions, their status as of this writing, 
and how they relate to the work to be described here. 


A more precise expression is obtained if the right side of (1.3.2) is re- 

placed by • log. (L(f ))+i for some constant I. Namely, if we start out 

J-Og.D L 

with a formula F and decompose it according to (1.3.4) and Lemma 1.3.1, then 
the length of H(X, C.) and G is bounded by ~L(F)+k where k is the length of C.. 
After n applications of Lemma 1.3.1, the lengths of the relevant formulas are 
bounded by ^ L(F) + k^FT +. ..+ k^+k « ^ (L(F)) + -^— j- ; hence the figure 


The Problem of Aggregate Length 

Let $ be a complete basis (for a certain domain D). The statement of the 
problem is: What is the largest number L(n,$) such that there exists a function 
f: D n -+ D and L(f,$) = L(n,§)? 

It has been studied by several authors, and is now effectively disposed 
of. Riordan and Shannon [Ri42] first derived a lower bound for L(n,IT). 
Actually they studied series-parallel contact networks, but the two models 
are equivalent. The first upper bound (for the same model) was obtained by 
Shannon [Sh49]. Krichevskii [Kr59] derived a lower bound for L(n,$) for 
arbitrary domains and bases, while Lupanov [Lu59] obtained the best upper 
bound for the general case. The result is 

0(L(n,$)) = rr (1.4.1) 

log a n 

where 0(f(n)) = g(n) means that lim — pf is finite and nonzero. There 

are two remarks that are in order here. The first is that 

Formulas represent finite functions efficiently; i.e., 
the total number of formulas (over a given basis $) of 
length up to L(n,$) closely matches the number of func- 
tions of n variables. (1.4.2) 

The second is 

The fraction of functions D n -♦ D that can be represented 
by formulas of length up to L(n,f ) • (l-e)for an arbitrary 
< e ^ 1 approaches zero as n -» °°. 

The interested reader may obtain more information in the literature cited 


Obviously, we could define functions C(n,§) and D(n,$) analogous to 
L(n,§) in terms of the cost and depth measures. In general, such functions 
(aggregate complexity functions) can be defined in connection with any model 
for the representation of functions D -♦ D and any measure on this model (an 
obvious variation of L(n,$) is to remove the condition of completeness on $). 
It should be noted that the asymtotic behavior of aggregate complexity functions 
remains an active area of research. For references on the subject, see Lupanov 

The Minimization Problem 

Investigation of the complexity of finite functions started on representa- 
tions of Boolean functions by logical circuits. In fact, formulas can be 
thought of as circuits with fan-out one. Thus, the first problems studied 
were those a logic designer is likely to ask: Given a finite function, what 
is the minimal circuit (formula) that represents it (i.e., find the complexity, 
and do so "effectively"). 

Unfortunately, no statisfactory solution to the minimization problem 
exists (for any measure). This does not mean that it is impossible to obtain 
a minimal formula for a given function f : D n -» D; rather that existing 
algorithms are impractical. Thus, it is always possible to order formulas 

according to length, and then search all formulas up to length L(n,f) for the 

d n 
first formula that represents f; but since there are d functions of n argu- 
ments this approach is absurb. 

At the present, all existing algorithms for the minimization of functional 
representations employ some sort of an exhaustive search (e.g. , the Quine 
algorithm for the minimization of disjunctive form representations of Boolean 


functions). In fact, there is reason to believe that a more efficient method 
does not exist, i.e., 

1.4.1 Conjecture 

Any generally applicable exact minimization procedure is comparable (in 
terms of computational complexity) to an exhaustive search among formulas. 

It is useful to consider a specific machine model. Let us consider 
implementations of such a procedure as a deterministic one-tape Turing machine 
M $ (see, e.g., Arbib [Ar69]) that receives as its input the d -tuple defining 
an arbitrary function f : D -♦ D, and whose output is the minimal formula F 
(over I) for f. Conjecture 1.4.1 gives us that the computation time of 
M- may attain an exponential (in the length of the input). Let us venture 
a more restrictive and precise version of Conjecture 1.4.1: 

1.4.2 Conjecture 

Let M. be as described, m is the length of its input, and let Tl(m) be a 


function such that ^' m ' -» as m -» °° for an arbitrary constant c > 1. Then 


the proportion of inputs of length m at which the running time of M exceeds 
T|(m) approaches 1 as m -♦ °°. 

Actually, the specific machine model on which the procedure of Conjecture 
1.4.1 above is implemented is not particularly important. It can be easily 
shown (see, e.g., Arbib [Ar69] , Chapter 4) that different deterministic machine 
models (this applies to the most widely used models, e.g., one-tape and multi- 
tape Turing machines) can simulate each other in such a way that the running 


time of one is related to the running time of another at most by a polynomial. 
In this way, whenever the running time is exponential (in the length of the 
input) in one case, it must be so also in others. 

It seems that Conjecture 1.4.1 was first expressed by Yablonskii [Ya59]. 
A very interesting result connected with this subject was recently obtained 
by Cook [Co71]. He obtained strong evidence that a simpler problem requires 
nonpolynomial time. The problem is that of recognizing whether a certain 
disjunctive normal form (for a Boolean function) represents the constant 1. 
Cook showed that if this problem could be solved in polynomial time (by a 
deterministic one-tape Turing machine), then a number of other problems that 
are regarded as very difficult (e.g., given the graphs G., and G^, determine 
whether G.. is isomorphic to a subgraph of G 2 ; the recognition of primes; etc.), 
would also be rapidly computable. Note that a fast minimization machine 
would give us also a fast constant recognizer; hence, Cook's results supports 
Conjecture 1.4.1. 

The Classification Problem 

In view of the difficulty of finding an exact nontrivial solution to the 
minimization problem (i.e., one that does not employ exhaustive search), present 
research is directed at establishing bounds for the length of functions. We 
consider sequences of functions f,,... of 1,... arguments and study the 
growth rate of the length of f . Thus, we can talk of classes of linear 
(length) sequences, quadratic (length) sequences, etc. Also of nonpolynomial 
(length) sequences. Unfortunately, if a sequence belongs to a nonlinear class, 
it is very difficult to estimate its length. We cannot even assign represen- 
tatives to the polynomial classes of degree > 2, let alone the nonpolynomial 


classes. In fact, at the present we have only a very limited store of examples 

of nonlinear sequences. 

2 n 
Consider the Boolean function f = .©., x.. Subbotovskaya [Su61] gave a 

n i=l l 

2 3/2 
striking proof that 0(L(f , II)) ^ n . It was known already to Shannon (see 

2 2 
[Sh49], or [Ya54]) that 0(L(f ,11)) £ n (the length of this sequence, of course, 

grows linearly if © is used). Unfortunately, it seems that the technique of 

[Su61] cannot be generalized to d > 2. Subbotovskaya' s result has recently 

been improved by Khrapchenko [Kh71]. He succeeded in showing that 0(L(f , II)) 

^ n . Since this result employs a very interesting technique, and since it 

has not yet been translated into English, it is reproduced in Appendix B. 

Neciporuk [Ne66] discovered a sequence of Boolean functions f such that 
n 2 

0(L(f ,§)) = ~ for an arbitrary basis ?. It is true that the functions 

n' log 2 n J 

involved in the Neciporuk sequence are rather "artificial" in that, while 
defined in a straightforward way, they have no special significance; however, 
lately Harper and Savage [Ha71] have succeeded in applying the Neciporuk tech- 
nique to a practical combinatorial problem (The Marriage Problem). 

Neciporuk 's construction is based on the following lemma: Let f be a 
Boolean function of n arguments. Consider a subset X of the arguments of f 
and the set of restrictions of f to X obtained by setting the arguments outside 
of X to constants. Let the number of such restrictions be r. If F is any 
formula over a finite basis $ for f, then the number of occurrences of variables 
representing the arguments in X is 2: clog.r where c depends on the basis § 
(for the proof of this see [Ne66] or [Ha71]). 

The Neciporuk function f of n arguments is then obtained as follows: 
The n arguments are arranged in a rectangular array with dimensions as shown 

in Fig. 1.2. Each argument x. . is associated with a 0-1 valued m-tuple a.. 

° ij — ij 


such that (1) not all components are 0, and (2) if (i,j) 4 (k,-0 then 
a. . ^ a, ,. Then we define 

f = © x.. © K(a..,k) 
all i,j i£j 

where K(a. . ,k) denotes the conjunction of those arguments x, whose second 

subscript (t) corresponds to nonzero components of a... 

It can be verified that the number of restrictions of f to the variables 

of an arbitrary row (except, perhaps, the last which may be imcomplete) ob- 

, . „n-m 

tained by replacing the variables of the other rows with constants is l 

This follows from the fact that any Boolean function can be uniquely represen- 
ted by a Boolean polynomial (see Lamma 4.5). Then, by the lemma above, the 
number of occurrences of variables of any row (except, perhaps the last) in 

any formula for f is ^ c(n-m); hence, the length of f over $ w c- — (n-m). 

n 2 * 

In other words, 0(L(cp ,$)) = ■: for an arbitrary basis $. 

T n *-°&2 n 

Neciporuk's construction may be viewed as a solution to a special case 

of the following problem (the problem of exhibiting a function of arbitrary 

length): Given a basis $ and a number k ^ L(n,$), exhibit a function 

n n 2 

f: D -> D of length ^ k over §. In Neciporuk's case 0(k) = , n . 

Since so few examples of functions that are known to be of large length 
exist (in spite of (1.4.3)), the reader has no doubt already gained the 
impression that this problem too is very difficult. However, we again have 
the trivial solution that consists in examining formulas in n variables in 
the order of their length, recording the functions they represent, and 
choosing the first previously unencountered function represented by a formula 
of length ^ k. In fact, it is reasonable to state an analog of Conjecture 


1.4.3 Conjecture 

The problem of exhibiting a function of arbitrary length is comparable 
(in terms of computational complexity) to an exhaustive search among formulas. 

We again make this conjecture more precise on the example of determin- 
istic one-tape Turing machines. 

1.3.4 Conjecture 

$ is an arbitrary basis. N. is a deterministic one-tape Turing machine 

with input (n,k) where n is arbitrary and k ^ L(n,§), and whose output is the 

d -tuple describing a function f of n arguments such that L(f,§) ^ k. Then 

there exists a constant c > 1 such that if k ^ e«L(n,$) for any < £ £ 1, 

the running time of N, on input (n,k) exceeds c when n ^ n( e ). 

We can sum up the discussion of the classification problem as follows. 

The problem is far from understood. At the present no sequence of functions 

is known whose length grows faster than n . Isolated examples of sequences 

with growth rate ^ n are known, and present research is directed at inven- 
ting more general techniques that can be used for estimating the complexity 
of whole classes of sequences. Also techniques have to be devised for d > 2. 
The importance of this will be discussed below in Section 1.5. 

1.5 Specker's Theorem 

The first general technique for proving the nonlinearity of a large 
class of sequences (of Boolean functions) was discovered by Specker [Ho68]. 
Let the basis IT U {x © y} be denoted by S . Then 


Theorem (Specker) . 

If f is a Boolean function of n arguments, if L(f,£) ^ c«n for some 
constant c, then for any integer m, if n ^ T| (m,c), a subset X = [x 1 ,...,x } 
of the arguments of f can be found such that (1) 

„ m m 

f J = c © c • n (i© x .)ec 2 © x. (1.5.1) 

i=l i=l 

where c n , C- , c„ are Boolean constants and Tl (m,c) is a certain number- 
theoretic function. Furthermore, (2) if the basis is II (the other assump- 
tions remaining unchanged), then c~ =0. 

This theorem has been used by Hodes and Specker to show that the predi- 


S x = mod k (1.5.2) 

i=l 1 

for k > 2 and x. f {0,1} is of nonlinear length over S. 

Using the second statement of the theorem, they are also able to give 

an alternative proof of the nonlinearily of the length of © x. over n. 

Another result obtained with Specker 's Theorem is the fact that some 

geometrical predicates (in particular, connectivity) discussed by Minsky 
and Papert [Mi69] are of nonlinear length over S (see Hodes [Ho70]). 

In Chapter Two we will formulate and prove a generalization of Specker' s 
Theorem (Theorem 2.2.2) to include the case d > 2 and multi-argument oper- 
ators in $. Our proof reveals the nature of both results more clearly. 

t _ 

T] Will be discussed in Chapter Two. 


They belong to a class of combinatorial results reminiscent of Ramsey's 
Theorem (see Ryser [Ry63]). In fact, an earlier version of our proof 
of Theorem 2.2.2 used Ramsey's Theorem. Besides this, Theorem 2.2.2 
enables us to derive the nonlinearity of new functions (sequences of 
functions) such as counting mod p where p is a prime, d possibly equals 
p, but there are restrictions on the basis, etc. An example of an im- 
provement over existing results is the connectivity predicate. Hodes 
[Ho70] proves that it is nonlinear if d = 2. However, in Automata Theory, 
for example, the result that a certain language can be computed in non- 
linear time if k states are used in the finite control would be considered 
weak. Rather we search for proofs that work for arbitrary finite controls. 
The Generalized Specker Theorem (Theorem 2.2.2) gives us a tool for proving 
the nonlinearity of the length of the connectivity predicate regardless of 
the domain D and basis §. We can apply it to connectivity by "reducing" 
connectivity (for the meaning of "reduction" see [Mi69] or 3.2) to certain 
symmetric functions. 

We should note that the generalization of Specker 's Theorem that we 
prove is the obvious one to attempt; but, as the reader will see, the proof 
turns out to be less straightforward. As an indication, consider (1.5.2). 

It does not generalize directly to d > 2 since, e.g., the function {0,1} -» 

n _ 
{0,1} defined by E x. = mod 6 can be represented in linear length with 

i=l x 

d = 3. This is because 

2, 5, n 

[Ex. =0 mod 6] = [ E x = mod 3] A [ E x =0 mod 2] 

1=1 X i=l l i=l X 


Hodes and Specker do not derive any bounds for the lengths of the 
functions investigated by them. This question is asked (and to an extent 
answered) in 3.3. 

1.6 Cyclic Perceptrons 

Cyclic Perceptrons will be treated in Chapter Four. They are an 
application of ideas of Minsky and Papert to the representation of functions 
by combinations of finite operators. In particular, one of the concerns 
in [Mi69] is to formalize the intuitive idea that the connectivity predicate, 
being "global" in nature, cannot be computed (or represented) by a "simple" 
combination of "local" predicates. 

The perceptron is the predicate 

E a.'tp. ^ 
i€I L X 

where I is an indexing set, a. € Q, the rationals, cp. € $, a set of Boolean 

functions (whose value is interpreted as being either the rational or 1. 

The cyclic perceptron is defined as 

where a. € F, a finite field, Y ^ F, and other symbols have the same inter- 
pretation as before. Thus, both represent a certain Boolean function. 

Minsky and Papert introduce the concept of the order of a perceptron 
(the maximal number of arguments on which cp depends where i ranges over I). 
They define then the order of a predicate as the minimal order of a per- 
ceptron that represents the predicate. They formalize "local" by defining 


an infinite predicate sequence to be local if and only if every member is 
representable by a perceptron of order ^ r, for some finite r. They are 
then able to show that connectivity is nonlocal. 

The concept of order can also be applied to cyclic perceptrons. Chapter 
Four will contain results on the order of the various predicates introduced 
in [Mi69]. In particular, connectivity is shown to be nonlocal. This will 
be an extension (to finite fields of arbitrary characteristic) of the results 

described in [Vi70]. 

Chapter Five describes a model of computation (Pattern Counting Machines) 
that again performs- a "local" computation followed by a "global" computation. 
In this case the "local" computation is even more constrained than in the 
case of perceptrons. The result is that no matter how cleverly we utilize 
the "local" information in the subsequent "global" phase, the connectivity 
predicated cannot be computed. 


T(F) for the formula F in Example 1.2.2 
Fig. 1.1 


l row 

.th , 

j column 

X. , 



number of columns: m = llog 9 nl + 1 

number of rows: ni/ml 

The array of arguments used in the definition of the 
Neciporuk function f 


Fig. 1.2 




2.1 e-Complexes 

Throughout this section, all formulas are D-formulas for some fixed (but 

arbitrary) domain D, and all operators are functions D -»D. 

Given the formulas F.,...,F , we shall call the formula F = y(F^,. . . ,F^) 
where cp is an arbitrary operator a parallel combination (PC) of F^, . . . jF^. cp is 
called the decoding operator of F. 

Let F(X,z) be a formula where the distinguished variable z appears only 
once, and let G be an arbitrary formula. Then F(X,G) shall be called a series 
combination (SC) of F and G through z. 

2.1.1 Definition 

We give an inductive definition of an elongated n- component (e^component) 
for n ^ 0. 

(1) Let cp. be an arbitrary unary operator and z an arbitrary variable symbol 
Then cp. (z) is an e -component. z is the input variable while cp in is the 

input operator . 

(2) Let cp be an arbitrary binary operator, G an arbitrary e n _ ^component , 
and x $ S(G). Then F = cp(x,G) (or cp(G,x)) is an e^component. The input vari- 
able and input operator of G are also the input variable and input operator 

of F. x is a lateral variable of F. Any lateral variable of G is also a 
lateral variable of F. cp will be called an internal operator of F. 


An example of an e -component is given in Fig. 2.1. Let F be an arbitrary 
e -component, and let x be the sequence of lateral variables arranged in the 
order they are connected to the branch of T(F) extending to the input variable. 
Then x is the lateral sequence of F. If F is an e -component, then the lateral 
sequence of F is \ (the empty sequence). For example, the lateral sequence of 
the e-component in Fig. 2.1 is x..,...,x . 

An e -component with all internal operators equal is a homogeneous 
e - component . 

2.1.2 Definition 

A formula F is an e r - complex if (1) F is a PC of the e n -components F^..., 
F , and (2) the lateral sequence of F. for 2 <: i <: r is either equal to the 
lateral sequence of F. , or the reverse of it. 

F ,..., F are the components of F. If the variables of F- are numbered 
as in Fig. 2.1, the second condition of Definition 2.1.2 means that any compo- 
nent F ..,..., F either appears as in Fig. 2.1, or as in Fig. 2.2. The compo- 
nents of the former kind will be known as standard components, while those of 
the latter kind will be called the reverse components. The lateral sequence 

of F 1 will also be called the lateral sequence of F. 

Both in the case of e -components and e -complexes, one or both indices 

will occasionally be omitted if the particular property they refer to is 

irrelevant to the argument at hand. 

An e-complex composed of homogeneous e-components is a homogeneous 

e- complex . 


One might wonder what the purpose of introducing e-complexes is since 

for appropriate r and m every function of n variables can be represented by 

an e -complex. Thus, it would seem that this class of formulas is trivial. 

However, we will be concerned with e -complexes where r remains fixed as n 
' n r 

grows without bounds, and this will allow us to obtain interesting results. 

We introduce some notation. Let F be an e -component with lateral 

sequence x. ,., N ,x, , ON , . . . ,x. , N . a € D is an arbitrary constant, cp. denotes 
i(l)' i(2)' i(n) J j 

the internal operator corresponding to x...,. 

. Also set cp = cp. . Then 
T n+1 T m 

<p.(a,<P. +1 (a,...,<P k _ 1 (a,cp k )...)) if 1 ^ j < k ^ n+1 

cp* , N =1 cp. if 1 ^ j - k £ n+1 
undefined otherwise 

Note that cp^, . * is a unary operator if j = n+1, otherwise it is a binary 
operator (if it is defined at all). Usually we will suppress the superscript 
a because it will be clear what constant is referred to. 

We now state the simple 

2.1.3 Proposition 

Let F be an e -component with lateral sequence x and input variable z. 
y. is an arbitrary subsequence of x of length m ^ and a € D is an arbitrary 

constant. If we denote the set consisting of z and the elements of y_ by Y, 

then F is equivalent to an e -component G. 



Let x = (x 1 ,x 2> ... ,x n ) and y = (Xj/jj, x i (2)'••• ' x i(m)^ ~ -' Set ± ^ = ° 

and i(m+l) = n+1. Then G has the operators ilf. = CD,. , . lN , n . ,. N v for i ^ j ^ 

j (i(j-l)+l, l(j)) 

m+1 (ijr is the input operator of G). □ 

2.1.4 Remark 

Obviously, Proposition 2.1.3 holds for e-complexes as well; one merely 
has to perform the above construction for each component. 

Proposition 2.1.3 will be frequently invoked. Namely, we will take an 
e-complex F, select a subsequence i e x, the lateral sequence of F, and obtain 
G as above. In this case, G is called the result of an a - merger with basis 
y on F. 

We introduce another restricted class of formulas. 

2.1.5 Definition 

A series parallel combination of e- components (SPCeC) is obtained accor- 
ding to the following rules: 

(1) An e-component is an SPCeC. 

(2) Let F and G be an e-component and an arbitrary SPCeC respectively. 
Then the SC of F and G through the input variable of F is an SPCeC. 

(3) If F.,...,F are SPCeC s, then a PC of F. F is an SPCeC. 

1' ' r ' 1' ' r 

(4) An SPCeC is only an object satisfying (1), (2), or (3). 


Given an arbitrary SPCeC F, we describe its set of components . If F 
consists of the single e-component G, then G is the only component of F. 
If F is the SC of an e-component G and another SPCeC H, then the set of 
components of F consists of G and the set of components of H. If F is a 
PC of F ..,..., F , then the set of components of Fconsists of the sets of 
components of F. for 1 ^ i ^ r. Among the components of F, those whose input 
variable corresponds to a terminal node of T(F) will be called terminal 
components while the others will be called internal components . An example 
of an SPCeC is given in Fig. 2.3. This particular SPCeC has four terminal 
components and two internal components. 

2.1.6 Proposition 

An SPCeC is equivalent to a PC of r e-components where r = d-I+J and I 
and J respectively are the number of internal and the number of terminal component of F. 

Proof F can be converted into a PC of e-components by using Lemma 1.3.1. 
The estimate of the number of e-components in the PC is also obtained from 
there. ^ 

Remark It is a simple matter to verify that if F of Proposition 2.1.6 
has k components, then I ^ k-1; and thus r £ d- (k-l)+l. 

2.1.7 Proposition 

F is a SPCeC with k components F^...^. F. for 1 ^ i ^ k is an e^- 

component for n ^ 0, and, furthermore, the sets of lateral variables of F^ 

and F. are equal for 1 <■ i, j ^ k. Let X be the set of lateral variables 

of F. and let Z be the set of input variables of F. Then for any m ^ and 


a f D if n ^ TL (m,k) where T] (m,k) is a certain function (to be defined), 

there exists a subset Y £ X with |y| = m such that F is equivalent to an 

e -complex G with Y as the set of lateral variables. Furthermore, r ^ d* (k-l)+l. 

Proof If m = 0, we can immediately apply Proposition 2.1.6 and obtain an 
e n ~complex where r is as described in the statement of the proposition; thus 
TL (0,k) = 0. We assume, therefore, that m > 0. 

We recall the following familiar result: 

Let i(l), i(2) ,. . . ,i((p-l) +1) be a sequence of distinct integers. Then 

we can extract a subsequence of length p that is either increasing or decreasing 
(for the proof see Berge [Be71] p. 16). 

Without loss of generality, we can assume that the lateral sequence of 
F. is x-j.-.jX . Then the lateral sequence of F 2 is x. ,^, Xj/ 2 ) ' * " " ' x i(n)" 
The sequence i(l), i(2),..., i (n) consists of distinct integers; therefore, 
if n ^ (n -1) +1, we can apply the above result and find a subset X^ £ X of 
n variables such that after performing an a-merger with basis X- on all 
components of F, the lateral sequences of the descendants of F^ and F^ are 
either the same or opposite. We can continue in this way, processing one 
after another all components. We end up with an SPCeC with components 
G 1 ,...,G k such that the lateral sequence of G ± for 2 <: i <■ k is either equal 
to that of G, or the reverse of it. To obtain G, we apply Proposition 2.1.6. 
In order that Iy| = m, we must have 

2 k-l 
n ^ T^On.k) = (m-1) 

for m > 1. The estimate for r is obtained from the Remark following 
Proposition 2.1.6. D 


Another equivalence that will be used later is given by 

2.1.8 Lemma 

E is an e r -complex, X is the set of its lateral variables, and Z is the 

set of its input variables. Then for any m ^ and a € D, if n ^ Tt 2 (m,r), 

there exists a subset Y C X with 'Y I = m such that E is equivalent to a 

' a 

r it, 
homogeneous e -complex F. 

Proof If m = 0, we simply use Proposition 2.1.3, and the result is a homo- 
geneous e n -complex (obviously, any e Q -complex is homogeneous). Then T! 2 (0,r) =0. 
Thus, from now on we assume that m ^ 1. 

The proof will be given for the special case when E has two components: 
a standard component E- and a reverse component E . It will then only be 
indicated how to generalize the proof. 

A procedure (The Homogenizing Procedure- -HP) will be described that will 

transform an e -complex G consisting of a standard component G. and a reverse 
P i 

component G„ with p a TL(q) (for a function Tl that will be defined later) 
and with the properties: (1) There exist (possibly empty) subsets 1^ and 
R 2 = D such that (^(a.y^I^ = id R (identity on 1^) and ilr^a.y)/^ = id R 
for 1 < i ^ p where CD. and i|r, is an operator of G 1 and G 2 respectively 
and the first argument corresponds to the lateral variable, and (2) 

V(i si,j«p) [ffl^x.y) € Rj_ => »j(x,y) = ^(x.y)] (2.1.1) 

(i.e., the operators of G, are identical on the inverse image of R^). Simil- 
arly for the operators of G 2 on R 2 « 


Remark Note that if R- and R include the range of every operator, 
Property 2 translates into the identity of the operators. In particular, 

this holds if R. = R 2 = D. 


Remark Note that an arbitrary e -complex satisfies Properties 1 and 2 

with R 1 = R- = 0. 

The result of applying HP will be an e -complex H that will either be 


homogeneous, or will have Properties 1 and 2 with S. and S„ replacing R.. and 
R 2 respectively and R / S. or R ? S_. Due to the Remarks above and to 
the fact that D is finite, repeated application of HE on E finally yields F. 
The condition on n is 

n * \(m,2) = T) 3 Q) 3 (...T1 3 (in) ...)) (2.1.2) 

2d times 
This bound for n corresponds to the worst case when R. or R_ increase by only 
1 on each application of HP. 

Before describing HP, note the useful fact that because of Property 1, 
Properties 1 and 2 are preserved under a-mergers. 

Description of HP The lateral sequence of G is of length (v+l)«u-l for 
certain values of u and v that will be defined later. 

Consider the sequence 

EL 3 

(CD ((k-l)-u+l,k-u)' + ((v-D-u+1, (v-k+l).u) ) (2.1.3) 

for k = l,...,v. Sequence (2.1.3) is illustrated in Fig. 2.4. The two 
vertical lines represent G- and G„; the numbered horizontal outlets represent 
the lateral variables (with the corresponding number); the boxes indicate the 


variables and operators that take part in the formation of any particular 

cd,. .v and ilr,. • an 'x' beside a variable indicates that it is not set to 

the constant while 'a' indicates that it is set to a; the two checked boxes 

represent the first member of (2.1.3). 

In the sequence (2.1.3), either (Case I) the ranges of c P//k_-i\. .i if. u "\ 

and ilr,, ., r1 , , ,.. N for 1 ^ k < v are included in R and R respectively, 
((v-k)*u+l, (v-k+l)-u) 1 2. 

or (Case II) not. 

Case I . If v is large enough, we can find q identical elements in the 
sequence (2.1.3). Let the indices k corresponding to these elements be 
k 1s ...,k . Performing an a-merger with this set as basis, the desired 
e-complex H is obtained. Note that in this case we use Property 1 of G. 
Namely if cp* is the first component of a pair in (2.1.3) whose range C R^ 
and if 9 is an arbitrary operator of G^ then 9(a,ep*) = cp* (similarly for 
the second components of the pairs in (2.1.3) and operators of G,,). Thus, 

the components of the identical pairs in (2.1.3) become the operators of H. 

d 2 d 2 

A bound for v is q« (d ) (d is the number of operators D ■+ D). 

Case II . Assume t P(/£_ 1 ). u+1 i. u )^ h ' C ^ ^ R l f ° r S ° me b » c ^ D and 
1 <= I* v (the case when ♦ ((v _ i) . u+1> (v .^i). u ) < b > c > * R 2 Can be treated 
similarly). But then 

cp,„ , nO>,c) t R n (2.1.4) 

for all ^ j ^ u-1 (as a consequence of Property 1). Provided that u is 
large enough, we can find an element e £ D that appears w times in the se- 
quence (2.1.4). Let the indices j corresponding to the appearances of e be 
j (1),... ,j(w) (see Fig. 2.5). Obviously then (all the variables considered 
except x« have been set to a), 


'(i-u-Mt), *-u-j(t-l)-l) (a > e) =e 

for 2 £ t ^ w (the first argument of cp . . corresponds to the lateral 
variable). Thus, 

^•u-jCt), i.u-j(t-l)-l) (a ' y) f R l U{6] =ld 
At this point we consider separately two cases: 

Case Ha ^ = 1 and j (w) = u-1. We perform an a-merger with the basis 

consisting of the variables with indices u-j (t)-l for 1 ^ t ^ w-1. As a re- 

suit of this we obtain an e -complex G' such that Property 1 holds for G' 

w-1 1 

on R 1 U [e] (G' satisfies Property 1 on Rl e R„; in any event R^-Ri = $). 
Note that at this point Property 2 is still satisfied only on R 1 and R by 
Gl and Gi respectively. 

Case lib . A 4- 1 or j (w) < u-1. We perform an a-merger with the basis 

consisting of the variables with indices 4«u-j(t)-l for 1 £ t £ w. As a result 

of this we obtain an e -complex G" such that Property (almost) holds for G " 

(the same remarks regarding G" and Property 1 as well as G", G^ and Property 

2 apply as in Case Ha). The only exception may be cd'.' (the operator of G" 

that is closest to the decoding function). By definition, cp" = ©,.. . ■ / -\_^) : 

and there is no assurance that cp''(a,e) = e. We may rectify this situation 

by absorbing co" into the decoding function in the following way: Let G" = 

9(G", G' r ). Now set x.. = a (after the a-merger the variables have been 

renumbered). Let S(G") - {Xj} = U. Then we have (G ,r ) ^ = G' = e(cp^(a,GJ), C-.1 ; 


ere G' equals G' minus co * and Gi equals G' with the input operator modified 

as follows: ilf! = -dr "(a, ilr ! r ) (remember that G' r is a reverse components, and, 
in w in 2 

hence, x, is attached to i]f''). Clearly, G' is an e -complex satisfying 
1 w w -l 


Property 1 with R- U {e] replacing R 1 . 

We can now resume considering Cases Ila and b together. To obtain H 

(with components HL and H„) we must find among cp.' for 1 ^ i ^ w-1 q operators 

that are identical on the inverse image of R.. U (e) (i.e., (2.1.1) with 

R- U [e] replacing R ) and again perform an a-merger. We again emphasize 

that the operators of H are identical only on the inverse image of R , and 

this property has not been violated by any of the transformations of the 

original e-complex G. 

To obtain q operators that are identical on the inverse image of R. U { e} , 

2 2 

it is sufficient that w-1 ^ q-d ; therefore, u ^ d. (q.d +1) and 

T] 3 (q) = q 2 .d-6 3 + q .d.(6 2 +6)5d-l (2.1.5) 

d 2 
where 6 = d . This is obtained from the values of u and v derived above. 

Recall that T] (q) = (v+l)«u-l, the length of G, 7L for r = 2 can then be 

obtained from (2.1.5) and (2.1.2). 

The proof for the general case is obtained by defining the Generalized 

H Procedure (GHP) with the corresponding function Tl, . We consider instead 

of (2.1.3) the sequence 

1 r' 1 r" . 

, (s,t)"-" a5 (s,t)' Vs-.f) V,t) ; 


\a,\.) \*,^-) (>s ,z ) - ' vs ,u; 

1 r' 
where s = (k-l)-u+l, t = k»u, s' = (v-k)-u+l, and t' = (v-k+l)«u. co. ,- . . ,cp 

1 r" 
denote the operators of the standard components of G while i|r. ,. . . ,i|f. denote 

the operators of the reverse components (r ' + r" = r). Without detailed 

argument we state that in the general case v = q« 6 while u remains the same 

(u is determined by the requirements of Case II at which time only one 


component is considered). From this we obtain analogously to (2.1.5) 

\(q) = q 2 -d.6 r+1 +q-d-(8 r +6)+d-l 
TL(m,r) can then be obtained from 

T) 2 0n,r) = n 4 CTl 4 (...T1 (m)...)) 

r«d times _ 

which is an analog of (2.1.2). 

2.1.9 Remark 

As we have seen, TL in Lemma 2.1.8 depends on r, the number of components 
of F. However, we shall mostly be using e-complexes that contain many components 
that are identical except for the input operator (such e-complexes are obtained 
e.g., by the use of Proposition 2.1.7). It may be checked that in the application 
of GHP only one representative from each such group of components need be 
considered. This significantly reduces 7], . Similarly, in computing TL from 
(2.1.5), only &•& compositions are required where I is the number of groups of 
similar components (corresponding to d compositions for each group of similar 

2.1.10 Remark 

The operators of a homogeneous e -complex F obtained as a result of 

applying Lemma 2.1.8 possess an added property that will be used later: Let 

R be the range of cp(x,y) an internal operator of R (x corresponds to a lateral 

variable); then CD (a,y) R = id (the identity on r). This fact follows from 


the definition of HP (GHP). This particular property of the operators of 

the components of F will be called the I -property relative to y. In what 



follows, we will always suppress "relative to y" since there is no danger 
of ambiguity. We will simiarly suppress the subscript R unless we will be 
interested in a specific range. We will abbreviate "F is an e-component 
(complex) whose operators possess the i a -property" to "F is an e-component 
(complex) with the I -property". 

A familiar and convenient way of representing a binary operator cp(x,y) 
is by a labeled directed graph. The graph of cp, denoted by T (?) , is defined 
as follows: The nodes of F(cp) are labeled with elements of D. A directed 
arc labeled with a £ D exists from b to c if and only if cp( a ,b) = c. 

If D = {1, 2, 3, 4} and R = (2, 3, 4] an example of a graph T(co) for 


an operator cp with the I -property is shown in Fig. 2.6. 

Given an arbitrary e -component F, the output of the operator Cp is 


in the case of a homogeneous e -component with internal operator to this will 
be abbreviated to 

cp(x. x, ,,. ..x , cp. (y)). 
k k+1 n' in J 

2.2 The Generalized Specker's Theorem 
We first give the following 

2.2.1 Definition 

Let $ be an arbitrary basis and a € D any constant. Let F be a formula 

over § with S(F) = X U Y U Z such that Ixl ^ n -1 (the maximal number of 

1 ' max 

arguments of an operator of $), Y is disjoint from X, but otherwise arbitrary, 


and Z = {z} is a singleton (disjoint from X and Y) such that z occurs only 
once in F. The set of functions f(X,z) represented by all possible such 
formulas F with the elements of Y replaced by the constant operator a will 
be denoted by $ . 

In particular, every operator of $ with all but k ^ 1 arguments (k is 

•a q 

arbitrary) replaced by a is in $ . Note that if cp(X,z) €. $ , then tp may 
qualify for $ by virtue of a number of different representations. If z 
(or any variable of X) in any one of them corresponds to a variable that 
occurs only once, it is called a distinguished argument ). The other arguments 
are called free arguments . Thus we may easily find a basis $ and an operator 
cp such that all arguments are at the same time distinguished and free. 

We now define a restricted class of e-components and e-complexes: § is an 
arbitrary basis and a f D is any constant. Let F be an arbitrary e Q -component; 
then F is an e n ~component over $ . Let cp(x,z) f $ be a binary operator, z 
a distinguished argument (hence x is free), and G an e .-component over $ ; 
then cp(x,G) is an e -component over $ . An e-complex over 9 is an e-complex 
such that all its components are e-components over $ . 

The main result of this chapter is 

2.2.2 Theorem 

Let there be given the function f : D -► D such that L(f ,$) ^ c.n for some 
constant c and basis $. Then for any m ^ 1 and a f D if n ^ T] (c ,m), there 
exists a subset Y of the arguments of f such that |y| = m and f is either a 


constant or is represented by F, a homogeneous e^-comp lex over i a with the I -property, 
Y as the set of lateral variables, constant input operators, and r <■ d- (2c-l)+l. 



If f has m fictitious arguments, then let Y be the set of these arguments, 

and f is a constant. From now on we assume that f has ^ m-1 fictitious 


The statement of the theorem gives us that there exists a formula E 
over $ such that L(E) ^ c«n. Therefore, there are ^ 1/2. n variable symbols 
representing the arguments of f which either do not appear in E or appear 
^ 2*c times. In other words, there are s l/2n-m+l variable symbols t^at 
actually appear in E and such that the number of occurrences of each is ^ 2-c. 
Denote the set of these variables by X- . 

If n ^ 2- T| (n„,2c >m-l, we can apply Lemma A. 9 and obtain a subset 

X 2 
X £ X with |x | = n and such that E is equivalent to E_, and SPCeC over 

$ a with at most 2c components and such that the set of lateral variables of 

every component is X„. 

If n„ ^ IWn-^c) we can apply Proposition 2.1.7 and obtain a subset 

X r 

X, <: X with JX,| = n and such that E 3 is equivalent to E3, an e n -complex 

over f a with r £ d«(2c-l)+l). The estimate for r is obtained at this point. 

If n„ > Tl 2 (m,2c). we can apply Lemma 2.1.8 and Remark 2.1.9 and obtain 

F, the desired homogeneous e r -complex over ? with the I -property. The 

I property is a consequence of Lemma 2.1.8. 


Discussion of TL. The present proof yields 

T1 5 (m,c) = 2. 1| 8 CT1 1 (Tl 2 <:m,2c), 2c ),2c)+m-l 

The exact representation of TL is extremely complex, and in what follows we 
shall use only a very rough approximation. In Appendix A it is seen that 
71 (t,k) (as a function of k) grows faster than iexp(b,2k) for any constant b. 
The functions TL and TL contribute only insignificantly to this, and thus we 

TL(m,c) ^ iexp(b,4c) for c ^ c(b) 
5 (2.2.1) 

and an arbitrary constant b 

(Later we shall se that the size of TL prevents us from obtaining any 

interesting bounds for the functions investigated with Theorem 2.2.2. The size 

of TL which contributes most to TL results from the technique used in Lemma 
o 5 

A. 3 to obtain a nesting sequence for a given formula F. It is not known 
whether this technique can be improved. Our guess is that it cannot be.) D 

2.3 On Specker's Theorem 

In this section it will be shown how Specker's Theorem follows from 
Theorem 2.2.2 (the statement of Specker's Theorem is given in section 1,5). 

In Theorem 2.2.2 set D = (0, lj , $ = £, a = and let f be as described. 
Then by Theorem 2.2.2, we obtain that for an appropriate choice of n, we can 
find a subset X of the arguments of f with |x| = m and such that f Q is either 
a constant or represented by 

♦(F 1 ,...,F r ) 


where ty:[0, 1} -» (0, 1] and F for 1 ^ i ^ r is either a standard or reverse 

homogeneous e -component over Z with the I -property and constant input 


operators. The value of r is bounded as described in Theorem 2.2.2. 

We now analyze the various functions that can be represented by e-compo- 
nents with these restrictions. First note that S consists of all Boolean 
binary operators; furthermore, if f(x,z) € E » then botn x and z are free 
(because every Boolean binary operator can be represented over S in such a 
way that each variable appears only once), and thus there are no restrictions 
on the use of operators in the e-components we encounter. 

All possible graphs T(cp) for cp f 2 are shown in Fig. 2.7. The ones that 

satisfy the I -property are starred. The functions obtained by choosing a 

value for the constant input operator for the starred graphs are shown in 

Table 2.1. In general, this function is either b n ffb.- tt where tt = IT (1 *x.) 



or c„ © c'CT where a = .$, x. for some values of b_, b. or c n , c-. Now, taking 
1 i=l i i u i 

into consideration the fact that rr« a = 0, and that every |: (0, 1} -» -0, 1} 
can be uniquely expressed as a Boolean polynomial 

2 h -l 

ffi c.«M. 
i=0 1 1 

where c. f {0, 1] and M is the monomial (of degree one in each variable) in 

those among x, , . . . ,x, corresponding to nonzero bits of the binary representation 

of i(see Lemma 4.5) we obtain the first part of Specker's Theorem. 

The second part of Specker's Theorem could be obtained directly at this 

point; however, we will derive a generalization of it in Example 3.1.3, and thus 

omit it here. 


It must be pointed out that our derivation of Specker's Theorem results 
in a slightly larger bound for n; however, since no known application requires 
a specific value for the bound, this is immaterial. Specker's bound (see 
[H068]) is obtained from the function 

y.(m,0) = m 

H(m,k) = 4 (k+1) .n(u(m,k-l) } k-l) 

by setting T) (m,c) = 2|j.(m,2c). Our bound is slightly larger due to the addi- 
tional processing implicit in the application of Proposition 2.1.7 and Lemma 
2.1.8. However, u resembles T| and this function by far contributes the most 
to T] ; thus, we can state that the bounds are approximately equal. 

Finally, let us note the fact that Theorem 2.2.2 allows us immediately 
to amplify Specker's Theorem. Namely, the statement of the theorem involves 
the basis consisting of all binary Boolean operators. However, the proof 
of Theorem 2.2.2 works for bases consisting of operators of an arbitrary 
number of arguments. 


^r) — (£)-•• — (£) — ©— O 

T(F) where F is an e -component 
Fig. 2.1 

^J) @ — @ — @ — Q 

© © 

T(F) where F is a reverse component of an e-complex 
Fig. 2.2 


F is an SPCeC 
Fig. 2.3 







X { 


v.u-1 ■ 













v»u ■ 





u - 




Illustration of the HP procedure (I) 
Fig. 2.4 

(\r 1 1 ^ . ii 1 


— !> 


(4-1). u-1 

X.u-j (w) 


( (J e-i).u-i^.u) (b ' c)? R i 

Illustration of the HP procedure (II) 
Fig. 2.5 

2, 3 

F((p) for an operator cp with the I f2 3 , -property 

Fig. 2.6 


0, 1 


1 ) (J) 


0, 1 

1 ) (!>) 




( ) ( 1 ) 


0, 1 

The graphs of all Boolean binary operators 
Fig. 2.7 



<P (y) 










rid® x > 








© X 

i=l i 




1 ffi © x. 




i © n (i ®x.) 

i=l x 







1 1 


Table of functions that can be represented by e-components 
with the I -property (see Fig. 2.7) and constant input oper- 
ators if D =« {0, 1} 

Table 2.1 




The principal results obtained previously [H068, Ho70] by the use of 
Specker's Theorem are 


A new proof that the Boolean function .§- x. is of nonlinear length over 

II . This is accomplished as follows. First note that the restriction of the 

mod 2 sum of n variables obtained by setting certain variables to is again 

a mod 2 sum (but of a smaller number of variables). Now apply Specker's 

Theorem (see 1.5). Suppose .6. x. is of linear length over II. Choose n 

large enough to obtain m = 3. The theorem states that for this particular 

bases c = in (1.5.1). However, it can be checked that in this case no 

choice of c n and c. will yield the mod 2 sum of three variables. A contra- 



The function f: {0, l} n -» {0, 1] defined by f = 1 if and only if E x = 

mod 3 is of nonlinear length over E. We proceed similarly as before. Assume 

it is of linear length. Apply Specker's Theorem with n sufficiently large to 

Of course the results of Subbotovskaya and Khrapchenko are stronger for this 

particular example. 


obtain m = 3. If we replace x.. , x„, x„ in (1.5.1) once by 1, 0, 0, and 
another time by 1, 1, 1, then the value of (1.5.1) remains unchanged. However, 
the value of f (with all variables except x.. , x„, x replaced by the 
constant 0) is different on these two assignments. Again a contradiction. 

Both of these results were derived by Hodes and Specker in [Ho68]. 

We might note that the technique of 3.0.2 can easily be generalized to 
counting mod k where k is an arbitrary integer (see (1,5.2)). 


Certain geometric predicates (see [Mi69]), in particular the connectivity 
predicate, are of nonlinear length if expressed with binary Boolean operators 
(this result was obtained by Hodes in [H070]). We will not discuss this in 
greater detail now since this technique will be treated later in 3.2. 

In this chapter we will use Theorem 2.2.2 to generalize all these results. 

3.1 Counting mod p 

Consider the function (0, 1} -» (0, 1} 

1 if Z x. = mod p 

i=l X 

f n (x 15 ...,x n ) - 


3.1.1 Theorem 

If p is a prime, if |d| < p, then f™ is not of linear length over an 
arbitrary basis. 



Suppose the statement of the theorem is not true. That is, there exists 
a prime p, a finite set D such that |d| < p, a basis § of operators on D, and 
L(f , $) £ en for some constant c. 

First note that if X is a subset of the arguments of f P and jx| = m, ther 

(f n ) Q = f P . We can now apply Theorem 2.2.2. For an arbitrary m, if n is 
sufficiently large, there exists a subset X of the arguments of f P with |x| = m 
and such that (f )_ is represented by a homogeneous e -complex F (over $ and 
with the I -property) with X as the set of lateral variables. In addition, 

since f is a Boolean function, the lateral variables of F are restricted to 

n ' 

(0, 1]. 

Consider now any component F. of F and T(cp ) where cp is the internal 
operator of F.. Since cp has the I -property, T(cp ) has the general appearance 
of Fig. 3.1. F. is determined by cp. and the constant input operator a.. Now 
let m ^ d and consider the sequence (s . ) for ^ j < m where s~ = a. and 

s - cp. (11. . . 11, a. ) for j > 0. Let s, ,.,. be the first element in the sequence 

j times 
that is repeated at some later point; in fact, let k(i) be the position of the 

first occurrence of this element. Let k(i)+j£(i) be the position of the second 

occurrence of this same element. Then we shall call k(i) the prefix of F. 

while 4(i) will be called the period of F . . 

Clearly, if F. is a standard (reverse) component, then cp. (x.x 2 ...x . 

11... 1, a ) (cp (x x _ . ..x 11... 1, a )) where k ^ k(i) is a function of the 


er of l's among x,, x„,...,x , (x, -,..., x ) mod 4(i). 


If we set Y = {x, ..,..., x , } and choose k-ClO to 

exceed or equal the prefixes of all the reverse 

(standard) components of F, then F. represents a 

function of the number of l's among the variables of 

Y mod jgcm(A(l),...,4(r)). (3.1.1) 

On the other hand, by the initial assumption, F. is a function of the number 

of l's among the variables of Y mod p; this results in a contradiction since 

d < p. 

On the basis of (3.1.1) we can obtain the following 

3.1.2 Theorem 

Let D be an arbitrary domain, 5 is a certain basis, and p is an arbitrary 
integer > 1. If 5 is such that any e-component over $ with the I -property 
and constant input operator has period one, then f p is of nonlinear length 
over $. 

3.1.3 Example 

This is an example of a basis satisfying the hypothesis of Theorem 3.1.2. 

Consider an arbitrary domain D = (0,1 d-1] . Then a complete basis 

for D is ♦ - (min(x,y), max(x,y) ,0,1, . . . ,d-l, e Q (x),. . . , e (J _ 1 (x)} where min 
and max are defined in the usual way, 0,...,d-l are the constants, and 

f d-1 if x « i 

e,(x) = 


S I W = 1 

I ot 


(Note that i|r, _ -. = { A,V,0,l,-,id] ; thus, a result on the nonlinearity of the 

length of a certain function over i|r, _ . , is also a result on the nonlinearity 

of the same function over II; in particular, applying Theorem 3.1.2 to ijr, - .. , , 

we obtain (2) of Specker's Theorem.) 

i|r is interesting because it gives rise to an analog of the disjunctive 

normal form for arbitrary D: Consider the table for an arbitrary function 

f : D -> D. Then 

d n -l 
f = max (M. ) 
i=Q ^ 

where M equals if the current assignment is not the i assignment and the 
value of the function at the i assignment otherwise. M. is represented as 

M i -"^.(I.nfr]? e a(i,n) (x n ) ' f i ) 

where a(i,j) is the j component of the i * assignment. 

We claim that i|f satisfies the hypothesis of Theorem 3.1.2. Note that 

a b 
i|; = i|f for all a, b £ D because iL contains all the constants. Therefore, 

we will write simply i|r 


Given cp € jr with the I -property, the statement that there exists b £ D 

such that the homogeneous e-component with internal operator f and b for its 

input operator has period & is equivalent to saying that there exists a subset 

LED with |l] = & and cp(0,z) L is the identity (id ) while cp(l,z) L is the 

permutation with cycle length & (p^)« 

We contend that for any L £ D, cp(x,z) € ± and c, e € D, if cp(c,z) L 


and cp(e,z) L are 1-1, then cp(c,z) L = cp(e,z) L. Since id 4 pj. if I > 1, 
this will establish the original claim. 

This can be proved by induction on the depth 6^ of the distinguished 
variable z in the formula F that represents cp (since there may be many such 
formulas, let F be one of the formulas where the depth of z is minimal. 

If 6 =1 then either F = max(F',z), or F = min(F",z). Assume the first 

case (the second can be argued similarly). By definition of jf , F contains 

only the variable x. If we replace x by c , F' represents a constant c' f D. 

Now if c' ^ L, then cp(c,z) L is the identity, otherwise it is not 1-1. 


If 6 > 1, then either cp(x,z) = e. (cp 1 (x,z)) where cp' £ jf_ and 6 , < 6 
or cp(x,z) = cp"(x,cp'"(x,z)) where cp", cp"' f^ and 6 6 < 6 (to see this, 
think of F). In any case cp', cp", and cp'" satisfy the inductive hypothesis, 
and we are done. 

Note that Theorem 3.1.1 and 3.1.2 hold with f P replaced by the function 


g p : D n -♦ {0,1} given by £ x = mod p since g£ f (0,l] n = fj\ 


3.2 Connectivity 

The connectivity predicate was already discussed in 1.6. It attacted 
considerable attention after Minsky and Papert [Mi69] succeeded in obtaining 
interesting results on the complexity of perceptrons that represent the 
connectivity predicate. Works that follows [Mi69] and that treat specifically 
the representation of the connectivity predicate by finite operators are, e.g., 
[H070], [Mi71], and [V170], 


Minsky and Papert describe a circuit for computing the connectivity 

predicate of depth (of the order of) (log„n) which on intuitive grounds 

seems minimal. This circuit translates into a formula of nonpolynomial length. 

Thus, the connectivity predicate seems to be a good benchmark for testing 

estimation methods for the complexity of functions (i.e., any appropriately 

general method which is presumed able to give estimates for length up to 

log 2n 
f(n) < n should declare the connectivity predicate complex). 

Consider a set of n variables {x. .) for 1 Si, j ^n; then the connec- 

ij 2 

tivity predicate is the function c : {0, 1] -* {0, 1} defined as follows: 

(we will not give a formal definition since the formalization is obvious) 

Given a specific assignment to the variables, consider it as a square array 

of 0's and l's. Then c = 1 on the empty pattern (i.e., consisting of all 0's), 

or if the l's form a connected pattern. By "connected" we mean that any two 

l's can be linked by a sequence of adjacent l's (two l's, corresponding to the 

variables x . and x, . are adjacent if !i-kt + !j-^| = 1). For example, the 

pattern in Fig. 3.2 is connected. 

The general approach used here to obtain an estimate for the length of 

c is to consider reductions of c . 
n n 

Given an arbitrary function f , a function g will be called a k- reduction 
of f if g is obtained from f by replacing each argument of f by a function with 
at most k arguments. 

Suppose we want to prove that f of i = 1,2,... arguments is of nonlinear 
length (over the basis $). Assume there exists a k-reduction g. of f. such 
that the number of arguments m in g is m > Ct-n for some constant < CC ^ 1. 
Assume L(f ,$) ^ en for some constant c. That is, there exists a formula 


F for f and L(F ) ^ c«n. Since the length of any function of k arguments 
is bounded by L(k,$), we obtain 

L(g n ,$) ^ c- L (k,5)-n 

by making substitutions for variables in F . 

If [g.] is rearranged (and renumbered) in the order of increasing number 
of arguments, and all but one functions with the same number of arguments 
are deleted, then we obtain 

L(g,$) ^ c' -m 

for some constant c'. Finally, if we can prove (e.g., by applying Theorem 

2.2.2) that g, is nonlinear, we obtain a contradiction, and we are done. 

Hodes [Ho70] shows the nonlinearity of the length of c over S by reducing 

c to the function 


V (A y ) Ay 
1=1 jtt J 

(i.e., exactly one variable is 1) which can then be proved to be nonlinear 
using Specker's Theorem. Unfortunately, this reduction does not work for 
domains with more than two elements because this function is linear over an 
appropriate basis in such domains. However, another approach works, and we 
can state 

3.2.1 Theorem 

Regardless of the size of D and the nature of f, c is of nonlinear 

length (i.e,, L(c ,$) ^ en is not true for any D, §, and constant c). 



Minsky and Papert [Mi69] succeed in reducing c to counting mod 2 by 

exhibiting a contact network such that its connectivity depends on the number 

mod 2 of contact variables equal to 1, and then by simulating this network 

on the square array of variables (they call it the "retina"). We shall proceed 


c is reduced to the function s P : {0, 1} -» {0, 1} (for an appropriate 
t n 

t) defined as follows: 

1 if > arguments are equal to 1 

s n = 


s , can be represented by the connectivity of a contact network S . S is 
n n n 

shown in Fig. 3.3a. It has p contact arms for each variable y^ The value 

of y. corresponds to the upward position of the corresponding arms while the 

1 value of y. corresponds to the downward position. The contact arm of S^ 

are arranged in p rows (n arms in each row). Whenever an arm for y. is in 

the upward position, it is connected to an arm for y..-, i n the same row; if 

the arm for y. is in the downward position, it is connected to an arm for 

y. in the next row. Thus, it may be easily checked, point A^ is connected 

to B , if and only if at least p among y, .•••,y are !• It: ma y also be 
p+1 1 n 

verified that in this case all the contact arms in the network are connected 

either to A„ or to B ,, and since these two are connected together, the whole 
U P+1 

network is connected. 

S p in turn can be simulated by a rectangular array R P of 0*s and l's 
n n 

where certain positions are constant and others depend on the y^s (see 
Fig. 3.3b). The size of / is (3 (p+l)+l) . (3n+p-l). 


We now show that s P is a 1-reduction of c. for some t. This is done 
n t 

by cutting R P into smaller rectangular pieces along vertical lines. The 
first piece is of length (£-l)-q where q = 3(p+l)+l and I will be defined 
later, the second through SL-lst piece is of length (£-2).q, while the I , 
last, piece is of length between 1 and (4-1) «q. These pieces are then 
arranged into an I- q * &' q square pattern T P as shown in Fig. 3.4 (the 
arrangement depends on the parity of I). Corresponding rows of adjacent 
pieces are connected by =>- or c- shaped patterns of O's (in the case when 
one of the positions along the cut at the row in question is 0) or l's. 

The unused positions of T P (corresponding to the case when the last piece 

r n 

is not of the maximal length) are replaced by O's. 

t is set to l-q and the variable x. . in c t is replaced by the correspon- 
ding position in T P (one among 0, 1, y^ or y). Obviously, the function 
obtained by this replacement is s P . If 3n+ P^ :> 1, I is obtained as r x"l where 
x is the positive solution of 

2 , ,^ 3n+p-l 
x -2(x-l) = ~ 

We also have that n «(l/3q)t, and, thus (by the reasoning outline previously), 

if c is linear so is s. 
t n 

With the assumption that s P is linear, we apply Theorem 2.2.2 with a = 0. 


In this way we obtain that (s P )q where Z is a certain subset of the arguments 

of s P of size m is represented by an e -complex F (with the requisite restric- 
n m 

, . p N Z p 
tions) where m is arbitrary. Note that (s^; Q - s^. 


As noted in 3.1, if a sufficiently large number of variables at the 
beginning and end of the lateral sequence of F is replaced by 1, then F with 
this substitution represents a function of the number of l's among the 
remaining variables mod the 1cm of a set of integers £ d. The number of 
variables that have to be set of 1 is u i 2(d-l) (at most d-1 at each end 
of the lateral sequence). Thus we obtain a representation of the function 
S m-u if P ^ U * If P ^ 2(d-l)+2, we obtain a function s. for i ^ 2. However, 
it is clear that s. is not a function of the number of l's mod k for any integer 

k. Thus, we have arrived at a contradiction, and, hence, c is of nonlinear 

' ' n 

length over any basis ?. D 

3.3 The Length of Symmetric Functions 

As we have seen in the previous examples, Theorem 2.2.2 has been applied 
only to functions that are either symmetric or that can be reduced to symmetric 
functions. While we know of no formal statement that can be proved and that 
asserts that this indeed exhausts the applicability of Theorem 2.2.2, it 
intuitively seems probable. 

In this section we will discuss several bounds on the length of symmetric 
functions (both specific functions and all symmetric functions). Recall that 
in 1.4 we have already mentioned several such bounds (Subbotovskaya, Khrapchenko). 

All of the results in this section were suggested by A. R. Meyer. 


Does Theorem 2.2.2 (or Specker 's Theorem) give us any information on the 
length of the functions investigated? Hodes and Specker do not treat this 
subject, and, in fact, the bound that can be obtained is very weak; however, 
we do mention it for the sake of completeness. 

In an application of Theorem 2.2.2 (or Specker 's Theorem) to a certain 
function f, we proceed with the assumption that L(f,$) ^ c*n. To apply 
Theorem 2.2.2 we must have n ^ Tl (m,c) where m is a sufficiently large number 
to obtain a contradiction. However, m does not depend on c. Thus, n depends 
only on c and m is assumed constant. 

Consider now c as a function of n. We ask what is the maximal value 
c for c(n) for which Theorem 2.2.2 can be applied (and a statement contradic- 
ting L(f,$) ^ en obtained). c(n)«n is then a lower bound for L(f,$). Due 
to (2.2.1) c grows slower than (l/4)«ht (p,n) where ht (p,n) = maximal x 
such that n ^ iexp (p,x). Then we have 

c(n)-n £ (l/4)«ht (p,n)-n (3.3.1) 

for an arbitrary constant b > 1 and for sufficiently large n. 

This bound seems unrealistically low, and it is useful to compare it 

with known bounds for length for the particular function f over some bases 

consisting of Boolean operators (we will suppress the subscript n). 

It has already been established that f is of nonlinear length if D = 

3 3 3 1 
(0, 1] (see 3.1). We introduce the following notation: f = f ' , f ' , 

3 2 n 

and f ' stand for S x = 0, 1, and 2 mod 3 respectively. We will repre- 

sent f 3 '°, f 3,1 , f 3 ' 2 by the formulas F°, F 1 , and F respectively. F is 

obtained by the following recursive relation 


F°(x) = F°(Y) A F°(Z) V F^y) A F 2 (Z) V F 2 (Y) A F X (Z) (3.3.2) 

(If X is the singleton {x}, then F (X) = x) 

L(F°(X)) = L(F°(Y))+L(F°(Z))+L(F 1 (Y))+L(F 2 (Z))+L(F 2 (Y))+L(F 1 (Z)) 

1 2 

Similar identities can be obtained for F (X) and F (X). When these identities 

are used recursively, we obtain 

i lo §? 6 9 fi 
0(L(r,$)) £ n «ri (3.3.3) 

an exact description how we obtain a bound of the form (3.3.3) from a recur- 
sive relation similar to (3.3.2), see [Ya54]. 

This upper bound can be further reduced by using multiargument operators. 

1 R 3 

Let y: (0,1,2] -+{0,1,2} be the operator £ y. mod 3. Then f can be 

i=l X 
represented by a formula G which uses y recursively (i.e., the arguments of 

f 3 are repeatedly divided by I together with an outermost decoding operator 
{0,1,2} -+ {0,1}. G is of linear length. If we use D = {0,1}, y can be en- 
coded by two operators v ' and y"> and G translates into a formula such that 

log n log,k 
0(L(G)) = (2k) k = n (3.3.4) 


Thus, as & increases, the upper bound for L(f ,$) (where y $) approaches 

c*n. However, the gap between this bound and (3.3.1) is still huge. But, 
the important thing to note is that any theorem that retains the same broad 

assumption (bases with an arbitrary number of operators) as Theorem 2.2.2 

cannot yield a better bound for f than (3.3.4). 


Another example of a function that is nonlinear in length (over all 

Boolean binary operators) by Specker's Theorem is f . However, it too has 

a relatively short representation (the previous and this example show that 

Theorem 2.2.2 is a sensitive tool for deriving the nonlinearity of functions; 

i.e., it can be used on functions that are only "slightly" nonlinear). 

A representation for f with Boolean operators is obtained by dividing 

the arguments of f into disjoint (nonempty) pieces Y and Z, and adding the 

4 4 
bineary representations of f (Y) and f (Z). Let the binary representations of 

f (X) be given by the formulas F ' (X) and F"(X), obtained by the following 

recursive relations 

F"(X) = F"(Y) a F"(Z) 

F'(X) = F'(Y) © F'(Z) © F"(Y) A F"(Z) 
(If X is a singleton F"(x) = x and F'(x) = 0) 

L(F"(X)) = L(F H (Y))+L(F"(Z)) 

L(F'(X)) = L(F'(Y))+L(F'(Z)+L(F"(Y))+L(F'(Z)) 

By choosing Y and Z always as equal as possible, we obtain 

L(F"(X)) = n 
0(L(F"(X))) = n-log 2 n 

Since f (X) is represented by F ' (X) A F"(X) n'log 2 n is also a bound for 

0(L(f ?) where ©, A 3. 

We now turn our attention to an upper bound for the length of all 
symmetric functions. 


Note that a symmetric function g: D n -» D where D = -0,1, . . . ,d-l] depends 
exclusively on N.,,...,N d , where N. is the number of variables equal to i. 
It can be represented, e.g., as 

g = max(M ) 

n (l)"'» n (d-l) 

where M ,^ nCd-l") ec l uals t if N = n(i) and is otherwise. The number 
of combinations of n(l),. . . ,n(d-l) is a polynomial inn - ( ~ ) — and the 
max function can also be represented in polynomial length, regardless of the 
basis i. The latter fact is established by representing max using the two- 
argument max recursively. Thus, if M . . (a-\\ were polynomial, g would 
also be. 

M. J. Fischer and A. R. Meyer discovered that M ,. v ,, n «. can, 

n(i; ,. . . ,n(d-l; 

indeed, be represented in polynomial length by using a special code for 
integers described by Avizienis [Av69], 

We will illustrate the construction on Boolean symmetric functions. It 
will be seen that if the basis of operators is appropriately chosen, the 
length of an arbitrary symmetric function is bounded above by a polynomial 
of a surprisingly low degree. 

The Avizienis code is a redundant positional representation of integers 
to an arbitrary base b > 2. We describe it for b = 3. 

An integer n is represented by all possible Tlog_nl - tuples 

a riog 3 nT'"' a l 
where a ± € {-2,-1,0,1,2} for 1 ^ i <: flog^n! and 


Tlog nl 

h a. J 

i = n 


The property that is exploited is that there are no long carry's in 
addition. Thus, if we want to add two Avizienis coded integers a = a.a, ..... 
a. and b = b,b , ...b. , we can do it in two steps using the following 

3.3.1 Algorithm (Avizienis) 

(1) Find the carry c and intermediate sum r such that 

a.+b. = 3c, +r. 
i i i i 

where a ± , b € {-2,-1,0,1,2} and c^ r ± € {-1,0,1}. 

(2) Compute the sum s according to 

S i = r i +C i-l 

Let us estimate the length of the formula representing any ternary place 
in the Avizienis representation of N.. for X = (x^,...,x }. 

Again let X = Y U Z and Y H Z = 0. r ± (X) and c^X) can be represented 

R t (X) = p(R i (Y),R i (Z),C i _ 1 (Y),C i _ 1 (Z)) 


C^X) = X(R i (Y),R.(Z),C i _ 1 (Y),C i _ 1 (Z)) 

If X is a singleton, r. and c. are if i > 1, or 1 and respectively if 

i = I. P and X are certain operators £-1,0,1} -» {-1,0,1} which can be 

obtained from the definition of Algorithm 3.3.1. Strictly speaking, the 


domain used here is not permitted in our definition of finite functions; 
however, the difference is merely one of coding. Thus, 

L(R i (X)) = L(R i (Y)+L(R i (Z)+L(G i _ 1 (Y))+L(C i _ 1 (Z)) 


MC^CX)) = L(R i (Y))+L(R i (Z))+L(C i _ 1 (Y))+L(C i _ 1 (Z)) 

If we use these relations recursively and always make Y and Z as equal as 
possible, we obtain 

0(L(R i )), 0(L(C.,)) ^ n 2 

for 1 £ i ^ Tiog nl. If D = {0,1}, we need two bits to encode r and c^. 
Therefore, using certain operators p', p", X' , and X" to encode p and X, 
we can encode R. (X) and C. (X) and combine them into a { 0,1] -formula A.^ 
representing the i ternary place of the Avizienis representation of N^. 
We have 

0(L(A i )) ^ n 3 (3.3.5) 

Let there be given a positive Avizienis coded number a = a a _ ...a . 

We desire to convert it into its binary equivalent b = b b _...b 1 . Let 

q q-1 1 

U £ (a ,...,a.}. Then we define b (U) = 1 if and only if the i bit of 
pi i 

E a.3 1 is 1. Note that even if a is positive, b.(U) may be negative 

a.€u x 

for some i and U. This is further discussed below. For the moment we assume 

that b. (U) is always positive. We can then again compute b.(a . ..a^) by a 

recursive method. Let U = V U W, V w = 0. Then b^U) is the i bit of 

the sum of b (V)...b.(V) and b (W)...b 1 (W). 
q 1 q 1 


b.(U) is represented by the formula B. 

B i (U) = P(B.(V),B i (W),G i _ 1 (U)) 

, _ , . th , 
where G. represents the carry from the 1 place; 

G.(Y) = Y(B 1 (V),B 1 (W),G 1 _ 1 (U)) 

G n represents the constant and P and y are certain Boolean operators. Then 
we obtain 

L(B.(U))« E L(B.(V))+L(B (W)) 
1 j=l J J 

If a is the Avizienis representation of a number ^ n then p = llog^n I and 

q = Tlog^nl . Thus, we obtain the following bound for L(B (a)) 

log log.n 
0(L(B 1 (a») < (2q) » 

log log n 
(21og 2 n) 

(l+log 2 log 2 n) log 2 log 3 n 
log 2 n 

« n 


Note that (3.3.6) means that 0(L(B.(a))) ^ n £(n) for 1^ i ^ Tlog^l where 
e -+ as n -» °°. 

It has already been remarked that b(U) need not be positive. Thus 
b(U) must be treated as a signed number. If we use the l's complement 
representation and the en-around carry technique (see, e.g., [Gr59]), addition 
can be performed as follows. Let b(U,g Q )=G(V)+b (W)+g Q where g Q is either 


or 1 and let g 1 denote the carry from the highest position of b(U,0). Then 
b(U) = b(U,0) if g x = and b(U,l) if ^ = 1. This means that in (3.3.4) 
we obtain (i.q) P for some constant SL instead of (2q) exp where exp has the 
value given above. 

If a^ for 1 <: j £ riog 3 nl . in B i for 1 ^ i ^ Tlog^l is replaced with 
A , we obtain formulas F^X) representing N in binary form. Combining 
(3.3.5) and (3.3.6) we obtain 

0(L(F i (X), ))) £n 3+6(N) 

where e(n) -» as n -* °°. 

To obtain the desired representation of an arbitrary Boolean symmetric 
function, we proceed as follows: Consider the formula S. defined inductively 

S l = X 10 V x ll 

S i+1 = X i0 A S i-1 V x il A S i-1 (3.3.7) 

Take Sp. -| and replace x. Q and x. - by F. and F. respectively. It is easily 
seen that S p. -| with this replacement is identically 1 (this can be proved, 
e.g., by induction; for S.. it is trivially true, and the general statement 
follows from (3.3.7)). 

Let there be given an arbitrary symmetric function g. It is defined by 
a subset M C {0,l,...,n] of possible values of N- . Each branch of length 
Tlog^nl in T(S|- -i ) corresponds to one value of N- (given by the binary 
number j (Tlog^l) ,. . . , j (1) where K^nl , j ( Tlog^l) '* * ' > X 1, j (1) define the 
branch in question). If we remove branches of T(Sr. -i) corresponding to 
M, thus obtaining the formula S', and perform the substitution defined pre- 
viously to obtain the formula p, we obtain a representation for g. 


We have 0(L(S')) £ n and thus 0(L(P)) <. n +e(n) where e -» as n -» °° . 

Thus, if a basis $ is given that contains all operators used to obtain P, 

T / »\ ^ 4+e(n). 
then L(g,$) £ n 

In Lu70 Lupanov announced a result of Khrapchenko to the effect that an 

4 93 
arbitrary symmetric function is of length < n * # Since the assumption were 

not made explicit, and the result itself is unavailable a3 of this writing, 

no exact comparison can be made with the estimate above. 


> (x x .. 
i m m-1 

T(tp i ) where cp has the I -property (only arrows labeled with 
and 1 are drawn). T' denotes the set of nodes 
x„0,a.) where m is as described in the text and x , 
may assume arbitrary values in {0, 1}. 

Fig. 3.1 



A connected pattern of l's 
Fig. 3.2 


Contact network S for s P 

n n 

Fig. 3.3a 


O rH 

pa pa 













r-<i-400i-IOOt-lo\ /r-'OOr-IOOr-fOO'-'OO'-'O 

, p f 5 I \ 

1-1 I S*> r-l >i«-< OOr-to, i-IOO'-IOO'-<00'- ( OOr-IO 

', ' 

P P • \ 

r-i i-l i-l o IfM-H >,i-) o |i-'OOi-<00'-<00'- | OOr-40 
rH _ r-l C 

I r-l 
, P r-* P O O r-t ,-1 
|>, >« r-i i-l 

I r-l lr-1 

O, fi 

C>P C IH r-l O 



°* fl 7 


~-L-0 O <-* O O i-l o 

P Oj/i-li-li-l|>-,r-l>,i-IO O-i 

r-l pH f5 O O r-l i-l 
[>*, >> CM CM 

IrH I i-l 
r-l r-l r-l O P P O 

H H 

r-IOr-lrHr-IOir^i- 1 >~ir-IOOr-IO 

I rH lr-1 P P 

.P P OOi-lr-IO|rM-l>^r-IO 

p-, >, i-H i-H 

CO CO ! / -d" <f 

HHHO|>iH >,i-IO]{ HHQ^H >,r-IOOi-lr-IOi-"'0 

, , , co / i •* <t 

HrlOOHHO^rll) i-<OOr-<r-IO|>»i-< >•> r-1 O O r"l O 


cm cm ' j co co vt st 

■ _, I5>-.*-' >-.i-IOOr-lr-IM (>>r-l f>»i-IOOi-lr-lO|>^t-l >•, i-l O 

, cm cm / co co 

r-l l-l r-l O |>M-I >,HO MHO^^ >,r-IOOr-lr-IOr-IO 

I r-l O O r-l r-l O 



, <N / .»■!*' I 

|r*,rH/!i-JOOr-li-lo|(»i-l >*rH O O r-l O 

r-l r-l ! i CM CM CO CO 

H I^H >m-4 O O r-l r-l / ' / >, r-l >, r-l O O r-lr-t O |>% r-l >■, i-l O 

r-l r-l \ I CM CM 

i-l r-l r-l O |>v-< >,W O / j r-l r-l O fc>, r-1 >^ r-l O O r-l i-l O r-1 O 

—Iff CM CM 

r-IOOr-lr-IO|>~.i-l >> r-1 O O r-1 O 

i-l r-l CM CM 

l>M-l >M-IOOr-li-IO|r^rH $►, I-l O 

r-l r-l 

r-1 r-l O |>* r-l J^r-IOOr-lr-IOr-IO 

r-l i-H 

r-IOO!-lr-IO|>-,r-l rS' _l OOr-IO 

r-l I-l 

r-IOOi-<OOr-li-IO|>ir-< >>i-< O 



O O r-l O O r-l O 

O r-l 

< < 










Planar connectivity simulation of S P 

Fig. 3.3b 

L p+1- 



T H for i = 5 

Fig. 3.4a 




T P for H = 4 

Fig. 3.4b 




The perceptron has already been discussed in 1.6. In the beginning 
of this chapter, we will first expand on that discussion in order to further 
motivate the study of cyclic perceptrons<> 

The classical perceptron (for references on the subject see [Mi69] 

became the subject of extensive research centered around concepts such as 

pattern recognition, learning, adaptive behavior, etc. A whole myth had 

been created around it -- about its capabilities and its potential for use. 

The thing that attracted people most were its ability to learn from experience 

and its simplicity -- it combines many small decisions, the values of the 

functions cp. , into a final decision by considering their weighted sum. 
Minsky and Paper deflated this myth by showing that such a scheme has its 

inherent drawbacks. In particular, it cannot compute predicates such as 


The most general intuitive basis for the result that the connectivity 
preciate cannot be represented by a perceptron is the following: First of all 
the reasoning makes sense only if the complexity of the functions cp. is limited 
in some way; if not, we can choose to. to be the function that we desire to 
represent and then it can be represented by a perceptron trivially. Minsky and 
Papert use the order and diameter restrictions (see [Mi69]). The former is also 
used by us. 

Suppose we want to represent connectivity. Then, if the cp . ' s are bounded 
in complexity (so the reasoning goes), the weighted sum is too simple a function 


to be able to integrate all the information that is required in computing 

We set out to apply the same basic reasoning to models where the inte- 
grating function is constructed out of finite operators. In particular, 
we choose addition in a finite field because of the unique representation 
property for functions in such a field (see 4.5) which makes proofs rather 
simple, and because of the purely formal resenblance to perceptrons. 
One particularly interesting aspect of using addition in a finite field as 
the integrating function is that one proof of the inability of perceptrons to 
compute connectivity is based on the reduction of connectivity to addition 
mod 2. However, this function is precisely the simplest one possible in GF(2), 
This underscores the need to make different reductions for different models 
of computation that are presumed to be incapable of computing connectivity. 

In this chapter we shall limit ourselves to Boolean functions. 

We introduce cyclic perceptrons formally: 

4.1 Definition 

GF(p ) is the finite field consisting of p elements. $ (the basis) 
is an infinite set of Boolean functions (0,1] -» [0,1] such that each cp € $ 
(w is the first infinite ordinal) depends on a finite number of arguments. 
Elements of $ are assumed to be ordered (in an arbitrary way). Then a (p,k)- 
perceptron (over ?) is a pair P = (a,Y), where a is an w-vector such that the 
i th component a. € GF(p ) and a.^0 for only finitely many values of i; 
Y c GF(p k ). 


Given a function f: {0,1] -» {0,1], we will denote the set of arguments 
on which it depends by S (f ) . 

Let P = (a,Y) be a (p,k)-perceptron. Then P will represent the predicate 
(Boolean function) 


f = [ £ a cp. € Y] (4.1) 

i=0 *■ X 

where the value of cp. € {0,1} £ GF(p k ). Obviously, S(f) £ U S^) 

1 i€{j: j? «0] 

We will indicate the function represented by a (p,k)-perceptron P as in 

(4.1), or simply [P] . 

Let us recall a concept from [Mi69]. Given a (p,k)-perceptron P = (a,Y) 

over a certain basis $, its order (ord(P)) is max (S^) . 

i€[j: a.#)] 
We can also introduce the order of a function. 

4.2 Definition 

The (p,k)-order of a Boolean function f over a given basis $ ((p,k)-ord $ (f,> 
is the smallest I such that there exists a (p,k)-perceptron of order I repre- 
senting f. If no such perceptron exists, the (p,k)-order of f is defined to 
be °°. 

Let 0. be the set of all Boolean functions with finite support. Note then 
that for an arbitrary Boolean function f, (p,k)-ord n (f ) is finite and <■ S(f), 
for all primes p and arbitrary k. Also note that for an arbitrary basis § 

( P ,k)-ord n (f) ^ (p,k)-ord i (f) (4.2) 


We show now, as is done in [Mi69], that we can choose for the basis a 
more restricted set. Let the set of arguments of the basis functions be H = 

{x 1 ,x„ ,...]; then we define the set of masks M=( A x.:Sisa finite sub- 

i6S x 

set of JW }. A convenient way of ordering M is to assign to cp€M the binary 
number b b ...b. where b = 1 if and only if x, appears in the conjunction 
defining cp. 

4.3 Proposition 

Any Boolean function f can be represented by a (p,k)-perceptron over M 
for any prime p, and arbitrary k. 

The proof is the same as that of Theorem 1.5.1 in [Mi69], i.e., we util- 
ize the following correspondence between Boolean operations and operations in 
GF(p ) if the variables assume only the values and 1: 

x. A x_ ~ x • x„, x, V x„ ~ x. + x 2 - x • x 2 , x ~ 1 - x 

If f is a function of n arguments, then from its disjunctive normal form, by 
using this correspondence and by multiplying out afterwards, we obtain the 
following representations for f: 

E « 1 x 1 x 2 ... x m = x 


th k 

where a. . is the j bit of the binary representation of i and a. € GF(p ). 

Note that the mask cp. (see the ordering above) is represented by the monomial 

with exponents corresponding to the binary representation of i. 


Theorem 1.5.3 of [Mi69] also holds in our case. We state it as 

4.4 Proposition 

The following holds for an arbitrary Boolean function f, an arbitrary 
basis §, and an arbitrary integer k and prime p: 

(p,k)-ord M (f) £ (p,k)-ordj(f) 


The same as in [Mi69]. 

Note that if we take fi for the basis in Proposition 4.4, and combine it 
with (4.2), we obtain that (p,k)-perceptrons over M achieve minimal order. 
We state without proof the following well-known 

4.5 Lemma 

Every function GF(p ) n -» GF(p ) can be uniquely represented as a polynomial 

k k 

in n variables over GF(p ) that is at most of degree p -1 in each variable. 

(see, e.g. , [La67].) 

It has already been noted that we will be interested in whether a function 
can be represented by a (p,k)-perceptron with a limitation on its order. For 
this we need the following 

4.6 Definition 

A sequence of Boolean functions f-,f 2> ... of 1,2,... arguments is of 

finite (p,k)-order (over a given basis ?) if there exists a finite r such 

Bounded would be a better word, but we conform to the terminology of [Mi69] 


that for all i (p,k)-ord- (f . ) £ r. 

Let there be given a (p,k)-perceptron (a,Y). If Y = (y,,...,y ], then, 
recalling that GF(p ) is a vector space of dimension k over GF(p), and desig- 
nating the j component of a. by a., (similarly for y 6 Y) , we have 

x. x j n 

eo m k CO 

[ E a m € Y] » V A [ E a., cp =y ] (4.3) 
i=0 L x h=l j-1 i=0 1J x tlJ 

We can restrict the diversity of perceptrons we are dealing with by noting 

4.7 Proposition 

Let $ be a basis closed under conjunction (i.e., cp, ijr € § => cp A \|r € $). 
If a Boolean function f is of finite (p,k)-order over f, then it is of finite 
(p,l)-order (but the order may change). 


We have f = [(a,Y)] where (a,Y) is a (p,k)-perceptron. Suppose !y( = m 
and the (p,k)-order of f is I. Prom (4.3) we have 

m k °° 
f = V A [ E a. ..«p. = y ] (4.4) 

h-1 J-1 1-0 ij x hj 

where a. ., v.. € GF(p). 

By Lemma 4.5, we know that for all a € GF(p) there always exists a poly- 
nomial P (x) over GF(p) of degree p-1 which takes on the value of 1 if x = a 
and is otherwise (the degree follows from the number of zeros of the poly- 
nomial). Thus substituting the Boolean operations with the field operations 
introduced in the proof of Proposition 4.3, we obtain from (4,4) 


k °° k 

f = Q( n P ( E a -cp ),..., n P ( E a -cp )) (4.5) 
j-1 y ij 1-0 1J j=l y mj 1=0 1J 

where Q(x-,...,x ) is the polynomial (of degree m) that represents the Boolean 

m f 
function V x, . Each P is of degree p-1. Hence f can be expressed as 

h=l y ij 

a polynomial in the cp. 's of degree ^ nr (p-1). Obviously, cp^ for j > 1 can be 

replaced by cp since it assumes only the values and 1. Also, cp* i|r represents 

the function cpAi|r and |s(cp A i|r) | ^ |s(cp)| + !s(i(r)|); thus, if the basis is closed 

under conjunction (as, e.g., Q or M) , (4.5) describes a (p,l)-perceptron for f 

of order ^m«(p-l)'^. □ 


Incidentally, this proof also shows that we can assume the cardinality 
of Y to be 1. 

Since we shall subsequently be concerned only in whether the order of cer- 
tain functions is finite or not, we will be able to limit ourselves to (p,l)~ 
perceptrons. For convenience, we will write simply "p-perceptrons". Also, we 
will be only concerned in whether there exists a basis over which a function is 
of finite order. This is equivalent to whether a function is of finite order 
over M. 


Q(x 1 ,...,x m ) is obtained by using y ± V y 2 ~ y x + y 2 - y L • y 2 recursively; 
i.e., Q(x 1 ,...,x m ) = Q(x 1 ,...,x m _ 1 )+x m +x m .Q(x 1 ,...,X m _ 1 ). If Q is a poly- 
nomial over GF(2), then Q = © II y where the sum ranges over all nonempty 

subsets S s (x.,,...,x }. 


We first turn our attention to the case when p = 2. Instead of '^-percep- 
tron 1 ', we will say "Boolean perceptron". 

From Lemma 4.5 we conclude that every Boolean function can be uniquely 
represented as a polynomial over GF(2) that is at most of degree one in each 
variable (a Boolean polynomial). 

Noting that the terms of a Boolean polynomial represent marks, we conclude 
that every Boolean function f has a unique representation as a Boolean perceptron 
over M. Furthermore, by Proposition 4.4, this representation is a minimal order 
representation for f. Note then that 2-ord^(f) corresponds to the degree of 
the Boolean polynomial for f. 

This unique representation property allows us to establish the minimal 

order of certain interesting predicates very easily. As in 3.2, we are again 

interested only in functions {0,1} -♦ {0,1} that are interpreted as functions 

of nxn patterns of 0's and l's. In particular, we are interested in the Boolean 

function of n variables c (introduced in 3.2) and e , (the Euler number of 

a pattern of l's on a square array of 0's and l's is equal to k). It is well 

known (see, for example [Mi69]) that the Euler number of a planar figure is 

the difference between the number of its components and the number of its holes. 

If we use the notion of connectivity introduced in 3.2, then the Euler number 

of the pattern in Fig. 4.1 is 1. 

4.8 Theorem 

The connectivity predicate is not of finite 2-order over M (hence, over 
any basis). 



We use the One-in-a-box construction introduced in [Mi69]= Before pro- 
ceeding, however, we must define certain auxiliary predicates, n, the size 
of the pattern is assume odd (henceforth, we will suppress the subscript n 
in the notation for functions). The variables representing positions in the 
square array will, as usual, be denoted by x . for 1 £ i, j £ n. Then we define 

r = (x n A x l2 A . . . A x ln ) A 

(x„ 7 A x„ 2 A... A x„ ) A... A 

(x . A x „ A . . . Ax ) 
nl n2 nn 


s - (x 21 Vx 22 V ... Vx 2n ) A 

(x 41 Vx 42 V... Vx 4n ) A... A 

<Vl,l V Vl,2 V '" V Vl,n>5 

i.e., r is 1 only on patterns with odd rows consisting exclusively of l's, 
and s is 1 only on patterns where each even row has at least one 1 (the One- 
in-a-box predicate). Then, 

r A c = r A s (4*6) 

(c is the connectivity predicate). 

Now, for arbitrary functions f, g, h,if h = f A g, then 2-ord M (h) ^ 2 "° rd M 
(f)+2-ord M (g); i.e., 

2-ord M (g) £ 2-ord M (h)-2-ord M (f) (4.7) 


Replacing h by r A s, f by r, and g by c we obtain 

2-ord^c) ^ 2-ord^Cr A s)-2-ord M (r) (4.8) 

We have 2-ord (r) = ~- *n; 2-ord^(h) = ~- *n (recall the Boolean 

polynomial for V x. described in the footnote on p. 80 ) 2-ord^(r A s) = 

n(n-l) (because ;the Boolean polynomial representations of r and s have no 

variables in common). Using this we obtain from (4.8) 2-ordfc) ^ „ ; 

i.e., the 2-order over M of the connectivity predicate is not finite. D 

We next establish 

4.9 Theorem 

The predicate "the Euler number of a pattern equals k" is not of finite 


We again consider the case when M is the basis. The general case follows 
from Proposition 4.4. n is the size of the pattern. We need to consider a 
subset T C S = (x..: i+j even] (note that all points of S are disconnected 
from each other, in the sense we use this word). |t| = t will be determined 

We define the following predicates 

P r = 1 if and only if all points of T are 0; i.e., 

pr = n (i e x) 

q = 1 if and only if k points of T are 1; i.e., 


q = © II x • II (lffiy) 
x€U y€T-U 

where the sum ranges over all possible subsets U C x with |u| = k. When the 

expression for q is multiplied out each term produces exactly one term of the 

form II x and thus the above Boolean polynomial is of degree t if and only 

if the number of terms in the sum is odd. The number of terms is (5). But 

2 1 -! ' t i k 

( ) is odd for all ^ k ^ 2 -1 and all I. Thus if t = 2-1, £ k £ t 

k _ 

then 2-ord(q) = t. Also, 2-ord(pT) = |t| = n -t. 

Recalling once again the e, is the difference between the number of compo- 
nents of a figure and the number of holes, we have the relationship 

pr A e k = pr A q 

Again using (4.7) with g = e,, h = pr A q, f = pr we obtain 

2-ord^e^ * n 2 - |t| = 2^-1 

No matter how large we choose I, we can find an n such that we can obtain 
a set T with |T| = 2 -1. Thus, the Euler predicate is not of finite order 
over M. 

Theorems 4.8 and 4.9 can be extended to p-perceptrons for arbitrary 
p. The generalization will only be indicated for Theorem 4.8. 

Proof: First show that ( ) is even for all I and all k?*0, 2 . This is done 

by induction. Now observe that due to (, ) = ( ) + (, .), and the fact that 

I 9 k 

2-1 2-1 2 

( , ) is odd, ( „ ) is also odd (for otherwise („) would not be even). We 

can continue this way and establish the claim. 


The obvious difficulty is that Boolean functions do not have a unique 
representation as polynomials over GF(p) for p > 2. Specifically, in the 
case of Boolean perceptrons over M, we were able to reduce the problem of 
the order of connectivity to the order of r and s (see above). The orders 
of these predicates (equal to the degrees of the corresponding Boolean poly- 
nomials) were easily computed due to Lemma 4.5. 

Suppose c is of finite p-order over M (we have already remarked that this 
brings no loss of generality) for some p > 2. Due to Proposition 4.7 we can 
assume that we have an expression of the form of (4.5) for c . When multiplied 
out, we obtain 


c = E a.m. mod p (4.9) 

n i=0 l i 

where m. is the monomial representing the i mask. m. is of degree one in 

each variable, and the values of the variables are restricted to (0,1). Since 

the perceptron from which we obtained (4.9) is finite order, we can assume that 

the degree of (4.9) is ^ i. 

We can now extend c to the domain GF)p) (i.e., to square patterns A = 


fb.J, 1 s i, j s n, and b . , € GF(p)) by defining c'(A) = 1 if and only if 

c (f(A)) = 1 where f(b.J) = {c.J such that c - 1 if b - 1, otherwise 
n ij lj ij l J 

c . =0. We have from (4.9) 


c 1 = T. a^.m! mod p (4.10) 

n i-0 i l 

where m* is obtained from m. by replacing each variable x. by P- (x ) such that 
i 1 J ■"■ J 

p ( X J = i s otherwise P.,(x.) = (see the proof of Proposition 4.7 how to ob- 
tain P 1 (x.)). Now we have a total function c n over GF(p), and thus its 


polynomial representation obtained by multiplying out (4.10) must be unique. 
Since the degree of (4.9) is £ A, 

the degree of the polynomial (4.10) £ (p-l)i (4.11) 

Another estimate of the degree of the polynomial for c 1 is obtained using 
the predicates r' and s' obtained from r and s of Theorem 4.8 similarly as 
c' was obtained from c . The polynomial P , representing r 1 , is obtained from 
the polynomial representing r by substituting each variable x with P.(x). 
Similarly, for the polynomial representation P , of s'. The degrees of P , 
and P , are then found to be ^ ^^ n« (p-1) and ^ —^ n(p-l) respectively 
(2-ord M (r) and 2-orcL/s) multiplied by deg(P.,)). Since the analogs of (4.6) 
aid (4.8) again hold, we obtain that deg(P ,) is not bounded, contradicting 
(4.11), and thus also the finite order of c. 

Note that the obvious generalization, i.e., if a function is of finite 

2-order, then it is also of finite p-order (over M) , is not true: Consider 

the Boolean function © x . . We will investigate the degree of the polynomial 

representation of ffi 1 (obtained in the same way from © as c 1 was from c above). 

If a p-perceptron over M of finite order exists for 0, then we obtain a poly- 
nomial representation of bounded degree for <£' similarly as (4.10) was 
obtained from (4.9). On the other hand we have the following representation 

for ffi 

E n x n (1-x ) mod p (4.12) 

i€s i?S 3 

where S ranges over all subsets of {l,...,n] of even size. When multiplied 

out, each term produces exactly one monomial of the form II x. . The number 


ii^ui^^mua^uwj ii Mu^y ' uiM i pi^ 


of such moaamUU appearing If, ^ djseafrinadj fern of (4.12) is 


.4. oiiooooo 

L £ J n. MO I I I 

1-0 3 X I I 

(use ths identity (J) - ( n ^)0 4 ^f % fetes this nanber is not 

«od p, (4.12) T±^4£Z^*^&K£M **•* » ** * If *i *■ 
replaced by P.(x.) n obteia s (yi ^saeA^prtlynnerf ■ 1 representation for 

«» of degree a* (p-i), contradicting die orl stones of s finite order 

p-perceptron over M for #i 


OMIfi 19 

01 m * 

» g # & JfljE $ ♦ fc^, » 

son el 3edBu« «Mt soajt G Jt 4|^M ♦•f T) - (p t*-**«»&* «& *•*) 
?abio artlnii s 3o 90*193 a Jaw «*fo ^a^s^bsrfaod f (J-q) *a »»«g»i> 1© *# 

«2 .x II •** io5 b 

M »o5 if wiro noWqsosa^-q 




In this chapter we shall permit ourselves a certain degree of informality. 

We are again concerned with the power of machines that combine a large 
number of "local" computations through an integrating function. Only this 
time we shall not be limited to functions that can be represented as a combin- 
ation of finite operators. 

This class of machines again operates on square patterns of O's and l's. 
The operation is divided into two phases : In Phase I the pattern is scanned 
with a square "window" of a certain size. Each time a nonzero pattern appears 
in the window, we take note of it (there is a finite number of nonzero patterns 
since the window is of finite size). At the end of the scan we have a count 
of the various patterns, and we are then allowed to utilize this data in Phase 
II which consists of computing the value of a partial recursive function for 
this data. The formalization of this model is obvious and we omit it. 

What can such a machine do? Clearly the computation of this machine is 
divided into a local phase and a global phase, so that it fits into the broad 
class of problems considered in [Mi69] and Chapter Four. 

Note that the boundedness of the window size is essential. If we insis- 
ted only that the window contain a given number of points, but otherwise 
allowed it to be of any shape with arbitrary distances between its points, 
then Phase II could reconstruct the whole figure as was observed already in 

We again inquire whether these machines can recognize the familiar 


topological predicates connectivity and Euler number. 

5. 1 Theorem 

Pattern Counting Machines (PCM) cannot recognize the connectivity pred- 

We need only exhibit two patterns, one connected and the other discon- 
nected, with the same pattern spectrum. In this case, no algorithm of Phase 
II could establish the difference between them. 

Two such patterns are given in Fig. 5.1. 

Specifically, these patterns are equivalent under windows of size 2x2. 
However, it is easily seen that increasing the dimensions of the patterns 
in Fig. 5.1 linearly by a factor of k makes them equivalent under windows 
of size up to k + 1. We can arrive at this conclusion by setting up a 1-1 
map between occurrences of the same pattern in the window in the two pat- 

5 . 2 Theorem 

PCM's can compute the Euler number. 

It is shown in TMi69] how to compute the Euler number from the spectrum 
of patterns of the shape 

: m n □ 


Before proceeding, we need a notion of continuous deformation . Pattern 
B can be obtained from pattern A by continuous deformation if B arises from 
A by a sequence of additions or deletions of l's of the following kind: Let 
us fix attention on a 3x3 square with the central position in the place of 
the 1 being added (deleted). For simplicity assume that the boundary positions 
are always 0. Each position of the periphery of the square which has a 1 in 
it is either connected or disconnected to another 1 on the periphery (not 
necessarily by a path in the square). This set of connections may be described 
by a symmetric 8x8 0-1 valued connection matrix, i.e., a . = 1 if and only if 
the i and j positions on the periphery have l's and are connected. The 
proposed addition (deletion) is permitted only if (1) the connection matrix 
remains unchanged as a result of it, and (2) there is a 1 adjacent to the 
proposed addition (deletion). 

Any predicate whose value remains unchanged if the pattern A is replaced 
by B, obtained from A by continuous deformation, is called a topological 
predicate . We assert without proof that connectivity and Euler number are 
topological predicates. The reader is warned, however, that there is a pitfall 
in proving this fact for the Euler number predicate. The number of holes in 
Fig. 5.2 should be one, not two (i.e., 0's are connected diagonally in addition 
to their usual connections). This is discussed more fully in [My71], However, 
if the holes are sufficiently large (so that all the 0's in them are connected 
in the usual way) this difficulty is not encountered. 

5.3 Theorem 

Any topological predicate recognized by a PCM must be a function of the 
Euler number. 



We will have established the theorem if we succeed in showing that, 
given any PCM p computing a topological predicate, then for two figures 
X 1 and X 2 with EULER(X 1 ) = EULER(X 2 ), we also have P(X ) = P(X 2 ). 

In [ML69] it is shown that for every figure X there exists an "Euler 
canonical" figure C(X) such that EULER(X) = EULER (C(X)); and if for two 
figures X^ X 2 , EULERCX^ = EULER (X 2 ) , then C(X ) = CCXj. If the Euler 
number of X is n > 0, then C(X) consists of n components without holes. If 
the Euler number of X is n £ 0, then C(X) consists of 1 component with-n+1 

We will show that we can deform any figure X into C(X) without changing 
the value of P„ 

The deformations available to us are: 

(1) Continuous deformation. If we subject X to this kind of deformation, 
then P(X) remains unchanged because it computes a topological predicate. 

(2) Deformations that leave the pattern spectrum unaltered. By defini- 
tion of PCM's. 

As a consequence, we have 

(3) Removal of components inside holes. To accomplish this without 
changing the value of P(X), we first apply (1) until the window cannot scan 
simultaneously an interior component and the wall of the hole in which it 
resides. Then it is obvious that the pattern spectrum will remain unchanged 
if we remove the component from inside the hole. After this we can apply 
(1) in the reverse direction. 


If we are given two figures X- and X with same pattern spectra, then 
we can add equally shaped holes to the figures in such a way that the pattern 
spectra remain the same. The holes have only to be placed in such a way that 
the window cannot scan any other boundary while scanning the newly introduced 
hole. We can then repeat this to add any number of holes. 

Specifically, given two figures of the shape of A and B in Fig. 5.1, we 
can add holes in this way and still have the same pattern spectra. For example, 
C and D in Fig. 5.3 have the same pattern spectra for a sufficiently small 
window size. Note that given any two components, one of which has a hole, 
we may apply deformation (1) to obtain a figure proportional in dimensions 
to C and they apply (2) to obtain D. We call this sequence of deformations 
"cancelling a hole and a component". 

We deform X into C(X) by cancelling as many holes and components as 
possible. We first apply (3) until no component remains within a hole. 
Then we may either have a hole and a component not containing this hole, or 
not. In the latter case, we are done. In the former, we select a hole and 
a component not containing it and cancel them. After this we are left with 
one less component and hole. We repeat this until we arrive at C(X). 

We can summarize the results on the recognition of topological predicates 
contained in [M.69], Chapter Four, and in the present section in the following 

















Functions of 

Euler number 


Functions of 

Euler number 


Thus, all these results support the conjecture expressed in [Mi69] that 
no "local-global" computer can recognize connectivity. 

It appears, however, that all models are extremely sensitive to altera- 
tions. We have already mentioned how PCM's can be converted into universal 
machines with the removal of the restriction on the size of the window. 
A. R. Meyer noticed that ordinary perceptrons may be modified to recognize 


any Boolean function with order one. Instead of E a, tp , ^ consider 



£ a. cp. £ Y for some subset of integers Y. Now we can choose the coeffi- 


3. 1 

cients a. in such a way that the sums of the coefficients in no two subsets 
of the set K of all coefficients is the same, ie. , 

V(X,Y C K) [ S a = E a =» X = Y] 

a. € X 



We can define the coefficients inductively, i.e., choose a to be greater 

than S a... This is in the spirit of stratification (see [Mi69]). Now 



tice that if $ is the set of rn^ks of order 1, then a Boolean function cp 



Lji..yjij,ii.iii .i.i.jiil^jip 



la ■iaply § fo|l§c$lfA0of •ubsoto of I. 

of integer*, fi§irfipgt|f to* •mm of 

belonging to «p. 



I I X 

$>**• J Mi Iw*^ ** ** **• ** t 

pt» jipif <Jiw1 oaboot* 

siaosqa cr»33s<i SxS aese arfa jf*Jhr aoragjrf out 

X.2 .gil 

aero el urtmismq a±i& ax aalorf lo 

S.S .g*f 



Sxrsmxpoa * btr« aXorf * gsilloajroO 






















Two figures with the same 2x2 pattern spectra 

Fig. 5.1 
















The number of holes in this pattern is one 
Fig. 5.2 

C D 

Cancelling a hole and a component 
Fig. 5.3 




The purpose of this appendix is to modify certain results of [H068] in 
the light of our different requirements. Our goal is Lemma A. 9 which is used 
directly in the proof of Theorem 2.2.2. We prove it by way of a series of 
intermediate results, none of which are used elsewhere. 

In what follows we would frequently use the phrase n F is a formula in 
n variables over $, and such that no variable occurs more than k times". 
This will be abbreviated to "F is a (§,n, k) -formula". If any of the para- 
meters is not present, we will replace it by *. For example, "F is a ($,*,k)- 
formula" and M F is a (<S,n,*)- formula" mean "F is a formula over $, and such 
that no variable appears more than k times'* and "F is a formula in n vari- 
ables over $ n respectively. 

A. 1 Definition 

Let there be given the sequence of formulas G = (G.(X-,z) , . . . ,G ^ 

(X .,,z).G (X )). If 1 £ i £ p-1, then G. contains the distinguished 
p-1 p p i 

variable z, occurring only once. X. for 1 S. i ^ p is nonempty and is 

either a singleton or = U X.. Let F be an arbitrary formula, and 

j<i J 
G = G 1 (X 1 ,G 2 (X 2 ,... ,G _ 1 (X _ X G (X ))...)). If F = G, then G is a nesting 

sequence of length d. for F. If, in addition, the total number of occur- 
rences of any variable (except z) in G is < the corresponding number in F, 
then G is a proper nesting sequence for F. 


A. 2 Remark 

Let G and G be as described in Definition A. 1. Furthermore, let all 
X. for 1 ^ i ^ p be singletons and distinct. Then G is equivalent to an 
e _. -component. Also, suppose X. is arbitrary and G. is a formula over §. 
Now replace all variables except possibly one in G. for 1 ^ i ^ p-1 by the 

constant a. Let the set of variables that have not been touched be Y. 

Y a 

Then G is equivalent to an e-component over $ . 


Let F be an arbitrary formula over $, X J S(F), and a f D; then we 
would like to obtain a formula over $ with the following properties: 
(1) G = F (2) S(G) = X, and (3) the number of occurrences of any vari- 


able of X in G is £ the corresponding number in F. G can be obtained 

by a straightforward replacement of operators in F such that the variable 

symbols that are replaced with a in forming F (and subformulas of F 


where S(F) consists entirely of such variable symbols) are removed, and 
the remaining operators are changed to preserve equivalence with F . 



e precisely, if cp(F- , . . . ,F. ) is a subformula of F, then if S(F.) <£ X 

1. iC i- 

for all i, cp remains the same; if S(F.) <= x and S(F.) <£ X for j^i, then 
cp is replaced with cp(x. , . . . ,x , ,F ,X . , . . . ,x, ) where all variables of 
F. have been replaced with a (if there are more such indices i, we proceed 
in the obvious way); and if S(F.)cx for all i, then cp is eliminated. This 
transformation will be called normalization and G will be denoted by norm(F ), 

"■ ■ ■■ " '""■ ' " '■■ ' 3 


A. 3 Lemma 

If F is a ($,n,*)-formula, then for any p, q ^ 1 and a £ D, if 
n ^ \ (P>q)j there exists a subset X £ s(F) such that either 

(1) |xj = q and F is equivalent to a PC of the formulas F ,...,F^_ 


where r ^ n and F for 1 < i £ r is a formula over $ such that each element 
max l 

of X occurs in at least two among F..,...,F , and the total number of 

occurrences of any x f X in F , . . . ,F is ^ the number of occurrences of x 
in F; or 

(2) |xj is arbitrary and F has a proper nesting sequence G = (G..,..., 

3 x 

G , , G ) where G. for 1 ^ i < p is a formula over $ . 
p-1 P i 


Assume there is no X c S(F) such that F is as described in (1) of 


the statement of the lemma. 

We will describe a (proper) nesting sequence extraction procedure 
(NSE) whose inputs will be a formula H over § and a set of variables 
Y. The output of NSE will be two formulas H 1 (Z,z) and H" over $ such 
that Z is either a singleton or £ Y; furthermore, H = H'(Z,H n ) for some 


U ^ S(H). 

G will be obtained by the repeated use of NSE. Initially, the input 
of NSE will be F = F- and <t>. In the first application of NSE, the output 

will be G, and F (F is an intermediate formula whose significance will 


be describe immediately). In general, the i application of NSE will 

receive the input F. and M X. and yield as output G.^ and F^. We will 

j< i J 



that if n ^ TL(p 5 cj)j we can apply NSE p-1 times and end up with F _„ 

from which G is obtained as will be described below. 

Description of NSE . The input to NSE is as describe above. Then 
we can distinquish two cases: 

Case I . L(H) = 1. In this case we cannot apply NSE, and the output 
is undefined. 

Case II . L(H) > 1. In this case we can assume that H has no unary 

operators; for suppose there exists a subformula J of H such that J = co(ty( 

J-,..., J )) where J. for 1 £ i < r is either a variable symbol or another 
1 r i 

subformula of H. In this case cp« \|/ = p 6 $ and we can replace J by the 
equivalent formula P(J..,...,J ). Similarly, if J = cd(J,, . . . , ^(J. ) > • • • ,J^) > 
we can eliminate i|f because cp(x.. , . . . ,i|f (x. ) , . . . ,x, ) = p(x- , . . . ,x^, . . . ,x^) 
f $ a (thus, if a unary operator of H corresponds to an internal node of 
T(H), we can eliminate it by either of these two means; on the other hand, 
if a unary operator of H corresponds either to the root node or to a node 
next to a terminal node of T(H), then we can use only one of the two methods 
described). Now choose i' such that S(H .,) is maximal among S(H ) for 
1 ^ i £ r. Since support is defined only for formulas, S(H ,) may be 
undefined if all arguments of the outermost operator of H are variable 
symbols. In this case replace one of them by the identity operator 
which is possible since id p $ a . Consider H/H ^, = K(Z,z). Again two 
cases can arise: 


Case Ila . Y R Z = <t>. Choose any variable of Z, e.g., x, and let 

fx z) 
H' = norm(K ). z is a distinquished argument (hence x is free). 

In this case set V = S(H ,)-Z U[x}. The significance of V will be 

• X 

seen immediately. 

Case lib . Y R Z ^ 0. H' = norm(K )] z is again a distinguished 

argument V = S(H . , )-Z U Y. 

In both cases H" = norm((H .,) ). 

. i a 

Analysis of NSE . Let I S (H) ! = m, and let us estimate ls(H")'. 

Obviously, |s(H . ,)| ^ . In the case that H results from a chain 

of applications of NSE to a formula F, and F does not satisfy (1) of the 

statement of the lemma, then we claim that in Cases Ila and b less than 

q variables are set to a in H . , . Suppose this is not true. Let the 

• J- 

set of variables that is set to a on this occasion be W. Then W £ Z-{x} 

(Case Ila), or W £ Z-Y (Case lib). In any case consider F & . This is 

equivalent to ^(H , n >) , . . . , (H . .) where i(l) , . . . ,i(s) are the 

« X \ X / H • X \ S / cl 

indices corresponding to the subformulas H . where all variables have 

• J 

not been replaced by a (if in H all variables have been replaced by a, 

then it is absorbed into 9). But then cp(norm((H ,,J ) ,. . . ,norm((H ^ g J a )) 

satisfies (1) of the statement of the lemma. A contradiction. 

Thus, IS(H")| s ^ J!L - -q+1 

Hence, if we define 


VI, q) = 1 

Tl 6 ( P +i,q) - <Vp.q>+q-D- r W 

x,e " 

n fi (p,q) = n P " 1 -q+ maX - • (q-1) 
6 ^ max ^ n -1 M 


for p,q ^ 1 (and if F does not satisfy (1)), we will be able to apply NSE 

p-1 times and obtain F ,. G can then be obtained as follows: If S(F ,) 

p-1 p p-1 

fl S(G.) = 6 for 1 ^ i ^ p-1, then choose any variable y f S (F 1 ) and 
i p-i H 

obtain G from (F _-) yi by normalization; otherwise, denoting U S(G.) 

by U, obtain F . from (F n ) by normalization. It can be checked that 
p-l p-1 a 

G. for 1 ^ i £ p satisfy the conditions of (2) of the statement of the lemma. 


Consider a sequence of (nonempty) sets X for 1 £ i ^ p such that 

X. is either a singleton, or is included in tj X . We will call such 

L J<i j 

a sequence of sets a normal sequence (of length p). Note that the sequence 

X..,...,X in Definition A. 1 is a normal seqdence. Then 

A. 4 Lemma 

Let X..,...,X be a normal sequence of sets with the additional pro- 

of I) 


sequence. Then if p S(k+l) m , there exists a subset Y C fj X. and an 

1=1 L 
increasing sequence of indices i(l) ,i (2) , . . . ,i (q) such that (1) q ^ m, 

perty that each element of U X. appears in at most k elements of the 

i-1 L 

:here exists a subset Y C I ! J 

1=1 L 


(2) 1(1) = 1, (3) X ... Y is a singleton for 1 £ j £ q, (4) Xj Y = 

if 4 £ i(q) and X j* i(j) for 1 £ j £ q, and (5) if x £ Y, ^ < J 2 < j g 

and x £ X. , . N , x f X. , . N , then also x € X. , . N . 
UJj) i(J 3 ) Kj 2 ) 


(this is a direct translation of the proof of Lemma 2 of [Ho68] into 

our terminology). Let IJ X. =* (x.. ,x„ , . . . } . Without loss of generality 

i=l 1 l 
assume that x- € X.. . If m = 1, set Y = (x..}, i(l) = 1, and conditions 

1-5 are satisfied. For the inductive step two cases are distinguished. 

Case I . x 1 occurs in none of the sets X., 2 £ j < (k+1) + 1. 

Setting r = (k+1) +1, the sequence X„,...,X is normal and each element 

occurs in at most k of the X., 2 <. j £ r. If Z C \\ X and the sequence 

J j =2 J 

j (1),... ,j (q-1) are obtained by the inductive hypothesis, then [xj U Z = Y 

and 1(1) = 1, 1(2) = j(l) i( q ) = j (q-1) satisfy conditions 1-5. 

TTl— 1 

Case II . Assume that x.. occurs in some X., 2 ^ j £ (k+1) +1 and 
let h be the smallest such number j. Furthermore, let V be the set of 
elements different from x, , and occurring in X 2 ,...,X ,. Delete the 
elements of V from X-,X. ,...,X , and delete those among X.X_,...,X 
that remain empty. Let the resulting sequence be Y-,...,Y . The length 
of the sequence (X. ,X, ,X, , . . . ,X ) is at least p- (k+1) +1. 

There are less than (k+1) distinct variables in X„,...,X _p 
each one occurring in at most k-1 of the formulas X ■, S X, ,...,X . 


r * p-(k+l) m " 1 +l-(k-l)(k+l) in " 1 ,i.e. ) 
r 2= (k+l) 1 "'^! 

m— 1 

The sequence Y_,...,Y is normal and its length is at least (k+1) 

x 1 occurs in Y„. Let Z £ I J Y. and the sequence j (1) = 2, j(2),..., 

j-2 J 
(q-1) be obtained according to the inductive hypothesis for Y^j'.'jY . 

Then Z and i (1) = 1, i(2) = j (1) , . . . ,i (q) = j (q-1) (where q * m) , satisfy 

conditions 1-5. LJ 

Let there be given a (*,*,k)- formula F with the proper nesting 

sequence G = (G.,,...,G ) such that G. is a formula over f. As has 
M — ^ 1 ' p l 

already been remarked above, X , ...,X (see Definition A. 1) is a normal 

i p 

sequence of sets. 

If p 2 (k+l) m , then by Lemma A. 4 there exists a set Y £ N X and 

q indices i(j) for 1 £ j £ q such that conditions 1-5 hold. Note that 

if m = k-t, then |y| * t since no variable appears more than k times 

in G (G is proper). In particular, consider only Z = {x^...^] £ Y 

where x-,...,x are numbered in the order of their appearance in G. 

Note that due to condition 5 of Lemma A. 4, if x, y €Y and y follows x 

in G, then x cannot appear again after y in G. Let G be as defined in 


Definition A. 1. Then we will let the reader convince himself that G 


(hence also F ) is equivalent to K(Z,G') where K(Z,z) is an e -component 
a c 

over $ with input variable z, and G' is a certain formula over $ such 
that each variable of G' occurs at most k-1 times. 


Note that in this case we do not know the size of S(G'). This 
can be remedied in the following way: There are two cases; either 
|S(G')| s 1/2- t, or not. In the first case perform an a-merger on K 
with basis S(G'), after which we obtain an SC of an e-component K' 
of length ^ 1/2* t and a formula G" (through the input variable) such 
that S(G") equals the set of lateral variables of K' ; in the second 
case perform an a-merger on K(Z,G') with basis Z-S(G') in which case we 
obtain an e-component K' of length ^ 1/2' t with a constant input operator. 

We summarize the preceding in the following 

A. 5 Lemma 

Let there be given a (*,*,k)-formula F with a proper nesting sequence 

2k« t 
of length p ^ 1 composed of formulas over $. Then if p ^ (k+1) , there 

exists a set Z = S(F), |f| ^ t, and F is either equivalent to an SC of 

SL 3. 

an e-component K over § and a formula G over $ such that S(G) is the 
set of lateral variables of K, and no variable of G occurs more than 
k-1 times in G; or to an e-component K over i with constant input 

Let there be given a PC F of the formulas F, ,...,F where r £ n 

& 1' ' r max 

such that |S(F)| = q and each variable appears in at least two among 
F-,...,F (i.e., a situation as described in (1) of the statement 
of Lemma A. 3). We are interested in obtaining a (nonempty) subset 
X £ S(F) such that when the variables outside of X have been replaced 


by the constant a, |s(norm(F.) )) I for those F. where not all variables 

■L a ^- 

h?ve been replaced with a is equal or larger than a predetermined number 
t (as large as possible). 

We could solve the problem as follows: Each variable of S(F) appears 

in a certain subset of the formulas F ..,..., F . The number of such 

t* max 

subsets is 2 (in general, S 2 ); thus, we are sure to find a subset X 

with |x| 2: — 3 — such that all elements of X appear in the same subset 

2 max 

of F- ,. .. ,F . 
1 r 

However, we can improve this number. Let us construct the occurrence 

table of F. The table consists of rows corresponding to elements of S(F), 

and of columns corresponding to F. for 1 <■ i ^ r. The entry a . is 1 

if x occurs in F. and otherwise. We will try to extract a subset 
i J 

X £ S(F) such that either all variables of F^, are replaced by a, or 


S(norm(F.) )) contains 2 elements (t will be determined later). 
i a 

If all columns in the occurrence table contain 2» t l's, we are 
done and X = S(F). Suppose not. Let the column j contain < t l's. 
Delete all rows corresponding to the l's in column j and column j 
itself. Let the set of variables corresponding to the remaining rows 
be X 1 . Consider the remainder of the occurrence table (i.e., minus 
the deleted rows and column); and again look for the column with < l's. 
If it does not exist, we are done and X = X... If such a column exists 
continue. Now two things can happen. Either at some point we end up 
with a certain subset of columns, all of which contain ^ t l's, or we 
end up with two columns that both contain < t l's. We shall see that by 


an appropriate choice of t, the latter case cannot happen. The number 
of l's in the whole table ^ 2q (each variable Occurs in at least two 
formulas). The smallest number of l's remaining after all but two columns 
have been deleted 2: 2q-m where m is largest possible number of l's that 
can be deleted in the course of this procedure, m = (t-1)- (r+r-l+r-2+. . . 

+3) = (t-1)* -^ j ^ (this corresponds to the case when each deleted 

row contains only l's and at each stage t-1 rows are deleted). If, after 
the table is reduced to two columns, both columns are to contain ^ t l's 
(both have to contain the same number of l's since each variable occurs 
in at least two formulas), then 

2ais * t 


or since r ^ n 


t ^^rr wh ere c = (n +3)(n -2) 
c+4 max max 

For large n this is better than the previous bound. This result 
° max 

can be summarized in 

A. 6 Lemma 

Let there be given a PC F of the formulas F ..,..., F over $ where 

r ^ n such that |S(F) = q and each variable appears in at least 
max ' ' 

two among F. , . . . ,F . Then if 


q 2 f where t 2: 1 


we can find a subset X £ S(F) such that F is equivalent to a PC of 


the formulas &,,..., G over § and S(G.) ^ t for 1 ^ i £ s <. r. 
Is v i 

Lemmas A. 3, A. 5, and A. 6 can be combined into 

A. 7 Lemma 

Let F be an ($ ,n,k)-formula. Then for any t ^ 1 and a £ D if 


(see Lemma A. 6 for the value of c) , there exists a subset X £ S(F) 
such that either 

(1) F is equivalent to a PC of the formulas F ..,..., F over $ 

a ^ 1' ' r 

where r < n , each variable of F. occurs at most k-1 times in it, 
max l ' 

and F. contains at least t variables of X or 

X a 

(2) F is equivalent to an SC of an e -component K over $ 

a t 

with a formula G over $ (through the input variable) such that S(G) 
is the set of the lateral variables of K and no variable occurs more 
than k-1 times in G; or to an e -component K over $ with a constant 
input operator. 

A. 8 Lemma 

Let F be a ($,n,k)-formula. Then for any t s 1 and a € D if 

n > T! 7 (t,k) 


then there exists a subset X C S(F) such that F is equivalent to an 


a k 

SPCeC over § G such that (1) G has ^ n components, (2) each 


component is of length ^ t, and (3) the terminal components of G 
have constant input operators. 


TL(t,l) = nt . In this case T(F) has at least one branch 
7 max 

connected to t+1 variable symbols (k=l and thus all variable symbols 
are distinct) at different nodes. This branch can be converted into 
an e -component with constant input operator. The idea is illustrated 
in Fig. A. 1. 

n 7 (t,k + l) - T| 6 <(k + 2) 2 °' fl),7) 7 (t . k > i , «*4). V,k)-c ) 

We can apply Lemma A. 7. The result is either (1) an e-component K 
of the correct length and constant input operator, (2) an SC of an 
e-component over $ of the correct length and a formula to which we 
can apply the inductive hypothesis, and (3) a PC of formulas to which 
we can apply the inductive hypothesis. In each case we obtain an SPCeC 
with the desired properties. D 

A. 9 Lemma 

Let F be a ($,nk)-formula. Then for any t ^ 1 and a f D if 

n * TLCt.k) 


there exists a subset X C S(F) such that F is equivalent to an SPCeC 


over $ G such that (1) G has ^ k components, (2) each component has X 
as the set of its lateral variables, (3) the terminal components have 
constant input operators and (4) |x| = t. 


/n k \ (n k ) (n k \ 

Set T1g(t,k) = TL(s«t,k) where s =f max) +1 max/ +. . .+ ( max J 

Apply Lemma A. 8 to obtain a SPCeC G' with all components having length 

s s*t. Since each variable appears at most k times, it can occur in 

at most k components. s is the number of nonempty subsets of £ k elements. 

Thus, if the number of variables is as indicated we are sure to kind 

in G' a subset of t variables that all occur in the same set of components 

of G'. After performing an a-merger with this set as basis, we obtain 

the desired SPCeC G. D 

Remarks on the bounds in Lemmas A.3-A.9 . If Tl is approximated by 


P m 

n r • q, then TL is inductively defined as follows: 
max "»> 7 j 

Vt,i) - n<: 

7 max 

2k-Tl 7 (t,k-l) 

V'^ - v*™ 1 ' -yt.k-i) b 

/ L 

c • — , ^ / , s -^ . „ o, \ , b f2k times 

for a certain constant y. Thus we see that TL(t,k) 2: iexp(b,2k) = b J 

for k ^ k(b) for any constant b (t has not been included in the estimate 

because in applications it is constant). 



Conversion of a formula F where each variable occurs only once 
into an equivalent e-component by setting certain variables to a. 

Fig. A.l 




There is an isomorphism between the set of formulas over II and series- 
parellel contact networks. We assume the reader is familiar with this model 
as well as with the isomorphism in question. In this case if F is a formula 
over II, then L(F) corresponds to the number of contacts in the network corres- 
ponding to F. 

For convenience, we will derive the result in contact network terminology. 
Given a (series-parallel) contact network C, a chain is set of contacts 
such that when they are all closed, C conducts (we will say "C is 1"); a cut 
set is a set of contacts such that when they are all open, C does not conduct 
(we will say "C is 0"). In the obvious way, we define minimal chain , minimal 
cut set (i.e., when one contact is deleted the corresponding property does not 

B. 1 Lemma 

Given a contact network C and any minimal chain and minimal cut set, their 
intersection is a singleton. 

This result is due to Khrapchenko [Kh71]. 



By induction on the number m of contacts in C. For m = 1 the assertion 
is obviously true. If m > 1, C must be either a series combination of smaller 
networks C. and C™, or a parallel combination of smaller networks C. and C ? . 
In each case it is simple to establish the lemma. D 

Suppose now we have a contact network S that represents © x.. Let m. 

i=l J 

denote the number of contacts labeled with x. or x. . Then we are interested 

in H m.. 

Consider n-tuples (a.) for 1 ^ i ^ n and a. € {0,1}. An n-tuple of this 
kind will be called even if it has an even number of l's, otherwise it is odd . 
Obviously S must be 1 on odd n-tuples and on even ones. 

Consider an arbitrary odd n-tuple a_ = (a- ,. . . ,a. ,. •• ,a ) and an even 
n-tuple b = (b 1 ,. . . ,b. ,. . . ,b ) at Hamming distance 1 from a. If b. = a., then 
all other components of a and b are equal, e. will denote the n-tuple with 
a single 1 in the i place. Then we will write b = a ffi e.. 

To each odd n-tuple a. we can assign a minimal chain c (a) (consisting of 
a subset of contacts of S that are closed at a and that do form a minimal 
chain); similarly, to each even n-tuple b_ we can assign a minimal cut set 
s(b) (consisting of a set of contacts of S that are open at b and that do form 
a minimal cut set). 

Let a be odd, b_ = a ff" e. even. Then by Lemma B.l, c (a) H s (b) is a 
singleton; in fact, it is easy to verify that it must be a contact labeled 
either with x. or x.. 


We build now Tables I and II. The rows of Table I correspond to odd 

n-tuples while those of Table II correspond to even n-tuples. Thus both 

have 2 rows. The columns of both tables correspond to the variable x. 

for 1 ^ i ^ n. The entry a(a,j) in Table I is c (a) H s (a © e.). This entry 

will be represented by a number between 1 and m.. 

Let t. . denote the number of times contact number i (among those labeled 

with x. or x.) appears in column j of Table I. Then 

S t = 2 n_1 (B.l) 

1=1 J 

The entry p(b,j) of Table II is s (b ) D c (b © e ). (B.l) again holds. 

Construct now Table III. The rows of Table III correspond to all possible 
pairs (a,b) where a and b are odd and even n-tuples respectively. The columns 
of Table III again correspond to variables. An entry of Table III is yCa.bjJ) = 
(a(a,j), p(b,j>. 

Consider now the diagonal entries of Table III (i.e., (C0,p) such that 
a = p). Let (CC(a,j), p(b,j)) be such an entry. Then CC(a,j) = P(b,j) = 
c(a) D s(b). Thus, by Lemma B.l, there can be only one such entry in a row. 


n J 2 
The number of diagonal entries is S S t^ .. Thus, 

j=l 1=1 



j-1 i=l " lj 

Combining a version of Cauchy's inequality 

m j „ "j 

t t?. ^ 2 2(n " 1) (B.2) 

-L ( t t ..) 2 <. s t 2 . 
m j i-i 1J i=i 1J 


with (B.l) and (B.2), we obtains J ri: 

n 1 

Thi» time *e apply the, idaqnality , ■<*& lab mItM ,«WI .W t <sjH 


( E »,).( E J- ) *n 
(for both inequalities see, e.g., Dfc*4j#. 9), «aa obtttartmeeeeired result 

ot:>.i3i;iac;i?«.; : tu rjivii 

«»l-dQi*-i s,ii;.i--. *:■;-: 

E *. a^tt-i' vsll* orfot .£ „IoV s jQi3nc.D i>B£ 



3fe3i3^,a 5<" 




, 5tf«R:3"X«/-2 iot3;K 

t. HZi^M S H$#Jft J ^».5-.**0A y i|C 



Ar69 M. A. Arbib, Theories of Abstract Automata, Prentice- Hall, 1969. 

Av69 A. Avizienis, On the Problem of Computational Time and Complexity of 
Arithmetic Functions, Proc. ACM Symposium Theory of Computing, 
May 5-7, 1969, Marina del Rey, California, pp. 255-258. 

Be71 C. Berge, Principles of Combinatorics, Academic Press, 1971. 

Co71 S. A. Cook, The Complexity of Theorem Proving Procedures, Proc. 3rd 
Ann. Symposium Theory Computing, Shaker Heights, Ohio, May 3-5, 1971 
pp. 151-158. 

Gr59 Editors, Eugene M. Grabbe et al. , Handbook of Automation Computation 
and Control, Vol. 2, John Wiley, 1959. 

Ha71 L. H. Harper and J. E. Savage, On the Complexity of the Marriage Problem 
(unpublished) . 

Ho68 L. Hodes and E. Specker, Lengths of Formulas and Elimination of Quanti- 
fiers I, Contributions to Mathematical Logic, K. Schutte, editor, North 
Holland Publ. Co., 1968, pp. 175-188. 

Ho70 L. Hodes, The Logical Complexity of Geometric Properties in the Plane, 
Journal ACM, 17, No. 2, pp. 339-347. 

Kh71 V. M. Khrapchenko, On the Complexity of the Realization of the Linear 
Function in the Class of rr-Circuits , Mat. Zametki, 9, No. 1, 1971, 
pp. 35-40 (Russian). 

Kr59 R. E. Krichevskii, Realization of Functions by Superpositions, Prob. 

Cypernetics II, 1961, pp. 458-477, Pergamon Press (translated from the 

La67 S. Lang, Algebra, Addi son-Wesley, 1967. 

Lu59 0. B. Lupanov, Complexity of Formula Realizations of Functions of 
Logical Algebra, Prob. Cybernetics III, A. A. Lyapunov, editor, 
Pergamon Press, 1962, pp. 782-811 (translated from the Russian). 

Lu70 0. B. Lupanov, On Some Results in the Mathematical Theory of Synthesis 
of Control Systems, Information Materials 5(42), Ac. Sci. USSR, Moscow 
1970, pp. 16-22 (Russian). 


Mi69 Mo Minsky and S. Papert, Perceptrons, MIT Press, 1969. 

Mk71 R. McKenzie, et al. On Boolean Functions and Connected Sets, Math. 
Systems Theory, 5, No. 3, pp. 259-270. 

Mt64 D. S. Mitrinovic, Elementary Inequalities, P. Noordhoff Ltd. Groningen, 

My71 J. P. Mylopoulos and T. Pavlidis, On the Topological Properties of 
Quantized Spaces I, II, Journal ACM, JL8, No. 2, pp. 239-254. 

Ne 66 E. I. Neciporuk, A Boolean Function, Soviet Math. Dokl. , 2, No. 4, 
1966, pp. 999-1000. 

Ri42 J. Riordan, C. E. Shannon, The Number of Two Terminal Series-Parallel 
Networks, J. Math, and Phys. 21, 1942, pp. 83-93. 

Ry63 H. J. Ryser, Combinatorial Mathematics, MAA Math. Monographs, No. 14, 
John Wiley, 1963. 

Sh49 C. E. Shannon, The Synthesis of Two-Terminal Switching Circuits, Bell 
System Tech. J., 28, No. 1, 1949, pp. 59-98. 

Su61 B. A. Subbotovskaya, Realizations of Linear Functions by Formulas Using 
V, &, -, Soviet Math. Dokl., 2, No. 2, 1961, pp. 110-112. 

Vi70 B. Vilfan, Cyclic Perceptrons and Pattern Counting Machines, Proc. 

4th Ann. Princeton Conf . Info. Sci. and Syst. , Princeton U. , March 1970 

Ya54 S. V. Yablonskii, The Realization of the Linear Function in the Class 
of TT-Circuits, Dokl, Ac. Sci. USSR, 94, No. 5, pp. 805-806 (Russian). 

Ya59 S. V. Yablonskii, On the Impossibility of Eliminating Exhaustive Search 
of Boolean Functions in the Solution of Some Problems in the Theory of 
Circuits, Dokl. Ac. Sci. USSR, 124, No. 1, pp. 44-47, (Russian). 

This empty page was substituted for a 
blank page in the original document. 

CS-TR Scanning Project 

Document Control Form Date JJ±±JJ±. 

Report # LCS-TR-^"? 

Each of the following should be identified by a checkmark: 
Originating Department: 

□ Artificial Intellegence Laboratory (Al) 
^g" Laboratory for Computer Science (LCS) 

Document Type: 

V^ Technical Report fTR) □ Technical Memo (TM) 

□ Other: 

Document Information Number of pages: //8(/^-.>^fJ 

- Not to include DOD forms, printer instructions, etc... originarpages only. 

Originals are: Intended to be printed as : 

□ Single-sided or □ Single-sided or 

^ Double-sided £( Double-sided 

Print type: 

□ Typewriter □ Offset Press □ Laser Print 

□ InkJet Printer J^f Unknown □ Other:_ . 

Check each if included with document: 

2^ DOD Form □ Funding Agent Form ^ Cover Page 

□ Spine □ Printers Notes □ Photo negatives 

□ Other: 

Page Data: 

Blank Pages^,-,,™^: foU-owj Q»*t PAG^ 

Photographs/Tonal Material (by,»9anumb«): . 

Other (note dMcnplion^MB* numbar). 

Description . Page Number 

Scanning Agent Signoff: . . r . 

Date Received: I lJ3lHi__ Date Scanned: / / Ml K Date Returned: cLjJjl*. 


Scanning Agent Signature: rWAtjfA^ykjl f v • mtt^R,, Rev &» ds/lcs Document control Fo<mc*Morm.v«d 


Security Classification 


(Sacumy claaallication ol till,, body of abstract and indexing annotation muat ba antarad whan the overall report ia claa tilted) 

I. ORIGINATING ACTIVITY (Corporate author) 

Massachusetts Institute of Technology 
Project MAC 


The Complexity of Finite Functions 



26. GROUP 


4. OESCRIPTIVE NOTES (Type ol report and tnclu 

Ph.D., Department of Electrical Engineering, February 1972 

5. AUTHORIS) (Laat name. Itrat name, Initial) 

Vilfan, Bostjan 


March 19 72 


N00014-70-A-0 362-0001 




76. NO. OF REFS 


MAC TR-97 (Thesis) 

•6. OTHER REPORT NOIS) (Any other number t that may be 
aeaigned thla report) 



Distribution of this document is unlimited 




Advanced Research Projects Agency 
3D-200 Pentagon 
Washington, D. C. 20301 

s. abstract T h e topics covered are the length of formulas for finite func- 
tions , the order of cyclic perceptrons , and pattern counting machines. 
Using a generalization of a theorem of Specker, it is shown" that the 
Boolean function is 1 if the number mod p of arguments equal to 1 is 
cannot be represented by a formula of length proportional to the number 
of arguments if k-ary logic is used with p>k. The same thing can be 
shown for arbitrary k if the only binary operators used are max(x,y) 
and min(x,y). It is also shown that the connectivity predicate cannot 
be represented by a formula of this kind, regardless of k and of the 
operators used. Next shown is that the connectivity predicate and the 
Euler number predicate cannot be represented by finite order cyclic per- 
ceptrons. Finally, it is shown that the only topological predicates 
that can be reconstructed from the k-subpattern spectrum of a given 
square pattern of 0's and l's are functions of the Euler number. The 
k-subpattern spectrum of a pattern is a tuple given the number of 
omnrrsnces of anv kxk square subpattern in the original pattern. 



computational complexity 


finite functions 

DD/ N r.. 1473 (M.I.T.) 


Security Classification 

Scanning Agent Identification Target 

Scanning of this document was supported in part by 
the Corporation for National Research Initiatives, 
using funds from the Advanced Research Projects 
Agency of the United states Government under 
Grant: MDA972-92-J1029. 

The scanning agent for this project was the 
Document Services department of the M.I.T 
Libraries. Technical support for this project was 
also provided by the M.I.T. Laboratory for 
Computer Sciences. 


^te: {/&g//J<lg 

M.I.T. Libraries 
Document Services 

darptrgt.wpw Rev. 9/94