# Full text of "mit :: lcs :: tr :: MIT-LCS-TR-097"

## See other formats

```This blank page was inserted to preserve pagination.

'-h-~akj ~d.tVv

THE COMPLEXITY OF FINITE FUNCTIONS

id i^daKwr stf.3 ^rd O&J.'Issml -ra^ali s

Cambridge

sis*

. ^i .1 - : J

• J dUs^a ?&:{:) nWO-Kfe 8X Jl iCs^vl^qg-S

3Ateb'iO aj*t3s«&sg a*!3 3ss«siigs3 3onrrso
b 1p ioo2<j sw 3e43~ »i &M5 lo assise

•fell 5«!3 ii £ -\$&btb UBt-lupmi nol3onuJ
"ie pas srf^ isi cs*olj Harass*] *o*>*iq a it's

d i^uk 31 s «Ism»Jt3s-£ siM iav*o gffo-r

HASSJKH UW g lH

-to £flS.I?:»aul! sis sstyg

m-'ii&siTc

Zfl3 , 33V9WCJM

-■?:rise*^ 3ifi»Jt^fc*vni &'

:36!

.?rsa:J3s\$ Thai 'mlus's l^aq £ ssatij iin -^dnux:

tts 02139

-5 > JJJ!

iiasi so

i3"00<.i

tipoo snj

THE COMPLEXITY OF FINITE FUNCTIONS

ABSTRACT

Lower bounds on the length of formulas for finite functions are
obtained from a generalization of a theorem of Specker. Let f:
{0,1,... ,d-l} n -» {0,1,... ,d-l) be a function which can be represented
by a formula of length ^ en. For any m, if n is sufficiently large,
there is a restriction f: { 0,1, . . . ,d-l} m -» { 0,1,. . . ,d-l} of f which
is representable by a special class of formulas called homogeneous
e-complexes. By showing that certain functions do not have restric-
tions representable by homogeneous e-complexes, we are able to conclude
that the length of formulas representing the mod p sum, p > d, or the
connectedness of a pattern on a discrete retina cannot be bounded by
a linear function of the number of variables in the formula.

Also considered are perceptrons over finite fields (cyclic per-
ceptrons). It is shown that cyclic perceptrons of bounded order
cannot represent the geometric predicate connectivity. An interesting
aspect of this is that one proof of the corresponding result for
bounded order perceptrons over the rationals rests on the inability
of the latter to represent the parity function. However, the parity
function requires order 1 if the field has chracteristic 2; thus,
this proof breaks down in the case of cyclic perceptrons. Another
geometric predicate that cannot be represented by bounded order
cyclic perceptrons is Euler number equals k (for an arbitrary k).
However, this predicate can be represented by bounded order percep-
trons over the rationals. It must be noted, however, that our proofs
are different and much simpler than the corresponding proofs derived
by Minsky and Paper t for perceptrons over the rationals.

Finally, we investigate k-pattern spectra of a discrete retina.
This is the 2 k -tuple, each component of which corresponds to the
number of times a particular kxk pattern occurs on the retina. It
is shown that the only topological predicates that can be determined
from k-pattern spectra of discrete figures are functions of the Euler
number of the figure.

This report reproduces a thesis of the same title submitted to
the Department of Electrical Engineering, Massachusetts Institute
of Technology, in partial fulfillment of the requirements for
the degree of Doctor of Philosophy, February 19 72.

mimmx^ wm ^ittm -'fi'^^ m i ^ ^ v. m mm ^ r f * ** ,<. / ***** * i Uj^ULi p ^.Wpi l ipjIl i ll l LI I Jm^- ' iipPi ' l ' . I WIU ll .1 J.HW^L«KjHWWP*wwiMp mu!

AgKMQWIJSPGBMENTS

H*5 : .1

In the first place, I cp^^ Aefe% ^ jg^fl^^u^k jto Professor
Albert R. Meye^ t»,tfea qftFCft ^ 3 M^F^^ft^f it ^.*r hom * learned
the heuristics of Research. ^<i£\$\$§£%u\fg&^g iijtifoduced me
to the problems described he? ,a, f ^ r ^a^J (3 t l %ijQ <#\$«&% l<?n,g hours with
me suggesting improvements and modifications.

I am also indebted to my readers, Professors Michael J. Fischer

89 xs !' qatot)** s* J * i

and C. L. Liu for valuable suggestions.

I would like to thank Professor Frederick C. Hennie and

-*5TO-vf r ;.- f swi:»<;8 rC IS

Project MAC for financial support during my study.

Last but not least, thanks are due to Miss Marsha Baker for
consenting to type this thesis.

•■'■■--'■ ■■: 'iXlS'-""^~: ?f> i'1gy».J :>ffT' ? ,£

-. ^i.-./u MU<: £ ocr SAT %ij H 1 I»!£ : : ms -i y;s~:w^i^

CONTENTS

CHAPTER ONE:

CHAPTER TWO:

INTRODUCTION AND SURVEY 5

1.1 Finite Functions 5

1.2 Formulas 5

1.3 Measures of Complexity 9

1.4 Problems Related to the Length Measure 13

1.5 Specker's Theorem 20

1.6 Cyclic Perceptrons 23

A GENERALIZATION OF A THEOREM OF SPECKER 26

2.1 e-Complexes 26

2.2 The Generalized Specker's Theorem 38

2.3 On Specker's Theorem 41

CHAPTER THREE: APPLICATIONS OF THE GENERALIZED SPECKER THEOREM 50

3.1 Counting mod p 51

3.2 Connectivity 55

3.3 The Length of Symmetric Functions 60

CHAPTER FOUR: CYCLIC PERCEPTRONS 74

CHAPTER FIVE: PATTERN COUNTING MACHINES 89

APPENDIX A: CERTAIN PROPERTIES OF SHORT FORMULAS 97

APPENDIX B: THE LENGTH OF THE MOD 2 SUM OVER II 112

LITERATURE 116

BIOGRAPHICAL NOTE 118

CHAPTER ONE

INTRODUCTION AND SURVEY

1.1 Finite Functions

Let n be nonzero and finite; then a partial function IN ■"♦JKT , defined
on only finitely many n-tuples, is called a finite function . We will restrict
our attention to a subclass of finite functions. D = {0,1, . . . ,d-l] is an
initial interval of^KT. Then we will consider ?, the set of all (total)
functions D -♦ D for all possible D and (finite and nonzero) n.

Let f : D n -♦ D. Then f is identified with a (functional) table with d
rows (corresponding to all possible n-tuples over D) and n+1 columns (corres-
ponding to the n arguments and the value of f). Obviously the number of func-
tions D -» D is d .

Consider any function f i D -» D for arbitrary D, n. We will say that f
depends on the i argument if and only if there exist two n-tuples

a = (a n a. , . . . ,a ) and b = (b. , . . . ,b , . . . ,b ) such that a . = b for j t i ,

lin lin J j

a. / b., and f(a) ^ f(b). Suppose that f does not depend on its j* h argument;
then we will say that the j argument is a fictitious argument.

1.2 Formulas

Let there be given the countable sets H ■ {x^x^...} of variable symbols
and fi of operator symbols. Each element of fJ is a name for a function in ^,
and conversely each function in ^ has a name in 0. Let 9 £ represent the
function f : D n -♦ D. Then we will write argOP) = n and dom(9) = D.

1.2.1 Definition

A D- formula is a finite expression F = cp(G..,...,G ) such that tp G H ,
arg(cp) = n, dom(cp) = D, and either G. 6 H or G. is a D-formula for 1 ^ i ^ n.

A formula is simply a D-formula for some D.

Let F be an arbitrary D-formula and let x be the highest numbered
variable symbol appearing in F. Then F represents a function f : D -» D.
This correspondence is well-known and we will not describe it in detail.
Without danger of imprecision, F will also be considered as a representation
for all functions obtained from f by adding fictitious arguments.

Let there be given two formulas F and G. Suppose that F represents a
certain function f, and also a representation for f can be obtained from G
by possibly choosing different variable symbols. Then we will say that F
is equivalent to G (F = G).

Remarks. Usually, if we are dealing with D- formulas for a single domain
D, we represent the identity function by a variable symbols (i.e., we omit
the operator symbol for the identity). In the formal model we use, we cannot
do this since it would be ambiguous. Also, for purely technical reasons, we
insist that every operator has at least one argument (otherwise, the wording
of several definitions and results would be more cumbersome). Thus, we do
not allow constants. Rather, instead of constants, we use operators with one
fictitious argument. Suppose we are given the formula F. Occasionally, we
will say "Replace the variable x (in F) by the constant a". This is to be
interpreted as "Replace the variable x with a (y) " where y is a variable
symbol not appearing in F.

Let f : D -+ D and let g be an arbitrary finite function of n arguments
with domain E £ D and such that g = f E. Let F be a D-formula for (i.e.,
representing) f. Then we can also say that (F,E) represents g. From now on
we will not be pedantic, and we will simply say that F represents g. Some of
the main results in this thesis are concerned with the question, given a
specific function D -+ D , how much can we simplify its representation if we
choose an E-formula for it with D f E.

If F is an arbitrary formula, then the set of variables appearing in it
will be called its support (denoted by S(F)). The set of operators appearing
in F will be called its basis (denoted by B(F)).

Let § s ft. Then the set of formulas F such that B(F) C \$ will be called
the set of formulas over f. Hopefully without too much danger of ambiguity,
we will also say that \$ is a basis of operators (for formulas over \$). All
the significant results we will describe deal with formulas over \$ when \$
is finite (and representing a set of operators with domain D for a single
value of D). From now on, whenever a basis of operators \$ is introduced,
it is always assumed finite. Usually, we are interested only in bases that
allow all function D -♦ D for a certain D and arbitrary n to be represented.
Such bases will be called complete bases (for D).

Notation. Elements of % will always be denoted by lower case Latin
letters. The various bases of operators we will use will be denoted by
capital Greek letters; operators (i.e., basis elements) will be denoted by
lower case Greek letters (except for well known operators for which established
notation exists); formulas will be denoted by capital Latin letters; and D
will always refer to the domain of formulas. d will denote D .

1.2.2 Example .

If D = {0,1}, then the functions D -» D for arbitrary n are known as
Boolean functions. A complete basis for (0,1} conists of the binary operators
A (conjunction) and V (disjunction), and the unary operator — - (complementation),
This basis shall be denoted by II. The formula F = V(A(— (x ),x 2 ),A(x , — ( x 2 ^^
over II represents x 1 x„ (the mod 2 sum of x 1 and x„). Usually, this is
written as x.. A x V x- A x_. We have S(F) = {x..,x 2 }, and B(F) = H „

A convenient representation of formulas is by trees. This is a standard
device that will not be described; suffice it to say that to each formula F
there corresponds a tree T(F) whose terminal nodes are labelled with variable
symbols and the nonterminal nodes with basis symbols. As an example, let F
be as defined in Example 1.2.2. Then T(F) is shown in Fig. 1.1.

Given a formula F, we need a notation for subformulas of F.

The definition of sub formula is the standard one: (1) F is a subformula
of F, (2) if F = cd(F ,...,F k ), then if F . for 1 ^ i ^ k is not a variable
symbol, any subformula of F. is a subformula of F, and (3) subformulas of F
are only objects satisfying (1) and (2). Subformulas distinct from F are
proper subformulas.

Let G be a subformula of F such that G = cp(H., , . . . ,Hg). Then we will say

H. =G . for 1 £ i £ jfc. This notation can be iterated. In Example 1.2.2,
i .1

F = -(x 9 ). However, note that F „ . is a variable symbol which according
to our definition is not a formula. This can be remedied by replacing this
particular occurrence of the variable symbol x. by id(x^). For this reason
we will require that all the bases we consider contain the identity function
whether this is specifically mentioned or not.

If G = F .i(l).i(2). i(r)' then ^ = J ^)J ( 2 )° ■ • J( r ^ is called th e index
of G (for completeness, let X denote the index of F). If G is a proper sub-
formula of F, then F = H(X,G) where X U S(G) = S(F) and H(X,z) is a formula
(determined by j ) where z appears only once. We write H = F/G. In this case,

with F and G as given, we will also write S*(G) = X (i.e., the variables of

F

F that appear outside of G) . We define S*(F) = <t>. The subscript F will

F

generally be suppressed when it will be clear to what formula F we refer to.
In what follows, whenever we will deal with a subformula G of F, it will be
assumed that the index of G is also given; for if not, then, e.g., F/G and
S*(G) are not uniquely defined.

Frequently, formulas will occur where certain variables have been re-
placed with constants. Suppose F is a formula over I, X G S(F), and a € D;

then, F with all variables except those in X replaced by a will be denoted

X X

by F . Obviously, F is a formula over § U {a}. If f is an arbitrary function,
a a

X a subset of its arguments, then f has the analogous meaning, viz., the

cl

function obtained from f by restricting the elements outside of X to a.

The functional table of f X is obtained from that of f by deleting all columns

a

except those that correspond to X and retaining only the rows with a
entries in the deleted columns.

1.3 Measures of Complexity

Let us introduce the three most widely studied measures on formulas:

(1) Length. The length of a formula F, denoted by L(F), is the number of
occurrences of variable symbols in F. In other words, it is the number of
terminal nodes of T(F).

10

(2) Cost. The cost of a formula F, denoted by C(F), is the number of opera-
tor symbols in F. In other words, it is the number of nonterminal nodes of
T(F).

(3) Depth. The depth of a formula F, denoted by D(F), is the depth of nest-
ing of operators in F. In other words, it is the number of arcs on the long-
est branch of T(F).

Now, given an arbitrary function f : D -» D and a (finite) basis \$, we
define the length of f over \$ as

L(f,§) = min({4: There exists a formula F over \$ for f such that

L(F) = I})

If f cannot be represented by a formula over §, we define L(f,f) = °°o Simil-
arly, for cost and depth.

It is noteworthy that all the measures above are closely related. In fact,

c «L(f,\$) ^ c (f,\$) ^ C 1 -L(f,\$) (1.3.1)

c 2 -log 2 (L(f,\$)) iD(f,l) * c 3 -log 2 (L(f,3)) (1.3.2)

for an arbitrary function f such that L(f,\$), C(f,\$), and D(f,\$) are finite,
and certain constants c Q , c., c 2 , and c„ that depends on \$. The basis I is
also arbitrary, except in the case of the right inequality of (1.3.2) where
it must be such that all the constants and the function g (see Lamma 1.3.1)
may be represented.

We first establish the relation between cost and length (1.3.1).

11

Any formula F over \$ can be built up from one which uses only one opera-
tor symbol (an elementary formula) by successively replacing variable symbols
with new elementary formulas. If F does not contain one-argument operators,
then whenever we increase the cost during the build-up (by adding an elementary
formula with cost 1), we also increase the length. Specifically, the length

increases by between n . -1 and n -1 where n . and n are respectively
J min max mm max

the smallest number larger than 1 and the greatest number of arguments of an
operator of §. This results in the estimate

1 - -L(F) £ C(F) <: — — r -L(F) (1.3.3)

n -I n . -I

max mm

where c = 1„ Suppose F contains one-argument operators. In other words, T(F)
contains nodes with branching factor one. Let the maximal number of such nodes
that occur one after another on any branch of T(F) be c*; then (1.3.3) still
applies with c = c* + 1. (1.3.1) is obtained from (1.3.3) by noting that the
minimal length or cost representation of any function (over the chosen basis
\$) can be achieved with a formula where c* ^ d (the number of functions D -♦ D).

The left inequality in (1.3.2) is established by a trivial counting argu-
ment (the maximal number of terminal nodes in a tree with branching factor ^

n and depth d is n ). The right side requires more effort (the following
max max

argument is due to R. W. Floyd). We first state the following obvious

1.3.1 Lemma .

Given a formula F such that F = F-CX^F^xp) where F 2 is a proper sub-
formula of F and F.. = F/F„, the following holds:

12

F = F 3 (F 1 (X 1 ,C ), F 1 (X 1 ,C 1 ),...,F 1 (X 1 ,C d-1 ),F 2 (X 2 ))

where G. for D ^ i ^ d-1 is any formula representing the constants 0,...,d-l

(or, as we have remarked previously, the one-argument function with constant

value), and F„ is any formula representing the function g(z Q , . . . ,z^_^,z^_ =

"z n if z, = 0: z, if z, = 1,..., z, , if z, = d-1.
d 1 d ' d-1 d

Let F be an arbitrary formula over §, and let G be a proper subformula
of F. We already know that F = H(X,G). The claim is made that if L(F) > 1,
G can be chosen in such a way that

L(H)-1,L(G) < ^MJj- -L<F) d.3.4)

max

where n is as defined previously. (Remark: L(H)-1 is the number of
max r

occurrences of the variables of S*(G) in H. )

To find G use the following procedure: Start with F and proceed to sub-
formulas of F. Assume you are considering the subformula K. Then two cases

can arise. Either among K for 1 £ j £ k where k is the number of arguments

• J

of the outermost operator of K there is one, j', such that L(K , ) ^ ct'L(F)

(0 < a < 1 will be determined later with the purpose of obtaining the lowest

possible estimate of L(H)-1 and L(G)), or not. In the first case, proceed to

K ., and containue. Otherwise, set G = K „ where j" is such that L(K „) =
. j • J • J

max (L(K .)) and terminate. Before the procedure terminates, L(K) ^ ct'L(F).
l^j^k ' J

Thus — — £ L(G) < CO-L(F). This also means (l-a)'L(F) £ L(H)-1 < (1-- )

ri
max max

.L(F) (because L(G) + L(H)-1 = L(F)). The lowest bound for L(G) and L(H)

is obtained by setting a = 1 ; hence (1.3.4)

max

13

Now apply Lemma 1.3.1 with G replacing F ? and H replacing F . F is of

depth c, depending on §. If the outlined procedure is applied recursively to

H(X,C ) for <; i <C d-1 and to G, we obtain in (1.3.2) c„ = , C , where
i 3 log b

n +1 + 2

, max T

b = .

n
max

Note that unlike the cost-length relationship, the minimal value of
depth may not be achieved by the same formula as the minimal value for length.

Apart from the relationship between the various measures, depth and cos'
will not be treated further. Even though in what follows (in this chapter)
many things hold mutatis mutandis for depth and cost, most of the specific
discussion and the examples shall be confined to length.

1A Problems Related t o the Length Measure

In this section we will mention several questions that have been asked
about the complexity of finite functions, their status as of this writing,
and how they relate to the work to be described here.

t

A more precise expression is obtained if the right side of (1.3.2) is re-

placed by • log. (L(f ))+i for some constant I. Namely, if we start out

J-Og.D L

with a formula F and decompose it according to (1.3.4) and Lemma 1.3.1, then
the length of H(X, C.) and G is bounded by ~L(F)+k where k is the length of C..
After n applications of Lemma 1.3.1, the lengths of the relevant formulas are
bounded by ^ L(F) + k^FT +. ..+ k^+k « ^ (L(F)) + -^— j- ; hence the figure
above.

14

The Problem of Aggregate Length

Let \$ be a complete basis (for a certain domain D). The statement of the
problem is: What is the largest number L(n,\$) such that there exists a function
f: D n -+ D and L(f,\$) = L(n,§)?

It has been studied by several authors, and is now effectively disposed
of. Riordan and Shannon [Ri42] first derived a lower bound for L(n,IT).
Actually they studied series-parallel contact networks, but the two models
are equivalent. The first upper bound (for the same model) was obtained by
Shannon [Sh49]. Krichevskii [Kr59] derived a lower bound for L(n,\$) for
arbitrary domains and bases, while Lupanov [Lu59] obtained the best upper
bound for the general case. The result is

0(L(n,\$)) = rr (1.4.1)

log a n

where 0(f(n)) = g(n) means that lim — pf is finite and nonzero. There

are two remarks that are in order here. The first is that

Formulas represent finite functions efficiently; i.e.,
the total number of formulas (over a given basis \$) of
length up to L(n,\$) closely matches the number of func-
tions of n variables. (1.4.2)

The second is

The fraction of functions D n -♦ D that can be represented
by formulas of length up to L(n,f ) • (l-e)for an arbitrary
< e ^ 1 approaches zero as n -» °°.

above.

15

Obviously, we could define functions C(n,§) and D(n,\$) analogous to
L(n,§) in terms of the cost and depth measures. In general, such functions
(aggregate complexity functions) can be defined in connection with any model
for the representation of functions D -♦ D and any measure on this model (an
obvious variation of L(n,\$) is to remove the condition of completeness on \$).
It should be noted that the asymtotic behavior of aggregate complexity functions
remains an active area of research. For references on the subject, see Lupanov
[Lu70].

The Minimization Problem

Investigation of the complexity of finite functions started on representa-
tions of Boolean functions by logical circuits. In fact, formulas can be
thought of as circuits with fan-out one. Thus, the first problems studied
were those a logic designer is likely to ask: Given a finite function, what
is the minimal circuit (formula) that represents it (i.e., find the complexity,
and do so "effectively").

Unfortunately, no statisfactory solution to the minimization problem
exists (for any measure). This does not mean that it is impossible to obtain
a minimal formula for a given function f : D n -» D; rather that existing
algorithms are impractical. Thus, it is always possible to order formulas

according to length, and then search all formulas up to length L(n,f) for the

d n
first formula that represents f; but since there are d functions of n argu-
ments this approach is absurb.

At the present, all existing algorithms for the minimization of functional
representations employ some sort of an exhaustive search (e.g. , the Quine
algorithm for the minimization of disjunctive form representations of Boolean

16

functions). In fact, there is reason to believe that a more efficient method
does not exist, i.e.,

1.4.1 Conjecture

Any generally applicable exact minimization procedure is comparable (in
terms of computational complexity) to an exhaustive search among formulas.

It is useful to consider a specific machine model. Let us consider
implementations of such a procedure as a deterministic one-tape Turing machine
M \$ (see, e.g., Arbib [Ar69]) that receives as its input the d -tuple defining
an arbitrary function f : D -♦ D, and whose output is the minimal formula F
(over I) for f. Conjecture 1.4.1 gives us that the computation time of
M- may attain an exponential (in the length of the input). Let us venture
a more restrictive and precise version of Conjecture 1.4.1:

1.4.2 Conjecture

Let M. be as described, m is the length of its input, and let Tl(m) be a

\$

function such that ^' m ' -» as m -» °° for an arbitrary constant c > 1. Then

m
c

the proportion of inputs of length m at which the running time of M exceeds
T|(m) approaches 1 as m -♦ °°.

Actually, the specific machine model on which the procedure of Conjecture
1.4.1 above is implemented is not particularly important. It can be easily
shown (see, e.g., Arbib [Ar69] , Chapter 4) that different deterministic machine
models (this applies to the most widely used models, e.g., one-tape and multi-
tape Turing machines) can simulate each other in such a way that the running

17

time of one is related to the running time of another at most by a polynomial.
In this way, whenever the running time is exponential (in the length of the
input) in one case, it must be so also in others.

It seems that Conjecture 1.4.1 was first expressed by Yablonskii [Ya59].
A very interesting result connected with this subject was recently obtained
by Cook [Co71]. He obtained strong evidence that a simpler problem requires
nonpolynomial time. The problem is that of recognizing whether a certain
disjunctive normal form (for a Boolean function) represents the constant 1.
Cook showed that if this problem could be solved in polynomial time (by a
deterministic one-tape Turing machine), then a number of other problems that
are regarded as very difficult (e.g., given the graphs G., and G^, determine
whether G.. is isomorphic to a subgraph of G 2 ; the recognition of primes; etc.),
would also be rapidly computable. Note that a fast minimization machine
would give us also a fast constant recognizer; hence, Cook's results supports
Conjecture 1.4.1.

The Classification Problem

In view of the difficulty of finding an exact nontrivial solution to the
minimization problem (i.e., one that does not employ exhaustive search), present
research is directed at establishing bounds for the length of functions. We
consider sequences of functions f,,... of 1,... arguments and study the
growth rate of the length of f . Thus, we can talk of classes of linear
(length) sequences, quadratic (length) sequences, etc. Also of nonpolynomial
(length) sequences. Unfortunately, if a sequence belongs to a nonlinear class,
it is very difficult to estimate its length. We cannot even assign represen-
tatives to the polynomial classes of degree > 2, let alone the nonpolynomial

18

classes. In fact, at the present we have only a very limited store of examples

of nonlinear sequences.

2 n
Consider the Boolean function f = .©., x.. Subbotovskaya [Su61] gave a

n i=l l

2 3/2
striking proof that 0(L(f , II)) ^ n . It was known already to Shannon (see

2 2
[Sh49], or [Ya54]) that 0(L(f ,11)) £ n (the length of this sequence, of course,

grows linearly if © is used). Unfortunately, it seems that the technique of

[Su61] cannot be generalized to d > 2. Subbotovskaya' s result has recently

2
been improved by Khrapchenko [Kh71]. He succeeded in showing that 0(L(f , II))

2
^ n . Since this result employs a very interesting technique, and since it

has not yet been translated into English, it is reproduced in Appendix B.

Neciporuk [Ne66] discovered a sequence of Boolean functions f such that
n 2

0(L(f ,§)) = ~ for an arbitrary basis ?. It is true that the functions

n' log 2 n J

involved in the Neciporuk sequence are rather "artificial" in that, while
defined in a straightforward way, they have no special significance; however,
lately Harper and Savage [Ha71] have succeeded in applying the Neciporuk tech-
nique to a practical combinatorial problem (The Marriage Problem).

Neciporuk 's construction is based on the following lemma: Let f be a
Boolean function of n arguments. Consider a subset X of the arguments of f
and the set of restrictions of f to X obtained by setting the arguments outside
of X to constants. Let the number of such restrictions be r. If F is any
formula over a finite basis \$ for f, then the number of occurrences of variables
representing the arguments in X is 2: clog.r where c depends on the basis §
(for the proof of this see [Ne66] or [Ha71]).

The Neciporuk function f of n arguments is then obtained as follows:
The n arguments are arranged in a rectangular array with dimensions as shown

in Fig. 1.2. Each argument x. . is associated with a 0-1 valued m-tuple a..

° ij — ij

19

such that (1) not all components are 0, and (2) if (i,j) 4 (k,-0 then
a. . ^ a, ,. Then we define

all i,j i£j

where K(a. . ,k) denotes the conjunction of those arguments x, whose second

subscript (t) corresponds to nonzero components of a...

It can be verified that the number of restrictions of f to the variables

of an arbitrary row (except, perhaps, the last which may be imcomplete) ob-

, . „n-m

tained by replacing the variables of the other rows with constants is l

This follows from the fact that any Boolean function can be uniquely represen-
ted by a Boolean polynomial (see Lamma 4.5). Then, by the lemma above, the
number of occurrences of variables of any row (except, perhaps the last) in

any formula for f is ^ c(n-m); hence, the length of f over \$ w c- — (n-m).

n 2 *

In other words, 0(L(cp ,\$)) = ■: for an arbitrary basis \$.

T n *-°&2 n

Neciporuk's construction may be viewed as a solution to a special case

of the following problem (the problem of exhibiting a function of arbitrary

length): Given a basis \$ and a number k ^ L(n,\$), exhibit a function

n n 2

f: D -> D of length ^ k over §. In Neciporuk's case 0(k) = , n .

Since so few examples of functions that are known to be of large length
exist (in spite of (1.4.3)), the reader has no doubt already gained the
impression that this problem too is very difficult. However, we again have
the trivial solution that consists in examining formulas in n variables in
the order of their length, recording the functions they represent, and
choosing the first previously unencountered function represented by a formula
of length ^ k. In fact, it is reasonable to state an analog of Conjecture
1.4.1:

20

1.4.3 Conjecture

The problem of exhibiting a function of arbitrary length is comparable
(in terms of computational complexity) to an exhaustive search among formulas.

We again make this conjecture more precise on the example of determin-
istic one-tape Turing machines.

1.3.4 Conjecture

\$ is an arbitrary basis. N. is a deterministic one-tape Turing machine

with input (n,k) where n is arbitrary and k ^ L(n,§), and whose output is the

d -tuple describing a function f of n arguments such that L(f,§) ^ k. Then

there exists a constant c > 1 such that if k ^ e«L(n,\$) for any < £ £ 1,

n
the running time of N, on input (n,k) exceeds c when n ^ n( e ).

We can sum up the discussion of the classification problem as follows.

The problem is far from understood. At the present no sequence of functions

2
is known whose length grows faster than n . Isolated examples of sequences

2
with growth rate ^ n are known, and present research is directed at inven-
ting more general techniques that can be used for estimating the complexity
of whole classes of sequences. Also techniques have to be devised for d > 2.
The importance of this will be discussed below in Section 1.5.

1.5 Specker's Theorem

The first general technique for proving the nonlinearity of a large
class of sequences (of Boolean functions) was discovered by Specker [Ho68].
Let the basis IT U {x © y} be denoted by S . Then

21

Theorem (Specker) .

If f is a Boolean function of n arguments, if L(f,£) ^ c«n for some
constant c, then for any integer m, if n ^ T| (m,c), a subset X = [x 1 ,...,x }
of the arguments of f can be found such that (1)

„ m m

i=l i=l

where c n , C- , c„ are Boolean constants and Tl (m,c) is a certain number-
theoretic function. Furthermore, (2) if the basis is II (the other assump-
tions remaining unchanged), then c~ =0.

This theorem has been used by Hodes and Specker to show that the predi-
cate

n

S x = mod k (1.5.2)

i=l 1

for k > 2 and x. f {0,1} is of nonlinear length over S.

Using the second statement of the theorem, they are also able to give

n
an alternative proof of the nonlinearily of the length of © x. over n.

i-1
Another result obtained with Specker 's Theorem is the fact that some

geometrical predicates (in particular, connectivity) discussed by Minsky
and Papert [Mi69] are of nonlinear length over S (see Hodes [Ho70]).

In Chapter Two we will formulate and prove a generalization of Specker' s
Theorem (Theorem 2.2.2) to include the case d > 2 and multi-argument oper-
ators in \$. Our proof reveals the nature of both results more clearly.

t _

T] Will be discussed in Chapter Two.

22

They belong to a class of combinatorial results reminiscent of Ramsey's
Theorem (see Ryser [Ry63]). In fact, an earlier version of our proof
of Theorem 2.2.2 used Ramsey's Theorem. Besides this, Theorem 2.2.2
enables us to derive the nonlinearity of new functions (sequences of
functions) such as counting mod p where p is a prime, d possibly equals
p, but there are restrictions on the basis, etc. An example of an im-
provement over existing results is the connectivity predicate. Hodes
[Ho70] proves that it is nonlinear if d = 2. However, in Automata Theory,
for example, the result that a certain language can be computed in non-
linear time if k states are used in the finite control would be considered
weak. Rather we search for proofs that work for arbitrary finite controls.
The Generalized Specker Theorem (Theorem 2.2.2) gives us a tool for proving
the nonlinearity of the length of the connectivity predicate regardless of
the domain D and basis §. We can apply it to connectivity by "reducing"
connectivity (for the meaning of "reduction" see [Mi69] or 3.2) to certain
symmetric functions.

We should note that the generalization of Specker 's Theorem that we
prove is the obvious one to attempt; but, as the reader will see, the proof
turns out to be less straightforward. As an indication, consider (1.5.2).

It does not generalize directly to d > 2 since, e.g., the function {0,1} -»

n _
{0,1} defined by E x. = mod 6 can be represented in linear length with

i=l x

d = 3. This is because

2, 5, n

[Ex. =0 mod 6] = [ E x = mod 3] A [ E x =0 mod 2]

1=1 X i=l l i=l X

23

Hodes and Specker do not derive any bounds for the lengths of the
functions investigated by them. This question is asked (and to an extent

1.6 Cyclic Perceptrons

Cyclic Perceptrons will be treated in Chapter Four. They are an
application of ideas of Minsky and Papert to the representation of functions
by combinations of finite operators. In particular, one of the concerns
in [Mi69] is to formalize the intuitive idea that the connectivity predicate,
being "global" in nature, cannot be computed (or represented) by a "simple"
combination of "local" predicates.

The perceptron is the predicate

E a.'tp. ^
i€I L X

where I is an indexing set, a. € Q, the rationals, cp. € \$, a set of Boolean

functions (whose value is interpreted as being either the rational or 1.

The cyclic perceptron is defined as

where a. € F, a finite field, Y ^ F, and other symbols have the same inter-
pretation as before. Thus, both represent a certain Boolean function.

Minsky and Papert introduce the concept of the order of a perceptron
(the maximal number of arguments on which cp depends where i ranges over I).
They define then the order of a predicate as the minimal order of a per-
ceptron that represents the predicate. They formalize "local" by defining

24

an infinite predicate sequence to be local if and only if every member is
representable by a perceptron of order ^ r, for some finite r. They are
then able to show that connectivity is nonlocal.

The concept of order can also be applied to cyclic perceptrons. Chapter
Four will contain results on the order of the various predicates introduced
in [Mi69]. In particular, connectivity is shown to be nonlocal. This will
be an extension (to finite fields of arbitrary characteristic) of the results

described in [Vi70].

Chapter Five describes a model of computation (Pattern Counting Machines)
that again performs- a "local" computation followed by a "global" computation.
In this case the "local" computation is even more constrained than in the
case of perceptrons. The result is that no matter how cleverly we utilize
the "local" information in the subsequent "global" phase, the connectivity
predicated cannot be computed.

25

T(F) for the formula F in Example 1.2.2
Fig. 1.1

.th

l row

.th ,

j column

X. ,

i;

■

number of columns: m = llog 9 nl + 1

number of rows: ni/ml

The array of arguments used in the definition of the
Neciporuk function f

n

Fig. 1.2

26

CHAPTER TWO

A GENERALIZATION OF A THEOREM OF SPECKER

2.1 e-Complexes

Throughout this section, all formulas are D-formulas for some fixed (but

r
arbitrary) domain D, and all operators are functions D -»D.

Given the formulas F.,...,F , we shall call the formula F = y(F^,. . . ,F^)
where cp is an arbitrary operator a parallel combination (PC) of F^, . . . jF^. cp is
called the decoding operator of F.

Let F(X,z) be a formula where the distinguished variable z appears only
once, and let G be an arbitrary formula. Then F(X,G) shall be called a series
combination (SC) of F and G through z.

2.1.1 Definition

We give an inductive definition of an elongated n- component (e^component)
for n ^ 0.

(1) Let cp. be an arbitrary unary operator and z an arbitrary variable symbol
Then cp. (z) is an e -component. z is the input variable while cp in is the

input operator .

(2) Let cp be an arbitrary binary operator, G an arbitrary e n _ ^component ,
and x \$ S(G). Then F = cp(x,G) (or cp(G,x)) is an e^component. The input vari-
able and input operator of G are also the input variable and input operator

of F. x is a lateral variable of F. Any lateral variable of G is also a
lateral variable of F. cp will be called an internal operator of F.

27

An example of an e -component is given in Fig. 2.1. Let F be an arbitrary
e -component, and let x be the sequence of lateral variables arranged in the
order they are connected to the branch of T(F) extending to the input variable.
Then x is the lateral sequence of F. If F is an e -component, then the lateral
sequence of F is \ (the empty sequence). For example, the lateral sequence of
the e-component in Fig. 2.1 is x..,...,x .

An e -component with all internal operators equal is a homogeneous
e - component .

2.1.2 Definition

A formula F is an e r - complex if (1) F is a PC of the e n -components F^...,
F , and (2) the lateral sequence of F. for 2 <: i <: r is either equal to the
lateral sequence of F. , or the reverse of it.

F ,..., F are the components of F. If the variables of F- are numbered
as in Fig. 2.1, the second condition of Definition 2.1.2 means that any compo-
nent F ..,..., F either appears as in Fig. 2.1, or as in Fig. 2.2. The compo-
nents of the former kind will be known as standard components, while those of
the latter kind will be called the reverse components. The lateral sequence

of F 1 will also be called the lateral sequence of F.

r
Both in the case of e -components and e -complexes, one or both indices

will occasionally be omitted if the particular property they refer to is

irrelevant to the argument at hand.

An e-complex composed of homogeneous e-components is a homogeneous

e- complex .

28

One might wonder what the purpose of introducing e-complexes is since

for appropriate r and m every function of n variables can be represented by

r
an e -complex. Thus, it would seem that this class of formulas is trivial.

r
However, we will be concerned with e -complexes where r remains fixed as n
' n r

grows without bounds, and this will allow us to obtain interesting results.

We introduce some notation. Let F be an e -component with lateral

sequence x. ,., N ,x, , ON , . . . ,x. , N . a € D is an arbitrary constant, cp. denotes
i(l)' i(2)' i(n) J j

the internal operator corresponding to x...,.

. Also set cp = cp. . Then
T n+1 T m

<p.(a,<P. +1 (a,...,<P k _ 1 (a,cp k )...)) if 1 ^ j < k ^ n+1

cp* , N =1 cp. if 1 ^ j - k £ n+1
undefined otherwise

Note that cp^, . * is a unary operator if j = n+1, otherwise it is a binary
operator (if it is defined at all). Usually we will suppress the superscript
a because it will be clear what constant is referred to.

We now state the simple

2.1.3 Proposition

Let F be an e -component with lateral sequence x and input variable z.
y. is an arbitrary subsequence of x of length m ^ and a € D is an arbitrary

constant. If we denote the set consisting of z and the elements of y_ by Y,

Y
then F is equivalent to an e -component G.

29

Proof

Let x = (x 1 ,x 2> ... ,x n ) and y = (Xj/jj, x i (2)'••• ' x i(m)^ ~ -' Set ± ^ = °

and i(m+l) = n+1. Then G has the operators ilf. = CD,. , . lN , n . ,. N v for i ^ j ^

j (i(j-l)+l, l(j))

m+1 (ijr is the input operator of G). □

2.1.4 Remark

Obviously, Proposition 2.1.3 holds for e-complexes as well; one merely
has to perform the above construction for each component.

Proposition 2.1.3 will be frequently invoked. Namely, we will take an
e-complex F, select a subsequence i e x, the lateral sequence of F, and obtain
G as above. In this case, G is called the result of an a - merger with basis
y on F.

We introduce another restricted class of formulas.

2.1.5 Definition

A series parallel combination of e- components (SPCeC) is obtained accor-
ding to the following rules:

(1) An e-component is an SPCeC.

(2) Let F and G be an e-component and an arbitrary SPCeC respectively.
Then the SC of F and G through the input variable of F is an SPCeC.

(3) If F.,...,F are SPCeC s, then a PC of F. F is an SPCeC.

1' ' r ' 1' ' r

(4) An SPCeC is only an object satisfying (1), (2), or (3).

30

Given an arbitrary SPCeC F, we describe its set of components . If F
consists of the single e-component G, then G is the only component of F.
If F is the SC of an e-component G and another SPCeC H, then the set of
components of F consists of G and the set of components of H. If F is a
PC of F ..,..., F , then the set of components of Fconsists of the sets of
components of F. for 1 ^ i ^ r. Among the components of F, those whose input
variable corresponds to a terminal node of T(F) will be called terminal
components while the others will be called internal components . An example
of an SPCeC is given in Fig. 2.3. This particular SPCeC has four terminal
components and two internal components.

2.1.6 Proposition

An SPCeC is equivalent to a PC of r e-components where r = d-I+J and I
and J respectively are the number of internal and the number of terminal component of F.

Proof F can be converted into a PC of e-components by using Lemma 1.3.1.
The estimate of the number of e-components in the PC is also obtained from
there. ^

Remark It is a simple matter to verify that if F of Proposition 2.1.6
has k components, then I ^ k-1; and thus r £ d- (k-l)+l.

2.1.7 Proposition

F is a SPCeC with k components F^...^. F. for 1 ^ i ^ k is an e^-

component for n ^ 0, and, furthermore, the sets of lateral variables of F^

and F. are equal for 1 <■ i, j ^ k. Let X be the set of lateral variables

of F. and let Z be the set of input variables of F. Then for any m ^ and
l

31

a f D if n ^ TL (m,k) where T] (m,k) is a certain function (to be defined),

there exists a subset Y £ X with |y| = m such that F is equivalent to an

e -complex G with Y as the set of lateral variables. Furthermore, r ^ d* (k-l)+l.
m

Proof If m = 0, we can immediately apply Proposition 2.1.6 and obtain an
e n ~complex where r is as described in the statement of the proposition; thus
TL (0,k) = 0. We assume, therefore, that m > 0.

We recall the following familiar result:

2
Let i(l), i(2) ,. . . ,i((p-l) +1) be a sequence of distinct integers. Then

we can extract a subsequence of length p that is either increasing or decreasing
(for the proof see Berge [Be71] p. 16).

Without loss of generality, we can assume that the lateral sequence of
F. is x-j.-.jX . Then the lateral sequence of F 2 is x. ,^, Xj/ 2 ) ' * " " ' x i(n)"
The sequence i(l), i(2),..., i (n) consists of distinct integers; therefore,
if n ^ (n -1) +1, we can apply the above result and find a subset X^ £ X of
n variables such that after performing an a-merger with basis X- on all
components of F, the lateral sequences of the descendants of F^ and F^ are
either the same or opposite. We can continue in this way, processing one
after another all components. We end up with an SPCeC with components
G 1 ,...,G k such that the lateral sequence of G ± for 2 <: i <■ k is either equal
to that of G, or the reverse of it. To obtain G, we apply Proposition 2.1.6.
In order that Iy| = m, we must have

2 k-l
n ^ T^On.k) = (m-1)

for m > 1. The estimate for r is obtained from the Remark following
Proposition 2.1.6. D

32

Another equivalence that will be used later is given by

2.1.8 Lemma

E is an e r -complex, X is the set of its lateral variables, and Z is the

set of its input variables. Then for any m ^ and a € D, if n ^ Tt 2 (m,r),

YlJZ
there exists a subset Y C X with 'Y I = m such that E is equivalent to a

' a

r it,
homogeneous e -complex F.

Proof If m = 0, we simply use Proposition 2.1.3, and the result is a homo-
geneous e n -complex (obviously, any e Q -complex is homogeneous). Then T! 2 (0,r) =0.
Thus, from now on we assume that m ^ 1.

The proof will be given for the special case when E has two components:
a standard component E- and a reverse component E . It will then only be
indicated how to generalize the proof.

A procedure (The Homogenizing Procedure- -HP) will be described that will

2
transform an e -complex G consisting of a standard component G. and a reverse
P i

component G„ with p a TL(q) (for a function Tl that will be defined later)
and with the properties: (1) There exist (possibly empty) subsets 1^ and
R 2 = D such that (^(a.y^I^ = id R (identity on 1^) and ilr^a.y)/^ = id R
for 1 < i ^ p where CD. and i|r, is an operator of G 1 and G 2 respectively
and the first argument corresponds to the lateral variable, and (2)

V(i si,j«p) [ffl^x.y) € Rj_ => »j(x,y) = ^(x.y)] (2.1.1)

(i.e., the operators of G, are identical on the inverse image of R^). Simil-
arly for the operators of G 2 on R 2 «

33

Remark Note that if R- and R include the range of every operator,
Property 2 translates into the identity of the operators. In particular,

this holds if R. = R 2 = D.

2

Remark Note that an arbitrary e -complex satisfies Properties 1 and 2

with R 1 = R- = 0.

2
The result of applying HP will be an e -complex H that will either be

q

homogeneous, or will have Properties 1 and 2 with S. and S„ replacing R.. and
R 2 respectively and R / S. or R ? S_. Due to the Remarks above and to
the fact that D is finite, repeated application of HE on E finally yields F.
The condition on n is

n * \(m,2) = T) 3 Q) 3 (...T1 3 (in) ...)) (2.1.2)

2d times
This bound for n corresponds to the worst case when R. or R_ increase by only
1 on each application of HP.

Before describing HP, note the useful fact that because of Property 1,
Properties 1 and 2 are preserved under a-mergers.

Description of HP The lateral sequence of G is of length (v+l)«u-l for
certain values of u and v that will be defined later.

Consider the sequence

EL 3

(CD ((k-l)-u+l,k-u)' + ((v-D-u+1, (v-k+l).u) ) (2.1.3)

for k = l,...,v. Sequence (2.1.3) is illustrated in Fig. 2.4. The two
vertical lines represent G- and G„; the numbered horizontal outlets represent
the lateral variables (with the corresponding number); the boxes indicate the

34

variables and operators that take part in the formation of any particular

cd,. .v and ilr,. • an 'x' beside a variable indicates that it is not set to

the constant while 'a' indicates that it is set to a; the two checked boxes

represent the first member of (2.1.3).

In the sequence (2.1.3), either (Case I) the ranges of c P//k_-i\. .i if. u "\

and ilr,, ., r1 , , ,.. N for 1 ^ k < v are included in R and R respectively,
((v-k)*u+l, (v-k+l)-u) 1 2.

or (Case II) not.

Case I . If v is large enough, we can find q identical elements in the
sequence (2.1.3). Let the indices k corresponding to these elements be
k 1s ...,k . Performing an a-merger with this set as basis, the desired
e-complex H is obtained. Note that in this case we use Property 1 of G.
Namely if cp* is the first component of a pair in (2.1.3) whose range C R^
and if 9 is an arbitrary operator of G^ then 9(a,ep*) = cp* (similarly for
the second components of the pairs in (2.1.3) and operators of G,,). Thus,

the components of the identical pairs in (2.1.3) become the operators of H.

d 2 d 2

A bound for v is q« (d ) (d is the number of operators D ■+ D).

Case II . Assume t P(/£_ 1 ). u+1 i. u )^ h ' C ^ ^ R l f ° r S ° me b » c ^ D and
1 <= I* v (the case when ♦ ((v _ i) . u+1> (v .^i). u ) < b > c > * R 2 Can be treated
similarly). But then

cp,„ , nO>,c) t R n (2.1.4)

for all ^ j ^ u-1 (as a consequence of Property 1). Provided that u is
large enough, we can find an element e £ D that appears w times in the se-
quence (2.1.4). Let the indices j corresponding to the appearances of e be
j (1),... ,j(w) (see Fig. 2.5). Obviously then (all the variables considered
except x« have been set to a),

35

'(i-u-Mt), *-u-j(t-l)-l) (a > e) =e

for 2 £ t ^ w (the first argument of cp . . corresponds to the lateral
variable). Thus,

^•u-jCt), i.u-j(t-l)-l) (a ' y) f R l U{6] =ld
At this point we consider separately two cases:

Case Ha ^ = 1 and j (w) = u-1. We perform an a-merger with the basis

consisting of the variables with indices u-j (t)-l for 1 ^ t ^ w-1. As a re-

2
suit of this we obtain an e -complex G' such that Property 1 holds for G'

w-1 1

on R 1 U [e] (G' satisfies Property 1 on Rl e R„; in any event R^-Ri = \$).
Note that at this point Property 2 is still satisfied only on R 1 and R by
Gl and Gi respectively.

Case lib . A 4- 1 or j (w) < u-1. We perform an a-merger with the basis

consisting of the variables with indices 4«u-j(t)-l for 1 £ t £ w. As a result

2
of this we obtain an e -complex G" such that Property (almost) holds for G "

(the same remarks regarding G" and Property 1 as well as G", G^ and Property

2 apply as in Case Ha). The only exception may be cd'.' (the operator of G"

that is closest to the decoding function). By definition, cp" = ©,.. . ■ / -\_^) :

and there is no assurance that cp''(a,e) = e. We may rectify this situation

by absorbing co" into the decoding function in the following way: Let G" =

9(G", G' r ). Now set x.. = a (after the a-merger the variables have been

renumbered). Let S(G") - {Xj} = U. Then we have (G ,r ) ^ = G' = e(cp^(a,GJ), C-.1 ;

wh

ere G' equals G' minus co * and Gi equals G' with the input operator modified

as follows: ilf! = -dr "(a, ilr ! r ) (remember that G' r is a reverse components, and,
in w in 2

2
hence, x, is attached to i]f''). Clearly, G' is an e -complex satisfying
1 w w -l

36

Property 1 with R- U {e] replacing R 1 .

We can now resume considering Cases Ila and b together. To obtain H

(with components HL and H„) we must find among cp.' for 1 ^ i ^ w-1 q operators

that are identical on the inverse image of R.. U (e) (i.e., (2.1.1) with

R- U [e] replacing R ) and again perform an a-merger. We again emphasize

that the operators of H are identical only on the inverse image of R , and

this property has not been violated by any of the transformations of the

original e-complex G.

To obtain q operators that are identical on the inverse image of R. U { e} ,

2 2

it is sufficient that w-1 ^ q-d ; therefore, u ^ d. (q.d +1) and

T] 3 (q) = q 2 .d-6 3 + q .d.(6 2 +6)5d-l (2.1.5)

d 2
where 6 = d . This is obtained from the values of u and v derived above.

Recall that T] (q) = (v+l)«u-l, the length of G, 7L for r = 2 can then be

obtained from (2.1.5) and (2.1.2).

The proof for the general case is obtained by defining the Generalized

H Procedure (GHP) with the corresponding function Tl, . We consider instead

of (2.1.3) the sequence

1 r' 1 r" .

, (s,t)"-" a5 (s,t)' Vs-.f) V,t) ;

(CD'

\a,\.) \*,^-) (>s ,z ) - ' vs ,u;

1 r'
where s = (k-l)-u+l, t = k»u, s' = (v-k)-u+l, and t' = (v-k+l)«u. co. ,- . . ,cp

1 r"
denote the operators of the standard components of G while i|r. ,. . . ,i|f. denote

the operators of the reverse components (r ' + r" = r). Without detailed

argument we state that in the general case v = q« 6 while u remains the same

(u is determined by the requirements of Case II at which time only one

37

component is considered). From this we obtain analogously to (2.1.5)

\(q) = q 2 -d.6 r+1 +q-d-(8 r +6)+d-l
TL(m,r) can then be obtained from

T) 2 0n,r) = n 4 CTl 4 (...T1 (m)...))

r«d times _

which is an analog of (2.1.2).

2.1.9 Remark

As we have seen, TL in Lemma 2.1.8 depends on r, the number of components
of F. However, we shall mostly be using e-complexes that contain many components
that are identical except for the input operator (such e-complexes are obtained
e.g., by the use of Proposition 2.1.7). It may be checked that in the application
of GHP only one representative from each such group of components need be
considered. This significantly reduces 7], . Similarly, in computing TL from
(2.1.5), only &•& compositions are required where I is the number of groups of
similar components (corresponding to d compositions for each group of similar
components).

2.1.10 Remark

The operators of a homogeneous e -complex F obtained as a result of

applying Lemma 2.1.8 possess an added property that will be used later: Let

R be the range of cp(x,y) an internal operator of R (x corresponds to a lateral

variable); then CD (a,y) R = id (the identity on r). This fact follows from

R

the definition of HP (GHP). This particular property of the operators of

the components of F will be called the I -property relative to y. In what

R

38

follows, we will always suppress "relative to y" since there is no danger
of ambiguity. We will simiarly suppress the subscript R unless we will be
interested in a specific range. We will abbreviate "F is an e-component
(complex) whose operators possess the i a -property" to "F is an e-component
(complex) with the I -property".

A familiar and convenient way of representing a binary operator cp(x,y)
is by a labeled directed graph. The graph of cp, denoted by T (?) , is defined
as follows: The nodes of F(cp) are labeled with elements of D. A directed
arc labeled with a £ D exists from b to c if and only if cp( a ,b) = c.

If D = {1, 2, 3, 4} and R = (2, 3, 4] an example of a graph T(co) for

2

an operator cp with the I -property is shown in Fig. 2.6.

Given an arbitrary e -component F, the output of the operator Cp is

^VW^W-'-VV^in^"-^

in the case of a homogeneous e -component with internal operator to this will
be abbreviated to

cp(x. x, ,,. ..x , cp. (y)).
k k+1 n' in J

2.2 The Generalized Specker's Theorem
We first give the following

2.2.1 Definition

Let \$ be an arbitrary basis and a € D any constant. Let F be a formula

over § with S(F) = X U Y U Z such that Ixl ^ n -1 (the maximal number of

1 ' max

arguments of an operator of \$), Y is disjoint from X, but otherwise arbitrary,

39

and Z = {z} is a singleton (disjoint from X and Y) such that z occurs only
once in F. The set of functions f(X,z) represented by all possible such
formulas F with the elements of Y replaced by the constant operator a will
be denoted by \$ .

In particular, every operator of \$ with all but k ^ 1 arguments (k is

•a q

arbitrary) replaced by a is in \$ . Note that if cp(X,z) €. \$ , then tp may
qualify for \$ by virtue of a number of different representations. If z
(or any variable of X) in any one of them corresponds to a variable that
occurs only once, it is called a distinguished argument ). The other arguments
are called free arguments . Thus we may easily find a basis \$ and an operator
cp such that all arguments are at the same time distinguished and free.

We now define a restricted class of e-components and e-complexes: § is an
arbitrary basis and a f D is any constant. Let F be an arbitrary e Q -component;
then F is an e n ~component over \$ . Let cp(x,z) f \$ be a binary operator, z
a distinguished argument (hence x is free), and G an e .-component over \$ ;
then cp(x,G) is an e -component over \$ . An e-complex over 9 is an e-complex
such that all its components are e-components over \$ .

The main result of this chapter is

2.2.2 Theorem

Let there be given the function f : D -► D such that L(f ,\$) ^ c.n for some
constant c and basis \$. Then for any m ^ 1 and a f D if n ^ T] (c ,m), there
exists a subset Y of the arguments of f such that |y| = m and f is either a

a

constant or is represented by F, a homogeneous e^-comp lex over i a with the I -property,
Y as the set of lateral variables, constant input operators, and r <■ d- (2c-l)+l.

40

Proof

If f has m fictitious arguments, then let Y be the set of these arguments,

Y
and f is a constant. From now on we assume that f has ^ m-1 fictitious
a

arguments;.

The statement of the theorem gives us that there exists a formula E
over \$ such that L(E) ^ c«n. Therefore, there are ^ 1/2. n variable symbols
representing the arguments of f which either do not appear in E or appear
^ 2*c times. In other words, there are s l/2n-m+l variable symbols t^at
actually appear in E and such that the number of occurrences of each is ^ 2-c.
Denote the set of these variables by X- .

If n ^ 2- T| (n„,2c >m-l, we can apply Lemma A. 9 and obtain a subset

X 2
X £ X with |x | = n and such that E is equivalent to E_, and SPCeC over

\$ a with at most 2c components and such that the set of lateral variables of

every component is X„.

If n„ ^ IWn-^c) we can apply Proposition 2.1.7 and obtain a subset

X r

X, <: X with JX,| = n and such that E 3 is equivalent to E3, an e n -complex

over f a with r £ d«(2c-l)+l). The estimate for r is obtained at this point.

If n„ > Tl 2 (m,2c). we can apply Lemma 2.1.8 and Remark 2.1.9 and obtain

F, the desired homogeneous e r -complex over ? with the I -property. The

I property is a consequence of Lemma 2.1.8.

41

Discussion of TL. The present proof yields

T1 5 (m,c) = 2. 1| 8 CT1 1 (Tl 2 <:m,2c), 2c ),2c)+m-l

The exact representation of TL is extremely complex, and in what follows we
shall use only a very rough approximation. In Appendix A it is seen that
71 (t,k) (as a function of k) grows faster than iexp(b,2k) for any constant b.
The functions TL and TL contribute only insignificantly to this, and thus we
state:

TL(m,c) ^ iexp(b,4c) for c ^ c(b)
5 (2.2.1)

and an arbitrary constant b

(Later we shall se that the size of TL prevents us from obtaining any

interesting bounds for the functions investigated with Theorem 2.2.2. The size

of TL which contributes most to TL results from the technique used in Lemma
o 5

A. 3 to obtain a nesting sequence for a given formula F. It is not known
whether this technique can be improved. Our guess is that it cannot be.) D

2.3 On Specker's Theorem

In this section it will be shown how Specker's Theorem follows from
Theorem 2.2.2 (the statement of Specker's Theorem is given in section 1,5).

In Theorem 2.2.2 set D = (0, lj , \$ = £, a = and let f be as described.
Then by Theorem 2.2.2, we obtain that for an appropriate choice of n, we can
find a subset X of the arguments of f with |x| = m and such that f Q is either
a constant or represented by

♦(F 1 ,...,F r )

42

where ty:[0, 1} -» (0, 1] and F for 1 ^ i ^ r is either a standard or reverse

homogeneous e -component over Z with the I -property and constant input

m

operators. The value of r is bounded as described in Theorem 2.2.2.

We now analyze the various functions that can be represented by e-compo-
nents with these restrictions. First note that S consists of all Boolean
binary operators; furthermore, if f(x,z) € E » then botn x and z are free
(because every Boolean binary operator can be represented over S in such a
way that each variable appears only once), and thus there are no restrictions
on the use of operators in the e-components we encounter.

All possible graphs T(cp) for cp f 2 are shown in Fig. 2.7. The ones that

satisfy the I -property are starred. The functions obtained by choosing a

value for the constant input operator for the starred graphs are shown in

m
Table 2.1. In general, this function is either b n ffb.- tt where tt = IT (1 *x.)

1-1

m

or c„ © c'CT where a = .\$, x. for some values of b_, b. or c n , c-. Now, taking
1 i=l i i u i

into consideration the fact that rr« a = 0, and that every |: (0, 1} -» -0, 1}
can be uniquely expressed as a Boolean polynomial

2 h -l

ffi c.«M.
i=0 1 1

where c. f {0, 1] and M is the monomial (of degree one in each variable) in

those among x, , . . . ,x, corresponding to nonzero bits of the binary representation

of i(see Lemma 4.5) we obtain the first part of Specker's Theorem.

The second part of Specker's Theorem could be obtained directly at this

point; however, we will derive a generalization of it in Example 3.1.3, and thus

omit it here.

43

It must be pointed out that our derivation of Specker's Theorem results
in a slightly larger bound for n; however, since no known application requires
a specific value for the bound, this is immaterial. Specker's bound (see
[H068]) is obtained from the function

y.(m,0) = m

H(m,k) = 4 (k+1) .n(u(m,k-l) } k-l)

by setting T) (m,c) = 2|j.(m,2c). Our bound is slightly larger due to the addi-
tional processing implicit in the application of Proposition 2.1.7 and Lemma
2.1.8. However, u resembles T| and this function by far contributes the most
to T] ; thus, we can state that the bounds are approximately equal.

Finally, let us note the fact that Theorem 2.2.2 allows us immediately
to amplify Specker's Theorem. Namely, the statement of the theorem involves
the basis consisting of all binary Boolean operators. However, the proof
of Theorem 2.2.2 works for bases consisting of operators of an arbitrary
number of arguments.

44

^r) — (£)-•• — (£) — ©— O

T(F) where F is an e -component
Fig. 2.1

^J) @ — @ — @ — Q

T(F) where F is a reverse component of an e-complex
Fig. 2.2

45

F is an SPCeC
Fig. 2.3

46

C

u-1-
u

u+1

2u-l
2u

(k-l).u+l

X {

k.u-1'
k«u

(v-l)«u+l-
v.u-1 ■

VtU

(v+l).u-l

X
Y

(i

a
x
a

a
x

a
x

%~

a

x

(v+l)*u

V'U+1-

v»u ■

k.u+1-
k«U'

3u-l-

2u+l
2u

2u-l

u+1-
u -

qp

((k-l)«u+l,k.u)

}

((v-k).u+l,(v-k+l).u)
Illustration of the HP procedure (I)
Fig. 2.4

(\r 1 1 ^ . ii 1

}Y

— !>
<»

47

(4-1). u-1

X.u-j (w)

<p

( (J e-i).u-i^.u) (b ' c)? R i

Illustration of the HP procedure (II)
Fig. 2.5

2, 3

F((p) for an operator cp with the I f2 3 , -property

Fig. 2.6

48

0, 1

(i)

1 ) (J)

(k)

0, 1

1 ) (!>)

(m)

(n)

CV^>0

( ) ( 1 )

(&)■■

0, 1

The graphs of all Boolean binary operators
Fig. 2.7

49

r<\$P)

<P (y)

in

Function

a

a

1

c

c

1

m

rid® x >

i=l

d

d

1

1

g

m

i=l i

g

1

m

i-1

h

m

i=l x

h

1

1

P

1

P

1 1

1

Table of functions that can be represented by e-components
with the I -property (see Fig. 2.7) and constant input oper-
ators if D =« {0, 1}

Table 2.1

50

CHAPTER THREE

APPLICATIONS OF THE GENERALIZED SPECKER THEOREM

The principal results obtained previously [H068, Ho70] by the use of
Specker's Theorem are

3.0.1

n
A new proof that the Boolean function .§- x. is of nonlinear length over

t
II . This is accomplished as follows. First note that the restriction of the

mod 2 sum of n variables obtained by setting certain variables to is again

a mod 2 sum (but of a smaller number of variables). Now apply Specker's

n
Theorem (see 1.5). Suppose .6. x. is of linear length over II. Choose n

large enough to obtain m = 3. The theorem states that for this particular

bases c = in (1.5.1). However, it can be checked that in this case no

choice of c n and c. will yield the mod 2 sum of three variables. A contra-
diction.

3.0.2

n

The function f: {0, l} n -» {0, 1] defined by f = 1 if and only if E x =

i=l
mod 3 is of nonlinear length over E. We proceed similarly as before. Assume

it is of linear length. Apply Specker's Theorem with n sufficiently large to

+
Of course the results of Subbotovskaya and Khrapchenko are stronger for this

particular example.

51

obtain m = 3. If we replace x.. , x„, x„ in (1.5.1) once by 1, 0, 0, and
another time by 1, 1, 1, then the value of (1.5.1) remains unchanged. However,
the value of f (with all variables except x.. , x„, x replaced by the
constant 0) is different on these two assignments. Again a contradiction.

Both of these results were derived by Hodes and Specker in [Ho68].

We might note that the technique of 3.0.2 can easily be generalized to
counting mod k where k is an arbitrary integer (see (1,5.2)).

3.0.3

Certain geometric predicates (see [Mi69]), in particular the connectivity
predicate, are of nonlinear length if expressed with binary Boolean operators
(this result was obtained by Hodes in [H070]). We will not discuss this in
greater detail now since this technique will be treated later in 3.2.

In this chapter we will use Theorem 2.2.2 to generalize all these results.

3.1 Counting mod p

Consider the function (0, 1} -» (0, 1}

n
1 if Z x. = mod p

i=l X

f n (x 15 ...,x n ) -

otherwise
then

3.1.1 Theorem

If p is a prime, if |d| < p, then f™ is not of linear length over an
arbitrary basis.

52

Proof

Suppose the statement of the theorem is not true. That is, there exists
a prime p, a finite set D such that |d| < p, a basis § of operators on D, and
L(f , \$) £ en for some constant c.

First note that if X is a subset of the arguments of f P and jx| = m, ther

(f n ) Q = f P . We can now apply Theorem 2.2.2. For an arbitrary m, if n is
sufficiently large, there exists a subset X of the arguments of f P with |x| = m
and such that (f )_ is represented by a homogeneous e -complex F (over \$ and
with the I -property) with X as the set of lateral variables. In addition,

since f is a Boolean function, the lateral variables of F are restricted to

n '

(0, 1].

Consider now any component F. of F and T(cp ) where cp is the internal
operator of F.. Since cp has the I -property, T(cp ) has the general appearance
of Fig. 3.1. F. is determined by cp. and the constant input operator a.. Now
let m ^ d and consider the sequence (s . ) for ^ j < m where s~ = a. and

s - cp. (11. . . 11, a. ) for j > 0. Let s, ,.,. be the first element in the sequence

j times
that is repeated at some later point; in fact, let k(i) be the position of the

first occurrence of this element. Let k(i)+j£(i) be the position of the second

occurrence of this same element. Then we shall call k(i) the prefix of F.

while 4(i) will be called the period of F . .

Clearly, if F. is a standard (reverse) component, then cp. (x.x 2 ...x .

11... 1, a ) (cp (x x _ . ..x 11... 1, a )) where k ^ k(i) is a function of the

numb

er of l's among x,, x„,...,x , (x, -,..., x ) mod 4(i).

53
Thus,

If we set Y = {x, ..,..., x , } and choose k-ClO to

exceed or equal the prefixes of all the reverse

Y
(standard) components of F, then F. represents a

function of the number of l's among the variables of

Y mod jgcm(A(l),...,4(r)). (3.1.1)

y
On the other hand, by the initial assumption, F. is a function of the number

of l's among the variables of Y mod p; this results in a contradiction since

d < p.

On the basis of (3.1.1) we can obtain the following

3.1.2 Theorem

Let D be an arbitrary domain, 5 is a certain basis, and p is an arbitrary
integer > 1. If 5 is such that any e-component over \$ with the I -property
and constant input operator has period one, then f p is of nonlinear length
over \$.

3.1.3 Example

This is an example of a basis satisfying the hypothesis of Theorem 3.1.2.

Consider an arbitrary domain D = (0,1 d-1] . Then a complete basis

for D is ♦ - (min(x,y), max(x,y) ,0,1, . . . ,d-l, e Q (x),. . . , e (J _ 1 (x)} where min
and max are defined in the usual way, 0,...,d-l are the constants, and

f d-1 if x « i

e,(x) =

jtherwise

S I W = 1

I ot

54

(Note that i|r, _ -. = { A,V,0,l,-,id] ; thus, a result on the nonlinearity of the

length of a certain function over i|r, _ . , is also a result on the nonlinearity

of the same function over II; in particular, applying Theorem 3.1.2 to ijr, - .. , ,

we obtain (2) of Specker's Theorem.)

i|r is interesting because it gives rise to an analog of the disjunctive

normal form for arbitrary D: Consider the table for an arbitrary function

f : D -> D. Then

d n -l
f = max (M. )
i=Q ^

where M equals if the current assignment is not the i assignment and the
value of the function at the i assignment otherwise. M. is represented as
follows

M i -"^.(I.nfr]? e a(i,n) (x n ) ' f i )

where a(i,j) is the j component of the i * assignment.

We claim that i|f satisfies the hypothesis of Theorem 3.1.2. Note that

a b
i|; = i|f for all a, b £ D because iL contains all the constants. Therefore,

we will write simply i|r

*

Given cp € jr with the I -property, the statement that there exists b £ D

such that the homogeneous e-component with internal operator f and b for its

input operator has period & is equivalent to saying that there exists a subset

LED with |l] = & and cp(0,z) L is the identity (id ) while cp(l,z) L is the

permutation with cycle length & (p^)«

*
We contend that for any L £ D, cp(x,z) € ± and c, e € D, if cp(c,z) L

55

and cp(e,z) L are 1-1, then cp(c,z) L = cp(e,z) L. Since id 4 pj. if I > 1,
this will establish the original claim.

This can be proved by induction on the depth 6^ of the distinguished
variable z in the formula F that represents cp (since there may be many such
formulas, let F be one of the formulas where the depth of z is minimal.

If 6 =1 then either F = max(F',z), or F = min(F",z). Assume the first

*
case (the second can be argued similarly). By definition of jf , F contains

only the variable x. If we replace x by c , F' represents a constant c' f D.

Now if c' ^ L, then cp(c,z) L is the identity, otherwise it is not 1-1.

•k

If 6 > 1, then either cp(x,z) = e. (cp 1 (x,z)) where cp' £ jf_ and 6 , < 6
or cp(x,z) = cp"(x,cp'"(x,z)) where cp", cp"' f^ and 6 6 < 6 (to see this,
think of F). In any case cp', cp", and cp'" satisfy the inductive hypothesis,
and we are done.

Note that Theorem 3.1.1 and 3.1.2 hold with f P replaced by the function

n

g p : D n -♦ {0,1} given by £ x = mod p since g£ f (0,l] n = fj\

i=l

3.2 Connectivity

The connectivity predicate was already discussed in 1.6. It attacted
considerable attention after Minsky and Papert [Mi69] succeeded in obtaining
interesting results on the complexity of perceptrons that represent the
connectivity predicate. Works that follows [Mi69] and that treat specifically
the representation of the connectivity predicate by finite operators are, e.g.,
[H070], [Mi71], and [V170],

56

Minsky and Papert describe a circuit for computing the connectivity

2
predicate of depth (of the order of) (log„n) which on intuitive grounds

seems minimal. This circuit translates into a formula of nonpolynomial length.

Thus, the connectivity predicate seems to be a good benchmark for testing

estimation methods for the complexity of functions (i.e., any appropriately

general method which is presumed able to give estimates for length up to

log 2n
f(n) < n should declare the connectivity predicate complex).

2
Consider a set of n variables {x. .) for 1 Si, j ^n; then the connec-

ij 2

tivity predicate is the function c : {0, 1] -* {0, 1} defined as follows:

(we will not give a formal definition since the formalization is obvious)

Given a specific assignment to the variables, consider it as a square array

of 0's and l's. Then c = 1 on the empty pattern (i.e., consisting of all 0's),

or if the l's form a connected pattern. By "connected" we mean that any two

l's can be linked by a sequence of adjacent l's (two l's, corresponding to the

variables x . and x, . are adjacent if !i-kt + !j-^| = 1). For example, the

pattern in Fig. 3.2 is connected.

The general approach used here to obtain an estimate for the length of

c is to consider reductions of c .
n n

Given an arbitrary function f , a function g will be called a k- reduction
of f if g is obtained from f by replacing each argument of f by a function with
at most k arguments.

Suppose we want to prove that f of i = 1,2,... arguments is of nonlinear
length (over the basis \$). Assume there exists a k-reduction g. of f. such
that the number of arguments m in g is m > Ct-n for some constant < CC ^ 1.
Assume L(f ,\$) ^ en for some constant c. That is, there exists a formula

57

F for f and L(F ) ^ c«n. Since the length of any function of k arguments
is bounded by L(k,\$), we obtain

L(g n ,\$) ^ c- L (k,5)-n

by making substitutions for variables in F .

If [g.] is rearranged (and renumbered) in the order of increasing number
of arguments, and all but one functions with the same number of arguments
are deleted, then we obtain

L(g,\$) ^ c' -m

for some constant c'. Finally, if we can prove (e.g., by applying Theorem

2.2.2) that g, is nonlinear, we obtain a contradiction, and we are done.

Hodes [Ho70] shows the nonlinearity of the length of c over S by reducing

c to the function
n

m

V (A y ) Ay
1=1 jtt J

(i.e., exactly one variable is 1) which can then be proved to be nonlinear
using Specker's Theorem. Unfortunately, this reduction does not work for
domains with more than two elements because this function is linear over an
appropriate basis in such domains. However, another approach works, and we
can state

3.2.1 Theorem

Regardless of the size of D and the nature of f, c is of nonlinear

2
length (i.e,, L(c ,\$) ^ en is not true for any D, §, and constant c).

58

Proof

Minsky and Papert [Mi69] succeed in reducing c to counting mod 2 by

exhibiting a contact network such that its connectivity depends on the number

mod 2 of contact variables equal to 1, and then by simulating this network

on the square array of variables (they call it the "retina"). We shall proceed

similarly.

c is reduced to the function s P : {0, 1} -» {0, 1} (for an appropriate
t n

t) defined as follows:

1 if > arguments are equal to 1

s n =

otherwise

s , can be represented by the connectivity of a contact network S . S is
n n n

shown in Fig. 3.3a. It has p contact arms for each variable y^ The value

of y. corresponds to the upward position of the corresponding arms while the

1 value of y. corresponds to the downward position. The contact arm of S^

are arranged in p rows (n arms in each row). Whenever an arm for y. is in

the upward position, it is connected to an arm for y..-, i n the same row; if

the arm for y. is in the downward position, it is connected to an arm for

y. in the next row. Thus, it may be easily checked, point A^ is connected

to B , if and only if at least p among y, .•••,y are !• It: ma y also be
p+1 1 n

verified that in this case all the contact arms in the network are connected

either to A„ or to B ,, and since these two are connected together, the whole
U P+1

network is connected.

S p in turn can be simulated by a rectangular array R P of 0*s and l's
n n

where certain positions are constant and others depend on the y^s (see
Fig. 3.3b). The size of / is (3 (p+l)+l) . (3n+p-l).

59

We now show that s P is a 1-reduction of c. for some t. This is done
n t

by cutting R P into smaller rectangular pieces along vertical lines. The
first piece is of length (£-l)-q where q = 3(p+l)+l and I will be defined
later, the second through SL-lst piece is of length (£-2).q, while the I ,
last, piece is of length between 1 and (4-1) «q. These pieces are then
arranged into an I- q * &' q square pattern T P as shown in Fig. 3.4 (the
arrangement depends on the parity of I). Corresponding rows of adjacent
pieces are connected by =>- or c- shaped patterns of O's (in the case when
one of the positions along the cut at the row in question is 0) or l's.

The unused positions of T P (corresponding to the case when the last piece

r n

is not of the maximal length) are replaced by O's.

t is set to l-q and the variable x. . in c t is replaced by the correspon-
ding position in T P (one among 0, 1, y^ or y). Obviously, the function
obtained by this replacement is s P . If 3n+ P^ :> 1, I is obtained as r x"l where
x is the positive solution of

2 , ,^ 3n+p-l
x -2(x-l) = ~

We also have that n «(l/3q)t, and, thus (by the reasoning outline previously),

if c is linear so is s.
t n

With the assumption that s P is linear, we apply Theorem 2.2.2 with a = 0.

n

In this way we obtain that (s P )q where Z is a certain subset of the arguments

of s P of size m is represented by an e -complex F (with the requisite restric-
n m

, . p N Z p
tions) where m is arbitrary. Note that (s^; Q - s^.

60

As noted in 3.1, if a sufficiently large number of variables at the
beginning and end of the lateral sequence of F is replaced by 1, then F with
this substitution represents a function of the number of l's among the
remaining variables mod the 1cm of a set of integers £ d. The number of
variables that have to be set of 1 is u i 2(d-l) (at most d-1 at each end
of the lateral sequence). Thus we obtain a representation of the function
S m-u if P ^ U * If P ^ 2(d-l)+2, we obtain a function s. for i ^ 2. However,
it is clear that s. is not a function of the number of l's mod k for any integer

k. Thus, we have arrived at a contradiction, and, hence, c is of nonlinear

' ' n

length over any basis ?. D

t
3.3 The Length of Symmetric Functions

As we have seen in the previous examples, Theorem 2.2.2 has been applied
only to functions that are either symmetric or that can be reduced to symmetric
functions. While we know of no formal statement that can be proved and that
asserts that this indeed exhausts the applicability of Theorem 2.2.2, it
intuitively seems probable.

In this section we will discuss several bounds on the length of symmetric
functions (both specific functions and all symmetric functions). Recall that
in 1.4 we have already mentioned several such bounds (Subbotovskaya, Khrapchenko).

All of the results in this section were suggested by A. R. Meyer.

61

Does Theorem 2.2.2 (or Specker 's Theorem) give us any information on the
length of the functions investigated? Hodes and Specker do not treat this
subject, and, in fact, the bound that can be obtained is very weak; however,
we do mention it for the sake of completeness.

In an application of Theorem 2.2.2 (or Specker 's Theorem) to a certain
function f, we proceed with the assumption that L(f,\$) ^ c*n. To apply
Theorem 2.2.2 we must have n ^ Tl (m,c) where m is a sufficiently large number
to obtain a contradiction. However, m does not depend on c. Thus, n depends
only on c and m is assumed constant.

Consider now c as a function of n. We ask what is the maximal value
c for c(n) for which Theorem 2.2.2 can be applied (and a statement contradic-
ting L(f,\$) ^ en obtained). c(n)«n is then a lower bound for L(f,\$). Due
to (2.2.1) c grows slower than (l/4)«ht (p,n) where ht (p,n) = maximal x
such that n ^ iexp (p,x). Then we have

c(n)-n £ (l/4)«ht (p,n)-n (3.3.1)

for an arbitrary constant b > 1 and for sufficiently large n.

This bound seems unrealistically low, and it is useful to compare it

3
with known bounds for length for the particular function f over some bases

consisting of Boolean operators (we will suppress the subscript n).

3
It has already been established that f is of nonlinear length if D =

3 3 3 1
(0, 1] (see 3.1). We introduce the following notation: f = f ' , f ' ,

3 2 n

and f ' stand for S x = 0, 1, and 2 mod 3 respectively. We will repre-

i=l
sent f 3 '°, f 3,1 , f 3 ' 2 by the formulas F°, F 1 , and F respectively. F is

obtained by the following recursive relation

62

F°(x) = F°(Y) A F°(Z) V F^y) A F 2 (Z) V F 2 (Y) A F X (Z) (3.3.2)

(If X is the singleton {x}, then F (X) = x)

L(F°(X)) = L(F°(Y))+L(F°(Z))+L(F 1 (Y))+L(F 2 (Z))+L(F 2 (Y))+L(F 1 (Z))

1 2

Similar identities can be obtained for F (X) and F (X). When these identities

are used recursively, we obtain

i lo §? 6 9 fi
0(L(r,\$)) £ n «ri (3.3.3)

an exact description how we obtain a bound of the form (3.3.3) from a recur-
sive relation similar to (3.3.2), see [Ya54].

This upper bound can be further reduced by using multiargument operators.

1 R 3

Let y: (0,1,2] -+{0,1,2} be the operator £ y. mod 3. Then f can be

i=l X
represented by a formula G which uses y recursively (i.e., the arguments of

f 3 are repeatedly divided by I together with an outermost decoding operator
{0,1,2} -+ {0,1}. G is of linear length. If we use D = {0,1}, y can be en-
coded by two operators v ' and y"> and G translates into a formula such that

log n log,k
0(L(G)) = (2k) k = n (3.3.4)

3

Thus, as & increases, the upper bound for L(f ,\$) (where y \$) approaches

c*n. However, the gap between this bound and (3.3.1) is still huge. But,
the important thing to note is that any theorem that retains the same broad

assumption (bases with an arbitrary number of operators) as Theorem 2.2.2

3
cannot yield a better bound for f than (3.3.4).

63

Another example of a function that is nonlinear in length (over all

4
Boolean binary operators) by Specker's Theorem is f . However, it too has

a relatively short representation (the previous and this example show that

Theorem 2.2.2 is a sensitive tool for deriving the nonlinearity of functions;

i.e., it can be used on functions that are only "slightly" nonlinear).

A representation for f with Boolean operators is obtained by dividing

4
the arguments of f into disjoint (nonempty) pieces Y and Z, and adding the

4 4
bineary representations of f (Y) and f (Z). Let the binary representations of

f (X) be given by the formulas F ' (X) and F"(X), obtained by the following

recursive relations

F"(X) = F"(Y) a F"(Z)

(If X is a singleton F"(x) = x and F'(x) = 0)
Consequently,

L(F"(X)) = L(F H (Y))+L(F"(Z))

L(F'(X)) = L(F'(Y))+L(F'(Z)+L(F"(Y))+L(F'(Z))

By choosing Y and Z always as equal as possible, we obtain

L(F"(X)) = n
0(L(F"(X))) = n-log 2 n

Since f (X) is represented by F ' (X) A F"(X) n'log 2 n is also a bound for

0(L(f ?) where ©, A 3.
n

We now turn our attention to an upper bound for the length of all
symmetric functions.

64

Note that a symmetric function g: D n -» D where D = -0,1, . . . ,d-l] depends
exclusively on N.,,...,N d , where N. is the number of variables equal to i.
It can be represented, e.g., as

g = max(M )

n (l)"'» n (d-l)

where M ,^ nCd-l") ec l uals t if N = n(i) and is otherwise. The number
of combinations of n(l),. . . ,n(d-l) is a polynomial inn - ( ~ ) — and the
max function can also be represented in polynomial length, regardless of the
basis i. The latter fact is established by representing max using the two-
argument max recursively. Thus, if M . . (a-\\ were polynomial, g would
also be.

M. J. Fischer and A. R. Meyer discovered that M ,. v ,, n «. can,

n(i; ,. . . ,n(d-l;

indeed, be represented in polynomial length by using a special code for
integers described by Avizienis [Av69],

We will illustrate the construction on Boolean symmetric functions. It
will be seen that if the basis of operators is appropriately chosen, the
length of an arbitrary symmetric function is bounded above by a polynomial
of a surprisingly low degree.

The Avizienis code is a redundant positional representation of integers
to an arbitrary base b > 2. We describe it for b = 3.

An integer n is represented by all possible Tlog_nl - tuples

a riog 3 nT'"' a l
where a ± € {-2,-1,0,1,2} for 1 ^ i <: flog^n! and

65

Tlog nl

h a. J

i = n

i=l

The property that is exploited is that there are no long carry's in
addition. Thus, if we want to add two Avizienis coded integers a = a.a, .....
a. and b = b,b , ...b. , we can do it in two steps using the following

3.3.1 Algorithm (Avizienis)

(1) Find the carry c and intermediate sum r such that

a.+b. = 3c, +r.
i i i i

where a ± , b € {-2,-1,0,1,2} and c^ r ± € {-1,0,1}.

(2) Compute the sum s according to

S i = r i +C i-l

Let us estimate the length of the formula representing any ternary place
in the Avizienis representation of N.. for X = (x^,...,x }.

Again let X = Y U Z and Y H Z = 0. r ± (X) and c^X) can be represented
as

R t (X) = p(R i (Y),R i (Z),C i _ 1 (Y),C i _ 1 (Z))

and

C^X) = X(R i (Y),R.(Z),C i _ 1 (Y),C i _ 1 (Z))

If X is a singleton, r. and c. are if i > 1, or 1 and respectively if

4
i = I. P and X are certain operators £-1,0,1} -» {-1,0,1} which can be

obtained from the definition of Algorithm 3.3.1. Strictly speaking, the

66

domain used here is not permitted in our definition of finite functions;
however, the difference is merely one of coding. Thus,

L(R i (X)) = L(R i (Y)+L(R i (Z)+L(G i _ 1 (Y))+L(C i _ 1 (Z))

and

MC^CX)) = L(R i (Y))+L(R i (Z))+L(C i _ 1 (Y))+L(C i _ 1 (Z))

If we use these relations recursively and always make Y and Z as equal as
possible, we obtain

0(L(R i )), 0(L(C.,)) ^ n 2

for 1 £ i ^ Tiog nl. If D = {0,1}, we need two bits to encode r and c^.
Therefore, using certain operators p', p", X' , and X" to encode p and X,
we can encode R. (X) and C. (X) and combine them into a { 0,1] -formula A.^
representing the i ternary place of the Avizienis representation of N^.
We have

0(L(A i )) ^ n 3 (3.3.5)

Let there be given a positive Avizienis coded number a = a a _ ...a .

We desire to convert it into its binary equivalent b = b b _...b 1 . Let

q q-1 1

U £ (a ,...,a.}. Then we define b (U) = 1 if and only if the i bit of
pi i

E a.3 1 is 1. Note that even if a is positive, b.(U) may be negative

a.€u x
1

for some i and U. This is further discussed below. For the moment we assume

that b. (U) is always positive. We can then again compute b.(a . ..a^) by a

recursive method. Let U = V U W, V w = 0. Then b^U) is the i bit of

the sum of b (V)...b.(V) and b (W)...b 1 (W).
q 1 q 1

67

b.(U) is represented by the formula B.

B i (U) = P(B.(V),B i (W),G i _ 1 (U))

, _ , . th ,
where G. represents the carry from the 1 place;

G.(Y) = Y(B 1 (V),B 1 (W),G 1 _ 1 (U))

G n represents the constant and P and y are certain Boolean operators. Then
we obtain

L(B.(U))« E L(B.(V))+L(B (W))
1 j=l J J

If a is the Avizienis representation of a number ^ n then p = llog^n I and

q = Tlog^nl . Thus, we obtain the following bound for L(B (a))

log log.n
0(L(B 1 (a») < (2q) »

log log n
(21og 2 n)

(l+log 2 log 2 n) log 2 log 3 n
log 2 n

« n

(3.3.6)

Note that (3.3.6) means that 0(L(B.(a))) ^ n £(n) for 1^ i ^ Tlog^l where
e -+ as n -» °°.

It has already been remarked that b(U) need not be positive. Thus
b(U) must be treated as a signed number. If we use the l's complement
representation and the en-around carry technique (see, e.g., [Gr59]), addition
can be performed as follows. Let b(U,g Q )=G(V)+b (W)+g Q where g Q is either

68

or 1 and let g 1 denote the carry from the highest position of b(U,0). Then
b(U) = b(U,0) if g x = and b(U,l) if ^ = 1. This means that in (3.3.4)
we obtain (i.q) P for some constant SL instead of (2q) exp where exp has the
value given above.

If a^ for 1 <: j £ riog 3 nl . in B i for 1 ^ i ^ Tlog^l is replaced with
A , we obtain formulas F^X) representing N in binary form. Combining
(3.3.5) and (3.3.6) we obtain

0(L(F i (X), ))) £n 3+6(N)

where e(n) -» as n -* °°.

To obtain the desired representation of an arbitrary Boolean symmetric
function, we proceed as follows: Consider the formula S. defined inductively

S l = X 10 V x ll

S i+1 = X i0 A S i-1 V x il A S i-1 (3.3.7)

Take Sp. -| and replace x. Q and x. - by F. and F. respectively. It is easily
seen that S p. -| with this replacement is identically 1 (this can be proved,
e.g., by induction; for S.. it is trivially true, and the general statement
follows from (3.3.7)).

Let there be given an arbitrary symmetric function g. It is defined by
a subset M C {0,l,...,n] of possible values of N- . Each branch of length
Tlog^nl in T(S|- -i ) corresponds to one value of N- (given by the binary
number j (Tlog^l) ,. . . , j (1) where K^nl , j ( Tlog^l) '* * ' > X 1, j (1) define the
branch in question). If we remove branches of T(Sr. -i) corresponding to
M, thus obtaining the formula S', and perform the substitution defined pre-
viously to obtain the formula p, we obtain a representation for g.

69

We have 0(L(S')) £ n and thus 0(L(P)) <. n +e(n) where e -» as n -» °° .

Thus, if a basis \$ is given that contains all operators used to obtain P,

T / »\ ^ 4+e(n).
then L(g,\$) £ n

In Lu70 Lupanov announced a result of Khrapchenko to the effect that an

4 93
arbitrary symmetric function is of length < n * # Since the assumption were

not made explicit, and the result itself is unavailable a3 of this writing,

no exact comparison can be made with the estimate above.

70

> (x x ..
i m m-1

T(tp i ) where cp has the I -property (only arrows labeled with
and 1 are drawn). T' denotes the set of nodes
x„0,a.) where m is as described in the text and x ,
may assume arbitrary values in {0, 1}.

Fig. 3.1

x„

00000000
00001111
00001101
00001001
01101011
01111011
00000000
00000000

A connected pattern of l's
Fig. 3.2

71

Contact network S for s P

n n

Fig. 3.3a

72

O rH

pa pa

w

CO

pa

CO

i

a,
pa

CM
(X

pa

pq

pa

+

pa

r-<i-400i-IOOt-lo\ /r-'OOr-IOOr-fOO'-'OO'-'O

, p f 5 I \

1-1 I S*> r-l >i«-< OOr-to, i-IOO'-IOO'-<00'- ( OOr-IO

', '

P P • \

r-i i-l i-l o IfM-H >,i-) o |i-'OOi-<00'-<00'- | OOr-40
rH _ r-l C

I r-l
, P r-* P O O r-t ,-1
|>, >« r-i i-l

I r-l lr-1

O, fi

C>P C IH r-l O

I

<N

°* fl 7

i-<OOr-(OOr-<OOi-'0

~-L-0 O <-* O O i-l o

P Oj/i-li-li-l|>-,r-l>,i-IO O-i

r-l pH f5 O O r-l i-l
[>*, >> CM CM

IrH I i-l
r-l r-l r-l O P P O

H H

r-IOr-lrHr-IOir^i- 1 >~ir-IOOr-IO

I rH lr-1 P P

.P P OOi-lr-IO|rM-l>^r-IO

p-, >, i-H i-H

CO CO ! / -d" <f

HHHO|>iH >,i-IO]{ HHQ^H >,r-IOOi-lr-IOi-"'0

, , , co / i •* <t

HrlOOHHO^rll) i-<OOr-<r-IO|>»i-< >•> r-1 O O r"l O

')\

cm cm ' j co co vt st

■ _, I5>-.*-' >-.i-IOOr-lr-IM (>>r-l f>»i-IOOi-lr-lO|>^t-l >•, i-l O

, cm cm / co co

r-l l-l r-l O |>M-I >,HO MHO^^ >,r-IOOr-lr-IOr-IO

I r-l O O r-l r-l O

CO

co

, <N / .»■!*' I

|r*,rH/!i-JOOr-li-lo|(»i-l >*rH O O r-l O

r-l r-l ! i CM CM CO CO

H I^H >m-4 O O r-l r-l / ' / >, r-l >, r-l O O r-lr-t O |>% r-l >■, i-l O

r-l r-l \ I CM CM

i-l r-l r-l O |>v-< >,W O / j r-l r-l O fc>, r-1 >^ r-l O O r-l i-l O r-1 O

—Iff CM CM

r-IOOr-lr-IO|>~.i-l >> r-1 O O r-1 O

i-l r-l CM CM

l>M-l >M-IOOr-li-IO|r^rH \$►, I-l O

r-l r-l

r-1 r-l O |>* r-l J^r-IOOr-lr-IOr-IO

r-l i-H

r-IOO!-lr-IO|>-,r-l rS' _l OOr-IO

r-l I-l

r-IOOi-<OOr-li-IO|>ir-< >>i-< O

i-^OOr-IOOr-IOOi-lr-IOi-IO

1-lOOr-lOOi-lOOi-iOOi-lO

O O r-l O O r-l O
r-lr-'OOr-'OOi-IO

O r-l

< <

CM

CO

CO
I

(X

CM
I

a,

i

a.

+
a,

Planar connectivity simulation of S P

n
Fig. 3.3b

L p+1-

73

i.<

T H for i = 5
n

Fig. 3.4a

J

"hB,

^L

T P for H = 4
n

Fig. 3.4b

74

CHAPTER FOUR

CYCLIC PERCEPTRONS

The perceptron has already been discussed in 1.6. In the beginning
of this chapter, we will first expand on that discussion in order to further
motivate the study of cyclic perceptrons<>

The classical perceptron (for references on the subject see [Mi69]

became the subject of extensive research centered around concepts such as

been created around it -- about its capabilities and its potential for use.

The thing that attracted people most were its ability to learn from experience

and its simplicity -- it combines many small decisions, the values of the

functions cp. , into a final decision by considering their weighted sum.
Minsky and Paper deflated this myth by showing that such a scheme has its

inherent drawbacks. In particular, it cannot compute predicates such as

connectivity.

The most general intuitive basis for the result that the connectivity
preciate cannot be represented by a perceptron is the following: First of all
the reasoning makes sense only if the complexity of the functions cp. is limited
in some way; if not, we can choose to. to be the function that we desire to
represent and then it can be represented by a perceptron trivially. Minsky and
Papert use the order and diameter restrictions (see [Mi69]). The former is also
used by us.

Suppose we want to represent connectivity. Then, if the cp . ' s are bounded
in complexity (so the reasoning goes), the weighted sum is too simple a function

75

to be able to integrate all the information that is required in computing
connectivity.

We set out to apply the same basic reasoning to models where the inte-
grating function is constructed out of finite operators. In particular,
we choose addition in a finite field because of the unique representation
property for functions in such a field (see 4.5) which makes proofs rather
simple, and because of the purely formal resenblance to perceptrons.
One particularly interesting aspect of using addition in a finite field as
the integrating function is that one proof of the inability of perceptrons to
compute connectivity is based on the reduction of connectivity to addition
mod 2. However, this function is precisely the simplest one possible in GF(2),
This underscores the need to make different reductions for different models
of computation that are presumed to be incapable of computing connectivity.

In this chapter we shall limit ourselves to Boolean functions.

We introduce cyclic perceptrons formally:

4.1 Definition

GF(p ) is the finite field consisting of p elements. \$ (the basis)
is an infinite set of Boolean functions (0,1] -» [0,1] such that each cp € \$
(w is the first infinite ordinal) depends on a finite number of arguments.
Elements of \$ are assumed to be ordered (in an arbitrary way). Then a (p,k)-
perceptron (over ?) is a pair P = (a,Y), where a is an w-vector such that the
i th component a. € GF(p ) and a.^0 for only finitely many values of i;
Y c GF(p k ).

76

Given a function f: {0,1] -» {0,1], we will denote the set of arguments
on which it depends by S (f ) .

Let P = (a,Y) be a (p,k)-perceptron. Then P will represent the predicate
(Boolean function)

00

f = [ £ a cp. € Y] (4.1)

i=0 *■ X

where the value of cp. € {0,1} £ GF(p k ). Obviously, S(f) £ U S^)

1 i€{j: j? «0]

We will indicate the function represented by a (p,k)-perceptron P as in

(4.1), or simply [P] .

Let us recall a concept from [Mi69]. Given a (p,k)-perceptron P = (a,Y)

over a certain basis \$, its order (ord(P)) is max (S^) .

i€[j: a.#)]
We can also introduce the order of a function.

4.2 Definition

The (p,k)-order of a Boolean function f over a given basis \$ ((p,k)-ord \$ (f,>
is the smallest I such that there exists a (p,k)-perceptron of order I repre-
senting f. If no such perceptron exists, the (p,k)-order of f is defined to
be °°.

Let 0. be the set of all Boolean functions with finite support. Note then
that for an arbitrary Boolean function f, (p,k)-ord n (f ) is finite and <■ S(f),
for all primes p and arbitrary k. Also note that for an arbitrary basis §

( P ,k)-ord n (f) ^ (p,k)-ord i (f) (4.2)

77

We show now, as is done in [Mi69], that we can choose for the basis a
more restricted set. Let the set of arguments of the basis functions be H =

{x 1 ,x„ ,...]; then we define the set of masks M=( A x.:Sisa finite sub-

i6S x

set of JW }. A convenient way of ordering M is to assign to cp€M the binary
number b b ...b. where b = 1 if and only if x, appears in the conjunction
defining cp.

4.3 Proposition

Any Boolean function f can be represented by a (p,k)-perceptron over M
for any prime p, and arbitrary k.

The proof is the same as that of Theorem 1.5.1 in [Mi69], i.e., we util-
ize the following correspondence between Boolean operations and operations in
GF(p ) if the variables assume only the values and 1:

x. A x_ ~ x • x„, x, V x„ ~ x. + x 2 - x • x 2 , x ~ 1 - x

If f is a function of n arguments, then from its disjunctive normal form, by
using this correspondence and by multiplying out afterwards, we obtain the
following representations for f:

E « 1 x 1 x 2 ... x m = x

i-0

th k

where a. . is the j bit of the binary representation of i and a. € GF(p ).

Note that the mask cp. (see the ordering above) is represented by the monomial

with exponents corresponding to the binary representation of i.

78

Theorem 1.5.3 of [Mi69] also holds in our case. We state it as

4.4 Proposition

The following holds for an arbitrary Boolean function f, an arbitrary
basis §, and an arbitrary integer k and prime p:

(p,k)-ord M (f) £ (p,k)-ordj(f)

Proof

The same as in [Mi69].

Note that if we take fi for the basis in Proposition 4.4, and combine it
with (4.2), we obtain that (p,k)-perceptrons over M achieve minimal order.
We state without proof the following well-known

4.5 Lemma

Every function GF(p ) n -» GF(p ) can be uniquely represented as a polynomial

k k

in n variables over GF(p ) that is at most of degree p -1 in each variable.

(see, e.g. , [La67].)

It has already been noted that we will be interested in whether a function
can be represented by a (p,k)-perceptron with a limitation on its order. For
this we need the following

4.6 Definition

A sequence of Boolean functions f-,f 2> ... of 1,2,... arguments is of

+
finite (p,k)-order (over a given basis ?) if there exists a finite r such

Bounded would be a better word, but we conform to the terminology of [Mi69]

79

that for all i (p,k)-ord- (f . ) £ r.

Let there be given a (p,k)-perceptron (a,Y). If Y = (y,,...,y ], then,
recalling that GF(p ) is a vector space of dimension k over GF(p), and desig-
nating the j component of a. by a., (similarly for y 6 Y) , we have

x. x j n

eo m k CO

[ E a m € Y] » V A [ E a., cp =y ] (4.3)
i=0 L x h=l j-1 i=0 1J x tlJ

We can restrict the diversity of perceptrons we are dealing with by noting

4.7 Proposition

Let \$ be a basis closed under conjunction (i.e., cp, ijr € § => cp A \|r € \$).
If a Boolean function f is of finite (p,k)-order over f, then it is of finite
(p,l)-order (but the order may change).

Proof

We have f = [(a,Y)] where (a,Y) is a (p,k)-perceptron. Suppose !y( = m
and the (p,k)-order of f is I. Prom (4.3) we have

m k °°
f = V A [ E a. ..«p. = y ] (4.4)

h-1 J-1 1-0 ij x hj

where a. ., v.. € GF(p).

By Lemma 4.5, we know that for all a € GF(p) there always exists a poly-
nomial P (x) over GF(p) of degree p-1 which takes on the value of 1 if x = a
and is otherwise (the degree follows from the number of zeros of the poly-
nomial). Thus substituting the Boolean operations with the field operations
introduced in the proof of Proposition 4.3, we obtain from (4,4)

80

k °° k

f = Q( n P ( E a -cp ),..., n P ( E a -cp )) (4.5)
j-1 y ij 1-0 1J j=l y mj 1=0 1J

where Q(x-,...,x ) is the polynomial (of degree m) that represents the Boolean

m f
function V x, . Each P is of degree p-1. Hence f can be expressed as

h=l y ij

a polynomial in the cp. 's of degree ^ nr (p-1). Obviously, cp^ for j > 1 can be

replaced by cp since it assumes only the values and 1. Also, cp* i|r represents

the function cpAi|r and |s(cp A i|r) | ^ |s(cp)| + !s(i(r)|); thus, if the basis is closed

under conjunction (as, e.g., Q or M) , (4.5) describes a (p,l)-perceptron for f

of order ^m«(p-l)'^. □

Remark

Incidentally, this proof also shows that we can assume the cardinality
of Y to be 1.

Since we shall subsequently be concerned only in whether the order of cer-
tain functions is finite or not, we will be able to limit ourselves to (p,l)~
perceptrons. For convenience, we will write simply "p-perceptrons". Also, we
will be only concerned in whether there exists a basis over which a function is
of finite order. This is equivalent to whether a function is of finite order
over M.

t

Q(x 1 ,...,x m ) is obtained by using y ± V y 2 ~ y x + y 2 - y L • y 2 recursively;
i.e., Q(x 1 ,...,x m ) = Q(x 1 ,...,x m _ 1 )+x m +x m .Q(x 1 ,...,X m _ 1 ). If Q is a poly-
nomial over GF(2), then Q = © II y where the sum ranges over all nonempty

y€s
subsets S s (x.,,...,x }.

81

We first turn our attention to the case when p = 2. Instead of '^-percep-
tron 1 ', we will say "Boolean perceptron".

From Lemma 4.5 we conclude that every Boolean function can be uniquely
represented as a polynomial over GF(2) that is at most of degree one in each
variable (a Boolean polynomial).

Noting that the terms of a Boolean polynomial represent marks, we conclude
that every Boolean function f has a unique representation as a Boolean perceptron
over M. Furthermore, by Proposition 4.4, this representation is a minimal order
representation for f. Note then that 2-ord^(f) corresponds to the degree of
the Boolean polynomial for f.

This unique representation property allows us to establish the minimal

order of certain interesting predicates very easily. As in 3.2, we are again

2
interested only in functions {0,1} -♦ {0,1} that are interpreted as functions

of nxn patterns of 0's and l's. In particular, we are interested in the Boolean

2
function of n variables c (introduced in 3.2) and e , (the Euler number of

a pattern of l's on a square array of 0's and l's is equal to k). It is well

known (see, for example [Mi69]) that the Euler number of a planar figure is

the difference between the number of its components and the number of its holes.

If we use the notion of connectivity introduced in 3.2, then the Euler number

of the pattern in Fig. 4.1 is 1.

4.8 Theorem

The connectivity predicate is not of finite 2-order over M (hence, over
any basis).

82

Proof

We use the One-in-a-box construction introduced in [Mi69]= Before pro-
ceeding, however, we must define certain auxiliary predicates, n, the size
of the pattern is assume odd (henceforth, we will suppress the subscript n
in the notation for functions). The variables representing positions in the
square array will, as usual, be denoted by x . for 1 £ i, j £ n. Then we define

r = (x n A x l2 A . . . A x ln ) A

(x„ 7 A x„ 2 A... A x„ ) A... A

(x . A x „ A . . . Ax )
nl n2 nn

and

s - (x 21 Vx 22 V ... Vx 2n ) A

(x 41 Vx 42 V... Vx 4n ) A... A

<Vl,l V Vl,2 V '" V Vl,n>5

i.e., r is 1 only on patterns with odd rows consisting exclusively of l's,
and s is 1 only on patterns where each even row has at least one 1 (the One-
in-a-box predicate). Then,

r A c = r A s (4*6)

(c is the connectivity predicate).

Now, for arbitrary functions f, g, h,if h = f A g, then 2-ord M (h) ^ 2 "° rd M
(f)+2-ord M (g); i.e.,

2-ord M (g) £ 2-ord M (h)-2-ord M (f) (4.7)

83

Replacing h by r A s, f by r, and g by c we obtain

2-ord^c) ^ 2-ord^Cr A s)-2-ord M (r) (4.8)

We have 2-ord (r) = ~- *n; 2-ord^(h) = ~- *n (recall the Boolean

m
polynomial for V x. described in the footnote on p. 80 ) 2-ord^(r A s) =

n(n-l) (because ;the Boolean polynomial representations of r and s have no

variables in common). Using this we obtain from (4.8) 2-ordfc) ^ „ ;

i.e., the 2-order over M of the connectivity predicate is not finite. D

We next establish

4.9 Theorem

The predicate "the Euler number of a pattern equals k" is not of finite
2-order.

Proof

We again consider the case when M is the basis. The general case follows
from Proposition 4.4. n is the size of the pattern. We need to consider a
subset T C S = (x..: i+j even] (note that all points of S are disconnected
from each other, in the sense we use this word). |t| = t will be determined
subsequently.

We define the following predicates

P r = 1 if and only if all points of T are 0; i.e.,

pr = n (i e x)
x£t

q = 1 if and only if k points of T are 1; i.e.,

84

q = © II x • II (lffiy)
x€U y€T-U

where the sum ranges over all possible subsets U C x with |u| = k. When the

expression for q is multiplied out each term produces exactly one term of the

form II x and thus the above Boolean polynomial is of degree t if and only

x€T
if the number of terms in the sum is odd. The number of terms is (5). But

2 1 -! ' t i k

( ) is odd for all ^ k ^ 2 -1 and all I. Thus if t = 2-1, £ k £ t

k _

then 2-ord(q) = t. Also, 2-ord(pT) = |t| = n -t.

Recalling once again the e, is the difference between the number of compo-
nents of a figure and the number of holes, we have the relationship

pr A e k = pr A q

Again using (4.7) with g = e,, h = pr A q, f = pr we obtain

2-ord^e^ * n 2 - |t| = 2^-1

No matter how large we choose I, we can find an n such that we can obtain
a set T with |T| = 2 -1. Thus, the Euler predicate is not of finite order
over M.

Theorems 4.8 and 4.9 can be extended to p-perceptrons for arbitrary
p. The generalization will only be indicated for Theorem 4.8.

Proof: First show that ( ) is even for all I and all k?*0, 2 . This is done

by induction. Now observe that due to (, ) = ( ) + (, .), and the fact that

I 9 k

2-1 2-1 2

( , ) is odd, ( „ ) is also odd (for otherwise („) would not be even). We

can continue this way and establish the claim.

85

The obvious difficulty is that Boolean functions do not have a unique
representation as polynomials over GF(p) for p > 2. Specifically, in the
case of Boolean perceptrons over M, we were able to reduce the problem of
the order of connectivity to the order of r and s (see above). The orders
of these predicates (equal to the degrees of the corresponding Boolean poly-
nomials) were easily computed due to Lemma 4.5.

Suppose c is of finite p-order over M (we have already remarked that this
brings no loss of generality) for some p > 2. Due to Proposition 4.7 we can
assume that we have an expression of the form of (4.5) for c . When multiplied
out, we obtain

CO

c = E a.m. mod p (4.9)

n i=0 l i

where m. is the monomial representing the i mask. m. is of degree one in

each variable, and the values of the variables are restricted to (0,1). Since

the perceptron from which we obtained (4.9) is finite order, we can assume that

the degree of (4.9) is ^ i.

We can now extend c to the domain GF)p) (i.e., to square patterns A =

n

fb.J, 1 s i, j s n, and b . , € GF(p)) by defining c'(A) = 1 if and only if

c (f(A)) = 1 where f(b.J) = {c.J such that c - 1 if b - 1, otherwise
n ij lj ij l J

c . =0. We have from (4.9)

00

c 1 = T. a^.m! mod p (4.10)

n i-0 i l

where m* is obtained from m. by replacing each variable x. by P- (x ) such that
i 1 J ■"■ J

p ( X J = i s otherwise P.,(x.) = (see the proof of Proposition 4.7 how to ob-
tain P 1 (x.)). Now we have a total function c n over GF(p), and thus its

86

polynomial representation obtained by multiplying out (4.10) must be unique.
Since the degree of (4.9) is £ A,

the degree of the polynomial (4.10) £ (p-l)i (4.11)

Another estimate of the degree of the polynomial for c 1 is obtained using
the predicates r' and s' obtained from r and s of Theorem 4.8 similarly as
c' was obtained from c . The polynomial P , representing r 1 , is obtained from
the polynomial representing r by substituting each variable x with P.(x).
Similarly, for the polynomial representation P , of s'. The degrees of P ,
and P , are then found to be ^ ^^ n« (p-1) and ^ —^ n(p-l) respectively
(2-ord M (r) and 2-orcL/s) multiplied by deg(P.,)). Since the analogs of (4.6)
aid (4.8) again hold, we obtain that deg(P ,) is not bounded, contradicting
(4.11), and thus also the finite order of c.

Note that the obvious generalization, i.e., if a function is of finite

2-order, then it is also of finite p-order (over M) , is not true: Consider

n
the Boolean function © x . . We will investigate the degree of the polynomial

i-1
representation of ffi 1 (obtained in the same way from © as c 1 was from c above).

If a p-perceptron over M of finite order exists for 0, then we obtain a poly-
nomial representation of bounded degree for <£' similarly as (4.10) was
obtained from (4.9). On the other hand we have the following representation

for ffi

E n x n (1-x ) mod p (4.12)

i€s i?S 3

where S ranges over all subsets of {l,...,n] of even size. When multiplied

n
out, each term produces exactly one monomial of the form II x. . The number

i=l

ii^ui^^mua^uwj ii Mu^y ' uiM i pi^

89

of such moaamUU appearing If, ^ djseafrinadj fern of (4.12) is

0IIOOQO0

.4. oiiooooo

L £ J n. MO I I I

1-0 3 X I I

000CL0000
(use ths identity (J) - ( n ^)0 4 ^f % fetes this nanber is not

«od p, (4.12) T±^4£Z^*^&K£M **•* » ** * If *i *■
replaced by P.(x.) n obteia s (yi ^saeA^prtlynnerf ■ 1 representation for

«» of degree a* (p-i), contradicting die orl stones of s finite order

p-perceptron over M for #i

wr

OMIfi 19

01 m *

» g # & JfljE \$ ♦ fc^, »

son el 3edBu« «Mt soajt G Jt 4|^M ♦•f T) - (p t*-**«»&* «& *•*)
?abio artlnii s 3o 90*193 a Jaw «*fo ^a^s^bsrfaod f (J-q) *a »»«g»i> 1© *#

«2 .x II •** io5 b

M »o5 if wiro noWqsosa^-q

89

CHAPTER FIVE

PATTERN COUNTING MACHINES

In this chapter we shall permit ourselves a certain degree of informality.

We are again concerned with the power of machines that combine a large
number of "local" computations through an integrating function. Only this
time we shall not be limited to functions that can be represented as a combin-
ation of finite operators.

This class of machines again operates on square patterns of O's and l's.
The operation is divided into two phases : In Phase I the pattern is scanned
with a square "window" of a certain size. Each time a nonzero pattern appears
in the window, we take note of it (there is a finite number of nonzero patterns
since the window is of finite size). At the end of the scan we have a count
of the various patterns, and we are then allowed to utilize this data in Phase
II which consists of computing the value of a partial recursive function for
this data. The formalization of this model is obvious and we omit it.

What can such a machine do? Clearly the computation of this machine is
divided into a local phase and a global phase, so that it fits into the broad
class of problems considered in [Mi69] and Chapter Four.

Note that the boundedness of the window size is essential. If we insis-
ted only that the window contain a given number of points, but otherwise
allowed it to be of any shape with arbitrary distances between its points,
then Phase II could reconstruct the whole figure as was observed already in
[ML69],

We again inquire whether these machines can recognize the familiar

90

topological predicates connectivity and Euler number.

5. 1 Theorem

Pattern Counting Machines (PCM) cannot recognize the connectivity pred-
icate.
Proof

We need only exhibit two patterns, one connected and the other discon-
nected, with the same pattern spectrum. In this case, no algorithm of Phase
II could establish the difference between them.

Two such patterns are given in Fig. 5.1.

Specifically, these patterns are equivalent under windows of size 2x2.
However, it is easily seen that increasing the dimensions of the patterns
in Fig. 5.1 linearly by a factor of k makes them equivalent under windows
of size up to k + 1. We can arrive at this conclusion by setting up a 1-1
map between occurrences of the same pattern in the window in the two pat-
terns

5 . 2 Theorem

PCM's can compute the Euler number.
Proof

It is shown in TMi69] how to compute the Euler number from the spectrum
of patterns of the shape

: m n □

91

Before proceeding, we need a notion of continuous deformation . Pattern
B can be obtained from pattern A by continuous deformation if B arises from
A by a sequence of additions or deletions of l's of the following kind: Let
us fix attention on a 3x3 square with the central position in the place of
the 1 being added (deleted). For simplicity assume that the boundary positions
are always 0. Each position of the periphery of the square which has a 1 in
it is either connected or disconnected to another 1 on the periphery (not
necessarily by a path in the square). This set of connections may be described
by a symmetric 8x8 0-1 valued connection matrix, i.e., a . = 1 if and only if
the i and j positions on the periphery have l's and are connected. The
proposed addition (deletion) is permitted only if (1) the connection matrix
remains unchanged as a result of it, and (2) there is a 1 adjacent to the

Any predicate whose value remains unchanged if the pattern A is replaced
by B, obtained from A by continuous deformation, is called a topological
predicate . We assert without proof that connectivity and Euler number are
topological predicates. The reader is warned, however, that there is a pitfall
in proving this fact for the Euler number predicate. The number of holes in
Fig. 5.2 should be one, not two (i.e., 0's are connected diagonally in addition
to their usual connections). This is discussed more fully in [My71], However,
if the holes are sufficiently large (so that all the 0's in them are connected
in the usual way) this difficulty is not encountered.

5.3 Theorem

Any topological predicate recognized by a PCM must be a function of the
Euler number.

92

Proof

We will have established the theorem if we succeed in showing that,
given any PCM p computing a topological predicate, then for two figures
X 1 and X 2 with EULER(X 1 ) = EULER(X 2 ), we also have P(X ) = P(X 2 ).

In [ML69] it is shown that for every figure X there exists an "Euler
canonical" figure C(X) such that EULER(X) = EULER (C(X)); and if for two
figures X^ X 2 , EULERCX^ = EULER (X 2 ) , then C(X ) = CCXj. If the Euler
number of X is n > 0, then C(X) consists of n components without holes. If
the Euler number of X is n £ 0, then C(X) consists of 1 component with-n+1
holes.

We will show that we can deform any figure X into C(X) without changing
the value of P„

The deformations available to us are:

(1) Continuous deformation. If we subject X to this kind of deformation,
then P(X) remains unchanged because it computes a topological predicate.

(2) Deformations that leave the pattern spectrum unaltered. By defini-
tion of PCM's.

As a consequence, we have

(3) Removal of components inside holes. To accomplish this without
changing the value of P(X), we first apply (1) until the window cannot scan
simultaneously an interior component and the wall of the hole in which it
resides. Then it is obvious that the pattern spectrum will remain unchanged
if we remove the component from inside the hole. After this we can apply
(1) in the reverse direction.

93

If we are given two figures X- and X with same pattern spectra, then
we can add equally shaped holes to the figures in such a way that the pattern
spectra remain the same. The holes have only to be placed in such a way that
the window cannot scan any other boundary while scanning the newly introduced
hole. We can then repeat this to add any number of holes.

Specifically, given two figures of the shape of A and B in Fig. 5.1, we
can add holes in this way and still have the same pattern spectra. For example,
C and D in Fig. 5.3 have the same pattern spectra for a sufficiently small
window size. Note that given any two components, one of which has a hole,
we may apply deformation (1) to obtain a figure proportional in dimensions
to C and they apply (2) to obtain D. We call this sequence of deformations
"cancelling a hole and a component".

We deform X into C(X) by cancelling as many holes and components as
possible. We first apply (3) until no component remains within a hole.
Then we may either have a hole and a component not containing this hole, or
not. In the latter case, we are done. In the former, we select a hole and
a component not containing it and cancel them. After this we are left with
one less component and hole. We repeat this until we arrive at C(X).

We can summarize the results on the recognition of topological predicates
contained in [M.69], Chapter Four, and in the present section in the following
table

94

Recognition

of
Predicates

Cyclic
Perceptrons

PCM

Classical
Perceptrons

Connectivity

NO

NO

NO

Euler

NO

YES

YES

Other
Topological
Predicates

?

Functions of

Euler number

only

Functions of

Euler number

only

Thus, all these results support the conjecture expressed in [Mi69] that
no "local-global" computer can recognize connectivity.

It appears, however, that all models are extremely sensitive to altera-
tions. We have already mentioned how PCM's can be converted into universal
machines with the removal of the restriction on the size of the window.
A. R. Meyer noticed that ordinary perceptrons may be modified to recognize

00

any Boolean function with order one. Instead of E a, tp , ^ consider

i-0

00

£ a. cp. £ Y for some subset of integers Y. Now we can choose the coeffi-

i=0

3. 1

cients a. in such a way that the sums of the coefficients in no two subsets
of the set K of all coefficients is the same, ie. ,

V(X,Y C K) [ S a = E a =» X = Y]

a. € X

i

a.€Y

We can define the coefficients inductively, i.e., choose a to be greater

n-1
than S a... This is in the spirit of stratification (see [Mi69]). Now

1*1

no

tice that if \$ is the set of rn^ks of order 1, then a Boolean function cp

ppw

HPH

Lji..yjij,ii.iii .i.i.jiil^jip

^p

0X000

0X000
la ■iaply § fo|l§c\$lfA0of •ubsoto of I.

0X000
of integer*, fi§irfipgt|f to* •mm of

belonging to «p.

a

ooooooo

I I X

\$>**• J Mi Iw*^ ** ** **• ** t

pt» jipif <Jiw1 oaboot*
9

siaosqa cr»33s<i SxS aese arfa jf*Jhr aoragjrf out

X.2 .gil

aero el urtmismq a±i& ax aalorf lo

S.S .g*f

•lit

a

Sxrsmxpoa * btr« aXorf * gsilloajroO

96

1

1

1

1

1

1

1

1

1

A

1

1

1

1

1

1

1

1

1

B

Two figures with the same 2x2 pattern spectra

Fig. 5.1

00000000
00000000

1

1

1

1

1

1

1

1

1

1

1

1

1

1

The number of holes in this pattern is one
Fig. 5.2

C D

Cancelling a hole and a component
Fig. 5.3

97

APPENDIX A

CERTAIN PROPERTIES OF SHORT FORMULAS

The purpose of this appendix is to modify certain results of [H068] in
the light of our different requirements. Our goal is Lemma A. 9 which is used
directly in the proof of Theorem 2.2.2. We prove it by way of a series of
intermediate results, none of which are used elsewhere.

In what follows we would frequently use the phrase n F is a formula in
n variables over \$, and such that no variable occurs more than k times".
This will be abbreviated to "F is a (§,n, k) -formula". If any of the para-
meters is not present, we will replace it by *. For example, "F is a (\$,*,k)-
formula" and M F is a (<S,n,*)- formula" mean "F is a formula over \$, and such
that no variable appears more than k times'* and "F is a formula in n vari-
ables over \$ n respectively.

A. 1 Definition

Let there be given the sequence of formulas G = (G.(X-,z) , . . . ,G ^

(X .,,z).G (X )). If 1 £ i £ p-1, then G. contains the distinguished
p-1 p p i

variable z, occurring only once. X. for 1 S. i ^ p is nonempty and is

either a singleton or = U X.. Let F be an arbitrary formula, and

j<i J
G = G 1 (X 1 ,G 2 (X 2 ,... ,G _ 1 (X _ X G (X ))...)). If F = G, then G is a nesting

sequence of length d. for F. If, in addition, the total number of occur-
rences of any variable (except z) in G is < the corresponding number in F,
then G is a proper nesting sequence for F.

98

A. 2 Remark

Let G and G be as described in Definition A. 1. Furthermore, let all
X. for 1 ^ i ^ p be singletons and distinct. Then G is equivalent to an
e _. -component. Also, suppose X. is arbitrary and G. is a formula over §.
Now replace all variables except possibly one in G. for 1 ^ i ^ p-1 by the

constant a. Let the set of variables that have not been touched be Y.

Y a

Then G is equivalent to an e-component over \$ .

3

Let F be an arbitrary formula over \$, X J S(F), and a f D; then we
would like to obtain a formula over \$ with the following properties:
(1) G = F (2) S(G) = X, and (3) the number of occurrences of any vari-

3

able of X in G is £ the corresponding number in F. G can be obtained

by a straightforward replacement of operators in F such that the variable

symbols that are replaced with a in forming F (and subformulas of F

3

where S(F) consists entirely of such variable symbols) are removed, and
the remaining operators are changed to preserve equivalence with F .

SL

Mor

e precisely, if cp(F- , . . . ,F. ) is a subformula of F, then if S(F.) <£ X

1. iC i-

for all i, cp remains the same; if S(F.) <= x and S(F.) <£ X for j^i, then
cp is replaced with cp(x. , . . . ,x , ,F ,X . , . . . ,x, ) where all variables of
F. have been replaced with a (if there are more such indices i, we proceed
in the obvious way); and if S(F.)cx for all i, then cp is eliminated. This
transformation will be called normalization and G will be denoted by norm(F ),

"■ ■ ■■ " '""■ ' " '■■ ' 3

99

A. 3 Lemma

If F is a (\$,n,*)-formula, then for any p, q ^ 1 and a £ D, if
n ^ \ (P>q)j there exists a subset X £ s(F) such that either

(1) |xj = q and F is equivalent to a PC of the formulas F ,...,F^_

3

where r ^ n and F for 1 < i £ r is a formula over \$ such that each element
max l

of X occurs in at least two among F..,...,F , and the total number of

occurrences of any x f X in F , . . . ,F is ^ the number of occurrences of x
in F; or

(2) |xj is arbitrary and F has a proper nesting sequence G = (G..,...,

3 x

G , , G ) where G. for 1 ^ i < p is a formula over \$ .
p-1 P i

Proof

Assume there is no X c S(F) such that F is as described in (1) of

3

the statement of the lemma.

We will describe a (proper) nesting sequence extraction procedure
(NSE) whose inputs will be a formula H over § and a set of variables
Y. The output of NSE will be two formulas H 1 (Z,z) and H" over \$ such
that Z is either a singleton or £ Y; furthermore, H = H'(Z,H n ) for some

3

U ^ S(H).

G will be obtained by the repeated use of NSE. Initially, the input
of NSE will be F = F- and <t>. In the first application of NSE, the output

will be G, and F (F is an intermediate formula whose significance will

f"Vi

be describe immediately). In general, the i application of NSE will

receive the input F. and M X. and yield as output G.^ and F^. We will

j< i J

100

show

that if n ^ TL(p 5 cj)j we can apply NSE p-1 times and end up with F _„

from which G is obtained as will be described below.
P

Description of NSE . The input to NSE is as describe above. Then
we can distinquish two cases:

Case I . L(H) = 1. In this case we cannot apply NSE, and the output
is undefined.

Case II . L(H) > 1. In this case we can assume that H has no unary

operators; for suppose there exists a subformula J of H such that J = co(ty(

J-,..., J )) where J. for 1 £ i < r is either a variable symbol or another
1 r i

subformula of H. In this case cp« \|/ = p 6 \$ and we can replace J by the
equivalent formula P(J..,...,J ). Similarly, if J = cd(J,, . . . , ^(J. ) > • • • ,J^) >
we can eliminate i|f because cp(x.. , . . . ,i|f (x. ) , . . . ,x, ) = p(x- , . . . ,x^, . . . ,x^)
f \$ a (thus, if a unary operator of H corresponds to an internal node of
T(H), we can eliminate it by either of these two means; on the other hand,
if a unary operator of H corresponds either to the root node or to a node
next to a terminal node of T(H), then we can use only one of the two methods
described). Now choose i' such that S(H .,) is maximal among S(H ) for
1 ^ i £ r. Since support is defined only for formulas, S(H ,) may be
undefined if all arguments of the outermost operator of H are variable
symbols. In this case replace one of them by the identity operator
which is possible since id p \$ a . Consider H/H ^, = K(Z,z). Again two
cases can arise:

101

Case Ila . Y R Z = <t>. Choose any variable of Z, e.g., x, and let

fx z)
H' = norm(K ). z is a distinquished argument (hence x is free).

In this case set V = S(H ,)-Z U[x}. The significance of V will be

• X

seen immediately.

Case lib . Y R Z ^ 0. H' = norm(K )] z is again a distinguished

argument V = S(H . , )-Z U Y.

In both cases H" = norm((H .,) ).

. i a

Analysis of NSE . Let I S (H) ! = m, and let us estimate ls(H")'.

Obviously, |s(H . ,)| ^ . In the case that H results from a chain

max
of applications of NSE to a formula F, and F does not satisfy (1) of the

statement of the lemma, then we claim that in Cases Ila and b less than

q variables are set to a in H . , . Suppose this is not true. Let the

• J-

set of variables that is set to a on this occasion be W. Then W £ Z-{x}

W
(Case Ila), or W £ Z-Y (Case lib). In any case consider F & . This is

equivalent to ^(H , n >) , . . . , (H . .) where i(l) , . . . ,i(s) are the

« X \ X / H • X \ S / cl

indices corresponding to the subformulas H . where all variables have

• J

not been replaced by a (if in H all variables have been replaced by a,

then it is absorbed into 9). But then cp(norm((H ,,J ) ,. . . ,norm((H ^ g J a ))

satisfies (1) of the statement of the lemma. A contradiction.

Thus, IS(H")| s ^ J!L - -q+1
max

Hence, if we define

102

VI, q) = 1

Tl 6 ( P +i,q) - <Vp.q>+q-D- r W

x,e "

n fi (p,q) = n P " 1 -q+ maX - • (q-1)
6 ^ max ^ n -1 M

max

for p,q ^ 1 (and if F does not satisfy (1)), we will be able to apply NSE

p-1 times and obtain F ,. G can then be obtained as follows: If S(F ,)

p-1 p p-1

fl S(G.) = 6 for 1 ^ i ^ p-1, then choose any variable y f S (F 1 ) and
i p-i H

obtain G from (F _-) yi by normalization; otherwise, denoting U S(G.)

i=l
U
by U, obtain F . from (F n ) by normalization. It can be checked that
p-l p-1 a

G. for 1 ^ i £ p satisfy the conditions of (2) of the statement of the lemma.

n

Consider a sequence of (nonempty) sets X for 1 £ i ^ p such that

X. is either a singleton, or is included in tj X . We will call such

L J<i j

a sequence of sets a normal sequence (of length p). Note that the sequence

X..,...,X in Definition A. 1 is a normal seqdence. Then

A. 4 Lemma

Let X..,...,X be a normal sequence of sets with the additional pro-

of I)

A.

sequence. Then if p S(k+l) m , there exists a subset Y C fj X. and an

1=1 L
increasing sequence of indices i(l) ,i (2) , . . . ,i (q) such that (1) q ^ m,

perty that each element of U X. appears in at most k elements of the

i-1 L

:here exists a subset Y C I ! J

1=1 L

103

(2) 1(1) = 1, (3) X ... Y is a singleton for 1 £ j £ q, (4) Xj Y =

if 4 £ i(q) and X j* i(j) for 1 £ j £ q, and (5) if x £ Y, ^ < J 2 < j g

and x £ X. , . N , x f X. , . N , then also x € X. , . N .
UJj) i(J 3 ) Kj 2 )

froof

(this is a direct translation of the proof of Lemma 2 of [Ho68] into

P
our terminology). Let IJ X. =* (x.. ,x„ , . . . } . Without loss of generality

i=l 1 l
assume that x- € X.. . If m = 1, set Y = (x..}, i(l) = 1, and conditions

1-5 are satisfied. For the inductive step two cases are distinguished.

Case I . x 1 occurs in none of the sets X., 2 £ j < (k+1) + 1.

Setting r = (k+1) +1, the sequence X„,...,X is normal and each element

r
occurs in at most k of the X., 2 <. j £ r. If Z C \\ X and the sequence

J j =2 J

j (1),... ,j (q-1) are obtained by the inductive hypothesis, then [xj U Z = Y

and 1(1) = 1, 1(2) = j(l) i( q ) = j (q-1) satisfy conditions 1-5.

TTl— 1

Case II . Assume that x.. occurs in some X., 2 ^ j £ (k+1) +1 and
let h be the smallest such number j. Furthermore, let V be the set of
elements different from x, , and occurring in X 2 ,...,X ,. Delete the
elements of V from X-,X. ,...,X , and delete those among X.X_,...,X
that remain empty. Let the resulting sequence be Y-,...,Y . The length
of the sequence (X. ,X, ,X, , . . . ,X ) is at least p- (k+1) +1.

There are less than (k+1) distinct variables in X„,...,X _p
each one occurring in at most k-1 of the formulas X ■, S X, ,...,X .
Therefore,

104

r * p-(k+l) m " 1 +l-(k-l)(k+l) in " 1 ,i.e. )
r 2= (k+l) 1 "'^!

m— 1

The sequence Y_,...,Y is normal and its length is at least (k+1)

r
x 1 occurs in Y„. Let Z £ I J Y. and the sequence j (1) = 2, j(2),...,

j-2 J
(q-1) be obtained according to the inductive hypothesis for Y^j'.'jY .

Then Z and i (1) = 1, i(2) = j (1) , . . . ,i (q) = j (q-1) (where q * m) , satisfy

conditions 1-5. LJ

Let there be given a (*,*,k)- formula F with the proper nesting

sequence G = (G.,,...,G ) such that G. is a formula over f. As has
M — ^ 1 ' p l

already been remarked above, X , ...,X (see Definition A. 1) is a normal

i p

sequence of sets.

If p 2 (k+l) m , then by Lemma A. 4 there exists a set Y £ N X and

1*1
q indices i(j) for 1 £ j £ q such that conditions 1-5 hold. Note that

if m = k-t, then |y| * t since no variable appears more than k times

in G (G is proper). In particular, consider only Z = {x^...^] £ Y

where x-,...,x are numbered in the order of their appearance in G.

Note that due to condition 5 of Lemma A. 4, if x, y €Y and y follows x

in G, then x cannot appear again after y in G. Let G be as defined in

2

Definition A. 1. Then we will let the reader convince himself that G

7

(hence also F ) is equivalent to K(Z,G') where K(Z,z) is an e -component
a c

over \$ with input variable z, and G' is a certain formula over \$ such
that each variable of G' occurs at most k-1 times.

105

Note that in this case we do not know the size of S(G'). This
can be remedied in the following way: There are two cases; either
|S(G')| s 1/2- t, or not. In the first case perform an a-merger on K
with basis S(G'), after which we obtain an SC of an e-component K'
of length ^ 1/2* t and a formula G" (through the input variable) such
that S(G") equals the set of lateral variables of K' ; in the second
case perform an a-merger on K(Z,G') with basis Z-S(G') in which case we
obtain an e-component K' of length ^ 1/2' t with a constant input operator.

We summarize the preceding in the following

A. 5 Lemma

Let there be given a (*,*,k)-formula F with a proper nesting sequence

2k« t
of length p ^ 1 composed of formulas over \$. Then if p ^ (k+1) , there

exists a set Z = S(F), |f| ^ t, and F is either equivalent to an SC of

SL 3.

an e-component K over § and a formula G over \$ such that S(G) is the
set of lateral variables of K, and no variable of G occurs more than
k-1 times in G; or to an e-component K over i with constant input
operator.

Let there be given a PC F of the formulas F, ,...,F where r £ n

& 1' ' r max

such that |S(F)| = q and each variable appears in at least two among
F-,...,F (i.e., a situation as described in (1) of the statement
of Lemma A. 3). We are interested in obtaining a (nonempty) subset
X £ S(F) such that when the variables outside of X have been replaced

106

by the constant a, |s(norm(F.) )) I for those F. where not all variables

■L a ^-

h?ve been replaced with a is equal or larger than a predetermined number
t (as large as possible).

We could solve the problem as follows: Each variable of S(F) appears

in a certain subset of the formulas F ..,..., F . The number of such

n
t* max

subsets is 2 (in general, S 2 ); thus, we are sure to find a subset X

with |x| 2: — 3 — such that all elements of X appear in the same subset

2 max

of F- ,. .. ,F .
1 r

However, we can improve this number. Let us construct the occurrence

table of F. The table consists of rows corresponding to elements of S(F),

and of columns corresponding to F. for 1 <■ i ^ r. The entry a . is 1

if x occurs in F. and otherwise. We will try to extract a subset
i J

X £ S(F) such that either all variables of F^, are replaced by a, or

XT

S(norm(F.) )) contains 2 elements (t will be determined later).
i a

If all columns in the occurrence table contain 2» t l's, we are
done and X = S(F). Suppose not. Let the column j contain < t l's.
Delete all rows corresponding to the l's in column j and column j
itself. Let the set of variables corresponding to the remaining rows
be X 1 . Consider the remainder of the occurrence table (i.e., minus
the deleted rows and column); and again look for the column with < l's.
If it does not exist, we are done and X = X... If such a column exists
continue. Now two things can happen. Either at some point we end up
with a certain subset of columns, all of which contain ^ t l's, or we
end up with two columns that both contain < t l's. We shall see that by

107

an appropriate choice of t, the latter case cannot happen. The number
of l's in the whole table ^ 2q (each variable Occurs in at least two
formulas). The smallest number of l's remaining after all but two columns
have been deleted 2: 2q-m where m is largest possible number of l's that
can be deleted in the course of this procedure, m = (t-1)- (r+r-l+r-2+. . .

+3) = (t-1)* -^ j ^ (this corresponds to the case when each deleted

row contains only l's and at each stage t-1 rows are deleted). If, after
the table is reduced to two columns, both columns are to contain ^ t l's
(both have to contain the same number of l's since each variable occurs
in at least two formulas), then

2ais * t

2

or since r ^ n

max

t ^^rr wh ere c = (n +3)(n -2)
c+4 max max

For large n this is better than the previous bound. This result
° max

can be summarized in

A. 6 Lemma

Let there be given a PC F of the formulas F ..,..., F over \$ where

r ^ n such that |S(F) = q and each variable appears in at least
max ' '

two among F. , . . . ,F . Then if

108

q 2 f where t 2: 1

y

we can find a subset X £ S(F) such that F is equivalent to a PC of

a

the formulas &,,..., G over § and S(G.) ^ t for 1 ^ i £ s <. r.
Is v i

Lemmas A. 3, A. 5, and A. 6 can be combined into

A. 7 Lemma

Let F be an (\$ ,n,k)-formula. Then for any t ^ 1 and a £ D if

n^^CCk+l^^/^^C)

(see Lemma A. 6 for the value of c) , there exists a subset X £ S(F)
such that either

(1) F is equivalent to a PC of the formulas F ..,..., F over \$

a ^ 1' ' r

where r < n , each variable of F. occurs at most k-1 times in it,
max l '

and F. contains at least t variables of X or

X a

(2) F is equivalent to an SC of an e -component K over \$

a t

with a formula G over \$ (through the input variable) such that S(G)
is the set of the lateral variables of K and no variable occurs more
than k-1 times in G; or to an e -component K over \$ with a constant
input operator.

A. 8 Lemma

Let F be a (\$,n,k)-formula. Then for any t s 1 and a € D if

n > T! 7 (t,k)

109

then there exists a subset X C S(F) such that F is equivalent to an

a

a k

SPCeC over § G such that (1) G has ^ n components, (2) each

max

component is of length ^ t, and (3) the terminal components of G
have constant input operators.

Proof

TL(t,l) = nt . In this case T(F) has at least one branch
7 max

connected to t+1 variable symbols (k=l and thus all variable symbols
are distinct) at different nodes. This branch can be converted into
an e -component with constant input operator. The idea is illustrated
in Fig. A. 1.

n 7 (t,k + l) - T| 6 <(k + 2) 2 °' fl),7) 7 (t . k > i , «*4). V,k)-c )

We can apply Lemma A. 7. The result is either (1) an e-component K
of the correct length and constant input operator, (2) an SC of an
e-component over \$ of the correct length and a formula to which we
can apply the inductive hypothesis, and (3) a PC of formulas to which
we can apply the inductive hypothesis. In each case we obtain an SPCeC
with the desired properties. D

A. 9 Lemma

Let F be a (\$,nk)-formula. Then for any t ^ 1 and a f D if

n * TLCt.k)

110

there exists a subset X C S(F) such that F is equivalent to an SPCeC

a.
a

over \$ G such that (1) G has ^ k components, (2) each component has X
as the set of its lateral variables, (3) the terminal components have
constant input operators and (4) |x| = t.

Proof

/n k \ (n k ) (n k \

Set T1g(t,k) = TL(s«t,k) where s =f max) +1 max/ +. . .+ ( max J

Apply Lemma A. 8 to obtain a SPCeC G' with all components having length

s s*t. Since each variable appears at most k times, it can occur in

at most k components. s is the number of nonempty subsets of £ k elements.

Thus, if the number of variables is as indicated we are sure to kind

in G' a subset of t variables that all occur in the same set of components

of G'. After performing an a-merger with this set as basis, we obtain

the desired SPCeC G. D

Remarks on the bounds in Lemmas A.3-A.9 . If Tl is approximated by

6

P m

n r • q, then TL is inductively defined as follows:
max "»> 7 j

Vt,i) - n<:

7 max

2k-Tl 7 (t,k-l)

V'^ - v*™ 1 ' -yt.k-i) b

/ L

c • — , ^ / , s -^ . „ o, \ , b f2k times

for a certain constant y. Thus we see that TL(t,k) 2: iexp(b,2k) = b J

for k ^ k(b) for any constant b (t has not been included in the estimate

because in applications it is constant).

Ill

V

Conversion of a formula F where each variable occurs only once
into an equivalent e-component by setting certain variables to a.

Fig. A.l

112

APPENDIX B

t
THE LENGTH OF THE MOD 2 SUM OVER II

There is an isomorphism between the set of formulas over II and series-
parellel contact networks. We assume the reader is familiar with this model
as well as with the isomorphism in question. In this case if F is a formula
over II, then L(F) corresponds to the number of contacts in the network corres-
ponding to F.

For convenience, we will derive the result in contact network terminology.
Given a (series-parallel) contact network C, a chain is set of contacts
such that when they are all closed, C conducts (we will say "C is 1"); a cut
set is a set of contacts such that when they are all open, C does not conduct
(we will say "C is 0"). In the obvious way, we define minimal chain , minimal
cut set (i.e., when one contact is deleted the corresponding property does not
hold).

B. 1 Lemma

Given a contact network C and any minimal chain and minimal cut set, their
intersection is a singleton.

This result is due to Khrapchenko [Kh71].

113

Proof

By induction on the number m of contacts in C. For m = 1 the assertion
is obviously true. If m > 1, C must be either a series combination of smaller
networks C. and C™, or a parallel combination of smaller networks C. and C ? .
In each case it is simple to establish the lemma. D

n
Suppose now we have a contact network S that represents © x.. Let m.

i=l J

denote the number of contacts labeled with x. or x. . Then we are interested

in H m..

Consider n-tuples (a.) for 1 ^ i ^ n and a. € {0,1}. An n-tuple of this
kind will be called even if it has an even number of l's, otherwise it is odd .
Obviously S must be 1 on odd n-tuples and on even ones.

Consider an arbitrary odd n-tuple a_ = (a- ,. . . ,a. ,. •• ,a ) and an even
n-tuple b = (b 1 ,. . . ,b. ,. . . ,b ) at Hamming distance 1 from a. If b. = a., then
all other components of a and b are equal, e. will denote the n-tuple with
a single 1 in the i place. Then we will write b = a ffi e..

To each odd n-tuple a. we can assign a minimal chain c (a) (consisting of
a subset of contacts of S that are closed at a and that do form a minimal
chain); similarly, to each even n-tuple b_ we can assign a minimal cut set
s(b) (consisting of a set of contacts of S that are open at b and that do form
a minimal cut set).

Let a be odd, b_ = a ff" e. even. Then by Lemma B.l, c (a) H s (b) is a
singleton; in fact, it is easy to verify that it must be a contact labeled
either with x. or x..

114

We build now Tables I and II. The rows of Table I correspond to odd

n-tuples while those of Table II correspond to even n-tuples. Thus both

n-1
have 2 rows. The columns of both tables correspond to the variable x.

for 1 ^ i ^ n. The entry a(a,j) in Table I is c (a) H s (a © e.). This entry

will be represented by a number between 1 and m..

Let t. . denote the number of times contact number i (among those labeled

with x. or x.) appears in column j of Table I. Then

S t = 2 n_1 (B.l)

1=1 J

The entry p(b,j) of Table II is s (b ) D c (b © e ). (B.l) again holds.

Construct now Table III. The rows of Table III correspond to all possible
pairs (a,b) where a and b are odd and even n-tuples respectively. The columns
of Table III again correspond to variables. An entry of Table III is yCa.bjJ) =
(a(a,j), p(b,j>.

Consider now the diagonal entries of Table III (i.e., (C0,p) such that
a = p). Let (CC(a,j), p(b,j)) be such an entry. Then CC(a,j) = P(b,j) =
c(a) D s(b). Thus, by Lemma B.l, there can be only one such entry in a row.

HI:

n J 2
The number of diagonal entries is S S t^ .. Thus,

j=l 1=1

ij

m.

j-1 i=l " lj

Combining a version of Cauchy's inequality

m j „ "j

t t?. ^ 2 2(n " 1) (B.2)

-L ( t t ..) 2 <. s t 2 .
m j i-i 1J i=i 1J

its

with (B.l) and (B.2), we obtains J ri:

n 1

Thi» time *e apply the, idaqnality , ■<*& lab mItM ,«WI .W t <sjH

sijaH|iJK>,^

( E »,).( E J- ) *n
(for both inequalities see, e.g., Dfc*4j#. 9), «aa obtttartmeeeeired result

ot:>.i3i;iac;i?«.; : tu rjivii

«»l-dQi*-i s,ii;.i--. *:■;-:

E *. a^tt-i' vsll* orfot .£ „IoV s jQi3nc.D i>B£

■lot!

-on

3fe3i3^,a 5<"

cbs;

"SSI.lBi

-iBi

, 5tf«R:3"X«/-2 iot3;K

t. HZi^M S H\$#Jft J ^».5-.**0A y i|C

116

LITERATURE

Ar69 M. A. Arbib, Theories of Abstract Automata, Prentice- Hall, 1969.

Av69 A. Avizienis, On the Problem of Computational Time and Complexity of
Arithmetic Functions, Proc. ACM Symposium Theory of Computing,
May 5-7, 1969, Marina del Rey, California, pp. 255-258.

Be71 C. Berge, Principles of Combinatorics, Academic Press, 1971.

Co71 S. A. Cook, The Complexity of Theorem Proving Procedures, Proc. 3rd
Ann. Symposium Theory Computing, Shaker Heights, Ohio, May 3-5, 1971
pp. 151-158.

Gr59 Editors, Eugene M. Grabbe et al. , Handbook of Automation Computation
and Control, Vol. 2, John Wiley, 1959.

Ha71 L. H. Harper and J. E. Savage, On the Complexity of the Marriage Problem
(unpublished) .

Ho68 L. Hodes and E. Specker, Lengths of Formulas and Elimination of Quanti-
fiers I, Contributions to Mathematical Logic, K. Schutte, editor, North
Holland Publ. Co., 1968, pp. 175-188.

Ho70 L. Hodes, The Logical Complexity of Geometric Properties in the Plane,
Journal ACM, 17, No. 2, pp. 339-347.

Kh71 V. M. Khrapchenko, On the Complexity of the Realization of the Linear
Function in the Class of rr-Circuits , Mat. Zametki, 9, No. 1, 1971,
pp. 35-40 (Russian).

Kr59 R. E. Krichevskii, Realization of Functions by Superpositions, Prob.

Cypernetics II, 1961, pp. 458-477, Pergamon Press (translated from the
Russian).

La67 S. Lang, Algebra, Addi son-Wesley, 1967.

Lu59 0. B. Lupanov, Complexity of Formula Realizations of Functions of
Logical Algebra, Prob. Cybernetics III, A. A. Lyapunov, editor,
Pergamon Press, 1962, pp. 782-811 (translated from the Russian).

Lu70 0. B. Lupanov, On Some Results in the Mathematical Theory of Synthesis
of Control Systems, Information Materials 5(42), Ac. Sci. USSR, Moscow
1970, pp. 16-22 (Russian).

117

Mi69 Mo Minsky and S. Papert, Perceptrons, MIT Press, 1969.

Mk71 R. McKenzie, et al. On Boolean Functions and Connected Sets, Math.
Systems Theory, 5, No. 3, pp. 259-270.

Mt64 D. S. Mitrinovic, Elementary Inequalities, P. Noordhoff Ltd. Groningen,
1964.

My71 J. P. Mylopoulos and T. Pavlidis, On the Topological Properties of
Quantized Spaces I, II, Journal ACM, JL8, No. 2, pp. 239-254.

Ne 66 E. I. Neciporuk, A Boolean Function, Soviet Math. Dokl. , 2, No. 4,
1966, pp. 999-1000.

Ri42 J. Riordan, C. E. Shannon, The Number of Two Terminal Series-Parallel
Networks, J. Math, and Phys. 21, 1942, pp. 83-93.

Ry63 H. J. Ryser, Combinatorial Mathematics, MAA Math. Monographs, No. 14,
John Wiley, 1963.

Sh49 C. E. Shannon, The Synthesis of Two-Terminal Switching Circuits, Bell
System Tech. J., 28, No. 1, 1949, pp. 59-98.

Su61 B. A. Subbotovskaya, Realizations of Linear Functions by Formulas Using
V, &, -, Soviet Math. Dokl., 2, No. 2, 1961, pp. 110-112.

Vi70 B. Vilfan, Cyclic Perceptrons and Pattern Counting Machines, Proc.

4th Ann. Princeton Conf . Info. Sci. and Syst. , Princeton U. , March 1970

Ya54 S. V. Yablonskii, The Realization of the Linear Function in the Class
of TT-Circuits, Dokl, Ac. Sci. USSR, 94, No. 5, pp. 805-806 (Russian).

Ya59 S. V. Yablonskii, On the Impossibility of Eliminating Exhaustive Search
of Boolean Functions in the Solution of Some Problems in the Theory of
Circuits, Dokl. Ac. Sci. USSR, 124, No. 1, pp. 44-47, (Russian).

This empty page was substituted for a
blank page in the original document.

CS-TR Scanning Project

Document Control Form Date JJ±±JJ±.

Report # LCS-TR-^"?

Each of the following should be identified by a checkmark:
Originating Department:

□ Artificial Intellegence Laboratory (Al)
^g" Laboratory for Computer Science (LCS)

Document Type:

V^ Technical Report fTR) □ Technical Memo (TM)

□ Other:

Document Information Number of pages: //8(/^-.>^fJ

- Not to include DOD forms, printer instructions, etc... originarpages only.

Originals are: Intended to be printed as :

□ Single-sided or □ Single-sided or

^ Double-sided £( Double-sided

Print type:

□ Typewriter □ Offset Press □ Laser Print

□ InkJet Printer J^f Unknown □ Other:_ .

Check each if included with document:

2^ DOD Form □ Funding Agent Form ^ Cover Page

□ Spine □ Printers Notes □ Photo negatives

□ Other:

Page Data:

Blank Pages^,-,,™^: foU-owj Q»*t PAG^

Photographs/Tonal Material (by,»9anumb«): .

Other (note dMcnplion^MB* numbar).

Description . Page Number

Scanning Agent Signoff: . . r .

Date Received: I lJ3lHi__ Date Scanned: / / Ml K Date Returned: cLjJjl*.

%4ArhJy>'<fA

Scanning Agent Signature: rWAtjfA^ykjl f v • mtt^R,, Rev &» ds/lcs Document control Fo<mc*Morm.v«d

UNCLASSIFIED

Security Classification

DOCUMENT CONTROL DATA - R&D

(Sacumy claaallication ol till,, body of abstract and indexing annotation muat ba antarad whan the overall report ia claa tilted)

I. ORIGINATING ACTIVITY (Corporate author)

Massachusetts Institute of Technology
Project MAC

3. REPORT TITLE

The Complexity of Finite Functions

2a. REPORT SECURITY CLASSIFICATION

UNCLASSIFIED

26. GROUP

NONE

4. OESCRIPTIVE NOTES (Type ol report and tnclu

Ph.D., Department of Electrical Engineering, February 1972

5. AUTHORIS) (Laat name. Itrat name, Initial)

Vilfan, Bostjan

6. REPORT DATE

March 19 72

8«. CONTRACT OR GRANT NO.

N00014-70-A-0 362-0001

6. PROJECT NO.

7*. TOTAL NO. OF PAGES

118

76. NO. OF REFS
25

9a. ORIGINATOR'S REPORT NUMBER(S)

MAC TR-97 (Thesis)

•6. OTHER REPORT NOIS) (Any other number t that may be
aeaigned thla report)

NONE

10. AVAILABILITY/LIMITATION NOTICES

Distribution of this document is unlimited

M. SUPPLf MENTARY NOTES

None

3D-200 Pentagon
Washington, D. C. 20301

s. abstract T h e topics covered are the length of formulas for finite func-
tions , the order of cyclic perceptrons , and pattern counting machines.
Using a generalization of a theorem of Specker, it is shown" that the
Boolean function is 1 if the number mod p of arguments equal to 1 is
cannot be represented by a formula of length proportional to the number
of arguments if k-ary logic is used with p>k. The same thing can be
shown for arbitrary k if the only binary operators used are max(x,y)
and min(x,y). It is also shown that the connectivity predicate cannot
be represented by a formula of this kind, regardless of k and of the
operators used. Next shown is that the connectivity predicate and the
Euler number predicate cannot be represented by finite order cyclic per-
ceptrons. Finally, it is shown that the only topological predicates
that can be reconstructed from the k-subpattern spectrum of a given
square pattern of 0's and l's are functions of the Euler number. The
k-subpattern spectrum of a pattern is a tuple given the number of
omnrrsnces of anv kxk square subpattern in the original pattern.

14. KEY WORDS

L

computational complexity

combinatorics

finite functions

DD/ N r.. 1473 (M.I.T.)

UNCLASSIFIED

Security Classification

Scanning Agent Identification Target

Scanning of this document was supported in part by
the Corporation for National Research Initiatives,
using funds from the Advanced Research Projects
Agency of the United states Government under
Grant: MDA972-92-J1029.

The scanning agent for this project was the
Document Services department of the M.I.T
Libraries. Technical support for this project was
also provided by the M.I.T. Laboratory for
Computer Sciences.

Scanned

^te: {/&g//J<lg

M.I.T. Libraries
Document Services

darptrgt.wpw Rev. 9/94

```