The general problem we will be concerned with in this paper is that of estimating the regression parameters alpha, beta(sub 1), ..., beta(sub q) in the general regression model.

This paper discusses some of the mathematical properties of Jeffrey's rule for revising a probability P to a new probability P*, connecting it with sufficient partitions, and maximum entropy updating of contingency tables. The main results concern simultaneous revision on two partitions.

The classical Fatou inequality for non-negative measurable functions is intimately connected with problems of convergence of random variables (r.v.). The present paper focuses on the study of a modified form of the Fatou inequality which has important applications in the theory of convergence of r.v. We formulate our results in the language of probability.

Percentage points for Greenwood's statistic obtained by fitting Pearson curves to the first four moments are given. Comparisons are given with the exact points for n=10, recently given by Burrows (1979) and these suggest that the approximate points will be accurate for practical purposes. (Author)

Conditional inference plays a central role in statistics, but determination of relevant conditional distributions is often difficult. We develop analytical procedures that are accurate and easy to apply for approximating conditional distribution functions. For a continuous random vector we estimate conditional tail probabilities are smooth functions of X. Previous approaches have dealt with the cases where the variable whose conditional distribution is sought is a linear function of means, and...

How do Bayesians justify using conjugate priors on grounds other than mathematical convenience? In the 1920's the Cambridge philosopher William Ernest Johnson in effect characterized symmetric Dirichlet priors for multinomial sampling in terms of a natural and easily assessed subjective condition. Johnson's proof can be generalized to include asymmetric Dirichlet priors and those finitely exchangeable sequences with linear posterior expectation of success. Some interesting open problems that...

Fortus' generalization (1979) of asymptotic shapes of optimal testing regions for composite hypotheses does away with the restriction to exponential families originally imposed by us (1962). Here we survey his work critically, and suggest some improvements that may be crucial for its practical applicability to parametric problems, and point out its shortcomings for nonparametric ones. (Author)

The asymptotic order of magnitude of the joint moments of the maxima in a critical Galton Watson process are given. Keywords: Integers; Martingales; Inqualities.

We discuss fixed-width interval estimation for the slope parameter in a simple linear regression when the X sub i are also normally distributed. A two-stage procedure that combines prediction with estimation is described. In addition, we discuss two sequential procedures. The confidence intervals obtained are used to construct tests with level and power at least independent of the values of the other parameters. We also consider a sequential procedure based on the distribution of the sample...

This note is concerned with countably infinite product sigma-fields and their invariant, tail, and exchangeable sub-sigma-fields. Under an exchangeable probability the three sub-sigma-fields coincide as measure algebras (the theorems (1) and (7)). An immediate consequence is the Hewitt-Savage 0-1 law. A later section includes examples which by and large preclude extensions of (1) and (7) to probabilities merely invariant under the shift. However, at least one interesting conjecture of David...

The operation of sorting the items of a Q-set according to their similarity to a given object is idealized by a system of axioms. As a consequence of this axiom system, a stochastic model of Q-sorting behavior is derived. This model, with its associated axioms, resembles in some respects the preference model of Luce; the two are compared at length.

A new method is presented for flexible regression modeling of high dimensional data. The model takes the form of an expansion in product spline basis functions, where the number of basis functions as well as the parameters associated with each one (product degree and knot locations) are automatically determined by the data. This procedure is motivated by the recursive partitioning approach to regression and shares its attractive properties. Unlike recursive partitioning, however this method...

Estimation of the parameters of a first-order Gaussian moving average model is treated in detail. Iterative methods in both the time and frequency domains are based on the maximization of the exact likelihood. Several methods for evaluating the necessary quadratic forms and traces are presented. The procedures are compared with other and with alternative procedures.

It is argued that statistical inference, like other branches of mathematics, should have a structure of axioms and theorems. Such an axiomatic system is described and shown to lead to the likelihood principle. Almost all standard statistical techniques violate that principle. The paper concludes with an example using the Galton-Watson process which demonstrates the power of the principle.

Consider an experiment which consists of n independent Bernoulli sequences, each of which is randomly terminated at either the kth success or the kth failure, whichever comes first. The goals are to make inferences about the per trial probability of success, as well as the probability that a sequence is terminated with a success. The latter is equivalent to the probability that if 2k-1 trials are accumulated, the majority will be successes. Various estimators are investigated, and the results...

A one-dimensional Wiener plus independent Poisson control problem with state governed by a partial differential equation has integrated discounted quadratic cost function and asymmetric bounds on the control, which is a function of the current state. A Bellman equation and maximum principle for partial differential equations are used to obtain the optimal closed loop control in bang-bang form. The finite and infinite integral quadratic cost functions are treated separately.

We derive an exact p-value for testing a global null hypothesis in a general adaptive regression setting. Our approach uses the Kac-Rice formula (as described in Adler & Taylor 2007) applied to the problem of maximizing a Gaussian process. The resulting test statistic has a known distribution in finite samples, assuming Gaussian errors. We examine this test statistic in the case of the lasso, group lasso, principal components and matrix completion problems. For the lasso problem, our test...

Using facts about divided differences, we prove an identity which can sometimes express the joint distribution of several linear combinations of uniform spacings as a sum of simpler distributions. This identity is useful in the exact computation of probabilities and expectations which arise in testing for uniformity. Examples are given. Keywords include: Spacings; Uniform distribution; Divided differences; Clustering probabilities.

The positive probability that an estimated moving average process is noninvertible is studied for maximum likelihood estimation of a univariate process. Upper and lower bounds for the probability in the first-order case are obtained as well as limits when the sample size tends to infinity. Higher order moving average models and autoregressive moving average models are also treated. Keywords: Moving average models; Maximum likelihood estimation; Noninvertible moving average; and Autoregressive...

There is an extensive literature concerned with random arcs and coverage problems. Much of this literature deals exclusively with the case of equal arc lengths. In this document the authrors mention some work dealing with coverage of the circle by arcs of differing lengths. Keywords: Schur convex functions; Coverage probabilities.

It is shown that the goodness of fit statistic based on moments of Gurland and Dahiya (1970) can be expressed as a sum of components having asymptotically independent chi-square sub 1 distributions. Expressions are found for the components under the null hypothesis of normality or exponentiality. The first and second components in the normal case are equivalent to the skewness and kurtosis measures b1 and b2 respectively. The first component in the exponential case is equivalent to Greenwood's...

Designs for quadratic regression on a cube, cube with truncated vertices and on a ball are studied in terms of a family of criteria that includes those of A-, D- and E-optimality. Both theoretical and numerical results on structure and performance are presented. In particular, D- and E- optimum designs are described and a procedure of construction of nearly robust under variation of criterion integer designs is suggested. Some examples are given for the dimensions 4, 5 and 6.

The concept of Standardized Generalized Variances (SGV's) is introduced. Several new problems of multivariate statistical inference are formulated on the basis of these SGV's. It is shown that in addition to providing several new statistical tests, many existing problems of multivariate tests of significance can be regarded as special cases of these formulations and can also be extended to their full generalities. Considering multivariate normal populations with general covariance matrices,...

A problem studied by Flanders is to minimize the function f(R) = tr (SR + T 1/R) over the set of positive definite matrices R, where S and T are positive semi-definite matrices of rank m. Alternative proofs that may have some intrinsic interest are provided. The proofs explicitly yield the infimum to f(R) . One proof is based on a convexity argument and the other on a sequence of reductions to a univariate problem.

The estimation of a variance ratio is studied under certain restrictions. A Bayesian viewpoint is taken. We then assume additional information on tau sub i is available in the form of an independent observation from a noncentral sq x distribution. A natural application arises when the u sub i are sums of squares in alpha variance components model

The tables provide probabilities of a form and are independent chi-square random variables with one degree of freedom, for n=2,3. These extensive tables for n=2,3 were given in an unpublished Technical Report. A number of requests over the years plus the additional uses for it for n 3 discussed in this paper and by Bock (1984) suggest that its republication is merited. Keywords include: Quadratic forms; Linear combinations of chi-square variables; Percentiles.

The optimal statistical control of a simple production process which has only two underlying states, good and bad, is studied. The produced items are classified as good or defective, a cost being associated with each defective item produced. A cost is also charged for repairing the process, which has the effect of returning the process to the good state. Other than immediately after repair, the process state is assumed unknown. One seeks a statistical control rule which, based on the quality...

A model is given for a class of contests in which the participants try to guess (or estimate) unknown quantities, and the objective of each player is to come closer to the unknown quantities than his adversary. A general optimality result is proved which gives the best guessing rules for the second guesser. These rules are first calculated exactly in a certain hierarchical linear model, and then simpler approximate rules are given. (Author)

Many authors have warned of the hazards involved in fitting Pearson curves on the basis of empirically determined moments. For the problem of estimating a percentage point of a Pearson curve on the basis of empirical moment estimates, the delta method easily yields an approximation to the variability of the estimated quantile. This report describes a crude computer program for assisting quantitatively the size of these hazards in the case of Pearson Type IV curves (an important family including...

This report shall focus attention on the specification of the edge process, and show how various geometrical insights suggests how the prior Gibbs distribution should be constructed. The discussion will suggest relative costs for possible configurations somewhat different from those proposed by Geman and Geman (1984). In addition the scheme will provide methods for dealing with rectangular and irregular pixel patterns. The general idea of evolving penalties for continuation configurations based...

Orderings (denoted ) for probability distributions on the circle are introduced for which mu1 mu2 means roughly that mu1 is more uniform than mu2, or that mu2 is more clumped than mu1. Somewhat more precisely, mu1 mu2 if the random variable mu1(A sub s) is 'less variable' than the random variable mu2(A sub s) for all s where A sub s denotes a random arc of length s distributed uniformly on the circle. Some properties of the orderings are explored and applications are presented to random...

The distribution of error is considered when a function y of x is rounded, and when x is uniformly distributed. The example discussed is y - sin x, and it is thought that the round-off error might be nearly uniformly distributed. The non-uniformity is very small, and the sample size needed to detect this by the A2 statistic is examined. The study is of interest in the examination of ancient and mediaeval tables. Distribution of error, Goodness-of- fit, Tables of functions.

Suppose a box contains m balls, numbered from 1 to m. A random number of balls are drawn from the box, their numbers are noted, and the balls are then returned to the box. This is done repeatedly, with the sample sizes being iid. Let X be the number of samples needed to see all the balls. This paper derives a simple but typically very accurate approximation for EX in terms of the sample size distribution. The justification of the approximation formula uses Wald's identity and Markov-chain...

Previous attempts at implementing fully Bayesian nonparametric bioassay have enjoyed limited success due to computational difficulties. We show here how this problem may be generally handled using a sampling based approach to develop desired marginal posterior distributions and their features. A useful extension is presented which treats the case of ordered polytomous response. Illustrative examples are provided.

A review is undertaken of two maximum likelihood approaches to cluster analysis, the so-called classification and mixture maximum likelihood methods. The basic assumptions of the two approaches and their associated properties are contrasted, in particular for multivariate normal component distributions. The problem of deciding how many clusters there are is discussed for each approach. Also, an account is given of the relative efficiency of the mixture approach to clustering. (Author)

Generalizations of Cochran's theorem by Takemura are discussed in connection with results by Anderson and Styan and other works by Rao and Yanai and Marsaglia and Styan. A number of further extensions are presented.

Blackwell's renewal theorem for non-lattice renewal processes with mean recurrence time m states that the expected number of renewals in a time-interval of length h tends to h/m as the interval goes to infinity: E(N(t,t + h)) approaches h/m as t approaches infinity. This note presents a self contained coupling proof of this result mending the drawbacks of earlier such proofs. Firstly, the proof is complete in the sense that it covers not only the case m infinity but also m = infinity. Secondly,...

This paper contains an account of several techniques in multivariate data analysis. Included among these techniques are classification and clustering procedures, multidimensional contingency table analysis, and some graphical representation techniques. Some data bases are employed to illustrate the techniques.

An approximation is given to calculate V, the covariance matrix for normal order statistics. The approximation gives considerable improvement over previous approximations, and the computing algorithm is available from the authors.

Binary switching nets have been presented as useful models for a variety of complex phenomena. Determinate binary functions describe the response of an element in such a net to its inputs. Such functions are called Boolean transformations. Three structural properties are studied of these transformations--forcibility, internal homogeneity and threshold.

Eight models of randomness for chords of a unit sphere are considered, and the distribution, mean, and variance of the chord length are obtained for each model.

In this paper we examine the distribution of a sum S of binomial random variables, each with different success probabilities. The distribution arises in reliability analysis and in survival analysis. An algorithm is given to calculate the exact distribution of S, and several approximations are examined. An approximation based on a method of Kolmogorov, and another based on fitting a distribution from the Pearson family, can be recommended.

We present a natural preliminary test for the presence of structure (nontrivial dependence) in a data set, and give some examples of its use. The procedure consists of sphering the data to remove correlations, then binning or discretizing the data, and finally, studying the cell counts in the resulting contingency table. If this procedure detects structure, we can then use more computationally intensive methods to determine the nature of this structure.

