Skip to main content

Full text of "ERIC ED362384: A Latent Variable Path Analysis Model of Secondary Physics Enrollment in New York State."

See other formats


ED 362 384 

SE 053 654 


N )TE 




Sobolewski, Stanley J. 

A Latent Variable Path Analysis Model of Secondary 
Physics Enrollment in New York State. 
Apr 93 

13p. ; Paper presented at the Annual Meeting of the 
National Association for Research in Science Teaching 
(Atlanta, GA, April 17, 1993). 
Reports - Research/Technical (143) — 
Speeches/Conference Papers (150) 

MFOl/PCOl Plus Postage. 

''^Academic Achievement; Classroom Research; 
''Enrollment Influences; High Schools; ^'Thysics; 
'''Science Education; Secondary School Science 
'■'New York 


Physics is a fundamental science course and can be 
valuable to all students, however, enrollment at the high school 
level is low. This study tries to identify factors that influence 
high school student enrollment and achievement. Raw data was obtained 
from the New York State Education Department on magnetic tapes with 
over 300,000 records containing information about individual classes. 
From the data sets, variables determined to be appropriate for the 
model were selected, the latent variables identified, and their 
relationships explored. There are six major points identified in the 
study: (1) there has been little change in enrollment in the past 20 
years; (2) school level variables, including size of school, 
instructional facilities, and teacher experience have the strongest 
relationship to physics enrollment; (3) variables external to the 
school have the biggest impact on physics achievement; (4) larger 
schools have disproportionately fewer students enrolled in physics; 
(5) schools with a higher percentage going to college have a greater 
physics enrollment; and (6) in schools with a higher percentage of 
females enrolled in physics, the relationship between achievement in 
previous science courses and physics achievement is much stronger. 

Vc Vc 5V ic Vc ic Vc Vc Vc i: t': Vr Vc ->': t': ic iV -Jc ->': Vc i< Vc Vc Vf :;c :V Vc it i< Vc ic Vc Vc it it it it Vc it it it it it it it i< i< it it ?V it it it it it it it 

Reproductions supplied by EDRS are the best that can be made 
from the original document. 

itititititititititit itititit i: itit itit it itit itititit itit ititititit it itit itit it it it it it it ic it it it-ititit itit it it i( it if it it it ititititit idt it 


A Latent Variable Path Analysis Model of 
Secondary Physics Enrollment in New York State 

A presentation and discussion 
at the Annual meeting of the 
National Association of Research in Science Teaching 
Atlanta, Georgia 
April 17, 1993 


Stanley J, Sobolewski 
Department of Physics 
Indiana University of Pennsylvania 
Indiana, Pennsylvania 15705 



Stanley j, .gphni ^yr^p^' 


OKtce of Educational Reaeafch and Improvemeni 


K This document has b«en fftproduced aa 
received Irom the peraon oi o/ganijation 
originating it 

O Minor Change* have b«en made to improve 
reproduction quality 

e Points o» view or opinions stated tn this docu- 
ment do not necessarily represent official 
OERI position or policy 


The number of high school students choosing to enroll in physics has been traditionally low. In 1968, 20% of high 
school students in an average grade took a physics course ^Welch 1968). A study conducted by the American Institute of 
Physics found that in 1987, 623,0()0 students were enrolled in a physics course. This represents about 20% of the 1987 high 
school graduates (Neuschatz and Covalt 1988), Since high school physics is the last science experience for most people* it is 
possible that up lo 80% of the population has an incomplete science experience. 

Physics is a fundamental science course and can be valuable to all students. Few educators would advise a student not 
to take some type of course in physics. Yet, the physics enrollments are low. The placement of the course at the end of the 
science curriculum sequence discourages some students from enrolling. In some cases, a lack of course availability may be a 
problem. According to the 1985-86 National Survey of Science biA Mathematics Education, (Weiss 1987) when science 
teachers were asked to name a science course, only 12% of the teachers sampled mentioned physics. Other reasons for low 
enrollment in physics include students' perception that the course is difficult and the inaccessibility of the course due to 
prerequisites (Chandavarkar 1988, Gay 1978). Other explanations for the low physics enrollment include a lack of enthusiastic 
and qualified physics teachers OFranz, Aldridge & Clark, 1983). 

Statement of Problem 

This study will try to identify factors that influc.i. , high school physics enrollment and achievement. 

Research Problem 

The following broad research issues will be addressed: 

1. What are some of the elements, both in the school environment and external to the formal educational process that have an 
influence on physics enrollment in high school? Included will be variables that have been discussed by other authois as 
well as new variables to be introduced in this study. 

2. Will the model identify factors that differentiate schools with high PEP from schools with low PEP? Will the model 
identify factors that differentiate schools with high average scores on the Regents physics exams from schools with low 
average scores on the Regents physics exams? 


Data Acquisition and Analysis 

The raw data was obtained from the New York State Education Department on five magnetic tapes. There 
wc;re three sets of data on the \^s. The Basic Educational Data System(BEDS) tape held over 300,000 records that 
contained information about iiKiividual classes. Each record contained information on class size, the number of times 
the class meets per week, the quality of the students in the class (remedial to honors) as well as information on the 
teacher*s experience and education. The school data tape is a compilation of surveys completed by the principal of 
each building. Data that pertains to the school's resources and students as a whole was found in this file. The data 
included the number and type of classrooms in the school, and the number of books in the library as well as the 
distribution of various minorities in the student body. Lastly, the Comprehensive Achievement Report (CAR) 
consisted of data on three magnetic tapes, which gave school level rq)orts on the number of students taking and 
passing various state exams. Since school districts are required by the state to submit this information, there was 
little problem with missing data. The exception was with teacher's salaries and class size in small districts where 
physics and chemistry were taught in alternating years. The teacher's salary was not used as a predictor variable, since 
the salary is primarily a reflection of the number of years of experience in a particular district and therefore is 
redundant data. The missing physics enrollment for the 1990- 199 1 school year was replaced with one-half of the 
1989-1990 physics enrollment, which was also included among the information on fte magnetic tapes. 

From the data sets, variables which were determined to be appropriate for the model were selected. These 
variables are listed in the appendix. From these data, the latent variables were subsequently identified. The previous 
woik of Bryant (1979) as well as the models of Noonan and Wold (1983) was used as a guideline for the exploratory 
factor analysis. Initially, a factor analysis on all of the variables was conducted to determine the number of factors 
operating on physics enrolhnent Secondly, groups of variables which were suspected of having a high correlation 



with each factor were examined by the investigator to determine if these manifest variables reflected only one latent 

Model Construction 

Essentially, the construction of the exploratory model was a three step process: 1) processing the data into a 
meaningful format 2) identification and construction of the latent variables and 3) determining the relationship 
between each of the variables. For each of these steps, a computer based pnxedure was us^^d. For the first step, the 
data entry process in SPSS-X was used to read the raw data fiom the tapes. The second step used the procedure 
FACTOR, also an SPSS-X procedure. Thirdly, LISREL was used to verify the proposed model. Also, a series of 
multiple regressions was used to determine relationships between the manifest variables. The critical decision in the 
study is the development of latent variables. Once these variables have been identified and constructed, the 
relationship between the variables can be studied. 

The {TOCcss of factor analysis is a process of grouping variables which are highly correlated with each other. 
It is the working assumption of any factor analysis that there is a common element to the variables wliich are 
correlated with eac h other. The task of the researcher is to determine if the factors isolated by the analysis are 
representative of reality or spurious associations. The factor loading and coefficients were found using the SPSS-X 
statistics package. After several analysis had been performed, approximately eight factors emerged. These factors or 
latent variables were identified and named by the manifest variables associaied with them. The factors identified and 
the rationale for postulating the analogous "atent variable is as follows. 

Initially the manifest variables listed in the appendix were used in an exploraicry factor analysis. In this 
process it was found that some of the variables shard little variance with most of the other variables. The amount of 
shared variance is indicated by the communality (h ^) For example, the percent of students in a physical science class 
(NPHYSCI h 2 = 0.07536) or the number of black and white televisions in the school (BWTV h 2 = 0.07413) 
were not used since they had little common variance with most of the other variables. It is not assumed that these 
omitted variables are unimportant or that they have no predictive power; it is an attempt to focus the model on the 
problem of physics enrollments. When these weakly linked variables were removed, additional factor analysis were 
performed to isolate the latent variables. 

Latent and Composite Variables 

A latent variable is one which cannot be directly measured, while manifest variables can be measured. The 
nature of the latent variables must be determined by the theoretical framework and the relationship between the 
manifest variables. A latent variable is not constructed by the researcher, it is discovered in a way analogous to the 
discovery of an astronomical body. Unlike latent variables, composite variables are groups of measured variables 
rearranged by the experimenter. Composite variables make the process of identifying a simpler stnicture more 
efficient and reduces the amount of data necessary to construct the model. Composites also simplify the 
interpretation of the effect since fewer variables are easier to conceptualize. (Darmondy 1984) Among the 
consideration for identifying the best combination of variables to create the composite . Darmondy suggests the 

a) the nature of the variables available 

b) their relationship to the criterion variable 

c) the researcher's preference for defining a latent pre-existing variable whose aspects are reflected in 
the components 

d) the plausibility of the resulting construct 

When appropriately chosen, the variables will maximize the variance explained in the criterion variable and 
produces a composite which conforms best to the method of analysis used. The addiuon of variables will increase the 
explained variance. Once I lained variance does not increase, there is no reason to increase the numbs, f 

A third relevant point made by Damnondy has to do with the selection of the specific variables. On the basis 
of a factor analysis, groups of variables should be selected that have a) high commoI:^alties b) high factor loading c) 
positive correlaiion with the criterion and d) positive inter-correlation. 




In most circumstances, if the composite vaiiables are constructed appropriately, the latent variables in a 
model will agree with the composite variables in the same model. 

The Path Analysis Diagram 

The path analysis diagram essentially is the graphical representation of a set of simultaneous equations. 
There are many conventions on the construction of path diagrams, the method of representing associations is usually 
dependent upon the technique used to solve the system of equations. Here the conventions of Jweskog and Sorbom 
(1984) will be used since LISREL was used as the primary analytical tool. 

LISREL - Linear Structural Relationship 

Develq)ed by Joreskog and Sorbom, LISREL 7 is a computer program that solves sets of linear equations. 
While the basic function of the program is not unique to LISREL, authors refer to sets of structural equations as a 
LISREL model. LISREL was used in this study primarily due to the nature of the data as well as convenience. The 
"full LISREL modeP was used in this study. This full model consists of two types of equations. The first is referred 
to as the structural model and it refers to the relationship between the latent variables: 

Ti = p(Ti) + ^K^) + ; (i) 

The Ti vector contains elements that are referred to as the endogenous variables, or latent dependent 
variables. Vector ^ contains the exogenous variables, or latent independent variables. The ^ term is the error not 
explained by the model, p is a coefficient matrix that gives the influence of the ii's on each other and y is the matrix 
that contains temis that relate the endogenous and exogenous terms. 

The second set of LISREL equations is referred to as the measurement model. These two equations are: 

The process of identifying the latent variables and discovering the relationships between the latent variables 
and manifest variables was one of the primary tasks of this study. As mentioned above, a series of factor analyses 
were conducted to determine the number of latent variables or factors that operate in the schools of New York State. 

y = (Ay) T1+ e (ii) 

X = {h^ ^ + 8 (iu) 

These relate the measured values (x and y's) to the latent variables (r| and %). If each latent variable has a single 
indicator, then T) = y and ^ = x and equation (i) becomes: y = (B)y + (F) x + ^ (i*) 

Here, the B is the collection of all the P's and the T is the collection of ail the fs. 

In the full LISREL model there are eight matrices that represent the relationships between: 1) the dependent 
manifest variables and the dependent latent variables 2) the independent manifest variables and the independent latent 
variables 3) the relationship between the dependent and independent latent variables 4) the relationship between the 
independent latent variables 5) tl. relationship between the dependent latent variables 6) the enw, or variance in the 
manifest variables not explained by the model 7) the errcM* in the latent variables and 8) the covariance matrix of the 
error terms of the latent variables. 

Model I - Identification ofVarixibies 

The process of identifying latent variables requires the use of judgment as well as mathematical logic. The 
relationship that the variables have with each other is a function of their definition. Therefore, more than one model 
was developed using linear structural modeling. It is surmised that comparing and contrasting between the two 
models will increase practical understanding of science instruction in New York State Schools. In both cases, a 
general model was developed and confirmed with the LISREL computer program. Path coefficients that were found 
to be insignificant were eliminated. Path coefficients that had been omitted and were found to have a high and 
significant modification index were added to tlie model. The modification index is a non-scaled value that is assigned 
by LISREL to all of the path coefficients which arc forced to be zero by the experimenter. Variables which arc 




highly related and have path coefficients fixed to zero are assigned a high modification index. This is an indication to 
the experimenter that the variables might be related. It is the decision of the experimenter to change the model and 
liave the program calculate a value for a coefficient for a path which had been initially set to zero. 

The initial structure developed for model I was influenced by the separation of variables into two categories, 
one of variables external to the control of the school and one external to school control. From the factor analysis 
illustrated below, eight factors were postulated, four exogenous and four endogenous. 

Factor Analysis Identification of Latent Variables 


























- .588 
















. 64 9 



PAPS 10 



























Eight latent variables were identified from the previous factor analysis. These latent variables were found to 
he the following: 

Community Siz^ (^-1) : The variable which is external to the school district, and is descriptive of the community. In 
this specific case, it is the size of the community. 

FacilitigS (5-2) : Variables which are characteristic of the educational resources that are available for use at the 
sc1k)0L This includes the number of science class rooms per student, the number of video-cassette players as well as 
the number of books per student in the school library. 

Students Attendance (E-3) : Variables wliich describe the student body of the school, such as dropout rate and the 
average daily attendance. 

Teacher Experience (^-4) : Variables which are characteristic of teachers' experience. These include the highest degree 
earned by a teacher as well as the number of years a teacher has taught in the district 

Regents Ch emistry and Mathematics (r|-l) : Variables which are descriptive of the "upper regents" classes, that is 
Regents chemistry and course III mathematics. This includes the proportion of smdent enrolled in the class as well as 
Regents test scores. 

AP Enroll ment & College Bound Graduates (i\-2) : Variables which are characteristic of the college preparatory 
program at a school. This includes enrollment in advanced placement courses as well as the percent of graduates that 
attend a four year college. 

Regents Biology nnd E arth Science (t|-3) : Variables which are characteristic of the "lower regents** classes, such as 
earth science and biology. 

Local Chemistry and Biology (ri-4) : Variables which are characteristic of classes which do not have a state mandated 
curriculum. This includes enrollments in local biology and local chemistry. 

In addition to these variables, there are two endogenous dependent latent ^'ariables, Phvsics Enrollment (t|-5) 
and Phvsics Ac hievement (r|-6V Each of these has only one manifest variable associated with it These are, the 
percent of an average year enrolled in physics (PEP) and the percent of the students enrolled in Regents physics 
passing the Regents exam, respectively. 

Since latent variables are not directly observed, they can in principle have arbitrary units of measiue. 
However, since the purpose of constructing a path diagram is to determine the relationships between variables, units 
of measure must be assigned to the latent variables so that meaningful relationships can be found. Here the scales of 
measure for the latent variables will be the same as that of the manifest variable which accounts for most of the 
variance in the measure of the latent vaiiable. The exception to this was in assigninf^ one of the two primary 
independent variables, the percent of students passing the Regents chemistry exam l defining the scale for the 
Regents latent variable, even though the percent of students enrolled in Math course m accounted for more 
variance. It was felt that since physics is the primary subject under investigation, it would be a more appropriate 

The following table lists the latent variables, the manifest variable used to set the scale and the actual unit 
of measure used 

Latent Variable ■ 

Manifest Variable Unit 

of Measure 


Size of community 

6 point scale 
Big Five 
Large City 
Small Cicy 
Small town 


Number of books 
in library 


Students Attendan 

ce Daily attendance 

percent of 


Teacher degree 

7 point scale 
Masters plus 

Bachelors plus 


No rmal School 

Secondary School 


Percent enrolled 
in physics 

percent of 

AP Enrollment 

Percent of last years 
graduates attending 
4 year colleges 

percent of 

Regents Bio and 
Earth Science 

Percent enrolled in 
Regents Biology or 
Earth Science 

percent of 

Local Courses 

Percent in Local 
Biology and Chemistry 

percent of 

The latent variable model 

After the latent variables have been defined by the collection of manifest variables, the relationship between 
the latent variables is calculated In the terminology of LISREL, the values of P and y are calculated. The P terms 
relate the endogenous latent variables to each other and the Y relate the exogenous latent variables to the endogenous 
latent variables. Paths that were postulated and were not statistically significant were omitted from the diagram. In 
the following diagram, the four exogenous htent variables arc at the top of the page. The "Community size ^1" 
latent variable is at the top of the diagram for clarity of the diagram. In the logical structure of the diagram, it is at 
the same level as "Facilities'* 'Teacher Experience" and "Student Attendance." In general, however, there is a trend of 
general or "up-stream*" variables at the top of the page and more specific "down stream** variables at the bottom of 
the page. At the bottom of the diagram are the two dependent latent variables, "physics enrollment** and "physics 
achievement.** Each of these latent variables controls only one manifest variable: the percentage of students enrolled 




in all physics classes for "physics enrollment" and the percent of Regents physics students who pass the New York 
State Regents physics exam fw "physics achievement " 

When the values of the path coefficients are first examined, it can be seen that tliere is a range of values, 
froni 0.033 to 0.879. These path coefficients represent the change in standard scores of the dependent variable when 
the independent variable is changed one unit. 

When the paths that lead to the two independent variables of physics achievement and physics enroUment 
arc examined^ A general trend in the difference between the paths leading to the enrollment and the achievement 




variable can be noted. The paths that lead to the physics enrollment latent variables are aU small; teacher experience 
at .412, student attendance at .313, school facilities at -.220. All of the remaining variable path coefficients to the 
physics enrollment was .10 or less. The paths that lead to the physics achievement are slightly higher, the highest 
is the path from "community size" to physics achievement, with a value of -0.72. Local chemistry and biology are 
also are strongly associated with physics achievement (-.697) as well as student attendance (.517). 

From this it can be concluded that the one of the strongest forces in increasing enrollment in physics is the 
experience of the teacher. While the teacher can bring the students info the classroom, ihe variables that are some 
what outside the control of the school, that is the size of the community and the student's attendance patterns, are 
more closely related to physics achievement. This strong relationship to the community latent variables can be also 
seen in the local chemistry and biology variable (.44 1), as well as tlie Regents biology and earth science variable (- 
.879). This negative path coefficient here indicates that the larger communities have a lower percentage of students 
enrolled in Regents biology and earth science. This trend of a strong relationship between community and enrollment 
is broken in the AP enrollment latent vcuiablcs, where the experience of the teacher has the strongest relationship to 
AP enrolhnent. While it is possible that there is an association between the AP enrollment and teacher experience, it 
also is possible that the variables external to the school, such as the size of the community, is acting through the 
teacher experience variable. It is not possible to determine tliat potential relationship from this diagram.; the 
community size and teacher experience (.553) arc both exogenous latent variables and by design are indej. ;ndent of 
each other. 

Essentially, the model illustrates the ccmbined effects of school and community variables on physics 
enrollment and achievement In general, schools that have high physics enrollment also have more experienced 
teachers. Schools that have a high passing rate on the Regents physics exam tend to be in smaller communities. 



Model 2 

In an anempt to belter describe physics enrollment patterns* a second model of physics enrollment is presented here. 
Tlie primary difference between tlie first and second model is in the number of manifest variables used to define the latent 
variables and the number of categories in which the latent variables were grouped, The second model has more of both, It is 
not necessarily true that one model is better than the other, siroc they both have statistically significant path coefficients and 
both have a liigh goodness of fit index (0 ,77 for model 2 and .83 for model 1), The motivation for the second model was tlie 
need for a level of intermediate school wide variables. The second model also has more manifest indicators for the latent 
variables. In the case of the first model, most of the latent variables had one or two indicators. In the case of the second 
model, most of the latent variables have from tlu'ee to sbt manifest variables associated with each latent variable, Tliis second 
model is illustrated here. 




TTie latent variables in this model were organized in levels from the most global or general to those that are 
most specific. This allow the variance to flow either from the exogenous variables direcUy to the dependent latent 
variable or to flow through the school through school level variable, science department level variables, and physics 
mstiuction variables. The most striking result is that most of the variance in both physics enrollment and physics 
achievement can be associated with the variables which do not flow from the school. However, this finding is 
consistent with some other research than indicates similar amount of common variance between school variables and 
achievement: 30% Spelhaug (1990), 5-15% Talton (1983). This second model used 30 manifest variables and 14 
latent variables to explain physics enrollment and physics achievement in schools in New York State. According to 
the recommendations of Brv-ant (1974) and others, the model has met the goal as described above. Specifically, 1) 
each of the variables in the model were theoreUcally justified 2) the model had several zero patli coefficients 3) all of 
the correlations were statistically significant 4) all of the regression equations used to calculate the path coefficients 
were significant at the 0.01 level 5) none of the beta weights had a magnitude less than one standard error of the i^ta 
weight 6) At least 50% of tht variance in the PEP is predicted by the model. 

The model meeting the above criteria in combination with an adjusted goodness to fit index of 0.85 
indicates that the model constructed here is a satisfactory model of physics enrolhnent and achievement in New York 
State high schools. 


There are sbc major points thai can summarize the results of this study. 1) According to the variables that 
were used here, there has been little change in the enrollment of students in high school physics in New York State 
in the past 20 years. Tlie average PEP in New York State is around 20%, which is a similar figure to what has been 
found in previous studies. The factors that influence the enrollment also were identify as being the same in 199 1 as 
in 1971. 2) School level variables, which include the size of the school, the instructional facilities and teacher 
experience have the strongest relationsliip on pli>sics enrollment. 3) Variables external to the school, such as 
community size and student ability have the greatest effect on physics achievement. 4) Larger schools have 
disproportionately fewer students enrolled in physics. 5) Schools with a higher percentage of graduate attending 
coUege also have a higher PEP They also have a greater percentage of teachers with advanced degrees. 6) In schools 
with a higher percentage of females enrolled in physics, the relationship between achievement in physics and 
achievement in previous science courses is much stronger. 

The combination of results 2) and 3) indicate tlie independent nature of achievement and enrollment. The 
school is capable of increasing the enrollment of physics, but it is essentially the student and the student level 
variables that are more associated with success in physics that the school level variables. 

Part of the purpose of this study was to investigate the development of educational models. Therefore, two 
latent variable snuctural models were constructed. In tiie comparison of the two models, some of the relationships 
between variables were similar. The sti'eiigth of a relationship between these variables is heightened due to this 





Bryant, L. T., A Path Analysis Model f or Secondary Phvsics Enrnllinents . Docloral Dissertation, State University 
of New York at Buffalo February, (1974) 

Bryant, L. T., A Path Analysis Model for Secondary Physics Enrollments, Journal of Research in Science Teaching . 
14(3): 177-1 89 (1979) 

Chandavaricar, NL, The Teaching and Learning of Phvsics in the United States . Doctoral Dissertation, Teachers College, 
Columbia University, New York (1988) 

Daimondy, J. Develonment of Composite Background Variables . Second DBA Science Study, lEA/SISS buUetin 121 

Franz, J. The Crisis in High School Physics Teaching: Paths to a solution, Phvsics Today v36 n9 September 

Joreskog, J. and Sorbom, D. LISREL VI ■ Analysis of Linear Strucmral Relationships hv the Method of Maximum 
Likelihood. Monesville, Indiana, Scientific Software (1984) 

Neuschatz, M., Covalt, M. 1986-87 Nati onwide Survey of Secondary School Teachers of Phvsics . American 
Institute of Physics, New York (1988) 

Noonan, R., Wold, H. Evaluation in Education: An International Review Series Evaluating School Systems Using 
Partial Lea<;t Squares. Volume 7, Number 3; Pergamon Press, New York (1983) 

Weiss, LR. Report of the 1985- 1986 National Sur vey of Science and Mathematics Education. Research Triangle 
Institute, North Carolina, (1987) 

Welch, Wayne Some Characteristics of High School Physics Students: Circa 1968 Journal of Research in 
Science Teaching. 6: 242-247; 1969.