This report presents findings on crime and violence in U.S. public schools, using data from the 2015-16 School Survey on Crime and Safety (SSOCS:2016). First administered in school year 1999-2000 and repeated in school years 2003-04, 2005-06, 2007-08, 2009-10, and 2015-16, SSOCS provides information on school crime-related topics from the perspective of schools. Developed and managed by the National Center for Education Statistics (NCES) within the Institute of Education Sciences of the U.S....

Topics: ERIC Archive, Crime, Violence, Discipline, School Safety, Public Schools, School Surveys, National...

Multilevel models (MLMs) have proven themselves to be very useful in social science research, as data from a variety of sources is sampled such that individuals at level-1 are nested within clusters such as schools, hospitals, counseling centers, and business entities at level-2. MLMs using restricted maximum likelihood estimation (REML) provide researchers with accurate estimates of parameters and standard errors at all levels of the data when the assumption of normality is met, and outliers...

Topics: ERIC Archive, Hierarchical Linear Modeling, Comparative Analysis, Computation, Robustness...

Poor quality early childhood education and care (ECEC) can be detrimental to the development of children as it could lead to poor social, emotional, educational, health, economic, and behavioural outcomes. The lack of consensus as to the strength of the relationship between teacher qualification and the quality of the early childhood learning environment has made it difficult for policymakers and educational practitioners alike to settle on strategies that would enhance the learning outcomes...

Topics: ERIC Archive, Early Childhood Education, Teacher Qualifications, Educational Environment,...

A simulation study is presented to evaluate and compare three methods to estimate the variance of the estimates of the parameters d and "C" of the signal detection theory (SDT). Several methods have been proposed to calculate the variance of their estimators, "d'" and "c." Those methods have been mostly assessed by comparing the empirical means and variances in simulation studies with the calculations done with the parametric values of the probabilities of giving a...

Topics: ERIC Archive, Evaluation Methods, Theories, Simulation, Statistical Analysis, Stimuli, Maximum...

Observational studies are common in educational research, where subjects self-select or are otherwise non-randomly assigned to different interventions (e.g., educational programs, grade retention, special education). Unbiased estimation of a causal effect with observational data depends crucially on the assumption of ignorability, which specifies that potential outcomes under different treatment conditions are independent of treatment assignment, given the observed covariates. The primary goals...

Topics: ERIC Archive, Computation, Influences, Observation, Data, Selection, Simulation, Methods,...

The research reported here uses a pre/post-test model and stimulated recall interviews to assess teachers' statistical reasoning about comparing distributions, when enrolled in a graduate-level statistics education course. We discuss key aspects of the course design aimed at improving teachers' learning and teaching of statistics, and the resulting different ways of reasoning about comparing distributions that teachers exhibited before and after the course.

Topics: ERIC Archive, Faculty Development, Thinking Skills, Graduate Students, Statistics, Instructional...

Regression, weighting and related approaches to estimating a population mean from a sample with nonrandom missing data often rely on the assumption that conditional on covariates, observed samples can be treated as random. Standard methods using this assumption generally will fail to yield consistent estimators when covariates are measured with error. We review approaches to consistent estimation of a population mean of an incompletely observed variable using error-prone covariates, noting...

Topics: ERIC Archive, Simulation, Computation, Statistical Analysis, Statistical Bias, Regression...

The purpose of this inquiry was to investigate the effectiveness of item response theory (IRT) proficiency estimators in terms of estimation bias and error under multistage testing (MST). We chose a 2-stage MST design in which 1 adaptation to the examinees' ability levels takes place. It includes 4 modules (1 at Stage 1, 3 at Stage 2) and 3 paths (low, middle, and high). When creating 2-stage MST panels (i.e., forms), we manipulated 2 assembly conditions in each module, such as difficulty level...

Topics: ERIC Archive, Item Response Theory, Computation, Statistical Bias, Error of Measurement, Difficulty...

Randomized experiments are commonly used to evaluate the effectiveness of educational interventions. The goal of the present investigation is to develop small-sample corrections for multiple contrast hypothesis tests (i.e., F-tests) such as the omnibus test of meta-regression fit or a test for equality of three or more levels of a categorical moderator. Drawing on work that addresses related, simpler problems and special cases of cluster-robust variance estimation, the authors develop three...

Topics: ERIC Archive, Randomized Controlled Trials, Sample Size, Effect Size, Hypothesis Testing,...

Regression discontinuity design (RD) has been widely used to produce reliable causal estimates. Researchers have validated the accuracy of RD design using within study comparisons (Cook, Shadish & Wong, 2008; Cook & Steiner, 2010; Shadish et al, 2011). Within study comparisons examines the validity of a quasi-experiment by comparing its estimates to trustworthy benchmarks (usually an experiment) with the same treatment group. First developed by Lalonde (1986), it is a rigorous method to...

Topics: ERIC Archive, Pretests Posttests, Statistical Bias, Accuracy, Regression (Statistics), Research...

The goal of this study is to better understand how methods for estimating treatment effects of latent groups operate. In particular, the authors identify where violations of assumptions can lead to biased estimates, and explore how covariates can be critical in the estimation process. For each set of approaches, the authors first review the assumptions necessary for identification and discuss practical issues that arise in estimation; second, they then examine how covariates allow for improved...

Topics: ERIC Archive, Computation, Statistical Analysis, Statistical Bias, Outcomes of Treatment,...

A valuable extension of the single-rating regression discontinuity design (RDD) is a multiple-rating RDD (MRRDD). To date, four main methods have been used to estimate average treatment effects at the multiple treatment frontiers of an MRRDD: the "surface" method, the "frontier" method, the "binding-score" method, and the "fuzzy instrumental variables" method. This paper uses a series of simulations to evaluate the relative performance of each of these...

Topics: ERIC Archive, Regression (Statistics), Research Design, Quasiexperimental Design, Research...

Randomized controlled trials (RCTs) and regression discontinuity (RD) studies both provide estimates of causal effects. A major difference between the two is that RD only estimates local average treatment effects (LATE) near the cutoff point of the forcing variable. This has been cited as a drawback to RD designs (Cook & Wong, 2008). Comparisons of RCT estimates of average treatment effect (ATE) and RD estimates of LATE are rare because few studies have both randomized assignment and a...

Topics: ERIC Archive, Randomized Controlled Trials, Regression (Statistics), Research Problems, Comparative...

When randomized control trials (RCT) are not feasible, researchers seek other methods to make causal inference, e.g., propensity score methods. One of the underlined assumptions for the propensity score methods to obtain unbiased treatment effect estimates is the ignorability assumption, that is, conditional on the propensity score, treatment assignment is independent of the outcome. The purpose of this study is to use within-study comparisons to assess how well propensity score methods can...

Topics: ERIC Archive, Educational Research, Benchmarking, Statistical Analysis, Computation, Comparative...

Meeting the What Works Clearinghouse (WWC) attrition standard (or one of the attrition standards based on the WWC standard) is now an important consideration for researchers conducting studies that could potentially be reviewed by the WWC (or other evidence reviews). Understanding the basis of this standard is valuable for anyone seeking to meet existing standards and for anyone interested in adopting this approach to developing a standard (that is, combining a theoretical model with empirical...

Topics: ERIC Archive, Attrition (Research Studies), Student Attrition, Randomized Controlled Trials,...

A central goal of the education literature is to demonstrate that specific educational interventions--instructional interventions at the student or classroom level, structural interventions at the school level, or funding interventions at the school district level, for example--have a "treatment effect" on student achievement. This paper has three objectives. First, Theobald and Richardson explain both how Single World Intervention Templates (SWITs) unify two existing approaches to...

Topics: ERIC Archive, Intervention, Educational Research, Pretests Posttests, Outcome Measures,...

This article presents a method for addressing the self-selection bias of students who participate in learning communities (LCs). More specifically, this research utilizes equivalent comparison groups based on selected incoming characteristics of students, known as bootstraps, to account for self-selection bias. To address the differences in academic preparedness in the fall 2012 cohort, three stratified random samples of students were drawn from the non-LC population to match the LC cohort in...

Topics: ERIC Archive, College Freshmen, First Year Seminars, Student Participation, Communities of...

This paper examines sources of potential bias in systematic reviews and meta-analyses which can distort their findings, leading to problems with interpretation and application by practitioners and policymakers. It follows from an article that was published in the "Canadian Journal of Communication" in 1990, "Integrating Research into Instructional Practice: The Use and Abuse of Meta-analysis," which introduced meta-analysis as a means for estimating population parameters and...

Topics: ERIC Archive, Meta Analysis, Statistical Bias, Data Interpretation, Accuracy, Research Problems,...

We explore the use of instrumental variables (IV) analysis with a multi-site randomized trial to estimate the effect of a mediating variable on an outcome in cases where it can be assumed that the observed mediator is the only mechanism linking treatment assignment to outcomes, as assumption known in the instrumental variables literature as the exclusion restriction. We use a random-coefficient IV model that allows both the impact of program assignment on the mediator (compliance with...

Topics: ERIC Archive, Statistical Bias, Statistical Analysis, Least Squares Statistics, Sampling,...

In this paper, we examine the validity and precision of two nonexperimental study designs (NXDs) that can be used in educational evaluation: the comparative interrupted time series (CITS) design and the difference-in-difference (DD) design. In a CITS design, program impacts are evaluated by looking at whether the treatment group deviates from its "baseline trend" by a greater amount than the comparison group. The DD design is a simplification of the CITS design--it evaluates the...

Topics: ERIC Archive, Research Design, Educational Assessment, Time, Intervals, Reading Programs,...

This brief considers the problem of using value-added scores to compare teachers who work in different schools. The author focuses on whether such comparisons can be regarded as fair, or, in statistical language, "unbiased." An unbiased measure does not systematically favor teachers because of the backgrounds of the students they are assigned to teach, nor does it favor teachers working in resource-rich classrooms or schools. A key caveat: A measure that is unbiased does not mean the...

Topics: ERIC Archive, Educational Research, Achievement Gains, Teacher Effectiveness, Comparative Analysis,...

The increasing availability of data from multi-site randomized trials provides a potential opportunity to use instrumental variables methods to study the effects of multiple hypothesized mediators of the effect of a treatment. We derive nine assumptions needed to identify the effects of multiple mediators when using site-by-treatment interactions to generate multiple instruments. Three of these assumptions are unique to the multiple-site, multiple-mediator case: 1) the assumption that the...

Topics: ERIC Archive, Causal Models, Measures (Individuals), Research Design, Context Effect, Compliance...

A key issue in quasi-experimental studies and also with many evaluations which required a treatment effects (i.e. a control or experimental group) design is selection bias (Shadish el at 2002). Selection bias refers to the selection of individuals, groups or data for analysis such that proper randomization is not achieved, thereby ensuring that the sample obtained is not representative of the population intended to be analyzed (Shadish el 2002). There are many ways in which selection bias...

Topics: ERIC Archive, Quasiexperimental Design, Probability, Scores, Least Squares Statistics, Regression...

A central issue in nonexperimental studies is identifying comparable individuals to remove selection bias. One common way to address this selection bias is through propensity score (PS) matching. PS methods use a model of the treatment assignment to reduce the dimensionality of the covariate space and identify comparable individuals. parallel to the PS, recent literature has developed the prognosis score (PG) to construct models of the potential outcomes (Hansen, 2008). Whereas PSs summarize...

Topics: ERIC Archive, Probability, Scores, Statistical Bias, Prediction, Monte Carlo Methods, Kelcey, Ben

The goal of this paper is to provide guidance for applied education researchers in using multi-level data to study the effects of interventions implemented at the school level. Two primary approaches are currently employed in observational studies of the effect of school-level interventions. One approach employs intact school matching: matching schools that are implementing the treatment to schools not implementing the treatment that are similar in observable characteristics. An alternative...

Topics: ERIC Archive, Matched Groups, Intervention, Randomized Controlled Trials, Elementary Schools,...

Propensity score analysis (PSA) is a methodological technique which may correct for selection bias in a quasi-experiment by modeling the selection process using observed covariates. Because logistic regression is well understood by researchers in a variety of fields and easy to implement in a number of popular software packages, it has traditionally been the most frequently used method for modeling selection in PSA. There are, however, circumstances under which logistic regression may not...

Topics: ERIC Archive, Probability, Scores, Statistical Analysis, Statistical Bias, Quasiexperimental...

The surgical theatre educational environment measures STEEM, OREEM and mini-STEEM for students (student-STEEM) comprise an up to now disregarded systematic overestimation (OE) due to inaccurate percentage calculation. The aim of the present study was to investigate the magnitude of and suggest a correction for this systematic bias. After an initial theoretical exploration of the problem, published scores were retrieved from the literature and corrected using statistical theorems....

Topics: ERIC Archive, Educational Environment, Scores, Grade Prediction, Academic Standards, Scoring,...

There has been an active debate in the literature over the validity of value-added models. In this study, the author tests the central assumption of value-added models that school assignment is random relative to expected test scores conditional on prior test scores, demographic variables, and other controls. He uses a Chicago charter school's lottery to identify school effects, and then compares this "experimental" estimate to that of a school value-added model, which is estimated...

Topics: ERIC Archive, Value Added Models, Charter Schools, School Effectiveness, Statistical Bias,...

The current main world university rankings broadly group the leading research universities of nations. Australia's Go8 universities are generally within the top 250 ranked universities, with several institutions in the top 50-100 on some measures. This recognition is commendable, however imperfect the individual rankings may be. Use is made of rankings by prospective students, governments and universities themselves. There is not always a good alignment between the purposes for which rankings...

Topics: ERIC Archive, Evaluation Methods, Foreign Countries, Public Policy, Research Universities,...

A teacher's value-added score is intended to convey how much that teacher has contributed to student learning in a particular subject in a particular year. Different school districts define and compute value-added scores in different ways. A variety of people may see value-added estimates, and each group may use them for different purposes. Teachers themselves may want to compare their scores with those of others and use them to improve their work. Administrators may use them to make decisions...

Topics: ERIC Archive, Teacher Effectiveness, Achievement Tests, Statistical Bias, Teacher Evaluation,...

The main goal of this study was to illustrate and provide some direction for dealing with the complexities of propensity score matching within different multilevel contexts. Special attention is given to how procedures typically applied in a non-hierarchical setting may be modified to properly reduce the expected bias in the estimated treatment effect of a high school-level intervention on college-level outcomes. In particular, students self-selected into a high school level intervention and...

Topics: ERIC Archive, Probability, Scores, Statistical Bias, High School Students, College Students,...

Selection bias is problematic when evaluating the effects of postsecondary interventions on college students, and can lead to biased estimates of program effects. While instrumental variables can be used to account for endogeneity due to self-selection, current practice requires that all five assumptions of instrumental variables be met in order to credibly estimate the causal effect of a program. Using the Pike et al. (2011) study of selection bias and learning communities as an example, the...

Topics: ERIC Archive, Statistical Bias, College Students, Educational Research, Statistical Analysis,...

This fifth and final paper in the Fordham Institute's series examining digital learning policy is "Overcoming the Governance Challenge in K-12 Online Learning". The purpose of this report is to outline the steps required to move the governance of K-12 online learning from the local district level to the less restrictive state level and to create a free market for corporate innovation in K-12 online learning. Unfortunately, the report is based on an unsupported premise that K-12 online...

Topics: ERIC Archive, Evidence, Electronic Learning, Free Enterprise System, Elementary Secondary...

Schools and school systems throughout the nation are increasingly experimenting with using various instructional technologies to improve productivity and decrease costs, but evidence on both the effectiveness and the costs of education technology is limited. A recent report published by the Thomas B. Fordham Institute sets out to describe "the size and range of the critical cost drivers for online schools in comparison to traditional brick-and-mortar schools" (p. 2). The study divides...

Topics: ERIC Archive, Evidence, Electronic Learning, Distance Education, Online Courses, Educational...

When most people think of the perks of teaching, an image that comes to mind is a shiny apple presented by a gap-toothed pupil. A recent paper by Jason Richwine of the Heritage Foundation and Andrew Biggs of the American Enterprise Institute claims that public school teachers enjoy lavish benefits that are more valuable than their base pay and twice as generous as those of private-sector workers (Richwine and Biggs 2011). According to Richwine and Biggs, this makes teachers' total compensation...

Topics: ERIC Archive, Educational Attainment, Public School Teachers, Salary Wage Differentials, Teacher...

Teacher pension systems target retirements within a narrow range of the career cycle by penalizing individuals who separate too soon or remain employed too long. The penalties result in the retention of some teachers who would otherwise choose to leave, and the premature exit of some teachers who would otherwise choose to stay. We examine how the effects of teachers' pension incentives on workforce composition influence teacher quality. Teachers who are held in by the "pull"...

Topics: ERIC Archive, Teacher Retirement, Retirement Benefits, Incentives, Teacher Effectiveness,...

In a provocative and influential paper, Jesse Rothstein (2010) finds that standard value-added models (VAMs) suggest implausible future teacher effects on past student achievement, a finding that obviously cannot be viewed as causal. This is the basis of a falsification test (the Rothstein falsification test) that appears to indicate bias in VAM estimates of current teacher contributions to student learning. Rothstein's finding is significant because there is considerable interest in using VAM...

Topics: ERIC Archive, Value Added Models, Academic Achievement, Teacher Effectiveness, Correlation,...

A new report titled "The Long-Term Impacts of Teachers" concludes that teachers whose students tend to show high gains on their test scores (called "high value-added teachers") also contribute to later student success in young adulthood, as indicated by outcomes such as college attendance and future earnings. To support this claim, it is not sufficient for researchers to show an observed association between teacher value-added and later outcomes in young adulthood. It is...

Topics: ERIC Archive, Evidence, Achievement Gains, High Achievement, Teacher Effectiveness, Outcomes of...

This study uses simulation examples representing three types of treatment assignment mechanisms in data generation (the random intercept and slopes setting, the random intercept setting, and a third setting with a cluster-level treatment and an individual-level outcome) in order to determine optimal procedures for reducing bias and improving precision in each of these three settings. Evaluation criteria include bias, variance, MSE, confidence interval coverage rate, and remaining sample size....

Topics: ERIC Archive, Probability, Statistical Analysis, Statistical Bias, Data Analysis, Yu, Bing, Hong,...

In a provocative and influential paper, Jesse Rothstein (2010) finds that standard value-added models (VAMs) suggest implausible future teacher effects on past student achievement, a finding that obviously cannot be viewed as causal. This is the basis of a falsification test (the Rothstein falsification test) that appears to indicate bias in VAM estimates of current teacher contributions to student learning. More precisely, the falsification test is designed to identify whether or not students...

Topics: ERIC Archive, Teacher Effectiveness, Academic Achievement, Models, Statistical Bias, Computation,...

This paper studies the determinants of college major choice using a unique "information" experiment embedded in a survey. We first ask respondents their "self" beliefs--beliefs about their own expected earnings and other major-specific outcomes conditional on various majors, their "population" beliefs--beliefs about the population distribution of these characteristics, as well as their subjective beliefs that they will graduate with each major. After eliciting...

Topics: ERIC Archive, Majors (Students), Course Selection (Students), Influences, Student Surveys, Student...

Estimation of parameters of random effects models from samples collected via complex multistage designs is considered. One way to reduce estimation bias due to unequal probabilities of selection is to incorporate sampling weights. Many researchers have been proposed various weighting methods (Korn, & Graubard, 2003; Pfeffermann, Skinner, Holmes, Goldstein, & Rasbash, 1998) in estimating the parameters of hierarchical models, including random effects models as a special case. In this...

Topics: ERIC Archive, Computation, Statistical Bias, Sampling, Statistical Analysis, Probability,...

The following is a discussion on student level of academic achievement, specifically that of African American learners. The misdiagnosis of Black students having learning disabilities and other disabilities will be examined, and the factors as to why this misdiagnosis occurs so often. Research will be provided as evidence to support this claim, as well as alternate methods of assessing and assisting African American students, especially those who are from poverty-stricken families, as research...

Topics: ERIC Archive, African American Students, Learning Disabilities, Educational Diagnosis,...

The study evaluated the effectiveness of log-linear presmoothing (Holland & Thayer, 1987) on the accuracy of small sample chained equipercentile equatings under two conditions (i.e., using small samples that differed randomly in ability from the target population "versus" using small samples that were distinctly different from the target population). Results showed that equating with small samples (e.g., N less than 50) using either raw or smoothed score distributions can result...

Topics: ERIC Archive, Equated Scores, Data Analysis, Accuracy, Sample Size, Tests, Statistical Bias, Error...

Of particular import to this study, is collider bias originating from stratification on retreatment variables forming an embedded M or bowtie structural design. That is, rather than assume an M structural design which suggests that "X" is a collider but not a confounder, the authors adopt what they consider to be a more reasonable position and that is "X" is both a collider and confounder. Accordingly, in this study they examined the extent to which confounder induced bias...

Topics: ERIC Archive, Statistical Bias, Statistical Analysis, Psychometrics, Elementary School Teachers,...

The purpose of this study is through Monte Carlo simulation to compare several propensity score methods in approximating factorial experimental design and identify best approaches in reducing bias and mean square error of parameter estimates of the main and interaction effects of two factors. Previous studies focused more on unbiased estimates of the effects of one factor, or the effects of one factor by the subgroups of another factor. The approaches for the unbiased estimates of the main and...

Topics: ERIC Archive, Research Design, Probability, Monte Carlo Methods, Simulation, Scores, Computation,...

Given the different possibilities of matching in the context of multilevel data and the lack of research on corresponding matching strategies, the author investigates two main research questions. The first research question investigates the advantages and disadvantages of different matching strategies that can be pursued with multilevel data structures. The goal is first to outline possible matching strategies and then to identify an optimal matching strategy for different treatment selection...

Topics: ERIC Archive, Educational Research, Research Methodology, Observation, Causal Models, Inferences,...

A comparison between six rater agreement measures obtained using three different approaches was achieved by means of a simulation study. Rater coefficients suggested by Bennet's [sigma] (1954), Scott's [pi] (1955), Cohen's [kappa] (1960) and Gwet's [gamma] (2008) were selected to represent the classical, descriptive approach, [alpha] agreement parameter from Aickin (1990) to represent loglinear and mixture model approaches and [Delta] measure from Martin and Femia (2004) to represent...

Topics: ERIC Archive, Interrater Reliability, Measurement, Comparative Analysis, Statistical Analysis,...

Since there is no standard national Pre and Post Test for Principles of Finance, akin to the one for Economics, by authors created one by selecting questions from previously administered examinations. The Cronbach's Alpha of 0.851, exceeding the minimum of 0.70 for reliable pen and paper test, indicates that our Test can detect differences in learning outcomes. Improvements between Pre and Post Test scores, statistically significant at the 1% level, in the entire sample and within different...

Topics: ERIC Archive, Finance Occupations, Business Administration Education, Educational Principles,...

The gold standard in making causal inference on program effects is a randomized trial. Most randomization designs in education randomize classrooms or schools rather than individual students. Such "clustered randomization" designs have one principal drawback: They tend to have limited statistical power or precision. This study aims to provide empirical information needed to design adequately powered studies that randomize schools using data from Florida and North Carolina. The authors...

Topics: ERIC Archive, Test Format, Reading Tests, Norm Referenced Tests, Research Design, Experimental...