Fixed vs Random Factors

Inappropriately Designating a Factor as Fixed or Random

In Analysis of Variance and some other methodologies, there are two types of factors: fixed effect and random effect. Which type is appropriate depends on the context of the problem, the questions of interest, and how the data is gathered. Here are the differences:

Fixed effect factor: Data has been gathered from all the levels of the factor that are of interest.

Example: The purpose of an experiment is to compare the effects of three specific dosages of a drug on the response. "Dosage" is the factor; the three specific dosages in the experiment are the levels; there is no intent to say anything about other dosages.

Random effect factor: The factor has many possible levels, interest is in all possible levels, but only a random sample of levels is included in the data.¹

Example: A large manufacturer of widgets is interested in studying the effect of machine operator on the quality final product. The researcher selects a random sample of operators from the large number of operators at the various facilities that manufacture the widgets. The factor is "operator." The analysis will not estimate the effect of each of the operators in the sample, but will instead estimate the variability attributable to the factor "operator".

The analysis of the data is different, depending on whether the factor is treated as fixed or as random. Consequently, inferences may be incorrect if the factor is classified inappropriately. Mistakes in classification are most likely to occur when there is more than one factor in the study.

Example: Two surgical procedures are being compared. Patients are randomized to treatment. Five different surgical teams are used. To prevent possible confounding of treatment and surgical team, each team is trained in both procedures, and each team performs equal numbers of surgery of each of the two types. Since the purpose of the experiment is to compare the procedures, the intent is to generalize to other surgical teams. Thus surgical team should be considered as a random factor, not a fixed factor.

Comments:

This example can help understand why inferences might be different for the two classifications of the factor: Asserting that there is a difference in the results of the two procedures regardless of the surgical team is a stronger statement that saying that there is a difference in the results of the two procedures just for the teams in the experiment.
Technically, the levels of the random factor (in this case, the five surgical teams) used in the experiment should be a random sample of all possible levels. This is in practice usually impossible, so the random factor analysis is usually used if there is reason to believe that the teams used in the experiment could reasonably be a random sample of all surgical teams who might perform the procedures. However, this assumption needs careful thought to avoid possible bias. For example, the conclusion would be more sound if it were limited to surgical teams which were trained in both procedures in the same manner and to the same extent, and who had the same surgical experiences, as the five teams actually studied.

Additional Comments about Fixed and Random Factors

The standard methods for analyzing random effects models assume that the random factor has infinitely many levels, but usually still work well if the total number of levels of the random factor is at least 100 times the number of levels observed in the data. Situations where the total number of levels of the random factor is less than 100 times the number of levels observed in the data require special "finite population" methods.

An interaction term involving both a fixed and a random factor should be considered a random factor.

A factor that is nested in a random factor should be considered random.

1. Usage of "random" in this and similar contexts in not uniform. For example, some authors, in discussing hierarchical (multilevel) analysis, may refer to an intercept as "random" when interest is restricted to a finite population with all members present in the data (e.g., the various states of the U. S. A.), but the intercept is allowed to be different for different members of the population. Using the term "variable intercept" can help emphasize that, although the intercept is allowed to vary, interest is only in the finite population, with no implication of inference beyond that population.

Last updated Jan 20, 2013

COMMON MISTEAKS MISTAKES IN USING STATISTICS: Spotting and Avoiding Them

Introduction Types of Mistakes Suggestions Resources Table of Contents About Glossary Blog

Inappropriately Designating a Factor as Fixed or Random