9. Experimental Designs

This section only discusses the principles of experimental design. The statistical analysis of these designs is discussed in a later section.

Definitions

A factor is a discrete variable used to classify experimental units. For example, ”Gender” might be a factor with two levels “male” and “female” and “Diet” might be a factor with three levels “low”, “medium” and “high” protein. The levels within each factor can be discrete, such as “Drug A” and “Drug B”, or they may be quantitative such as 0, 10, 20 and 30 mg/kg.

Fixed effects factors, are variables which can be controlled by the investigator. These include gender, dose, diet, genotype (in the case of genetically defined strains) and any treatment which can be administered to the animals. Most experiments are designed to study the fixed effects.

Random effects factors are variables which can not be controlled by the investigator. They include inter-individual differences, litter effects, time effects and environmental effects like barometric pressure and batch differences in diet and bedding. These effects are responsible for noise (variation) which is of little scientific interest to the investigator. So the aim of some experimental designs, such as randomised blocks is to partition these effects out so that they do not obscure the effects of the fixed effects.

The main experimental designs are:

1. The completely randomised design. It has one or more fixed effect factor(s), often called the treatment. Subjects assigned to treatments at random regardless of any characteristics or natural structure to of the experimental material. This is the commonest design. It is simple and tolerates unequal numbers in each group.

2. The randomised block design. This is also known as a “within-subject”“crossover” or “matched subjects”. Note that the term “repeated measures” design is sometimes used for a design where an individual receives different treatments over time (i.e. just like a crossover design). However, the term “repeated measures” is used here for a design where an experimental unit is measured several times without receiving different treatments.

All these designs have one  random effect variable which is of no interest and one or more fixed effect factors (treatments) which are of interest. The design is used to:

• Increase power by better control of variation (eliminating some random effect variation such as height from the floor in the animal house).
• Provide a convenient way of breaking the experiment up into smaller, more convenient, parts.
• Take account of some natural structure of the experimental material, such as litter differences when studying pre-weaning animals
• Increase the generality by sampling slightly different environments.

3. Latin square designs. These have two random effect variables (often designated rows and columns) and one or more fixed effects. They are used to further increase power in special situations taking account of the two sources of random effects.

4. Factorial designs. These have two or more fixed effect factors and in view of their importance they are discussed separately. Strictly they are  arrangements of the treatments rather than designs, so it is possible to have a factorial treatment structure in a completely randomised, randomised block or Latin square design.

5. Split plot designs. These are   randomised block designs with a factorial.  treatment structure in which a main effect is confounded with blocks. It may sometimes be possible to design such an experiment by accident because in some circumstances they make good use of experimental subjects. For example, a within-animal experiment is a type of randomised block design. But suppose half the available animals are male and half female. The gender differences would be assessed using whole animals while the treatment differences would be assessed within the animals. This would be a split plot design. Such designs are discussed with factorial designs.

6. Repeated measures designs in which each experimental unit is measured several times without different treatments being applied and time effects are of interest. Note that some authors use the term “repeated measures designs” for crossover experiments in which a subject receives different treatments over a period of time.. Two cases need to be considered:

A. If there are just a few measurements on each individual, then one approach is to reduce the observations to a single number for each experimental unit.

This could be the area under the curve or time to peak  if response is like plot 1 on the right. Or the slope of the line or difference between the first and last few measurements it response is like plot 2 on the right. Or simply the mean of the measurements if there is no apparent trend (like plot 3 on the right).

The design can then be analysed as a completely randomised design using the single number for each subject.

B. If there are lots of measurements on each individual where the shape of the curve is of interest,  such as a growth curve, then specialised methods may need to be used which are beyond the scope of this web site.

7. Hierarchical designs. In these designs more than one sample is taken from each experimental unit, and in some case the samples are sub-sampled, as illustrated below, where the liver of each individual is split into three parts, homogenised and then determinations done on two aliquats from each part. The usual aim is to increase power by reducing measurement error. Sometimes the terms “technical replication” and “biological replication” are used. The former refers to replication of measurements on the same experimental unit.

These designs help to answer questions such as whether it is better to do more measurements on each experimental unit (which could be relatively inexpensive) or use more experimental units, if the aim is to increase power. In general if the measurements on each experimental unit are variable, then that is where there should be more replication. If they are similar, then more experimental units should be used (ethical considerations being taken into account). These designs are not discussed in any more detail here.

8. Designs measuring association: correlation and regression. In this type of experiment the aim is to see whether there is any association between two variables, and if so what is its nature.  If the variables are associated but one does not cause the other, then the association can be studied and quantified using correlation. However, if altering one variable, such as dose rate (an independent variable), may cause some other variable (a dependent variable), such as red blood cell count to change, then this is studied using regression analysis. These designs are considered in a separate section.

9. Other less commonly used designs. These include: incomplete block designs where there is a natural structure to the experimental material but the number of treatments exceeds the natural block size and sequential designs where the experiment continues until certain criteria of success are achieved. These designs are rare (although important in some special situations) and are not described here.

1. The completely randomised design

This is the simplest design. Each experimental unit is assigned to a treatment strictly at random without taking account of any individual characteristics. It is best used when relatively homogeneous experimental units are available. It can tolerate unequal numbers in each group and is perfectly adequate in many experimental situations. Following treatment investigators should (where possible) be blinded by using only the animal numbers when making measurements

In the figure above relatively homogeneous experimental units (animals, cages of animals etc.) were assigned at random (using EXCEL as previously described) to treatments gray, green, red and purple. The subjects can be housed, treated and measured in any order.

The fact that 4/5 of the gray treatment are in the first ten and 4/5 purple treatments are in the last ten would not matter in most cases, although if, for example, surgery is involved skill may increase, leading to a bias against gray.

If the experiment needs to be split up, (e.g. if applying the treatments or if making the measurements takes several hours or days) then this can be done in any way as the subjects have already been randomised. However, if splitting the experiment up in this way is likely to introduce an unknown source of variation, then the design loses power. In such circumstances a randomised block design might be preferable.

This design will normally be analysed using a one-way analysis of variance or a t-test if there are only two groups.

2. The randomised block design

A randomised block design is used to control a source of random variation which might otherwise obscure the effect of a treatment.

In this design the experimental material is split up into a number of “mini-experiments”, typically with one subject on each treatment. It is assumed that differences between treatments are of interest while differences between blocks, which are random effects are of no interest.

Subjects are matched using any criteria available at the time the experiment is started. This might be on size (as above), space (e.g. location within the animal house such as shelf level) or time (as in within-litter experiments, where litters are infrequent). Blocks can differ in several ways at the same time. For example, block 1 might be large animals held on the top shelf and processed on day 1..

Although it is usual to have only a single experimental unit of each treatment in a block, it is possible to have two or more. In that case there will be two error terms. One will be calculated from the differences among individuals within a block and the other from the block times treatment interaction. If these do not differ significantly, they can be combined (see statistical analysis section)

Randomisation in a randomised block design

It could be tedious to randomise each block separately so here is an alternative, assuming six treatments A,B,C,D,E,F are to be assigned in three blocks (but it can be adjusted to any number of blocks and treatments)

The first column has the animal number. Second column is a random number expressed to three decimal places. The third column is the treatment assignment. Within each block treatment A is assigned to the lowest number, treatment B to the next one etc.

Variants of the randomised block design

A matched-pairs design. This will normally be analysed as a paired t-test or a two-way ANOVA without interaction. Matched pairs of subjects have been assigned at random to the gray or green treatments.

This represents a “Before and after” experiment. While this can be regarded as a randomised block design, true randomisation is not possible. You can not have an “after” before a “before”. The assumption must be made that measuring the subject before applying the treatment does not alter the subject.

A “crossover” design in which the experimental unit is the animal (or other entity) for a period of time.

Each subject receives different treatments sequentially and it is assumed that the treatment does not permanently alter the subject. The blocking factor is time, with all animals being measured at each time.

Individual animals can be “blocks”. In this case different treatments are applied to the shaved back of an animal. The experimental unit is an area of skin and it is assumed that the treatments do not interact with each other.

Blocks can be set up at different times (even weeks apart) and/or housed in different locations.

The main advantages of the RB design are that:

• It can deal with heterogeneous material by matching subjects in each block (increasing power).
• It can take account of any natural structure in the experimental material (e.g. litters)
• It is often more convenient to break the experiment down into smaller bits which can then be handled and measured more carefully in the available time.

The main disadvantages of the RB design are

• It is not very tolerant of missing observations
• It should not be done with very small experiments (say less than about 16 experimental units total) because there may be a loss of power.

3. The Latin square design

The number of subjects is the number of treatments squared.

This is a 5x5 Latin square. It has five rows, five columns and five treatments (Gray, Red, Purple, Green, Brick). Note that there is one of each treatment in each row and in each column. It can be written by writing out the first line, then starting the second line with the second treatment from the first line (red in this case), with the first from the first line (gray) going on the end. And so on for the rest of the lines.

It has not yet been randomised. To maintain the layout we randomise whole rows and then whole columns.

It has one fixed factor (Treatment) and two random factors (Rows and Columns). We would use it if there are two factors such as day of the week (represented as columns) and time of the day (rows) which may influence the outcome, and we want these balanced out.

Latin squares with more than 7 treatments can become too large to be managed easily, and those with fewer than four are too small. However, small ones (as small as 2x2) can be replicated.

4 & 5. Factorial and split-plot designs. These involve two or more fixed effect factors. Factorial designs are of great importance so are discussed in a separate section