Normalisation of microarray gene expression data

This page contains a brief summary of methods and experiences with normalisation of microarray gene expression data, the R scripts I have written for the purpose, and some suggestions for useful quality controls.

My work has primarily been on Agilent one-colour oligonucleotide arrays for gene expression. For other platforms or uses, the methods and results I present here may not be fully valid or applicable.

Introduction
Here I give a quick introduction to microarrays, with focus on Agilent gene expression arrays, and explain the main sources of variability and ideas underlying my choice of methods.
Methods
The normalisation method is explained: the different steps, their purpose, and some motivation of these choices.
QC (quality control)
A brief overview of quality control checks I have been using.
Analysis
Suggestions and advice related to analysis of microarray gene expression data.
R scripts
I have made some R scripts for performing microarray normalisation in a streamlined fashion. These are basically wrappers for methods from the R packages limma and pcaMethods, a simple data structure for organising the data of the different steps of the normalisation, and supporting methods for reading and writing the gene expression data as a matrix table.
File formats and identifiers
An brief summary of the file format, naming conventions, and relevant identifiers.
Download page
From here, you can download the scripts and get directions to annotation files and sample data to test them on.
Last modified February 18, 2015.