Software for measuring the distance between distribution functions
Download and
installation instructions
This is a
short user manual for the q-distance software. Feel free to send
questions or comments to this e-mail address: k.p.d.lehre@medisin.uio.no.
The software calculates the distance between
two cumulative distribution functions. To be precise, let X and Y be two
variables with cumulative distribution functions (CDFs) F and G,
respectively. As a measure of the difference we study the distance function
given by
![]()
where
and
are the inverses of the CDFs,
respectively. The estimate of
is given by
![]()
where
and
are the empirical distribution
functions, based on m and n observations. The confidence
intervals are based on asymptotic results, and their derivations can be found
in Laake et al. (1985), Biometrics, 41:515-523. The algorithm used in this
program is based on the methods described there.
Download
and installation instructions can be found here.
Start the
program by running the "q-distance.exe" file.
When the
program is started, a brief manual text is displayed in the data panel. This
text is also available from the help menu.
1) Use the
"Open file" menu to open a text file with one data point on each
line. The two data sets to be compared should be separated by a line containing
an asterisk (*). The data is read into the “Data” panel.
2) Use the
"Analyze" menu to perform analysis. The program will suggest a window
size for the calculations. After the analysis, two inverse cumulative
distribution graphs will be displayed in the “Inverse cumulative graphs” panel.
The first data set (before the * in the data file) is shown in green, and the
second data set is shown in red. The “Distance graph” panel shows the distance
between the two inverse cumulative distributions in blue, and the 95 %
confidence bands in black. This curve typically needs some smoothing.
3) Use the
"Smooth distance graph" menu after analysis to perform LOWESS
smoothing of the distance graph. Note: The distance graph and confidence bands
are smoothed individually: Use "Fit distance confidence graphs" to
align the confidence bands to the distance graph if the graphs become
misaligned during smoothing.
4) The
"Save distance graph point data to file" menu saves the points of the
distance graph +-1.96 SD to a file.
The "Save currently displayed graph to
bitmap file" menu saves an image of the graph in the selected panel to a
file.
In
addition to the functions described above, displaying the inverse cumulative
distributions, there are functions to show the cumulative distributions.
The menu
“Simple analysis” shows a separate window with some simple statistics for the
two data sets. The first dataset is denoted “Y”, and the second set “X”.
Information from this window can be selected with the cursor and copied by
right clicking the mouse.
This
software is currently only available for the Microsoft Windows operating
system. A Linux version might be available in the future.
Download
and unzip q-distance1.6.0.zip (MD5SUM: 9eadfcef624f827afded8f2d4b067b87)
into a directory. The zip file contains 3 files. The program is started by
running the file “q-distance.exe”. “Birth.txt” is an example data file. The file
“winlowess.exe” is used by q-distance to perform LOWESS curve smoothing.
Winlowess.exe is a stand alone LOWESS curve smoothing program made by
K.P.Lehre, based on GNU GPL licensed code from the BASE - BioArray Software Environment project. In accordance with the GPL
license, the source code for winlowess.exe is found in winlowess.zip.
Read in
the dataset birth.txt by using the "Open file" menu. These data are
taken from the The Low
Birth Weight Study, Hosmer, D.W. og Lemeshow, S. Applied Logistic Regression,
Second Edition, 2000, John Wiley & Sons, page 24, and used with permission
by John Wiley &
Sons. Here we study birth weight in relation to smoking. The two groups are
given as non-smokers and smokers, respectively. The data in the file consist of
one column, listed first the birth weight for non-smoking mothers and then for
smoking mothers, separated with an *. Please use the "Analyze" menu to
perform analysis, and, finally, the "Smooth" menu after analysis to
perform smoothing. The distance function is displayed by clicking on “Distance
graph”. This should display the following graph:

Editors:
k.p.d.lehre@medisin.uio.no
Document created: 16.11.2005, certified: xx.xx.2005, last update: 06.02.2010
Get
in touch with the University of Oslo