Abbreviations
There are differend kinds of abbreviations:
True abbreviations
Short forms made from the first letters of a word: eg. temp. for temperature. These are ended with a period.
Suspensions
Abbreviations made removing parts of the interior of a word: eg. Mr and dept. These are not ended with a period.
Acronyms
These are words made up from the initial letters or syllables of each part of a compound term, and is pronounced as a word: eg. NASA, ASAP.
Initialisms
Like acronymes, these are formed by the initial letters of each part of a compound term, but are not pronounced as words: eg. USA, UK, ATP, WWW.
Category theory and functoriality
Ref: Wikipeda
Dunning-Kruger effect
The Dunning–Kruger effect is a cognitive bias in which unskilled people make poor decisions and reach erroneous conclusions, but their incompetence denies them the metacognitive ability to realize their mistakes.

Ref: Wikipedia

Entropy and information
For a channel where signal i has likelihood pi, the entropy is defined as H=-Σpi log pi. The unit depends on the type of logarithm used: bit if log2 is used; nat if natural logarithms, ln, is used; Hartley if log10 is used.
ESP, Extrasensory Perception
Basically, mind-reading, as demonstrated online by Cliffor Pickover.
Persistent, persistence
Of objects/data: That values/data persist (eg. are stored) after program ends.
Sociobiology and evolutionary perspectives on human behaviour
There are five major directions in studying evolutionary perspective on human and animal behaviour. These should not be seen as clear-cut categories, and most would accept that all types of explanations characteristic of each of the five directions are relevant; the explanations and examples are ment to be somewhat stereotypical.
Sociobiology
This is the oldest approach, and perhaps the most resented. One new idea is the gene's-eye view: ie. to view evolution and selection from the gene's point of view, rather than from the carrier's (human's) point of view. From this follows kin-selection as those of close kin share much of the same genes.

Example: Haldane's joke that he'd "lay down his life for two brothers or eight cousins" as they are each expected to carry resp. one half and one eighth of his own genes.

Human behavioural ecology
The main idea is to explain human behaviour as adaptive, and to demonstrate that an optimal behaviour is chosen. Hence, human behaviour, and differences in behaviour, are explained through human behavioural adaptability.

Example: Human trade-off between having many offspring or fewer, where fewer offspring allows for greater investment per offspring.

Evolutionary psychology
Argues that human behaviour are adaptations to our prehistoric environment of evolutionary adaptedness (EEA): primarily that as hunter-gatherers.

Example: Human desire for fats and sugars were probably advantagous when these were in shortage, but is now a health hazard.

Memetics
Memes are cultural entities, ideas and information, that can spread from person to person. Takes a meme's-eye view arguing that memes may spread even though they need not be advantagous to their hosts.

Example: Spread of religions and science.

Gene-culture co-evolution
There is co-evolution between biology and culture: behaviour may adapt or be learnt, which in turn favours genes that are advantagous under that behaviour, and vice versa.

Example: Co-evolution between dairy farming and genes for processing milk (lactase).

With all of the above, risks of producing just so-explanations are great, and heavily criticized: ie. producing explanations that fit the facts, but without testable predictions. In terms of scientific position, evolutionary psychology and human behavioural ecology are the most prominent; few refer to themselves as sociobiologists, though that is probably partly due to hostility towards that term hosted by other sciences; memetics is hardly considered a science, though it has gained a lot of popular publicity; gene-culture co-evolution has so far been more theoretical than empirical.

Ref: "Sense & Nonsense" by Laland and Brown

Statistics
Cumulants, cumulant generating function
The cumulants, mk are defined as the coefficients of m(s) = m1s+m2s2/2!+... = ln M(s) where M(s) is the moment generating function.
Exponential family of distributions
Exponential distributions have distribution functions f(x) on the form ln f(x) = [θx-g(θ)]/φ+c(x,φ). This is linked to the cumulant generating function by m(s) = [g(θ+sφ)-g(θ)]/φ; in particular, μ = g'(θ) and σ2 = φg''(θ).
Moments, moment generating function
The k-th moment of a stochastic variable X is Mk = E[Xk]. The moment generating function M(s) = 1+M1s+M2s2/2!+... = E[esX]. See also cumulants.
Multiple testing
Multiple testing occurs when there are N hypotheses, H1, ...,HN, simultaneously. With ordinary P-values, the expected number of P<α is the E-value, E=N*α. Corrections for multiple testing may satisfy different criteria:
Family-wise error: FWE = P[FP>0]
This is the probability of having at least one false positive. No assumption is to be made as to how many or which of the null-hypotheses are true.
Weak family-wise error: wFWE = P[FP>0 | H1, H2...]
This is also called the family-wise error under the complete null-hypothesis: i.e. the probability of getting at least one false positive if all the null-hypotheses are true. It makes no statement about the probability of false positives if one or more of the null-hypothese are false.
False discovery rate: FDR = E[FP/(FP+TP)] where FP/(TP+FP)=0 if TP+FP=0
This is the expected portion of false positives amongst the positives.
Robustness against false/true positives (RP = RFP + RTP)
This is not truely a formal criterium, yet an important one. FWE only gives the probability of having at least one false positive, but says nothing about the reliability of the other tests in the case when false positives have occured. Of course, if two hypotheses are strongly correlated, if one turns up false positive, the other is likely to do the same; however, independent hypotheses should not be affected.

The same applies to true positives, perhaps to an even stronger extent: that a true positive should not affect the test of an independent hypothesis. If this is not true, including a lot of false null-hypotheses, i.e. that would make true positives, would make the test liberal on the true null-hypotheses, causing more false positives.

If both RFP and RTP apply, this is robustness against positives: RP.

Sensitivity for correlated hypotheses (SCH)
If the hypotheses are strongly correlated, the number of hypotheses exagerates the effective number of hypotheses or degrees of freedom. E.g. if each hypothesis is listed m times, N=n*m, the effective number of hypotheses should be n rather than m. A multiple test procedure has the SCH property if it retains its sensitivity with increasing m.
Though I write equality in the above, in reality what can be determined are upper limits: e.g. as FWE should hold for all combinations of true and false null-hypotheses, so all we can say is about the worst case scenario.

Several different approaches exist. Assume that the uncorrected P-values are ordered in increasing order: P1≤P2≤.... The aim is either to find corrected P-values, Ψi, or a procedure for determining which are considered statistically significant at the α level.

  • Bonferroni (FWE,RP): Let Ψi=N*Pi: i.e. the test of Hi is positive if Pi<α/N. Holds under all conditions, but is conservative if the hypotheses are positively correlated.
  • Bonferroni-Holm (FWE,RP): If Pi*(N+1-i)<α for i=1,2,...,n, H1,...,Hn are positive. Holds under all conditions, but is conservative if the hypotheses are positively correlated.
  • Bonferroni-Hochberg (FWE,RP): If Pi*(N+1-i)<α for some i, H1,...,Hi are positive. Independence of hypotheses is assumed.
  • Sidak (FWE,RP): Ψi=1-(1-Pi)N. Holds true if the hypotheses are independent, but often as conservative as Bonferroni. Little gained over Bonferroni.
  • Sidak-Holm (FWE,RP): If 1-(1-Pi)N+1-i for i=1,2,...,n, H1,...,Hn are positive. Like the Sidak test, it assumes that the hypotheses are independent.
  • Simes (FWE,SCH): If Pi*N/i<α, tests of H1,...,Hi are positive. This holds if the hypotheses are independent, but tends to be a little liberal if hypotheses are positively correlated.
  • Benjamini-Hochberg (FDR,SCH): This is the same criterium as Simes' test, but with a different interpretation: as a false discovery rate.
  • For comparison of groups in ANOVA
  • Tukey (FWE): All pairwise comparisons. Assumes homogene ity (same variance across groups).
  • Dunnett (FWE) All comparisons agains a control. Assumes homogeneity.
  • There are numerous others.
Prediction accuracy
Logical predictions predict positives and negatives. Those predicted as positives may either be correctly predicted (TP=true positives) or falsely predicted as positives (FP=false positives); conversely for negatives. The following measurements are used:
  • sensitivity = TP/(TP+FN) = TP/actual positives
  • specificity = TN/(TN+FP) = TN/actual negatives
  • False negative probability = FN/(TP+FN) = 1-sensitivity
  • False positive probability = FP/(TN+FP) = 1-specificity
  • Positive predictive valuea = TP/(TP+FP)
  • Negative predictive valuea = TN/(TN+FN)
  • False positive indexa = FP/FN
  • ROC (Receiver Operating Characteristic) is the plot of the sensitivity (true positive rate) against the false positive rate (1-specificity): i.e. (x,y)=(1-specificity,sensitivity). This curve is used to describe how sensitivity and specificity correspond to each other if prediction threshold values or parameters vary.
  • Accuracy index is the area under the ROC curve. This is 1 (100%) for perfect prediction and 0.5 (50%) for pure change; 0 (0%) is perfect misprediction. One interpretation of this index is as the likelihood of being able tell a positive from a negative when it is know there is one of each.
  • a) Depends on the percentage of positives in the sample.
Rules of Thumb
  • For a normal distribution, percentiles are related to the standard deviation roughly by P75-P25=1.35SD, P84-P16=2SD, P90-P10=2.6SD, P95-P5=3.3SD, P97.5-P2.5=3.9SD, P99-P1=4.7SD, P99.5-P0.5=5.2SD, P99.9-P0.9=6.2SD, P99.95-P0.95=6.6SD.
Terminology, medical
Example of medical terminology:
Following termination of avian exposure, there was a substantial incrementation in lung volume and, at this momente in time, it would appear that there has been a marginal degree of improvement in diffusing capacity.
For the uninitiated, this means:
After the man stopped keeping birds, his lung volume increased and diffusing capasity apparently improved slightly.

Ref: "Successful scientific writing" by Matthews, Bowen and Matthews.

Thread
'Tread of control' in a program.

Ref: An Introduction to Java Thread Programming

Units of measurement
Bits and bytes
Use b for bits and B for bytes. Preferably, prefixes k, M, etc. denote 103, 106, etc.; for 'binary' prefixes, powers of 210, use Ki, Mi, etc.

Ref: NIST on SI, Romulus, UKmetrication.

Entropy and Information
Visualization of data
Ref: Gallery of Data Visualization

Last modified: Fri Jun 30 14:27:54 CEST 2006 Back to Main Page