-
Why do we use 0.05 for statistical significance?
http://www.jerrydallal.com/LHSP/p05.htm Maybe I can summarize it like this. First of all, Fisher started using 0.05 when deciding statistically meaningful. p=0.05 is 2 * standard deviation and 1 over 20. Thus it looks good enough. Also, once it’s accepted, it was not welcomed to using p=0.06 or p=0.07. In addition, using p=0.05 is easy for scientific…
Tags:
-
unclass and as.character
http://www2.warwick.ac.uk/fac/sci/moac/people/students/peter_cock/r/iris_plots/ Look for unclass() in the middle of page used for drawing 3 dimensional data in 2d plot with pch=unclass(…) Or one can use as.character().
Tags:
-
HMC Calculus Tutorial
http://www.math.hmc.edu/calculus/tutorials/ Contains calculus and linear algebra tutorials.
Tags:
-
Expectation Maximization
http://see.stanford.edu/see/materials/aimlcs229/handouts.aspx See ‘Mixtures of Gaussians and the EM algorithm’ and ‘The EM algorithm’. This is the easiest to understand explanation on this topic I’ve ever seen on the internet. For books, pattern classification by Duda has a good chapter on it.
Tags:
-
Advanced R data manipulation functions
http://www.ats.ucla.edu/stat/r/library/advanced_function_r.htm *apply, sweep, and column/row functions.
Tags:
-
Tikhonov regularization
http://en.wikipedia.org/wiki/Tikhonov_regularization Called ridge regression in statistics. Regularization: Ridge Regression and the LASSO is another good reference. In R, use lm.ridge(). Here’s sample code that describes not just lm.ridge, but also pls(partial least square), lasso, and pcr(principal component regression).
Tags:
-
Automatic model selection
Leaps package has regsubsets function that automatically finds best model for each model size. Here’s example from ?regsubsets. Let’s find the best model using Adjusted R square. Build a model using them. When doing model selection, instead of using automatic methods recklessly, one should consider if the model really makes sense based on the prior…
Tags:
-
I/O Virtualization
http://queue.acm.org/detail.cfm?id=2071256 Nice intro to IO in virtualization; benefits, challenges, and solutions.
Tags:
-
Bufferbloat
http://en.wikipedia.org/wiki/Bufferbloat Buffers in the middle of network may interfere with TCP congestion control and may slow down a network.
Tags:
-
Cholesky decomposition
http://en.m.wikipedia.org/wiki/Cholesky_decomposition In linear algebra, the Cholesky decomposition or Cholesky triangle is a decomposition of a Hermitian, positive-definitematrix into the product of a lower triangular matrix and itsconjugate transpose. It was discovered by André-Louis Choleskyfor real matrices. When it is applicable, the Cholesky decomposition is roughly twice as efficient as the LU decomposition for solving systems of linear equations.[1] In a loose, metaphorical sense, this can be thought of as the matrix analogue…
Tags: