Passion is like genius; a miracle. – Page 30 – Blog on Software, Statistics, and Quant

Transformation Matrix

http://en.wikipedia.org/wiki/Transformation_matrix In linear algebra, linear transformations can be represented by matrices. Most common geometric transformations that keep the origin fixed are linear, including rotation, scaling, shearing, reflection, and orthogonal projection

January 5, 2012

Tags:

statistics
Robust Regression

Classical linear model minimizes where is residual and called LS(Least Squares) or Sum of Least Squares. But it is not robust against outlier. Thus we need roboust regression methods. Least Absolute Deviation Seek to minimize . This method has breakdown point zero meaning that even a small number of outliers can damage the goodness of…

January 1, 2012

Tags:

statistics
Kendall’s Tau VS Spearman’s Rho

Nice vid. on Kendall’s tau and Spearman’s rho. Part 1. Part 2. Here’s another explanation: http://www.unesco.org/webworld/idams/advguide/Chapt4_2.htm. In most cases, these values are very similar, and when discrepancies occur, it is probably safer to interpret the lower value. More importantly, Kendall’s Tau and Spearman’s Rho imply different interpretations. Spearman’s Rho is considered as the regular Pearson’s…

December 30, 2011

Tags:

statistics
Kendall’s Tau

This is nice vid. from how2stats.com. Part 1. Part 2.

December 30, 2011

Tags:

statistics
Multiple implementations denial-of-service via hash algorithm collision from ocert

http://www.ocert.org/advisories/ocert-2011-003.html Hash collsision based attack to key value store. If a webapp uses the given key as it is, i.e., it’s not including timestamp or some salt, it is vulerable this type of attack. I like hacking as most of them spring from this kind of creativeness.

December 29, 2011

Tags:

software
PCA tutorial

http://www.snl.salk.edu/~shlens/pca.pdf The easiest explanation on PCA among several documents I’ve read. It’s written by Jonathon Shlens.

December 27, 2011

Tags:

statistics
Folded normal distribution – Wikipedia, the free encyclopedia

http://en.m.wikipedia.org/wiki/Folded_normal_distribution If X is a random variable from normal distribution, then |x| follows folded normal distribution. Folding can happen anywhere. But if the folding is done where pdf is 0.5, it’s called half normal distribution.

December 27, 2011

Tags:

statistics
Neuralnet for XOR

Let’s use caret to find out the better # of hidden nodes. In the below, I needed many data so that default sampling method, i.e., k-fold CV, can have enough data in it. (i.e., if k=5 or 10, how can we run k-fold using just 4 data rows?) We may choose to instantiate trainControl, but…

December 22, 2011

Tags:

statistics
Caret package in R

http://caret.r-forge.r-project.org/Classification_and_Regression_Training.html The caret package (short for Classification And REgression Training) is a set of functions that attempt to streamline the process for creating predictive models. The package contains tools for: • data splitting • pre-processing • model tuning using resampling • variable importance estimation as well as other functionality. There are many different modeling functions…

December 21, 2011

Tags:

statistics
Permutation Test

Permutation test is a way of getting p value using randomization without assuming a certain distribution of data. The basic idea is simple. Suppose that we want to see if y = ax + b + error holds where x is 0 or 1. In other words, we’re interested if mean of y differs depending…

December 21, 2011

Tags:

statistics