-
Folded normal distribution – Wikipedia, the free encyclopedia
http://en.m.wikipedia.org/wiki/Folded_normal_distribution If X is a random variable from normal distribution, then |x| follows folded normal distribution. Folding can happen anywhere. But if the folding is done where pdf is 0.5, it’s called half normal distribution.
Tags:
-
Neuralnet for XOR
Let’s use caret to find out the better # of hidden nodes. In the below, I needed many data so that default sampling method, i.e., k-fold CV, can have enough data in it. (i.e., if k=5 or 10, how can we run k-fold using just 4 data rows?) We may choose to instantiate trainControl, but…
Tags:
-
Caret package in R
http://caret.r-forge.r-project.org/Classification_and_Regression_Training.html The caret package (short for Classification And REgression Training) is a set of functions that attempt to streamline the process for creating predictive models. The package contains tools for: • data splitting • pre-processing • model tuning using resampling • variable importance estimation as well as other functionality. There are many different modeling functions…
Tags:
-
Permutation Test
Permutation test is a way of getting p value using randomization without assuming a certain distribution of data. The basic idea is simple. Suppose that we want to see if y = ax + b + error holds where x is 0 or 1. In other words, we’re interested if mean of y differs depending…
Tags:
-
R^2 without intercept is not what you want
In R, gives you this example: But the doc does not explain the difference between lm.D9 and lm.D90. Their difference is that lm.D9 has intercept (like weight = intercept + beta * group) while lm.D90 does not (weight = beta * group). But this is only small part of the difference. If you look at…
Tags:
-
error != residual
Errors and residuals in statistics – Wikipedia, the free encyclopedia [quote]The error of a sample is the deviation of the sample from the (unobservable) true function value, while the residual of a sample is the difference between the sample and the estimated function value.[/quoye]
Tags:
-
Multiple comparison in R
R: Adjust P-values for Multiple Comparisons Adjust p value using, e.g., Bonferroni correction in mutiple comparisons.
Tags:
-
Guessing user profile in social network
http://www.ccs.neu.edu/home/amislove/publications/Inferring-WSDM.pdf Friends share social attribute, e.g., school. Thus, even if you hide your profile, it could be predicted from your friends’ profile. This is so true… When I was looking for people to follow on twitter, I started from some engineers I know of. After some time, I was able to follow many people working…
Tags:
-
Scorecard is logistic regression
Scorecard is a table to compute, for example, credit score of a person. For example, add 10 if age < 30, add 20 if age < 40, add 30 if age <40, add 10 if he/she does not own a house, and add 20 if he/she owns a house. The grand sum of this process…
Tags:
-
svm tool
LIBSVM — A Library for Support Vector Machines includes a introduction for beginner and pythin tool.
Tags: