Book on Mining Massive Datasets

Tags:

http://i.stanford.edu/~ullman/mmds.html

It’s a free ebook in pdf. At first, I just shared the link on twitter without reading it. Later, after reading some chapters, I realized that this book covers algorithms for massive data sets really nicely. Some concepts, for example, combiner(ch2), shingling and minhashing (ch3), bloom filter(ch4), association rules(ch6), etc., are must know concept if you work with large data.

Plus, the book is surprisingly easy to read. Authors explain algorithm with many example in human language unlike other data mining books which deliver concept with tons of mathematical notations.

I highly recommend the book.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *