My reviews on Machine Learning, Data Science and Statistics books
I receive questions on content that explains machine learning, statistics or data science on a daily basis. I usually learn from books, so I wanted to write a post about the resources I used, some of them e-books. I’ve finished some of the books, and some of them are in queue. I’ve categorized the books according to topic it covers, and it’s important to note that each book suits you according to your background, e.g. you might be a stats graduate trying to learn machine learning and I might suggest you books on CS or you might be self-taught person that needs to learn the theory. Without further ado, let’s review!
Books on Machine Learning/Deep Learning
Introducing MLOps: Although this book is not considered a technical book (there’s no code inside), there is very valuable information on how to manage machine learning processes in production, I think entry-level and junior machine learning engineers can benefit from it.You can get the free PDF of the book here: https://pages.dataiku.com/oreilly-introducing-mlops
AI and Machine Learning for Coders (Laurence Moroney): This book is considered to be more beginner level compared to Geron’s book, explains machine learning to software engineers who want to make an MVP with machine learning models, it’s particularly good for people with software developer background, but if you are advanced it may not work for you.
Introduction to Machine Learning with Python — Andreas Müller: I finished this book back when I was trying to learn ML best practices, I call this book “the sklearn book”, it doesn’t cover data science that much, it’s more on machine learning (mostly statistical learning) algorithms, it’s great because it covers best practices on supervised and unsupervised learning. It doesn’t cover the theory that much though, but it’s still a good book for people who aren’t interested that much in the theory of the algorithms and want to train a model. There’s a small chapter on deep learning as well.
Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow — Aurelien Geron: I’ve finished this book on TF 1.0, I just bought it for 2.0, it’s one of the best books I’ve read on implementation, it includes both the theory and practice of machine learning and deep learning, but it doesn’t cover the theory as much as the below book (Deep Learning) does. But the good thing is there is the deployment part, it covers ML on the Edge and cloud, and has a section on reinforcement learning with TF-Agents which most tensorflow books don’t cover. It covers all of the tensorflow ecosystem and I refer to this book a lot.
Deep Learning — Bengio, Courville, Goodfellow: It may be the only book (at least I’ve seen) that delves into the theory of deep learning this deeply. There’s everything from linear algebra to GANs, it gives you the fundamentals and teaches you the advanced part but it doesn’t cover code. I’ve read this, but it’s not an easy to read book (like below one, Deep Learning with Python). There is no practical part, you have to use the book below for this.
Deep Learning with Python — François Chollet: I finished this book and applied all the codes inside, it covers all of the Keras ecosystem. It may be one of my favorite books as it helped me learn deep learning (I’m a person with operations research background so all I knew was statistical learning algorithms back then). The theory of each algorithm is explained in a simple way and paired with the code.
Artificial Intelligence A Modern Approach — Peter Norvig & Stuart Russell: I would call this the “AI book”, I have the third edition, fourth edition might cover machine learning more. It does have a section for machine learning and deep learning but those chapters are really small. This book covers artificial intelligence algorithms, search algorithms, decision making based on AI and reinforcement learning. Also, it was weird to see that it has no dedicated chapter on unsupervised learning. But it’s the best book on AI, I also read the book of Tom Mitchell and this book helps you comprehend everything compared to that book, it’s also relatively easy to read compared to Deep Learning book above.
Natural Language Processing with Python — Bird, Klein & Loper : It’s a book that I refer from time to time when I’m trying to solve NLPproblems, as it covers the whole NLTK library, it tells you what your workflow should be depending on the problem, each problem covers a different thing, e.g. segmentation is a problem on it’s own but it’s a pre-requisite for POS tagging and POS tagging is a prerequisite for relationship extraction and it guides you through all of the process. It’s a good starting point if you’re new to natural language processing. It also helps you model a problem which is a great know-how in my opinion.
Generative Deep Learning, Teaching Machines to Paint, Write, Compose, and Play — David Foster : This is one of my favorite books for making fun side-projects. It explains how to generate data in various domains(image, sound, text), covers architectures like CycleGAN, VAE, DCGAN, and even language models such as GPT-2 for text generation.
Statistics, Data Science
Python Data Science Handbook: Essential Tools for Working with Data — Jake VanderPlas: I enjoyed reading this book a lot and it was one of the books that I started from the beginning and finished completely. There are all the steps to take raw data with pandas and sklearn and manage the process of exploratory data analysis, data preprocessing, model building, hyperparameter optimization. It’s quite comprehensive for end-to-end data science workflow, and I’d highly recommend this book.
Naked Statistics — Charles Wheelan: This is one of my favorite books, I wish I came across this while I was studying in the bachelor’s. A very entertaining and intuitive book that explains the core concepts of statistics, it also covers a bit of machine learning. I remember finishing it in two or three days, but it’s not a book you can learn statistics completely, it just gives you intuition.
The Art of Statistics (How to Learn from Data) — David Spiegelhalter: This book is very similar to Naked Statistics, but it’s more advanced. The author explains statistics in an entertaining way. Unfortunately, this is not a book where you can learn statistics (it’s not enough to apply statistics, in my opinion), but it does answer most of the statistics-related questions, like why we need it and how can we use it.
I have this below book if you’d like to study statistics.
Probability and Statistics for Engineers — Jay L. Devore: It’s a textbook where you can learn probability and statistics from scratch. I’m usually very skeptical of intuitions (though I like them) so I usually prefer to study and solve problems and I finished this book in one semester, though I passed the class with a BA (that is completely my fault :’)). It’s one of the most comprehensive books I’ve read.
The Book of Why — Judea Pearl: This book covers causality, Judea Pearl explains why causality should not be confused with correlation and how you can distinguish between the two. I read some of it but had to take a break because of my exams.
Thinking, Fast and Slow — Daniel Kahneman: Daniel Kahneman is one of the two best in the cognitive science field, along with Amos Tversky, and he received the Nobel Prize in economics. This book explains the cognitive biases we have, and people’s behavior when they think fast. I’ve heard of this book from the data science lecturer at the master’s, as he said it’s good to read this to understand people’s behavior when you want to make an inference from the data.
Data Structures and Algorithms in Python — Tamassia, Goodrich, Goldwasser: I’m not a computer science graduate so I sat down and took the data structures and algorithms class, and I met this book. The main reason I took this course is that good companies often interview you for data structures and algorithms, but later on I realized how necessary this is for engineer-like thinking. I have to admit that every time I read it, it changes my perspective on these concepts. It is both a theoretical and a practical book, it contains the theory and application of each algorithm, in Python, which I recommend you to learn DA&A with, as it’s an easy language and algorithms and data structures doesn’t really change from language to language.
Flask Web Development: Developing Web Applications with Python — Miguel Grinberg: I loved this book because I had no idea about development. The book covers development as a concept (e.g. databases, requests, bootstraps, security) then walks you through you how can implement it in Flask. I still refer to it from time to time and recommend this book if you want to develop something and you have no idea on development.
I loved the manga guides, they keep your attention and explain the concepts visually. I mainly read the one for linear algebra (oh how I wish I read it when I was in bachelor’s), statistics and microprocessors. I will read the ones for databases and cryptography when I make time. I recommend these series to anyone who gets bored during studying.
I’m currently reading Deep Learning for Coders with fastai & Pytorch by Sylvain Gugger & Jeremy Howard. Will add the review soon.
Some books in queue:
- Learning Tensorflow.js by Gant Laborde
- Approaching (Almost) Any Machine Learning Problem by Abhishek Thakur
- The Most Human Human by Brian Christian