Ensemble Methods: Boosting, Bagging, and Stacking machine learning

Questions about Ensemble Methods frequently appear in data science interviews. In this video, I’ll go over various examples of ensemble learning, the advantages of boosting and bagging, how to explain stacking, and more!

Continue Reading...
How to Handle Categorical Data machine learning

Handling categorical data in machine learning projects is a very common topic in data science interviews. In this video, I’ll cover the difference between treating a variable as a dummy variable vs. a non-dummy variable, how you can deal with categorical features when the number of levels...

Continue Reading...
K-means machine learning

K-Means is one of the most popular machine learning algorithms you’ll encounter in data science interviews. In this video, I’ll explain what k-means clustering is, how to select the “k” in k-means, show you how to implement k-means from scratch, and go over the main pros...

Continue Reading...
How to Handle Imbalanced Dataset machine learning

Imbalanced data is one of the most common machine learning problems you’ll come across in data science interviews. In this video, I cover what an imbalanced dataset is, what disadvantages it presents, and how to deal with imbalanced data when data contains only 1% of the minority class.

Continue Reading...
L1 and L2 Regularization machine learning

Regularization is a machine learning technique that introduces a regularization term to the loss function of a model in order to improve the generalization of a model. In this video, I explain both L1 and L2 regularizations, the main differences between the two methods, and leave you with helpful...

Continue Reading...
Random Forest machine learning

Random Forest is one of the most useful pragmatic algorithms for fast, simple, flexible predictive modeling. In this video, I dive into how Random Forest works, how you can use it to reduce variance, what makes it “random,” and the most common pros and cons associated with using this...

Continue Reading...
Z-test for Proportions statistics

The z-test is a great asset to use when exploring proportions. In this video, I go over conducting both one-proportion and two-proportion tests, using loads of step-by-step examples. I’ll also share some of my top tips on when to use “pooled” vs. “unpooled” variance....

Continue Reading...
Gradient Boosting machine learning

Questions about Gradient Boosting frequently appear in data science interviews. In this video, I cover what the Gradient Boosting method and XGBoost are, teach you how I would describe the architecture of gradient boosting, and go over some common pros and cons associated with gradient-boosted...

Continue Reading...
Z-test for Means statistics

The z-test is one of the most basic, and commonly used hypothesis tests. There are many data science interview questions on z-tests, so in this video we’ll dive into when to use it, and how to conduct both one-sample and two-sample z-tests for means.

Continue Reading...
Principle Components Analysis (PCA) machine learning

Questions about Principal Component Analysis commonly appear in data science interviews. In this video, I’ll explain what principal component analysis is, how it works, the problems you would use PCA for, and the pros and cons associated with PCA.

Continue Reading...
1 2 3 4 5