# Python Machine Learning Immersive

Canonical URL: <https://programwithus.com/classes/python-machine-learning-nyc>

## Overview

This skillset is in high demand, as machine learning algorithms now run the majority of trading on Wall Street and the product recommendations at big companies like Amazon, Spotify, and Netflix.

This course will begin with linear and logistic regression, the most time-tested and reliable tools for approaching a machine learning problem. The course will then progress to algorithms with a very different theoretical basis, such as k-nearest neighbors, decision trees, and random forest. This will bring important statistical concepts to the forefront, such as bias, variance, and overfitting. You’ll also learn how to measure the accuracy of your models, as well as tips for choosing effective features and algorithms.

The course will be focused on the practical skills needed to solve real-world problems with machine learning. The mathematical foundations for each machine learning algorithm will be explained visually, but there will not be a formal math component. Entering students are expected to be comfortable with writing Python programs, as well as the Numpy and Pandas libraries.

## What you'll learn

- How to clean and balance your data using the Pandas library
- Applying machine learning algorithms such as logistic regression and random forest using the scikit-learn library
- Choosing good features to use as input for your algorithms
- Properly splitting data into training, test and cross-validation sets
- Important theoretical concepts like overfitting, variance and bias
- Evaluating the performance of your machine learning models

## Prerequisites

This course requires students to be comfortable with Python and its data science libraries (NumPy and Pandas). If a student has not worked in Python before, we require a student to enroll in our [Python for Data Science Bootcamp](/classes/python-for-data-science-immersive-nyc)before taking this course.

## Curriculum

### 1. Course Kick‑off & Python Refresher

- Data Science tool recap - Pandas and indexing
- Exploratory data analysis (EDA): standard deviations and uniform vs. normal distributions using NumPy/Pandas
- Hands‑on: loading CSVs, basic plotting with Matplotlib

### 2. Data Visualization & Simple Linear Regression

- Crafting clear scatterplots: labels, grids, styling
- Single‑variable linear regression (attendance → concessions)
- Train‑test splitting and dealing with outliers
- Evaluating models with R²; interpreting residuals
- Extended example: car‑sales dataset, predicting price from one feature

### 3. Binary Classification & Logistic Regression

- From regression to classification: why logistic vs. linear
- Implementing logistic regression on an employee “stay/leave” dataset
- Classification metrics deep dive: accuracy, precision, recall, F1 score, ROC curve
- Understanding variability: train‑test ratios, data shuffling, sample size effects
- Confusion matrix analysis

### 4. k‑Nearest Neighbors & the Iris Dataset

- Introduction to k‑NN: distance metrics, choosing k
- Dataset exploration: sepal/petal measurements, plotting clusters
- Preprocessing: label encoding categorical data, feature scaling
- Model training, hyperparameter tuning, evaluating with confusion matrix and classification report
- Brief intro to decision‑tree logic (setting up for ensembles)

### 5. Ensemble Methods & Neural Networks

- Random forest classifiers on the Titanic dataset: feature engineering, importance scores
- Kaggle workflow: generating predictions, submitting to competition
- Neural network primer: perceptron to multilayer architectures
- Hands‑on MNIST digit classification with Keras/TensorFlow in Colab

## Schedule
- Jun 15, 2026 – Jun 19, 2026 — NYC
- Aug 3, 2026 – Aug 7, 2026 — NYC
- Aug 30, 2026 – Oct 11, 2026 — NYC
- Sep 8, 2026 – Oct 8, 2026 — NYC
- Sep 22, 2026 – Sep 28, 2026 — NYC
- Nov 9, 2026 – Nov 13, 2026 — NYC
- Nov 9, 2026 – Nov 13, 2026 — NYC
- Dec 7, 2026 – Dec 11, 2026 — NYC
- Dec 29, 2026 – Feb 2, 2027 — NYC
- Dec 29, 2026 – Feb 2, 2027 — NYC
- Jan 17, 2027 – Feb 21, 2027 — NYC

## FAQ

### How is this class structured? 

This class is an 18-hour class that starts by teaching forms of regression analysis and moves onto more industry-used algorithms such as k-nearest neighbors, decision trees, and random forest. Additionally, students will learn how to determine the accuracy of a predictive model.

### How many students are in a given class?

Noble's typical class ranges from 8-12 students, but we allow up to 20 students to register for our course.

### How does this class prepare me for the job market? 

The classes will allow students to learn advanced topics in data science used by the most cutting edge companies such as Google, Facebook, and more. These topics will allow students to build, evaluate, and reassess forecasting models on all forms of data.

### Is there mandatory work outside of the classroom? 

Students are not required to complete any work outside of class. However, we provide students with bonus materials if they would like extra practice.

### What tangible skills do students leave with after the class? 

Students will leave with the ability to learn how to build a model from start to finish. Students will learn how to clean and balance data, apply a form of learning algorithm on the data, perform a bias test, and finally evaluate the accuracy of your model.

## Pricing

**Tuition:** $1895