Web Scraping and Content Mining - Voted most interesting course in NYC
We go through the whole process of gathering, storing and analyzing data from the Web
What this course is about:
Web Scraping is a method for extracting textual characters from websites so that they could be analyzed. Web scraping is sort of content mining, which means that you collect useful information from websites, including quotes, prices, news company info, etc.This method for gathering data is direct, either through looking at websites' html code or visual abstraction techniques using Python programming language.
We start workshop by exploring different methods to gather data from Web. We go through the whole process of gathering, storing and analyzing data. For our examples we use real-life financial quotes and Annual reports 10-K. During the course we learn how to use numerous Python libraries - Urllib, Requests, Wget, BeautifulSoup 4.0, SSL, PDFminer3k, Twitter and others. Also, we learn to contract Regular expressions patterns to find targeted information on Web pages. As a part of content mining, we build Twitter application to search and analyze the trends. We use Google APIs to build Custom Search Engine.
Who this program is for:
Anyone who would like to learn how to scrape and mine for a content from scratch, entrepreneurs who want to build web applications and writes scripts for web crawlers, and everyone who wants to know what practical programming is about.
You will learn:
- How to use Urllib and Requests
- BeautifulSoup Python Library
- Regular Expressions patterns
- Scrape and Store Data with CSV files
- Use APIs
- Mining Twitter Content
- Build Custom Google Search Engine
Prerequisites & Preparation: