Everything you need for Web Scraping workshop
Use PIP to install all packages.
Pip is a package management system used to install and manage software packages written in Python. Many packages can be found in the Python Package Index (PyPI). Python 2.7.9 and later (on the python2 series), and Python 3.4 and later include pip (pip3 for Python 3) by default.
For more info and installation:
Urllib module for python
Urllib is a Python module for fetching URLs. You do not have to install it. Urllib module comes with Python package. For python 3.6 use:
For python 2.7 use:
Requests is HTTP library for Python, official documentation is here:
pip install requests
Python download utility WGET, official documentation is here:
pip install wget
Beautiful Soup is a Python library for pulling data out of HTML and XML files. Official documentation is here:
pip install beautifulsoup4
PDFminer3k PDF parser and analyzer, official documentation is here:
pip install pdfminer3k