How To Become A Python Data Analyst (Beginner’s Guide)

With the growing amount of data and growing interest in the internet of things, we desperately need GREAT analysts and data scientists to make sense of the Trillions gigabytes of data online and around us.

But to become a professional data scientist one of the major requirements is the ability to analyze data using python, most importantly, skills in mathematics, probability, statistics, and programming.

Like yourself, the majority of python analysts were self-taught, that is why in this article we decided to make a python data analysis syllabus from beginner analyst to intermediate analyst and advanced data analyst.

How To Use Our Data Analytics Skills Syllabus?

This recommendation list should be used as a guideline for you to tick off each dot point once you finished a section in the book. You should also use the books and physically start programming or combine the books with online video tutorials.

With each of the skill levels, we recommend that you pick up a book or two anything more than that will be too extensive and demotivating.

Completing online courses is a much more interactive and motivating way to learn python programming, that is why we also recommend signing up to one of our FREE online python for data scientist courses below.

Free Bonus: Click Here To Get A FREE Introduction To Python Course and learn the basics of Python 3, such as Lists, NumPy, Functions and Packages.

Affiliate Disclaimer: We sometimes use affiliate links in our content. This won’t cost you anything but it helps keep our lights on and pays our writing and developer teams. We appreciate your support!

Beginner Python Data Science Syllabus

This beginner syllabus will give you a solid foundation into the world of data science, you’ll learn how to use the basic python functions and understand foundational statistics which helps with decision making, correct analysis of results, and making effective data presentations.

Beginner Python Programming Skills

  • Python, Jupyter, Variables, Printing, Documentation
  • Integers, Floats, Booleans, Strings
  • Conditionals, for Loops
  • Functions, I/O
  • Lists, List Operations, Tuples
  • Dictionaries, Sets, List Comprehensions
  • Recursion
  • Generators, Exception Handling
  • Classes and Objects
  • pandas, matplotlib/seaborn/bokeh

Beginner Python Programming Books

Python Crash Course, 2nd Edition: A Hands-On, Project-Based Introduction to Programming

Python crash course is by far one of the most highly recommended beginner python books out there, this is because it cuts all the noise and goes straight to the point, and teaches you exactly what you need to know to start programming with python. Our advice is to skip the irrelevant parts such as game development and other parts and only focus on the beginner python topics.

Automate The Boring Stuff With Python: Practical Programming for Total Beginners

Automate the boring stuff with python is another highly recommended beginner book and one of the most popular ones as well, this book is a fun interactive way to teach you Python, basically, it goes through all the basic functions in python, then teaches you how to work with excel files, google sheets and many more. This book is a must-read and is more tailored towards analytics compared to the python crash course.

Python for Everybody: Exploring Data in Python 3

Python for everybody is one of my most favorite books out of all three listed, this is because it goes straight to the technical aspects of python and gives you a foundation of how programs work. Some may not like this book because the style of its writing may not be “user friendly” for non-technical beginners, also it is not as mainstream.I recommend this book if you would like to dive straight into python for data analysis.

Beginner Statistics Skills

  • Introduction to Data
  • Intro to Regression
  • Categorical & Numerical Data
  • Probability Tables
  • Relative Risk
  • Correlation Analysis
  • Simple Linear Regression
  • Basics of Sampling
  • Sampling Distribution
  • Tests for Means, Proportions & Contingency Tables
  • Inferences for Correlation and Simple Linear Regression

Beginner Statistics Books

Naked Statistics: Stripping the Dread from the Data

Naked statistics gives a foundational knowledge of the world of statistics in a humorous and entertaining way, this book is recommended for those who are getting their foot into the door of statistics and may need a confidence or motivation boost.

Head First Statistics: A Brain-Friendly Guide

Another entertaining book is Headfirst statistics,  there’s plenty of simplified examples, diagrams, and other visual aids included in the book to assist you in understanding statistics. The book provides a wide range of topics covered in first-year statistics.

Statistics for People Who (Think They) Hate Statistics

Neil J. Salkind’s book “Statistics for people who think they hate statistics” is one of the best-sellers in introductory to statistics, there is a reason for that. This is because of his ability to transform intimidating mathematical and statistical concepts into a humorous, personable, and informative approach that reduces statistics anxiety.

Intermediate Python Data Science Syllabus

Now that you have finished the beginner python programming books and gained a foundational understanding of statistics. You will now be diving deeper into retrieving, manipulating, and visualizing data using python. You will also be learning how to plan your data analysis projects.

Intermediate Python Programming Skills

  • Stats & EDA
  • Pandas & Scraping
  • EDA Viz
  • Multiple Linear Regression
  • Model Selection
  • Regularization
  • PCA
  • Logistic Regression
  • scikit-learn for machine learning
  • kNN Classification
  • Discriminant Analysis
  • Decision Trees
  • Random Forests
  • Boosting
  • Stacking
  • Support Vector Machines
  • A/B Testing 

Intermediate Python Programming Books

Machine Learning: An Algorithmic Perspective, Second Edition

One of the main issues in programming with python is students’ lack of mathematics and statistical background, Machine Learning: An Algorithmic Perspective solves this by providing solid background mathematics and statistics as well as the necessary programming and experimentation in their book.

Data Mining: Practical Machine Learning Tools and Techniques

Another practical book is Data Mining: Practical Machine Learning Tools and Techniques. This book goes through a basic overview of machine learning, followed by a more detailed look at how the key algorithms are implemented.

Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems

This book helps you gain an overall understanding of the concepts and tools for building intelligent systems. You’ll learn a range of techniques, starting with simple linear regression and progressing to deep neural networks. One of the good things about this book is it provides exercises in each chapter to help you apply what you’ve learned. However some may find this book difficult without any solid python programming experience.

Intermediate Data Visualization Skills

  • Design
  • Perception
  • Cognition
  • Interaction
  • Process
  • Projects
  • Exploration

Intermediate Data Visualization Books

Data Visualization: A Practical Introduction

Another programming language you may require is R programming. It is a widely known fact that R is mainly used for statistical analysis while Python provides a more general approach to data science. Data visualization by Kieran Healy teaches you exactly how to analyse and visualize data using R programming.

Although this is not a comphresnive book about R programming, It does each you everything you need to know with graphing data using R programming.

Storytelling with Data: A Data Visualization Guide for Business Professionals

Storytelling with Data teaches you the fundamentals of data visualization and how to communicate effectively with data. You will learn how to understand your audience and visualize data directed towards your target audience attention.

The Visual Display of Quantitative Information

Visual Display of Quantitative information is one of the classic books with data visualization. Altough some may find this book to be outdated, there is still some relevant information. This book provides theory and practice in the design of data visualization and teaches you the most effective way to describe, explore, and summarize data.

Advanced Python Data Science Syllabus

Congrats, you have finished nearly the entire syllabus. The syllabus skills listed here may be far too advanced for most people. But going through the advanced syllabus it will give you an advantage in data analytics compared to most people.

Advanced Python Programming Skills

  • Smoothers & GAMs
  • Cluster Analysis
  • Anomaly Detection
  • Bayesian Statistics
  • Deep Neural Network
  • Neural Network Basics
  • Deep Feed Forward
  • Regularization
  • Optimization
  • CNNs
  • RNNs
  • Autoencoders
  • Generative Models & GANs
  • Basic Statistics and R/Python
  • Relationships and Representations, Graph Databases
  • Introduction to Spark 2.0
  • Spark 2.2 DataFrame API
  • Hadoop Distributed File System (HDFS)
  • Analysis of Streaming Data with Spark
  • Applications of Spark ML Library
  • Text processing with Python NLTK
  • Basic Neural Network and Tensor Flow
  • Analysis of Images, OCR Applications
  • Analysis of Speech Signal
  • Analysis of Streaming Data
  • Time Series with Tensor Flow

Advanced Python Programming Books

Foundations of Statistical Natural Language Processing

Foundations of Statistical natural language processing is an extensive book teaching you all you need to know with the theory and algorithms needed for building NLP tools. It provides broad coverage of mathematical and linguistic foundations, combined with a detailed discussion of statistical methods, allowing you to construct your own NLP project.  

Think Bayes

Think bayes combines probability with python programming. It teaches you how to use python code instead of maths to teach you bayesian statistics Bayesian statistics.

Programming Computer Vision with Python: Tools and Algorithms for Analyzing Images

Programming computer vision with python helps you understand computer vision, you will learn techniques for object recognition, 3D reconstruction, stereo imaging, augmented reality, and many more. The good thing about this book is the code samples with explanations and exercises for you to complete.

Deep Learning with Python

Deep Learning with Python introduces the field of deep learning using the Python language and the Keras library. It introduces you to deep learning through the use of python code rather than mathematical notations.

Probability Syllabus (Optional)

This probably section is probably optional for most of you, we rarely see any probabilities in analytics, however it is helpful if you plan to do anything related to machine learning or artificial intelligence, anything else may be outside the scope for most people as it is purely mathematical.

  • Probability and Counting
  • Story Proofs, Axioms of Probability
  • Birthday Problem, Properties of Probability
  • Conditional Probability
  • Law of Total Probability
  • Monty Hall, Simpson’s Paradox
  • Gambler’s Ruin and Random Variables
  • Random Variables and Their Distributions
  • Expectation, Indicator Random Variables
  • The Poisson Distribution
  • Discrete vs. Continuous, the Uniform Distribution
  • Normal Distribution
  • Location, Scale and LOTUS
  • Exponential Distribution
  • Moment Generating Functions
  • Joint, Conditional, and Marginal Distributions
  • Multinomial and Cauchy
  • Covariance and Correlation
  • Transformations and Convolutions
  • The Probabilistic Method
  • Beta Distribution
  • Gamma Distribution and Poisson Processes
  • Order Statistics and Conditional Expectation
  • Conditional Expectation Given
  • Inequalities
  • Law of Large Numbers and Central Limit Theorem
  • Chi-Square, Student-t, Multivariate Normal
  • Markov Chains