A year ago, I was a numbers geek with no coding background. After trying an online programming course, I was so inspired that I enrolled in one of the best computer science programs in Canada.
Two weeks later, I realized that I could learn everything I needed through edX, Coursera, and Udacity instead. So I dropped out.
The decision was not difficult. I could learn the content I wanted to faster, more efficiently, and for a fraction of the cost.
I already had a university degree and, perhaps more importantly, I already had the university experience. Paying $30K+ to go back to school seemed irresponsible.
I started creating my own data science master’s degree using online courses shortly afterwards, after realizing it was a better fit for me than computer science. I scoured the introduction to programming landscape. I’ve already taken several courses and audited portions of many others. I know the options, and what skills are needed if you’re targeting a data analyst or data scientist role.
For this guide, I spent 20+ hours trying to find every single online introduction to programming course offered as of August 2016, extracting key bits of information from their syllabi and reviews, and compiling their ratings. For this task, I turned to none other than the open source Class Central community and its database of thousands of course ratings and reviews.
Class Central’s homepage.
Since 2011, Class Central founder Dhawal Shah has kept a closer eye on online courses than arguably anyone else in the world. Dhawal personally helped me assemble this list of resources.
How we picked courses to consider
Each course had to fit four criteria:
It introduces programming and, optionally, computer science. See “A note on Programming vs. Computer Science” below.
The language of instruction is Python or R. These are by far the two most popular programming languages used in data science.
It must be an interactive online course, so no books or text-based tutorials. Regarding the latter, Codecademy’s video-less and text editor-based courses would qualify, but strict text tutorials like the ones from R tutorial would not. Though books are viable ways to learn programming, Python, and R, this guide focuses on courses.
It must be a decent length: at least ten hours in total for estimated completion.
Python and R are the two most popular programming languages used in data science.
How we evaluated courses
We believe we covered every notable course that exists and which fits the above criteria. Since there are seemingly hundreds of courses on Udemy in Python and R, we chose to consider the most reviewed and highest rated ones only. There is a chance we missed something, however. Please let us know if you think that is the case.
We compiled average rating and number of reviews from Class Central and other review sites. We calculated a weighted average rating for each course. If a series had multiple courses (like Rice University’s Part 1 and Part 2), we calculated the weighted average rating across all courses. We also read text reviews and used this feedback to supplement the numerical ratings.
We made subjective syllabus judgment calls based on three factors:
Coverage of the fundamentals of programming.
Coverage of more advanced, but useful, topics in programming. (E.g. several courses choose to not cover object-oriented programming. We believe this is a key topic, though not a deal-breaker, hence these courses only being docked marks and not excluded from consideration.)
How much of the syllabus is relevant to data science?
A note on Programming vs. Computer Science
Programming is not computer science and vice versa. There is a difference of which beginners may not be acutely aware. Borrowing this answer from Programmers Stack Exchange:
Computer science is the study of what computers [can] do; programming is the practice of making computers do things.
The course we are looking for introduces programming and optionally touches on relevant aspects of computer science that would benefit a new programmer in terms of awareness. Many of the courses considered, you’ll notice, do indeed have a computer science portion.
None of the courses, however, are strictly computer science courses, which is why something like Harvard’s CS50x on edX is excluded.
Our pick for the best programming course for data scientists is…
University of Toronto’s “Learn to Program” series on Coursera. LTP1: The Fundamentals and LTP2: Crafting Quality Code have a near-perfect weighted average rating of 4.71 out of 5 stars over 284 reviews. They also have a great mix of content difficulty and scope for the beginner data scientist.
This free, Python-based introduction to programming sets itself apart from the other 20+ courses we considered.
Part 2 of the University of Toronto’s “Learn to Program” series.
Jennifer Campbell and Paul Gries, two associate professors in the University of Toronto’s department of computer science (which is regarded as one of the best in the world) teach the series. The self-paced, self-contained Coursera courses match the material in their book, “Practical Programming: An Introduction to Computer Science Using Python 3.” LTP1 covers 40–50% of the book and LTP2 covers another 40%. The 10–20% not covered is not particularly useful for data science, which helped their case for being our pick.
Your “Learn to Program” instructors: Jennifer Campbell and Paul Gries.
The professors kindly and promptly sent me detailed course syllabi upon request, which were difficult to find online prior to the course’s official restart in September 2016.
Learn to Program: The Fundamentals (LTP1)
Timeline: 7 weeks
Estimated time commitment: 6–8 hours per week
This course provides an introduction to computer programming intended for people with no programming experience. It covers the basics of programming in Python including elementary data types (numeric types, strings, lists, dictionaries, and files), control flow, functions, objects, methods, fields, and mutability.
Installing Python, IDLE, mathematical expressions, variables, assignment statement, calling and defining functions, syntax, and semantic errors.
Strings, input/output, function reuse, function design recipe, and docstrings.
Booleans, import, namespaces, and if statements.
For loops and fancy string manipulation.
While loops, lists, and mutability.
For loops over indices, parallel lists and strings, and files.
Tuples and dictionaries.
Learn to Program: Crafting Quality Code (LTP2)
Timeline: 5 weeks
Estimated time commitment: 6–8 hours per week
You know the basics of programming in Python: elementary data types (numeric types, strings, lists, dictionaries, and files), control flow, functions, objects, methods, fields, and mutability. You need to be good at these in order to succeed in this course.
LTP: Crafting Quality Code covers the next steps: designing larger programs, testing your code so that you know it works, reading code in order to understand how efficient it is, and creating your own types.
Designing algorithms: how do you decide what to do in a function body? How do you figure out what functions to write in the first place?
Automated testing: doctest and unittest.
Analyzing code for speed — details of searching and sorting.
Creating new types: classes in Python.
Functions as arguments, default parameter values, and exceptions.
Associate professor Gries also provided the following commentary on the course structure: “Each module has between about 45 minutes to a bit more than an hour of video. There are in-video quiz questions, which will bring the total time spent studying the videos to perhaps 2 hours.”
These videos are generally shorter than ten minutes each.
He continued: “In addition, we have one exercise (a dozen or two or so multiple choice and short-answer questions) per module, which should take an hour or two. There are three programming assignments in LTP1, each of which might take four to eight hours of work. There are two programming assignments in LTP2 of similar size.”
He emphasized that the estimate of 6–8 hours per week is a rough guess: “Estimating time spent is incredibly student-dependent, so please take my estimates in that context. For example, someone who knows a bit of programming, perhaps in another programming language, might take half the time of someone completely new to programming. Sometimes someone will get stuck on a concept for a couple of hours, while they might breeze through on other concepts … That’s one of the reasons the self-paced format is so appealing to us.”
In total, the University of Toronto’s Learn to Program series runs an estimated 12 weeks at 6–8 hours per week, which is about standard for most online courses created by universities. If you prefer to binge-study your MOOCs, that’s 72–96 hours, which could feasibly be completed in two to three weeks, especially if you have a bit of programming experience.
Another great Python option
If you already have some familiarity with programming, and don’t mind a syllabus that has a notable skew towards games and interactive applications, I would also recommend Rice University’s An Introduction to Interactive Programming in Python (Part 1 and Part 2) on Coursera.
With 6,000+ reviews and the highest weighted average rating of 4.93/5 stars, this popular course is noted for its engaging videos, challenging quizzes, and enjoyable mini projects. It’s slightly more difficult, and focuses less on the fundamentals and more on topics that aren’t applicable in data science than our #1 pick.
These courses are also part of the 7 course Principles in Computing Specialization on Coursera.
CodeSkulptor: Browser-based Python programming environment used for Rice University’s MOOCs.
The materials are self-paced and free, and a paid certificate is available. The course must be purchased for $79 (USD) for access to graded materials.
Rice University’s Coursera page.
The condensed course description and full syllabus are as follows:
“This two-part course is designed to help students with very little or no computing background learn the basics of building simple interactive applications … To make learning Python easy, we have developed a new browser-based programming environment that makes developing interactive applications in Python simple. These applications will involve windows whose contents are graphical and respond to buttons, the keyboard, and the mouse.
Recommended background: A knowledge of high school mathematics is required. While the class is designed for students with no prior programming experience, some beginning programmers have viewed the class as being fast-paced. For students interested in some light preparation prior to the start of class, we recommend a self-paced Python learning site such as codecademy.com.”
Timeline: 5 weeks
Estimated time commitment: 7–10 hours per week
Week 0 — statements, expressions, variables
Understand the structure of this class, and explore Python as a calculator.
Week 1 — functions, logic, conditionals<
- (KNOCK OUT) The CIMAPRA17-BA1-1 Exam By Utilizing Special CIMA CIMAPRA17-BA1-1 Dumps (Feb 2021) (KNOCK OUT) The CIMAPRA17-BA1-1 Exam By By