The One Coursera Course You Don’t Want to Miss as a to be Data Scientist

Coursera is one of must haves if you want to become a data scientist on your own with internet resources. But here is the problem: when it comes to finding courses on Coursera it can be daunting when you are staring down a list of 326 courses and all you did was search “data” in their website. There are a variety of programs or “specializations”, which consist of multiple courses. The one recommended the most by people to me is the “Data Science” Specialization from John Hopkin’s University. This is a 10 course program. You can buy the individual courses of the program instead of the whole specialization and any course you did purchase will be removed from the price of the rest of the specialization.

Not all courses are created equal. This post has two parts in its series, and will give you an overview of the courses offered and help you save some time and money if you aren’t looking to do the whole program.

another-data-scientist.jpg

Overview:

Each of the courses run 4 weeks and has weekly turn-ins. These will usually be a quiz or project, sometimes both. I’ll discuss each course, some of the pros and cons, things I learned and difficulty of the course. At the end, I’ll discuss if the specialization was worthwhile. If a course is particularly note-worthy, I’ll discuss why.

       

The Data Scientist’s Toolbox

– My take: Skip if you don’t want to invest in more time or money

This is the introductory course to the program. The course ‘introduces’ you to a variety of programs and tools that will you will use throughout the rest of the program. It helps to introduce you to the data science specialization and how the courses are lectured. It’s a very high level overview, but you’ll take some quizzes about the materials covered. Each week has a quiz at the end of it. You’ll spend about 1 – 4 hours each week on the videos and the quiz. You’ll install a number of products and set-up accounts you’ll need for the rest of the specializations courses. All-in-all this course is pretty straightforward and introduces you to the rest of the program. On a scale of 10, I would rate the difficulty of this course at a 1 or 2.

NOTE: This course is reduced in price at $29 versus $49 for each other course.

R Programming

– My take: The One Course You Don’t Want to Miss

This course is your full introduction to R. They’ll discuss its history, how it came about as a language, and how to use R from installing it to using it for analysis. This course uses video lectures, quizzes, and multiple assignments to help teach you and give you practice using R. If this is your first time using R, this course will take you considerably longer than the first course. I would put an estimate at 6 – 12 hours a week once you get to the assignments. Make sure you start the assignments ahead of the due dates. I found it best to at least read through the assignments at the start of the course so I could think a bit about how I wanted to handle each assignment. If you haven’t done any programming before or had exposure to R, this course can be fairly difficult (6 – 8). If you have a programming background, you’ll have to get used to R’s syntax, but it should be a bit less difficult for you (3 – 5).

Cloud-Mobility-Security-and-Big-Data-The-Big-Four-for-Business-Growth-Study.jpg

Getting and Cleaning Data

–  My take: Take it if You Have Time and/or Money

This course is about how to get data into R and how to make it “clean” data. Often times, you’ll encounter data that has missing pieces or data that is in the wrong format for your analysis. You will be taught the basics on how to clean it up and proper ways to do it. For example, what is the right way to handle null values can vary based on the type of data/clientele you are working with. This course uses videos, 4 quizzes, and a project at the end of course. This course is easier than the previous “R programming”. You’ll need roughly 3 – 6 hours a week. The overall difficulty of this course is around 2 or 3. The course gives you a good outline of how to clean your data, but in the real world this can be far more tricky!

Exploratory Data Analysis

–  My take: It Will Be Worth it Course

Exploratory data analysis goes hand-in-hand with cleaning the data. You’ll must likely take an initial look at your data before deciding how you would like to clean them. Exploratory analysis is crucial for how you plan to model your data after you have cleaned it up. A good analysis can even lead to some initial insights without having to dig deeper into the data and help you create some ideas about the analysis you would like to do on the data. The course itself starts with a quiz and project due in your first week. After that there’s another quiz and a project at the end of the course. This course you’ll need roughly 2 – 4 hours a week to do the lectures, quizzes, and projects. It’s not a difficult course at all, so I would say the difficulty is roughly a 1 or 2.

Reproducible Research

–  My take: Take it if You Have Time and/or Money

Reproducible research is one of the great things about dealing with electronic data. If your
data can be shared and the process you used to create your analysis was shared too, then any individual could recreate your research and analysis. If your a research or scientist, you know how important this is to have your validated by another set of eyes. This class is a front loaded like the “Exploratory Data Analysis” course, you’ll have a quiz and project due in the first week and then another quiz in week 2 and a final project. I believe you’ll need about 2 – 4 hours a week for the lectures, quizzes, and projects. One of the neat things about this course is you can validate your classmates work by re-running their work. You’ll learn about the “knitr” package, which is a create way to publish your R code onto a webpage, pdf, or website. This course is roughly the same difficulty as the the exploratory analysis course, but the project is a bit harder (1 – 3).

Wrap Up of Part 1

At this point, you are halfway through the courses for the “Data Science” Specialization. These courses form the base from which you can apply R to data sets of your own and begin building your own models and analysis. The “R Programming” course is the most important of the courses. If you’ve had no exposure to R before, I would buy this course on its own.

Picture1 (1).png

Next post, we’ll go over the remaining five courses and my takeaway from the program as whole.

Until next week, see you later #statheads.

 

 

One thought on “The One Coursera Course You Don’t Want to Miss as a to be Data Scientist

Leave a comment