Is DataCamp worth the money?

I recently left my job as a software developer to focus on transitioning into a data science role (see this post). As the first step of my transition, I am working through the courses offered by DataCamp.

DataCamp offers a range of courses in Python and R in topics including: data importing, cleaning, manipulation, and visualization, as well as probability and statistics, machine learning, and finance. Within the last month, DataCamp has also created a number of course tracks based on specific skills or career path. These tracks are very helpful. I already had a good idea of which courses I wanted to take, but the tracks laid them out in an appropriate order and automatically started the next course in the sequence.

After about 3 weeks, I’ve completed 27 courses including the Data Scientist with R career track.

The Good:

  • At $30/month you can’t beat the price, though I don’t imagine they expect many people to complete 10 courses/week. I keep seeing reminders that paying for the full year up front works out cheaper than a monthly subscription, but after my first month, I’ll have taken all the courses I’m interested in.
  • Each course has a number of video segments, with exercises interspersed.  There is no need to install R or Python to get started, everything runs through your web browser. This makes it easy to focus on understanding the underlying concepts and not worrying that all the required packages are loaded.
  • The courses are consistently good, and I feel like I learned a lot in most of them.
  • The skill and career tracks simplify the task of choosing what to do next. When I first started I read each course description and made a list, and tried to decide the best order to take the courses in. When the career tracks came along, I was able to enroll and get through my courses in an appropriate order.

The (not so) Bad:

  • Everything is done in the browser. At the end of the course, I don’t have any working examples to refer back to. Also to simplify the exercises, each exercise builds on the last. I have seldom seen all the code for one task collected together on the screen.
  • The DataCamp platform provides excellent feedback, including guidance matching the errors found in your code. The drawback is that you must code the exercise in exactly the same way the course creator did. After taking some of the more advanced programming courses, I was frustrated at the way some exercises were presented knowing that there was a better way to do it and I was unable to practice what I had already learned.
  • Many of the exercises are reduced to “fill in the blank.” I would like to do more of the typing myself as I find this helps me remember what I’ve learnt better.

At this point, I’m very happy with my experience. While I don’t think I could consider myself a Data Scientist, I’ve gotten an introduction to many topics and have at least a vague sense of how to start a project of my own. Other people might disagree and feel completely qualified to call themselves data scientists after completing the career track, but I’m a mathematician by training, and I don’t think I understand something until I know all the details about the algorithm and can implement it myself (perhaps that’s a topic for another post). None of the DataCamp courses go into this level of detail, nor do they promise to.

As with everything in life, what you get out depends on what you put in. It would be very easy to get through the DataCamp courses without learning anything. I focused on understanding the concepts since I can always look up the syntax as needed.

Have you taken any online courses? What was your experience?

Advice, insight, etc.

Some of the things I’ve read lately:

Did you do an integral today?

When I was at UBC, I lived in a graduate student residence. There was an engineering student who would ask me regularly if I had done an integral that day. While I suspect it was a bit of a joke I was never quite sure how to respond other than, “well, no, that’s not really what I do.”

This interaction has stuck with me and while you can find many write ups about the misconceptions people have about math (see here for example), I know this phenomena exists in every field. My sister is a librarian, and I was shocked to learn that there’s more to her job that scowling at people from behind her desk and saying “shh.”

What are the misconceptions people have about what you do?

Advice, inspiration, etc.

I’ve recently come across a number of blog posts, articles, etc. that I’ve found useful. As the first in an ongoing series of posts, I’ll share the links that have captured my attention.

  1. Chad Bryant has and excellent series: So You Want to be a Data Scientist Part 1, Part 2, Part 3. The suggestion to find an area of interest and become a subject matter expert really resonated with me.
  2. The Road to Data Science, by Joel Grus: a great set of slides about becoming a data scientist.
  3. How to get your first job in Data Science, by Tomi Mester: some good advice regarding the most important skills, how to develop them, and encouragement to work on “pet projects”
  4. How to be an “idea machine”, by James Altucher: I’m trying to do this, but still get stuck trying to have “good” ideas…

Becoming a Data Scientist

After working as a software developer for a few years, I’m ready for something different. I’ve explored a few different options and have decided to pursue a career in data science.

How will I make this switch? Following the advice of Marc Miller, I’ve talked to a number of people to find out what a data scientist does, the skills needed, and how to get a data science job.

The term data scientist is not well defined and means different things to different people (Chad Bryant has an excellent summary here), but common advice is to create a portfolio to demonstrate what you can do. Ideally this will contain a variety of projects showcasing your expertise. I’m currently working towards developing this portfolio.

Step 1 is to gain some basic skills, and I’ve decided to begin by working through the many courses at DataCamp. DataCamp offers courses in Python and R (the main two languages used for data science), and since I already am proficient in Python, I’m tackling the R courses. I’m hoping that through these courses I’ll gain an overview of the basic concepts, and will have the tools to begin building my portfolio.

Step 2 in my plan is to work through more in depth courses. I haven’t decided what provider to use yet, but there seem to be many options (through udacity, edX, udemy, etc.). These courses dig deeper into the material, and also provide projects that can be included in a portfolio.

Step 3 is to go it alone: find a data set and see what I can do. This step could also include Kaggle competitions as these tend to have nicer data sets and clearly defined questions that could make a first project more manageable.

Through all of this I’m continuing to meet people and learning about what they do. Do you work as a data scientist? I’d love the opportunity to talk to you.