I am an electronics under-graduate student. I had little, or modestly speaking no idea what it was really that I wanted to do. I was romanticizing on the idea that I was good for anything that I choose to do. It was a time when studying electronics was considered glamorous, and I did it. I saw how loads and loads of people were being recruited by some of the IT giants so obviously I did not want to be a part of the herd! But ironically, I ended up joining one of them as well. My job was good, I was actually training with some of the best minds in my company (CTS), but I was simply too snobbish and blind to see this (I got the proper perspective quite later!!). I blindly believed I was destined for greater (more sophisticated) jobs. To cut the long story short, after many pitfalls I realized what I loved doing the most was writing computer programs!!
At this point, I knew that I did not have the necessary math skills to excel in the field of computer science. As soon as I realized what it was that I really wanted to do, I started finding out more information on how and where to begin. At this point I cannot help but express my gratitude to institutes like IIT, MIT, Berkeley, Stanford, etc., for opening their courses online. I remember I was one of the first to register for the courses offered by Coursera. I was so excited when I watched the materials, that for the first time in my life I regretted not putting efforts to get into premier institutes like the IITs and MIT.
Even though I thought it was too late, I realize there is no better time than now. I am here to learn. I am going to pursue that which piques my interest. One particular course machine learning, taught by Prof. Andrew Ng was so fascinating for me that I decided then that I am going to pursue my career in this field. I finally had found my passion - Machine Learning and Data Analysis. This is a vast field and there can be no single expert, it was all left to the creativity of the student. The following picture composed by Drew Conway, explains my previous statement.
At this point, I knew that I did not have the necessary math skills to excel in the field of computer science. As soon as I realized what it was that I really wanted to do, I started finding out more information on how and where to begin. At this point I cannot help but express my gratitude to institutes like IIT, MIT, Berkeley, Stanford, etc., for opening their courses online. I remember I was one of the first to register for the courses offered by Coursera. I was so excited when I watched the materials, that for the first time in my life I regretted not putting efforts to get into premier institutes like the IITs and MIT.
Even though I thought it was too late, I realize there is no better time than now. I am here to learn. I am going to pursue that which piques my interest. One particular course machine learning, taught by Prof. Andrew Ng was so fascinating for me that I decided then that I am going to pursue my career in this field. I finally had found my passion - Machine Learning and Data Analysis. This is a vast field and there can be no single expert, it was all left to the creativity of the student. The following picture composed by Drew Conway, explains my previous statement.
Essentials of Data Science
I made a list of things which I realized I needed to learn to become a Data Science expert. Following is a list that I made based on my findings. Any suggestion contrary to/or on top of this, is quite welcome in the comments section.
- Fundamental math
- Probability and Statistics (more Statistics, since computers can perform only numerical calculations)
- Calculus (for understanding and designing algorithms) for applications in computer vision, data mining applications and numerous other applications
- Linear Algebra (this is quite essential for anyone desirous of making serious inroads in the field of machine learning)
- Programming language
- Mastery over a scripting language such as R or Matlab and an object oriented language such as Java or C++ is a must. There are many libraries in these languages to perform a lot of machine learning tasks (I ll be discussing more in my up-coming blogs).
- Computer Science Engineering
- Analysis and design of algorithms (how to program to perfection)
Of course, C++ and Java need not be 'the' way. A scripting language like R or Python is must suited for quick analysis. In fact any new concept that is published today, almost parallely an R library is released.
ReplyDeleteMy point of insistence is, knowing enough programming to be able to implement our custom versions of popular algorithms, to suit our needs.
About Kaggle, yeah, its a great platform to experiment one's skills. Hopefully in the upcoming future, I start contributing solutions. :-)