Data Science in the Real World - A Primer
My Journey with Python
print("Hello World")
Well, that's the very first thing you learn in Python 3.
Of course, people who have studied Java will, no doubt, recognize these lines too, resulting in the same output:
public class HelloWorld {
public static void main (String [] args) {
System.out.println("Hello world");
}
}
It is probably the easiest example that one can think of when talking about Python. Utter simplicity that is enamoring to the average Joe, learning the language. For some of us, it is this simplicity that has prodded us continue this journey. Almost 4 years now. But there's more than meets the eye, quite honestly. But that's a discussion for another day.
As for my desire to learn Python, and pursue another degree in Computer Science, the motivation is two-fold, really. Technology is only the field that pays a writer well in my country while Python, in particular, is the only programming language that I fancy these days - since you can write a program in literally plain and simple English.
(About the choice of profession, it's a question that will come up every time you're due when paying the bills or putting food on the table.)
And given that I'm not getting any younger, the time to gamble with careers that don't pay in this country is completely out of the question. Any idiot will understand that... especially if you're Indian!
So Why Data Science?
It happened by accident, really. As most good things. I do believe that learning much like one's choice of work is an ongoing thing. Fueled by a curiosity quotient, if you will. Data Science, as far as I'm concerned, is much like that.
In my curiosity to learn more about Python, I bought a number of courses at Udemy, and with the completion of each course, I chanced upon "data scraping" through the use of Python and tools like Data Miner & .Import.io.
The more I experimented with these tools while running scripts that involved scraping of documents and websites, the more curious I became with data science.
Of course, most aspiring data scientists, would focus on the tools at first, which is really the first stage of this journey. Yet when I met a BangPypers mentor, he shared with me his experience of what the field is all about - from a macro perspective.
Speaking of which, the idea of working with Excel (a skill that I've hardly used until now) seemed far more relaxing compared to just being a writer. Being an instructional designer in the past required one to master Microsoft Office at least at the intermediate level.
And we did - completely on our own. No certifications. Nothing. That gets you nowhere in Life. Which is why I place just as much value on certifications given that they verify one's learning. You don't just have to take one's word for it!
So, one thing led to another, and given that it seemed rather exciting: I decided to pay the BangPypers community a visit just recently.
And so, this is what I learned, in a lengthy discussion:
A Master Class in Data Science, If You Will
Every aspiring Data Scientist has their own preconceived notions about this multi-disciplinary field. We’re human, after all!
We are but at our worst and best, thanks to our imperfections. There’s no other field that brings these aspects out so clearly than data science. The use of our intuition, the comprehension of context and culture, history and mathematics, business and the scientific method.
To be honest, and until now, my approach has been with focusing only the technical aspect of data science. Very one-dimensional and academic, to say the least. Anything but the bigger picture. Yet an important first step before moving on to anything else.
Yes, there’s great value to academics but experience in the real world counts for just as much if not more - especially in this field. There’s great truth in that and one that you can’t shirk from.
In fact, there are certain truths about this all-encompassing field that can hardly be learned in the classroom. It would seem like rocket science otherwise, if you will.
Data Science In The Real World: First Learnings
So, as I discovered today, in a “masterclass” where the focus had nothing to do with Python, Excel, Statistics or even methods related to capturing data itself. In fact, it was all just “thinking about thinking”!
After all, what we observe as we sift through data - the patterns and their underlying interpretations, if you will - carry so much weight that it is impossible to ignore the fact that asking the right questions is where it all begins.
No, it has never been about looking for the right answers. Instead, it has been about understanding what these man-made problems really are, in the first place.
Of course, there’s no doubt that it’s all about data - sometimes, the patterns tend to reveal what people assume to be so far-fetched. It’s no wonder - truth is often stranger than fiction!
The phrase “being all things to all people” comes to mind since it involves performing technical tasks that generates the data you require but so does being able to find ‘meaning’ or a ‘narrative’ as you sift through the data. Yet most of all, it’s being able to convey a message that enables a business to fix, overhaul or even tinker with processes so as to maximize performance.
What you also have to keep in mind is that you’d be a “messenger of a new world” to powerful individuals and that can come with unsavory repercussions as well.
Yes, data science is all about teamwork. About communication. About formulating the perfect hypothesis and then setting about to prove it with hard facts. About everything that has nothing to do with technology. About being a versatile lifelong learner is truly a must coupled with the ability to understand both people in the collective sense and as individuals.
Last but not the least, being a data scientist, much like anything else these days, is all about integrity. Down to the last data point. If you don’t meet this condition, the rest won’t matter in the real world.
Class dismissed!
In Closing
Most people in the business would know that these thoughts aren’t anything new. However, to the individual in the classroom, this won't mean much: thanks to the lack of context and a mentor.
As for those who do have that curiosity that is of a non-judgmental nature, this career path, despite being strewn with numerous highs and lows, will be a natural one. Da Vinci, with his unquenchable thirst for learning, would have been a natural for this profession.
Keeping this in mind, since technology has made all of the legwork simple enough, one can leave the collection and analysis of data sets to technology (dataminer, import.io, dgit among several others) and that should enable the true data scientist to stay focused on the real challenges of data science.
Simple enough?