it is me again 😀 Recently I kind of “discovered” SQL. It came as quite a shock to me that I had never ever read about how important SQL is for an aspiring Data Scientist. So in this blog, I am going to raise some serious awareness for my new bestie SQL and also show you how to get started with this query language.
What is SQL?
A data scientist or analyst (as you have read in one of my earlier posts, the terms are not precisely defined) you are “obviously” going to work with data, duh. Scrolling through the web and looking for examples and cases, this data is mostly provided for you in form of a finished excel sheet or text file which you might have to enrich. Or at least this is what I have looked at whenever I researched something.
Mr shy guy
In daily life, data gets not mysteriously born in an excel file. Somebody has to call it from a database and its tables. To do so, SQL comes in handy. SQL stands for Structured Query Language. I myself decided to call it the “secret” or “shy” query language because nobody ever mentions this little guy. I used to scroll through job openings for data scientist and analysts to find out what they require. This is how I found out about Python being really good for data science, about Tableau and d3.js. But not in one of them was SQL mentioned. So I guess it really just is a well-kept secret.
Which is funny because how would any data person work without data? Okay, I admit it, I have been working with data before and I actually did even use SQL although I was not really aware of it. The reason why? I did not “really” write much SQL, but instead, I had an interface with nice drag and drop options, which in the background generated the SQL code for me. And on the occasion that I did actually write something in SQL, I just did what google told me to do without really knowing what I was using. Yeah, you can call that dumb or stupid… I think if you have a proper IT background this would not have happened to you… but I don’t… soooo here we are…
The above said, SQL is really, really powerful and tricky. I do have to learn a ton about this language! Which is why the following part is to come 😉
How to get to know SQL
but enough of me rambling on… now that you are absolutely convinced that you want to learn SQL, I will tell you where I read up…
On a page called tutorialspoint, I found a nice (and short) description of relational databases and data types, which will be helpful to better understand SQL. You can read it yourself here. They also give you some information on databases and the language SQL. You can click from one article to another and I feel like it is well structured. Some topics might not be explained more deeply, but at least you then know what to type into google 😉
All time favourite
Learning by watching ++
Usually, I am not a big fan of learning tutorials in form of videos. But Khan Academy does have a nice series of videos. You see the screen and can listen and see the code at the same time. Then you get to do an exercise after each lesson. They thought of some really good exercises. As soon as you start typing your SQL code you see a preview of the result. This way you can really see what each step does. Also, you get hints whenever you start to type something wrong or you forgot something. Go check it out on this link. They structured their videos well and divided them into basic and more advanced SQL queries.
“Übung macht den Meister”
In German, we have the saying, “Übung macht den Meister”, which means only be using SQL you will become a true Query Yoda. But this goes for nearly everything in life. So if you ever find yourself in the situation where a tool can magically generate a SQL query for you in the background or you can actually type the SQL yourself, ALWAYS (yes, capital letters) go for the type your query yourself option! You can still use the magical way if you are not sure and want to check your own created query.
The magic way cheats – kind of
I used to work a little with SAS (which is a statistical analysis tool) where you could go for a wizard to get your data and do all the select, when, join etc. stuff with drag and drop. Then SAS would generate its own kind of SQL in the background. Weirdly, the generated code and the code I wrote looked (slightly) different and I never really felt like I learnt a lot from the “code” SAS generated. Well, this might have been because I did not really know what to look for, but I also felt like it “hell how did this work?”.
What I am saying is, if you are using one of these magical tools, maybe do check whether you can really learn from this generated codes. I know that it works quite well with the Macro Recorder in Microsoft Excel (my little cheat tick ;)) but for SAS I cannot recommend it. I do not know any other statistical tool (yet) which also query data for you. Do you know a tool where the generated code is suited for beginners to learn SQL? Let me/us know down below in the comments. (Thanks in advance!)
SQL might not always be exactly the same for every database system. Ask your colleagues (or the internet) whether certain functions or methods do work for your system or environment.
Thank you so much for reading!
P.S. If you made it this far, congratulations! I feel like this is the longst post ever haha