So you’ve read articles in the news of AI being used more and more in medicine, industry, financial markets. AI is even making it’s way into your homes by filtering your mail for spam / junk and it’s use just increases with time from automated cars to smart systems in and around the house.
But what if you want to get into this new field of computer science, where do you start? How do you start? What if you studied computer science like I did over 20 years ago and want to get a solid understanding of the programming languages, libraries and datasets you need to use. There is a plethora of study courses out there on platforms such as Udemy / YouTube and of course Google. But, if like me, you’re strapped for cash but want to learn the basis this article may assist you in getting started.
There are four “pillars” of AI which one needs a sound understanding of:
- Python programming language
- Tensorflow library and/or Pytorch library
- Mathematics behind the AI functions used
- Datasets which to apply the AI algorithms to
Let me start with the first; Python programming language. If your already from a programming background such as Java, C++, C# or like me, VBA then learning Python is quite straightforward. Python is a “loosely” typed, platform independent object orientated language. Loosely typed implies no variables are declared upfront (contrary to Java, C++ and C# where variable MUST be declared before being used). One of my favourite learning sites to this day is w3schools.com. Under the Programming section you can find ‘Learn Python’ which will walk you through everything you need to know about the Python programming language. The ‘Python Exercises’ at the end of each section will help solidify your understanding. The final ‘Python Quiz’ will prove your knowledge. NOTE: This site has been updated to now include ‘Machine Learning’ too. It also serves as a good reference for those like myself who constantly forget the syntax and need a recap.
Next up is the Google machine learning library; Tensorflow and Facebook’s PyTorch. The majority of the machine learning examples you will come across make use of these Python libraries with most using the former. DeepMind which was established in 2010 was sold to Alpabet (parent company of Google) for some $600 million in 2014 and Google deployed the AI libraries via Tensorflow in late 2015. Now most people like me who are new to AI won’t have a clue what this does or how to use it. The Tensorflow API looks overwhelming and at first intimidating. This is when I started trawling YouTube to find good tutorials on AI and came across sentdex’s channel; https://www.youtube.com/user/sentdex. His name is Harrison Wells and he also has a comprehensive website dedicated to learning AI; https://pythonprogramming.net/machine-learning-tutorial-python-introduction/. Starting with classification AI algorithms; Linear regression, SVM and K Nearest Neighbours it progresses to neural networks and deep learning with Tensorflow. Admittedly, this takes a lot of time to go through systematically but each chapter is in depth and Harrison provides a walk-along YouTube video too.
For all the newbies out there, I’d create a Google account if you don’t already have one and deploy Colaboratory to your Google Drive. This will allow you to use a Jupyter notebook with your work saved in your Google Drive all for free! Since graphic cards are notoriously expensive the Colab environment allows you to use them for free too, perfect for training your deep learning model. The only downside I can think of is the 15GB data allowance Google gives so your datasets would have to be below this size on the Google Drive to use.
As for the mathematics behind the AI functions, initially your approach is to treat them as a “black-box”, that is to just use the functions given and see the outcome of training your model. Once the code is written and working, or semi-working you can go back and read the pertaining API’s to get a better understanding and “lift the lid on the workings of the function”. Note, this could take some considerable time to understand, especially if you’re not from a mathematical background. For me, I’ve found the best approach is to use the functions on ‘faith-value’ just to get the model trained then later ‘put more meat on the bone’ by going through the code and commenting the functions and referencing the API documentation.
Finally, the datasets when learning AI are given in a “clean and tidy” form. That means it doesn’t require much cleansing and sifting before it can be used. Dataset repository sites such as Keras and Kaggle are a good place to start so that you can familiarise yourself with the look and feel of the data. Sentdex’s Machine Learning tutorials make use of the Wisconsin breast-cancer dataset and Titantic dataset for classification algorithms. The IMDB review dataset is also used for deep learning.
In conclusion, the best approach that has worked for me would be to initially familiarise yourself with Python on w3schools.com. Create folders/subfolders in your Google Drive and write your python scripts in Google Colab so they can be accessed from any location on any computer. Follow sentdex’s excellent machine learning tutorials so you become adept with the ML libraries and various Python functions. This should give you a good starting point into the world of machine learning and deep learning.