Data Structures for Deep Learning

For the rest of the afternoon you will be studying about about data structures useful for deep learning program.

This diagram has a small mistake, hopefully you should be able to make out the mistake by end of this page

1) Learning Objectives:

  1. Full understanding of Scalar, Vector and Matrix.
  2. Understanding Vectorization and Broadcasting in NumPy.
  3. Understanding data structure tensor in TensorFlow.

2) Recap:

2a Revisit the foundation course of NumPy

2b Revisit the foundation course on Vectorization and Broadcasting

2c Warm-up Assignment

Create 2 NumPy arrays, each with shape = 1 million, and populate them with random values between 0 and 1. Multiply both the array, firstly using for-loop and secondly using vectorized code. Compare the time of both process, and measure the comparison (in millisecond)

try to do the assignment on your own, by doing research on how to write such a piece of code in python. As a hint, these are the steps you can follow.

  • import numpy library
  • import a library to get time in millisecond
  • declare 2 numpy arrays with given size and random values. Reference to numpy random
  • store the current time in millisecond
  • multiplication of 2 arrays using vectorized code
  • store the current time after multiplication in millisecond
  • subtract before and after ‘time' to get computation time for multiplication
  • repeat the previous 4 steps, but this time for multiplication using "for-loop" (non-vectorized implementation)
  • compare the computation time for both the approaches

2d Codecademy - Here I Come

2) Intro to TensorFlow :

we will be using tensorflow for our Deep learning project. Here are some key points to know before we start with tensors.

  • Tensorflow is a machine learning framework that is provided by Google. It is an open-source framework used in conjunction with Python to implement algorithms, deep learning applications, and much more.

  • TensorFlow has optimization techniques that help in performing complicated mathematical operations quickly. This is because it uses NumPy and multi-dimensional arrays. These multi-dimensional arrays are also known as ‘tensors'.

  • TensorFlow uses GPU computation and automates the management of resources. As an example, you see in diagram below, using tensors and tensor cores of recent graphics card gives 32 times faster results.

  • TensorFlow comes with multitude of machine learning libraries and is well-supported and documented. The framework has the ability to run deep neural network models, train them, and create applications that predict relevant characteristics of the respective datasets.

  • Tensor is a data structure used in TensorFlow. Tensors are nothing but a multidimensional array.

  • Tensor in Codecademy

3) Intro to Google Colab :

Google Colab helps run Python code over the browser and requires zero configuration and free access to GPUs (Graphical Processing Units) and TPUs (Tensor Processing Units). Colab has been built on top of Jupyter Notebook.

You can watch an introductory video on Google Colab here, but try not to spend too much time on this topic now, as we will be digging deeper into this in Datalabs in forthcoming weeks.

4) Tensor in TensorFlow :

Its important you take your time to read and understand the concepts on Tensors in TensorFlow here: https://www.tensorflow.org/guide/tensor

For this Deep Learning module, you will only be using Rectangular tensors (that is tensors, along each axis, every element is the same size). So, you can skip the RaggedTensor and SparseTensor from the article for now.

4a Make sure you are taking notes of your understanding of the concepts in DL-notes.docx file in your Block C Microsoft Teams assignment.

4b Make sure to run the code snippets from the article and make some changes and play around to have a better understanding of the concepts. You can do so by simply using the Google Colab notebook that's in the TOP of the article.

Additional reference material

https://learning.oreilly.com/library/view/learning-tensorflow-js/9781492090786/ch03.html#idm45049251928088

Next up!

Coming Datalab we will reflect upon today's independent study material, and give you an opportunity to ask any questions you might have.