├── Bioinformatics
    └── README.md
├── Computer Vision
    └── README.md
├── LICENSE
├── Machine Learning
    ├── Problem1
    │   ├── 1_Logistic_Regression.ipynb
    │   ├── 2_Poisson_Regression.ipynb
    │   ├── 3_Gaussian_Discriminant_Analysis.ipynb
    │   ├── 4_Linear_Invariance.ipynb
    │   ├── 5_Quasar_Regression.ipynb
    │   ├── data
    │   │   ├── logistic_x.txt
    │   │   ├── logistic_y.txt
    │   │   ├── quasar_test.csv
    │   │   └── quasar_train.csv
    │   └── ps1.pdf
    ├── Problem2
    │   ├── 1_Training_Stability.ipynb
    │   ├── 2_Model_Calibration.ipynb
    │   ├── 3_Bayesian_Logistic_Regression.ipynb
    │   ├── 4_Constructing_Kernels.ipynb
    │   ├── 5_Kernelizing_the_Perceptron.ipynb
    │   ├── 6_Spam_Classification.ipynb
    │   ├── data
    │   │   ├── MATRIX.TEST
    │   │   ├── MATRIX.TRAIN
    │   │   ├── MATRIX.TRAIN.100
    │   │   ├── MATRIX.TRAIN.1400
    │   │   ├── MATRIX.TRAIN.200
    │   │   ├── MATRIX.TRAIN.400
    │   │   ├── MATRIX.TRAIN.50
    │   │   ├── MATRIX.TRAIN.800
    │   │   ├── data_a.txt
    │   │   └── data_b.txt
    │   ├── nb.py
    │   ├── ps2.pdf
    │   └── svm.py
    ├── Problem3
    │   ├── 1_Simple_Neural_Network.ipynb
    │   ├── 2_EM_for_MAP.ipynb
    │   ├── 3_EM_Application.ipynb
    │   ├── 4_KL_Divergence.ipynb
    │   ├── 5_K-means_for_Compression.ipynb
    │   ├── data
    │   │   ├── mandrill-large.tiff
    │   │   ├── mandrill-small.tiff
    │   │   └── triangle_pb3_1.jpg
    │   └── ps3.pdf
    ├── Problem4
    │   ├── 2_EM-Convergence.ipynb
    │   ├── 4_Independent-Component-Analysis.ipynb
    │   ├── data
    │   │   ├── bellsej.py
    │   │   ├── cart_pole.py
    │   │   ├── control.py
    │   │   ├── mix.dat
    │   │   ├── mnist.zip
    │   │   └── nn_starter.py
    │   └── ps4.pdf
    └── Readme.md
├── NLP
    ├── README.md
    ├── assignment1
    │   ├── Makefile
    │   ├── assignment1-solution.pdf
    │   ├── assignment1.pdf
    │   ├── collect_submission.sh
    │   ├── get_datasets.sh
    │   ├── q1_softmax.py
    │   ├── q2_gradcheck.py
    │   ├── q2_neural.py
    │   ├── q2_sigmoid.py
    │   ├── q3_run.py
    │   ├── q3_sgd.py
    │   ├── q3_word2vec.py
    │   ├── q3_word_vectors.png
    │   ├── q4_dev_conf.png
    │   ├── q4_reg_v_acc.png
    │   ├── q4_sentiment.py
    │   ├── requirements.txt
    │   └── utils
    │   │   ├── __pycache__
    │   │       ├── __init__.cpython-36.pyc
    │   │       ├── glove.cpython-36.pyc
    │   │       └── treebank.cpython-36.pyc
    │   │   ├── glove.py
    │   │   └── treebank.py
    ├── assignment2
    │   ├── assignment2-soln.pdf
    │   ├── assignment2.pdf
    │   ├── model.py
    │   ├── q1_classifier.py
    │   ├── q1_softmax.py
    │   ├── q2_initialization.py
    │   ├── q2_parser_model.py
    │   ├── q2_parser_transitions.py
    │   └── utils
    │   │   ├── general_utils.py
    │   │   └── parser_utils.py
    └── assignment3
    │   ├── assignment3-soln.pdf
    │   ├── assignment3.pdf
    │   ├── data_util.py
    │   ├── defs.py
    │   ├── model.py
    │   ├── ner_model.py
    │   ├── q1_window.py
    │   ├── q2_rnn.py
    │   ├── q2_rnn_cell.py
    │   ├── q3-clip-gru.png
    │   ├── q3-clip-rnn.png
    │   ├── q3-noclip-gru.png
    │   ├── q3_gru.py
    │   ├── q3_gru_cell.py
    │   ├── requirements.txt
    │   └── util.py
├── Python
    ├── CME193
    │   ├── lec1.pdf
    │   ├── lec2.pdf
    │   ├── lec3.pdf
    │   ├── lec4.pdf
    │   ├── lec5.pdf
    │   ├── lec6.pdf
    │   ├── lec7.pdf
    │   ├── lec8.pdf
    │   └── problemsets
    │   │   ├── Markov-chain-startercode-master.zip
    │   │   ├── Rock-paper-scissors-startercode-master.zip
    │   │   ├── exercises.pdf
    │   │   ├── hangman-master.zip
    │   │   └── shakespeare.txt
    └── README.md
├── Readme.md
└── Speech
    └── Readme.md


/Bioinformatics/README.md:
--------------------------------------------------------------------------------
 1 | <!-- For bioinformatics we will be following stanford courses.
 2 | 
 3 | https://web.stanford.edu/class/cs262/cgi-bin/index.php
 4 | 
 5 | https://web.stanford.edu/class/cs273a/cgi-bin/ -->
 6 | 
 7 | # Bio Informatics   
 8 | 
 9 | ## Course List
10 | **S.No** | **Course Title** | **Link to course**
11 | ------------ | ------------- | ---------
12 | [1](#1-computational-systems-biology--deep-learning-in-life-sciences) | Computational Systems Biology : Deep Learning in Life Sciences | https://mit6874.github.io/ 
13 | [2](#2-computational-genomics) | Computatinal Genomics | https://web.stanford.edu/class/cs262/
14 | [3](#3-the-human-genome-source-code) | The Human Genome Source Code | https://web.stanford.edu/class/cs273a/
15 | 
16 | ## Course Details
17 | ### 1. Computational Systems Biology : Deep Learning in Life Sciences
18 |    * **Link to course** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;: &nbsp; &nbsp; https://mit6874.github.io/ 
19 |    * **Offered By** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;: &nbsp; &nbsp; MIT 
20 |    * **Pre-Requisites** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; : &nbsp; &nbsp;  Calculus, Linear Algebra, Python programming,Probability,   
21 |                                      &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Introductory molecular biology
22 |    * **Level** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; : &nbsp; &nbsp; Advanced
23 |    * **Course description**    
24 |         This course introduces foundations and state-of-the-art machine learning challenges in genomics and the life sciences more broadly. The course introduces both deep learning and classical machine learning approaches to key problems, comparing and contrasting their power and limitations.
25 |  
26 | 
27 | ### 2. Computational Genomics
28 |    * **Link to course** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;: &nbsp; &nbsp; https://web.stanford.edu/class/cs262/
29 |    * **Offered By** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;: &nbsp; &nbsp; Stanford 
30 |    * **Pre-Requisites** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; : &nbsp; &nbsp; Design and Analysis of Algorithms  
31 |    * **Level** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; : &nbsp; &nbsp; Intermediate
32 |    * **Course description**    
33 |         Genomics is a new and very active application area of computer science. Computer science is playing a central role in genomics: from sequencing and assembling of DNA sequences to analyzing genomes in order to locate genes, similarities between sequences of different organisms, and several other applications. This course aims to present some of the most basic and useful algorithms for sequence analysis, together with the minimal biological background. Sequence alignments, hidden Markov models, multiple alignment algorithms and heuristics such as Gibbs sampling, and the probabilistic interpretation of alignments will be covered.        
34 | 
35 | 
36 | ### 3. The Human Genome Source Code
37 |    * **Link to course** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;: &nbsp; &nbsp; https://web.stanford.edu/class/cs273a/
38 |    * **Offered By** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;: &nbsp; &nbsp; Stanford 
39 |    * **Pre-Requisites** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; : &nbsp; &nbsp; Programming Experience in any language  
40 |    * **Level** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; : &nbsp; &nbsp; Beginner
41 |    * **Course description**    
42 |         The course introduces you to various aspects of genomic data . The course contents cover Population genomics & paternity testing, Medical AI (disease) genomics  and Comparative (evolutionary) genomics and maybe a dash of cryptogenomics and genomic privacy.
43 |        
44 | ####  Happy Learning  &nbsp; :thumbsup: :memo: 
45 | 
46 | 
47 | 
48 | 
49 | 
50 | 


--------------------------------------------------------------------------------
/Computer Vision/README.md:
--------------------------------------------------------------------------------
 1 | <!-- For computer vision we will be following the below courses form stanford.
 2 | 
 3 | http://vision.stanford.edu/teaching/cs131_fall1819/syllabus.html
 4 | 
 5 | http://cs231n.github.io/
 6 | 
 7 | We added some previous assignments and soultions for these courses. -->
 8 | 
 9 | # Computer Vision
10 | 
11 | ## Course List
12 | **S.No** | **Course Title** | **Link to course** | **Link to Assignment Solutions**
13 | ------------ | ------------- | --------- | -----------
14 | [1](#1-computer-vision--foundations-and-applications) | Computer Vision: Foundations and Applications | http://vision.stanford.edu/teaching/cs131_fall2122/ | [CS131 Solutions](https://github.com/StanfordVL/CS131_release)
15 | [2](#2-deep-learning-for-computer-vision) | Deep Learning for Computer Vision | http://cs231n.stanford.edu/ | [CS231 Solutions](https://github.com/mantasu/cs231n)
16 | <!-- [3](#3-computer-vision) | Computer Vision | https://faculty.cc.gatech.edu/~hays/compvision/ |  -->
17 | 
18 | ## Course Details
19 | ### 1. Computer Vision: Foundations and Applications
20 |    * **Link to course** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;: &nbsp; &nbsp; http://vision.stanford.edu/teaching/cs131_fall2122/
21 |    * **Offered By** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;: &nbsp; &nbsp; Stanford 
22 |    * **Pre-Requisites** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; : &nbsp; &nbsp;  Calculus, Linear Algebra, Python programming,Probability, Statistics  
23 |                                      <!-- &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Introductory molecular biology -->
24 |    * **Level** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; : &nbsp; &nbsp; Beginner
25 |    * **Course description**    
26 |     Computer Vision is one of the fastest growing and most exciting AI disciplines in today’s academia and industry. This 10-week course is designed to open the doors for students who are interested in learning about the fundamental principles and important applications of computer vision.It covers topics ranging from basic operations on images to Image Segmentation. Students will be exposed to a number of real-world applications that are important to our daily lives.
27 |  
28 | 
29 | ### 2. Deep Learning for Computer Vision
30 |    * **Link to course** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;: &nbsp; &nbsp; http://cs231n.stanford.edu/
31 |    * **Offered By** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;: &nbsp; &nbsp; Stanford 
32 |    * **Pre-Requisites** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; : &nbsp; &nbsp; Proficiency in Python, Calculus, Linear Algebra,Probability, Statistics   
33 |    * **Level** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; : &nbsp; &nbsp; Advanced
34 |    * **Course description**  
35 |     This course is a deep dive into the details of deep learning architectures with a focus on learning end-to-end models for these tasks, particularly image classification. During the 10-week course, students will learn to implement and train their own neural networks and gain a detailed understanding of cutting-edge research in computer vision.  
36 |         
37 | 
38 | 
39 | <!-- ### 3. Computer Vision
40 |    * **Link to course** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;: &nbsp; &nbsp; https://faculty.cc.gatech.edu/~hays/compvision2021/
41 |    * **Offered By** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;: &nbsp; &nbsp; Georgia Tech 
42 |    * **Pre-Requisites** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; : &nbsp; &nbsp; Python programming, Calculus, Linear Algebra,Probability   
43 |    * **Level** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; : &nbsp; &nbsp; Advanced
44 |    * **Course description**    
45 |         This course provides an introduction to computer vision including fundamentals of image formation, camera imaging geometry, feature detection and matching, stereo, motion estimation and tracking, image classification and scene understanding. We'll explore methods for depth recovery from stereo images, camera calibration, automated alignment, tracking, boundary detection, and recognition. We'll use both classical machine learning and deep learning to approach these problems. The focus of the course is to develop the intuitions and mathematics of the methods in lecture, and then to learn about the difference between theory and practice in the projects.  -->
46 |        
47 | ####  Happy Learning  &nbsp; :thumbsup: :memo: 
48 | 
49 | 
50 | 
51 | 
52 | 
53 | 
54 | 
55 | 
56 | 
57 | 
58 | 
59 | 
60 | 
61 | 
62 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2019 bayeslabs
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/Machine Learning/Problem1/2_Poisson_Regression.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "# CS229: Problem Set 1\n",
  8 |     "## Problem 2: Poisson Regression and the Exponential Family\n",
  9 |     "\n",
 10 |     "\n",
 11 |     "**C. Combier**\n",
 12 |     "\n",
 13 |     "This iPython Notebook provides solutions to Stanford's CS229 (Machine Learning, Fall 2017) graduate course problem set 1, taught by Andrew Ng.\n",
 14 |     "\n",
 15 |     "The problem set can be found here: [./ps1.pdf](ps1.pdf)\n",
 16 |     "\n",
 17 |     "I chose to write the solutions to the coding questions in Python, whereas the Stanford class is taught with Matlab/Octave.\n",
 18 |     "\n",
 19 |     "## Notation\n",
 20 |     "\n",
 21 |     "- $x^i$ is the $i^{th}$ feature vector\n",
 22 |     "- $y^i$ is the expected outcome for the $i^{th}$ training example\n",
 23 |     "- $m$ is the number of training examples\n",
 24 |     "- $n$ is the number of features"
 25 |    ]
 26 |   },
 27 |   {
 28 |    "cell_type": "markdown",
 29 |    "metadata": {},
 30 |    "source": [
 31 |     "### Question 2.a)\n",
 32 |     "The exponential family is a class of distributions with the following form:\n",
 33 |     "\n",
 34 |     "$$\n",
 35 |     "p(y;\\eta) = b(y)\\exp{(\\eta^T T(y) - a(\\eta))}\n",
 36 |     "$$\n",
 37 |     "\n",
 38 |     "By identifying the parameters of the Poisson distribution:\n",
 39 |     "\n",
 40 |     "$$\n",
 41 |     "p(y;\\lambda) = \\frac{e^{-\\lambda}\\lambda^y}{y!}\n",
 42 |     "$$\n",
 43 |     "\n",
 44 |     "- $b(y) = \\frac{1}{y!}$\n",
 45 |     "- $T(y) = y$\n",
 46 |     "- $\\eta = \\log \\lambda$\n",
 47 |     "- $a(\\eta) = e^{\\eta}$"
 48 |    ]
 49 |   },
 50 |   {
 51 |    "cell_type": "markdown",
 52 |    "metadata": {},
 53 |    "source": [
 54 |     "### Question 2.b)\n",
 55 |     "\n",
 56 |     "The canonical response function for the Poisson distribution is given by:\n",
 57 |     "\n",
 58 |     "$$\n",
 59 |     "g(\\eta) = E(y;\\eta) = \\lambda = e^{\\eta}\n",
 60 |     "$$"
 61 |    ]
 62 |   },
 63 |   {
 64 |    "cell_type": "markdown",
 65 |    "metadata": {},
 66 |    "source": [
 67 |     "### Question 2.c)\n",
 68 |     "\n",
 69 |     "The log-likelihood of a training example $(x^i, y^i)$ is given by $\\log p(y^i|x^i;\\theta)$.\n",
 70 |     "\n",
 71 |     "To derive the stochastic gradient descent update rule, we start by computing the partial derivative of the log-likelihood with respect to parameter $\\theta_j$, with $\\eta = \\theta^T x^i$:\n",
 72 |     "\n",
 73 |     "$$\n",
 74 |     "\\begin{align*}\n",
 75 |     "\\frac{\\partial}{\\partial \\theta_j} \\log p(y^i|x^i;\\theta) &= \\frac{\\partial}{\\partial \\theta_j} ((\\theta^T x^i)^T y^i - e^{\\theta^T x^i}) \\\\\n",
 76 |     "&= (y^i-e^{\\theta^T x^i})x_j^i\n",
 77 |     "\\end{align*}\n",
 78 |     "$$\n",
 79 |     "\n",
 80 |     "The yields the update rule for stochastic gradient descent with learning rate $\\alpha$:\n",
 81 |     "\n",
 82 |     "$$\n",
 83 |     "\\theta_j := \\theta_j + \\alpha.(y^i-e^{\\theta^T x^i})x_j^i\n",
 84 |     "$$"
 85 |    ]
 86 |   }
 87 |  ],
 88 |  "metadata": {
 89 |   "kernelspec": {
 90 |    "display_name": "Python 2",
 91 |    "language": "python",
 92 |    "name": "python2"
 93 |   },
 94 |   "language_info": {
 95 |    "codemirror_mode": {
 96 |     "name": "ipython",
 97 |     "version": 2
 98 |    },
 99 |    "file_extension": ".py",
100 |    "mimetype": "text/x-python",
101 |    "name": "python",
102 |    "nbconvert_exporter": "python",
103 |    "pygments_lexer": "ipython2",
104 |    "version": "2.7.15"
105 |   }
106 |  },
107 |  "nbformat": 4,
108 |  "nbformat_minor": 2
109 | }
110 | 


--------------------------------------------------------------------------------
/Machine Learning/Problem1/4_Linear_Invariance.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "# CS229: Problem Set 1\n",
  8 |     "## Problem 4: Linear Invariance of Optimization Algorithms\n",
  9 |     "\n",
 10 |     "**C. Combier**\n",
 11 |     "\n",
 12 |     "This iPython Notebook provides solutions to Stanford's CS229 (Machine Learning, Fall 2017) graduate course problem set 1, taught by Andrew Ng.\n",
 13 |     "\n",
 14 |     "The problem set can be found here: [./ps1.pdf](ps1.pdf)\n",
 15 |     "\n",
 16 |     "I chose to write the solutions to the coding questions in Python, whereas the Stanford class is taught with Matlab/Octave.\n",
 17 |     "\n",
 18 |     "## Notation\n",
 19 |     "\n",
 20 |     "- $x^i$ is the $i^{th}$ feature vector\n",
 21 |     "- $y^i$ is the expected outcome for the $i^{th}$ training example\n",
 22 |     "- $m$ is the number of training examples\n",
 23 |     "- $n$ is the number of features"
 24 |    ]
 25 |   },
 26 |   {
 27 |    "cell_type": "markdown",
 28 |    "metadata": {
 29 |     "colab_type": "text",
 30 |     "id": "9J7p406abzgl"
 31 |    },
 32 |    "source": [
 33 |     "### Question 4.a)\n",
 34 |     "Let:\n",
 35 |     " - $z = A^{-1} x$\n",
 36 |     " - $g(z) = f(Az)$\n",
 37 |     " \n",
 38 |     "We also define the notation $H_f |_x$ and $\\nabla_f |_x$ , which are respectively the Hessian and Gradient of function $f$ evaluted at $x$.\n",
 39 |     " \n",
 40 |     "Write the update rule for Newton-Rhapson's method:\n",
 41 |     " \n",
 42 |     " $$\n",
 43 |     " \\begin{align*}\n",
 44 |     " z : &= z -H_g^{-1} |_z. \\nabla_g |_z \\\\\n",
 45 |     "  : &= z - H_f^{-1} |_{Az}. \\nabla_f |_{Az}\n",
 46 |     " \\end{align*}\n",
 47 |     " $$\n",
 48 |     " \n",
 49 |     "We calculate the Hessian using the chain rule:\n",
 50 |     " $$\n",
 51 |     " H_f|_{Az} = A^T . H_f |_{Az} . A \\implies H_f|_{Az} ^{-1}= A^{-1} . H_f|_{Az}^{-1} . A^{T^{-1}}\n",
 52 |     " $$\n",
 53 |     " \n",
 54 |     "Similarly, the chain rule applied to the gradient operator is given by:\n",
 55 |     " \n",
 56 |     " $$\n",
 57 |     " \\nabla_f |_{Az} = A^T . \\nabla_f|_{Az}\n",
 58 |     " $$\n",
 59 |     " \n",
 60 |     "Combining the two in the update rule:\n",
 61 |     " \n",
 62 |     "  $$\n",
 63 |     " \\begin{align*}\n",
 64 |     " z : &= z - A^{-1} . H_f|_{Az}^{-1}. A^{T^{-1}}. A^T  .\\nabla_f|_{Az} \\\\\n",
 65 |     " : &= z - A^{-1} H_f|_{Az}^{-1} .\\nabla_f|_{Az} \\\\\n",
 66 |     " : &= z - A^{-1} H_f|_x^{-1}  .\\nabla_f|_x \\\\\n",
 67 |     " : &= A^{-1} . (x -  H_f|_x^{-1} . \\nabla_f|_x )\n",
 68 |     " \\end{align*}\n",
 69 |     " $$\n",
 70 |     " \n",
 71 |     "Which proves the linearity of the Newton-Rhapson method."
 72 |    ]
 73 |   },
 74 |   {
 75 |    "cell_type": "markdown",
 76 |    "metadata": {
 77 |     "colab_type": "text",
 78 |     "id": "OLXoC8jH6mlv"
 79 |    },
 80 |    "source": [
 81 |     "### Question 4.b)\n",
 82 |     "\n",
 83 |     "In this question, we show that gradient descent is not invariant to linear reparametrization.\n",
 84 |     "\n",
 85 |     "Consider the function $f:x \\mapsto x^2$, $x \\in R$\n",
 86 |     "\n",
 87 |     "We now consider the gradient descent update rule for this function and parameter $z = \\lambda x$:\n",
 88 |     "\n",
 89 |     "$$\n",
 90 |     "\\begin{align*}\n",
 91 |     "z:&= z - \\alpha \\frac{df}{dz} \\\\\n",
 92 |     ": &= z - \\alpha \\frac{df}{dx} \\frac{dx}{dz} \\\\\n",
 93 |     ": &= \\lambda x -\\frac{\\alpha}{\\lambda} (2 \\lambda x) = \\lambda x - \\alpha (2x) \\\\\n",
 94 |     "\\neq \\lambda.(x - \\alpha.\\frac{df}{dx}) = \\lambda (x - \\alpha.(2x))\n",
 95 |     "\\end{align*}\n",
 96 |     "$$\n",
 97 |     "\n",
 98 |     "This counter example shows that gradient descent is not invariant to linear reparametrization."
 99 |    ]
100 |   }
101 |  ],
102 |  "metadata": {
103 |   "kernelspec": {
104 |    "display_name": "Python 2",
105 |    "language": "python",
106 |    "name": "python2"
107 |   },
108 |   "language_info": {
109 |    "codemirror_mode": {
110 |     "name": "ipython",
111 |     "version": 2
112 |    },
113 |    "file_extension": ".py",
114 |    "mimetype": "text/x-python",
115 |    "name": "python",
116 |    "nbconvert_exporter": "python",
117 |    "pygments_lexer": "ipython2",
118 |    "version": "2.7.15"
119 |   }
120 |  },
121 |  "nbformat": 4,
122 |  "nbformat_minor": 2
123 | }
124 | 


--------------------------------------------------------------------------------
/Machine Learning/Problem1/data/logistic_x.txt:
--------------------------------------------------------------------------------
  1 |    1.3432504e+00  -1.3311479e+00
  2 |    1.8205529e+00  -6.3466810e-01
  3 |    9.8632067e-01  -1.8885762e+00
  4 |    1.9443734e+00  -1.6354520e+00
  5 |    9.7673352e-01  -1.3533151e+00
  6 |    1.9458584e+00  -2.0443278e+00
  7 |    2.1075153e+00  -2.1256684e+00
  8 |    2.0703730e+00  -2.4634101e+00
  9 |    8.6864964e-01  -2.4119348e+00
 10 |    1.8006594e+00  -2.7739689e+00
 11 |    3.1283787e+00  -3.4452432e+00
 12 |    3.0947429e+00  -3.6446145e+00
 13 |    2.9086652e+00  -4.0065037e+00
 14 |    2.6770338e+00  -3.0198592e+00
 15 |    2.7458671e+00  -2.7100561e+00
 16 |    4.1714647e+00  -3.4622482e+00
 17 |    3.9313220e+00  -2.1099044e+00
 18 |    4.3786870e+00  -2.3804743e+00
 19 |    4.8016565e+00  -3.3803344e+00
 20 |    4.1661050e+00  -2.8138844e+00
 21 |    2.4670141e+00  -1.6108444e+00
 22 |    3.4826743e+00  -1.5533872e+00
 23 |    3.3652482e+00  -1.8164936e+00
 24 |    2.8772788e+00  -1.8511689e+00
 25 |    3.1090444e+00  -1.6384946e+00
 26 |    2.2183701e+00   7.4279558e-02
 27 |    1.9949873e+00   1.6268659e-01
 28 |    2.9500308e+00   1.6873016e-02
 29 |    2.0216009e+00   1.7227387e-01
 30 |    2.0486921e+00  -6.3581041e-01
 31 |    8.7548563e-01  -5.4586168e-01
 32 |    5.7079941e-01  -3.3278660e-02
 33 |    1.4266468e+00  -7.5288337e-01
 34 |    7.2265633e-01  -8.6691930e-01
 35 |    9.5346198e-01  -1.4896956e+00
 36 |    4.8333333e+00   7.0175439e-02
 37 |    4.3070175e+00   1.4152047e+00
 38 |    6.0321637e+00   4.5029240e-01
 39 |    5.4181287e+00  -2.7076023e+00
 40 |    3.4590643e+00  -2.8245614e+00
 41 |    2.7280702e+00  -9.2397661e-01
 42 |    1.0029240e+00   7.7192982e-01
 43 |    3.6637427e+00  -7.7777778e-01
 44 |    4.3070175e+00  -1.0409357e+00
 45 |    3.6929825e+00  -1.0526316e-01
 46 |    5.7397661e+00  -1.6257310e+00
 47 |    4.9795322e+00  -1.5087719e+00
 48 |    6.5000000e+00  -2.9122807e+00
 49 |    5.2426901e+00   9.1812865e-01
 50 |    1.6754386e+00   5.6725146e-01
 51 |    5.1708997e+00   1.2103667e+00
 52 |    4.8795188e+00   1.6081848e+00
 53 |    4.6649870e+00   1.0695532e+00
 54 |    4.4934321e+00   1.2351592e+00
 55 |    4.1512967e+00   8.6721260e-01
 56 |    3.7177080e+00   1.1517200e+00
 57 |    3.6224477e+00   1.3106769e+00
 58 |    3.0606943e+00   1.4857163e+00
 59 |    7.0718465e+00  -3.4961651e-01
 60 |    6.0391832e+00  -2.4756832e-01
 61 |    6.6747480e+00  -1.2484766e-01
 62 |    6.8461291e+00   2.5977167e-01
 63 |    6.4270724e+00  -1.4713863e-01
 64 |    6.8456065e+00   1.4754967e+00
 65 |    7.7054006e+00   1.6045555e+00
 66 |    6.2870658e+00   2.4156427e+00
 67 |    6.9810956e+00   1.2599865e+00
 68 |    7.0990172e+00   2.2155151e+00
 69 |    5.5275479e+00   2.9968421e-01
 70 |    5.8303489e+00  -2.1974408e-01
 71 |    6.3594527e+00   2.3944217e-01
 72 |    6.1004524e+00  -4.0957414e-02
 73 |    5.6237412e+00   3.7135914e-01
 74 |    5.8836969e+00   2.7768186e+00
 75 |    5.5781611e+00   3.0682889e+00
 76 |    7.0050662e+00  -2.5781727e-01
 77 |    4.4538114e+00   8.3941831e-01
 78 |    5.6495924e+00   1.3053929e+00
 79 |    4.6337489e+00   1.9467546e+00
 80 |    3.6986847e+00   2.2594084e+00
 81 |    4.1193005e+00   2.5474510e+00
 82 |    4.7665558e+00   2.7531209e+00
 83 |    3.0812098e+00   2.7985255e+00
 84 |    4.0730994e+00  -3.0292398e+00
 85 |    3.4883041e+00  -1.8888889e+00
 86 |    7.6900585e-01   1.2105263e+00
 87 |    1.5000000e+00   3.8128655e+00
 88 |    5.7982456e+00  -2.0935673e+00
 89 |    6.8114529e+00  -8.3456730e-01
 90 |    7.1106096e+00  -1.0201158e+00
 91 |    7.4941520e+00  -1.7426901e+00
 92 |    3.1374269e+00   4.2105263e-01
 93 |    1.6754386e+00   5.0877193e-01
 94 |    2.4941520e+00  -8.6549708e-01
 95 |    4.7748538e+00   9.9415205e-02
 96 |    5.8274854e+00  -6.9005848e-01
 97 |    2.2894737e+00   1.9707602e+00
 98 |    2.4941520e+00   1.4152047e+00
 99 |    2.0847953e+00   1.3567251e+00
100 | 


--------------------------------------------------------------------------------
/Machine Learning/Problem1/data/logistic_y.txt:
--------------------------------------------------------------------------------
  1 |    -1.0000000e+00
  2 |    -1.0000000e+00
  3 |    -1.0000000e+00
  4 |    -1.0000000e+00
  5 |    -1.0000000e+00
  6 |    -1.0000000e+00
  7 |    -1.0000000e+00
  8 |    -1.0000000e+00
  9 |    -1.0000000e+00
 10 |    -1.0000000e+00
 11 |    -1.0000000e+00
 12 |    -1.0000000e+00
 13 |    -1.0000000e+00
 14 |    -1.0000000e+00
 15 |    -1.0000000e+00
 16 |    -1.0000000e+00
 17 |    -1.0000000e+00
 18 |    -1.0000000e+00
 19 |    -1.0000000e+00
 20 |    -1.0000000e+00
 21 |    -1.0000000e+00
 22 |    -1.0000000e+00
 23 |    -1.0000000e+00
 24 |    -1.0000000e+00
 25 |    -1.0000000e+00
 26 |    -1.0000000e+00
 27 |    -1.0000000e+00
 28 |    -1.0000000e+00
 29 |    -1.0000000e+00
 30 |    -1.0000000e+00
 31 |    -1.0000000e+00
 32 |    -1.0000000e+00
 33 |    -1.0000000e+00
 34 |    -1.0000000e+00
 35 |    -1.0000000e+00
 36 |    -1.0000000e+00
 37 |    -1.0000000e+00
 38 |    -1.0000000e+00
 39 |    -1.0000000e+00
 40 |    -1.0000000e+00
 41 |    -1.0000000e+00
 42 |    -1.0000000e+00
 43 |    -1.0000000e+00
 44 |    -1.0000000e+00
 45 |    -1.0000000e+00
 46 |    -1.0000000e+00
 47 |    -1.0000000e+00
 48 |    -1.0000000e+00
 49 |    -1.0000000e+00
 50 |    -1.0000000e+00
 51 |    1.0000000e+00
 52 |    1.0000000e+00
 53 |    1.0000000e+00
 54 |    1.0000000e+00
 55 |    1.0000000e+00
 56 |    1.0000000e+00
 57 |    1.0000000e+00
 58 |    1.0000000e+00
 59 |    1.0000000e+00
 60 |    1.0000000e+00
 61 |    1.0000000e+00
 62 |    1.0000000e+00
 63 |    1.0000000e+00
 64 |    1.0000000e+00
 65 |    1.0000000e+00
 66 |    1.0000000e+00
 67 |    1.0000000e+00
 68 |    1.0000000e+00
 69 |    1.0000000e+00
 70 |    1.0000000e+00
 71 |    1.0000000e+00
 72 |    1.0000000e+00
 73 |    1.0000000e+00
 74 |    1.0000000e+00
 75 |    1.0000000e+00
 76 |    1.0000000e+00
 77 |    1.0000000e+00
 78 |    1.0000000e+00
 79 |    1.0000000e+00
 80 |    1.0000000e+00
 81 |    1.0000000e+00
 82 |    1.0000000e+00
 83 |    1.0000000e+00
 84 |    1.0000000e+00
 85 |    1.0000000e+00
 86 |    1.0000000e+00
 87 |    1.0000000e+00
 88 |    1.0000000e+00
 89 |    1.0000000e+00
 90 |    1.0000000e+00
 91 |    1.0000000e+00
 92 |    1.0000000e+00
 93 |    1.0000000e+00
 94 |    1.0000000e+00
 95 |    1.0000000e+00
 96 |    1.0000000e+00
 97 |    1.0000000e+00
 98 |    1.0000000e+00
 99 |    1.0000000e+00
100 | 


--------------------------------------------------------------------------------
/Machine Learning/Problem1/ps1.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bayeslabs/AiGym/205350c311f0e0981fe90eae84586c9bd4a9cfef/Machine Learning/Problem1/ps1.pdf


--------------------------------------------------------------------------------
/Machine Learning/Problem2/2_Model_Calibration.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "# CS229: Problem Set 2\n",
  8 |     "## Problem 2: Model Calibration\n",
  9 |     "\n",
 10 |     "\n",
 11 |     "**C. Combier**\n",
 12 |     "\n",
 13 |     "This iPython Notebook provides solutions to Stanford's CS229 (Machine Learning, Fall 2017) graduate course problem set 2, taught by Andrew Ng.\n",
 14 |     "\n",
 15 |     "The problem set can be found here: [./ps2.pdf](ps2.pdf)\n",
 16 |     "\n",
 17 |     "I chose to write the solutions to the coding questions in Python, whereas the Stanford class is taught with Matlab/Octave.\n",
 18 |     "\n",
 19 |     "## Notation\n",
 20 |     "\n",
 21 |     "- $x^i$ is the $i^{th}$ feature vector\n",
 22 |     "- $y^i$ is the expected outcome for the $i^{th}$ training example\n",
 23 |     "- $m$ is the number of training examples\n",
 24 |     "- $n$ is the number of features"
 25 |    ]
 26 |   },
 27 |   {
 28 |    "cell_type": "markdown",
 29 |    "metadata": {
 30 |     "colab_type": "text",
 31 |     "id": "rsUtreJMonLw"
 32 |    },
 33 |    "source": [
 34 |     "### Question 2.a)\n",
 35 |     "\n",
 36 |     "The maximum likelihood parameters $\\theta^*$ are obtained by writing the gradient of the log-likelihood with respect to $\\theta$ and setting it to $0$. In matrix form, this is equivalent to solving the following equation:\n",
 37 |     "\n",
 38 |     "$$\n",
 39 |     "X^T(Y-h_{\\theta}(X)) = 0\n",
 40 |     "$$\n",
 41 |     "\n",
 42 |     "Where:\n",
 43 |     "- $X$ is an $m \\times (n+1)$ matrix, given the addition of the intercept term $x_0 = 1 \\hspace{1em}, \\forall i$\n",
 44 |     "- $Y$ is an $m \\times 1$ matrix\n",
 45 |     "\n",
 46 |     "Expanding the matrix equation for $\\theta = \\theta^*$ gives:\n",
 47 |     "\n",
 48 |     "$$\n",
 49 |     "  \\left[ {\\begin{array}{cccc}\n",
 50 |     "   1 & ... & 1 \\\\\n",
 51 |     "   x^1_1 & ... & x_n^1 \\\\\n",
 52 |     "    & ... & \\\\\n",
 53 |     "    x^m_1 & ... & x^m_n\n",
 54 |     "  \\end{array} } \\right]\n",
 55 |     "  (Y-h_{\\theta^*}(X)) = 0\n",
 56 |     "  $$\n",
 57 |     "  \n",
 58 |     "  If we extract the first line from the above matrix equation, we get:\n",
 59 |     "  \n",
 60 |     "  $$\n",
 61 |     "  \\sum_{i=1}^m y^i = \\sum_{i=1}^m h_{\\theta^*}(x^i)\n",
 62 |     "  $$\n",
 63 |     "  \n",
 64 |     "  Using the definition of $h_{\\theta^*}$:\n",
 65 |     "  \n",
 66 |     "  $$\n",
 67 |     "  \\sum_{i=1}^m 1(y^i = 1) = \\sum_{i=1}^m P(y = 1|x;\\theta^*)\n",
 68 |     "  $$\n",
 69 |     "  \n",
 70 |     "  We conclule by saying that $|\\{ i \\in I_{0,1} \\}| = m$ which shows the property holds true for $(a,b) = (0,1)$\n",
 71 |     "  \n",
 72 |     "  ### Question 2.b)\n",
 73 |     "  \n",
 74 |     "  - If a model is perfectly callibrated, then all we can say is that the probabilities output from the model match empirical observations. This only describes the probabilities of the outcomes, and not the outcomes themselves, therefore the model does not necessarily achieve perfect accuracy.\n",
 75 |     "  - Conversely, if a model has perfect accuracy, then the probabilities output by the model necessarily match empirical observations\n",
 76 |     "  \n",
 77 |     "  \n",
 78 |     "  This implies that callibration is a weaker assumption than accuracy."
 79 |    ]
 80 |   }
 81 |  ],
 82 |  "metadata": {
 83 |   "kernelspec": {
 84 |    "display_name": "Python 2",
 85 |    "language": "python",
 86 |    "name": "python2"
 87 |   },
 88 |   "language_info": {
 89 |    "codemirror_mode": {
 90 |     "name": "ipython",
 91 |     "version": 2
 92 |    },
 93 |    "file_extension": ".py",
 94 |    "mimetype": "text/x-python",
 95 |    "name": "python",
 96 |    "nbconvert_exporter": "python",
 97 |    "pygments_lexer": "ipython2",
 98 |    "version": "2.7.15"
 99 |   }
100 |  },
101 |  "nbformat": 4,
102 |  "nbformat_minor": 2
103 | }
104 | 


--------------------------------------------------------------------------------
/Machine Learning/Problem2/3_Bayesian_Logistic_Regression.ipynb:
--------------------------------------------------------------------------------
 1 | {
 2 |  "cells": [
 3 |   {
 4 |    "cell_type": "markdown",
 5 |    "metadata": {},
 6 |    "source": [
 7 |     "# CS229: Problem Set 2\n",
 8 |     "## Problem 3: Bayesian Logistic Regression and Weight Decay\n",
 9 |     "\n",
10 |     "\n",
11 |     "**C. Combier**\n",
12 |     "\n",
13 |     "This iPython Notebook provides solutions to Stanford's CS229 (Machine Learning, Fall 2017) graduate course problem set 3, taught by Andrew Ng.\n",
14 |     "\n",
15 |     "The problem set can be found here: [./ps2.pdf](ps2.pdf)\n",
16 |     "\n",
17 |     "I chose to write the solutions to the coding questions in Python, whereas the Stanford class is taught with Matlab/Octave.\n",
18 |     "\n",
19 |     "## Notation\n",
20 |     "\n",
21 |     "- $x^i$ is the $i^{th}$ feature vector\n",
22 |     "- $y^i$ is the expected outcome for the $i^{th}$ training example\n",
23 |     "- $m$ is the number of training examples\n",
24 |     "- $n$ is the number of features"
25 |    ]
26 |   },
27 |   {
28 |    "cell_type": "markdown",
29 |    "metadata": {
30 |     "colab_type": "text",
31 |     "id": "rsUtreJMonLw"
32 |    },
33 |    "source": [
34 |     "### Question 3)\n",
35 |     "\n",
36 |     "Suppose that $|| \\theta_{MAP} ||^2 > || \\theta_{MLE} ||^2$.\n",
37 |     "\n",
38 |     "Then, given the prior that $\\theta$ is a gaussian random variable:\n",
39 |     "\n",
40 |     "$$\n",
41 |     "\\begin{align*}\n",
42 |     "p(\\theta_{MAP}) &< p(\\theta_{MLE}) \\\\\n",
43 |     "p(\\theta_{MAP}) \\prod_{i=1}^m p(y^i |x^i; \\theta_{MAP}) &< p(\\theta_{MLE}) \\prod_{i=1}^m p(y^i |x^i; \\theta_{MAP}) \\\\\n",
44 |     "&< p(\\theta_{MLE}) \\prod_{i=1}^m p(y^i |x^i; \\theta_{MLE})\n",
45 |     "\\end{align*}\n",
46 |     "$$\n",
47 |     "This is true because by the definition of $\\theta_{MLE}$:\n",
48 |     "$$\n",
49 |     "\\forall \\theta, \\prod_{i=1}^m p(y^i |x^i; \\theta) < \\prod_{i=1}^m p(y^i |x^i; \\theta_{MLE})\n",
50 |     "$$\n",
51 |     "\n",
52 |     "However, this statement contradicts the definition of $\\theta_{MAP}$. Therefore, our initial assumption is incorrect, which proves that:\n",
53 |     "\n",
54 |     "$$\n",
55 |     "|| \\theta_{MAP} ||^2 \\leq || \\theta_{MLE} ||^2\n",
56 |     "$$"
57 |    ]
58 |   }
59 |  ],
60 |  "metadata": {
61 |   "kernelspec": {
62 |    "display_name": "Python 2",
63 |    "language": "python",
64 |    "name": "python2"
65 |   },
66 |   "language_info": {
67 |    "codemirror_mode": {
68 |     "name": "ipython",
69 |     "version": 2
70 |    },
71 |    "file_extension": ".py",
72 |    "mimetype": "text/x-python",
73 |    "name": "python",
74 |    "nbconvert_exporter": "python",
75 |    "pygments_lexer": "ipython2",
76 |    "version": "2.7.15"
77 |   }
78 |  },
79 |  "nbformat": 4,
80 |  "nbformat_minor": 2
81 | }
82 | 


--------------------------------------------------------------------------------
/Machine Learning/Problem2/4_Constructing_Kernels.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "# CS229: Problem Set 2\n",
  8 |     "## Problem 4: Constructing Kernels\n",
  9 |     "\n",
 10 |     "\n",
 11 |     "**C. Combier**\n",
 12 |     "\n",
 13 |     "This iPython Notebook provides solutions to Stanford's CS229 (Machine Learning, Fall 2017) graduate course problem set 2, taught by Andrew Ng.\n",
 14 |     "\n",
 15 |     "The problem set can be found here: [./ps2.pdf](ps2.pdf)\n",
 16 |     "\n",
 17 |     "I chose to write the solutions to the coding questions in Python, whereas the Stanford class is taught with Matlab/Octave.\n",
 18 |     "\n",
 19 |     "## Notation\n",
 20 |     "\n",
 21 |     "- $x^i$ is the $i^{th}$ feature vector\n",
 22 |     "- $y^i$ is the expected outcome for the $i^{th}$ training example\n",
 23 |     "- $m$ is the number of training examples\n",
 24 |     "- $n$ is the number of features"
 25 |    ]
 26 |   },
 27 |   {
 28 |    "cell_type": "markdown",
 29 |    "metadata": {
 30 |     "colab_type": "text",
 31 |     "id": "Cc5deiNrag6C"
 32 |    },
 33 |    "source": [
 34 |     "### Question 4.a)\n",
 35 |     "\n",
 36 |     "$$\n",
 37 |     "\\begin{align*}\n",
 38 |     "u^T K u &= u^T K_1 u + u^T K_2 u\\\\\n",
 39 |     "\\end{align*}\n",
 40 |     "$$\n",
 41 |     "\n",
 42 |     "Since $u^T K_1 u \\geq 0$ and $u^T K_2 u \\geq 0$, we have that $u^T K u \\geq 0$ and thus $K$ is a Mercer kernel.\n",
 43 |     "\n",
 44 |     "### Question 4.b)\n",
 45 |     "\n",
 46 |     "$$\n",
 47 |     "\\begin{align*}\n",
 48 |     "u^T K u &= u^T K_1 u - u^T K_2 u\\\\\n",
 49 |     "\\end{align*}\n",
 50 |     "$$\n",
 51 |     "\n",
 52 |     "Therefore $u^T K u$ is not necessarily positive, i.e. $K$ is not a Mercer kernel.\n",
 53 |     "\n",
 54 |     "### Question 4.c)\n",
 55 |     "\n",
 56 |     "$$\n",
 57 |     "\\begin{align*}\n",
 58 |     "u^T K u &= u^T a K_1 u\\\\\n",
 59 |     "&= a. u^T K_1 u \\geq 0\n",
 60 |     "\\end{align*}\n",
 61 |     "$$\n",
 62 |     "\n",
 63 |     "Therefore $K$ is a Mercer kernel.\n",
 64 |     "\n",
 65 |     "### Question 4.d)\n",
 66 |     "\n",
 67 |     "$$\n",
 68 |     "\\begin{align*}\n",
 69 |     "u^T K u &= - u^T a K_1 u\\\\\n",
 70 |     "&= -a. u^T K_1 u \\leq 0\n",
 71 |     "\\end{align*}\n",
 72 |     "$$\n",
 73 |     "\n",
 74 |     "Therefore $K$ is **not** a Mercer kernel.\n",
 75 |     "\n",
 76 |     "### Question 4.e)\n",
 77 |     "\n",
 78 |     "$$\n",
 79 |     "\\begin{align*}\n",
 80 |     "u^T K u &= u^T K_1 K_2 u\\\\\n",
 81 |     "&= u^T K_1 [uu^T] [uu^T]^{-1} K_2 u \\\\\n",
 82 |     "&= [u^T K_1 u] u^T [uu^T]^{-1} K_2 u\n",
 83 |     "\\end{align*}\n",
 84 |     "$$\n",
 85 |     "\n",
 86 |     "Now, we use the linear algebra property:\n",
 87 |     "$$\n",
 88 |     "A^{-1} = [A^T A]^{-1} A^T\n",
 89 |     "$$\n",
 90 |     "The proof is straightforward (multiplying left and right by $A$ yields $[A^T A]^{-1} [A^T A] = I$).\n",
 91 |     "\n",
 92 |     "Choosing $A = uu^T$ and replacing in the previous formulation, we get:\n",
 93 |     "\n",
 94 |     "$$\n",
 95 |     "[uu^T]^{-1} = [[uu^T]^T[uu^T]]^{-1} uu^T\n",
 96 |     "$$\n",
 97 |     "\n",
 98 |     "Since $uu^T$ is symmetric, $[uu^T]^T = uu^T$, therefore:\n",
 99 |     "\n",
100 |     "$$\n",
101 |     "[uu^T]^{-1} = [[uu^T]^2]^{-1} uu^T\n",
102 |     "$$\n",
103 |     "\n",
104 |     "We inject this formulation into the previous step:\n",
105 |     "\n",
106 |     "$$\n",
107 |     "\\begin{align*}\n",
108 |     "u^T K u &= [u^T K_1 u] u^T [[uu^T]^2]^{-1} u[u^T K_2 u]\\\\\n",
109 |     "\\end{align*}\n",
110 |     "$$\n",
111 |     "\n",
112 |     "Let $C = [[uu^T]^2]^{-1}$. To complete the proof, we need to show that $C \\geq 0$.\n",
113 |     "\n",
114 |     "We know that by construction, $uu^T$ is symetric. Therefore $uu^T$ is diagonalizable in an orthogonal basis:\n",
115 |     "\n",
116 |     "$$\n",
117 |     "A = uu^T = Q \\Lambda Q^{-1}.\n",
118 |     "$$\n",
119 |     "\n",
120 |     "Squaring this result yields:\n",
121 |     "\n",
122 |     "$$\n",
123 |     "A^2 = [uu^T]^2 = Q \\Lambda ^2 Q^{-1}.\n",
124 |     "$$\n",
125 |     "\n",
126 |     "The diagonal elements of $\\Lambda^2$ are the eigenvalues of $[uu^T]^2$. The eigenvalues are all positive, hence $[uu^T]^2$ is semi defininte positive.\n",
127 |     "\n",
128 |     "Finally, because $[uu^T]^2$ is semi definite positive, $C=[[uu^T]^2]^{-1}$ is also semi definite positive.\n",
129 |     "\n",
130 |     "We therefore have:\n",
131 |     "\n",
132 |     "$$\n",
133 |     "\\begin{align*}\n",
134 |     "u^T K u &= [u^T K_1 u] [u^T C u][u^T K_2 u] \\geq 0\\\\\n",
135 |     "\\end{align*}\n",
136 |     "$$\n",
137 |     "\n",
138 |     "This concludes the proof that $K$ is indeed a Mercer kernel, since all the elements of this product are positive.\n",
139 |     "\n",
140 |     "### Question 4.f)\n",
141 |     "\n",
142 |     "$K$ is not a Mercer kernel. A counter example would be $f: y \\mapsto sign(y)$, and choosing $(x,z) = (-1,1)$.\n",
143 |     "\n",
144 |     "### Question 4.g)\n",
145 |     "\n",
146 |     "It is straightforward to prove $K$ is a Mercer kernel, since $K_3$ is a Mercer kernel. This is independant of the chosen map $\\phi$.\n",
147 |     "\n",
148 |     "### Question 4.f)\n",
149 |     "\n",
150 |     "We need to prove that $\\forall a_q \\geq 0$, $\\forall N$:\n",
151 |     "\n",
152 |     "$$\\sum_{q=0}^N a_q K_1 ^q$$\n",
153 |     "\n",
154 |     "is also a Mercer kernel.\n",
155 |     "\n",
156 |     "Let's start by showing that $\\forall q, K_1^q$ is a Mercer kernel. We can do this by induction:\n",
157 |     "\n",
158 |     "**$k=0$:** the result is immediate, since $u^T K_1^0 u = u^T u = ||u||^2 \\geq 0$\n",
159 |     "\n",
160 |     "**$k \\implies k+1$:** this result is proved in question 4.e)\n",
161 |     "\n",
162 |     "Furthermore, $\\forall q, a_q \\geq 0$ so according to 4.c), $a_q K_1^q$ is also a Mercer kernel.\n",
163 |     "\n",
164 |     "Finally, according to 4.a), the sum of two Mercer kernels is also a Mercer kernel. This concludes the proof that $K$ is a Mercer kernel.\n",
165 |     "\n"
166 |    ]
167 |   },
168 |   {
169 |    "cell_type": "code",
170 |    "execution_count": null,
171 |    "metadata": {},
172 |    "outputs": [],
173 |    "source": []
174 |   }
175 |  ],
176 |  "metadata": {
177 |   "kernelspec": {
178 |    "display_name": "Python 2",
179 |    "language": "python",
180 |    "name": "python2"
181 |   },
182 |   "language_info": {
183 |    "codemirror_mode": {
184 |     "name": "ipython",
185 |     "version": 2
186 |    },
187 |    "file_extension": ".py",
188 |    "mimetype": "text/x-python",
189 |    "name": "python",
190 |    "nbconvert_exporter": "python",
191 |    "pygments_lexer": "ipython2",
192 |    "version": "2.7.15"
193 |   }
194 |  },
195 |  "nbformat": 4,
196 |  "nbformat_minor": 2
197 | }
198 | 


--------------------------------------------------------------------------------
/Machine Learning/Problem2/5_Kernelizing_the_Perceptron.ipynb:
--------------------------------------------------------------------------------
 1 | {
 2 |  "cells": [
 3 |   {
 4 |    "cell_type": "markdown",
 5 |    "metadata": {},
 6 |    "source": [
 7 |     "# CS229: Problem Set 2\n",
 8 |     "## Problem 5: Kernelizing the Perceptron\n",
 9 |     "\n",
10 |     "\n",
11 |     "**C. Combier**\n",
12 |     "\n",
13 |     "This iPython Notebook provides solutions to Stanford's CS229 (Machine Learning, Fall 2017) graduate course problem set 5, taught by Andrew Ng.\n",
14 |     "\n",
15 |     "The problem set can be found here: [./ps2.pdf](ps2.pdf)\n",
16 |     "\n",
17 |     "I chose to write the solutions to the coding questions in Python, whereas the Stanford class is taught with Matlab/Octave.\n",
18 |     "\n",
19 |     "## Notation\n",
20 |     "\n",
21 |     "- $x^i$ is the $i^{th}$ feature vector\n",
22 |     "- $y^i$ is the expected outcome for the $i^{th}$ training example\n",
23 |     "- $m$ is the number of training examples\n",
24 |     "- $n$ is the number of features"
25 |    ]
26 |   },
27 |   {
28 |    "cell_type": "markdown",
29 |    "metadata": {
30 |     "colab_type": "text",
31 |     "id": "rsUtreJMonLw"
32 |    },
33 |    "source": [
34 |     "### Question 5.a)\n",
35 |     "\n",
36 |     "Let $K$ be a Mercer kernel with mapping $\\phi: R^n \\to E $:\n",
37 |     "\n",
38 |     "$$\n",
39 |     "\\begin{align*}\n",
40 |     "K(x, y) &\\mapsto \\langle \\phi(x), \\phi(y) \\rangle \\\\\n",
41 |     "\\end{align*}\n",
42 |     "$$\n",
43 |     "\n",
44 |     "$\\theta^i$ can be represented as $K(\\theta^i,x^i )$\n",
45 |     "\n",
46 |     "### Question 5.b)\n",
47 |     "\n",
48 |     "\n",
49 |     "$h_{\\theta^i} (x^{i+1}) = g( {\\theta^i}^T \\phi(x^{i+1}) ) = g( K ({\\theta^i} ,x^{i+1}) ) $\n",
50 |     "\n",
51 |     "### Question 5.c)\n",
52 |     "\n",
53 |     "Simply remap the update rule by using $K$:\n",
54 |     "\n",
55 |     "$$\n",
56 |     "K (\\theta^{i+1} ,x^{i+1}) := K ({\\theta^i} ,x^{i+1})+ \\alpha y^{i+1}.1 \\left \\{ y^{i+1}g \\left( K ({\\theta^i} ,x^{i+1}) \\right) \\right \\}. K(x^{i+1},x^{i+1})\n",
57 |     "$$"
58 |    ]
59 |   }
60 |  ],
61 |  "metadata": {
62 |   "kernelspec": {
63 |    "display_name": "Python 2",
64 |    "language": "python",
65 |    "name": "python2"
66 |   },
67 |   "language_info": {
68 |    "codemirror_mode": {
69 |     "name": "ipython",
70 |     "version": 2
71 |    },
72 |    "file_extension": ".py",
73 |    "mimetype": "text/x-python",
74 |    "name": "python",
75 |    "nbconvert_exporter": "python",
76 |    "pygments_lexer": "ipython2",
77 |    "version": "2.7.15"
78 |   }
79 |  },
80 |  "nbformat": 4,
81 |  "nbformat_minor": 2
82 | }
83 | 


--------------------------------------------------------------------------------
/Machine Learning/Problem2/data/data_a.txt:
--------------------------------------------------------------------------------
  1 | -1.000000000000000000e+00 6.012660321346644521e-01 1.650910586864833274e-01
  2 | 1.000000000000000000e+00 8.717253403947561319e-01 5.273606284195629934e-01
  3 | -1.000000000000000000e+00 3.725479744663405812e-01 4.466090687037850282e-01
  4 | -1.000000000000000000e+00 1.357664310444239852e-02 5.135778964393811208e-02
  5 | 1.000000000000000000e+00 5.830316375912952820e-01 7.106191307030319537e-01
  6 | 1.000000000000000000e+00 9.084797126970022285e-01 1.752718002509726647e-01
  7 | -1.000000000000000000e+00 3.999644820298295933e-01 4.739952831980015491e-01
  8 | 1.000000000000000000e+00 8.325367962545292544e-01 5.980482975033334370e-01
  9 | -1.000000000000000000e+00 4.816405545569502067e-03 9.844565337131838678e-01
 10 | -1.000000000000000000e+00 7.499084384207852505e-01 4.542354333273501688e-02
 11 | 1.000000000000000000e+00 6.787728052832739944e-01 9.244621614802555065e-01
 12 | -1.000000000000000000e+00 2.955893587048743498e-01 3.491424901113958645e-01
 13 | -1.000000000000000000e+00 2.036023752010092114e-01 5.920026826758448824e-01
 14 | -1.000000000000000000e+00 2.198870010022596633e-01 4.525792858555204301e-01
 15 | -1.000000000000000000e+00 1.454150156442755026e-01 8.284065671346926285e-01
 16 | -1.000000000000000000e+00 2.507999560886482460e-01 5.412252402844155430e-01
 17 | -1.000000000000000000e+00 4.454069172249203179e-01 9.642287028245233316e-02
 18 | 1.000000000000000000e+00 2.533909449181281914e-02 9.541419533675528086e-01
 19 | 1.000000000000000000e+00 8.795209274605995109e-01 9.807605212599779243e-01
 20 | 1.000000000000000000e+00 7.941665553454693161e-01 4.595618235830615239e-01
 21 | -1.000000000000000000e+00 2.692782581937129827e-01 3.045600108717329002e-01
 22 | -1.000000000000000000e+00 5.269267778019376403e-01 1.446119313623852598e-01
 23 | -1.000000000000000000e+00 6.425484830157719429e-01 1.444312811956353082e-01
 24 | -1.000000000000000000e+00 2.762516897555838957e-01 2.210722138600362818e-02
 25 | -1.000000000000000000e+00 5.583924564641318256e-01 3.164687286818336220e-01
 26 | -1.000000000000000000e+00 2.075942837293226484e-01 5.810765156019325195e-01
 27 | -1.000000000000000000e+00 5.865040077341432401e-01 1.730316178976930575e-01
 28 | -1.000000000000000000e+00 3.805484638713033663e-01 6.717623204463272213e-01
 29 | -1.000000000000000000e+00 3.813986396562527581e-01 3.077646651653764831e-02
 30 | -1.000000000000000000e+00 4.899248660762223206e-01 4.167626968687931921e-02
 31 | 1.000000000000000000e+00 6.682081706177449565e-01 6.628198755504333128e-01
 32 | -1.000000000000000000e+00 3.421034653103514067e-01 7.363192575332345724e-01
 33 | 1.000000000000000000e+00 8.337479918488441832e-01 1.573775395900884888e-01
 34 | -1.000000000000000000e+00 4.923674631211191199e-01 3.888211557841462218e-01
 35 | -1.000000000000000000e+00 2.746871724354470468e-01 2.194037119875775765e-01
 36 | 1.000000000000000000e+00 9.514244202703872055e-01 7.517850107244149482e-01
 37 | 1.000000000000000000e+00 7.222970233828414077e-01 6.293849395650549239e-01
 38 | 1.000000000000000000e+00 7.221358438649915223e-01 9.296141057040921973e-01
 39 | -1.000000000000000000e+00 1.351819603026834793e-01 1.854270482295840017e-01
 40 | 1.000000000000000000e+00 6.847589110633647280e-01 3.005782511358631170e-01
 41 | 1.000000000000000000e+00 9.167839007195689449e-01 7.608979395120699651e-01
 42 | -1.000000000000000000e+00 7.296113795807901425e-02 4.672119711166422551e-01
 43 | 1.000000000000000000e+00 8.453793119640681253e-01 7.107858693353201751e-01
 44 | 1.000000000000000000e+00 8.758550459041712921e-01 5.390947722932232233e-01
 45 | -1.000000000000000000e+00 6.680240193628613765e-01 4.079056712401981644e-01
 46 | -1.000000000000000000e+00 1.942797580183325268e-01 6.786361588339889783e-01
 47 | -1.000000000000000000e+00 6.992478452176446035e-01 2.772214675572326481e-02
 48 | 1.000000000000000000e+00 4.856696344043648361e-01 6.878385389105863279e-01
 49 | -1.000000000000000000e+00 1.532187070142976282e-01 7.760493991951687986e-01
 50 | 1.000000000000000000e+00 4.260091802091322544e-01 8.316101224643255296e-01
 51 | 1.000000000000000000e+00 7.730169346598478874e-01 8.167050106762755446e-01
 52 | 1.000000000000000000e+00 2.811591409918925422e-01 7.915812228496058589e-01
 53 | -1.000000000000000000e+00 3.969712841861736674e-01 5.436956492094292548e-01
 54 | -1.000000000000000000e+00 1.561845853065626510e-01 1.149413637944906030e-01
 55 | -1.000000000000000000e+00 5.462254340670188446e-01 2.775432517674987221e-01
 56 | 1.000000000000000000e+00 9.147298002684245422e-01 8.771582557983079731e-01
 57 | 1.000000000000000000e+00 5.480483804952250848e-01 5.571056036752553009e-01
 58 | 1.000000000000000000e+00 8.750401697283192171e-01 9.688990899668680212e-01
 59 | -1.000000000000000000e+00 8.693212797165661421e-02 4.463068285614146813e-01
 60 | 1.000000000000000000e+00 4.598375862902908118e-01 8.810994801265042975e-01
 61 | -1.000000000000000000e+00 3.757937993395965570e-02 3.157852668502673099e-01
 62 | -1.000000000000000000e+00 1.644957474326235181e-01 7.126782272435483456e-01
 63 | -1.000000000000000000e+00 4.696837654443807297e-01 7.651211500270849175e-02
 64 | -1.000000000000000000e+00 3.457483908016467655e-01 6.395094766845679235e-01
 65 | 1.000000000000000000e+00 8.106231383282292979e-01 4.007879765862863986e-01
 66 | -1.000000000000000000e+00 2.198238787111499448e-01 3.812991804733401047e-01
 67 | -1.000000000000000000e+00 7.804157051854651028e-01 1.695647137097033852e-01
 68 | -1.000000000000000000e+00 1.332143858129445357e-01 6.990228277977402760e-01
 69 | 1.000000000000000000e+00 2.147236390512209381e-01 8.529417155381071591e-01
 70 | 1.000000000000000000e+00 8.302333072606401521e-01 4.886383657396295987e-01
 71 | 1.000000000000000000e+00 6.565655768761827771e-01 6.457323732473402300e-01
 72 | 1.000000000000000000e+00 7.214153912553155079e-01 2.401191402571628553e-01
 73 | -1.000000000000000000e+00 1.323621993588250945e-01 3.770522982995052619e-01
 74 | -1.000000000000000000e+00 4.977100693580231994e-01 6.548100620049546183e-02
 75 | -1.000000000000000000e+00 2.540307903435575776e-01 6.519108537876683318e-01
 76 | 1.000000000000000000e+00 9.925329991819680231e-01 8.009199234853079385e-01
 77 | 1.000000000000000000e+00 9.871058868993810576e-01 8.958814489034055972e-01
 78 | 1.000000000000000000e+00 7.329421343906131758e-01 7.162614142817553819e-01
 79 | 1.000000000000000000e+00 9.944219413039141475e-01 7.052213028969684938e-01
 80 | -1.000000000000000000e+00 2.072586215931685460e-01 7.856577766861695400e-01
 81 | -1.000000000000000000e+00 6.419590805832910974e-01 3.274665648965369158e-01
 82 | -1.000000000000000000e+00 3.809536714273964453e-02 7.312617957403662050e-01
 83 | -1.000000000000000000e+00 7.506710432993660698e-01 1.265072249195027254e-01
 84 | 1.000000000000000000e+00 8.343988342644053091e-01 1.841024147367045227e-01
 85 | -1.000000000000000000e+00 1.758563408675866135e-01 4.620185467796678047e-01
 86 | 1.000000000000000000e+00 9.017343818930201316e-01 6.451069718007419462e-01
 87 | 1.000000000000000000e+00 4.645860976665043829e-01 8.036931879984066107e-01
 88 | -1.000000000000000000e+00 1.095113168514771917e-01 5.896622604055185013e-01
 89 | 1.000000000000000000e+00 2.595460748012538010e-01 6.656903665016161709e-01
 90 | 1.000000000000000000e+00 8.248707100076894116e-01 5.982805782011830775e-01
 91 | 1.000000000000000000e+00 6.066008767832432591e-01 7.002726085427883884e-01
 92 | -1.000000000000000000e+00 4.960382430993948155e-03 9.641836628417100874e-01
 93 | 1.000000000000000000e+00 9.187704651055217386e-02 9.457729354805407551e-01
 94 | 1.000000000000000000e+00 2.146449197939926945e-01 9.362964528123883801e-01
 95 | 1.000000000000000000e+00 7.650517822114389910e-01 7.758744085669813106e-01
 96 | 1.000000000000000000e+00 7.910586028200941033e-01 7.463191757483961242e-01
 97 | -1.000000000000000000e+00 3.116241907975170200e-01 3.477116968694634602e-01
 98 | 1.000000000000000000e+00 3.473903707850247713e-01 9.182780921232864824e-01
 99 | 1.000000000000000000e+00 9.469950080796307734e-01 8.927011126301536148e-01
100 | -1.000000000000000000e+00 1.636710794703860605e-01 8.530396603820045165e-02
101 | 


--------------------------------------------------------------------------------
/Machine Learning/Problem2/data/data_b.txt:
--------------------------------------------------------------------------------
  1 | -1.000000000000000000e+00 5.956630502064887978e-01 1.930721369700331147e-01
  2 | -1.000000000000000000e+00 4.369971909808768595e-01 5.448065208512253843e-01
  3 | 1.000000000000000000e+00 8.999454640117418025e-01 8.459224350533809389e-01
  4 | -1.000000000000000000e+00 5.550637832421146944e-01 9.263357825110341004e-03
  5 | -1.000000000000000000e+00 7.468707153253317799e-02 2.828451350645997397e-01
  6 | -1.000000000000000000e+00 5.560221769762927480e-01 4.096332857505222691e-01
  7 | -1.000000000000000000e+00 6.795002114321661013e-01 2.984057796113770422e-04
  8 | -1.000000000000000000e+00 4.710146542284510129e-02 9.463613488310902433e-01
  9 | 1.000000000000000000e+00 7.238166214179516667e-01 4.940647054551196016e-01
 10 | -1.000000000000000000e+00 2.443048766770348212e-01 1.766116557088118766e-01
 11 | 1.000000000000000000e+00 5.974033174044348637e-01 6.139306418759564732e-01
 12 | -1.000000000000000000e+00 2.069637273226839769e-01 3.987357388610405229e-01
 13 | -1.000000000000000000e+00 3.221219680536931973e-01 2.844430513676717842e-01
 14 | 1.000000000000000000e+00 7.445778579629004357e-01 4.351377072146410674e-01
 15 | -1.000000000000000000e+00 5.451932897422276936e-01 2.062341013493946829e-01
 16 | -1.000000000000000000e+00 1.696239808246304825e-01 4.358392332310834227e-03
 17 | 1.000000000000000000e+00 2.339114962383119778e-01 9.684295920296079885e-01
 18 | 1.000000000000000000e+00 5.622484507716049018e-01 6.023455730528913810e-01
 19 | -1.000000000000000000e+00 2.802845826689382980e-01 1.867436119464408462e-01
 20 | -1.000000000000000000e+00 3.079895604262861131e-02 3.020010729636660729e-01
 21 | -1.000000000000000000e+00 2.278860921159409081e-01 6.609776742839968966e-01
 22 | -1.000000000000000000e+00 2.775014686241032980e-01 4.238468460906956725e-01
 23 | 1.000000000000000000e+00 3.378432848578666325e-01 7.944254769116867454e-01
 24 | 1.000000000000000000e+00 9.939354074741211242e-01 8.490465597278253895e-01
 25 | -1.000000000000000000e+00 2.863167202290489710e-01 5.959512902157737546e-02
 26 | -1.000000000000000000e+00 1.209130049738784685e-01 3.141006657109364220e-01
 27 | 1.000000000000000000e+00 4.003937420986558582e-02 9.676845691209272626e-01
 28 | 1.000000000000000000e+00 8.086399636856905770e-01 8.618918132536165233e-01
 29 | -1.000000000000000000e+00 5.539808004962957222e-01 1.907996257607158519e-02
 30 | 1.000000000000000000e+00 1.163521320106113421e-01 9.398709177182549279e-01
 31 | 1.000000000000000000e+00 7.301063472763824613e-01 9.499676490265697160e-01
 32 | 1.000000000000000000e+00 8.468338829463664119e-01 1.867208747061926966e-01
 33 | 1.000000000000000000e+00 2.608233534980617385e-01 9.834132630673575459e-01
 34 | 1.000000000000000000e+00 4.199570420269110871e-01 9.327541919772166512e-01
 35 | 1.000000000000000000e+00 7.719418059150739975e-01 5.532023133427225181e-01
 36 | 1.000000000000000000e+00 9.206744943572922057e-01 6.352192232989287701e-01
 37 | 1.000000000000000000e+00 5.293421631045709397e-01 7.222684582313321222e-01
 38 | -1.000000000000000000e+00 1.379460622648243096e-02 4.214618296938960063e-01
 39 | -1.000000000000000000e+00 7.751426490739787845e-02 6.299004172832024517e-01
 40 | 1.000000000000000000e+00 9.276983243348403407e-01 1.040934615547040032e-01
 41 | 1.000000000000000000e+00 7.957432241562331088e-01 9.215388971855021927e-01
 42 | -1.000000000000000000e+00 2.239672931215105356e-01 7.332974484931875647e-02
 43 | 1.000000000000000000e+00 9.422253608049449003e-01 5.218366190287160311e-01
 44 | 1.000000000000000000e+00 9.651959620807119000e-01 2.014979368917352298e-01
 45 | 1.000000000000000000e+00 9.940321308542976464e-01 6.081093015679264191e-01
 46 | 1.000000000000000000e+00 6.658087328963868678e-01 5.027583853754320486e-01
 47 | -1.000000000000000000e+00 7.176560564085979754e-01 3.989391362458483137e-02
 48 | -1.000000000000000000e+00 3.487063110075080408e-01 2.238231883533835509e-01
 49 | -1.000000000000000000e+00 2.709494622365625771e-01 2.082144211471060880e-01
 50 | -1.000000000000000000e+00 3.182573103269309422e-01 3.915896341357829602e-01
 51 | 1.000000000000000000e+00 8.274854880780476707e-01 7.264992541831815087e-01
 52 | -1.000000000000000000e+00 5.362365071659461746e-01 4.372891008985122507e-01
 53 | -1.000000000000000000e+00 8.427624155699986463e-02 4.113506863837846916e-01
 54 | 1.000000000000000000e+00 6.701074178232642176e-01 4.536317270900971366e-01
 55 | 1.000000000000000000e+00 8.544602135318842828e-01 2.735166442807617226e-01
 56 | 1.000000000000000000e+00 9.949426365900033709e-01 7.080489150662402364e-01
 57 | 1.000000000000000000e+00 9.344457506471001151e-01 4.628294452441769069e-01
 58 | 1.000000000000000000e+00 2.748701859697193495e-01 8.689564728606002930e-01
 59 | -1.000000000000000000e+00 3.562057238439320095e-01 3.450267206455902569e-01
 60 | 1.000000000000000000e+00 9.877831746152774262e-01 4.914572650953237254e-01
 61 | 1.000000000000000000e+00 7.987092991607904757e-01 6.098110781977787997e-01
 62 | 1.000000000000000000e+00 8.038461863207471136e-01 2.830525912541793643e-01
 63 | 1.000000000000000000e+00 8.130156172775482304e-01 9.302480416896362625e-01
 64 | 1.000000000000000000e+00 9.059172919354674391e-01 3.568542056989827405e-01
 65 | -1.000000000000000000e+00 4.337382469790391770e-01 4.783305272611882986e-01
 66 | -1.000000000000000000e+00 1.143317228148252873e-01 7.397845361075310322e-01
 67 | -1.000000000000000000e+00 3.449200082398627965e-01 6.130545045172697272e-01
 68 | -1.000000000000000000e+00 2.781990965302262309e-01 6.411607338569670356e-01
 69 | 1.000000000000000000e+00 6.964452033739039205e-01 8.817775541629888636e-01
 70 | 1.000000000000000000e+00 7.989675199719957766e-01 3.531907985451089305e-01
 71 | 1.000000000000000000e+00 8.768836927519900737e-01 6.774500857515753927e-01
 72 | 1.000000000000000000e+00 6.348480021319043987e-01 5.015602127922110798e-01
 73 | -1.000000000000000000e+00 2.119084404426209156e-01 2.856859505361388774e-01
 74 | 1.000000000000000000e+00 5.865762180265414738e-01 5.713716895712067645e-01
 75 | -1.000000000000000000e+00 6.985710058820981949e-02 7.915028704009061666e-01
 76 | -1.000000000000000000e+00 2.355671221113677660e-01 6.438144833145231782e-02
 77 | 1.000000000000000000e+00 8.877615762048381987e-01 5.200746512035342439e-01
 78 | -1.000000000000000000e+00 2.449941209134361975e-01 3.213293478699230654e-02
 79 | -1.000000000000000000e+00 9.069944587950939940e-02 8.690374126034291491e-01
 80 | 1.000000000000000000e+00 5.088278277278240891e-01 6.612414281766401114e-01
 81 | 1.000000000000000000e+00 6.520418524389888226e-01 7.755778069467987867e-01
 82 | 1.000000000000000000e+00 7.722246267138052067e-01 8.067057648890757493e-01
 83 | -1.000000000000000000e+00 1.326901020660873343e-01 1.189518561857797474e-01
 84 | -1.000000000000000000e+00 3.261379646538820065e-02 7.520091046634519438e-01
 85 | -1.000000000000000000e+00 4.051593114583660338e-01 2.829115340428780545e-01
 86 | -1.000000000000000000e+00 3.261819856628866976e-01 2.026040109627341712e-01
 87 | 1.000000000000000000e+00 8.473728790474870376e-01 7.041767658931633589e-01
 88 | 1.000000000000000000e+00 3.762819089695988994e-01 6.355178264510906727e-01
 89 | 1.000000000000000000e+00 6.647854472459970854e-01 9.596004675793697869e-01
 90 | 1.000000000000000000e+00 8.591562056089958599e-01 7.006845513875639142e-01
 91 | 1.000000000000000000e+00 5.181283220873837969e-01 5.480024371738405620e-01
 92 | 1.000000000000000000e+00 7.278833585497114234e-01 4.235243120149102536e-01
 93 | 1.000000000000000000e+00 3.096603842065530632e-01 9.127329643704719109e-01
 94 | -1.000000000000000000e+00 2.894308966996906873e-01 1.389663259706833687e-01
 95 | 1.000000000000000000e+00 9.077251355288461498e-01 2.432028201602077777e-01
 96 | 1.000000000000000000e+00 8.173287941331177642e-01 6.937093875591073822e-01
 97 | -1.000000000000000000e+00 3.721150829343222721e-02 1.226343137474231737e-01
 98 | 1.000000000000000000e+00 9.715801573056137563e-02 9.315221884331510438e-01
 99 | 1.000000000000000000e+00 8.075115122905083265e-01 5.837523984780504938e-01
100 | -1.000000000000000000e+00 8.298607464063743056e-01 8.628668164813368957e-02
101 | 


--------------------------------------------------------------------------------
/Machine Learning/Problem2/nb.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | 
 3 | def readMatrix(file):
 4 |     fd = open(file, 'r')
 5 |     hdr = fd.readline()
 6 |     rows, cols = [int(s) for s in fd.readline().strip().split()]
 7 |     tokens = fd.readline().strip().split()
 8 |     matrix = np.zeros((rows, cols))
 9 |     Y = []
10 |     for i, line in enumerate(fd):
11 |         nums = [int(x) for x in line.strip().split()]
12 |         Y.append(nums[0])
13 |         kv = np.array(nums[1:])
14 |         k = np.cumsum(kv[:-1:2])
15 |         v = kv[1::2]
16 |         matrix[i, k] = v
17 |     return matrix, tokens, np.array(Y)
18 | 
19 | def nb_train(matrix, category):
20 |     state = {}
21 |     N = matrix.shape[1]
22 |     ###################
23 | 
24 |     ###################
25 |     return state
26 | 
27 | def nb_test(matrix, state):
28 |     output = np.zeros(matrix.shape[0])
29 |     ###################
30 | 
31 |     ###################
32 |     return output
33 | 
34 | def evaluate(output, label):
35 |     error = (output != label).sum() * 1. / len(output)
36 |     print 'Error: %1.4f' % error
37 | 
38 | def main():
39 |     trainMatrix, tokenlist, trainCategory = readMatrix('./data/MATRIX.TRAIN')
40 |     testMatrix, tokenlist, testCategory = readMatrix('./data/MATRIX.TEST')
41 | 
42 |     state = nb_train(trainMatrix, trainCategory)
43 |     output = nb_test(testMatrix, state)
44 | 
45 |     evaluate(output, testCategory)
46 |     return
47 | 
48 | if __name__ == '__main__':
49 |     main()
50 | 


--------------------------------------------------------------------------------
/Machine Learning/Problem2/ps2.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bayeslabs/AiGym/205350c311f0e0981fe90eae84586c9bd4a9cfef/Machine Learning/Problem2/ps2.pdf


--------------------------------------------------------------------------------
/Machine Learning/Problem2/svm.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | 
 3 | tau = 8.
 4 | 
 5 | def svm_readMatrix(file):
 6 |     fd = open(file, 'r')
 7 |     hdr = fd.readline()
 8 |     rows, cols = [int(s) for s in fd.readline().strip().split()]
 9 |     tokens = fd.readline().strip().split()
10 |     matrix = np.zeros((rows, cols))
11 |     Y = []
12 |     for i, line in enumerate(fd):
13 |         nums = [int(x) for x in line.strip().split()]
14 |         Y.append(nums[0])
15 |         kv = np.array(nums[1:])
16 |         k = np.cumsum(kv[:-1:2])
17 |         v = kv[1::2]
18 |         matrix[i, k] = v
19 |     category = (np.array(Y) * 2) - 1
20 |     return matrix, tokens, category
21 | 
22 | def svm_train(matrix, category):
23 |     state = {}
24 |     M, N = matrix.shape
25 |     #####################
26 |     Y = category
27 |     matrix = 1. * (matrix > 0)
28 |     squared = np.sum(matrix * matrix, axis=1)
29 |     gram = matrix.dot(matrix.T)
30 |     K = np.exp(-(squared.reshape((1, -1)) + squared.reshape((-1, 1)) - 2 * gram) / (2 * (tau ** 2)) )
31 | 
32 |     alpha = np.zeros(M)
33 |     alpha_avg = np.zeros(M)
34 |     L = 1. / (64 * M)
35 |     outer_loops = 40
36 | 
37 |     alpha_avg
38 |     for ii in xrange(outer_loops * M):
39 |         i = int(np.random.rand() * M)
40 |         margin = Y[i] * np.dot(K[i, :], alpha)
41 |         grad = M * L * K[:, i] * alpha[i]
42 |         if (margin < 1):
43 |             grad -=  Y[i] * K[:, i]
44 |         alpha -=  grad / np.sqrt(ii + 1)
45 |         alpha_avg += alpha
46 | 
47 |     alpha_avg /= (ii + 1) * M
48 | 
49 |     state['alpha'] = alpha
50 |     state['alpha_avg'] = alpha_avg
51 |     state['Xtrain'] = matrix
52 |     state['Sqtrain'] = squared
53 |     ####################
54 |     return state
55 | 
56 | def svm_test(matrix, state):
57 |     M, N = matrix.shape
58 |     output = np.zeros(M)
59 |     ###################
60 |     Xtrain = state['Xtrain']
61 |     Sqtrain = state['Sqtrain']
62 |     matrix = 1. * (matrix > 0)
63 |     squared = np.sum(matrix * matrix, axis=1)
64 |     gram = matrix.dot(Xtrain.T)
65 |     K = np.exp(-(squared.reshape((-1, 1)) + Sqtrain.reshape((1, -1)) - 2 * gram) / (2 * (tau ** 2)))
66 |     alpha_avg = state['alpha_avg']
67 |     preds = K.dot(alpha_avg)
68 |     output = np.sign(preds)
69 |     ###################
70 |     return output
71 | 
72 | def svm_evaluate(output, label):
73 |     error = (output != label).sum() * 1. / len(output)
74 |     print 'Error: %1.4f' % error
75 |     return error
76 | 
77 | def main():
78 |     trainMatrix, tokenlist, trainCategory = svm_readMatrix('./data/MATRIX.TRAIN.400')
79 |     testMatrix, tokenlist, testCategory = svm_readMatrix('./data/MATRIX.TEST')
80 | 
81 |     state = svm_train(trainMatrix, trainCategory)
82 |     output = svm_test(testMatrix, state)
83 | 
84 |     evaluate(output, testCategory)
85 |     return
86 | 
87 | if __name__ == '__main__':
88 |     main()
89 | 


--------------------------------------------------------------------------------
/Machine Learning/Problem3/1_Simple_Neural_Network.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "# CS229: Problem Set 3\n",
  8 |     "## Problem 1: A Simple Neural Network\n",
  9 |     "\n",
 10 |     "\n",
 11 |     "**C. Combier**\n",
 12 |     "\n",
 13 |     "This iPython Notebook provides solutions to Stanford's CS229 (Machine Learning, Fall 2017) graduate course problem set 3, taught by Andrew Ng.\n",
 14 |     "\n",
 15 |     "The problem set can be found here: [./ps3.pdf](ps3.pdf)\n",
 16 |     "\n",
 17 |     "I chose to write the solutions to the coding questions in Python, whereas the Stanford class is taught with Matlab/Octave.\n",
 18 |     "\n",
 19 |     "## Notation\n",
 20 |     "\n",
 21 |     "- $x^i$ is the $i^{th}$ feature vector\n",
 22 |     "- $y^i$ is the expected outcome for the $i^{th}$ training example\n",
 23 |     "- $m$ is the number of training examples\n",
 24 |     "- $n$ is the number of features"
 25 |    ]
 26 |   },
 27 |   {
 28 |    "cell_type": "markdown",
 29 |    "metadata": {},
 30 |    "source": [
 31 |     "### Question 1.b)\n",
 32 |     "\n",
 33 |     "![triangle separation](data/triangle_pb3_1.jpg)\n",
 34 |     "\n",
 35 |     "It seems that a triangle can separate the data.\n",
 36 |     "\n",
 37 |     "We can construct a weight matrix by using a combination of linear classifiers, where each side of the triangle represents a decision boundary.\n",
 38 |     "\n",
 39 |     "Each side of the triangle can be represented by an equation of the form $w_0 +w_1 x_1 + w_2 x_2 = 0$. If we transform this equality into an inequality, then the output represents on which side of the decision boundary a given data point $(x_1,x_2)$ belongs. The intersection of the outputs for each of these decision boundaries tells us whether $(x_1,x_2)$ lies within the triangle, in which case we will classify it $0$, and if not as $1$.\n",
 40 |     "\n",
 41 |     "The first weight matrix can be written as:\n",
 42 |     "\n"
 43 |    ]
 44 |   },
 45 |   {
 46 |    "cell_type": "markdown",
 47 |    "metadata": {},
 48 |    "source": [
 49 |     "$$\n",
 50 |     "W^{[1]} = \\left ( \\begin{array}{ccc}\n",
 51 |     "-1 & 4 & 0 \\\\\n",
 52 |     "-1 & 0 & 4 \\\\\n",
 53 |     "4.5 & -1 & -1\n",
 54 |     "\\end{array} \\right )\n",
 55 |     "$$\n",
 56 |     "\n",
 57 |     "The input vector is:\n",
 58 |     "$$\n",
 59 |     "X = (\\begin{array}{ccc}\n",
 60 |     "1 & x_1 & x_2\n",
 61 |     "\\end{array})^T\n",
 62 |     "$$\n",
 63 |     "\n",
 64 |     "- The first line of $W^{[1]}$ is the equation for the vertical side of the triangle, $x_1 = 0.25$\n",
 65 |     "- The second line of $W^{[1]}$ is the equation for the horizontal side of the triangle, $x_2 = 0.25$\n",
 66 |     "- The third line of $W^{[1]}$ is the equation for the oblique side of the triangle, $x_2 = -x_1 + 4.5$\n",
 67 |     "\n",
 68 |     "Consequently, with the given activation function, if the training example given by ($x_1$, $x_2$) lies within the triangle, then:\n",
 69 |     "\n",
 70 |     "$$\n",
 71 |     "f(W^{[1]}X) = (\\begin{array}{ccc}\n",
 72 |     "1 & 1 & 1\n",
 73 |     "\\end{array})^T\n",
 74 |     "$$\n",
 75 |     "\n",
 76 |     "In all other cases, at least one element of the output vector $f(W^{[1]}X)$ is not equal to $1$.\n",
 77 |     "\n",
 78 |     "We can use this observation to find weights for the ouput layer. We take the sum of the components of $f(W^{[1]}X)$, and compare the value to 2.5 to check if all elements are equal to $1$ or not. This gives the weight matrix:\n",
 79 |     "\n",
 80 |     "$$\n",
 81 |     "W^{[2]} =(\\begin{array}{cccc}\n",
 82 |     "2.5 & -1 & -1 & -1\n",
 83 |     "\\end{array})\n",
 84 |     "$$\n",
 85 |     "\n",
 86 |     "The additional term 2.5 is the zero intercept. With this weight matrix, the ouput of the final layer will be $0$ if the training example is within the triangle, and $1$ if it is outside of the triangle.\n",
 87 |     "\n",
 88 |     "The "
 89 |    ]
 90 |   },
 91 |   {
 92 |    "cell_type": "markdown",
 93 |    "metadata": {},
 94 |    "source": [
 95 |     "### Question 1.c)\n",
 96 |     "\n",
 97 |     "A linear activation function does not work, because the problem is not linearly separable, i.e. there is no hyperplane that perfectly separates the data."
 98 |    ]
 99 |   }
100 |  ],
101 |  "metadata": {
102 |   "kernelspec": {
103 |    "display_name": "Python 2",
104 |    "language": "python",
105 |    "name": "python2"
106 |   },
107 |   "language_info": {
108 |    "codemirror_mode": {
109 |     "name": "ipython",
110 |     "version": 2
111 |    },
112 |    "file_extension": ".py",
113 |    "mimetype": "text/x-python",
114 |    "name": "python",
115 |    "nbconvert_exporter": "python",
116 |    "pygments_lexer": "ipython2",
117 |    "version": "2.7.15"
118 |   }
119 |  },
120 |  "nbformat": 4,
121 |  "nbformat_minor": 2
122 | }
123 | 


--------------------------------------------------------------------------------
/Machine Learning/Problem3/2_EM_for_MAP.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "# CS229: Problem Set 3\n",
  8 |     "## Problem 2: Expectation-Maximization for Maximum a Posteriori\n",
  9 |     "\n",
 10 |     "\n",
 11 |     "**C. Combier**\n",
 12 |     "\n",
 13 |     "This iPython Notebook provides solutions to Stanford's CS229 (Machine Learning, Fall 2017) graduate course problem set 3, taught by Andrew Ng.\n",
 14 |     "\n",
 15 |     "The problem set can be found here: [./ps3.pdf](ps3.pdf)\n",
 16 |     "\n",
 17 |     "I chose to write the solutions to the coding questions in Python, whereas the Stanford class is taught with Matlab/Octave.\n",
 18 |     "\n",
 19 |     "## Notation\n",
 20 |     "\n",
 21 |     "- $x^i$ is the $i^{th}$ feature vector\n",
 22 |     "- $y^i$ is the expected outcome for the $i^{th}$ training example\n",
 23 |     "- $m$ is the number of training examples\n",
 24 |     "- $n$ is the number of features"
 25 |    ]
 26 |   },
 27 |   {
 28 |    "cell_type": "markdown",
 29 |    "metadata": {},
 30 |    "source": [
 31 |     "This problem is very similar to the derivation of the EM algorithm for MLE given in the lectures notes. The difference is that we are now in a Bayesian setting, and impose a prior on $\\theta$:\n",
 32 |     "\n",
 33 |     "$$\n",
 34 |     "MAP = \\prod_i^m \\sum_{z^i} p(x^i, z^i | \\theta)p(\\theta)\n",
 35 |     "$$\n",
 36 |     "\n",
 37 |     "Here, $z^i$ denotes the latent (hidden) random variables.\n",
 38 |     "\n",
 39 |     "### Step 1: E-step\n",
 40 |     "\n",
 41 |     "1. We start by taking the log-MAP:\n",
 42 |     "\n",
 43 |     "$$\n",
 44 |     "\\log MAP = \\sum_i^m \\log \\sum_{z^i} Q_i(z^i) \\frac{p(x^i, z^i | \\theta)}{Q_i(z^i)} + \\log p(\\theta)\n",
 45 |     "$$\n",
 46 |     "\n",
 47 |     "2. We apply Jensen's inequality to the above formula:\n",
 48 |     "\n",
 49 |     "$$\n",
 50 |     "\\log MAP \\geq \\sum_i^m  \\sum_{z^i} Q_i(z^i) \\log \\frac{p(x^i, z^i | \\theta)}{Q_i(z^i)} + \\log p(\\theta)\n",
 51 |     "$$\n",
 52 |     "\n",
 53 |     "3. Next, we need to choose a distribution $Q_i$ for $z^i$. The above inequality become an equality if $\\frac{p(x^i, z^i | \\theta)}{Q_i(z^i)} = cste$, which will lead to the inequality becoming tight for the current value of $\\theta$:\n",
 54 |     "\n",
 55 |     "$$\n",
 56 |     "\\begin{align*}\n",
 57 |     "\\frac{p(x^i, z^i | \\theta)}{Q_i(z^i)} = \\lambda & \\iff Q_i(z^i) = \\frac{1}{\\lambda} p(x^i, z^i | \\theta) \\\\\n",
 58 |     "& \\iff Q_i(z^i) = \\frac{p(x^i, z^i | \\theta) }{\\sum_{z^i}p(x^i, z^i | \\theta)} \\\\\n",
 59 |     "& \\iff Q_i(z^i) = \\frac{p(x^i, z^i | \\theta) }{p(x^i | \\theta)} \\\\\n",
 60 |     "& \\iff Q_i(z^i) = p(z^i | x^i, \\theta)\n",
 61 |     "\\end{align*}\n",
 62 |     "$$\n",
 63 |     "\n",
 64 |     "This obtained by using the fact that since $Q_i$ is a distribution, $\\sum_{z^i} Q_i(z^i) = 1 \\implies \\lambda = \\sum_{z^i} p(x^i,z^i | \\theta)$.\n",
 65 |     "\n",
 66 |     "**This completes the E-step of the EM algorithm.**"
 67 |    ]
 68 |   },
 69 |   {
 70 |    "cell_type": "markdown",
 71 |    "metadata": {},
 72 |    "source": [
 73 |     "### Step 2: M-step\n",
 74 |     "\n",
 75 |     "For the M-step, we simply maximize the expression obtained in step 2) with respect to $\\theta$:\n",
 76 |     "\n",
 77 |     "$$\n",
 78 |     "\\theta := \\text{arg}\\max_{\\theta} \\sum_i^m  \\sum_{z^i} Q_i(z^i) \\log \\frac{p(x^i, z^i | \\theta)}{Q_i(z^i)} + \\log p(\\theta)\n",
 79 |     "$$\n",
 80 |     "\n",
 81 |     "As usual, we do this by taking the gradient with respect to $\\theta$ and setting it to $0$."
 82 |    ]
 83 |   },
 84 |   {
 85 |    "cell_type": "markdown",
 86 |    "metadata": {},
 87 |    "source": [
 88 |     "### Proof of Convergence\n",
 89 |     "\n",
 90 |     "We consider two successive iterations $k+1$ and $k$  of EM, and we will prove that $\\ell(\\theta^{k+1}) \\geq \\ell(\\theta^k)$, i.e. that $\\ell$ is monotonically increasing.\n",
 91 |     "\n",
 92 |     "We refer the reader to the lecture notes, as the proof is the same."
 93 |    ]
 94 |   }
 95 |  ],
 96 |  "metadata": {
 97 |   "kernelspec": {
 98 |    "display_name": "Python 2",
 99 |    "language": "python",
100 |    "name": "python2"
101 |   },
102 |   "language_info": {
103 |    "codemirror_mode": {
104 |     "name": "ipython",
105 |     "version": 2
106 |    },
107 |    "file_extension": ".py",
108 |    "mimetype": "text/x-python",
109 |    "name": "python",
110 |    "nbconvert_exporter": "python",
111 |    "pygments_lexer": "ipython2",
112 |    "version": "2.7.15"
113 |   }
114 |  },
115 |  "nbformat": 4,
116 |  "nbformat_minor": 2
117 | }
118 | 


--------------------------------------------------------------------------------
/Machine Learning/Problem3/3_EM_Application.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "# CS229: Problem Set 3\n",
  8 |     "## Problem 3: EM Application - Paper Reviews\n",
  9 |     "\n",
 10 |     "\n",
 11 |     "**C. Combier**\n",
 12 |     "\n",
 13 |     "This iPython Notebook provides solutions to Stanford's CS229 (Machine Learning, Fall 2017) graduate course problem set 3, taught by Andrew Ng.\n",
 14 |     "\n",
 15 |     "The problem set can be found here: [./ps3.pdf](ps3.pdf)\n",
 16 |     "\n",
 17 |     "I chose to write the solutions to the coding questions in Python, whereas the Stanford class is taught with Matlab/Octave.\n",
 18 |     "\n",
 19 |     "## Notation\n",
 20 |     "\n",
 21 |     "- $x^i$ is the $i^{th}$ feature vector\n",
 22 |     "- $y^i$ is the expected outcome for the $i^{th}$ training example\n",
 23 |     "- $m$ is the number of training examples\n",
 24 |     "- $n$ is the number of features"
 25 |    ]
 26 |   },
 27 |   {
 28 |    "cell_type": "markdown",
 29 |    "metadata": {},
 30 |    "source": [
 31 |     "### Question 3.a.i)\n",
 32 |     "\n",
 33 |     "$x^{pr} = y^{pr}+z^{pr}+\\epsilon^{pr}$.\n",
 34 |     "\n",
 35 |     "Given that, $y^{pr}$, $z^{pr}$ and $\\epsilon^{pr}$ are all Gaussian, then $x^{pr}$ is also gaussian, with\n",
 36 |     "\n",
 37 |     "- Mean $\\mu_p + \\nu_r$\n",
 38 |     "- Variance $\\sigma_p^2 + \\tau_r^2 + \\sigma^2$\n",
 39 |     "\n",
 40 |     "The joint probability distribution for $x^{pr}, y^{pr}, z^{pr})$ is gaussian, with:\n",
 41 |     "\n",
 42 |     "- Mean $[\\mu_p + \\nu_r, \\mu_p, \\nu_r]^T$\n",
 43 |     "- Covariance:\n",
 44 |     "\n",
 45 |     "$$\n",
 46 |     "\\left( \\begin{array}{ccc}\n",
 47 |     "\\sigma_p^2 + \\tau_r^2 + \\sigma^2 & \\sigma_p^2 & \\tau_r^2 \\\\\n",
 48 |     "\\sigma_p^2 & \\sigma_p^2 & 0 \\\\\n",
 49 |     "\\tau_r^2 & 0 & \\tau_r^2\n",
 50 |     "\\end{array}\\right)\n",
 51 |     "$$"
 52 |    ]
 53 |   },
 54 |   {
 55 |    "cell_type": "markdown",
 56 |    "metadata": {},
 57 |    "source": [
 58 |     "### Question 3.a.ii)\n",
 59 |     "\n",
 60 |     "For the E-step, we are looking for a certain distribution $Q_{pr}(y^{pr},z^{pr})$ such that:\n",
 61 |     "\n",
 62 |     "$$\n",
 63 |     "\\frac{p(x^{pr},y^{pr},{z^{pr}})}{Q_{pr}(y^{pr},z^{pr})} = cste\n",
 64 |     "$$\n",
 65 |     "\n",
 66 |     "Since $Q_{pr}$ is a distribution, it must sum (discrete case) or integrate (continuous case) to one, i.e. : \n",
 67 |     "\n",
 68 |     "$$\n",
 69 |     "\\sum_{r=1}^R \\sum_{p=1}^P Q_{pr}(y^{pr},z^{pr}) = 1\n",
 70 |     "$$\n",
 71 |     "\n",
 72 |     "This yields the value for the constant, and hence the value of $Q_{pr}$:\n",
 73 |     "\n",
 74 |     "$$\n",
 75 |     "\\begin{align*}\n",
 76 |     "Q_{pr} &= \\frac{p(x^{pr},y^{pr},{z^{pr}})}{\\sum_{r=1}^R \\sum_{p=1}^P p(x^{pr},y^{pr},{z^{pr}})} \\\\\n",
 77 |     "& = \\frac{p(x^{pr},y^{pr},{z^{pr}})}{p(x^{pr})}\n",
 78 |     "\\end{align*}\n",
 79 |     "$$\n",
 80 |     "\n",
 81 |     "We recognize the conditional probability given below:\n",
 82 |     "\n",
 83 |     "$$\n",
 84 |     "Q_{pr} = p(y^{pr},z^{pr} | x^{pr})\n",
 85 |     "$$\n",
 86 |     "\n",
 87 |     "This is also a gaussian distribution. Calculations are heavy, but the mean and variance of this joint distribution are given by:\n",
 88 |     "\n",
 89 |     "\\begin{align*}\n",
 90 |     "\\mu_x &=\n",
 91 |     "\\begin{bmatrix}\n",
 92 |     "\\mu_p \\\\\n",
 93 |     "\\nu_r \\\\\n",
 94 |     "\\end{bmatrix} + \\frac{x^{pr} - \\mu_p - \\nu_r}{\\sigma_p^2 + \\tau_r^2 + \\sigma^2} \n",
 95 |     "\\begin{bmatrix}\n",
 96 |     "\\sigma_p^2 \\\\\n",
 97 |     "\\tau^2 \\\\\n",
 98 |     "\\end{bmatrix}\n",
 99 |     "\\end{align*}\n",
100 |     "\n",
101 |     "\\begin{align*}\n",
102 |     "\\Sigma_x \n",
103 |     "&= \\begin{bmatrix}\n",
104 |     "\\sigma_p^2 & 0        \\\\ \n",
105 |     "0          & \\tau_r^2 \\\\ \n",
106 |     "\\end{bmatrix} - \\frac{1}{\\sigma_p^2 + \\tau_r^2 + \\sigma^2} \\begin{bmatrix}\n",
107 |     "\\sigma_p^4        & \\sigma_p^2 \\tau_r^2 \\\\\n",
108 |     "\\tau_r^2 \\sigma_p^2 & \\tau_r^4            \\\\\n",
109 |     "\\end{bmatrix} \n",
110 |     "\\end{align*}"
111 |    ]
112 |   },
113 |   {
114 |    "cell_type": "markdown",
115 |    "metadata": {},
116 |    "source": [
117 |     "### Question 3.b)"
118 |    ]
119 |   },
120 |   {
121 |    "cell_type": "markdown",
122 |    "metadata": {},
123 |    "source": [
124 |     "In the E-step, we calculate a lower bound for the log-likelihood, and make it tight for the current value of the parameters of the latent variables $y^{pr}$ and $z^{pr}$. In the M-step, we update those parameters by maximizing the lower bound calculated in the E-step. This is done by calculating the gradient of the lower bound with respect to the parameters ($\\mu_p, \\sigma_p, \\nu_r, \\tau_r $), and setting the gradient to $0$."
125 |    ]
126 |   }
127 |  ],
128 |  "metadata": {
129 |   "kernelspec": {
130 |    "display_name": "Python 2",
131 |    "language": "python",
132 |    "name": "python2"
133 |   },
134 |   "language_info": {
135 |    "codemirror_mode": {
136 |     "name": "ipython",
137 |     "version": 2
138 |    },
139 |    "file_extension": ".py",
140 |    "mimetype": "text/x-python",
141 |    "name": "python",
142 |    "nbconvert_exporter": "python",
143 |    "pygments_lexer": "ipython2",
144 |    "version": "2.7.15"
145 |   }
146 |  },
147 |  "nbformat": 4,
148 |  "nbformat_minor": 2
149 | }
150 | 


--------------------------------------------------------------------------------
/Machine Learning/Problem3/4_KL_Divergence.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "# CS229: Problem Set 3\n",
  8 |     "## Problem 4: KL Divergence and Maximum Likelihood\n",
  9 |     "\n",
 10 |     "\n",
 11 |     "**C. Combier**\n",
 12 |     "\n",
 13 |     "This iPython Notebook provides solutions to Stanford's CS229 (Machine Learning, Fall 2017) graduate course problem set 3, taught by Andrew Ng.\n",
 14 |     "\n",
 15 |     "The problem set can be found here: [./ps3.pdf](ps3.pdf)\n",
 16 |     "\n",
 17 |     "I chose to write the solutions to the coding questions in Python, whereas the Stanford class is taught with Matlab/Octave.\n",
 18 |     "\n",
 19 |     "## Notation\n",
 20 |     "\n",
 21 |     "- $x^i$ is the $i^{th}$ feature vector\n",
 22 |     "- $y^i$ is the expected outcome for the $i^{th}$ training example\n",
 23 |     "- $m$ is the number of training examples\n",
 24 |     "- $n$ is the number of features"
 25 |    ]
 26 |   },
 27 |   {
 28 |    "cell_type": "markdown",
 29 |    "metadata": {},
 30 |    "source": [
 31 |     "### Question 4.a)\n",
 32 |     "\n",
 33 |     "The goal is to prove that $KL(P||Q) \\geq 0$.\n",
 34 |     "\n",
 35 |     "$$\n",
 36 |     "\\begin{align*}\n",
 37 |     "KL(P||Q) &= \\sum_x P(x) \\log \\frac{P(x)}{Q(x)} \\\\\n",
 38 |     "&= -\\sum_x P(x) \\log \\frac{Q(x)}{P(x)} \\\\\n",
 39 |     "& \\geq -\\log \\sum_x P(x) \\frac{Q(x)}{P(x)} \\\\\n",
 40 |     "& \\geq -\\log \\sum_x Q(x) \\\\\n",
 41 |     "& \\geq - \\log 1 \\\\\n",
 42 |     "& \\geq 0\n",
 43 |     "\\end{align*}\n",
 44 |     "$$\n",
 45 |     "\n",
 46 |     "Now we prove $KL(P||Q) = 0 \\iff P=Q$.\n",
 47 |     "\n",
 48 |     "1. If $P = Q$, then it is immediate that $\\log \\frac{P(x)}{Q(x)} = 0$ and hence $KL(P||Q) = 0$\n",
 49 |     "\n",
 50 |     "2. If $KL(P||Q) = 0$, then $\\forall x$, $\\frac{P(x)}{Q(x)} = 1$, therefore $P = Q$\n",
 51 |     "\n"
 52 |    ]
 53 |   },
 54 |   {
 55 |    "cell_type": "markdown",
 56 |    "metadata": {},
 57 |    "source": [
 58 |     "### Question 4.b)\n",
 59 |     "\n",
 60 |     "\\begin{align*}\n",
 61 |     "KL(P(X) \\parallel Q(X)) + KL(P(Y|X) \\parallel Q(Y|X)) \n",
 62 |     "&= \\sum_x P(x) (\\log \\frac{P(x)}{Q(x)} + \\sum_y P(y|x) \\log \\frac{P(y|x)}{Q(y|x)}) \\\\\n",
 63 |     "&= \\sum_x P(x) \\sum_y P(y|x) ( \\log \\frac{P(x)}{Q(x)} + \\log \\frac{P(y|x)}{Q(y|x)} ) \\\\\n",
 64 |     "\\end{align*}\n",
 65 |     "\n",
 66 |     "We can include the term $\\log \\frac{P(x)}{Q(x)}$ in the sum over $y$, because $\\sum_y P(y|x) = 1$ since $P$ is a probability distribution. We continue the calculation:\n",
 67 |     "\n",
 68 |     "\\begin{align*}\n",
 69 |     "KL(P(X) \\parallel Q(X)) + KL(P(Y|X) \\parallel Q(Y|X)) \n",
 70 |     "&= \\sum_x P(x) \\sum_y P(y|x) \\log \\frac{P(x) P(y|x)}{Q(x) Q(y|x)} \\\\\n",
 71 |     "&= \\sum_x P(x) \\sum_y P(y|x)  \\log \\frac{P(x, y)}{Q(x, y)} \\\\\n",
 72 |     "&= \\sum_x P(x, y) \\log \\frac{P(x, y)}{Q(x, y)} \\\\\n",
 73 |     "&= KL(P(X, Y) || Q(X, Y)) \\\\\n",
 74 |     "\\end{align*}"
 75 |    ]
 76 |   },
 77 |   {
 78 |    "cell_type": "markdown",
 79 |    "metadata": {},
 80 |    "source": [
 81 |     "### Question 4.c)\n",
 82 |     "\n",
 83 |     "\\begin{align*}\n",
 84 |     "KL(\\hat P || P_{\\theta}) \n",
 85 |     "&= \\sum_x \\hat P(x) \\log \\frac{\\hat P(x)}{P_{\\theta}(x)} \\\\\n",
 86 |     "&= - \\sum_x \\hat P(x) \\log \\frac{P_{\\theta}(x)}{\\hat P(x)} \\\\\n",
 87 |     "&= - \\sum_x (\\frac{1}{m} \\sum_{i=1}^{m} 1 \\{x^{(i)} = x\\}). \\log \\frac{P_{\\theta}(x)}{\\frac{1}{m} \\sum_{i=1}^{m} 1 \\{x^{(i)} = x\\}} \\\\\n",
 88 |     "&= - \\frac{1}{m} \\sum_{i=1}^{m} \\log P_{\\theta}(x^{(i)}) \\\\\n",
 89 |     "\\end{align*}\n",
 90 |     "\n",
 91 |     "Thus, minimizing $KL(\\hat P || P_{\\theta})$ is equivalent to maximizing $\\sum_{i=1}^{m} \\log P_{\\theta}(x^{(i)}) = \\ell(\\theta)$"
 92 |    ]
 93 |   }
 94 |  ],
 95 |  "metadata": {
 96 |   "kernelspec": {
 97 |    "display_name": "Python 2",
 98 |    "language": "python",
 99 |    "name": "python2"
100 |   },
101 |   "language_info": {
102 |    "codemirror_mode": {
103 |     "name": "ipython",
104 |     "version": 2
105 |    },
106 |    "file_extension": ".py",
107 |    "mimetype": "text/x-python",
108 |    "name": "python",
109 |    "nbconvert_exporter": "python",
110 |    "pygments_lexer": "ipython2",
111 |    "version": "2.7.15"
112 |   }
113 |  },
114 |  "nbformat": 4,
115 |  "nbformat_minor": 2
116 | }
117 | 


--------------------------------------------------------------------------------
/Machine Learning/Problem3/data/mandrill-large.tiff:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bayeslabs/AiGym/205350c311f0e0981fe90eae84586c9bd4a9cfef/Machine Learning/Problem3/data/mandrill-large.tiff


--------------------------------------------------------------------------------
/Machine Learning/Problem3/data/mandrill-small.tiff:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bayeslabs/AiGym/205350c311f0e0981fe90eae84586c9bd4a9cfef/Machine Learning/Problem3/data/mandrill-small.tiff


--------------------------------------------------------------------------------
/Machine Learning/Problem3/data/triangle_pb3_1.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bayeslabs/AiGym/205350c311f0e0981fe90eae84586c9bd4a9cfef/Machine Learning/Problem3/data/triangle_pb3_1.jpg


--------------------------------------------------------------------------------
/Machine Learning/Problem3/ps3.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bayeslabs/AiGym/205350c311f0e0981fe90eae84586c9bd4a9cfef/Machine Learning/Problem3/ps3.pdf


--------------------------------------------------------------------------------
/Machine Learning/Problem4/2_EM-Convergence.ipynb:
--------------------------------------------------------------------------------
 1 | {
 2 |  "cells": [
 3 |   {
 4 |    "cell_type": "markdown",
 5 |    "metadata": {},
 6 |    "source": [
 7 |     "# CS229: Problem Set 4\n",
 8 |     "## Problem 2: Expectation-Maximization Convergence \n",
 9 |     "\n",
10 |     "\n",
11 |     "**C. Combier**\n",
12 |     "\n",
13 |     "This iPython Notebook provides solutions to Stanford's CS229 (Machine Learning, Fall 2017) graduate course problem set 3, taught by Andrew Ng.\n",
14 |     "\n",
15 |     "The problem set can be found here: [./ps4.pdf](ps4.pdf)\n",
16 |     "\n",
17 |     "I chose to write the solutions to the coding questions in Python, whereas the Stanford class is taught with Matlab/Octave.\n",
18 |     "\n",
19 |     "## Notation\n",
20 |     "\n",
21 |     "- $x_i$ is the $i^{th}$ feature vector\n",
22 |     "- $y_i$ is the expected outcome for the $i^{th}$ training example\n",
23 |     "- $z_i$'s are the latent (hidden) variables\n",
24 |     "- $m$ is the number of training examples\n",
25 |     "- $n$ is the number of features"
26 |    ]
27 |   },
28 |   {
29 |    "cell_type": "markdown",
30 |    "metadata": {
31 |     "colab_type": "text",
32 |     "id": "2URlArNNdz_q"
33 |    },
34 |    "source": [
35 |     "After the E-step, we obtain a lower bound on the log-likelihood denoted by:\n",
36 |     "\n",
37 |     "$$\n",
38 |     "\\beta = \\sum_i^m \\sum_{z_i} Q_i(z_i) \\log \\frac{ p(x_i, z_i;\\theta)}{Q_i (z_i)}\n",
39 |     "$$\n",
40 |     "\n",
41 |     "This lower bound $\\beta$ has been made tight by setting:\n",
42 |     "\n",
43 |     "$$\n",
44 |     "Q_i(z_i) = p(z_i |x_i; \\theta) = \\frac{p(x_i, z_i; \\theta)}{p(x_i; \\theta)}\n",
45 |     "$$\n",
46 |     "\n",
47 |     "For the M-step, we maximize $\\beta$ by taking the gradient with respect to $\\theta$ and setting it to zero.\n",
48 |     "\n",
49 |     "Suppose that EM hase converged, and that $\\theta = \\theta^*$.\n",
50 |     "\n",
51 |     "In this case:\n",
52 |     "\n",
53 |     "$$\n",
54 |     "\\begin{align*}\n",
55 |     "\\nabla_{\\theta} \\beta & = \\sum_i^m \\sum_{z_i} Q_i(z_i) \\nabla_{\\theta}\\log \\frac{ p(x_i, z_i;\\theta)}{Q_i (z_i)}_{| \\theta = \\theta^*} \\\\\n",
56 |     "&= \\sum_i^m \\sum_{z_i} Q_i(z_i)  \\frac{Q_i(z_i) }{p(x_i, z_i;\\theta^*) Q_i (z_i)} \\nabla_{\\theta}  p(x_i, z_i;\\theta)_{| \\theta = \\theta^*}  \\\\\n",
57 |     "&= \\sum_i^m \\sum_{z_i}  \\frac{p(x_i, z_i; \\theta^*)}{p(x_i; \\theta^*) p(x_i, z_i; \\theta^*)} \\nabla_{\\theta}  p(x_i, z_i;\\theta^*)_{| \\theta = \\theta^*} \\\\\n",
58 |     "&= \\sum_i^m \\sum_{z_i}  \\frac{\\nabla_{\\theta}  p(x_i, z_i;\\theta)_{| \\theta = \\theta^*}  }{p(x_i; \\theta^*) } \\\\\n",
59 |     "&= \\sum_i^m \\frac{\\nabla_{\\theta}  p(x_i;\\theta)_{| \\theta = \\theta^*}  }{p(x_i; \\theta^*) } \\\\\n",
60 |     "&= \\sum_i^m \\nabla_{\\theta}  \\log p(x_i;\\theta)_{| \\theta = \\theta^*}   \\\\\n",
61 |     "&= \\nabla_{\\theta} ( \\sum_i^m  \\log p(x_i;\\theta) )_{| \\theta = \\theta^*}\\\\\n",
62 |     "&= \\nabla_{\\theta} \\ell (\\theta)_{| \\theta = \\theta^*}\n",
63 |     "\\end{align*}\n",
64 |     "$$"
65 |    ]
66 |   }
67 |  ],
68 |  "metadata": {
69 |   "colab": {
70 |    "collapsed_sections": [],
71 |    "name": "Bonjour, Colaboratory",
72 |    "provenance": [],
73 |    "version": "0.3.2"
74 |   },
75 |   "kernelspec": {
76 |    "display_name": "Python 2",
77 |    "language": "python",
78 |    "name": "python2"
79 |   },
80 |   "language_info": {
81 |    "codemirror_mode": {
82 |     "name": "ipython",
83 |     "version": 2
84 |    },
85 |    "file_extension": ".py",
86 |    "mimetype": "text/x-python",
87 |    "name": "python",
88 |    "nbconvert_exporter": "python",
89 |    "pygments_lexer": "ipython2",
90 |    "version": "2.7.15"
91 |   }
92 |  },
93 |  "nbformat": 4,
94 |  "nbformat_minor": 1
95 | }
96 | 


--------------------------------------------------------------------------------
/Machine Learning/Problem4/4_Independent-Component-Analysis.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "# CS229: Problem Set 4\n",
  8 |     "## Problem 4: Independent Component Analysis\n",
  9 |     "\n",
 10 |     "\n",
 11 |     "**C. Combier**\n",
 12 |     "\n",
 13 |     "This iPython Notebook provides solutions to Stanford's CS229 (Machine Learning, Fall 2017) graduate course problem set 3, taught by Andrew Ng.\n",
 14 |     "\n",
 15 |     "The problem set can be found here: [./ps4.pdf](ps4.pdf)\n",
 16 |     "\n",
 17 |     "I chose to write the solutions to the coding questions in Python, whereas the Stanford class is taught with Matlab/Octave.\n",
 18 |     "\n",
 19 |     "## Notation\n",
 20 |     "\n",
 21 |     "- $x_i$ is the $i^{th}$ feature vector\n",
 22 |     "- $y_i$ is the expected outcome for the $i^{th}$ training example\n",
 23 |     "- $z_i$'s are the latent (hidden) variables\n",
 24 |     "- $m$ is the number of training examples\n",
 25 |     "- $n$ is the number of features\n",
 26 |     "\n",
 27 |     "For clarity, I've inlined the code of the provided helper function ```belsej.py```.\n",
 28 |     "\n",
 29 |     "## Dependencies\n",
 30 |     "\n",
 31 |     "I installed ```sounddevice``` to Anaconda with the following command:\n",
 32 |     "\n",
 33 |     "```conda install -c conda-forge python-sounddevice ```\n",
 34 |     "\n",
 35 |     "First, let's set up the environment and write helper functions:\n",
 36 |     "\n",
 37 |     "- ```normalize``` ensures all mixes have the same volume\n",
 38 |     "- ```load_data``` loads the mix\n",
 39 |     "- ```play``` plays the audio using ```sounddevice```"
 40 |    ]
 41 |   },
 42 |   {
 43 |    "cell_type": "code",
 44 |    "execution_count": 2,
 45 |    "metadata": {},
 46 |    "outputs": [],
 47 |    "source": [
 48 |     "### Independent Components Analysis\n",
 49 |     "###\n",
 50 |     "### This program requires a working installation of:\n",
 51 |     "###\n",
 52 |     "### On Mac:\n",
 53 |     "### conda install -c conda-forge python-sounddevice\n",
 54 |     "###\n",
 55 |     "\n",
 56 |     "import sounddevice as sd\n",
 57 |     "import numpy as np\n",
 58 |     "\n",
 59 |     "Fs = 11025\n",
 60 |     "\n",
 61 |     "def normalize(dat):\n",
 62 |     "    return 0.99 * dat / np.max(np.abs(dat))\n",
 63 |     "\n",
 64 |     "def load_data():\n",
 65 |     "    mix = np.loadtxt('data/mix.dat')\n",
 66 |     "    return mix\n",
 67 |     "\n",
 68 |     "def play(vec):\n",
 69 |     "    sd.play(vec, Fs, blocking=True)"
 70 |    ]
 71 |   },
 72 |   {
 73 |    "cell_type": "markdown",
 74 |    "metadata": {},
 75 |    "source": [
 76 |     "Next we write a numerically stable sigmoid function, to avoid overflows:"
 77 |    ]
 78 |   },
 79 |   {
 80 |    "cell_type": "code",
 81 |    "execution_count": 1,
 82 |    "metadata": {},
 83 |    "outputs": [],
 84 |    "source": [
 85 |     "# Numerically stable sigmoid\n",
 86 |     "def sigmoid(x):\n",
 87 |     "    return np.where(x >= 0, 1 / (1 + np.exp(-x)), np.exp(x) / (1 + np.exp(x)))"
 88 |    ]
 89 |   },
 90 |   {
 91 |    "cell_type": "markdown",
 92 |    "metadata": {},
 93 |    "source": [
 94 |     "The following functions calculates the weights to separate the independent components of the five mixes, using stochastic gradient descent and annealing to speed up convergence."
 95 |    ]
 96 |   },
 97 |   {
 98 |    "cell_type": "code",
 99 |    "execution_count": 3,
100 |    "metadata": {},
101 |    "outputs": [],
102 |    "source": [
103 |     "def unmixer(X):\n",
104 |     "    M, N = X.shape\n",
105 |     "    W = np.eye(N)\n",
106 |     "\n",
107 |     "    anneal = [0.1, 0.1, 0.1, 0.05, 0.05, 0.05, 0.02, 0.02, 0.01, 0.01,\n",
108 |     "              0.005, 0.005, 0.002, 0.002, 0.001, 0.001]\n",
109 |     "    print('Separating tracks ...')\n",
110 |     "    for alpha in anneal:\n",
111 |     "        for xi in X:\n",
112 |     "            W += alpha * (np.outer(1 - 2 * sigmoid(np.dot(W, xi.T)), xi) + np.linalg.inv(W.T))\n",
113 |     "    return W"
114 |    ]
115 |   },
116 |   {
117 |    "cell_type": "markdown",
118 |    "metadata": {},
119 |    "source": [
120 |     "Finally, this last function unmixes the 5 mixes to extract the independent components."
121 |    ]
122 |   },
123 |   {
124 |    "cell_type": "code",
125 |    "execution_count": 4,
126 |    "metadata": {},
127 |    "outputs": [],
128 |    "source": [
129 |     "def unmix(X, W):\n",
130 |     "    S = np.zeros(X.shape)\n",
131 |     "    S = X.dot(W.T)\n",
132 |     "    return S"
133 |    ]
134 |   },
135 |   {
136 |    "cell_type": "markdown",
137 |    "metadata": {},
138 |    "source": [
139 |     "Now, we load the mix data:"
140 |    ]
141 |   },
142 |   {
143 |    "cell_type": "code",
144 |    "execution_count": 8,
145 |    "metadata": {},
146 |    "outputs": [
147 |     {
148 |      "name": "stdout",
149 |      "output_type": "stream",
150 |      "text": [
151 |       "Playing mixed track 0\n",
152 |       "Playing mixed track 1\n",
153 |       "Playing mixed track 2\n",
154 |       "Playing mixed track 3\n",
155 |       "Playing mixed track 4\n"
156 |      ]
157 |     }
158 |    ],
159 |    "source": [
160 |     "X = normalize(load_data())\n",
161 |     "for i in range(X.shape[1]):\n",
162 |     "    print('Playing mixed track %d' % i)\n",
163 |     "    play(X[:, i])"
164 |    ]
165 |   },
166 |   {
167 |    "cell_type": "markdown",
168 |    "metadata": {},
169 |    "source": [
170 |     "Next, we run Independent Component Analysis and separate the components in the mix:"
171 |    ]
172 |   },
173 |   {
174 |    "cell_type": "code",
175 |    "execution_count": 7,
176 |    "metadata": {},
177 |    "outputs": [
178 |     {
179 |      "name": "stdout",
180 |      "output_type": "stream",
181 |      "text": [
182 |       "Separating tracks ...\n"
183 |      ]
184 |     }
185 |    ],
186 |    "source": [
187 |     "W = unmixer(X)\n",
188 |     "S = normalize(unmix(X, W))"
189 |    ]
190 |   },
191 |   {
192 |    "cell_type": "markdown",
193 |    "metadata": {},
194 |    "source": [
195 |     "Finally, we play the separated components:"
196 |    ]
197 |   },
198 |   {
199 |    "cell_type": "code",
200 |    "execution_count": 9,
201 |    "metadata": {},
202 |    "outputs": [
203 |     {
204 |      "name": "stdout",
205 |      "output_type": "stream",
206 |      "text": [
207 |       "Playing separated track 0\n",
208 |       "Playing separated track 1\n",
209 |       "Playing separated track 2\n",
210 |       "Playing separated track 3\n",
211 |       "Playing separated track 4\n"
212 |      ]
213 |     }
214 |    ],
215 |    "source": [
216 |     "for i in range(S.shape[1]):\n",
217 |     "    print('Playing separated track %d' % i)\n",
218 |     "    play(S[:, i])"
219 |    ]
220 |   }
221 |  ],
222 |  "metadata": {
223 |   "colab": {
224 |    "collapsed_sections": [],
225 |    "name": "Bonjour, Colaboratory",
226 |    "provenance": [],
227 |    "version": "0.3.2"
228 |   },
229 |   "kernelspec": {
230 |    "display_name": "Python 2",
231 |    "language": "python",
232 |    "name": "python2"
233 |   },
234 |   "language_info": {
235 |    "codemirror_mode": {
236 |     "name": "ipython",
237 |     "version": 2
238 |    },
239 |    "file_extension": ".py",
240 |    "mimetype": "text/x-python",
241 |    "name": "python",
242 |    "nbconvert_exporter": "python",
243 |    "pygments_lexer": "ipython2",
244 |    "version": "2.7.15"
245 |   }
246 |  },
247 |  "nbformat": 4,
248 |  "nbformat_minor": 1
249 | }
250 | 


--------------------------------------------------------------------------------
/Machine Learning/Problem4/data/bellsej.py:
--------------------------------------------------------------------------------
 1 | ### Independent Components Analysis
 2 | ###
 3 | ### This program requires a working installation of:
 4 | ###
 5 | ### On Mac:
 6 | ###     1. portaudio: On Mac: brew install portaudio
 7 | ###     2. sounddevice: pip install sounddevice
 8 | ###
 9 | ### On windows:
10 | ###      pip install pyaudio sounddevice
11 | ###
12 | 
13 | import sounddevice as sd
14 | import numpy as np
15 | 
16 | Fs = 11025
17 | 
18 | def normalize(dat):
19 |     return 0.99 * dat / np.max(np.abs(dat))
20 | 
21 | def load_data():
22 |     mix = np.loadtxt('mix.dat')
23 |     return mix
24 | 
25 | def play(vec):
26 |     sd.play(vec, Fs, blocking=True)
27 | 
28 | def unmixer(X):
29 |     M, N = X.shape
30 |     W = np.eye(N)
31 | 
32 |     anneal = [0.1, 0.1, 0.1, 0.05, 0.05, 0.05, 0.02, 0.02, 0.01, 0.01,
33 |               0.005, 0.005, 0.002, 0.002, 0.001, 0.001]
34 |     print('Separating tracks ...')
35 |     ######## Your code here ##########
36 | 
37 |     ###################################
38 |     return W
39 | 
40 | def unmix(X, W):
41 |     S = np.zeros(X.shape)
42 | 
43 |     ######### Your code here ##########
44 | 
45 |     ##################################
46 |     return S
47 | 
48 | def main():
49 |     X = normalize(load_data())
50 | 
51 |     for i in range(X.shape[1]):
52 |         print('Playing mixed track %d' % i)
53 |         play(X[:, i])
54 | 
55 |     W = unmixer(X)
56 |     S = normalize(unmix(X, W))
57 | 
58 |     for i in range(S.shape[1]):
59 |         print('Playing separated track %d' % i)
60 |         play(S[:, i])
61 | 
62 | if __name__ == '__main__':
63 |     main()
64 | 


--------------------------------------------------------------------------------
/Machine Learning/Problem4/data/cart_pole.py:
--------------------------------------------------------------------------------
  1 | """
  2 | CS 229 Machine Learning, Fall 2017
  3 | Problem Set 4
  4 | Question: Reinforcement Learning: The inverted pendulum
  5 | Author: Sanyam Mehra, sanyam@stanford.edu
  6 | """
  7 | from __future__ import division, print_function
  8 | from math import sin, cos, pi
  9 | import matplotlib.pyplot as plt
 10 | import matplotlib.patches as patches
 11 | 
 12 | class CartPole:
 13 |     def __init__(self, physics):
 14 |         self.physics = physics
 15 |         self.mass_cart = 1.0
 16 |         self.mass_pole = 0.3
 17 |         self.mass = self.mass_cart + self.mass_pole
 18 |         self.length = 0.7 # actually half the pole length
 19 |         self.pole_mass_length = self.mass_pole * self.length
 20 | 
 21 |     def simulate(self, action, state_tuple):
 22 |         """
 23 |         Simulation dynamics of the cart-pole system
 24 | 
 25 |         Parameters
 26 |         ----------
 27 |         action : int
 28 |             Action represented as 0 or 1
 29 |         state_tuple : tuple
 30 |             Continuous vector of x, x_dot, theta, theta_dot
 31 | 
 32 |         Returns
 33 |         -------
 34 |         new_state : tuple
 35 |             Updated state vector of new_x, new_x_dot, nwe_theta, new_theta_dot
 36 |         """
 37 |         x, x_dot, theta, theta_dot = state_tuple
 38 |         costheta, sintheta = cos(theta), sin(theta)
 39 |         # costheta, sintheta = cos(theta * 180 / pi), sin(theta * 180 / pi)
 40 | 
 41 |         # calculate force based on action
 42 |         force = self.physics.force_mag if action > 0 else (-1 * self.physics.force_mag)
 43 | 
 44 |         # intermediate calculation
 45 |         temp = (force + self.pole_mass_length * theta_dot * theta_dot * sintheta) / self.mass
 46 |         theta_acc = (self.physics.gravity * sintheta - temp * costheta) / (self.length * (4/3 - self.mass_pole * costheta * costheta / self.mass))
 47 | 
 48 |         x_acc = temp - self.pole_mass_length * theta_acc * costheta / self.mass
 49 | 
 50 |         # return new state variable using Euler's method
 51 |         new_x = x + self.physics.tau * x_dot
 52 |         new_x_dot = x_dot + self.physics.tau * x_acc
 53 |         new_theta = theta + self.physics.tau * theta_dot
 54 |         new_theta_dot = theta_dot + self.physics.tau * theta_acc
 55 |         new_state = (new_x, new_x_dot, new_theta, new_theta_dot)
 56 | 
 57 |         return new_state
 58 | 
 59 |     def get_state(self, state_tuple):
 60 |         """
 61 |         Discretizes the continuous state vector. The current discretization
 62 |         divides x into 3, x_dot into 3, theta into 6 and theta_dot into 3
 63 |         categories. A finer discretization produces a larger state space
 64 |         but allows for a better policy
 65 | 
 66 |         Parameters
 67 |         ----------
 68 |         state_tuple : tuple
 69 |             Continuous vector of x, x_dot, theta, theta_dot
 70 | 
 71 |         Returns
 72 |         -------
 73 |         state : int
 74 |             Discretized state value
 75 |         """
 76 |         x, x_dot, theta, theta_dot = state_tuple
 77 |         # parameters for state discretization in get_state
 78 |         # convert degrees to radians
 79 |         one_deg = pi / 180
 80 |         six_deg = 6 * pi / 180
 81 |         twelve_deg = 12 * pi / 180
 82 |         fifty_deg = 50 * pi / 180
 83 | 
 84 |         total_states = 163
 85 |         state = 0
 86 | 
 87 |         if x < -2.4 or x > 2.4 or theta < -twelve_deg or theta > twelve_deg:
 88 |             state = total_states - 1 # to signal failure
 89 |         else:
 90 |             # x: 3 categories
 91 |             if x < -1.5:
 92 |                 state = 0
 93 |             elif x < 1.5:
 94 |                 state = 1
 95 |             else:
 96 |                 state = 2
 97 |             # x_dot: 3 categories
 98 |             if x_dot < -0.5:
 99 |                 pass
100 |             elif x_dot < 0.5:
101 |                 state += 3
102 |             else:
103 |                 state += 6
104 |             # theta: 6 categories
105 |             if theta < -six_deg:
106 |                 pass
107 |             elif theta < -one_deg:
108 |                 state += 9
109 |             elif theta < 0:
110 |                 state += 18
111 |             elif theta < one_deg:
112 |                 state += 27
113 |             elif theta < six_deg:
114 |                 state += 36
115 |             else:
116 |                 state += 45
117 |             # theta_dot: 3 categories
118 |             if theta_dot < -fifty_deg:
119 |                 pass
120 |             elif theta_dot < fifty_deg:
121 |                 state += 54
122 |             else:
123 |                 state += 108
124 |         # state += 1 # converting from MATLAB 1-indexing to 0-indexing
125 |         return state
126 | 
127 |     def show_cart(self, state_tuple, pause_time):
128 |         """
129 |         Given the `state_tuple`, displays the cart-pole system.
130 | 
131 |         Parameters
132 |         ----------
133 |         state_tuple : tuple
134 |             Continuous vector of x, x_dot, theta, theta_dot
135 |         pause_time : float
136 |             Time delay in seconds
137 | 
138 |         Returns
139 |         -------
140 |         """
141 |         x, x_dot, theta, theta_dot = state_tuple
142 |         X = [x, x + 4*self.length * sin(theta)]
143 |         Y = [0, 4*self.length * cos(theta)]
144 |         plt.close('all')
145 |         fig, ax = plt.subplots(1)
146 |         plt.ion()
147 |         ax.set_xlim(-3, 3)
148 |         ax.set_ylim(-0.5, 3.5)
149 |         ax.plot(X, Y)
150 |         cart = patches.Rectangle((x - 0.4, -0.25), 0.8, 0.25,
151 |                         linewidth=1, edgecolor='k', facecolor='cyan')
152 |         base = patches.Rectangle((x - 0.01, -0.5), 0.02, 0.25,
153 |                         linewidth=1, edgecolor='k', facecolor='r')
154 |         ax.add_patch(cart)
155 |         ax.add_patch(base)
156 |         x_dot_str, theta_str, theta_dot_str = '\\dot{x}', '\\theta', '\\dot{\\theta}'
157 |         ax.set_title('x: %.3f, $%s$: %.3f, $%s$: %.3f, $%s$: %.3f'\
158 |                                 %(x, x_dot_str, x_dot, theta_str, theta, theta_dot_str, x))
159 |         plt.show()
160 |         plt.pause(pause_time)
161 | 
162 | class Physics:
163 |     gravity = 9.8
164 |     force_mag = 10.0
165 |     tau = 0.02 # seconds between state updates
166 | 


--------------------------------------------------------------------------------
/Machine Learning/Problem4/data/mnist.zip:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bayeslabs/AiGym/205350c311f0e0981fe90eae84586c9bd4a9cfef/Machine Learning/Problem4/data/mnist.zip


--------------------------------------------------------------------------------
/Machine Learning/Problem4/data/nn_starter.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | import matplotlib.pyplot as plt
  3 | 
  4 | def readData(images_file, labels_file):
  5 |     x = np.loadtxt(images_file, delimiter=',')
  6 |     y = np.loadtxt(labels_file, delimiter=',')
  7 |     return x, y
  8 | 
  9 | def softmax(x):
 10 |     """
 11 |     Compute softmax function for input. 
 12 |     Use tricks from previous assignment to avoid overflow
 13 |     """
 14 | 	### YOUR CODE HERE
 15 | 
 16 | 	### END YOUR CODE
 17 |     return s
 18 | 
 19 | def sigmoid(x):
 20 |     """
 21 |     Compute the sigmoid function for the input here.
 22 |     """
 23 |     ### YOUR CODE HERE
 24 | 
 25 |     ### END YOUR CODE
 26 |     return s
 27 | 
 28 | def forward_prop(data, labels, params):
 29 |     """
 30 |     return hidder layer, output(softmax) layer and loss
 31 |     """
 32 |     W1 = params['W1']
 33 |     b1 = params['b1']
 34 |     W2 = params['W2']
 35 |     b2 = params['b2']
 36 | 
 37 |     ### YOUR CODE HERE
 38 | 
 39 |     ### END YOUR CODE
 40 |     return h, y, cost
 41 | 
 42 | def backward_prop(data, labels, params):
 43 |     """
 44 |     return gradient of parameters
 45 |     """
 46 |     W1 = params['W1']
 47 |     b1 = params['b1']
 48 |     W2 = params['W2']
 49 |     b2 = params['b2']
 50 | 
 51 |     ### YOUR CODE HERE
 52 | 
 53 |     ### END YOUR CODE
 54 | 
 55 |     grad = {}
 56 |     grad['W1'] = gradW1
 57 |     grad['W2'] = gradW2
 58 |     grad['b1'] = gradb1
 59 |     grad['b2'] = gradb2
 60 | 
 61 |     return grad
 62 | 
 63 | def nn_train(trainData, trainLabels, devData, devLabels):
 64 |     (m, n) = trainData.shape
 65 |     num_hidden = 300
 66 |     learning_rate = 5
 67 |     params = {}
 68 | 
 69 |     ### YOUR CODE HERE
 70 | 
 71 |     ### END YOUR CODE
 72 | 
 73 |     return params
 74 | 
 75 | def nn_test(data, labels, params):
 76 |     h, output, cost = forward_prop(data, labels, params)
 77 |     accuracy = compute_accuracy(output, labels)
 78 |     return accuracy
 79 | 
 80 | def compute_accuracy(output, labels):
 81 |     accuracy = (np.argmax(output,axis=1) == np.argmax(labels,axis=1)).sum() * 1. / labels.shape[0]
 82 |     return accuracy
 83 | 
 84 | def one_hot_labels(labels):
 85 |     one_hot_labels = np.zeros((labels.size, 10))
 86 |     one_hot_labels[np.arange(labels.size),labels.astype(int)] = 1
 87 |     return one_hot_labels
 88 | 
 89 | def main():
 90 |     np.random.seed(100)
 91 |     trainData, trainLabels = readData('images_train.csv', 'labels_train.csv')
 92 |     trainLabels = one_hot_labels(trainLabels)
 93 |     p = np.random.permutation(60000)
 94 |     trainData = trainData[p,:]
 95 |     trainLabels = trainLabels[p,:]
 96 | 
 97 |     devData = trainData[0:10000,:]
 98 |     devLabels = trainLabels[0:10000,:]
 99 |     trainData = trainData[10000:,:]
100 |     trainLabels = trainLabels[10000:,:]
101 | 
102 |     mean = np.mean(trainData)
103 |     std = np.std(trainData)
104 |     trainData = (trainData - mean) / std
105 |     devData = (devData - mean) / std
106 | 
107 |     testData, testLabels = readData('images_test.csv', 'labels_test.csv')
108 |     testLabels = one_hot_labels(testLabels)
109 |     testData = (testData - mean) / std
110 | 	
111 |     params = nn_train(trainData, trainLabels, devData, devLabels)
112 | 
113 | 
114 |     readyForTesting = False
115 |     if readyForTesting:
116 |         accuracy = nn_test(testData, testLabels, params)
117 | 	print 'Test accuracy: %f' % accuracy
118 | 
119 | if __name__ == '__main__':
120 |     main()
121 | 


--------------------------------------------------------------------------------
/Machine Learning/Problem4/ps4.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bayeslabs/AiGym/205350c311f0e0981fe90eae84586c9bd4a9cfef/Machine Learning/Problem4/ps4.pdf


--------------------------------------------------------------------------------
/Machine Learning/Readme.md:
--------------------------------------------------------------------------------
 1 | <!-- In our community we follow the stanford cs229 course notes by andrew NG as base for machine learning.
 2 | You can find latest offering of the course below.
 3 | 
 4 | http://cs229.stanford.edu/
 5 | 
 6 | Some curated assignments and solutions shared above. -->
 7 | 
 8 | # Machine Learning
 9 | 
10 | ## Course List
11 | **S.No** | **Course Title** | **Link to course** | **Link to Assignment Solutions**
12 | ------------ | ------------- | --------- | -----------
13 | [1](#1-machine-learning) | Machine Learning | https://cs229.stanford.edu/ | [CS229 Assignment Solutions](https://github.com/huyfam/cs229-solutions-2020)
14 | <!-- [2](#2-deep-learning-for-computer-vision) | Deep Learning for Computer Vision | http://cs231n.stanford.edu/ -->
15 | 
16 | 
17 | ## Course Details
18 | ### 1. Machine Learning
19 |    * **Link to course** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;: &nbsp; &nbsp; http://cs229.stanford.edu/
20 |    * **Offered By** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;: &nbsp; &nbsp; Stanford 
21 |    * **Pre-Requisites** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; : &nbsp; &nbsp;  Calculus, Linear Algebra, Basic Python programming,Probability Theory  
22 |                                      <!-- &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Introductory molecular biology -->
23 |    * **Level** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; : &nbsp; &nbsp; Beginner
24 |    * **Course description**    
25 |         This course provides a broad introduction to machine learning and statistical pattern recognition. Topics include: supervised learning (generative/discriminative learning, parametric/non-parametric learning, neural networks, support vector machines); unsupervised learning (clustering, dimensionality reduction, kernel methods); learning theory (bias/variance tradeoffs, practical advice); reinforcement learning and adaptive control.
26 |  
27 | 
28 | <!-- ### 2. Deep Learning for Computer Vision
29 |    * **Link to course** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;: &nbsp; &nbsp; http://cs231n.stanford.edu/
30 |    * **Offered By** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;: &nbsp; &nbsp; Stanford 
31 |    * **Pre-Requisites** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; : &nbsp; &nbsp; Proficiency in Python, Calculus, Linear Algebra,Probability, Statistics   
32 |    * **Level** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; : &nbsp; &nbsp; Advanced
33 |    * **Course description**    
34 |         Computer Vision has become ubiquitous in our society, with applications in search, image understanding, apps, mapping, medicine, drones, and self-driving cars. Core to many of these applications are visual recognition tasks such as image classification, localization and detection. Recent developments in neural network (aka “deep learning”) approaches have greatly advanced the performance of these state-of-the-art visual recognition systems. This course is a deep dive into the details of deep learning architectures with a focus on learning end-to-end models for these tasks, particularly image classification. During the 10-week course, students will learn to implement and train their own neural networks and gain a detailed understanding of cutting-edge research in computer vision. Additionally, the final assignment will give them the opportunity to train and apply multi-million parameter networks on real-world vision problems of their choice. Through multiple hands-on assignments and the final course project, students will acquire the toolset for setting up deep learning tasks and practical engineering tricks for training and fine-tuning deep neural networks.  -->
35 |         
36 |        
37 | ####  Happy Learning  &nbsp; :thumbsup: :memo: 
38 | 
39 | 
40 | 
41 | 
42 | 
43 | 


--------------------------------------------------------------------------------
/NLP/README.md:
--------------------------------------------------------------------------------
 1 | <!-- For NLP we we will be following the following course from stanford. It mostly deals with NLP with deep learning. Because of deep learning being widely used in NLP. This course helps in solving problems in best possible way.
 2 | 
 3 | https://web.stanford.edu/class/cs224n/index.html
 4 | 
 5 | You can go through the readings and get work on the assignments above to get coplete understanding. -->
 6 | 
 7 | 
 8 | # Natural Language Processing   
 9 | 
10 | ## Course List
11 | **S.No** | **Course Title** | **Link to course** | **Link to Assignment Solutions**
12 | ------------ | ------------- | --------- | ------------
13 | [1](#1-natural-language-processing-with-deep-learning) | Natural Language Processing with Deep Learning | https://web.stanford.edu/class/cs224n/index.html | [CS224n Solutions](https://github.com/Brant-Skywalker/CS224n-Winter-2022)
14 | <!-- [2](#2-deep-learning-for-computer-vision) | Deep Learning for Computer Vision | http://cs231n.stanford.edu/ -->
15 | 
16 | 
17 | ## Course Details
18 | ### 1. Natural Language Processing with Deep Learning
19 |    * **Link to course** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;: &nbsp; &nbsp; https://web.stanford.edu/class/cs224n/index.html
20 |    * **Offered By** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;: &nbsp; &nbsp; Stanford 
21 |    * **Pre-Requisites** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; : &nbsp; &nbsp;  Calculus, Linear Algebra, Proficiency in Python,Probability , Statistics, Foundations of Machine Learning
22 |                                      <!-- &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Introductory molecular biology -->
23 |    * **Level** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; : &nbsp; &nbsp; Advanced
24 |    * **Course description**    
25 |          In this course, students will gain a thorough introduction to cutting-edge research in Deep Learning for NLP. Through lectures, assignments and a final project, students will learn the necessary skills to design, implement, and understand their own neural network models, using the Pytorch framework. 
26 |  
27 | 
28 | <!-- ### 2. Deep Learning for Computer Vision
29 |    * **Link to course** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;: &nbsp; &nbsp; http://cs231n.stanford.edu/
30 |    * **Offered By** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;: &nbsp; &nbsp; Stanford 
31 |    * **Pre-Requisites** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; : &nbsp; &nbsp; Proficiency in Python, Calculus, Linear Algebra,Probability, Statistics   
32 |    * **Level** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; : &nbsp; &nbsp; Advanced
33 |    * **Course description**    
34 |         Computer Vision has become ubiquitous in our society, with applications in search, image understanding, apps, mapping, medicine, drones, and self-driving cars. Core to many of these applications are visual recognition tasks such as image classification, localization and detection. Recent developments in neural network (aka “deep learning”) approaches have greatly advanced the performance of these state-of-the-art visual recognition systems. This course is a deep dive into the details of deep learning architectures with a focus on learning end-to-end models for these tasks, particularly image classification. During the 10-week course, students will learn to implement and train their own neural networks and gain a detailed understanding of cutting-edge research in computer vision. Additionally, the final assignment will give them the opportunity to train and apply multi-million parameter networks on real-world vision problems of their choice. Through multiple hands-on assignments and the final course project, students will acquire the toolset for setting up deep learning tasks and practical engineering tricks for training and fine-tuning deep neural networks.  -->
35 |         
36 |        
37 | ####  Happy Learning  &nbsp; :thumbsup: :memo: 
38 | 
39 | 
40 | 
41 | 
42 | 
43 | 
44 | 


--------------------------------------------------------------------------------
/NLP/assignment1/Makefile:
--------------------------------------------------------------------------------
 1 | DATASETS_DIR=utils/datasets
 2 | 
 3 | init:
 4 | 	sh get_datasets.sh
 5 | 
 6 | submit:
 7 | 	sh collect_submission.sh
 8 | 
 9 | clean:
10 | 	rm -f assignment1.zip
11 | 	rm -rf ${DATASETS_DIR}
12 | 	rm -f *.pyc *.png *.npy utils/*.pyc
13 | 
14 | 


--------------------------------------------------------------------------------
/NLP/assignment1/assignment1-solution.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bayeslabs/AiGym/205350c311f0e0981fe90eae84586c9bd4a9cfef/NLP/assignment1/assignment1-solution.pdf


--------------------------------------------------------------------------------
/NLP/assignment1/assignment1.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bayeslabs/AiGym/205350c311f0e0981fe90eae84586c9bd4a9cfef/NLP/assignment1/assignment1.pdf


--------------------------------------------------------------------------------
/NLP/assignment1/collect_submission.sh:
--------------------------------------------------------------------------------
1 | rm -f assignment1.zip
2 | zip -r assignment1.zip *.py *.png saved_params_40000.npy
3 | 


--------------------------------------------------------------------------------
/NLP/assignment1/get_datasets.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/bash
 2 | 
 3 | DATASETS_DIR="utils/datasets"
 4 | mkdir -p $DATASETS_DIR
 5 | 
 6 | cd $DATASETS_DIR
 7 | 
 8 | # Get Stanford Sentiment Treebank
 9 | if hash wget 2>/dev/null; then
10 |   wget http://nlp.stanford.edu/~socherr/stanfordSentimentTreebank.zip
11 | else
12 |   curl -L http://nlp.stanford.edu/~socherr/stanfordSentimentTreebank.zip -o stanfordSentimentTreebank.zip
13 | fi
14 | unzip stanfordSentimentTreebank.zip
15 | rm stanfordSentimentTreebank.zip
16 | 
17 | # Get 50D GloVe vectors
18 | if hash wget 2>/dev/null; then
19 |   wget http://nlp.stanford.edu/data/glove.6B.zip
20 | else
21 |   curl -L http://nlp.stanford.edu/data/glove.6B.zip -o glove.6B.zip
22 | fi
23 | unzip glove.6B.zip
24 | rm glove.6B.100d.txt glove.6B.200d.txt glove.6B.300d.txt glove.6B.zip
25 | 


--------------------------------------------------------------------------------
/NLP/assignment1/q1_softmax.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | 
 3 | 
 4 | def softmax(x):
 5 |     """Compute the softmax function for each row of the input x.
 6 | 
 7 |     It is crucial that this function is optimized for speed because
 8 |     it will be used frequently in later code. You might find numpy
 9 |     functions np.exp, np.sum, np.reshape, np.max, and numpy
10 |     broadcasting useful for this task.
11 | 
12 |     Numpy broadcasting documentation:
13 |     http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html
14 | 
15 |     You should also make sure that your code works for a single
16 |     D-dimensional vector (treat the vector as a single row) and
17 |     for N x D matrices. This may be useful for testing later. Also,
18 |     make sure that the dimensions of the output match the input.
19 | 
20 |     You must implement the optimization in problem 1(a) of the
21 |     written assignment!
22 | 
23 |     Arguments:
24 |     x -- A D dimensional vector or N x D dimensional numpy matrix.
25 | 
26 |     Return:
27 |     x -- You are allowed to modify x in-place
28 |     """
29 |     orig_shape = x.shape
30 |     x-=np.max(x,axis=-1,keepdims=True)
31 |     x_exp=np.exp(x)
32 |     x=x_exp/np.sum(x_exp, axis=-1, keepdims=True)
33 | 
34 |     assert x.shape == orig_shape
35 |     return x
36 | 
37 | 
38 | def test_softmax_basic():
39 |     """
40 |     Some simple tests to get you started.
41 |     Warning: these are not exhaustive.
42 |     """
43 |     print("Running basic tests...")
44 |     test1 = softmax(np.array([1,2]))
45 |     print(test1)
46 |     ans1 = np.array([0.26894142,  0.73105858])
47 |     assert np.allclose(test1, ans1, rtol=1e-05, atol=1e-06)
48 | 
49 |     test2 = softmax(np.array([[1001,1002],[3,4]]))
50 |     print(test2)
51 |     ans2 = np.array([
52 |         [0.26894142, 0.73105858],
53 |         [0.26894142, 0.73105858]])
54 |     assert np.allclose(test2, ans2, rtol=1e-05, atol=1e-06)
55 | 
56 |     test3 = softmax(np.array([[-1001,-1002]]))
57 |     print(test3)
58 |     ans3 = np.array([0.73105858, 0.26894142])
59 |     assert np.allclose(test3, ans3, rtol=1e-05, atol=1e-06)
60 | 
61 |     print("You should be able to verify these results by hand!\n")
62 | 
63 | 
64 | def test_softmax():
65 |     """
66 |     Use this space to test your softmax implementation by running:
67 |         python q1_softmax.py
68 |     This function will not be called by the autograder, nor will
69 |     your tests be graded.
70 |     """
71 |     print("Running your tests...")
72 |     ### YOUR CODE HERE
73 |     raise NotImplementedError
74 |     ### END YOUR CODE
75 | 
76 | 
77 | if __name__ == "__main__":
78 |     test_softmax_basic()
79 |     # test_softmax()
80 | 


--------------------------------------------------------------------------------
/NLP/assignment1/q2_gradcheck.py:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env python
 2 | 
 3 | import numpy as np
 4 | import random
 5 | 
 6 | 
 7 | # First implement a gradient checker by filling in the following functions
 8 | def gradcheck_naive(f, x):
 9 |     """ Gradient check for a function f.
10 | 
11 |     Arguments:
12 |     f -- a function that takes a single argument and outputs the
13 |          cost and its gradients
14 |     x -- the point (numpy array) to check the gradient at
15 |     """
16 | 
17 |     rndstate = random.getstate()
18 |     random.setstate(rndstate)
19 |     fx, grad = f(x) # Evaluate function value at original point
20 |     h = 1e-4        # Do not change this!
21 | 
22 |     # Iterate over all indexes ix in x to check the gradient.
23 |     it = np.nditer(x, flags=['multi_index'], op_flags=['readwrite'])
24 |     while not it.finished:
25 |         ix = it.multi_index
26 | 
27 |         # Try modifying x[ix] with h defined above to compute numerical
28 |         # gradients (numgrad).
29 | 
30 |         # Use the centered difference of the gradient.
31 |         # It has smaller asymptotic error than forward / backward difference
32 |         # methods. If you are curious, check out here:
33 |         # https://math.stackexchange.com/questions/2326181/when-to-use-forward-or-central-difference-approximations
34 | 
35 |         # Make sure you call random.setstate(rndstate)
36 |         # before calling f(x) each time. This will make it possible
37 |         # to test cost functions with built in randomness later.
38 | 
39 |         ### YOUR CODE HERE:
40 |         x[ix]+=h
41 |         random.setstate(rndstate)
42 |         fhn=f(x)[0]
43 |         x[ix]-=2*h
44 |         random.setstate(rndstate)
45 |         fhp=f(x)[0]
46 |         x[ix]+=h
47 |         numgrad = (fhn-fhp)/(2*h)
48 |         ### END YOUR CODE
49 | 
50 |         # Compare gradients
51 |         reldiff = abs(numgrad - grad[ix]) / max(1, abs(numgrad), abs(grad[ix]))
52 |         if reldiff > 1e-5:
53 |             print("Gradient check failed.")
54 |             print("First gradient error found at index %s" % str(ix))
55 |             print("Your gradient: %f \t Numerical gradient: %f" % (
56 |                 grad[ix], numgrad))
57 |             return
58 | 
59 |         it.iternext() # Step to next dimension
60 | 
61 |     print("Gradient check passed!")
62 | 
63 | 
64 | def sanity_check():
65 |     """
66 |     Some basic sanity checks.
67 |     """
68 |     quad = lambda x: (np.sum(x ** 2), x * 2)
69 | 
70 |     print("Running sanity checks...")
71 |     gradcheck_naive(quad, np.array(123.456))      # scalar test
72 |     gradcheck_naive(quad, np.random.randn(3,))    # 1-D test
73 |     gradcheck_naive(quad, np.random.randn(4,5))   # 2-D test
74 |     print("")
75 | 
76 | 
77 | def your_sanity_checks():
78 |     """
79 |     Use this space add any additional sanity checks by running:
80 |         python q2_gradcheck.py
81 |     This function will not be called by the autograder, nor will
82 |     your additional tests be graded.
83 |     """
84 |     print("Running your sanity checks...")
85 |     ### YOUR CODE HERE
86 |     raise NotImplementedError
87 |     ### END YOUR CODE
88 | 
89 | 
90 | if __name__ == "__main__":
91 |     sanity_check()
92 |     # your_sanity_checks()
93 | 


--------------------------------------------------------------------------------
/NLP/assignment1/q2_neural.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | 
  3 | import numpy as np
  4 | import random
  5 | 
  6 | from q1_softmax import softmax
  7 | from q2_sigmoid import sigmoid, sigmoid_grad
  8 | from q2_gradcheck import gradcheck_naive
  9 | 
 10 | 
 11 | def forward_backward_prop(X, labels, params, dimensions):
 12 |     """
 13 |     Forward and backward propagation for a two-layer sigmoidal network
 14 | 
 15 |     Compute the forward propagation and for the cross entropy cost,
 16 |     the backward propagation for the gradients for all parameters.
 17 | 
 18 |     Notice the gradients computed here are different from the gradients in
 19 |     the assignment sheet: they are w.r.t. weights, not inputs.
 20 | 
 21 |     Arguments:
 22 |     X -- M x Dx matrix, where each row is a training example x.
 23 |     labels -- M x Dy matrix, where each row is a one-hot vector.
 24 |     params -- Model parameters, these are unpacked for you.
 25 |     dimensions -- A tuple of input dimension, number of hidden units
 26 |                   and output dimension
 27 |     """
 28 | 
 29 |     ### Unpack network parameters (do not modify)
 30 |     ofs = 0
 31 |     Dx, H, Dy = (dimensions[0], dimensions[1], dimensions[2])
 32 | 
 33 |     W1 = np.reshape(params[ofs:ofs+ Dx * H], (Dx, H))
 34 |     ofs += Dx * H
 35 |     b1 = np.reshape(params[ofs:ofs + H], (1, H))
 36 |     ofs += H
 37 |     W2 = np.reshape(params[ofs:ofs + H * Dy], (H, Dy))
 38 |     ofs += H * Dy
 39 |     b2 = np.reshape(params[ofs:ofs + Dy], (1, Dy))
 40 | 
 41 |     # Note: compute cost based on `sum` not `mean`.
 42 |     ### YOUR CODE HERE: forward propagation
 43 |     fc1=X.dot(W1)+b1 # [M,H]
 44 |     sig1=sigmoid(fc1) # [M,H]
 45 |     scores=sig1.dot(W2)+b2 # [M,Dy]
 46 |     shifted_scores = scores - np.max(scores,axis=-1,keepdims=True) # [M,Dy]
 47 |     z = np.exp(shifted_scores).sum(axis=-1, keepdims=True) # [M,1]
 48 |     log_porbs = shifted_scores - np.log(z)
 49 |     cost = -1*(log_porbs*labels).sum()
 50 |     ### END YOUR CODE
 51 | 
 52 |     ### YOUR CODE HERE: backward propagation
 53 |     dout=np.exp(log_porbs)
 54 |     dout[labels==1]-=1
 55 |     gradW2=sig1.T.dot(dout)
 56 |     gradb2=dout.sum(axis=0)
 57 |     dsig1=dout.dot(W2.T)
 58 |     dfc1=sigmoid_grad(sig1)*dsig1
 59 |     gradW1=X.T.dot(dfc1)
 60 |     gradb1=dfc1.sum(axis=0)
 61 |     ### END YOUR CODE
 62 | 
 63 |     ### Stack gradients (do not modify)
 64 |     grad = np.concatenate((gradW1.flatten(), gradb1.flatten(),
 65 |         gradW2.flatten(), gradb2.flatten()))
 66 | 
 67 |     return cost, grad
 68 | 
 69 | 
 70 | def sanity_check():
 71 |     """
 72 |     Set up fake data and parameters for the neural network, and test using
 73 |     gradcheck.
 74 |     """
 75 |     print("Running sanity check...")
 76 | 
 77 |     N = 20
 78 |     dimensions = [10, 5, 10]
 79 |     data = np.random.randn(N, dimensions[0])   # each row will be a datum
 80 |     labels = np.zeros((N, dimensions[2]))
 81 |     for i in range(N):
 82 |         labels[i, random.randint(0,dimensions[2]-1)] = 1
 83 | 
 84 |     params = np.random.randn((dimensions[0] + 1) * dimensions[1] + (
 85 |         dimensions[1] + 1) * dimensions[2], )
 86 | 
 87 |     gradcheck_naive(lambda params:
 88 |         forward_backward_prop(data, labels, params, dimensions), params)
 89 | 
 90 | 
 91 | def your_sanity_checks():
 92 |     """
 93 |     Use this space add any additional sanity checks by running:
 94 |         python q2_neural.py
 95 |     This function will not be called by the autograder, nor will
 96 |     your additional tests be graded.
 97 |     """
 98 |     print("Running your sanity checks...")
 99 |     ### YOUR CODE HERE
100 |     raise NotImplementedError
101 |     ### END YOUR CODE
102 | 
103 | 
104 | if __name__ == "__main__":
105 |     sanity_check()
106 |     # your_sanity_checks()
107 | 


--------------------------------------------------------------------------------
/NLP/assignment1/q2_sigmoid.py:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env python
 2 | 
 3 | import numpy as np
 4 | 
 5 | 
 6 | def sigmoid(x):
 7 |     """
 8 |     Compute the sigmoid function for the input here.
 9 | 
10 |     Arguments:
11 |     x -- A scalar or numpy array.
12 | 
13 |     Return:
14 |     s -- sigmoid(x)
15 |     """
16 | 
17 |     ### YOUR CODE HERE
18 |     if isinstance(x,np.ndarray):
19 |         mask_pos = (x>0)
20 |         mask_neg = (x<=0)
21 |         pos = np.zeros_like(x, dtype=float)
22 |         neg = np.zeros_like(x, dtype=float)
23 |         pos[mask_pos] = np.exp(-x[mask_pos])
24 |         neg[mask_neg] = np.exp(x[mask_neg])
25 |         numerator=np.ones_like(pos)
26 |         numerator[mask_neg]=neg[mask_neg]
27 |         denumerator=pos+neg
28 |         s=numerator/(1+denumerator)
29 |     else:
30 |         if x < 0:
31 |             s = 1./(1+np.exp(x))
32 |         else:
33 |             s = np.exp(-x)/(1+np.exp(-x))
34 |     ### END YOUR CODE
35 | 
36 |     return s
37 | 
38 | 
39 | def sigmoid_grad(s):
40 |     """
41 |     Compute the gradient for the sigmoid function here. Note that
42 |     for this implementation, the input s should be the sigmoid
43 |     function value of your original input x.
44 | 
45 |     Arguments:
46 |     s -- A scalar or numpy array.
47 | 
48 |     Return:
49 |     ds -- Your computed gradient.
50 |     """
51 | 
52 |     ### YOUR CODE HERE
53 |     ds = s*(1-s)
54 |     ### END YOUR CODE
55 | 
56 |     return ds
57 | 
58 | 
59 | def test_sigmoid_basic():
60 |     """
61 |     Some simple tests to get you started.
62 |     Warning: these are not exhaustive.
63 |     """
64 |     print("Running basic tests...")
65 |     x = np.array([[1, 2], [-1, -2]])
66 |     f = sigmoid(x)
67 |     g = sigmoid_grad(f)
68 |     print(f)
69 |     f_ans = np.array([
70 |         [0.73105858, 0.88079708],
71 |         [0.26894142, 0.11920292]])
72 |     assert np.allclose(f, f_ans, rtol=1e-05, atol=1e-06)
73 |     print(g)
74 |     g_ans = np.array([
75 |         [0.19661193, 0.10499359],
76 |         [0.19661193, 0.10499359]])
77 |     assert np.allclose(g, g_ans, rtol=1e-05, atol=1e-06)
78 |     print("You should verify these results by hand!\n")
79 | 
80 | 
81 | def test_sigmoid():
82 |     """
83 |     Use this space to test your sigmoid implementation by running:
84 |         python q2_sigmoid.py
85 |     This function will not be called by the autograder, nor will
86 |     your tests be graded.
87 |     """
88 |     print("Running your tests...")
89 |     ### YOUR CODE HERE
90 |     raise NotImplementedError
91 |     ### END YOUR CODE
92 | 
93 | 
94 | if __name__ == "__main__":
95 |     test_sigmoid_basic()
96 |     # test_sigmoid()
97 | 


--------------------------------------------------------------------------------
/NLP/assignment1/q3_run.py:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env python
 2 | 
 3 | import random
 4 | import numpy as np
 5 | from utils.treebank import StanfordSentiment
 6 | import matplotlib
 7 | matplotlib.use('agg')
 8 | import matplotlib.pyplot as plt
 9 | import time
10 | 
11 | from q3_word2vec import *
12 | from q3_sgd import *
13 | 
14 | # Reset the random seed to make sure that everyone gets the same results
15 | random.seed(314)
16 | dataset = StanfordSentiment()
17 | tokens = dataset.tokens()
18 | nWords = len(tokens)
19 | 
20 | # We are going to train 10-dimensional vectors for this assignment
21 | dimVectors = 10
22 | 
23 | # Context size
24 | C = 5
25 | 
26 | # Reset the random seed to make sure that everyone gets the same results
27 | random.seed(31415)
28 | np.random.seed(9265)
29 | 
30 | startTime=time.time()
31 | wordVectors = np.concatenate(
32 |     ((np.random.rand(nWords, dimVectors) - 0.5) /
33 |        dimVectors, np.zeros((nWords, dimVectors))),
34 |     axis=0)
35 | wordVectors = sgd(
36 |     lambda vec: word2vec_sgd_wrapper(skipgram, tokens, vec, dataset, C,
37 |         negSamplingCostAndGradient),
38 |     wordVectors, 0.3, 40000, None, True, PRINT_EVERY=10)
39 | # Note that normalization is not called here. This is not a bug,
40 | # normalizing during training loses the notion of length.
41 | 
42 | print("sanity check: cost at convergence should be around or below 10")
43 | print("training took %d seconds" % (time.time() - startTime))
44 | 
45 | # concatenate the input and output word vectors
46 | wordVectors = np.concatenate(
47 |     (wordVectors[:nWords,:], wordVectors[nWords:,:]),
48 |     axis=0)
49 | # wordVectors = wordVectors[:nWords,:] + wordVectors[nWords:,:]
50 | 
51 | visualizeWords = [
52 |     "the", "a", "an", ",", ".", "?", "!", "``", "''", "--",
53 |     "good", "great", "cool", "brilliant", "wonderful", "well", "amazing",
54 |     "worth", "sweet", "enjoyable", "boring", "bad", "waste", "dumb",
55 |     "annoying"]
56 | 
57 | visualizeIdx = [tokens[word] for word in visualizeWords]
58 | visualizeVecs = wordVectors[visualizeIdx, :]
59 | temp = (visualizeVecs - np.mean(visualizeVecs, axis=0))
60 | covariance = 1.0 / len(visualizeIdx) * temp.T.dot(temp)
61 | U,S,V = np.linalg.svd(covariance)
62 | coord = temp.dot(U[:,0:2])
63 | 
64 | for i in range(len(visualizeWords)):
65 |     plt.text(coord[i,0], coord[i,1], visualizeWords[i],
66 |         bbox=dict(facecolor='green', alpha=0.1))
67 | 
68 | plt.xlim((np.min(coord[:,0]), np.max(coord[:,0])))
69 | plt.ylim((np.min(coord[:,1]), np.max(coord[:,1])))
70 | 
71 | plt.savefig('q3_word_vectors.png')
72 | 


--------------------------------------------------------------------------------
/NLP/assignment1/q3_sgd.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | 
  3 | # Save parameters every a few SGD iterations as fail-safe
  4 | SAVE_PARAMS_EVERY = 5000
  5 | 
  6 | import glob
  7 | import random
  8 | import numpy as np
  9 | import os.path as op
 10 | import  pickle
 11 | 
 12 | 
 13 | def load_saved_params():
 14 |     """
 15 |     A helper function that loads previously saved parameters and resets
 16 |     iteration start.
 17 |     """
 18 |     st = 0
 19 |     for f in glob.glob("saved_params_*.npy"):
 20 |         iter = int(op.splitext(op.basename(f))[0].split("_")[2])
 21 |         if (iter > st):
 22 |             st = iter
 23 | 
 24 |     if st > 0:
 25 |         with open("saved_params_%d.npy" % st, "rb") as f:
 26 |             params = pickle.load(f)
 27 |             state = pickle.load(f)
 28 |         return st, params, state
 29 |     else:
 30 |         return st, None, None
 31 | 
 32 | 
 33 | def save_params(iter, params):
 34 |     with open("saved_params_%d.npy" % iter, "wb") as f:
 35 |         pickle.dump(params, f)
 36 |         pickle.dump(random.getstate(), f)
 37 | 
 38 | 
 39 | def sgd(f, x0, step, iterations, postprocessing=None, useSaved=False,
 40 |         PRINT_EVERY=10):
 41 |     """ Stochastic Gradient Descent
 42 | 
 43 |     Implement the stochastic gradient descent method in this function.
 44 | 
 45 |     Arguments:
 46 |     f -- the function to optimize, it should take a single
 47 |          argument and yield two outputs, a cost and the gradient
 48 |          with respect to the arguments
 49 |     x0 -- the initial point to start SGD from
 50 |     step -- the step size for SGD
 51 |     iterations -- total iterations to run SGD for
 52 |     postprocessing -- postprocessing function for the parameters
 53 |                       if necessary. In the case of word2vec we will need to
 54 |                       normalize the word vectors to have unit length.
 55 |     PRINT_EVERY -- specifies how many iterations to output loss
 56 | 
 57 |     Return:
 58 |     x -- the parameter value after SGD finishes
 59 |     """
 60 | 
 61 |     # Anneal learning rate every several iterations
 62 |     ANNEAL_EVERY = 20000
 63 | 
 64 |     if useSaved:
 65 |         start_iter, oldx, state = load_saved_params()
 66 |         if start_iter > 0:
 67 |             x0 = oldx
 68 |             step *= 0.5 ** (start_iter / ANNEAL_EVERY)
 69 | 
 70 |         if state:
 71 |             random.setstate(state)
 72 |     else:
 73 |         start_iter = 0
 74 | 
 75 |     x = x0
 76 | 
 77 |     if not postprocessing:
 78 |         postprocessing = lambda x: x
 79 | 
 80 |     expcost = None
 81 | 
 82 |     for iter in range(start_iter + 1, iterations + 1):
 83 |         # Don't forget to apply the postprocessing after every iteration!
 84 |         # You might want to print the progress every few iterations.
 85 | 
 86 |         cost = None
 87 |         ### YOUR CODE HERE
 88 |         cost,grad = f(x)
 89 |         x-=step*grad
 90 |         x=postprocessing(x)
 91 |         ### END YOUR CODE
 92 | 
 93 |         if iter % PRINT_EVERY == 0:
 94 |             if not expcost:
 95 |                 expcost = cost
 96 |             else:
 97 |                 expcost = .95 * expcost + .05 * cost
 98 |             print("iter %d: %f" % (iter, expcost))
 99 | 
100 |         if iter % SAVE_PARAMS_EVERY == 0 and useSaved:
101 |             save_params(iter, x)
102 | 
103 |         if iter % ANNEAL_EVERY == 0:
104 |             step *= 0.5
105 | 
106 |     return x
107 | 
108 | 
109 | def sanity_check():
110 |     quad = lambda x: (np.sum(x ** 2), x * 2)
111 | 
112 |     print("Running sanity checks...")
113 |     t1 = sgd(quad, 0.5, 0.01, 1000, PRINT_EVERY=100)
114 |     print("test 1 result:", t1)
115 |     assert abs(t1) <= 1e-6
116 | 
117 |     t2 = sgd(quad, 0.0, 0.01, 1000, PRINT_EVERY=100)
118 |     print("test 2 result:", t2)
119 |     assert abs(t2) <= 1e-6
120 | 
121 |     t3 = sgd(quad, -1.5, 0.01, 1000, PRINT_EVERY=100)
122 |     print("test 3 result:", t3)
123 |     assert abs(t3) <= 1e-6
124 | 
125 |     print("")
126 | 
127 | 
128 | def your_sanity_checks():
129 |     """
130 |     Use this space add any additional sanity checks by running:
131 |         python q3_sgd.py
132 |     This function will not be called by the autograder, nor will
133 |     your additional tests be graded.
134 |     """
135 |     print("Running your sanity checks...")
136 |     ### YOUR CODE HERE
137 |     # raise NotImplementedError
138 |     ### END YOUR CODE
139 | 
140 | 
141 | if __name__ == "__main__":
142 |     sanity_check()
143 |     # your_sanity_checks()
144 | 


--------------------------------------------------------------------------------
/NLP/assignment1/q3_word_vectors.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bayeslabs/AiGym/205350c311f0e0981fe90eae84586c9bd4a9cfef/NLP/assignment1/q3_word_vectors.png


--------------------------------------------------------------------------------
/NLP/assignment1/q4_dev_conf.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bayeslabs/AiGym/205350c311f0e0981fe90eae84586c9bd4a9cfef/NLP/assignment1/q4_dev_conf.png


--------------------------------------------------------------------------------
/NLP/assignment1/q4_reg_v_acc.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bayeslabs/AiGym/205350c311f0e0981fe90eae84586c9bd4a9cfef/NLP/assignment1/q4_reg_v_acc.png


--------------------------------------------------------------------------------
/NLP/assignment1/q4_sentiment.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | 
  3 | import argparse
  4 | import numpy as np
  5 | import matplotlib
  6 | matplotlib.use('agg')
  7 | import matplotlib.pyplot as plt
  8 | import itertools
  9 | 
 10 | from utils.treebank import StanfordSentiment
 11 | import utils.glove as glove
 12 | 
 13 | from q3_sgd import load_saved_params, sgd
 14 | 
 15 | # We will use sklearn here because it will run faster than implementing
 16 | # ourselves. However, for other parts of this assignment you must implement
 17 | # the functions yourself!
 18 | from sklearn.linear_model import LogisticRegression
 19 | from sklearn.metrics import confusion_matrix
 20 | 
 21 | 
 22 | def getArguments():
 23 |     parser = argparse.ArgumentParser()
 24 |     group = parser.add_mutually_exclusive_group(required=True)
 25 |     group.add_argument("--pretrained", dest="pretrained", action="store_true",
 26 |                        help="Use pretrained GloVe vectors.")
 27 |     group.add_argument("--yourvectors", dest="yourvectors", action="store_true",
 28 |                        help="Use your vectors from q3.")
 29 |     return parser.parse_args()
 30 | 
 31 | 
 32 | def getSentenceFeatures(tokens, wordVectors, sentence):
 33 |     """
 34 |     Obtain the sentence feature for sentiment analysis by averaging its
 35 |     word vectors
 36 |     """
 37 | 
 38 |     # Implement computation for the sentence features given a sentence.
 39 | 
 40 |     # Inputs:
 41 |     # tokens -- a dictionary that maps words to their indices in
 42 |     #           the word vector list
 43 |     # wordVectors -- word vectors (each row) for all tokens
 44 |     # sentence -- a list of words in the sentence of interest
 45 | 
 46 |     # Output:
 47 |     # - sentVector: feature vector for the sentence
 48 | 
 49 |     sentVector = np.zeros((wordVectors.shape[1],))
 50 | 
 51 |     ### YOUR CODE HERE
 52 |     sentVector = np.sum(np.vstack([wordVectors[tokens[w]] for w in sentence]),axis=0)/len(sentence)
 53 |     ### END YOUR CODE
 54 | 
 55 |     assert sentVector.shape == (wordVectors.shape[1],)
 56 |     return sentVector
 57 | 
 58 | 
 59 | def getRegularizationValues():
 60 |     """Try different regularizations
 61 | 
 62 |     Return a sorted list of values to try.
 63 |     """
 64 |     values = None   # Assign a list of floats in the block below
 65 |     ### YOUR CODE HERE
 66 |     values = np.logspace(-6, 0, num=10)
 67 |     ### END YOUR CODE
 68 |     return sorted(values)
 69 | 
 70 | 
 71 | def chooseBestModel(results):
 72 |     """Choose the best model based on dev set performance.
 73 | 
 74 |     Arguments:
 75 |     results -- A list of python dictionaries of the following format:
 76 |         {
 77 |             "reg": regularization,
 78 |             "clf": classifier,
 79 |             "train": trainAccuracy,
 80 |             "dev": devAccuracy,
 81 |             "test": testAccuracy
 82 |         }
 83 | 
 84 |     Each dictionary represents the performance of one model.
 85 | 
 86 |     Returns:
 87 |     Your chosen result dictionary.
 88 |     """
 89 |     bestResult = None
 90 | 
 91 |     ### YOUR CODE HERE
 92 |     bestResult = max(results, key=lambda x: x['dev'])
 93 |     ### END YOUR CODE
 94 | 
 95 |     return bestResult
 96 | 
 97 | 
 98 | def accuracy(y, yhat):
 99 |     """ Precision for classifier """
100 |     assert(y.shape == yhat.shape)
101 |     return np.sum(y == yhat) * 100.0 / y.size
102 | 
103 | 
104 | def plotRegVsAccuracy(regValues, results, filename):
105 |     """ Make a plot of regularization vs accuracy """
106 |     plt.plot(regValues, [x["train"] for x in results])
107 |     plt.plot(regValues, [x["dev"] for x in results])
108 |     plt.xscale('log')
109 |     plt.xlabel("regularization")
110 |     plt.ylabel("accuracy")
111 |     plt.legend(['train', 'dev'], loc='upper left')
112 |     plt.savefig(filename)
113 | 
114 | 
115 | def outputConfusionMatrix(features, labels, clf, filename):
116 |     """ Generate a confusion matrix """
117 |     pred = clf.predict(features)
118 |     cm = confusion_matrix(labels, pred, labels=range(5))
119 |     plt.figure()
120 |     plt.imshow(cm, interpolation='nearest', cmap=plt.cm.Reds)
121 |     plt.colorbar()
122 |     classes = ["- -", "-", "neut", "+", "+ +"]
123 |     tick_marks = np.arange(len(classes))
124 |     plt.xticks(tick_marks, classes)
125 |     plt.yticks(tick_marks, classes)
126 |     thresh = cm.max() / 2.
127 |     for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])):
128 |         plt.text(j, i, cm[i, j],
129 |                  horizontalalignment="center",
130 |                  color="white" if cm[i, j] > thresh else "black")
131 |     plt.tight_layout()
132 |     plt.ylabel('True label')
133 |     plt.xlabel('Predicted label')
134 |     plt.savefig(filename)
135 | 
136 | 
137 | def outputPredictions(dataset, features, labels, clf, filename):
138 |     """ Write the predictions to file """
139 |     pred = clf.predict(features)
140 |     with open(filename, "w") as f:
141 |         print(f, "True\tPredicted\tText")
142 |         for i in range(len(dataset)):
143 |             print(f, "%d\t%d\t%s" % (
144 |                 labels[i], pred[i], " ".join(dataset[i][0])))
145 | 
146 | 
147 | def main(args):
148 |     """ Train a model to do sentiment analyis"""
149 | 
150 |     # Load the dataset
151 |     dataset = StanfordSentiment()
152 |     tokens = dataset.tokens()
153 |     nWords = len(tokens)
154 | 
155 |     if args.yourvectors:
156 |         _, wordVectors, _ = load_saved_params()
157 |         wordVectors = np.concatenate(
158 |             (wordVectors[:nWords,:], wordVectors[nWords:,:]),
159 |             axis=1)
160 |     elif args.pretrained:
161 |         wordVectors = glove.loadWordVectors(tokens)
162 |     dimVectors = wordVectors.shape[1]
163 | 
164 |     # Load the train set
165 |     trainset = dataset.getTrainSentences()
166 |     nTrain = len(trainset)
167 |     trainFeatures = np.zeros((nTrain, dimVectors))
168 |     trainLabels = np.zeros((nTrain,), dtype=np.int32)
169 |     for i in range(nTrain):
170 |         words, trainLabels[i] = trainset[i]
171 |         trainFeatures[i, :] = getSentenceFeatures(tokens, wordVectors, words)
172 | 
173 |     # Prepare dev set features
174 |     devset = dataset.getDevSentences()
175 |     nDev = len(devset)
176 |     devFeatures = np.zeros((nDev, dimVectors))
177 |     devLabels = np.zeros((nDev,), dtype=np.int32)
178 |     for i in range(nDev):
179 |         words, devLabels[i] = devset[i]
180 |         devFeatures[i, :] = getSentenceFeatures(tokens, wordVectors, words)
181 | 
182 |     # Prepare test set features
183 |     testset = dataset.getTestSentences()
184 |     nTest = len(testset)
185 |     testFeatures = np.zeros((nTest, dimVectors))
186 |     testLabels = np.zeros((nTest,), dtype=np.int32)
187 |     for i in range(nTest):
188 |         words, testLabels[i] = testset[i]
189 |         testFeatures[i, :] = getSentenceFeatures(tokens, wordVectors, words)
190 | 
191 |     # We will save our results from each run
192 |     results = []
193 |     regValues = getRegularizationValues()
194 |     for reg in regValues:
195 |         print("Training for reg=%f" % reg)
196 |         # Note: add a very small number to regularization to please the library
197 |         clf = LogisticRegression(C=1.0/(reg + 1e-12))
198 |         clf.fit(trainFeatures, trainLabels)
199 | 
200 |         # Test on train set
201 |         pred = clf.predict(trainFeatures)
202 |         trainAccuracy = accuracy(trainLabels, pred)
203 |         print("Train accuracy (%%): %f" % trainAccuracy)
204 | 
205 |         # Test on dev set
206 |         pred = clf.predict(devFeatures)
207 |         devAccuracy = accuracy(devLabels, pred)
208 |         print("Dev accuracy (%%): %f" % devAccuracy)
209 | 
210 |         # Test on test set
211 |         # Note: always running on test is poor style. Typically, you should
212 |         # do this only after validation.
213 |         pred = clf.predict(testFeatures)
214 |         testAccuracy = accuracy(testLabels, pred)
215 |         print("Test accuracy (%%): %f" % testAccuracy)
216 | 
217 |         results.append({
218 |             "reg": reg,
219 |             "clf": clf,
220 |             "train": trainAccuracy,
221 |             "dev": devAccuracy,
222 |             "test": testAccuracy})
223 | 
224 |     # Print the accuracies
225 |     print("")
226 |     print("=== Recap ===")
227 |     print("Reg\t\tTrain\tDev\tTest")
228 |     for result in results:
229 |         print("%.2E\t%.3f\t%.3f\t%.3f" % (
230 |             result["reg"],
231 |             result["train"],
232 |             result["dev"],
233 |             result["test"]))
234 |     print("")
235 | 
236 |     bestResult = chooseBestModel(results)
237 |     print("Best regularization value: %0.2E" % bestResult["reg"])
238 |     print("Test accuracy (%%): %f" % bestResult["test"])
239 | 
240 |     # do some error analysis
241 |     if args.pretrained:
242 |         plotRegVsAccuracy(regValues, results, "q4_reg_v_acc.png")
243 |         outputConfusionMatrix(devFeatures, devLabels, bestResult["clf"],
244 |                               "q4_dev_conf.png")
245 |         outputPredictions(devset, devFeatures, devLabels, bestResult["clf"],
246 |                           "q4_dev_pred.txt")
247 | 
248 | 
249 | if __name__ == "__main__":
250 |     main(getArguments())
251 | 


--------------------------------------------------------------------------------
/NLP/assignment1/requirements.txt:
--------------------------------------------------------------------------------
1 | matplotlib
2 | scipy
3 | numpy
4 | sklearn
5 | 


--------------------------------------------------------------------------------
/NLP/assignment1/utils/__pycache__/__init__.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bayeslabs/AiGym/205350c311f0e0981fe90eae84586c9bd4a9cfef/NLP/assignment1/utils/__pycache__/__init__.cpython-36.pyc


--------------------------------------------------------------------------------
/NLP/assignment1/utils/__pycache__/glove.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bayeslabs/AiGym/205350c311f0e0981fe90eae84586c9bd4a9cfef/NLP/assignment1/utils/__pycache__/glove.cpython-36.pyc


--------------------------------------------------------------------------------
/NLP/assignment1/utils/__pycache__/treebank.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bayeslabs/AiGym/205350c311f0e0981fe90eae84586c9bd4a9cfef/NLP/assignment1/utils/__pycache__/treebank.cpython-36.pyc


--------------------------------------------------------------------------------
/NLP/assignment1/utils/glove.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | import numpy as np
 3 | 
 4 | DEFAULT_FILE_PATH = os.path.join(os.path.dirname(__file__),"datasets/glove.6B.50d.txt")
 5 | 
 6 | def loadWordVectors(tokens, filepath=DEFAULT_FILE_PATH, dimensions=50):
 7 |     """Read pretrained GloVe vectors"""
 8 |     wordVectors = np.zeros((len(tokens), dimensions))
 9 |     with open(filepath) as ifs:
10 |         for line in ifs:
11 |             line = line.strip()
12 |             if not line:
13 |                 continue
14 |             row = line.split()
15 |             token = row[0]
16 |             if token not in tokens:
17 |                 continue
18 |             data = [float(x) for x in row[1:]]
19 |             if len(data) != dimensions:
20 |                 raise RuntimeError("wrong number of dimensions")
21 |             wordVectors[tokens[token]] = np.asarray(data)
22 |     return wordVectors
23 | 


--------------------------------------------------------------------------------
/NLP/assignment1/utils/treebank.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | # -*- coding: utf-8 -*-
  3 | 
  4 | import pickle
  5 | import numpy as np
  6 | import os
  7 | import random
  8 | 
  9 | class StanfordSentiment:
 10 |     def __init__(self, path=None, tablesize = 1000000):
 11 |         if not path:
 12 |             path = os.path.join(os.path.dirname(__file__),"datasets/stanfordSentimentTreebank")
 13 | 
 14 |         self.path = path
 15 |         self.tablesize = tablesize
 16 | 
 17 |     def tokens(self):
 18 |         if hasattr(self, "_tokens") and self._tokens:
 19 |             return self._tokens
 20 | 
 21 |         tokens = dict()
 22 |         tokenfreq = dict()
 23 |         wordcount = 0
 24 |         revtokens = []
 25 |         idx = 0
 26 | 
 27 |         for sentence in self.sentences():
 28 |             for w in sentence:
 29 |                 wordcount += 1
 30 |                 if not w in tokens:
 31 |                     tokens[w] = idx
 32 |                     revtokens += [w]
 33 |                     tokenfreq[w] = 1
 34 |                     idx += 1
 35 |                 else:
 36 |                     tokenfreq[w] += 1
 37 | 
 38 |         tokens["UNK"] = idx
 39 |         revtokens += ["UNK"]
 40 |         tokenfreq["UNK"] = 1
 41 |         wordcount += 1
 42 | 
 43 |         self._tokens = tokens
 44 |         self._tokenfreq = tokenfreq
 45 |         self._wordcount = wordcount
 46 |         self._revtokens = revtokens
 47 |         return self._tokens
 48 | 
 49 |     def sentences(self):
 50 |         if hasattr(self, "_sentences") and self._sentences:
 51 |             return self._sentences
 52 | 
 53 |         sentences = []
 54 |         with open(self.path + "/datasetSentences.txt", "r") as f:
 55 |             first = True
 56 |             for line in f:
 57 |                 if first:
 58 |                     first = False
 59 |                     continue
 60 | 
 61 |                 splitted = line.strip().split()[1:]
 62 |                 # Deal with some peculiar encoding issues with this file
 63 |                 sentences += [[w.lower() for w in splitted]]
 64 | 
 65 |         self._sentences = sentences
 66 |         self._sentlengths = np.array([len(s) for s in sentences])
 67 |         self._cumsentlen = np.cumsum(self._sentlengths)
 68 | 
 69 |         return self._sentences
 70 | 
 71 |     def numSentences(self):
 72 |         if hasattr(self, "_numSentences") and self._numSentences:
 73 |             return self._numSentences
 74 |         else:
 75 |             self._numSentences = len(self.sentences())
 76 |             return self._numSentences
 77 | 
 78 |     def allSentences(self):
 79 |         if hasattr(self, "_allsentences") and self._allsentences:
 80 |             return self._allsentences
 81 | 
 82 |         sentences = self.sentences()
 83 |         rejectProb = self.rejectProb()
 84 |         tokens = self.tokens()
 85 |         allsentences = [[w for w in s
 86 |             if 0 >= rejectProb[tokens[w]] or random.random() >= rejectProb[tokens[w]]]
 87 |             for s in sentences * 30]
 88 | 
 89 |         allsentences = [s for s in allsentences if len(s) > 1]
 90 | 
 91 |         self._allsentences = allsentences
 92 | 
 93 |         return self._allsentences
 94 | 
 95 |     def getRandomContext(self, C=5):
 96 |         allsent = self.allSentences()
 97 |         sentID = random.randint(0, len(allsent) - 1)
 98 |         sent = allsent[sentID]
 99 |         wordID = random.randint(0, len(sent) - 1)
100 | 
101 |         context = sent[max(0, wordID - C):wordID]
102 |         if wordID+1 < len(sent):
103 |             context += sent[wordID+1:min(len(sent), wordID + C + 1)]
104 | 
105 |         centerword = sent[wordID]
106 |         context = [w for w in context if w != centerword]
107 | 
108 |         if len(context) > 0:
109 |             return centerword, context
110 |         else:
111 |             return self.getRandomContext(C)
112 | 
113 |     def sent_labels(self):
114 |         if hasattr(self, "_sent_labels") and self._sent_labels:
115 |             return self._sent_labels
116 | 
117 |         dictionary = dict()
118 |         phrases = 0
119 |         with open(self.path + "/dictionary.txt", "r") as f:
120 |             for line in f:
121 |                 line = line.strip()
122 |                 if not line: continue
123 |                 splitted = line.split("|")
124 |                 dictionary[splitted[0].lower()] = int(splitted[1])
125 |                 phrases += 1
126 | 
127 |         labels = [0.0] * phrases
128 |         with open(self.path + "/sentiment_labels.txt", "r") as f:
129 |             first = True
130 |             for line in f:
131 |                 if first:
132 |                     first = False
133 |                     continue
134 | 
135 |                 line = line.strip()
136 |                 if not line: continue
137 |                 splitted = line.split("|")
138 |                 labels[int(splitted[0])] = float(splitted[1])
139 | 
140 |         sent_labels = [0.0] * self.numSentences()
141 |         sentences = self.sentences()
142 |         for i in range(self.numSentences()):
143 |             sentence = sentences[i]
144 |             full_sent = " ".join(sentence).replace('-lrb-', '(').replace('-rrb-', ')')
145 |             try:
146 |                 sent_labels[i] = labels[dictionary[full_sent]]
147 |             except:
148 |                 continue
149 | 
150 |         self._sent_labels = sent_labels
151 |         return self._sent_labels
152 | 
153 |     def dataset_split(self):
154 |         if hasattr(self, "_split") and self._split:
155 |             return self._split
156 | 
157 |         split = [[] for i in range(3)]
158 |         with open(self.path + "/datasetSplit.txt", "r") as f:
159 |             first = True
160 |             for line in f:
161 |                 if first:
162 |                     first = False
163 |                     continue
164 | 
165 |                 splitted = line.strip().split(",")
166 |                 split[int(splitted[1]) - 1] += [int(splitted[0]) - 1]
167 | 
168 |         self._split = split
169 |         return self._split
170 | 
171 |     def getRandomTrainSentence(self):
172 |         split = self.dataset_split()
173 |         sentId = split[0][random.randint(0, len(split[0]) - 1)]
174 |         return self.sentences()[sentId], self.categorify(self.sent_labels()[sentId])
175 | 
176 |     def categorify(self, label):
177 |         if label <= 0.2:
178 |             return 0
179 |         elif label <= 0.4:
180 |             return 1
181 |         elif label <= 0.6:
182 |             return 2
183 |         elif label <= 0.8:
184 |             return 3
185 |         else:
186 |             return 4
187 | 
188 |     def getDevSentences(self):
189 |         return self.getSplitSentences(2)
190 | 
191 |     def getTestSentences(self):
192 |         return self.getSplitSentences(1)
193 | 
194 |     def getTrainSentences(self):
195 |         return self.getSplitSentences(0)
196 | 
197 |     def getSplitSentences(self, split=0):
198 |         ds_split = self.dataset_split()
199 |         return [(self.sentences()[i], self.categorify(self.sent_labels()[i])) for i in ds_split[split]]
200 | 
201 |     def sampleTable(self):
202 |         if hasattr(self, '_sampleTable') and self._sampleTable is not None:
203 |             return self._sampleTable
204 | 
205 |         nTokens = len(self.tokens())
206 |         samplingFreq = np.zeros((nTokens,))
207 |         self.allSentences()
208 |         i = 0
209 |         for w in range(nTokens):
210 |             w = self._revtokens[i]
211 |             if w in self._tokenfreq:
212 |                 freq = 1.0 * self._tokenfreq[w]
213 |                 # Reweigh
214 |                 freq = freq ** 0.75
215 |             else:
216 |                 freq = 0.0
217 |             samplingFreq[i] = freq
218 |             i += 1
219 | 
220 |         samplingFreq /= np.sum(samplingFreq)
221 |         samplingFreq = np.cumsum(samplingFreq) * self.tablesize
222 | 
223 |         self._sampleTable = [0] * self.tablesize
224 | 
225 |         j = 0
226 |         for i in range(self.tablesize):
227 |             while i > samplingFreq[j]:
228 |                 j += 1
229 |             self._sampleTable[i] = j
230 | 
231 |         return self._sampleTable
232 | 
233 |     def rejectProb(self):
234 |         if hasattr(self, '_rejectProb') and self._rejectProb is not None:
235 |             return self._rejectProb
236 | 
237 |         threshold = 1e-5 * self._wordcount
238 | 
239 |         nTokens = len(self.tokens())
240 |         rejectProb = np.zeros((nTokens,))
241 |         for i in range(nTokens):
242 |             w = self._revtokens[i]
243 |             freq = 1.0 * self._tokenfreq[w]
244 |             # Reweigh
245 |             rejectProb[i] = max(0, 1 - np.sqrt(threshold / freq))
246 | 
247 |         self._rejectProb = rejectProb
248 |         return self._rejectProb
249 | 
250 |     def sampleTokenIdx(self):
251 |         return self.sampleTable()[random.randint(0, self.tablesize - 1)]


--------------------------------------------------------------------------------
/NLP/assignment2/assignment2-soln.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bayeslabs/AiGym/205350c311f0e0981fe90eae84586c9bd4a9cfef/NLP/assignment2/assignment2-soln.pdf


--------------------------------------------------------------------------------
/NLP/assignment2/assignment2.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bayeslabs/AiGym/205350c311f0e0981fe90eae84586c9bd4a9cfef/NLP/assignment2/assignment2.pdf


--------------------------------------------------------------------------------
/NLP/assignment2/model.py:
--------------------------------------------------------------------------------
  1 | class Model(object):
  2 |     """Abstracts a Tensorflow graph for a learning task.
  3 | 
  4 |     We use various Model classes as usual abstractions to encapsulate tensorflow
  5 |     computational graphs. Each algorithm you will construct in this homework will
  6 |     inherit from a Model object.
  7 |     """
  8 |     def add_placeholders(self):
  9 |         """Adds placeholder variables to tensorflow computational graph.
 10 | 
 11 |         Tensorflow uses placeholder variables to represent locations in a
 12 |         computational graph where data is inserted.  These placeholders are used as
 13 |         inputs by the rest of the model building and will be fed data during
 14 |         training.
 15 | 
 16 |         See for more information:
 17 |         https://www.tensorflow.org/versions/r0.7/api_docs/python/io_ops.html#placeholders
 18 |         """
 19 |         raise NotImplementedError("Each Model must re-implement this method.")
 20 | 
 21 |     def create_feed_dict(self, inputs_batch, labels_batch=None):
 22 |         """Creates the feed_dict for one step of training.
 23 | 
 24 |         A feed_dict takes the form of:
 25 |         feed_dict = {
 26 |                 <placeholder>: <tensor of values to be passed for placeholder>,
 27 |                 ....
 28 |         }
 29 | 
 30 |         If labels_batch is None, then no labels are added to feed_dict.
 31 | 
 32 |         Hint: The keys for the feed_dict should be a subset of the placeholder
 33 |                     tensors created in add_placeholders.
 34 |         Args:
 35 |             inputs_batch: A batch of input data.
 36 |             labels_batch: A batch of label data.
 37 |         Returns:
 38 |             feed_dict: The feed dictionary mapping from placeholders to values.
 39 |         """
 40 |         raise NotImplementedError("Each Model must re-implement this method.")
 41 | 
 42 |     def add_prediction_op(self):
 43 |         """Implements the core of the model that transforms a batch of input data into predictions.
 44 | 
 45 |         Returns:
 46 |             pred: A tensor of shape (batch_size, n_classes)
 47 |         """
 48 |         raise NotImplementedError("Each Model must re-implement this method.")
 49 | 
 50 |     def add_loss_op(self, pred):
 51 |         """Adds Ops for the loss function to the computational graph.
 52 | 
 53 |         Args:
 54 |             pred: A tensor of shape (batch_size, n_classes)
 55 |         Returns:
 56 |             loss: A 0-d tensor (scalar) output
 57 |         """
 58 |         raise NotImplementedError("Each Model must re-implement this method.")
 59 | 
 60 |     def add_training_op(self, loss):
 61 |         """Sets up the training Ops.
 62 | 
 63 |         Creates an optimizer and applies the gradients to all trainable variables.
 64 |         The Op returned by this function is what must be passed to the
 65 |         sess.run() to train the model. See
 66 | 
 67 |         https://www.tensorflow.org/versions/r0.7/api_docs/python/train.html#Optimizer
 68 | 
 69 |         for more information.
 70 | 
 71 |         Args:
 72 |             loss: Loss tensor (a scalar).
 73 |         Returns:
 74 |             train_op: The Op for training.
 75 |         """
 76 | 
 77 |         raise NotImplementedError("Each Model must re-implement this method.")
 78 | 
 79 |     def train_on_batch(self, sess, inputs_batch, labels_batch):
 80 |         """Perform one step of gradient descent on the provided batch of data.
 81 | 
 82 |         Args:
 83 |             sess: tf.Session()
 84 |             input_batch: np.ndarray of shape (n_samples, n_features)
 85 |             labels_batch: np.ndarray of shape (n_samples, n_classes)
 86 |         Returns:
 87 |             loss: loss over the batch (a scalar)
 88 |         """
 89 |         feed = self.create_feed_dict(inputs_batch, labels_batch=labels_batch)
 90 |         _, loss = sess.run([self.train_op, self.loss], feed_dict=feed)
 91 |         return loss
 92 | 
 93 |     def predict_on_batch(self, sess, inputs_batch):
 94 |         """Make predictions for the provided batch of data
 95 | 
 96 |         Args:
 97 |             sess: tf.Session()
 98 |             input_batch: np.ndarray of shape (n_samples, n_features)
 99 |         Returns:
100 |             predictions: np.ndarray of shape (n_samples, n_classes)
101 |         """
102 |         feed = self.create_feed_dict(inputs_batch)
103 |         predictions = sess.run(self.pred, feed_dict=feed)
104 |         if predictions.sum() >10000:
105 |             print("predictions.sum()", predictions.sum())
106 |         return predictions
107 | 
108 |     def build(self):
109 |         self.add_placeholders()
110 |         self.pred = self.add_prediction_op()
111 |         self.loss = self.add_loss_op(self.pred)
112 |         self.train_op = self.add_training_op(self.loss)
113 | 


--------------------------------------------------------------------------------
/NLP/assignment2/q1_classifier.py:
--------------------------------------------------------------------------------
  1 | import time
  2 | 
  3 | import numpy as np
  4 | import tensorflow as tf
  5 | 
  6 | from q1_softmax import softmax
  7 | from q1_softmax import cross_entropy_loss
  8 | from model import Model
  9 | from utils.general_utils import get_minibatches
 10 | 
 11 | 
 12 | class Config(object):
 13 |     """Holds model hyperparams and data information.
 14 | 
 15 |     The config class is used to store various hyperparameters and dataset
 16 |     information parameters. Model objects are passed a Config() object at
 17 |     instantiation. They can then call self.config.<hyperparameter_name> to
 18 |     get the hyperparameter settings.
 19 |     """
 20 |     n_samples = 1024
 21 |     n_features = 100
 22 |     n_classes = 5
 23 |     batch_size = 64
 24 |     n_epochs = 50
 25 |     lr = 1e-4
 26 | 
 27 | 
 28 | class SoftmaxModel(Model):
 29 |     """Implements a Softmax classifier with cross-entropy loss."""
 30 | 
 31 |     def add_placeholders(self):
 32 |         """Generates placeholder variables to represent the input tensors.
 33 | 
 34 |         These placeholders are used as inputs by the rest of the model building
 35 |         and will be fed data during training.
 36 | 
 37 |         Adds following nodes to the computational graph
 38 | 
 39 |         input_placeholder: Input placeholder tensor of shape
 40 |                                               (batch_size, n_features), type tf.float32
 41 |         labels_placeholder: Labels placeholder tensor of shape
 42 |                                               (batch_size, n_classes), type tf.int32
 43 | 
 44 |         Add these placeholders to self as the instance variables
 45 |             self.input_placeholder
 46 |             self.labels_placeholder
 47 |         """
 48 |         ### YOUR CODE HERE
 49 |         self.input_placeholder=tf.placeholder(dtype=tf.float32,shape=(None, Config.n_features),name='input_placeholder')
 50 |         self.labels_placeholder=tf.placeholder(dtype=tf.int32,shape=(None, Config.n_classes), name='labels_placeholder')
 51 |         ### END YOUR CODE
 52 | 
 53 |     def create_feed_dict(self, inputs_batch, labels_batch=None):
 54 |         """Creates the feed_dict for training the given step.
 55 | 
 56 |         A feed_dict takes the form of:
 57 |         feed_dict = {
 58 |                 <placeholder>: <tensor of values to be passed for placeholder>,
 59 |                 ....
 60 |         }
 61 | 
 62 |         If label_batch is None, then no labels are added to feed_dict.
 63 | 
 64 |         Hint: The keys for the feed_dict should be the placeholder
 65 |                 tensors created in add_placeholders.
 66 | 
 67 |         Args:
 68 |             inputs_batch: A batch of input data.
 69 |             labels_batch: A batch of label data.
 70 |         Returns:
 71 |             feed_dict: The feed dictionary mapping from placeholders to values.
 72 |         """
 73 |         ### YOUR CODE HERE
 74 |         feed_dict=dict()
 75 |         feed_dict[self.input_placeholder]=inputs_batch
 76 |         if labels_batch is not None:
 77 |             feed_dict[self.labels_placeholder]=labels_batch
 78 |         ### END YOUR CODE
 79 |         return feed_dict
 80 | 
 81 |     def add_prediction_op(self):
 82 |         """Adds the core transformation for this model which transforms a batch of input
 83 |         data into a batch of predictions. In this case, the transformation is a linear layer plus a
 84 |         softmax transformation:
 85 | 
 86 |         yhat = softmax(xW + b)
 87 | 
 88 |         Hint: The input x will be passed in through self.input_placeholder. Each ROW of
 89 |               self.input_placeholder is a single example. This is usually best-practice for
 90 |               tensorflow code.
 91 |         Hint: Make sure to create tf.Variables as needed.
 92 |         Hint: For this simple use-case, it's sufficient to initialize both weights W
 93 |                     and biases b with zeros.
 94 | 
 95 |         Returns:
 96 |             pred: A tensor of shape (batch_size, n_classes)
 97 |         """
 98 |         ### YOUR CODE HERE
 99 |         W = tf.Variable(tf.zeros((Config.n_features,Config.n_classes)), dtype=tf.float32)
100 |         b = tf.Variable(tf.zeros(Config.n_classes), dtype=tf.float32)
101 |         pred = softmax(tf.matmul(self.input_placeholder,W)+b)
102 |         ### END YOUR CODE
103 |         return pred
104 | 
105 |     def add_loss_op(self, pred):
106 |         """Adds cross_entropy_loss ops to the computational graph.
107 | 
108 |         Hint: Use the cross_entropy_loss function we defined. This should be a very
109 |                     short function.
110 |         Args:
111 |             pred: A tensor of shape (batch_size, n_classes)
112 |         Returns:
113 |             loss: A 0-d tensor (scalar)
114 |         """
115 |         ### YOUR CODE HERE
116 |         loss = cross_entropy_loss(self.labels_placeholder,pred)
117 |         ### END YOUR CODE
118 |         return loss
119 | 
120 |     def add_training_op(self, loss):
121 |         """Sets up the training Ops.
122 | 
123 |         Creates an optimizer and applies the gradients to all trainable variables.
124 |         The Op returned by this function is what must be passed to the
125 |         `sess.run()` call to cause the model to train. See
126 | 
127 |         https://www.tensorflow.org/api_docs/python/tf/train/Optimizer
128 | 
129 |         for more information. Use the learning rate from self.config.
130 | 
131 |         Hint: Use tf.train.GradientDescentOptimizer to get an optimizer object.
132 |                     Calling optimizer.minimize() will return a train_op object.
133 | 
134 |         Args:
135 |             loss: Loss tensor, from cross_entropy_loss.
136 |         Returns:
137 |             train_op: The Op for training.
138 |         """
139 |         ### YOUR CODE HERE
140 |         train_op=tf.train.GradientDescentOptimizer(learning_rate=Config.lr).minimize(loss)
141 |         ### END YOUR CODE
142 |         return train_op
143 | 
144 |     def run_epoch(self, sess, inputs, labels):
145 |         """Runs an epoch of training.
146 | 
147 |         Args:
148 |             sess: tf.Session() object
149 |             inputs: np.ndarray of shape (n_samples, n_features)
150 |             labels: np.ndarray of shape (n_samples, n_classes)
151 |         Returns:
152 |             average_loss: scalar. Average minibatch loss of model on epoch.
153 |         """
154 |         n_minibatches, total_loss = 0, 0
155 |         for input_batch, labels_batch in get_minibatches([inputs, labels], self.config.batch_size):
156 |             n_minibatches += 1
157 |             total_loss += self.train_on_batch(sess, input_batch, labels_batch)
158 |         return total_loss / n_minibatches
159 | 
160 |     def fit(self, sess, inputs, labels):
161 |         """Fit model on provided data.
162 | 
163 |         Args:
164 |             sess: tf.Session()
165 |             inputs: np.ndarray of shape (n_samples, n_features)
166 |             labels: np.ndarray of shape (n_samples, n_classes)
167 |         Returns:
168 |             losses: list of loss per epoch
169 |         """
170 |         losses = []
171 |         for epoch in range(self.config.n_epochs):
172 |             start_time = time.time()
173 |             average_loss = self.run_epoch(sess, inputs, labels)
174 |             duration = time.time() - start_time
175 |             print(('Epoch {:}: loss = {:.2f} ({:.3f} sec)'.format(epoch, average_loss, duration)))
176 |             losses.append(average_loss)
177 |         return losses
178 | 
179 |     def __init__(self, config):
180 |         """Initializes the model.
181 | 
182 |         Args:
183 |             config: A model configuration object of type Config
184 |         """
185 |         self.config = config
186 |         self.build()
187 | 
188 | 
189 | def test_softmax_model():
190 |     """Train softmax model for a number of steps."""
191 |     config = Config()
192 | 
193 |     # Generate random data to train the model on
194 |     np.random.seed(1234)
195 |     inputs = np.random.rand(config.n_samples, config.n_features)
196 |     labels = np.zeros((config.n_samples, config.n_classes), dtype=np.int32)
197 |     labels[:, 0] = 1
198 | 
199 |     # Tell TensorFlow that the model will be built into the default Graph.
200 |     # (not required but good practice)
201 |     with tf.Graph().as_default() as graph:
202 |         # Build the model and add the variable initializer op
203 |         model = SoftmaxModel(config)
204 |         init_op = tf.global_variables_initializer()
205 |     # Finalizing the graph causes tensorflow to raise an exception if you try to modify the graph
206 |     # further. This is good practice because it makes explicit the distinction between building and
207 |     # running the graph.
208 |     graph.finalize()
209 | 
210 |     # Create a session for running ops in the graph
211 |     with tf.Session(graph=graph) as sess:
212 |         # Run the op to initialize the variables.
213 |         sess.run(init_op)
214 |         # Fit the model
215 |         losses = model.fit(sess, inputs, labels)
216 | 
217 |     # If ops are implemented correctly, the average loss should fall close to zero
218 |     # rapidly.
219 |     assert losses[-1] < .5
220 |     print("Basic (non-exhaustive) classifier tests pass")
221 | 
222 | if __name__ == "__main__":
223 |     test_softmax_model()
224 | 


--------------------------------------------------------------------------------
/NLP/assignment2/q1_softmax.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | import tensorflow as tf
  3 | from utils.general_utils import test_all_close
  4 | 
  5 | 
  6 | def softmax(x):
  7 |     """
  8 |     Compute the softmax function in tensorflow.
  9 | 
 10 |     You might find the tensorflow functions tf.exp, tf.reduce_max,
 11 |     tf.reduce_sum, tf.expand_dims useful. (Many solutions are possible, so you may
 12 |     not need to use all of these functions). Recall also that many common
 13 |     tensorflow operations are sugared (e.g. x + y does elementwise addition
 14 |     if x and y are both tensors). Make sure to implement the numerical stability
 15 |     fixes as in the previous homework!
 16 | 
 17 |     Args:
 18 |         x:   tf.Tensor with shape (n_samples, n_features). Note feature vectors are
 19 |                   represented by row-vectors. (For simplicity, no need to handle 1-d
 20 |                   input as in the previous homework)
 21 |     Returns:
 22 |         out: tf.Tensor with shape (n_sample, n_features). You need to construct this
 23 |                   tensor in this problem.
 24 |     """
 25 | 
 26 |     ### YOUR CODE HERE
 27 |     m = tf.reduce_max(x, axis=-1, keepdims=True)
 28 |     e = tf.exp(x - m)
 29 |     out = e/tf.reduce_sum(e, axis=-1, keepdims=True)
 30 |     ### END YOUR CODE
 31 | 
 32 |     return out
 33 | 
 34 | 
 35 | def cross_entropy_loss(y, yhat):
 36 |     """
 37 |     Compute the cross entropy loss in tensorflow.
 38 |     The loss should be summed over the current minibatch.
 39 | 
 40 |     y is a one-hot tensor of shape (n_samples, n_classes) and yhat is a tensor
 41 |     of shape (n_samples, n_classes). y should be of dtype tf.int32, and yhat should
 42 |     be of dtype tf.float32.
 43 | 
 44 |     The functions tf.to_float, tf.reduce_sum, and tf.log might prove useful. (Many
 45 |     solutions are possible, so you may not need to use all of these functions).
 46 | 
 47 |     Note: You are NOT allowed to use the tensorflow built-in cross-entropy
 48 |                 functions.
 49 | 
 50 |     Args:
 51 |         y:    tf.Tensor with shape (n_samples, n_classes). One-hot encoded.
 52 |         yhat: tf.Tensorwith shape (n_sample, n_classes). Each row encodes a
 53 |                     probability distribution and should sum to 1.
 54 |     Returns:
 55 |         out:  tf.Tensor with shape (1,) (Scalar output). You need to construct this
 56 |                     tensor in the problem.
 57 |     """
 58 | 
 59 |     out=0
 60 |     ### YOUR CODE HERE
 61 |     out=-tf.reduce_sum(tf.to_float(y)*tf.log(yhat+1e-8))
 62 |     ### END YOUR CODE
 63 | 
 64 |     return out
 65 | 
 66 | 
 67 | def test_softmax_basic():
 68 |     """
 69 |     Some simple tests of softmax to get you started.
 70 |     Warning: these are not exhaustive.
 71 |     """
 72 | 
 73 |     test1 = softmax(tf.constant(np.array([[1001, 1002], [3, 4]]), dtype=tf.float32))
 74 |     with tf.Session() as sess:
 75 |             test1 = sess.run(test1)
 76 |     test_all_close("Softmax test 1", test1, np.array([[0.26894142, 0.73105858],
 77 |                                                       [0.26894142, 0.73105858]]))
 78 | 
 79 |     test2 = softmax(tf.constant(np.array([[-1001, -1002]]), dtype=tf.float32))
 80 |     with tf.Session() as sess:
 81 |             test2 = sess.run(test2)
 82 |     test_all_close("Softmax test 2", test2, np.array([[0.73105858, 0.26894142]]))
 83 | 
 84 |     print("Basic (non-exhaustive) softmax tests pass\n")
 85 | 
 86 | 
 87 | def test_cross_entropy_loss_basic():
 88 |     """
 89 |     Some simple tests of cross_entropy_loss to get you started.
 90 |     Warning: these are not exhaustive.
 91 |     """
 92 |     y = np.array([[0, 1], [1, 0], [1, 0]])
 93 |     yhat = np.array([[.5, .5], [.5, .5], [.5, .5]])
 94 | 
 95 |     test1 = cross_entropy_loss(tf.constant(y, dtype=tf.int32), tf.constant(yhat, dtype=tf.float32))
 96 |     with tf.Session() as sess:
 97 |         test1 = sess.run(test1)
 98 |     expected = -3 * np.log(.5)
 99 |     test_all_close("Cross-entropy test 1", test1, expected)
100 | 
101 |     print("Basic (non-exhaustive) cross-entropy tests pass")
102 | 
103 | if __name__ == "__main__":
104 |     test_softmax_basic()
105 |     test_cross_entropy_loss_basic()
106 | 


--------------------------------------------------------------------------------
/NLP/assignment2/q2_initialization.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import tensorflow as tf
 3 | 
 4 | 
 5 | def xavier_weight_init():
 6 |     """Returns function that creates random tensor.
 7 | 
 8 |     The specified function will take in a shape (tuple or 1-d array) and
 9 |     returns a random tensor of the specified shape drawn from the
10 |     Xavier initialization distribution.
11 | 
12 |     Hint: You might find tf.random_uniform useful.
13 |     """
14 |     def _xavier_initializer(shape, **kwargs):
15 |         """Defines an initializer for the Xavier distribution.
16 |         Specifically, the output should be sampled uniformly from [-epsilon, epsilon] where
17 |             epsilon = sqrt(6) / <sum of the sizes of shape's dimensions>
18 |         e.g., if shape = (2, 3), epsilon = sqrt(6 / (2 + 3))
19 | 
20 |         This function will be used as a variable initializer.
21 | 
22 |         Args:
23 |             shape: Tuple or 1-d array that species the dimensions of the requested tensor.
24 |         Returns:
25 |             out: tf.Tensor of specified shape sampled from the Xavier distribution.
26 |         """
27 |         ### YOUR CODE HERE
28 |         eps=np.sqrt(6/np.sum(shape))
29 |         out=tf.random_uniform(shape,-eps,eps)
30 |         ### END YOUR CODE
31 |         return out
32 |     # Returns defined initializer function.
33 |     return _xavier_initializer
34 | 
35 | 
36 | def test_initialization_basic():
37 |     """Some simple tests for the initialization.
38 |     """
39 |     print("Running basic tests...")
40 |     xavier_initializer = xavier_weight_init()
41 |     shape = (1,)
42 |     xavier_mat = xavier_initializer(shape)
43 |     assert xavier_mat.get_shape() == shape
44 | 
45 |     shape = (1, 2, 3)
46 |     xavier_mat = xavier_initializer(shape)
47 |     assert xavier_mat.get_shape() == shape
48 |     print("Basic (non-exhaustive) Xavier initialization tests pass")
49 | 
50 | if __name__ == "__main__":
51 |     test_initialization_basic()
52 | 


--------------------------------------------------------------------------------
/NLP/assignment2/q2_parser_transitions.py:
--------------------------------------------------------------------------------
  1 | class PartialParse(object):
  2 |     def __init__(self, sentence):
  3 |         """Initializes this partial parse.
  4 | 
  5 |         Your code should initialize the following fields:
  6 |             self.stack: The current stack represented as a list with the top of the stack as the
  7 |                         last element of the list.
  8 |             self.buffer: The current buffer represented as a list with the first item on the
  9 |                          buffer as the first item of the list
 10 |             self.dependencies: The list of dependencies produced so far. Represented as a list of
 11 |                     tuples where each tuple is of the form (head, dependent).
 12 |                     Order for this list doesn't matter.
 13 | 
 14 |         The root token should be represented with the string "ROOT"
 15 | 
 16 |         Args:
 17 |             sentence: The sentence to be parsed as a list of words.
 18 |                       Your code should not modify the sentence.
 19 |         """
 20 |         # The sentence being parsed is kept for bookkeeping purposes. Do not use it in your code.
 21 |         self.sentence = sentence
 22 | 
 23 |         ### YOUR CODE HERE
 24 |         self.stack=['ROOT']
 25 |         self.buffer=sentence.copy()
 26 |         self.dependencies=[]
 27 |         ### END YOUR CODE
 28 | 
 29 |     def parse_step(self, transition):
 30 |         """Performs a single parse step by applying the given transition to this partial parse
 31 | 
 32 |         Args:
 33 |             transition: A string that equals "S", "LA", or "RA" representing the shift, left-arc,
 34 |                         and right-arc transitions. You can assume the provided transition is a legal
 35 |                         transition.
 36 |         """
 37 |         ### YOUR CODE HERE
 38 |         # print(self.stack)
 39 |         if transition == 'S':
 40 |             # if len(self.buffer)>0:
 41 |             w=self.buffer[0]
 42 |             self.stack.append(w)
 43 |             del self.buffer[0]
 44 |         elif transition == 'LA':
 45 |             # if len(self.stack)>1:
 46 |             self.dependencies.append((self.stack[-1],self.stack[-2]))
 47 |             del self.stack[-2]
 48 |         elif transition == 'RA':
 49 |             # if len(self.stack)>1:
 50 |             w=self.stack.pop()
 51 |             self.dependencies.append((self.stack[-1],w))
 52 |         else:
 53 |             raise(KeyError("Transition str key {} is not valide".format(transition)))
 54 |         ### END YOUR CODE
 55 | 
 56 |     def parse(self, transitions):
 57 |         """Applies the provided transitions to this PartialParse
 58 | 
 59 |         Args:
 60 |             transitions: The list of transitions in the order they should be applied
 61 |         Returns:
 62 |             dependencies: The list of dependencies produced when parsing the sentence. Represented
 63 |                           as a list of tuples where each tuple is of the form (head, dependent)
 64 |         """
 65 |         for transition in transitions:
 66 |             self.parse_step(transition)
 67 |         return self.dependencies
 68 | 
 69 | 
 70 | def minibatch_parse(sentences, model, batch_size):
 71 |     """Parses a list of sentences in minibatches using a model.
 72 | 
 73 |     Args:
 74 |         sentences: A list of sentences to be parsed (each sentence is a list of words)
 75 |         model: The model that makes parsing decisions. It is assumed to have a function
 76 |                model.predict(partial_parses) that takes in a list of PartialParses as input and
 77 |                returns a list of transitions predicted for each parse. That is, after calling
 78 |                    transitions = model.predict(partial_parses)
 79 |                transitions[i] will be the next transition to apply to partial_parses[i].
 80 |         batch_size: The number of PartialParses to include in each minibatch
 81 |     Returns:
 82 |         dependencies: A list where each element is the dependencies list for a parsed sentence.
 83 |                       Ordering should be the same as in sentences (i.e., dependencies[i] should
 84 |                       contain the parse for sentences[i]).
 85 |     """
 86 |     dependencies=[]
 87 |     ### YOUR CODE HERE
 88 |     partial_parses=[PartialParse(s) for s in sentences]
 89 |     unfinished_parses=[p for p in partial_parses]
 90 |     while len(unfinished_parses) != 0:
 91 |         pars=unfinished_parses[:batch_size]
 92 |         transitions=model.predict(pars)
 93 |         tobe_deleted=[]
 94 |         for i,p in enumerate(pars):
 95 |             p.parse_step(transitions[i])
 96 |             if len(p.buffer)==0 and len(p.stack)==1:
 97 |                 tobe_deleted.append(i)
 98 |         for i in reversed(tobe_deleted):
 99 |             del unfinished_parses[i]
100 |         
101 |     dependencies=[p.dependencies for p in partial_parses]
102 | 
103 |     ### END YOUR CODE
104 | 
105 |     return dependencies
106 | 
107 | 
108 | def test_step(name, transition, stack, buf, deps,
109 |               ex_stack, ex_buf, ex_deps):
110 |     """Tests that a single parse step returns the expected output"""
111 |     pp = PartialParse([])
112 |     pp.stack, pp.buffer, pp.dependencies = stack, buf, deps
113 | 
114 |     pp.parse_step(transition)
115 |     stack, buf, deps = (tuple(pp.stack), tuple(pp.buffer), tuple(sorted(pp.dependencies)))
116 |     assert stack == ex_stack, \
117 |         "{:} test resulted in stack {:}, expected {:}".format(name, stack, ex_stack)
118 |     assert buf == ex_buf, \
119 |         "{:} test resulted in buffer {:}, expected {:}".format(name, buf, ex_buf)
120 |     assert deps == ex_deps, \
121 |         "{:} test resulted in dependency list {:}, expected {:}".format(name, deps, ex_deps)
122 |     print("{:} test passed!".format(name))
123 | 
124 | 
125 | def test_parse_step():
126 |     """Simple tests for the PartialParse.parse_step function
127 |     Warning: these are not exhaustive
128 |     """
129 |     test_step("SHIFT", "S", ["ROOT", "the"], ["cat", "sat"], [],
130 |               ("ROOT", "the", "cat"), ("sat",), ())
131 |     test_step("LEFT-ARC", "LA", ["ROOT", "the", "cat"], ["sat"], [],
132 |               ("ROOT", "cat",), ("sat",), (("cat", "the"),))
133 |     test_step("RIGHT-ARC", "RA", ["ROOT", "run", "fast"], [], [],
134 |               ("ROOT", "run",), (), (("run", "fast"),))
135 | 
136 | 
137 | def test_parse():
138 |     """Simple tests for the PartialParse.parse function
139 |     Warning: these are not exhaustive
140 |     """
141 |     sentence = ["parse", "this", "sentence"]
142 |     dependencies = PartialParse(sentence).parse(["S", "S", "S", "LA", "RA", "RA"])
143 |     dependencies = tuple(sorted(dependencies))
144 |     expected = (('ROOT', 'parse'), ('parse', 'sentence'), ('sentence', 'this'))
145 |     assert dependencies == expected,  \
146 |         "parse test resulted in dependencies {:}, expected {:}".format(dependencies, expected)
147 |     assert tuple(sentence) == ("parse", "this", "sentence"), \
148 |         "parse test failed: the input sentence should not be modified"
149 |     print("parse test passed!")
150 | 
151 | 
152 | class DummyModel(object):
153 |     """Dummy model for testing the minibatch_parse function
154 |     First shifts everything onto the stack and then does exclusively right arcs if the first word of
155 |     the sentence is "right", "left" if otherwise.
156 |     """
157 |     def predict(self, partial_parses):
158 |         return [("RA" if pp.stack[1] is "right" else "LA") if len(pp.buffer) == 0 else "S"
159 |                 for pp in partial_parses]
160 | 
161 | 
162 | def test_dependencies(name, deps, ex_deps):
163 |     """Tests the provided dependencies match the expected dependencies"""
164 |     deps = tuple(sorted(deps))
165 |     assert deps == ex_deps, \
166 |         "{:} test resulted in dependency list {:}, expected {:}".format(name, deps, ex_deps)
167 | 
168 | 
169 | def test_minibatch_parse():
170 |     """Simple tests for the minibatch_parse function
171 |     Warning: these are not exhaustive
172 |     """
173 |     sentences = [["right", "arcs", "only"],
174 |                  ["right", "arcs", "only", "again"],
175 |                  ["left", "arcs", "only"],
176 |                  ["left", "arcs", "only", "again"]]
177 |     deps = minibatch_parse(sentences, DummyModel(), 2)
178 |     test_dependencies("minibatch_parse", deps[0],
179 |                       (('ROOT', 'right'), ('arcs', 'only'), ('right', 'arcs')))
180 |     test_dependencies("minibatch_parse", deps[1],
181 |                       (('ROOT', 'right'), ('arcs', 'only'), ('only', 'again'), ('right', 'arcs')))
182 |     test_dependencies("minibatch_parse", deps[2],
183 |                       (('only', 'ROOT'), ('only', 'arcs'), ('only', 'left')))
184 |     test_dependencies("minibatch_parse", deps[3],
185 |                       (('again', 'ROOT'), ('again', 'arcs'), ('again', 'left'), ('again', 'only')))
186 |     print("minibatch_parse test passed!")
187 | 
188 | if __name__ == '__main__':
189 |     test_parse_step()
190 |     test_parse()
191 |     test_minibatch_parse()
192 | 


--------------------------------------------------------------------------------
/NLP/assignment2/utils/general_utils.py:
--------------------------------------------------------------------------------
 1 | import sys
 2 | import time
 3 | import numpy as np
 4 | 
 5 | 
 6 | def get_minibatches(data, minibatch_size, shuffle=True):
 7 |     """
 8 |     Iterates through the provided data one minibatch at at time. You can use this function to
 9 |     iterate through data in minibatches as follows:
10 | 
11 |         for inputs_minibatch in get_minibatches(inputs, minibatch_size):
12 |             ...
13 | 
14 |     Or with multiple data sources:
15 | 
16 |         for inputs_minibatch, labels_minibatch in get_minibatches([inputs, labels], minibatch_size):
17 |             ...
18 | 
19 |     Args:
20 |         data: there are two possible values:
21 |             - a list or numpy array
22 |             - a list where each element is either a list or numpy array
23 |         minibatch_size: the maximum number of items in a minibatch
24 |         shuffle: whether to randomize the order of returned data
25 |     Returns:
26 |         minibatches: the return value depends on data:
27 |             - If data is a list/array it yields the next minibatch of data.
28 |             - If data a list of lists/arrays it returns the next minibatch of each element in the
29 |               list. This can be used to iterate through multiple data sources
30 |               (e.g., features and labels) at the same time.
31 | 
32 |     """
33 |     list_data = type(data) is list and (type(data[0]) is list or type(data[0]) is np.ndarray)
34 |     data_size = len(data[0]) if list_data else len(data)
35 |     indices = np.arange(data_size)
36 |     if shuffle:
37 |         np.random.shuffle(indices)
38 |     for minibatch_start in np.arange(0, data_size, minibatch_size):
39 |         minibatch_indices = indices[minibatch_start:minibatch_start + minibatch_size]
40 |         yield [_minibatch(d, minibatch_indices) for d in data] if list_data \
41 |             else _minibatch(data, minibatch_indices)
42 | 
43 | 
44 | def _minibatch(data, minibatch_idx):
45 |     return data[minibatch_idx] if type(data) is np.ndarray else [data[i] for i in minibatch_idx]
46 | 
47 | 
48 | def test_all_close(name, actual, expected):
49 |     if actual.shape != expected.shape:
50 |         raise ValueError("{:} failed, expected output to have shape {:} but has shape {:}"
51 |                          .format(name, expected.shape, actual.shape))
52 |     if np.amax(np.fabs(actual - expected)) > 1e-6:
53 |         raise ValueError("{:} failed, expected {:} but value is {:}".format(name, expected, actual))
54 |     else:
55 |         print(name, "passed!")
56 | 


--------------------------------------------------------------------------------
/NLP/assignment3/assignment3-soln.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bayeslabs/AiGym/205350c311f0e0981fe90eae84586c9bd4a9cfef/NLP/assignment3/assignment3-soln.pdf


--------------------------------------------------------------------------------
/NLP/assignment3/assignment3.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bayeslabs/AiGym/205350c311f0e0981fe90eae84586c9bd4a9cfef/NLP/assignment3/assignment3.pdf


--------------------------------------------------------------------------------
/NLP/assignment3/data_util.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python3
  2 | # -*- coding: utf-8 -*-
  3 | """
  4 | Utility functions to process data.
  5 | """
  6 | import os
  7 | import pickle
  8 | import logging
  9 | from collections import Counter
 10 | 
 11 | import numpy as np
 12 | from util import read_conll, one_hot, window_iterator, ConfusionMatrix, load_word_vector_mapping
 13 | from defs import LBLS, NONE, LMAP, NUM, UNK, EMBED_SIZE
 14 | 
 15 | logger = logging.getLogger(__name__)
 16 | logger.setLevel(logging.DEBUG)
 17 | logging.basicConfig(format='%(levelname)s:%(message)s', level=logging.DEBUG)
 18 | 
 19 | 
 20 | FDIM = 4
 21 | P_CASE = "CASE:"
 22 | CASES = ["aa", "AA", "Aa", "aA"]
 23 | START_TOKEN = "<s>"
 24 | END_TOKEN = "</s>"
 25 | 
 26 | def casing(word):
 27 |     if len(word) == 0: return word
 28 | 
 29 |     # all lowercase
 30 |     if word.islower(): return "aa"
 31 |     # all uppercase
 32 |     elif word.isupper(): return "AA"
 33 |     # starts with capital
 34 |     elif word[0].isupper(): return "Aa"
 35 |     # has non-initial capital
 36 |     else: return "aA"
 37 | 
 38 | def normalize(word):
 39 |     """
 40 |     Normalize words that are numbers or have casing.
 41 |     """
 42 |     if word.isdigit(): return NUM
 43 |     else: return word.lower()
 44 | 
 45 | def featurize(embeddings, word):
 46 |     """
 47 |     Featurize a word given embeddings.
 48 |     """
 49 |     case = casing(word)
 50 |     word = normalize(word)
 51 |     case_mapping = {c: one_hot(FDIM, i) for i, c in enumerate(CASES)}
 52 |     wv = embeddings.get(word, embeddings[UNK])
 53 |     fv = case_mapping[case]
 54 |     return np.hstack((wv, fv))
 55 | 
 56 | def evaluate(model, X, Y):
 57 |     cm = ConfusionMatrix(labels=LBLS)
 58 |     Y_ = model.predict(X)
 59 |     for i in range(Y.shape[0]):
 60 |         y, y_ = np.argmax(Y[i]), np.argmax(Y_[i])
 61 |         cm.update(y,y_)
 62 |     cm.print_table()
 63 |     return cm.summary()
 64 | 
 65 | class ModelHelper(object):
 66 |     """
 67 |     This helper takes care of preprocessing data, constructing embeddings, etc.
 68 |     """
 69 |     def __init__(self, tok2id, max_length):
 70 |         self.tok2id = tok2id
 71 |         self.START = [tok2id[START_TOKEN], tok2id[P_CASE + "aa"]]
 72 |         self.END = [tok2id[END_TOKEN], tok2id[P_CASE + "aa"]]
 73 |         self.max_length = max_length
 74 | 
 75 |     def vectorize_example(self, sentence, labels=None):
 76 |         sentence_ = [[self.tok2id.get(normalize(word), self.tok2id[UNK]), self.tok2id[P_CASE + casing(word)]] for word in sentence]
 77 |         if labels:
 78 |             labels_ = [LBLS.index(l) for l in labels]
 79 |             return sentence_, labels_
 80 |         else:
 81 |             return sentence_, [LBLS[-1] for _ in sentence]
 82 | 
 83 |     def vectorize(self, data):
 84 |         return [self.vectorize_example(sentence, labels) for sentence, labels in data]
 85 | 
 86 |     @classmethod
 87 |     def build(cls, data):
 88 |         # Preprocess data to construct an embedding
 89 |         # Reserve 0 for the special NIL token.
 90 |         tok2id = build_dict((normalize(word) for sentence, _ in data for word in sentence), offset=1, max_words=10000)
 91 |         tok2id.update(build_dict([P_CASE + c for c in CASES], offset=len(tok2id)))
 92 |         tok2id.update(build_dict([START_TOKEN, END_TOKEN, UNK], offset=len(tok2id)))
 93 |         assert sorted(tok2id.items(), key=lambda t: t[1])[0][1] == 1
 94 |         logger.info("Built dictionary for %d features.", len(tok2id))
 95 | 
 96 |         max_length = max(len(sentence) for sentence, _ in data)
 97 | 
 98 |         return cls(tok2id, max_length)
 99 | 
100 |     def save(self, path):
101 |         # Make sure the directory exists.
102 |         if not os.path.exists(path):
103 |             os.makedirs(path)
104 |         # Save the tok2id map.
105 |         with open(os.path.join(path, "features.pkl"), "wb") as f:
106 |             pickle.dump([self.tok2id, self.max_length], f)
107 | 
108 |     @classmethod
109 |     def load(cls, path):
110 |         # Make sure the directory exists.
111 |         assert os.path.exists(path) and os.path.exists(os.path.join(path, "features.pkl"))
112 |         # Save the tok2id map.
113 |         with open(os.path.join(path, "features.pkl"), "rb") as f:
114 |             tok2id, max_length = pickle.load(f)
115 |         return cls(tok2id, max_length)
116 | 
117 | def load_and_preprocess_data(args):
118 |     logger.info("Loading training data...")
119 |     train = read_conll(args.data_train)
120 |     logger.info("Done. Read %d sentences", len(train))
121 |     logger.info("Loading dev data...")
122 |     dev = read_conll(args.data_dev)
123 |     logger.info("Done. Read %d sentences", len(dev))
124 | 
125 |     helper = ModelHelper.build(train)
126 | 
127 |     # now process all the input data.
128 |     train_data = helper.vectorize(train)
129 |     dev_data = helper.vectorize(dev)
130 | 
131 |     return helper, train_data, dev_data, train, dev
132 | 
133 | def load_embeddings(args, helper):
134 |     embeddings = np.array(np.random.randn(len(helper.tok2id) + 1, EMBED_SIZE), dtype=np.float32)
135 |     embeddings[0] = 0.
136 |     for word, vec in load_word_vector_mapping(args.vocab, args.vectors).items():
137 |         word = normalize(word)
138 |         if word in helper.tok2id:
139 |             embeddings[helper.tok2id[word]] = vec
140 |     logger.info("Initialized embeddings.")
141 | 
142 |     return embeddings
143 | 
144 | def build_dict(words, max_words=None, offset=0):
145 |     cnt = Counter(words)
146 |     if max_words:
147 |         words = cnt.most_common(max_words)
148 |     else:
149 |         words = cnt.most_common()
150 |     return {word: offset+i for i, (word, _) in enumerate(words)}
151 | 
152 | 
153 | def get_chunks(seq, default=LBLS.index(NONE)):
154 |     """Breaks input of 4 4 4 0 0 4 0 ->   (0, 4, 5), (0, 6, 7)"""
155 |     chunks = []
156 |     chunk_type, chunk_start = None, None
157 |     for i, tok in enumerate(seq):
158 |         # End of a chunk 1
159 |         if tok == default and chunk_type is not None:
160 |             # Add a chunk.
161 |             chunk = (chunk_type, chunk_start, i)
162 |             chunks.append(chunk)
163 |             chunk_type, chunk_start = None, None
164 |         # End of a chunk + start of a chunk!
165 |         elif tok != default:
166 |             if chunk_type is None:
167 |                 chunk_type, chunk_start = tok, i
168 |             elif tok != chunk_type:
169 |                 chunk = (chunk_type, chunk_start, i)
170 |                 chunks.append(chunk)
171 |                 chunk_type, chunk_start = tok, i
172 |         else:
173 |             pass
174 |     # end condition
175 |     if chunk_type is not None:
176 |         chunk = (chunk_type, chunk_start, len(seq))
177 |         chunks.append(chunk)
178 |     return chunks
179 | 
180 | def test_get_chunks():
181 |     assert get_chunks([4, 4, 4, 0, 0, 4, 1, 2, 4, 3], 4) == [(0,3,5), (1, 6, 7), (2, 7, 8), (3,9,10)]
182 | 


--------------------------------------------------------------------------------
/NLP/assignment3/defs.py:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env python3
 2 | # -*- coding: utf-8 -*-
 3 | """
 4 | Common definitions for NER
 5 | """
 6 | 
 7 | from util import one_hot
 8 | 
 9 | LBLS = [
10 |     "PER",
11 |     "ORG",
12 |     "LOC",
13 |     "MISC",
14 |     "O",
15 |     ]
16 | NONE = "O"
17 | LMAP = {k: one_hot(5,i) for i, k in enumerate(LBLS)}
18 | NUM = "NNNUMMM"
19 | UNK = "UUUNKKK"
20 | 
21 | EMBED_SIZE = 50
22 | 


--------------------------------------------------------------------------------
/NLP/assignment3/model.py:
--------------------------------------------------------------------------------
  1 | class Model(object):
  2 |     """Abstracts a Tensorflow graph for a learning task.
  3 | 
  4 |     We use various Model classes as usual abstractions to encapsulate tensorflow
  5 |     computational graphs. Each algorithm you will construct in this homework will
  6 |     inherit from a Model object.
  7 |     """
  8 |     def add_placeholders(self):
  9 |         """Adds placeholder variables to tensorflow computational graph.
 10 | 
 11 |         Tensorflow uses placeholder variables to represent locations in a
 12 |         computational graph where data is inserted.  These placeholders are used as
 13 |         inputs by the rest of the model building and will be fed data during
 14 |         training.
 15 | 
 16 |         See for more information:
 17 |         https://www.tensorflow.org/versions/r0.7/api_docs/python/io_ops.html#placeholders
 18 |         """
 19 |         raise NotImplementedError("Each Model must re-implement this method.")
 20 | 
 21 |     def create_feed_dict(self, inputs_batch, labels_batch=None):
 22 |         """Creates the feed_dict for one step of training.
 23 | 
 24 |         A feed_dict takes the form of:
 25 |         feed_dict = {
 26 |                 <placeholder>: <tensor of values to be passed for placeholder>,
 27 |                 ....
 28 |         }
 29 | 
 30 |         If labels_batch is None, then no labels are added to feed_dict.
 31 | 
 32 |         Hint: The keys for the feed_dict should be a subset of the placeholder
 33 |                     tensors created in add_placeholders.
 34 |         Args:
 35 |             inputs_batch: A batch of input data.
 36 |             labels_batch: A batch of label data.
 37 |         Returns:
 38 |             feed_dict: The feed dictionary mapping from placeholders to values.
 39 |         """
 40 |         raise NotImplementedError("Each Model must re-implement this method.")
 41 | 
 42 |     def add_prediction_op(self):
 43 |         """Implements the core of the model that transforms a batch of input data into predictions.
 44 | 
 45 |         Returns:
 46 |             pred: A tensor of shape (batch_size, n_classes)
 47 |         """
 48 |         raise NotImplementedError("Each Model must re-implement this method.")
 49 | 
 50 |     def add_loss_op(self, pred):
 51 |         """Adds Ops for the loss function to the computational graph.
 52 | 
 53 |         Args:
 54 |             pred: A tensor of shape (batch_size, n_classes)
 55 |         Returns:
 56 |             loss: A 0-d tensor (scalar) output
 57 |         """
 58 |         raise NotImplementedError("Each Model must re-implement this method.")
 59 | 
 60 |     def add_training_op(self, loss):
 61 |         """Sets up the training Ops.
 62 | 
 63 |         Creates an optimizer and applies the gradients to all trainable variables.
 64 |         The Op returned by this function is what must be passed to the
 65 |         sess.run() to train the model. See
 66 | 
 67 |         https://www.tensorflow.org/versions/r0.7/api_docs/python/train.html#Optimizer
 68 | 
 69 |         for more information.
 70 | 
 71 |         Args:
 72 |             loss: Loss tensor (a scalar).
 73 |         Returns:
 74 |             train_op: The Op for training.
 75 |         """
 76 | 
 77 |         raise NotImplementedError("Each Model must re-implement this method.")
 78 | 
 79 |     def train_on_batch(self, sess, inputs_batch, labels_batch):
 80 |         """Perform one step of gradient descent on the provided batch of data.
 81 | 
 82 |         Args:
 83 |             sess: tf.Session()
 84 |             input_batch: np.ndarray of shape (n_samples, n_features)
 85 |             labels_batch: np.ndarray of shape (n_samples, n_classes)
 86 |         Returns:
 87 |             loss: loss over the batch (a scalar)
 88 |         """
 89 |         feed = self.create_feed_dict(inputs_batch, labels_batch=labels_batch)
 90 |         _, loss = sess.run([self.train_op, self.loss], feed_dict=feed)
 91 |         return loss
 92 | 
 93 |     def predict_on_batch(self, sess, inputs_batch):
 94 |         """Make predictions for the provided batch of data
 95 | 
 96 |         Args:
 97 |             sess: tf.Session()
 98 |             input_batch: np.ndarray of shape (n_samples, n_features)
 99 |         Returns:
100 |             predictions: np.ndarray of shape (n_samples, n_classes)
101 |         """
102 |         feed = self.create_feed_dict(inputs_batch)
103 |         predictions = sess.run(self.pred, feed_dict=feed)
104 |         return predictions
105 | 
106 |     def build(self):
107 |         self.add_placeholders()
108 |         self.pred = self.add_prediction_op()
109 |         self.loss = self.add_loss_op(self.pred)
110 |         self.train_op = self.add_training_op(self.loss)
111 | 


--------------------------------------------------------------------------------
/NLP/assignment3/ner_model.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python2.7
  2 | # -*- coding: utf-8 -*-
  3 | """
  4 | A model for named entity recognition.
  5 | """
  6 | import pdb
  7 | import logging
  8 | 
  9 | import tensorflow as tf
 10 | from util import ConfusionMatrix, Progbar, minibatches
 11 | from data_util import get_chunks
 12 | from model import Model
 13 | from defs import LBLS
 14 | 
 15 | logger = logging.getLogger("hw3")
 16 | logger.setLevel(logging.DEBUG)
 17 | logging.basicConfig(format='%(levelname)s:%(message)s', level=logging.DEBUG)
 18 | 
 19 | class NERModel(Model):
 20 |     """
 21 |     Implements special functionality for NER models.
 22 |     """
 23 | 
 24 |     def __init__(self, helper, config, report=None):
 25 |         self.helper = helper
 26 |         self.config = config
 27 |         self.report = report
 28 | 
 29 |     def preprocess_sequence_data(self, examples):
 30 |         """Preprocess sequence data for the model.
 31 | 
 32 |         Args:
 33 |             examples: A list of vectorized input/output sequences.
 34 |         Returns:
 35 |             A new list of vectorized input/output pairs appropriate for the model.
 36 |         """
 37 |         raise NotImplementedError("Each Model must re-implement this method.")
 38 | 
 39 |     def consolidate_predictions(self, data_raw, data, preds):
 40 |         """
 41 |         Convert a sequence of predictions according to the batching
 42 |         process back into the original sequence.
 43 |         """
 44 |         raise NotImplementedError("Each Model must re-implement this method.")
 45 | 
 46 | 
 47 |     def evaluate(self, sess, examples, examples_raw):
 48 |         """Evaluates model performance on @examples.
 49 | 
 50 |         This function uses the model to predict labels for @examples and constructs a confusion matrix.
 51 | 
 52 |         Args:
 53 |             sess: the current TensorFlow session.
 54 |             examples: A list of vectorized input/output pairs.
 55 |             examples: A list of the original input/output sequence pairs.
 56 |         Returns:
 57 |             The F1 score for predicting tokens as named entities.
 58 |         """
 59 |         token_cm = ConfusionMatrix(labels=LBLS)
 60 | 
 61 |         correct_preds, total_correct, total_preds = 0., 0., 0.
 62 |         for _, labels, labels_  in self.output(sess, examples_raw, examples):
 63 |             for l, l_ in zip(labels, labels_):
 64 |                 token_cm.update(l, l_)
 65 |             gold = set(get_chunks(labels))
 66 |             pred = set(get_chunks(labels_))
 67 |             correct_preds += len(gold.intersection(pred))
 68 |             total_preds += len(pred)
 69 |             total_correct += len(gold)
 70 | 
 71 |         p = correct_preds / total_preds if correct_preds > 0 else 0
 72 |         r = correct_preds / total_correct if correct_preds > 0 else 0
 73 |         f1 = 2 * p * r / (p + r) if correct_preds > 0 else 0
 74 |         return token_cm, (p, r, f1)
 75 | 
 76 | 
 77 |     def output(self, sess, inputs_raw, inputs=None):
 78 |         """
 79 |         Reports the output of the model on examples (uses helper to featurize each example).
 80 |         """
 81 |         if inputs is None:
 82 |             inputs = self.preprocess_sequence_data(self.helper.vectorize(inputs_raw))
 83 | 
 84 |         preds = []
 85 |         prog = Progbar(target=1 + int(len(inputs) / self.config.batch_size))
 86 |         for i, batch in enumerate(minibatches(inputs, self.config.batch_size, shuffle=False)):
 87 |             # Ignore predict
 88 |             batch = batch[:1] + batch[2:]
 89 |             preds_ = self.predict_on_batch(sess, *batch)
 90 |             preds += list(preds_)
 91 |             prog.update(i + 1, [])
 92 |         return self.consolidate_predictions(inputs_raw, inputs, preds)
 93 | 
 94 |     def fit(self, sess, saver, train_examples_raw, dev_set_raw):
 95 |         best_score = 0.
 96 | 
 97 |         train_examples = self.preprocess_sequence_data(train_examples_raw)
 98 |         dev_set = self.preprocess_sequence_data(dev_set_raw)
 99 | 
100 |         for epoch in range(self.config.n_epochs):
101 |             logger.info("Epoch %d out of %d", epoch + 1, self.config.n_epochs)
102 |             # You may use the progress bar to monitor the training progress
103 |             # Addition of progress bar will not be graded, but may help when debugging
104 |             prog = Progbar(target=1 + int(len(train_examples) / self.config.batch_size))
105 | 			
106 | 			# The general idea is to loop over minibatches from train_examples, and run train_on_batch inside the loop
107 | 			# Hint: train_examples could be a list containing the feature data and label data
108 | 			# Read the doc for utils.get_minibatches to find out how to use it.
109 |                         # Note that get_minibatches could either return a list, or a list of list
110 |                         # [features, labels]. This makes expanding tuples into arguments (* operator) handy
111 | 
112 |             ### YOUR CODE HERE (2-3 lines)
113 |             for batch in minibatches(train_examples, self.config.batch_size):
114 |                 if len(batch)==2:
115 |                     features,labels=batch
116 |                     self.train_on_batch(sess, features, labels)
117 |                 if len(batch)==3:
118 |                     features,labels,masks=batch
119 |                     self.train_on_batch(sess, inputs_batch=features, labels_batch=labels, mask_batch=masks)
120 |                 else:
121 |                     raise ValueError("train examples has %d element for each sentence! But the supporteds are 2 and 3"%len(batch))
122 |             ### END YOUR CODE
123 | 
124 |             logger.info("Evaluating on development data")
125 |             token_cm, entity_scores = self.evaluate(sess, dev_set, dev_set_raw)
126 |             logger.debug("Token-level confusion matrix:\n" + token_cm.as_table())
127 |             logger.debug("Token-level scores:\n" + token_cm.summary())
128 |             logger.info("Entity level P/R/F1: %.2f/%.2f/%.2f", *entity_scores)
129 | 
130 |             score = entity_scores[-1]
131 |             
132 |             if score > best_score:
133 |                 best_score = score
134 |                 if saver:
135 |                     logger.info("New best score! Saving model in %s", self.config.model_output)
136 |                     saver.save(sess, self.config.model_output)
137 |             print("")
138 |             if self.report:
139 |                 self.report.log_epoch()
140 |                 self.report.save()
141 |         return best_score
142 | 


--------------------------------------------------------------------------------
/NLP/assignment3/q2_rnn_cell.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python3
  2 | # -*- coding: utf-8 -*-
  3 | """
  4 | Q2(c): Recurrent neural nets for NER
  5 | """
  6 | 
  7 | from __future__ import absolute_import
  8 | from __future__ import division
  9 | 
 10 | import argparse
 11 | import logging
 12 | import sys
 13 | 
 14 | import tensorflow as tf
 15 | import numpy as np
 16 | 
 17 | logger = logging.getLogger("hw3.q2.1")
 18 | logger.setLevel(logging.DEBUG)
 19 | logging.basicConfig(format='%(levelname)s:%(message)s', level=logging.DEBUG)
 20 | 
 21 | class RNNCell(tf.nn.rnn_cell.RNNCell):
 22 |     """Wrapper around our RNN cell implementation that allows us to play
 23 |     nicely with TensorFlow.
 24 |     """
 25 |     def __init__(self, input_size, state_size):
 26 |         self.input_size = input_size
 27 |         self._state_size = state_size
 28 | 
 29 |     @property
 30 |     def state_size(self):
 31 |         return self._state_size
 32 | 
 33 |     @property
 34 |     def output_size(self):
 35 |         return self._state_size
 36 | 
 37 |     def __call__(self, inputs, state, scope=None):
 38 |         """Updates the state using the previous @state and @inputs.
 39 |         Remember the RNN equations are:
 40 | 
 41 |         h_t = sigmoid(x_t W_x + h_{t-1} W_h + b)
 42 | 
 43 |         TODO: In the code below, implement an RNN cell using @inputs
 44 |         (x_t above) and the state (h_{t-1} above).
 45 |             - Define W_x, W_h, b to be variables of the apporiate shape
 46 |               using the `tf.get_variable' functions. Make sure you use
 47 |               the names "W_x", "W_h" and "b"!
 48 |             - Compute @new_state (h_t) defined above
 49 |         Tips:
 50 |             - Remember to initialize your matrices using the xavier
 51 |               initialization as before.
 52 |         Args:
 53 |             inputs: is the input vector of size [None, self.input_size]
 54 |             state: is the previous state vector of size [None, self.state_size]
 55 |             scope: is the name of the scope to be used when defining the variables inside.
 56 |         Returns:
 57 |             a pair of the output vector and the new state vector.
 58 |         """
 59 |         scope = scope or type(self).__name__
 60 | 
 61 |         # It's always a good idea to scope variables in functions lest they
 62 |         # be defined elsewhere!
 63 |         with tf.variable_scope(scope):
 64 |             ### YOUR CODE HERE (~6-10 lines)
 65 |             W_x = tf.get_variable(name="W_x", shape=(self.input_size,self.state_size),initializer=tf.contrib.layers.xavier_initializer())
 66 |             b = tf.get_variable(name='b', shape=(self.state_size), initializer=tf.constant_initializer(0))
 67 |             W_h = tf.get_variable(name='W_h', shape=(self.state_size,self.output_size), initializer=tf.contrib.layers.xavier_initializer())
 68 |             z_t=tf.add(tf.matmul(inputs,W_x)+tf.matmul(state,W_h),b,name='z_t')
 69 |             new_state=tf.nn.sigmoid(z_t, name='new_state')
 70 |             ### END YOUR CODE ###
 71 |         # For an RNN , the output and state are the same (N.B. this
 72 |         # isn't true for an LSTM, though we aren't using one of those in
 73 |         # our assignment)
 74 |         output = new_state
 75 |         return output, new_state
 76 | 
 77 | def test_rnn_cell():
 78 |     with tf.Graph().as_default():
 79 |         with tf.variable_scope("test_rnn_cell"):
 80 |             x_placeholder = tf.placeholder(tf.float32, shape=(None,3))
 81 |             h_placeholder = tf.placeholder(tf.float32, shape=(None,2))
 82 | 
 83 |             with tf.variable_scope("rnn"):
 84 |                 tf.get_variable("W_x", initializer=np.array(np.eye(3,2), dtype=np.float32))
 85 |                 tf.get_variable("W_h", initializer=np.array(np.eye(2,2), dtype=np.float32))
 86 |                 tf.get_variable("b",  initializer=np.array(np.ones(2), dtype=np.float32))
 87 | 
 88 |             tf.get_variable_scope().reuse_variables()
 89 |             cell = RNNCell(3, 2)
 90 |             y_var, ht_var = cell(x_placeholder, h_placeholder, scope="rnn")
 91 | 
 92 |             init = tf.global_variables_initializer()
 93 |             with tf.Session() as session:
 94 |                 session.run(init)
 95 |                 x = np.array([
 96 |                     [0.4, 0.5, 0.6],
 97 |                     [0.3, -0.2, -0.1]], dtype=np.float32)
 98 |                 h = np.array([
 99 |                     [0.2, 0.5],
100 |                     [-0.3, -0.3]], dtype=np.float32)
101 |                 y = np.array([
102 |                     [0.832, 0.881],
103 |                     [0.731, 0.622]], dtype=np.float32)
104 |                 ht = y
105 | 
106 |                 y_, ht_ = session.run([y_var, ht_var], feed_dict={x_placeholder: x, h_placeholder: h})
107 |                 print("y_ = " + str(y_))
108 |                 print("ht_ = " + str(ht_))
109 | 
110 |                 assert np.allclose(y_, ht_), "output and state should be equal."
111 |                 assert np.allclose(ht, ht_, atol=1e-2), "new state vector does not seem to be correct."
112 | 
113 | def do_test(_):
114 |     logger.info("Testing rnn_cell")
115 |     test_rnn_cell()
116 |     logger.info("Passed!")
117 | 
118 | if __name__ == "__main__":
119 |     parser = argparse.ArgumentParser(description='Tests the RNN cell implemented as part of Q2 of Homework 3')
120 |     subparsers = parser.add_subparsers()
121 | 
122 |     command_parser = subparsers.add_parser('test', help='')
123 |     command_parser.set_defaults(func=do_test)
124 | 
125 |     ARGS = parser.parse_args()
126 |     if ARGS.func is None:
127 |         parser.print_help()
128 |         sys.exit(1)
129 |     else:
130 |         ARGS.func(ARGS)
131 | 


--------------------------------------------------------------------------------
/NLP/assignment3/q3-clip-gru.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bayeslabs/AiGym/205350c311f0e0981fe90eae84586c9bd4a9cfef/NLP/assignment3/q3-clip-gru.png


--------------------------------------------------------------------------------
/NLP/assignment3/q3-clip-rnn.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bayeslabs/AiGym/205350c311f0e0981fe90eae84586c9bd4a9cfef/NLP/assignment3/q3-clip-rnn.png


--------------------------------------------------------------------------------
/NLP/assignment3/q3-noclip-gru.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bayeslabs/AiGym/205350c311f0e0981fe90eae84586c9bd4a9cfef/NLP/assignment3/q3-noclip-gru.png


--------------------------------------------------------------------------------
/NLP/assignment3/q3_gru_cell.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python3
  2 | # -*- coding: utf-8 -*-
  3 | """
  4 | Q3(d): Grooving with GRUs
  5 | """
  6 | 
  7 | from __future__ import absolute_import
  8 | from __future__ import division
  9 | 
 10 | import argparse
 11 | import logging
 12 | import sys
 13 | 
 14 | import tensorflow as tf
 15 | import numpy as np
 16 | 
 17 | logger = logging.getLogger("hw3.q3.1")
 18 | logger.setLevel(logging.DEBUG)
 19 | logging.basicConfig(format='%(levelname)s:%(message)s', level=logging.DEBUG)
 20 | 
 21 | class GRUCell(tf.nn.rnn_cell.RNNCell):
 22 |     """Wrapper around our GRU cell implementation that allows us to play
 23 |     nicely with TensorFlow.
 24 |     """
 25 |     def __init__(self, input_size, state_size):
 26 |         self.input_size = input_size
 27 |         self._state_size = state_size
 28 | 
 29 |     @property
 30 |     def state_size(self):
 31 |         return self._state_size
 32 | 
 33 |     @property
 34 |     def output_size(self):
 35 |         return self._state_size
 36 | 
 37 |     def __call__(self, inputs, state, scope=None):
 38 |         """Updates the state using the previous @state and @inputs.
 39 |         Remember the GRU equations are:
 40 | 
 41 |         z_t = sigmoid(x_t W_z + h_{t-1} U_z + b_z)
 42 |         r_t = sigmoid(x_t W_r + h_{t-1} U_r + b_r)
 43 |         o_t = tanh(x_t W_o + r_t * h_{t-1} U_o + b_o)
 44 |         h_t = z_t * h_{t-1} + (1 - z_t) * o_t
 45 | 
 46 |         TODO: In the code below, implement an GRU cell using @inputs
 47 |         (x_t above) and the state (h_{t-1} above).
 48 |             - Define U_r, W_r, b_r, U_z, W_z, b_z and U_o, W_o, b_o to
 49 |               be variables of the apporiate shape using the
 50 |               `tf.get_variable' functions.
 51 |             - Compute z, r, o and @new_state (h_t) defined above
 52 |         Tips:
 53 |             - Remember to initialize your matrices using the xavier
 54 |               initialization as before.
 55 |         Args:
 56 |             inputs: is the input vector of size [None, self.input_size]
 57 |             state: is the previous state vector of size [None, self.state_size]
 58 |             scope: is the name of the scope to be used when defining the variables inside.
 59 |         Returns:
 60 |             a pair of the output vector and the new state vector.
 61 |         """
 62 |         scope = scope or type(self).__name__
 63 | 
 64 |         # It's always a good idea to scope variables in functions lest they
 65 |         # be defined elsewhere!
 66 |         with tf.variable_scope(scope):
 67 |             ### YOUR CODE HERE (~20-30 lines)
 68 |             W_z=tf.get_variable(name="W_z", shape=(self.input_size, self.output_size), initializer=tf.contrib.layers.xavier_initializer())
 69 |             U_z=tf.get_variable(name="U_z", shape=(self.output_size, self.output_size), initializer=tf.contrib.layers.xavier_initializer())
 70 |             b_z=tf.get_variable(name='b_z', shape=(self.output_size))
 71 |             W_r=tf.get_variable(name="W_r", shape=(self.input_size, self.output_size), initializer=tf.contrib.layers.xavier_initializer())
 72 |             U_r=tf.get_variable(name="U_r", shape=(self.output_size, self.output_size), initializer=tf.contrib.layers.xavier_initializer())
 73 |             b_r=tf.get_variable(name='b_r', shape=(self.output_size))
 74 |             W_o=tf.get_variable(name="W_o", shape=(self.input_size, self.output_size), initializer=tf.contrib.layers.xavier_initializer())
 75 |             U_o=tf.get_variable(name="U_o", shape=(self.output_size, self.output_size), initializer=tf.contrib.layers.xavier_initializer())
 76 |             b_o=tf.get_variable(name='b_o', shape=(self.output_size))
 77 |             z_t=tf.nn.sigmoid(tf.add(tf.matmul(inputs,W_z)+tf.matmul(state,U_z),b_z),name='z_t')
 78 |             r_t=tf.nn.sigmoid(tf.add(tf.matmul(inputs,W_r)+tf.matmul(state,U_r),b_r),name='r_t')
 79 |             o_t=tf.nn.tanh(tf.add(tf.matmul(inputs,W_o)+r_t*tf.matmul(state,U_o),b_o),name='o_t')
 80 |             new_state=z_t*state+(1-z_t)*o_t
 81 |             ### END YOUR CODE ###
 82 |         # For a GRU, the output and state are the same (N.B. this isn't true
 83 |         # for an LSTM, though we aren't using one of those in our
 84 |         # assignment)
 85 |         output = new_state
 86 |         return output, new_state
 87 | 
 88 | def test_gru_cell():
 89 |     with tf.Graph().as_default():
 90 |         with tf.variable_scope("test_gru_cell"):
 91 |             x_placeholder = tf.placeholder(tf.float32, shape=(None,3))
 92 |             h_placeholder = tf.placeholder(tf.float32, shape=(None,2))
 93 | 
 94 |             with tf.variable_scope("gru"):
 95 |                 tf.get_variable("W_r", initializer=np.array(np.eye(3,2), dtype=np.float32))
 96 |                 tf.get_variable("U_r", initializer=np.array(np.eye(2,2), dtype=np.float32))
 97 |                 tf.get_variable("b_r",  initializer=np.array(np.ones(2), dtype=np.float32))
 98 |                 tf.get_variable("W_z", initializer=np.array(np.eye(3,2), dtype=np.float32))
 99 |                 tf.get_variable("U_z", initializer=np.array(np.eye(2,2), dtype=np.float32))
100 |                 tf.get_variable("b_z",  initializer=np.array(np.ones(2), dtype=np.float32))
101 |                 tf.get_variable("W_o", initializer=np.array(np.eye(3,2), dtype=np.float32))
102 |                 tf.get_variable("U_o", initializer=np.array(np.eye(2,2), dtype=np.float32))
103 |                 tf.get_variable("b_o",  initializer=np.array(np.ones(2), dtype=np.float32))
104 | 
105 |             tf.get_variable_scope().reuse_variables()
106 |             cell = GRUCell(3, 2)
107 |             y_var, ht_var = cell(x_placeholder, h_placeholder, scope="gru")
108 | 
109 |             init = tf.global_variables_initializer()
110 |             with tf.Session() as session:
111 |                 session.run(init)
112 |                 x = np.array([
113 |                     [0.4, 0.5, 0.6],
114 |                     [0.3, -0.2, -0.1]], dtype=np.float32)
115 |                 h = np.array([
116 |                     [0.2, 0.5],
117 |                     [-0.3, -0.3]], dtype=np.float32)
118 |                 y = np.array([
119 |                     [ 0.320, 0.555],
120 |                     [-0.006, 0.020]], dtype=np.float32)
121 |                 ht = y
122 | 
123 |                 y_, ht_ = session.run([y_var, ht_var], feed_dict={x_placeholder: x, h_placeholder: h})
124 |                 print("y_ = " + str(y_))
125 |                 print("ht_ = " + str(ht_))
126 | 
127 |                 assert np.allclose(y_, ht_), "output and state should be equal."
128 |                 assert np.allclose(ht, ht_, atol=1e-2), "new state vector does not seem to be correct."
129 | 
130 | def do_test(_):
131 |     logger.info("Testing gru_cell")
132 |     test_gru_cell()
133 |     logger.info("Passed!")
134 | 
135 | if __name__ == "__main__":
136 |     parser = argparse.ArgumentParser(description='Tests the GRU cell implemented as part of Q3 of Homework 3')
137 |     subparsers = parser.add_subparsers()
138 | 
139 |     command_parser = subparsers.add_parser('test', help='')
140 |     command_parser.set_defaults(func=do_test)
141 | 
142 |     ARGS = parser.parse_args()
143 |     if ARGS.func is None:
144 |         parser.print_help()
145 |         sys.exit(1)
146 |     else:
147 |         ARGS.func(ARGS)
148 | 


--------------------------------------------------------------------------------
/NLP/assignment3/requirements.txt:
--------------------------------------------------------------------------------
1 | tensorflow>=0.12
2 | matplotlib
3 | 


--------------------------------------------------------------------------------
/Python/CME193/lec1.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bayeslabs/AiGym/205350c311f0e0981fe90eae84586c9bd4a9cfef/Python/CME193/lec1.pdf


--------------------------------------------------------------------------------
/Python/CME193/lec2.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bayeslabs/AiGym/205350c311f0e0981fe90eae84586c9bd4a9cfef/Python/CME193/lec2.pdf


--------------------------------------------------------------------------------
/Python/CME193/lec3.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bayeslabs/AiGym/205350c311f0e0981fe90eae84586c9bd4a9cfef/Python/CME193/lec3.pdf


--------------------------------------------------------------------------------
/Python/CME193/lec4.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bayeslabs/AiGym/205350c311f0e0981fe90eae84586c9bd4a9cfef/Python/CME193/lec4.pdf


--------------------------------------------------------------------------------
/Python/CME193/lec5.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bayeslabs/AiGym/205350c311f0e0981fe90eae84586c9bd4a9cfef/Python/CME193/lec5.pdf


--------------------------------------------------------------------------------
/Python/CME193/lec6.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bayeslabs/AiGym/205350c311f0e0981fe90eae84586c9bd4a9cfef/Python/CME193/lec6.pdf


--------------------------------------------------------------------------------
/Python/CME193/lec7.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bayeslabs/AiGym/205350c311f0e0981fe90eae84586c9bd4a9cfef/Python/CME193/lec7.pdf


--------------------------------------------------------------------------------
/Python/CME193/lec8.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bayeslabs/AiGym/205350c311f0e0981fe90eae84586c9bd4a9cfef/Python/CME193/lec8.pdf


--------------------------------------------------------------------------------
/Python/CME193/problemsets/Markov-chain-startercode-master.zip:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bayeslabs/AiGym/205350c311f0e0981fe90eae84586c9bd4a9cfef/Python/CME193/problemsets/Markov-chain-startercode-master.zip


--------------------------------------------------------------------------------
/Python/CME193/problemsets/Rock-paper-scissors-startercode-master.zip:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bayeslabs/AiGym/205350c311f0e0981fe90eae84586c9bd4a9cfef/Python/CME193/problemsets/Rock-paper-scissors-startercode-master.zip


--------------------------------------------------------------------------------
/Python/CME193/problemsets/exercises.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bayeslabs/AiGym/205350c311f0e0981fe90eae84586c9bd4a9cfef/Python/CME193/problemsets/exercises.pdf


--------------------------------------------------------------------------------
/Python/CME193/problemsets/hangman-master.zip:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bayeslabs/AiGym/205350c311f0e0981fe90eae84586c9bd4a9cfef/Python/CME193/problemsets/hangman-master.zip


--------------------------------------------------------------------------------
/Python/README.md:
--------------------------------------------------------------------------------
 1 | <!-- For learning python for datascience we suggest our folks to follow the CME 193 course by stanford.
 2 | Some lectures presentations and problem sets are uploaded above. Follow the below link for latest offerings of this course.
 3 | 
 4 | http://web.stanford.edu/class/cme193/
 5 | 
 6 | 
 7 | Github page for all CME 193 class notebooks: https://github.com/icme/cme193/tree/gh-pages/nb -->
 8 | 
 9 | 
10 | 
11 | # Python
12 | 
13 | ## Course List
14 | **S.No** | **Course Title** | **Link to course** | **Link to Assignment Solutions**
15 | ------------ | ------------- | --------- | -------------
16 | [1](#1-scientific-python) | Scientific Python | http://web.stanford.edu/class/cme193/ | [CME193 Solutions](https://github.com/icme/cme193)
17 | [2](#2-introduction-to-computer-science-and-programming-in-python) | Introduction to Computer Science and Programming in Python | https://ocw.mit.edu/courses/6-0001-introduction-to-computer-science-and-programming-in-python-fall-2016/ | [CS6.0001 Solutions](https://github.com/tuthang102/MIT-6.0001-Intro-to-CS)
18 | 
19 | 
20 | ## Course Details
21 | ### 1. Scientific Python
22 |    * **Link to course** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;: &nbsp; &nbsp; http://web.stanford.edu/class/cme193/index.html
23 |    * **Offered By** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;: &nbsp; &nbsp; Stanford 
24 |    * **Pre-Requisites** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; : &nbsp; &nbsp;  Basic Programming Knowledge
25 |                                      <!-- &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Introductory molecular biology -->
26 |    * **Level** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; : &nbsp; &nbsp; Beginner
27 |    * **Course description**    
28 |         This course is recommended for students who are familiar with programming at least at the level of CS106A and want to translate their programming knowledge to Python with the goal of becoming proficient in the scientific computing and data science stack. Lectures will be interactive with a focus on real world applications of scientific computing. Technologies covered include Numpy, SciPy, Pandas, Scikit-learn, and others. Topics will be chosen from Linear Algebra, Optimization, Machine Learning, and Data Science. Prior knowledge of programming will be assumed, and some familiarity with Python is helpful, but not mandatory.
29 |  
30 | 
31 | ### 2. Introduction to Computer Science and Programming in Python
32 |    * **Link to course** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;: &nbsp; &nbsp; https://ocw.mit.edu/courses/6-0001-introduction-to-computer-science-and-programming-in-python-fall-2016/
33 |    * **Offered By** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;: &nbsp; &nbsp; MIT 
34 |    * **Pre-Requisites** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; : &nbsp; &nbsp; None
35 |    * **Level** &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; : &nbsp; &nbsp; Intermediate
36 |    * **Course description**    
37 |         Introduction to Computer Science and Programming in Python is intended for students with little or no programming experience. It aims to provide students with an understanding of the role computation can play in solving problems and to help students, regardless of their major, feel justifiably confident of their ability to write small programs that allow them to accomplish useful goals. The class uses the Python 3.5 programming language.
38 |         
39 |        
40 | ####  Happy Learning  &nbsp; :thumbsup: :memo: 
41 | 
42 | 
43 | 
44 | 
45 | 
46 | 
47 | 
48 | 


--------------------------------------------------------------------------------
/Readme.md:
--------------------------------------------------------------------------------
1 | In this repository we tried to maintain best AI materials to learn and apply in different fields. Most of these courses are based with python. Hope you will enjoy our curation.
2 | 


--------------------------------------------------------------------------------
/Speech/Readme.md:
--------------------------------------------------------------------------------
1 | For speech processing we will be following stanford CS224s course.
2 | 
3 | http://web.stanford.edu/class/cs224s/syllabus.html
4 | 


--------------------------------------------------------------------------------