Newer
Older
abgabensammlungSS15 / is / UB3 / ExerciseSheet3.ipynb
@Jan-Peter Hohloch Jan-Peter Hohloch on 3 May 2015 11 KB IS: 3A1
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "preamble": true
   },
   "source": [
    "(Defining latex commands: not to be shown...)\n",
    "$$\n",
    "\\newcommand{\\norm}[1]{\\left \\| #1 \\right \\|}\n",
    "\\DeclareMathOperator{\\minimize}{minimize}\n",
    "\\newcommand{\\real}{\\mathbb{R}}\n",
    "\\newcommand{\\normal}{\\mathcal{N}}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#Intelligent Systems Sheet 3\n",
    "Maximus Mutschler & Jan-Peter Hohloch"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#  Gaussian Algebra (25 Points)\n",
    "\n",
    "Prove that the product of two univariate (scalar) Gaussian distributions is a Gaussian again, i.e. show, by explicitly performing the required arithmetic transformations, that\n",
    "\n",
    "\\begin{equation}\n",
    "   \\normal(x;\\mu,\\sigma^2)\\normal(x;m,s^2) = \\normal\\left[x; \\left(\\frac 1{\\sigma^2}+\\frac 1{s^2}\\right)^{-1}\\left(\\frac \\mu{\\sigma^2}+\\frac m{s^2}\\right),\\left(\\frac 1{\\sigma^2}+\\frac 1{s^2}\\right)^{-1}\\right]\\normal\\left[\\mu;m,\\sigma^2+s^2\\right].\n",
    "\\end{equation}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Proof\n",
    "\\begin{align*}\n",
    "   f(x) &= \\frac{1}{\\sqrt{2\\pi\\sigma^2}}\\cdot e^{\\frac{-(x-\\mu)^2}{2\\sigma^2}}\\cdot \\frac{1}{\\sqrt{2\\pi s^2}}\\cdot e^{\\frac{-(x-m)^2}{2s^2}}\\\\\n",
    "   &=\\frac{1}{2\\pi s\\sigma}\\cdot e^{-\\alpha}\\\\\n",
    "   \\text{where }\\alpha &= \\frac{s^2(x-\\mu)^2+\\sigma^2(x-m)^2}{2\\sigma^2 s^2}\\\\\n",
    "   &=\\frac{(s^2+\\sigma^2)x^2-2(s^2\\mu+\\sigma^2 m)x+s^2\\mu^2+\\sigma^2m^2}{2\\sigma^2s^2}\\\\\n",
    "   &=\\frac{x^2-2\\frac{s^2\\mu+\\sigma^2 m}{s^2+\\sigma^2}x+\\frac{s^2\\mu^2+\\sigma^2m^2}{s^2+\\sigma^2}}{\\frac{2\\sigma^2s^2}{s^2+\\sigma^2}}\\\\\n",
    "   &=\\frac{\\left(x-\\frac{s^2\\mu+\\sigma^2 m}{s^2+\\sigma^2}\\right)^2-\\left(\\frac{s^2\\mu+\\sigma^2 m}{s^2+\\sigma^2}\\right)^2+\\frac{s^2\\mu^2+\\sigma^2m^2}{s^2+\\sigma^2}}{\\frac{2\\sigma^2s^2}{s^2+\\sigma^2}}\\\\\n",
    "   &=\\frac{\\left(x-\\frac{s^2\\mu+\\sigma^2 m}{s^2+\\sigma^2}\\right)^2}{\\frac{2\\sigma^2s^2}{s^2+\\sigma^2}}+\\underbrace{\\frac{\\frac{s^2\\mu^2+\\sigma^2m^2}{s^2+\\sigma^2}-\\left(\\frac{s^2\\mu+\\sigma^2 m}{s^2+\\sigma^2}\\right)^2}{\\frac{2\\sigma^2s^2}{s^2+\\sigma^2}}}_\\beta\\\\\n",
    "   \\beta&= \\frac{\\frac{s^4\\mu^2+s^2\\sigma^2\\mu^2+s^2\\sigma^2m^2+\\sigma^4m^2-\\left(s^4\\mu^2+2s^2\\sigma^2\\mu m+\\sigma^4m^2\\right)}{\\left(s^2+\\sigma^2\\right)^2}}{\\frac{2\\sigma^2s^2}{s^2+\\sigma^2}}\\\\\n",
    "   &=\\frac{s^2\\sigma^2\\left(\\mu^2-2\\mu m+m^2\\right)}{2\\sigma^2s^2\\left(s^2+\\sigma^2\\right)}\\\\\n",
    "   &=\\frac{(\\mu-m)^2}{2(s^2+\\sigma^2)}\\\\\n",
    "   \\Rightarrow f(x)&=\\frac{1}{2\\pi s\\sigma}\\cdot e^{-(\\alpha-\\beta)-\\beta}\\\\\n",
    "   &=\\frac{1}{2\\pi s\\sigma}\\cdot e^{-\\frac{\\left(x-\\frac{s^2\\mu+\\sigma^2 m}{s^2+\\sigma^2}\\right)^2}{\\frac{2\\sigma^2s^2}{s^2+\\sigma^2}}}\\cdot e^{-\\frac{(\\mu-m)^2}{2(s^2+\\sigma^2)}}\\\\\n",
    "   &=\\frac{\\sqrt{2\\pi(s^2+\\sigma^2)}\\sqrt{2\\pi\\frac{\\sigma^2s^2}{s^2+\\sigma^2}}}{2\\pi s\\sigma}\\normal\\left[x;\\frac{s^2\\mu+\\sigma^2 m}{s^2+\\sigma^2},\\frac{\\sigma^2s^2}{s^2+\\sigma^2}\\right]\\normal\\left[\\mu;m,\\sigma^2+s^2\\right]\\\\\n",
    "   &=\\frac{\\sqrt{2\\pi\\cdot 2\\pi\\sigma^2s^2}}{2\\pi s\\sigma}\\normal\\left[x; \\left(\\frac 1{\\sigma^2}+\\frac 1{s^2}\\right)^{-1}\\left(\\frac \\mu{\\sigma^2}+\\frac m{s^2}\\right),\\left(\\frac 1{\\sigma^2}+\\frac 1{s^2}\\right)^{-1}\\right]\\normal\\left[\\mu;m,\\sigma^2+s^2\\right]\\\\\n",
    "   &=\\normal\\left[x; \\left(\\frac 1{\\sigma^2}+\\frac 1{s^2}\\right)^{-1}\\left(\\frac \\mu{\\sigma^2}+\\frac m{s^2}\\right),\\left(\\frac 1{\\sigma^2}+\\frac 1{s^2}\\right)^{-1}\\right]\\normal\\left[\\mu;m,\\sigma^2+s^2\\right]\\hspace{2cm} q.e.d.\n",
    "\\end{align*}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#  Maximum Likelihood Estimator of Simple Linear Regression (25 Points)\n",
    "\n",
    "Derive the formula $\\mathbf{w}_{MLE} = (X^TX)^{-1}X^T\\mathbf{y}$ from the lecture, by calculating the derivative of $p(\\mathbf{y}\\,|X,\\mathbf{w}) = \\normal(\\mathbf{y}\\,|X\\mathbf{w}, \\sigma^2I)$ with respect to $\\mathbf{w}$, setting it to zero and solving it for $\\mathbf{w}$.\n",
    "\n",
    "\n",
    "Note: _To refresh your linear algebra you might find it useful to have a look in [here](http://webdav.tuebingen.mpg.de/lectures/ei-SS2015/pdfs/Murray_cribsheet.pdf)._"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#  Linear regression (50 Points)\n",
    "\n",
    "In this exercise you will perform a regression analysis on a toy dataset. You will implement ridge regression and learn how to find a good model through a comparative performance analysis.\n",
    "\n",
    "1) Download the [training set](http://webdav.tuebingen.mpg.de/lectures/ei-SS2015/data/ex1_train.csv)! <br>\n",
    "2) Implement $\\mathbf{w}_{RIDGE}$ as a function of a given $X, \\mathbf{y}$ array and a regularization parameter $\\lambda$!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# Loading the required packages\n",
    "%matplotlib inline\n",
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "from IPython.html.widgets import interact"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "def wRidge(X,y,lamb):\n",
    "  # Change the following line and implement the ridge regression estimator wRidge\n",
    "  w = np.zeros(X.shape[-1])\n",
    "  return w"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "3) Load \"ex1_train.csv\" into a numpy array! The first column in the csv file is $X$ and the second column is $\\mathbf{y}$, assign them to each variable!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Read ex1_train.csv and assign the first column and \n",
    "# second column to variables x and y respectively."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "4) Plot the training data with appropriate labels on each axes!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "# Plot the input data here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "5) Implement a function which constructs features upto a input polynomial degree $d$!<br>\n",
    "Note: _Constructing higher polynomial features is similar to what you implemented in Exercise 3 (SVM) of the previous exercise sheet._"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 68,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def construct_poly(x,d):\n",
    "  ## Implement a method which given an array of size N, \n",
    "  ## returns an array of dimension (N,d)\n",
    "  return np.zeros((x.shape[0],d+x.shape[1]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "6) Implement the Mean Squared Error Loss (MSE) as a function of the predicted and true values of the target variable! <br>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def MSE(y_predict,y_true):\n",
    "  ## Implement mean squared error for a given input y and its predictions.\n",
    "  return 0.0"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "7) By comparing the MSE find the degree $d$ for the polynomial that fits the training data best! You might find it useful to use the code below to interactively change the variable $d$, set $\\lambda = 1$ and keep it fixed. Plot the error as a function of different values of $d$!<br>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "##This function provides an interactive mode to change polynomial degree. \n",
    "@interact(n=[1,16])\n",
    "def plot(n):\n",
    "  X = construct_poly(x,n)\n",
    "  w = wRidge(X,y,1.0)\n",
    "  plt.plot(x,X.dot(w))\n",
    "  plt.title(\"MSE %f\" % MSE(X.dot(w),y))\n",
    "  plt.plot(x,y)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "8) Apply models with different values of $d$ after being trained on the training dataset, to the test data available [here](http://webdav.tuebingen.mpg.de/lectures/ei-SS2015/data/ex1_test.csv). Compare the errors on the test data to the ones from the training by plotting the error curves as functions of the polynomial degree in a single plot! What do you conclude? <br>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 82,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[  1.00000000e+02,   1.00000000e+01,   1.00000000e+00],\n",
       "       [  1.00000000e+01,   1.00000000e+00,   1.00000000e-01],\n",
       "       [  1.00000000e+00,   1.00000000e-01,   1.00000000e-02]])"
      ]
     },
     "execution_count": 82,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "A.T.dot(A)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "import scipy\n",
    "from scipy import linalg\n",
    "import numpy as np"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1\n",
      "[[  9.80101960e-03   9.80101960e-04   9.80101960e-05]\n",
      " [  9.80101960e-04   9.80101960e-05   9.80101960e-06]\n",
      " [  9.80101960e-05   9.80101960e-06   9.80101960e-07]]\n",
      "True\n"
     ]
    }
   ],
   "source": [
    "A = np.array([[10,1,0.1],[10,1,0.2],[10,1.1,0.7]])\n",
    "a = np.array([[10,1,0.1]])\n",
    "A = a.T.dot(a)\n",
    "\n",
    "B,rank = linalg.pinv(A,return_rank=True)\n",
    "print rank\n",
    "print B\n",
    "print np.allclose(A,A.dot(B.dot(A)))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 67,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "## Read test data here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "9) With a fixed optimal $d$, change the value of $\\lambda$ to one of the following values $[0.1, 1.0, 10.0]$ and find the minimum MSE!<br>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Hand in printed copy of completed notebook."
   ]
  }
 ],
 "metadata": {
  "annotations": {
   "author": "",
   "categories": [
    "intelligent-systems-1-2015"
   ],
   "date": "2015-04-30",
   "location": "Beginning of next lecture",
   "parent": "IS_SS2015",
   "submission_date": "2015-05-07",
   "subtitle": "Exercise Sheet 3, Linear Regression",
   "tags": [
    "IntelligenSystems",
    "Course"
   ],
   "title": "Intelligent Systems 1 - Summer Semester 2015"
  },
  "celltoolbar": "Edit Metadata",
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.4.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}