diff --git a/is/UB3/OriginalSheet3.ipynb b/is/UB3/OriginalSheet3.ipynb new file mode 100644 index 0000000..dd7e77f --- /dev/null +++ b/is/UB3/OriginalSheet3.ipynb @@ -0,0 +1,329 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "preamble": true + }, + "source": [ + "(Defining latex commands: not to be shown...)\n", + "$$\n", + "\\newcommand{\\norm}[1]{\\left \\| #1 \\right \\|}\n", + "\\DeclareMathOperator{\\minimize}{minimize}\n", + "\\newcommand{\\real}{\\mathbb{R}}\n", + "\\newcommand{\\normal}{\\mathcal{N}}\n", + "$$" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Gaussian Algebra (25 Points)\n", + "\n", + "Prove that the product of two univariate (scalar) Gaussian distributions is a Gaussian again, i.e. show, by explicitly performing the required arithmetic transformations, that\n", + "\n", + "\\begin{equation}\n", + " \\normal(x;\\mu,\\sigma^2)\\normal(x;m,s^2) = \\normal[x; (\\frac 1{\\sigma^2}+\\frac 1{s^2})^{-1}(\\frac \\mu{\\sigma^2}+\\frac m{s^2}),(\\frac 1{\\sigma^2}+\\frac 1{s^2})^{-1}]\\normal[m,\\mu,\\sigma^2+s^2].\n", + "\\end{equation}" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Maximum Likelihood Estimator of Simple Linear Regression (25 Points)\n", + "\n", + "Derive the formula $\\mathbf{w}_{MLE} = (X^TX)^{-1}X^T\\mathbf{y}$ from the lecture, by calculating the derivative of $p(\\mathbf{y}\\,|X,\\mathbf{w}) = \\normal(\\mathbf{y}\\,|X\\mathbf{w}, \\sigma^2I)$ with respect to $\\mathbf{w}$, setting it to zero and solving it for $\\mathbf{w}$.\n", + "\n", + "\n", + "Note: _To refresh your linear algebra you might find it useful to have a look in [here](http://webdav.tuebingen.mpg.de/lectures/ei-SS2015/pdfs/Murray_cribsheet.pdf)._" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Linear regression (50 Points)\n", + "\n", + "In this exercise you will perform a regression analysis on a toy dataset. You will implement ridge regression and learn how to find a good model through a comparative performance analysis.\n", + "\n", + "1) Download the [training set](http://webdav.tuebingen.mpg.de/lectures/ei-SS2015/data/ex1_train.csv)!
\n", + "2) Implement $\\mathbf{w}_{RIDGE}$ as a function of a given $X, \\mathbf{y}$ array and a regularization parameter $\\lambda$!" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "# Loading the required packages\n", + "%matplotlib inline\n", + "import numpy as np\n", + "import matplotlib.pyplot as plt\n", + "from IPython.html.widgets import interact" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "scrolled": false + }, + "outputs": [], + "source": [ + "def wRidge(X,y,lamb):\n", + " # Change the following line and implement the ridge regression estimator wRidge\n", + " w = np.zeros(X.shape[-1])\n", + " return w" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "3) Load \"ex1_train.csv\" into a numpy array! The first column in the csv file is $X$ and the second column is $\\mathbf{y}$, assign them to each variable!" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": true + }, + "outputs": [], + "source": [ + "# Read ex1_train.csv and assign the first column and \n", + "# second column to variables x and y respectively." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "4) Plot the training data with appropriate labels on each axes!" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "scrolled": false + }, + "outputs": [], + "source": [ + "# Plot the input data here" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "5) Implement a function which constructs features upto a input polynomial degree $d$!
\n", + "Note: _Constructing higher polynomial features is similar to what you implemented in Exercise 3 (SVM) of the previous exercise sheet._" + ] + }, + { + "cell_type": "code", + "execution_count": 68, + "metadata": { + "collapsed": true + }, + "outputs": [], + "source": [ + "def construct_poly(x,d):\n", + " ## Implement a method which given an array of size N, \n", + " ## returns an array of dimension (N,d)\n", + " return np.zeros((x.shape[0],d+x.shape[1]))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "6) Implement the Mean Squared Error Loss (MSE) as a function of the predicted and true values of the target variable!
" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": true + }, + "outputs": [], + "source": [ + "def MSE(y_predict,y_true):\n", + " ## Implement mean squared error for a given input y and its predictions.\n", + " return 0.0" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "7) By comparing the MSE find the degree $d$ for the polynomial that fits the training data best! You might find it useful to use the code below to interactively change the variable $d$, set $\\lambda = 1$ and keep it fixed. Plot the error as a function of different values of $d$!
" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": false, + "scrolled": true + }, + "outputs": [], + "source": [ + "##This function provides an interactive mode to change polynomial degree. \n", + "@interact(n=[1,16])\n", + "def plot(n):\n", + " X = construct_poly(x,n)\n", + " w = wRidge(X,y,1.0)\n", + " plt.plot(x,X.dot(w))\n", + " plt.title(\"MSE %f\" % MSE(X.dot(w),y))\n", + " plt.plot(x,y)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "8) Apply models with different values of $d$ after being trained on the training dataset, to the test data available [here](http://webdav.tuebingen.mpg.de/lectures/ei-SS2015/data/ex1_test.csv). Compare the errors on the test data to the ones from the training by plotting the error curves as functions of the polynomial degree in a single plot! What do you conclude?
" + ] + }, + { + "cell_type": "code", + "execution_count": 82, + "metadata": { + "collapsed": false + }, + "outputs": [ + { + "data": { + "text/plain": [ + "array([[ 1.00000000e+02, 1.00000000e+01, 1.00000000e+00],\n", + " [ 1.00000000e+01, 1.00000000e+00, 1.00000000e-01],\n", + " [ 1.00000000e+00, 1.00000000e-01, 1.00000000e-02]])" + ] + }, + "execution_count": 82, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "A.T.dot(A)" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": { + "collapsed": false + }, + "outputs": [], + "source": [ + "import scipy\n", + "from scipy import linalg\n", + "import numpy as np" + ] + }, + { + "cell_type": "code", + "execution_count": 25, + "metadata": { + "collapsed": false + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "1\n", + "[[ 9.80101960e-03 9.80101960e-04 9.80101960e-05]\n", + " [ 9.80101960e-04 9.80101960e-05 9.80101960e-06]\n", + " [ 9.80101960e-05 9.80101960e-06 9.80101960e-07]]\n", + "True\n" + ] + } + ], + "source": [ + "A = np.array([[10,1,0.1],[10,1,0.2],[10,1.1,0.7]])\n", + "a = np.array([[10,1,0.1]])\n", + "A = a.T.dot(a)\n", + "\n", + "B,rank = linalg.pinv(A,return_rank=True)\n", + "print rank\n", + "print B\n", + "print np.allclose(A,A.dot(B.dot(A)))" + ] + }, + { + "cell_type": "code", + "execution_count": 67, + "metadata": { + "collapsed": true + }, + "outputs": [], + "source": [ + "## Read test data here" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "9) With a fixed optimal $d$, change the value of $\\lambda$ to one of the following values $[0.1, 1.0, 10.0]$ and find the minimum MSE!
" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Hand in printed copy of completed notebook." + ] + } + ], + "metadata": { + "annotations": { + "author": "", + "categories": [ + "intelligent-systems-1-2015" + ], + "date": "2015-04-30", + "location": "Beginning of next lecture", + "parent": "IS_SS2015", + "submission_date": "2015-05-07", + "subtitle": "Exercise Sheet 3, Linear Regression", + "tags": [ + "IntelligenSystems", + "Course" + ], + "title": "Intelligent Systems 1 - Summer Semester 2015" + }, + "celltoolbar": "Edit Metadata", + "kernelspec": { + "display_name": "Python 2", + "language": "python", + "name": "python2" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 2 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython2", + "version": "2.7.8" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +}