diff --git a/text/TODO.txt b/text/TODO.txt index 9b0a74e..a596771 100644 --- a/text/TODO.txt +++ b/text/TODO.txt @@ -115,6 +115,9 @@ * compare with pause svmCV (only 0.1,1,10) +2016-11-15 +---------- +* really blinking? (p.27) diff --git a/text/thesis/02MaterialsAndMethods.tex b/text/thesis/02MaterialsAndMethods.tex index cf98443..1d9eb5a 100644 --- a/text/thesis/02MaterialsAndMethods.tex +++ b/text/thesis/02MaterialsAndMethods.tex @@ -156,9 +156,29 @@ Common kernels are polynomial, Gaussian and hyperbolic kernels. \subsection{Confusion Matrix} \label{mm:cm} - %TODO: 2 classes: specifity ... The confusion matrix is a visualization of classifications. In it for every class the number of samples classified as each class is shown. This is interesting since it can show bias and give a feeling for similar cases where similar is meant according to the features.\\ - %TODO: figure Confusion matrix + In the 2-class case the well known table of true and false positives and negatives (table~\ref{tab:tptnftfn}) is a confusion matrix. From it we can learn specificity and sensitivity as follows: + $$\text{sensitivity}=TP/(TP+FP)$$ + $$\text{specificity}=TN/(TN+FN)$$ + \begin{table} + \centering + \begin{math} + \begin{array} + {c||c|c} + &\text{predicted }\true&\text{predicted }\false\\\hline\hline + \text{is }\true& TP & FN\\\hline + \text{is }\false& FP & TN + \end{array} + \end{math} + \caption{2D confusion matrix} + \label{tab:tptnftfn} + \end{table} + In the higher dimensional case \matlab{} uses color coded maps as figure~\ref{fig:exampleCM}. In our application we use scaled confusion matrices where each row adds up to 1. + \begin{figure} + \includegraphics[width=\textwidth]{pictures/results/cmEEGfull.png} + \caption{Example for a confusion matrix} + \label{fig:exampleCM} + \end{figure} \subsection{Regression} Regression is the idea of finding $\beta$ so that $$y= X\beta+\epsilon$$ where X is the $n\times p$ input matrix and y the $n\times 1$ output vector of a system. Having this $\beta$ from given input the output can be predicted.\\ There are different ways to find this $\beta$. One common approach is the \emph{ordinary least squares}-Algorithm. $$\hat{\beta}=\arg\min\limits_{b\in\mathds{R}^p} \left(y-Xb\right)^T\left(y-Xb\right),$$ meaning the chosen $\hat\beta$ is that $b$ which produces the lowest error since $Xb$ should be - besides from noise $\epsilon$ - the same as $y$.\\ @@ -201,7 +221,13 @@ \label{alg:cv} \end{algorithm} \subsection{ANOVA} - %TODO + Analysis of Variance (ANOVA) is a way of checking if there is a main effect of a variable.\\ + The Hypotheses tested are that all group means are equal ($H_0$) or they are not ($H_1$). To check on those ANOVA compares the deviation from the over-all mean and compares it to the deviation within the groups. If a lot of variance in the data can be explained by the groups (meaning in-group variance is lower than variance between groups) it is quite likely that the proposed groups have different means.\\ + Whether this is significant is decided based on the $p$-Value representing the probability that the difference between in-group and between-group variance is even higher. $H_0$ is rejected if $p$ is lower than a defined threshold (often $0.05$, $0.01$ or $0.001$). + \subsection{Boxplot} + To plot data and show their distribution we use boxplots. + A boxplot contains information about the median (red line), 0.25 and 0.75 quantiles (ends of the box) and about the highest and lowest values that are not classified as outliers.\\ + A data point $y$ is classified as outlier if $y > q_3+1.5\cdot(q_3-q_1)$ or $y < q_1-1.5\cdot(q_3-q_1)$, where $q_1,q_3$ are the first and third quartile (which are also defining the box). \section{Experimental design} \label{mm:design} The data used for this work were mainly recorded by Farid Shiman, Nerea Irastorza-Landa, and Andrea Sarasola-Sanz for their work (\cite{Shiman15},\cite{Sarasola15}). We were allowed to use them for further analysis.\\ @@ -376,51 +402,52 @@ Since it takes some time for commands to go from brain to the muscles, we introduced an variable offset between EEG and other data. The offset has to be given in a number of shifts, so in default is a multiple of 200ms.\\ Results are given in Sections~\ref{res:offsetEEG} and~\ref{res:offsetLF}. \subsection{Pause} - \label{mat:pause}%TODO + \label{mat:pause} + We introduce a pause before movement onset. This pause means that 1 second before movement onset is not taken into account when analyzing the data. If there is no pause we only take 1s to 0.5 second before movement onset out and classify the last 0.5 seconds before movement as belonging to the following task.\\ + This was necessary since the data about presentation of stimuli didn't match the recordings and we had to reclassify (cf. section \ref{mm:newClass}). \subsection{Prediction with interim step} All these analyses only show the accuracy of one step. To get a measure for the over-all performance we predict synergies from EEG and use them to predict EMG or kinematics respectively.\\ The resulting correlation is the mean of the correlations of a 10-fold cross validation where the same unknown synergies are predicted from EEG and used to predict EMG or kinematics. So there is no correction step between the steps and EMG or kinematics are predicted from EEG via the Synergies. Here also different methods to determine Synergies are compared (see Section~\ref{res:differentSynergiesVia}). \subsection{Multiple Sessions} We analyze each session (cf. Section~\ref{mm:design}) independently meaning there are 51 independent results for each analysis. These are used for the statistical evaluation in Chapter~\ref{chp:results}.\\ - Some analyses are only done on one session - if so it will be clearly stated. %TODO: check, out if not necessary - \subsection{Evaluation} - \subsubsection{Default values} - \label{mat:default} - The values of our variables used in \texttt{'Default'} are given in table~\ref{tab:default}. - \begin{table} - \centering - \begin{tabular}{r|c|l} - Variable & default & Meaning\\\hline - allSubjects & \true & is the computation done for all 51 sessions\\ - &&or only for one randomly chosen?\\ - eegOffset & 0 & amount of offset applied for EEG data (cf. \ref{mat:offset})\\ - $k$ & 10 & iterations of cross validation (do not change)\\ - maxEEGFreq & 49 & Frequency for a Butterworth low-pass filter\\ - minEEGFreq & 2 & Frequency for a Butterworth high-pass filter\\ - maxExpC & 0 & SVM tries values $10^{-x}$ to $10^{x}$ - with steps in x: 1\\ - maxFile & 5 & number of files per session\\ - maxPerClass & 250 & maximum number of data points in one svm class\\ - && (only for training)\\ - noLFsamples & 5 & number of samples out of one time window\\ && used for LF predictions\\ - noSynergies & 3 & number of Synergies used\\ - pause & 0 & apply pause or not? (cf. \ref{mat:pause})\\ - pBurgOrder & 250 & order of model for Burg's model (cf. \ref{mat:burg})\\ - ridgeParams & 100 & Array of parameters tried in cross \\&&validation for ridge\\ - shiftEEG & 0.2 & shift of the EEG window in each step\\ - shiftEMG & 0.05 & shift of the EMG window in each step\\ - threshold & 10000 & threshold for classifiaction as movement (cf. \ref{mm:newClass})\\ - windowEEG & 1 & size of the EEG window \\ - windowEMG & 0.2 & size of the EMG window \\ - \end{tabular} - \caption{Values used for default} - \label{tab:default} - \end{table} - \subsubsection{Boxplot} - To plot data and show their distribution we use boxplots. - A boxplot contains information about the median (red line), 0.25 and 0.75 quantiles (ends of the box) and about the highest and lowest values that are not classified as outliers.\\ - A data point $y$ is classified as outlier if $y > q_3+1.5\cdot(q_3-q_1)$ or $y < q_1-1.5\cdot(q_3-q_1)$, where $q_1,q_3$ are the first and third quartile (which are also defining the box). - \subsubsection{ANOVA} - Analysis of Variance (ANOVA) is a way of checking if there is a main effect of a variable.\\ - The Hypotheses tested are that all group means are equal ($H_0$) or they are not ($H_1$). To check on those ANOVA compares the deviation from the over-all mean and compares it to the deviation within the groups. If a lot of variance in the data can be explained by the groups (meaning in-group variance is lower than variance between groups) it is quite likely that the proposed groups have different means.\\ - Whether this is significant is decided based on the $p$-Value representing the probability that the difference between in-group and between-group variance is even higher. $H_0$ is rejected if $p$ is lower than a defined threshold (often $0.05$, $0.01$ or $0.001$). + Some analyses are only done on one session - if so it will be clearly stated. + \subsection{Default values} + \label{mat:default} + The values of our variables used in \texttt{'Default'} are given in table~\ref{tab:default}. + \begin{table} + \centering + \begin{tabular}{r|c|l} + Variable & default & Meaning\\\hline + allSubjects & \true & is the computation done for all 51 sessions\\ + &&or only for one randomly chosen?\\ + eegOffset & 0 & amount of offset applied for EEG data (cf. \ref{mat:offset})\\ + $k$ & 10 & iterations of cross validation (do not change)\\ + maxEEGFreq & 49 & Frequency for a Butterworth low-pass filter\\ + minEEGFreq & 2 & Frequency for a Butterworth high-pass filter\\ + maxExpC & 0 & SVM tries values $10^{-x}$ to $10^{x}$ + with steps in x: 1\\ + maxFile & 5 & number of files per session\\ + maxPerClass & 250 & maximum number of data points in one svm class\\ + && (only for training)\\ + noLFsamples & 5 & number of samples out of one time window\\ && used for LF predictions\\ + noSynergies & 3 & number of Synergies used\\ + pause & 0 & apply pause or not? (cf. \ref{mat:pause})\\ + pBurgOrder & 250 & order of model for Burg's model (cf. \ref{mat:burg})\\ + ridgeParams & 100 & Array of parameters tried in cross \\&&validation for ridge\\ + shiftEEG & 0.2 & shift of the EEG window in each step\\ + shiftEMG & 0.05 & shift of the EMG window in each step\\ + threshold & 10000 & threshold for classification as movement (cf. \ref{mm:newClass})\\ + windowEEG & 1 & size of the EEG window \\ + windowEMG & 0.2 & size of the EMG window \\ + \end{tabular} + \caption{Values used for default} + \label{tab:default} + \end{table} + \subsection{Topographical Plots} + Sometimes the interpretation of EEG data is easier if plotted topographically, meaning visualized according to the corresponding positions on a modeled head.\\ + An example is shown in figure~\ref{fig:blink}. + \begin{figure} + \includegraphics[height=\textheight]{pictures/topoplotMB1blink.png} + \caption{Topographical plot of MB1 blinking} + \label{fig:blink} + \end{figure} diff --git a/text/thesis/03Results.tex b/text/thesis/03Results.tex index d0b70e6..a1f4ef0 100644 --- a/text/thesis/03Results.tex +++ b/text/thesis/03Results.tex @@ -4,7 +4,6 @@ %TODO: plot, decision for 3 %TODO: compare different number of synergies \section{Classification} -%TODO: Confusion Matrices \subsection{Comparison of methods of recording} The different methods of recording (EEG, EMG and Low frequencies) also differ in the results. An ANOVA gives $p<0.001$ for all classifications done on 4 different movements and rest. \begin{figure} @@ -33,7 +32,7 @@ In figure~\ref{fig:overviewEMG} the different settings for classification based on EMG-data are shown. Default has values as in \ref{mat:default}. The runs with pause leave out the data 1 second before the movement begins (cf. \ref{mat:pause}). \begin{figure} \centering - \includegraphics[width=\textwidth]{pictures/results/overviewEMG.png} + \includegraphics[width=\textwidth]{pictures/results/overviewEMGclass.png} \caption{Classification with EMG-data} \label{fig:overviewEMG} \end{figure} @@ -42,7 +41,7 @@ In figure~\ref{fig:overviewEEG} the different settings for classification based on EEG-data are shown. Default has values as in \ref{mat:default}. The runs with pause leave out the data 1 second before the movement begins (cf. \ref{mat:pause}). Runs with offset have an offset of 1 or 2 (cf. \ref{mat:offset}). \begin{figure} \centering - \includegraphics[width=\textwidth]{pictures/results/overviewEEG.png} + \includegraphics[width=\textwidth]{pictures/results/overviewEEGclass.png} \caption{Classification with EEG-data} \label{fig:overviewEEG} \end{figure} @@ -50,7 +49,7 @@ In figure~\ref{fig:overviewLF} the different settings for classification based on LowFrequency(LF)-data are shown. Default has values as in \ref{mat:default}. The runs with pause leave out the data 1 second before the movement begins (cf. \ref{mat:pause}). Runs with offset have an offset of 1 or 2 (cf. \ref{mat:offset}). \begin{figure} \centering - \includegraphics[width=\textwidth]{pictures/results/overviewLF.png} + \includegraphics[width=\textwidth]{pictures/results/overviewLFclas.png} \caption{Classification with LF-data} \label{fig:overviewLF} \end{figure} @@ -72,7 +71,6 @@ \caption{Confusion Matrices in default configuration} \label{fig:cmEMG} \end{figure} - \section{Regression} \subsection{Comparison of methods of recording} \subsubsection{Velocities} @@ -232,3 +230,4 @@ %TODO \section{Topographical plots} %Maybe in discussion + %TODO diff --git a/text/thesis/pictures/results/classEEGemgLF.png b/text/thesis/pictures/results/classEEGemgLF.png index 3f26da8..f671cc4 100644 --- a/text/thesis/pictures/results/classEEGemgLF.png +++ b/text/thesis/pictures/results/classEEGemgLF.png Binary files differ diff --git a/text/thesis/pictures/results/noSyn.png b/text/thesis/pictures/results/noSyn.png new file mode 100644 index 0000000..e43143f --- /dev/null +++ b/text/thesis/pictures/results/noSyn.png Binary files differ diff --git a/text/thesis/pictures/results/overviewEEG.png b/text/thesis/pictures/results/overviewEEG.png deleted file mode 100644 index d580dfb..0000000 --- a/text/thesis/pictures/results/overviewEEG.png +++ /dev/null Binary files differ diff --git a/text/thesis/pictures/results/overviewEEGclass.png b/text/thesis/pictures/results/overviewEEGclass.png new file mode 100644 index 0000000..9ac934f --- /dev/null +++ b/text/thesis/pictures/results/overviewEEGclass.png Binary files differ diff --git a/text/thesis/pictures/results/overviewEMG.png b/text/thesis/pictures/results/overviewEMG.png deleted file mode 100644 index 88db318..0000000 --- a/text/thesis/pictures/results/overviewEMG.png +++ /dev/null Binary files differ diff --git a/text/thesis/pictures/results/overviewEMGclass.png b/text/thesis/pictures/results/overviewEMGclass.png new file mode 100644 index 0000000..5e74249 --- /dev/null +++ b/text/thesis/pictures/results/overviewEMGclass.png Binary files differ diff --git a/text/thesis/pictures/results/overviewLF.png b/text/thesis/pictures/results/overviewLF.png deleted file mode 100644 index 9475979..0000000 --- a/text/thesis/pictures/results/overviewLF.png +++ /dev/null Binary files differ diff --git a/text/thesis/pictures/results/overviewLFclass.png b/text/thesis/pictures/results/overviewLFclass.png new file mode 100644 index 0000000..3f7f0f8 --- /dev/null +++ b/text/thesis/pictures/results/overviewLFclass.png Binary files differ diff --git a/text/thesis/thesis.tex b/text/thesis/thesis.tex index 54c5656..7793682 100644 --- a/text/thesis/thesis.tex +++ b/text/thesis/thesis.tex @@ -55,6 +55,7 @@ \newcommand{\true}{\texttt{true}} \newcommand{\false}{\texttt{false}} + \begin{document} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% @@ -108,13 +109,15 @@ %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\addcontentsline{toc}{section}{Abstract} \section*{Abstract} Write here your abstract.%\cite{Morasso92}\cite{Tresch06} \newpage -\section*{Acknowledgements} +\addcontentsline{toc}{section}{Acknowledgments} +\section*{Acknowledgments} -Write here your acknowledgements. +Write here your acknowledgments. % Martin, Rosenstiel, Birbaumer(?) - Data, Nieselt for template, WSI, Fachschaft, Familie, ...? \cleardoublepage @@ -126,6 +129,7 @@ \renewcommand{\baselinestretch}{1.3} \small\normalsize +\addcontentsline{toc}{chapter}{Table of Contents} \tableofcontents \renewcommand{\baselinestretch}{1} diff --git a/topoplot/plotOneSubjectOneDay.m b/topoplot/plotOneSubjectOneDay.m index d5f8ab3..4345542 100644 --- a/topoplot/plotOneSubjectOneDay.m +++ b/topoplot/plotOneSubjectOneDay.m @@ -18,8 +18,9 @@ j=fix(rand()*size(numbersMat,2)+1); number=numbersMat(j); subject=subjectsForNumbers{j}; - + fprintf('%s%i',subject,number) + [EEG,~]=readEEGSig(pathToFile,subject,number,maxFile); - topoplot_Wrapper(EEG,eloc_file,stepSize,250); + topoplot_Wrapper(EEG,eloc_file,stepSize,250) end \ No newline at end of file diff --git a/usedMcode/evaluationAccuracys.m b/usedMcode/evaluationAccuracys.m index 812956c..e665bdd 100644 --- a/usedMcode/evaluationAccuracys.m +++ b/usedMcode/evaluationAccuracys.m @@ -1,5 +1,7 @@ load('/home/jph/Uni/masterarbeit/evaluation.mat') +figureSavePath='/home/jph/Uni/masterarbeit/text/thesis/pictures/results/'; +% mySaveFigure(gcf,strcat(figureSavePath,'plot')) %% compare forms of recording eegAcc=struct2array(accuracys.EEG); emgAcc=struct2array(accuracys.EMG); @@ -30,12 +32,15 @@ limits_y=[limits_y(1)-0.1*diff(limits_y),limits_y(2)+0.1*diff(limits_y)] figure(); - +sizeX=ceil(noOfPlots/sizeY); for i=1:noOfPlots - subplot(sizeY,ceil(noOfPlots/sizeY),i) + subplot(sizeY,sizeX,i) boxplot(input.(sprintf('%s',names{i}))) title(names{i}) ylim(limits_y) + if ceil(i/sizeX) > ceil((i-1)/sizeX) + ylabel('% classified correctly') + end end anova1(cat(2,input.default3Syn,input.offset1Syn3,input.offset2Syn3,input.pause1Syn3,input.pause1Off1Syn3),[0,0,0,1,1]) diff --git a/usedMcode/evaluationCorrelations.m b/usedMcode/evaluationCorrelations.m index d252a06..b870737 100644 --- a/usedMcode/evaluationCorrelations.m +++ b/usedMcode/evaluationCorrelations.m @@ -1,7 +1,7 @@ load('/home/jph/Uni/masterarbeit/evaluation.mat') figureSavePath='/home/jph/Uni/masterarbeit/text/thesis/pictures/results/'; -% mySaveFigure(gcf,'plot') +% mySaveFigure(gcf,strcat(figureSavePath,'plot')) %% compare methods of recording % velocities eegCorrKin=pickFromStruct(correlations.EEG.kin,1:3);