diff --git a/text/thesis/01Introduction.tex b/text/thesis/01Introduction.tex index ef4bf57..9ecac2e 100644 --- a/text/thesis/01Introduction.tex +++ b/text/thesis/01Introduction.tex @@ -3,20 +3,21 @@ \section{Motivation} \label{intro:motivation} \qq{Reading the mind} is something humanity is and always was exited about. Whatever one may think about the possibility of doing so as a human, computers have a chance to catch a glimpse of the (neuronal) activity in the human brain and interpret it.\\ - Here we use electroencephalography (EEG) to record brain activity and try to predict arm movements from the data.\\ - Using this as a brain-computer-interface (BCI) holds the possibility of restoring e.g. a lost arm. This arm could be used as before by commands constructed in the brain. In a perfect application there would be no need of relearning the usage. The lost arm could just be replaced.\\ - Another opportunity this technique provides is support of retraining the usage of the natural arm after stroke. If it is possible to interpret the brainwaves the arm can be moved passively according to the commands formed in brain. This congruency can restore the bodies own ability to move the arm as \cite{Gomez11} shows.\\ + Here we use Electroencephalography (EEG) to record brain activity and try to predict arm movements from the data.\\ + Using this as a Brain-Computer-Interface (BCI) holds the possibility of restoring e.g. a lost arm. This arm could be used as before by commands constructed in the brain. In a perfect application there would be no need of relearning the usage. The lost arm could just be replaced.\\ + Another opportunity this technique provides is support of retraining the usage of the natural arm after stroke. If it is possible to interpret the brainwaves the arm can be moved passively according to the commands formed in brain. This congruency can restore the bodies own ability to move the arm as \cite{Gomez11} show.\\ In a slightly different context it might become possible to handle a machine (e.g. an industrial robot or mobile robots like quadrocopters) with \qq{thoughts} (i.e. brain activity) like an additional limb. One could learn to use the possibilities of the robot like the possibilities of his arm and hand to modulate something.\\ Similar to that application it could be possible to drive a car by brain activity. This would lower the reaction time needed to activate the breaks for example by direct interaction instead of using the nerves down to the leg to press the break. - Using non-invasive methods like EEG makes it harder to get a good signal and determine its origin. However it lowers the risk of injuries and infections which makes it the method of choice for wide spread application (cf. \cite{Collinger13}). Modern versions of EEG-caps even use dry electrodes which allow for more comfort without loosing predictive strength (cf. \cite{Yeung15}). So everybody may put on and off an EEG-cap without high costs for production or placement. + Using non-invasive methods like EEG makes it harder to get a good signal and determine its origin. However it lowers the risk of injuries and infections which makes it the method of choice for wide spread application (cf. \cite{Collinger13}). Modern versions of EEG-caps even use dry electrodes which allow for more comfort with similar predictive strength in context of movement of the whole body due to mathematical post-processing (cf. \cite{Yeung15}). So everybody may put on and off an EEG-cap without high costs for production or placement.\\ + With EEG brainwaves can be captured that let us predict intended movements. This movement predictions however bears some problems up to now. - Predicting synergies instead of positions or movement is closer to the concept the nervous system uses. This should make it easier to predict them while we can also use them to move an robotic arm or an quadrocopter. - Because there are different possibilities to calculate synergies from EMG we compare them and try to reconstruct movement from them. + Predicting synergies instead of predicting positions or movement directly may solve some of these problems, since it is closer to the concept the nervous system uses. Most likely in brain there are no neurons for every single muscle involved in movement. Instead there are synergies activated, meaning there is coordinated co-activation of different muscles. When using synergies only some basic movements have to be represented in brain and can be combined for more complex movements.\\ + Assuming this it should be easier to predict synergies while we can also use them to move a robotic arm or a quadrocopter. - To be able to compare the results similar calculations were done with other data and paradigms like direct prediction from EEG. -\section{Overview} - After this Introduction in Materials and Methods (Chapter \ref{chp:mat}) we show the scientific background of the methods used in the work. These reach from PCA and Autoencoders over SVMs and regression to boxplots and topographical plots.\\ + This improvements shall be shown in this thesis. To do so different methods of the acquisition of synergies from EMG are compared with other data and paradigms like direct prediction from EEG, EMG and low frequencies. +\section{Overview}%TODO + After this Introduction in Materials and Methods (Chapter \ref{chp:mat}) we show the scientific background of the methods used in the work. These reach from Principal Component Analysis (PCA) and Autoencoders over Support Vector MAchines (SVMs) and regression to boxplots and topographical plots.\\ In chapter \ref{chp:results} Results we show the numerical findings of our work separated into parts on synergies, classification, regression and a topographical analysis of the brain activity.\\ This results and their meaning will be discussed in chapter \ref{chp:dis} Discussion.\\ Finally we take a look in the possible future and discuss which further research could be done based on or related to our work (chapter \ref{chp:fut}). diff --git a/text/thesis/02MaterialsAndMethods.tex b/text/thesis/02MaterialsAndMethods.tex index 6dde9a6..97f15db 100644 --- a/text/thesis/02MaterialsAndMethods.tex +++ b/text/thesis/02MaterialsAndMethods.tex @@ -1,14 +1,18 @@ -\chapter{Materials and Methods} -\label{chp:mat} -\section{Scientific background} +%\chapter{Materials and Methods} +%\label{chp:mat} +\chapter{Scientific background} \label{mat:background} - \subsection{BCIs} +\section{Communication between Brain and Computer} + \subsection{Brain-Computer-Interfaces} The idea of BCIs began to spread in the 1970s when Vidal published his paper (\cite{Vidal73}).\\ - First approaches used invasive BCIs earlier in Animals (rodents and monkeys) later also in humans. Invasive BCIs in humans were mostly implanted when the human was under brain surgery for another reason like therapy of epilepsy. Problems of invasive BCIs are the need to cut through skull and dura mater. This can lead to infections and severe brain damage.\\ - An improvement were less invasive BCIs with e.g. ECoG which is placed inside the skull but outside the dura which decreased the risk for infections massively.\\ - Measuring outside the skull entails even less risk, the dura and skull however lower the quality of the signal massively. With some improvements EEG has a spatial resolution of 2-3 cm (cf. \cite{Babiloni01}). This is quite bad compared to the single neuron one can observe with invasive methods. However we are more interested in the activity of areas then single cells for our task, so EEG meets our requirements here.\\ - In addition EEG is much cheaper and easier to use than other techniques. There is no need for surgery (like for invasive methods) and the hardware can be bought for less than 100\euro{} while FMRI hardware costs far above 100,000\euro{}. This is one of the reasons EEG is far more available than other techniques. There are some inventions of younger date but not as much work has been done with them why they are not as well known and as far distributed as EEG.\\ - Another pro of EEG is that the device is head mounted. That means the user may move while measuring without high impact on the tracking of activity. This is highly necessary for any BCI used in daily life. + The connection between brain and computer allows to help the human in different ways. From implants to re-acquire hearing and sight in one direction to the commanding of machines by brainwaves or communication although having the Locked-In syndrome in the other direction a wide field of possibilities is given yet. However most applications require lots of training and are sometimes quite far from natural behavior. Binary decisions for example are usually made through an excited or relaxed mood, which can easily be detected in brain activity.\\ + + \subsubsection{Methods of recording} + First approaches used invasive BCIs earlier in Animals (rodents and monkeys) later also in humans. Invasive BCIs in humans were mostly implanted when the human was under brain surgery for another reason like therapy of epilepsy. Problems of invasive BCIs are the need to cut through skull and dura mater. This can lead to infections and severe brain damage.\\ + An improvement were less invasive BCIs with e.g. Electrocorticography (ECoG) which is placed inside the skull but outside the dura which decreased the risk for infections massively.\\ + Measuring outside the skull entails even less risk, the dura and skull however lower the quality of the signal massively. With some improvements EEG has a spatial resolution of 2-3 cm (cf. \cite{Babiloni01}). This is quite bad compared to the single neuron one can observe with invasive methods. However we are more interested in the activity of areas then single cells for our task, so EEG meets our requirements here.\\ + In addition EEG is much cheaper and easier to use than other techniques. There is no need for surgery (like for invasive methods) and the hardware can be bought for less than 100\euro{} while functional Magnetic Resonance Imaging (fMRI) hardware costs far above 100,000\euro{}. This is one of the reasons EEG is far more available than other techniques. There are some inventions of younger date but not as much work has been done with them why they are not as well known and as far distributed as EEG.\\ + Another pro of EEG is that the device is head mounted. That means the user may move while measuring without high impact on the tracking of activity. This is highly necessary for any BCI used in daily life. \subsection{EEG} When using Electroencephalography (EEG) one measures the electrical fields on the scalp that are generated by activity of neurons in the brain. These measurements allow some interpretation about what is happening inside the skull. In our application we use the recorded currents directly to train a SVM or as predictor for regression. @@ -18,8 +22,8 @@ EEG is often used for non-invasive BCIs because it is cheap and easier to use than e.g. fMRI. The electrodes have to be spread over the scalp. To allow for comparability there are standardized methods for this. These methods also bring a naming convention with them. \subsubsection{10-20 system} In this standard adjacent electrodes are placed either 10\% or 20\% of the total front-back or left-right distance apart. This standardization also makes it possible to name each electrode or rather here place. This is done with capital letters for lobes (Frontal, \qq{Central}, Parietal, Occipital and Temporal) and numbers for the specific place on the lobe. Even numbers are on the right side of the head, odd on the left; larger numbers are closer to the ears, lower numbers closer to the other hemisphere. The exact number now refers to the exact distance from center: $$\left\lceil\frac{x}{2}\right\rceil\cdot \frac{d}{10}$$ where $x$ is the number and $d$ the diameter of the scalp. Electrodes in the center are named with a lower case $z$ e.g. $Cz$.\\ - Electrodes between two lobes (10\% instead of 20\% distance) are named with the both adjacent lobes (anterior first) e.g. $FCz$ (between frontal and central lobe). - Also see figure~\ref{fig:10-20}. + Electrodes between two lobes (10\% instead of 20\% distance) are named with the both adjacent lobes (anterior first) e.g. $FCz$ (between frontal and central lobe).\\ + The naming convention according to the 10-20 system is shown in figure~\ref{fig:10-20}. \begin{figure}[!p] \centering \includegraphics[width=\textwidth]{eeg_electrodes_10-20.png} @@ -35,20 +39,31 @@ \item Beta: 13-20Hz \end{itemize} There are different definitions of the limits of the bands, as we only use them for rough estimation we stick to these. For more exact results an analysis of wave patterns would be necessary. + + In limits similar to them of the alpha wave also Mu-waves are measured. They are associated with mirror neurons in the motor cortex and their activity is suppressed while the subject is moving. %TODO + \subsection{Low Frequencies} + In the 2000s there began a movement using new techniques to record ultrafast and infraslow brainwaves (above 50Hz and below 1Hz). These were found to have some importance (cf. \cite{Vanhatalo04}).\\ + Also in predicting movements there was found some significance in low frequency as was done by \cite{Liu11} and \cite{Antelis13} for example. \citeauthor{Antelis13} found correlations between hand movement and low frequency signal of $(0.29,0.15,0.37)$ in the dimensions respectively.\\ + \cite{Lew14} state low frequencies are mainly involved in spontaneous self-induced movement and can be found before the movement starts. By this they may be a great possibility to lower reaction time of neuroprostheses for example. + \subsection{EMG} + When using muscles, they are contracted after a signal via an efferent nerve activates them. Contraction of muscles also releases measurable energy, which is used for Electromyography (EMG). There are intramuscular applications of EMG but we only used surface EMG.\\ + From surface EMG, activity of muscles can be estimated, however not very precisely without repetition. Since the muscles used for arm movements are quite large in our setting, EMG allows relatively precise estimations of underlying muscle activity. + + EMG is mainly developed for diagnostic tasks. However it is also applicable in science to track muscle activity as we do here. +\section{Signal Processing} \subsection{Power estimation} \subsubsection{EEG} To use data from EEG one way is to analyze the occurring frequencies and their respective power.\\ - To gain these from the continuous signal there are different methods. The intuitive approach would be to use Fourier transformation however the Fourier transform does not need to exists for a continuous signal. So we used power spectral density (PSD) estimation. + To gain these from the continuous we use windows in which the signal is finite and a Fourier transform can be estimated. For this we use power spectral density (PSD) estimation. \subsubsection{Power spectral density estimation} The PSD is the power per frequency, where power refers to the square of the amplitude.\\ - If the Fourier transform is existing, PSD can be calculated from it e.g. as periodogram. If not it has to be estimated. One way to do so is parametrized with an Autoregressive model (AR). Here one assumes that there is a correlation of the spectral density between $p$ consecutive samples and the following one. This leads to an equation with only $p$ parameters which can be estimated in different ways. We used Burg's method (\texttt{pburg} from \matlab{} library).\\ + If the Fourier transform is existing, PSD can be calculated from it e.g. as periodogram. If not it has to be estimated. One way to do so is fast Fourier transform (FFT), another - used here - is parametrized with an Autoregressive model (AR). For this one assumes that there is a relationship of the spectral density between $p$ consecutive samples and the following one. This leads to an equation with only $p$ parameters which can be estimated in different ways. We used Burg's method (\texttt{pburg} from \matlab{} library).\\ In Figure~\ref{fig:psd} we see the difference between autoregressive \texttt{pburg} and periodogram \texttt{pwelch} PSD estimation. \begin{figure} \includegraphics[width=\textwidth]{psd.png} - \caption{PSD with FFT or an Autoregressive model respectively\protect\footnotemark} + \caption{PSD with FFT (top) or an Autoregressive model (bottom) respectively. The signal was unfiltered EEG data from channel \textit{Cz} second run of second session with subject AO} \label{fig:psd} \end{figure} - \footnotetext{The signal was unfiltered EEG data from channel \textit{Cz} second run of second session with subject AO} \subsubsection{Burg's method - Autoregressive Model} \label{mat:burg} Burg's method (\cite{Burg75}) is a special case of parametric PSD estimation. It interprets the Yule-Walker-Equations as least squares problem and iteratively estimates solutions.\\ @@ -59,25 +74,16 @@ The minimum has zero slope and can be found by setting the derivative to zero:$$\frac{\partial E}{\partial a_k}=0,\text{ for } 1\le k\le p$$ This yields a set of equations called \emph{Yule-Walker-Equations} (cf. \cite{Yule27},\cite{Walker31}).\\ Using forward and backward prediction the parameters ($a_k$) are estimated based on the Yule-Walker-Equations then. - \subsection{Low Frequencies} - In the 2000s there began a movement using new techniques to record ultrafast and infraslow brainwaves (above 50Hz and below 1Hz). These were found to have some importance (cf. \cite{Vanhatalo04}).\\ - Also in predicting movements there was found some significance in low frequency as was done by \cite{Liu11} and \cite{Antelis13} for example. \citeauthor{Antelis13} found correlations between hand movement and low frequency signal of $(0.29,0.15,0.37)$ in the dimensions respectively.\\ - \cite{Lew14} state low frequencies are mainly involved in spontaneous self-induced movement and can be found before the movement starts. By this they may be a great possibility to lower reaction time of neuroprostheses for example. \subsection{Filtering} Filtering of the recorded EEG signal is necessary for different reasons. First there are current artifacts from 50Hz current. These can be filtered out with bandstop filters.\\ Secondly we need to concentrate on the interesting frequencies (for classical EEG 1-50Hz). This is done by applying lowpass or highpass filters respectively. This is necessary because the PSD of lower frequency is a lot higher than that of higher frequencies. The relation $$PSD(f)=\frac{c}{f^\gamma}$$ holds for constants $c$ and $\gamma$ (\cite{Demanuele07}).\\ The Butterworth filter (\cite{Butterworth30}) was invented by Stephen Butterworth in 1930. Its advantage was uniform sensitivity to all wanted frequencies. In comparison to other filters Butterworth's is smoother because it is flat in the pass band and monotonic over all frequencies. This however leads to decreased steepness meaning a higher portion of frequencies beyond cutoff. - \subsection{EMG} - When using muscles they are contracted after an signal via an efferent nerve activates them. Contraction of muscles also releases measurable energy which is used for Electromyography (EMG). There are intramuscular applications of EMG but we only used surface EMG.\\ - From surface EMG activity of muscles can be estimated however not very precisely without repetition. Since the muscles used for arm movements are quite large in our setting EMG allows relatively precise estimations of underlying muscle activity. - - EMG is mainly developed for diagnostic tasks. However it is also applicable in science to track muscle activity as we do here. - \subsection{Synergies} - \label{back:synergies} - Movement of the arm (and other parts of the body) are under-determined meaning with given trajectory there are different muscle contractions possible. One idea how this problem could be solved by our nervous system are synergies. Proposed by Bernstein in 1967 (\cite{Bernstein67}) they describe the goal of the movement (e.g. the trajectory) instead of controlling single muscles. This would mean however that predicting the activity of single muscles from EEG is harder than predicting a synergy which in turn determines the contraction of muscles.\\ - Evidence for the use of synergies in the nervous system was found e.g. by Bizzi et al. (\cite{Bizzi08}) and Byadarhaly et al. (\cite{Byadarhaly12}). They also showed that synergies meet the necessary requirement to be able to build predictable trajectories.\\ - Synergies are usually gotten from EMG signal through a principal component analysis (PCA, cf. \ref{mat:pca}), non-negative matrix factorization (NMF, cf. \ref{mat:nmf}) or autoencoders (a form of neuronal network, cf. \ref{mat:autoenc}). - \subsection{PCA} +\section{Synergies} +\label{back:synergies} + Movement of the arm (and other parts of the body) are under-determined meaning with given trajectory there are different muscle contractions possible. One idea how this problem could be solved by our nervous system are synergies. Proposed by Bernstein in 1967 (\cite{Bernstein67}) they describe the goal of the movement (e.g. the trajectory) instead of controlling single muscles. This would mean however that predicting the activity of single muscles from EEG is harder than predicting a synergy which in turn determines the contraction of muscles.\\ + Evidence for the use of synergies in the nervous system was found e.g. by Bizzi et al. (\cite{Bizzi08}) and Byadarhaly et al. (\cite{Byadarhaly12}). They also showed that synergies meet the necessary requirement to be able to build predictable trajectories.\\ + Synergies are usually obtained from EMG signal through a principal component analysis (PCA, cf. \ref{mat:pca}), non-negative matrix factorization (NMF, cf. \ref{mat:nmf}) or autoencoders (a form of neuronal network, cf. \ref{mat:autoenc}). + \subsection{Principal Component Analysis} \label{mat:pca} Principal Component Analysis (PCA) is probably the most common technique for dimensionality reduction. The idea is to use those dimensions with the highest variance to keep as much information as possible in the lower dimensional room.\\ Invented PCA was in 1901 by Karl Pearson (\cite{Pearson01}). The intention was to get the line closest to a set of data. This line also is the one that explains most variance.\\ @@ -89,7 +95,7 @@ \label{fig:pca} \end{figure} In Figure~\ref{fig:pca} we see the eigenvectors of the data. The longer vector is the principal component the shorter one is orthogonal to it and explains the remaining variance. The second component here also is the component which explains least variance, since most variance is orthogonal to it. - \subsection{NMF} + \subsection{Non-Negative Matrix Factorization} \label{mat:nmf} In some applications non-Negative Matrix Factorization (NMF) is preferred over PCA (cf. \cite{Lee99}). This is because it does not learn eigenvectors but decomposes the input into parts which are all possibly used in the input. When seen as matrix factorization PCA yields matrices of arbitrary sign where one represents the eigenvectors the other the specific mixture of them. Because an entry may be negative cancellation is possible. This leads to unintuitive representation in the first matrix.\\ NMF in contrast only allows positive entries. This leads to \qq{what is in, is in} meaning no cancellation which in turn yields more intuitive matrices. The first contains possible parts of the data, the second how strongly they are represented in the current input.\\ @@ -126,7 +132,8 @@ \caption{Autoencoder (6-3-6)} \label{fig:autoenc} \end{figure} - \subsection{Support-vector Machines} +\section{Machine Learning} + \subsection{Support-Vector Machines} Support-vector machines (SVMs) are used for classification of data. This is done by separating data in feature space by a hyperplane. Additional data is classified with respect to the site of the hyperplane it is located in feature space.\\ This hyperplane is considered optimal if the margins on both sides (distance to the nearest data point) are maximal to allow for the maximal possible noise. This means the separating hyperplane can be constructed out of the nearest points (3 in 2-D) from both classes. This points however may be different for different attempts as an different angle in some dimension may make different points the nearest (cf. Figure~\ref{fig:hyperplanes}).\\ \begin{figure} @@ -153,38 +160,13 @@ $$\text{Minimize }\frac{1}{N}\sum\limits_{i=1}^N\max\{0,1-y_i(\vec{w}\cdot\vec{x_i}-b)\}+\lambda ||\vec{w}||^2,$$ where $\lambda$ is the parameter that adjusts the trade-off between large margins and wrong classifications (if $\lambda$ has an higher value, there is more weight on large margins). \subsubsection{Kernel trick} - Data like in Figure~\ref{fig:kernel} are not \emph{linear} separable. The idea here is to apply the \emph{kernel trick} meaning to transform the data in a different space where they are linear separable. In the example this is accomplished by using the distance from origin as feature and separating in that space. + Data like in figure~\ref{fig:kernel} are not \emph{linear} separable. The idea here is to apply the \emph{kernel trick} meaning to separate the data in a higher dimensional space where they are linear separable. In the example this is accomplished by using the distance from origin as feature and separating in that space. \begin{figure} \input{pictures/kernel.tikz} - \caption{Data separable with the kernel trick} + \caption{Data separable with the kernel trick; left in the original space with features $x$ and $y$, right in the dimension where distance from the origin is shown and the data is linear separable} \label{fig:kernel} \end{figure} Common kernels are polynomial, Gaussian and hyperbolic kernels. - \subsection{Confusion Matrix} - \label{mm:cm} - The confusion matrix is a visualization of classifications. In it for every class the number of samples classified as each class is shown. This is interesting since it can show bias and give a feeling for similar cases where similar is meant according to the features.\\ - In the 2-class case the well known table of true and false positives and negatives (table~\ref{tab:tptnftfn}) is a confusion matrix. From it we can learn specificity and sensitivity as follows: - $$\text{sensitivity}=TP/(TP+FP)$$ - $$\text{specificity}=TN/(TN+FN)$$ - \begin{table} - \centering - \begin{math} - \begin{array} - {c||c|c} - &\text{predicted }\true&\text{predicted }\false\\\hline\hline - \text{is }\true& TP & FN\\\hline - \text{is }\false& FP & TN - \end{array} - \end{math} - \caption{2D confusion matrix} - \label{tab:tptnftfn} - \end{table} - In the higher dimensional case \matlab{} uses color coded maps as figure~\ref{fig:exampleCM}. In our application we use scaled confusion matrices where each row adds up to 1. - \begin{figure} - \includegraphics[width=\textwidth]{pictures/results/cmEEGfull.png} - \caption{Example for a confusion matrix} - \label{fig:exampleCM} - \end{figure} \subsection{Regression} Regression is the idea of finding $\beta$ so that $$y= X\beta+\epsilon$$ where X is the $n\times p$ input matrix and y the $n\times 1$ output vector of a system. Having this $\beta$ from given input the output can be predicted.\\ There are different ways to find this $\beta$. One common approach is the \emph{ordinary least squares}-Algorithm. $$\hat{\beta}=\arg\min\limits_{b\in\mathds{R}^p} \left(y-Xb\right)^T\left(y-Xb\right),$$ meaning the chosen $\hat\beta$ is that $b$ which produces the lowest error since $Xb$ should be - besides from noise $\epsilon$ - the same as $y$.\\ @@ -226,6 +208,32 @@ \caption{Nested 10-fold Cross Validation with parameter optimization} \label{alg:cv} \end{algorithm} +\section{Evaluation Methods} + \subsection{Confusion Matrix} + \label{mm:cm} + The confusion matrix is a visualization of classifications. In it for every class the number of samples classified as each class is shown. This is interesting since it can show bias and give a feeling for similar cases where similar is meant according to the features.\\ + In the 2-class case the well known table of true and false positives and negatives (table~\ref{tab:tptnftfn}) is a confusion matrix. From it we can learn specificity and sensitivity as follows: + $$\text{sensitivity}=TP/(TP+FP)$$ + $$\text{specificity}=TN/(TN+FN)$$ + \begin{table} + \centering + \begin{math} + \begin{array} + {c||c|c} + &\text{predicted }\true&\text{predicted }\false\\\hline\hline + \text{is }\true& TP & FN\\\hline + \text{is }\false& FP & TN + \end{array} + \end{math} + \caption{2D confusion matrix} + \label{tab:tptnftfn} + \end{table} + In the higher dimensional case \matlab{} uses color coded maps as figure~\ref{fig:exampleCM}. In our application we use scaled confusion matrices where each row adds up to 1. + \begin{figure} + \includegraphics[width=\textwidth]{pictures/results/cmEEGfull.png} + \caption{Example for a confusion matrix} + \label{fig:exampleCM} + \end{figure} \subsection{ANOVA} Analysis of Variance (ANOVA) is a way of checking if there is a main effect of a variable.\\ The Hypotheses tested are that all group means are equal ($H_0$) or they are not ($H_1$). To check on those ANOVA compares the deviation from the over-all mean and compares it to the deviation within the groups. If a lot of variance in the data can be explained by the groups (meaning in-group variance is lower than variance between groups) it is quite likely that the proposed groups have different means.\\ @@ -234,7 +242,7 @@ To plot data and show their distribution we use boxplots. A boxplot contains information about the median (red line), 0.25 and 0.75 quantiles (ends of the box) and about the highest and lowest values that are not classified as outliers.\\ A data point $y$ is classified as outlier if $y > q_3+1.5\cdot(q_3-q_1)$ or $y < q_1-1.5\cdot(q_3-q_1)$, where $q_1,q_3$ are the first and third quartile (which are also defining the box). -\section{Experimental design} +\chapter{Experimental design} \label{mm:design} The data used for this work was mainly recorded by Farid Shiman, Nerea Irastorza-Landa, and Andrea Sarasola-Sanz for their work (\cite{Shiman15},\cite{Sarasola15}). We were allowed to use it for further analysis.\\ There were 9 right-handed subjects with an average age of 25 (variance 6.67, minimum 20, maximum 28). Three female and 6 male subjects were tested. All the tasks were performed with the dominant right hand.\\ @@ -250,11 +258,11 @@ Of the kinematic information tracked we only used position ($x,y$) and angle ($\theta$, rotation around $z$-axis) of the hand.\\ Only complete sessions were used in our analysis to ensure better comparability.\\ One session consists of 5 runs with 40 trials each. The trials were separated by resting phases of varying length (2-3s, randomly assigned). Each trial began with an auditory cue specifying the random but equally distributed target for this trial. This leads to 50 reaches to the same target each session. - After the auditory cue the participants should \qq{perform the movement and return to the starting position at a comfortable pace but within 4 seconds}\footnote{\cite{Shiman15}}\\ + After the auditory cue the participants should \qq{perform the movement and return to the starting position at a comfortable pace but within 4 seconds} (\cite{Shiman15})\\ For each subject there were 4 to 6 sessions, each recorded on a different day. All in all there were 255 runs in 51 sessions. Each session was analyzed independently as one continuous trial. - \subsection{Environment for evaluation} + \section{Environment for evaluation} The calculations were done on Ubuntu \texttt{14.04 / 3.19.0-39} with \matlab{} \texttt{R2016a (9.0.0.341360) 64-bit (glnxa64) February 11, 2016}. -\section{Data Acquisition} +\chapter{Data Acquisition} \subsection{Loading of data} The data recorded with BCI2000 (\cite{Schalk04}) can be loaded into \matlab{} with a specific \texttt{.mex} file. The according \texttt{.mex}-Files for some platforms (Windows, MAC, Linux) are available from BCI2000 precompiled.\\ We load the signal plus the according status data and the parameters (see Algorithm~\ref{alg:load_bcidat}). @@ -381,7 +389,7 @@ Kinematic data we used either as movement or as position. The position was directly recorded, the movement is the first derivative of the position in time.\\ The recording of kinematics was started after that of EEG. In synchronization channel\footnote{cf. Table~\ref{tab:channelNames}} there is a peak when kinematic recording is started. This was used to align movement with EEG and EMG data. In addition we adjusted the kinematic data to the EMG window and shift to be able to use corresponding data for the same time step. This was done by summing all differences (for movement) or by calculating the mean position in the time window.\\ Size of this data is same as EMG and Synergies in length but has only three features per time step since we used only 3D positioning ($x,y$ and $\theta$) of the hand and no information about the fingers. -\section{Data Analysis} +\chapter{Data Analysis} Figure~\ref{fig:overview} shows the steps of our work. EEG, EMG and positions were recorded, Synergies and velocities were calculated from them. To check the performance of our methods the relations between them were predicted. \begin{figure} \centering diff --git a/text/thesis/05Future.tex b/text/thesis/05Future.tex index d2fed67..0f71171 100644 --- a/text/thesis/05Future.tex +++ b/text/thesis/05Future.tex @@ -1,27 +1,27 @@ -\chapter{Future Work} +\section{Future Work} \label{chp:fut} -\section{Classification} +\subsection{Classification} Our results in the topic of classification are not very reliable since we did the classification based on EMG (cf. section \ref{mm:newClass}). It would be interesting to analyze data where the stimulus is matched to the EEG signal and check for early detectability (e.g. with low frequencies as \cite{Lew14}).\\ Additionally classification - which is enough for some tasks - could be compared to regression. If there is only a limited set of movements a robotic prosthesis has to perform, it could use classification. This should give a lower error rate since the different movements can be distinguished better. -\section{Measurement of error} +\subsection{Measurement of error} For comparison of regression and classification it could be interesting to introduce another measure for performance than just classified correctly or not. It could be interesting how much the predicted movement differs from the real even in the classification task. In that way one would get a measure to decide whether using classification instead of regression pays off.\\ For this analysis also a variable number of classes would be interesting since having 4 movements (as in our setting) is not enough to use an artificial arm. -\section{Offset} +\subsection{Offset} There is no significant effect of an offset in our configuration. When using smaller EEG windows however there might be one. This could be tried in further analyses with small EEG windows.\\ These small windows however will probably bring other problems as e.g. unstable transformation into Fourier space. So if it is necessary to use large windows, an offset is unnecessary. -\section{Use of EEG channels} +\subsection{Use of EEG channels} To achieve higher performance it would be interesting to identify those EEG channels that contribute most in an good estimation of arm movements or position. There should be channels that do not carry much information for those differentiations, this however has to be explored better.\\ In this context research could also be done to find out which frequencies allow for the best predictions. Our findings predict a better performance for the alpha band and occipital and parietal regions. A more detailed work on this specific topic however is necessary to decide based on more data. -\section{Self-chosen movement} +\subsection{Self-chosen movement} For a better use of low frequency features our work could be re-done with data recorded when subjects move voluntarily. This might also influence the way synergies are predicted and could lead to an better prediction.\\ Additionally this task matches the requirements for an BCI better, as movement in daily life is more voluntary than decided by a single auditory cue. -\section{Synergies} - \subsection{Generation of Synergies} +\subsection{Synergies} + \subsubsection{Generation of Synergies} We proofed the plausibility of synergies here so the next step could be to improve the acquisition. Generating them from EMG may include unnecessary information. The generation of synergies as an intermediate step between EEG (or generally brain activity) and EMG (or generally muscle activity) my achieve even better results.\\ A dimensionality reduction in EEG only probably will not work since there is to much unrelated activity, EMG only bears the problem of lower fit to the movement as we showed.\\ An idea could be to try a dimensionality reduction on EEG of parts of the brain known to be involved in arm movement. This however is a far less general approach than the methods we used.\\ A more general approach would be a neural network trained to predict EMG from EEG. The hidden layer of this network again could be used as synergies. - \subsection{Autoencoders} + \subsubsection{Autoencoders} We did not find significantly better performance of autoencoders even with only 2 synergies. Since this was not the focus of our work here that might however be possible. Additional research is needed to answer which method is best to generate synergies. diff --git a/text/thesis/thesis.tex b/text/thesis/thesis.tex index f414b5d..91be396 100644 --- a/text/thesis/thesis.tex +++ b/text/thesis/thesis.tex @@ -114,10 +114,10 @@ \section*{Abstract} \addcontentsline{toc}{section}{Abstract} -Synergies are %TODO -This thesis shows the plausibility of synergies as an intermediate step between brain and muscles. Our results show only small decrease in predicting performance for position and velocity compared to the EMG signal. This was achieved with synergies acquired through dimensionality reduction from EMG signal.\\ -The results of prediction of, via and from synergies are compared with other techniques currently used to predict movement from EEG in a classification and regression context. Over all synergies perform not much worse than EMG and are predicted better from EEG.\\ -We also compare different methods for the acquisition of synergies. Our findings show that autoencoders are a great possibility to generate synergies from EMG. Synergies from non-Negative Matrix Factorization also perform well, those acquired by Principal Component Analysis are performing worse when being predicted from EEG. +Synergies are patterns of muscle activation where muscles are used in a coordinated way and not each muscle has to be activated separately. Theory is that these patterns can be found in the brain and its activation.\\ +This thesis shows the plausibility of synergies as an intermediate step between brain and muscles. The results show only small decrease in predicting performance for position and velocity compared to the Electromyography (EMG) signal. This was achieved with synergies acquired through dimensionality reduction from EMG signal.\\ +The results of prediction of, via and from synergies are compared with other techniques currently used to predict movement from Electroencephalography (EEG) in a classification and regression context. Over all synergies perform not much worse than EMG and are predicted better from EEG.\\ +Also comparison of different methods for the acquisition of synergies is done. The findings show that autoencoders are a great possibility to generate synergies from EMG. Synergies from non-Negative Matrix Factorization also perform well, those acquired by Principal Component Analysis are performing worse when being predicted from EEG. \newpage \section*{Acknowledgments} @@ -207,6 +207,7 @@ \begin{tabbing} \textbf{EEG}\hspace{2cm}\=Electroencephalography\\ \textbf{EMG}\>Electromyography\\ +\textbf{fMRI}\>functional Magnetic Resonance Imaging\\ \textbf{LF}\>Low Frequency\\ \textbf{BCI}\> Brain-Computer-Interface \\ \textbf{SVM}\> Support-Vector-Machine \\