diff --git a/07_final_assignment/paper/main.tex b/07_final_assignment/paper/main.tex
index 6754b20..c2117d1 100644
--- a/07_final_assignment/paper/main.tex
+++ b/07_final_assignment/paper/main.tex
@@ -17,7 +17,7 @@
 \author{R. Geirhos (3827808), K. Grethen (3899962),\\ D.-E. Künstle (3822829), A.-K. Mahlke (3897867), F. Saar (3818590)}
 \affiliation{Linguistics for Cognitive Science Course, University of Tübingen}
 
-\abstract{In \citeyear{Grainger245}, \citeauthor{Grainger245} conducted a word learning experiments with baboons. Interestingly, monkeys are able to discriminate words from non-words with high accuracies. We simulate the learning experience with the Rescorla-Wagner learning model %TODO cite properly .
+\abstract{In \citeyear{Grainger245}, \citeauthor{Grainger245} conducted a word learning experiments with baboons. Interestingly, monkeys are able to discriminate words from non-words with high accuracies. We simulate the learning experience with the Rescorla-Wagner learning model \parencite{rescorla1972theory}.
 Running 225 parallelized experiments on a cluster, we show that it is possible to obtain even better results simply by application of the Rescorla-Wagner model; the learning parameters by themselves are not able to make learning slow enough to be comparable to the monkey's learning. We therefore introduced a random parameter that makes the models take random guesses in 65\% of the trials. That way, we successfully model the monkey's performance.}
 
 \lstset{ %
@@ -64,7 +64,7 @@
 \section{Simulations}
 
 \subsection{Stimuli}
-For stimuli we used the words given in the supplemetary material of the original paper. The list contained 307 four-letter words and 7832 non-words, each consisting of four letters. In every trial, the word or non-word was presented split into overlapping trigrams (for example for the word atom: \#at, ato, tom, om\#), one trigram after the other, as proposed by Baayen et al. (2016). %TODO cite properly
+For stimuli we used the words given in the supplemetary material of the original paper. The list contained 307 four-letter words and 7832 non-words, each consisting of four letters. In every trial, the word or non-word was presented split into overlapping trigrams (for example for the word atom: \#at, ato, tom, om\#), one trigram after the other, as proposed by Baayen et al. (2016). %TODO cite properly - is it baayen2016comprehension ?
 
 \subsection{Experimental Code}
 
@@ -93,7 +93,7 @@
 \subsubsection{Random Parameter} The random parameter $ r $ was set to 0.65, which proved to be reasonable value in preliminary experiments. That means, in 65\% of the cases the monkey would guess for either word or nonword with equal probabilities. Therefore, the maximum possible performance $ p_{max} $ is:
 $$ p_{max} = 1 - \frac{r}{2} = 0.675$$
 In other words, the maximum possible performance is no longer 1.0 (for a very intelligent monkey) but rather restricted by $ r $. If a monkey's performance is slightly better than $ p_{max} $, this is assured to be due to chance.
-\subsubsection{Alpha and Beta} %TODO alpha and beta are important - we have to explain their meaning in a sentence - see Lambda, that's really good.
+\subsubsection{Alpha and Beta} %TODO alpha and beta are important - we have to explain their meaning in a sentence - see Lambda, that looks excellent.
 Both $ \alpha $ and $ \beta $ were our independent variables which we manipulated over the course of the experiments. We gathered data for every possible combination of $ \alpha $ and $ \beta $ values within an equally spaced range from 0.0 to 0.3. A total of 15 values for each $ \alpha $ and $ \beta $ were combined to $ 15*15 = 225 $ possible combinations. Since $ \alpha $ and $ \beta $ were internally multiplied to a single value, we expected the outcome to be more or less symmetrical due to the commutativity of the multiplication operation and therefore calculated each combination of $ \alpha $ and $ \beta $ only once, which we used as a trick to improve the overall runtime. Therefore, $\sum_{i=1}^{15}i = 120$ combinations  remained to be explored.
 \subsubsection{Lambda}
 The independent variable $\lambda$ represents the maximum activation in the Rescorla-Wagner model and therefore limits the learning.
@@ -106,7 +106,7 @@
 \section{Results}
 The number of words learned by the actual monkeys ranged between 87 and 308. With the chosen range for $\alpha$ and $\beta$, we obtained between 275 and 307 learned words, however, it is important to note that we only presented 307 words, so the model reached maximum learning potential. The general accuracy for the real monkeys lay between 71.14\% and 79.81\%, while our accuracies moved between 0.60 and 0.68. Accuracies for word and non-word decisions are similar in both cases. The complete result data is attached in the appendix of this paper. 
 
-%TODO we need a section explaining the results of the plots. What does that mean? -> small influence of parameters as a major finding, however there ARE effects -> perhaps explain that along with the GAM
+%TODO we need a section explaining the results of the plots. What does that mean? -> small influence of parameters as a major finding, however there ARE effects -> perhaps explain that along with the GAM. I think it is crucial that we explain our findings (= our contribution): We explored the whole parameter space (which others probably couldn't), and we found this and that influence.
 
 \begin{figure*}[ht]
   \centering
@@ -143,7 +143,7 @@
 
 Concerning the code, we decided to write it as clear as possible and not as fast as possible. We therefore assume that the overall runtime can be enhanced quite a bit. That would enable future researchers to re-run the experiments with more words to see if there are changes in the later learning process which we now could not explore. The mode of presentation could be reassessed, as well as whether the number of letters changes the behavior of the model.
 
-Furthermore different models could be used in the experiment, to see if other models fit the results of the actual monkeys even better. It would also be interesting to explore the influence of different values for $ \lambda $: Although this wouldn't be a simulation of the original experiment anymore, one would be able to explore the parameter space not only in 2D ($ \alpha $ and $ \beta $) but in 3D. A lot remains to be explored!
+Furthermore different models could be used in the experiment, to see if other models fit the results of the actual monkeys even better. It would also be interesting to explore the influence of different values for $ \lambda $: Although this would go beyond a simulation of the original experiment and therefore fulfill a different purpose, one would be able to explore the parameter space not only in 2D ($ \alpha $ and $ \beta $) but in 3D. A lot remains to be discovered!
 
 
 \newpage
@@ -154,7 +154,7 @@
 
 \onecolumn
 
-\section{Complete Results}
+\section{Complete Results} %TODO perhaps format this in a nicer way... doesn't look amazing.
 Here are the complete results of our experiments. The abbreviations used are:
 \begin{APAitemize}
 \item \#Trials: Number of trials