diff --git a/07_final_assignment/paper/main.tex b/07_final_assignment/paper/main.tex
index c1a1704..6754b20 100644
--- a/07_final_assignment/paper/main.tex
+++ b/07_final_assignment/paper/main.tex
@@ -17,7 +17,8 @@
 \author{R. Geirhos (3827808), K. Grethen (3899962),\\ D.-E. Künstle (3822829), A.-K. Mahlke (3897867), F. Saar (3818590)}
 \affiliation{Linguistics for Cognitive Science Course, University of Tübingen}
 
-\abstract{We try to simulate the results of a word learning experiments with baboons by \textcite{Grainger245}. To that end we use ndl, which is based on the Rescorla-Wagner learning model. The learning parameters by themselves re not able to make learning slow enough to be coparable to the monkeys, which is why we introduced a random parameter that makes the models take random guesses in 65\% of the trials. That way, we can successfully model the monkeys' performance.}
+\abstract{In \citeyear{Grainger245}, \citeauthor{Grainger245} conducted a word learning experiments with baboons. Interestingly, monkeys are able to discriminate words from non-words with high accuracies. We simulate the learning experience with the Rescorla-Wagner learning model %TODO cite properly .
+Running 225 parallelized experiments on a cluster, we show that it is possible to obtain even better results simply by application of the Rescorla-Wagner model; the learning parameters by themselves are not able to make learning slow enough to be comparable to the monkey's learning. We therefore introduced a random parameter that makes the models take random guesses in 65\% of the trials. That way, we successfully model the monkey's performance.}
 
 \lstset{ %
   basicstyle=\footnotesize,        % the size of the fonts that are used for the code
@@ -33,7 +34,7 @@
   showtabs=false,                  % show tabs within strings adding particular underscores
   stepnumber=2,                    % the step between two line-numbers. If it's 1, each line will be numbered
   tabsize=2,	                   % sets default tabsize to 2 spaces
-  title=\lstname                   % show the filename of files included with \lstinputlisting; also try caption instead of title
+  title=\lstname                   % show the filename of files included with \lstinputlisting; also try captiFon instead of title
 }
 
 \begin{document}
@@ -105,11 +106,13 @@
 \section{Results}
 The number of words learned by the actual monkeys ranged between 87 and 308. With the chosen range for $\alpha$ and $\beta$, we obtained between 275 and 307 learned words, however, it is important to note that we only presented 307 words, so the model reached maximum learning potential. The general accuracy for the real monkeys lay between 71.14\% and 79.81\%, while our accuracies moved between 0.60 and 0.68. Accuracies for word and non-word decisions are similar in both cases. The complete result data is attached in the appendix of this paper. 
 
+%TODO we need a section explaining the results of the plots. What does that mean? -> small influence of parameters as a major finding, however there ARE effects -> perhaps explain that along with the GAM
+
 \begin{figure*}[ht]
   \centering
   \includegraphics[width=0.9\textwidth]{../plots/plot_accuracy}
   \caption{
-    Top row shows model output  accuracies in dependence of modulated alpha and beta.
+    Top row shows model output  accuracies in dependence of modulated $ \alpha $ and $ \beta $.
     Second row visualizes corresponding nonlinear regressions (GAM).
     Accuracy seem to approximate a maximal accuracy with growing alpha, beta parameter.
     Visible in the GAM plot is the small influence of one of the parameters.
@@ -134,17 +137,14 @@
 \section{Discussion}
 
 We meticulously simulated the learning experience that monkeys were exposed to in an experiment by \textcite{Grainger245} by systematically exploring the parameter space of Rescorla-Wagner equations \parencite{rescorla1972theory}.\\
-Since preliminary results indicated that without restricting the model performance to a ceiling, the model performs way too accurate compared to the original monkeys, we introduced a random choice in some cases.
+Since preliminary results indicated that without restricting the model performance to a ceiling, the model performs way too accurate compared to the original monkeys, we introduced a random choice in some cases. We were thereby able to model the performance, and to show that the influence of the alpha and beta values was surprisingly small.
 
-%TODO: hwa Robert
+As a limitation, we have to note that the definition of a word being learned in our impression isn't perfect: It is defined as the moment a word had 80\% accuracy of recognition. We would expect this definition to become problematic when a word was 'almost' learned, but not quite reaching the 80\%. In the next block with that word, the learning would be a lot quicker than for an actually new word. It might be a good idea to monitor and save the knowledge level concerning one specific word an measuring the actual number of repetitions a word needed to become known.
 
-We were also slightly unhappy with the definition of a word being learned, which was when the word had 80\% accuracy of recognition. We would expect this definition to become problematic when a word was 'almost' learned, but not quite reaching the 80\%. In the next block with that word, the learning would be a lot quicker than for an actually new word. It might be a good idea to monitor and save the knowledge level concerning one specific word an measuring the actual number of repetitions a word needed to become known.
+Concerning the code, we decided to write it as clear as possible and not as fast as possible. We therefore assume that the overall runtime can be enhanced quite a bit. That would enable future researchers to re-run the experiments with more words to see if there are changes in the later learning process which we now could not explore. The mode of presentation could be reassessed, as well as whether the number of letters changes the behavior of the model.
 
-Concerning our code, there are a few measurements that could be taken to improve it, too. As mentioned above, we parallelized the process because it would have otherwise taken far too long to calculate. It would be very interesting to look into ways to make the program run even faster, therefore enabling more trials to be run and therefore resulting in more data and exacter results.
+Furthermore different models could be used in the experiment, to see if other models fit the results of the actual monkeys even better. It would also be interesting to explore the influence of different values for $ \lambda $: Although this wouldn't be a simulation of the original experiment anymore, one would be able to explore the parameter space not only in 2D ($ \alpha $ and $ \beta $) but in 3D. A lot remains to be explored!
 
-Shortening running times would also make it possible to re-run the program with more words to see if there are changes in the later learning process which we now could not explore due to lack of words. The mode of presentation could be reassessed, as well as whether the number of letters changes the behaviour of the model.
-
-Lastly, of course, different models could be used in the experiment, to see if other models fit the results of the actual monkeys better.
 
 \newpage