Riccardo Poli, and
School of Computer Science and Electronic Engineering
University of Essex
CO4 3SQ, Colchester, UK
Brain-Computer Interface (BCI) systems measure specific (intentionally and unintentionally induced) signals of brain activity and translate them into device control signals (see  for a comprehensive review).
Many factors limit the performance of a BCI system. These include: the natural variability and noise in the brain signals measured; the limitations of the recording and signal processing methods that extract signal features and of the algorithms that translate these features into device commands; the quality of the feedback provided to the user; the lack of motivation, tiredness, limited degree of understanding, age variations, handedness, etc. in users; the natural limitations of the human perceptual system .
In some cases, however, amplitude and shape variations in brain waves may carry information which can be exploited to the benefit of BCI (e.g., see the discussion in ) even if a physiological explanation for such variations is unavailable (see for example the analogue approach used in  where the task of extracting information from amplitude variations was left to an evolutionary algorithm).
In this paper we will document and exploit a source of waveform variations in P300 waves (an endogenous component of EEG event related potentials which is elicited by rare and significant stimuli): namely, their modulation caused by variations in the interval between target-stimulus presentations. We illustrate our ideas using the Donchin's speller as a model, although we expect the benefits of our approach could be accrued also in other P300-based systems.
The paper is organised as follows. In Section II we provide some background on Donchin's speller, on the P300 waves and the factors that may affect their characteristics, and on what is known about how P300 characteristics are affected by the timing of stimulus presentation. In Section III we document the effects that target-to-target interval variations have on the shape of P300s. In Section IV we modify the best Donchin's speller available to date so as to take timing effects into account and show that the new system is superior to the original. We draw some conclusions in Section V.
Among the different approaches used in BCI studies, those based on the P300 event related potential (ERP)  present a relatively high bit-rate and no need for user training.
ERPs are relatively well defined shape-wise variations to the ongoing EEG elicited by a stimulus and temporally linked to it. ERPs include an exogenous response, due to the primary processing of the stimulus, as well as an endogenous response, which is a reflection of higher cognitive processing induced by the stimulus .
The P300 wave is an endogenous component of ERPs with a latency of about 300 ms which is elicited by rare and/or significant stimuli (visual, auditory, or somatosensory). Effectively P300 potentials are ERP components whose presence depends on whether or not a user attends a rare, deviant or target stimulus. This is what makes it possible to use them in BCI systems to determine user intentions. For an overview of the cognitive theory and neurophysiological origin of the P300 see .
The characteristics of the P300 component (mainly its amplitude and its latency) vary depending on several factors . Some factors are related to the psychophysical state of the subject , such as food intake, fatigue, assumption of drugs. Others factors depend on the the physical layout of the stimuli [9,10,11], such as number of symbols, their size, their relative spacing. Other important factors are related to the sequence of stimuli. For example, several studies have reported that the P300 amplitude increases as target probability decreases (for a review, see ). The P300 amplitude seems also to be positively correlated with the interstimulus interval (ISI) or the stimulus onset asynchrony (SOA) (as reported, among others, in [12,13,14]). Other studies [15,16] object that, despite the P300 being clearly affected by target stimulus probability and ISI, each of these factors also varies the average target-to-target interval (TTI) which they hypothesise to be the true factor underlying the P300 amplitude effects attributed to target probability, sequence length, and ISI.
In fact, in addition to being modulated by target probability, the P300 is also sensitive to the order of nontarget and target stimuli since this temporarily modifies target stimulus probabilities. There is a positive correlation between P300 amplitude and the number of nontarget stimuli preceding a target (e.g. [17,18]).
The influence of the sequence preceding a target on the P300, could be partially explained as the result of ``recovery cycle'' limitations in the mechanisms responsible for component generation . Smaller potentials could be produced after a short TTI, because the system has not yet reacquired the necessary resources to produce large ERPs.
Others ascribe the lower amplitudes associated with shorter TTIs (i.e., few nontargets preceding a target stimulus) to an inability to consistently generate a P300, rather than producing a P300 with small amplitude . In other words, they attribute the lower amplitude of the averaged ERP, to an increase in the percentage of responses to target stimuli that do not show a P300 component, whereas the amplitude of the P300 component (for the responses that do show a P300) would be unaffected.
Back in 1988, Donchin and his student Farwell designed a speller based on the P300 component. The user was presented with a 6 by 6 matrix of characters (see Fig. 1) whose rows and columns were randomly highlighted. The user's task was to focus attention on the chosen character. Every 12 flashes (one per each row and column), the 2 containing the desired character represented a rare target stimulus, therefore able to elicit a P300-like response. By averaging the ERP related to each row and column and looking for the largest P300 response, it was possible to infer the target character with sufficient accuracy.
To test whether TTI-modulated P300 variability also occurs with Donchin's speller, we studied the training set of the two subjects of dataset II from the BCI competition III .
In the competition data were collected with the Donchin speller protocol described above, using a SOA of 175 ms. For each subject, the training set consisted of 85 characters, each one containing 15 sequences of 12 intensifications. Further details on the data can be found in .
The signals were further bandpass-filtered in the band 0.15 - 5 Hz (HPF: 1600-tap FIR; LPF: 960-tap FIR) to reduce exogenous components at 5.7 Hz and multiples. The 1-second epochs following each flash were extracted. Therefore, for each subject, a total of 15300 (851512) epochs were available, of which 2550 (85152) were targets.
The set of epochs was partitioned according to the number of nontargets presented between the previous target and the current epoch. For example, for the sequence ...TNNNT..., the second target (T) is assigned to partition 3 because it is preceded by three nontargets (N). Then, for each partition, average target and nontarget responses were determined as follows.
First, outlier epochs were removed. For each set of epochs, the first, , and third, , quartiles at each sample, , were found. Then an acceptance ``strip'' was defined as the time-varying interval where is the interquartile range. Responses falling outside the acceptance strip for more than one tenth of the epoch were rejected. The remaining responses were averaged. The mean, and the standard error of the responses for each class were finally evaluated using the remaining epochs.
Fig. 2 shows the average responses obtained from the epoch-partitioning and artifact-rejection procedure described above. The results confirm that despite Donchin's speller being characterised by fast SOA, significant modulations of the P300 amplitude due to TTI variations are present.
The most significant effect is visible around the peak of ``t'', the average P300, that is about 450 ms for subject A and 350 ms for B. It is not surprising that in the average response of the partition ``t00'' (two targets in a row), the P300 is almost completely absent, as the previous P300 (approximately 175 ms before) has not yet faded away. Similar considerations apply to ``t01''.
The other averages show an increase in the P300 amplitude for increasing number of nontargets separating the flash from the previous target (proportional to the TTI).
Our next step was to test whether knowledge of the effects of the target-to-target-interval on the P300 can be exploited to build better classifiers. Instead of building a new full-blown ad-hoc classifier, we decided to test first whether a thin layer built on top of a high-performing existing approach could improve the performance.
We borrowed from the work of Rakotomamonjy and Guigue  which resulted winner of the III BCI Competition for the Donchin speller. Namely, we used the approach they called ``Ensemble SVM without channel selection'' because it is easier to implement and outperforms other alternatives when using only 5 sequences to classify a character.
They used an ensemble of classifiers approach, where the datasets were split in several subsets and a linear support vector machine (SVM) classifier was trained on each of them. The outputs of all classifiers were summed up to build the final decision. When using sequences, the character identified by row and column was scored
where is the output of the classifier, is a vector with features from the epoch, is a mapping returning the ordinal position of the flash where the row was target during the sequence, and similarly for . The character with higher score was returned.
We decided to start from their approach and just change the scoring function in order to account for the effect of the number of nontargets preceding each epoch. The following hypotheses were made: the ERP response was considered to be a Gaussian random process whose mean is shifted up around 300 ms poststimulus when the flash is target; the amount by which the mean increases depends, among many factors, on the number of nontarget stimuli preceding the current one (Fig. 2 can be interpreted as the timecurse of the mean in the different conditions). The discriminability, or equivalently the reliability of the classification, of each epoch as target or nontarget depends on the distance between the corresponding target and nontarget class. As a result, the output of the classifier will be more unreliable when few nontargets separate the flash from the previous target.
Using these considerations, the scoring function was changed to
where is a mapping which associates a weight to the number of nontargets preceding a flash, while is a mapping returning the number of nontargets preceding the flash in the hypothesis that the row and the column identified the target character.
The function was found for each classifier using the part of the training set which was not used to build that particular SVM classifier. A stochastic hill climber was trained to maximise the number of correct characters using 5 sequences. The starting points were in the range and each was normalised to have unit sum.
Finally, the algorithm was tested on the test set and the results compared with those of the original method in .
Significant improvements arise when 3 to 7 sequences are used, which is a range characterised by a reasonable speed vs accuracy compromise.
In this paper we first document and then exploit, within the context of the Donchin's speller, a modulation in amplitude of P300 caused by target-to-target interval differences. In particular, we show that by specialising detectors to work with P300s elicited with each TTI, we can consistently improve performance of the best known classification algorithm for Donchin's speller with minimal changes. In the future we intend to explore the possibility of obtaining similar improvements within other BCI paradigms based on P300s.