In a recent article published in this journal , Bradberry, Gentili and Contreras-Vidal proposed a method for the control of a 2-D mouse cursor which adopted particularly simple techniques, where the control of the velocity of the mouse cursor was a simple linear combination of the temporal derivatives of the voltages recorded in 34 EEG channels. Voltages were low-pass filtered with a cut-off frequency of 1 Hz before differentiation.
The display used in that work is reported in Figure 1. Results with 5 subjects were very positive, indicating that they could hit a target within 15 seconds in 73% of trials, with a median hit time (for the successful trials) of 5.4 seconds. However, we believe that these encouraging results were partly to be attributed to the evaluation criteria adopted rather than the subjects attaining a reasonable level of control of the interface. As we will explain in the following section there are at least two reasons for this.
As shown in Figure 1 the targets used in  represent approximately 40% of the length of the edges of the display. In each trial only one of the targets appeared. The cursor started at the centre of the screen and the subject was tasked with directing it to hit the designated target. As soon as a target was hit the trial ended. The cursor trajectories obtained in successful trials were then rescaled so as to make them the same length and averaged to produce plots of mean cursor paths. The success criterion adopted and the averaging technique used led us to ask two questions. We will look at these questions in the following two subsections.
As it is standard practice in psychology, when evaluating performance of subjects in a situation such as this where success could simply be obtained as a result of the cursor randomly drifting within the display, one needs to prove that subjects performance is above chance. This wasn't done in , probably because performance appeared sufficiently good to dispel any such concerns. However, we believe this was a potential mistake.
The question to be addressed is how likely is it that a cursor driven by some sort of random process would hit the target. The training procedure adopted in  was sufficiently complex that we cannot reasonably attempt to exactly replicate their experimentation. However, we can idealise the system and attempt to get ballpark figures for success rate and expected hit times. As we will see these are comparable to those reported in  suggesting that perhaps subjects' real performance was actually inferior to what was reported in the article.
Our idealisation of the system is as follows. We created a simulation where the display is identical to that in Figure 1 except that targets are simple line segments (with no thickness) occupying 40% of the edge of the display. Also, we idealised the cursor making it a point rather than the circle with a non-zero diameter used in . To keep the simulation as similar as possible to that in , in each trial we gave a maximum of 1,500 time steps (15 seconds at 100 Hz) to the cursor to hit the target. Also, if the cursor hit a boundary of the display we zeroed the component of the velocity orthogonal to the boundary, thereby simulating an inelastic collision, as was done in .
The cursor movement in our simulation was controlled by acting on the
two components of the cursor velocity. These were computed as follows:
In the runs with our simulator we obtained a success rate of 71.81% with a median number of iterations for the successful trials of 551 samples, which correspond to 5.51 seconds at the sampling rate of 100 Hz as adopted in . These results match worryingly closely the results reported in  and summarised in Section 1, suggesting that, while the subjects in  may have had some degree of control, that control was likely overestimated.
The second issue we need to look at is the averaging of trajectories for the purpose visualisation. The procedure used in  involves taking the trajectories recorded in successful trials, normalising them so that they all are of the same length, averaging them and, finally, plotting the resulting averages. We believe that this procedure leads to biased results.
Firstly, even if all trajectories had the same length (thereby making it easier to average them), it is clear that by selecting only the successful trajectories one already biases the mean. The mean trajectory thus obtained wouldn't represent the real mean, but the conditional mean (i.e., the mean subject to the event that a trajectory was successful at hitting the target). Because of this, it is not surprising to see that in  all such trajectories hit the target and they do so by hitting it approximately in the middle. Indeed the mean trajectories obtained by averaging 30 trajectories obtained in our pseudo-random simulation show exactly the same behaviour, as illustrated in Figure 3 (bottom). It actually does not matter which process one uses to obtain the original trajectories: by definition they all end up on the target and so must their conditional means. Also, because single trajectories are initially uncorrelated but later become correlated by the fact that they all hit the target, averages appear initially convoluted but they become more directed and smooth as they approach the target, which is exactly what happened in the plots in [1, see their Figure 4].
Secondly, there is the question of whether length-normalising trajectories before averaging has an impact on the veracity of the results. We believe it has. The normalisation process leads to the longer trajectories, which are the more convoluted ones, to be sub-sampled more with respect to the shorter trajectories. This leads to them appearing smoother than they were originally. The shorter trajectories will be smoother because they did approach the target more directly. So, averaging length-normalised trajectories leads to the impression that such trajectories were on average much straighter that they actually were. This is clearly illustrated in Figure 3 (top) which reports the single successful trajectories which led to the means in Figure 3 (bottom). As one can see the mean trajectories are not really representative of the successful trials, which effectively occupy the whole display. It would be very hard for anyone to infer which was the target in these trials if we hadn't drawn it in the plots.
What have we demonstrated with the simulations reported in this paper? Certainly, we have not proven that the results reported in  are bogus, nor did we conduct these experiments with that purpose in mind. We acknowledge the fact that we did adjust the parameter so as to achieve a close match between performance criteria and mean trajectories. However, the simulations shown in the paper and our discussion about the biases implicit in the success criteria and averaging methods adopted in  indicate that care is needed when evaluating BCI mice and that, perhaps, additional investigation is required to fully evaluate the impact of the results presented in .
We would like to thank the UK Engineering and Physical Sciences Research
Council (grant EP/F033818/1) for financial
support. The authors would also like to thank Luca Citi for useful comments.