Last week, I ran some EEG data through an analysis that should find clusters of electrodes that recorded high activity at the same time. In other words, it looks for spatial and temporal adjacency. The cluster that it found looks like this:
That’s a map of all of the electrodes on our EEG nets, and the big blue circles overlay the electrodes that were part of my cluster. Hopefully it is apparent to you that this cluster has a topographically odd feature – the hole in the middle! Clusters don’t (generally) have holes like that – the correlation between adjacent electrodes is pretty high – and I knew from looking at the maps of overall activity that there wasn’t an activity dip in the center of that cluster. The particular electrode making up my hole was even more disconcerting because it is electrode Cz, the one that sits at the center of the top of the head, and when we record our EEG data that electrode is the reference. As data comes off the system, the trace at Cz is always zero, because measuring voltage requires a reference point. Although the data should have been re-referenced before the clustering algorithm, the worst-case “What when wrong here” answers involved the re-referencing having been lost, and that would be a major bug in an analysis that has been used already in several studies in the lab. Scary.
It turned out not to have been that bug, but another issue entirely. The first step in the clustering analysis is to take the spatial layout of electrodes, and define each electrode’s “neighborhood” – which other electrodes should be counted as “adjacent” to the electrode in question. The piece of code that does this takes in a list of names of electrodes you want to define neighborhoods for (in my case, all of them except the two eye-movement channels) and matches that list to a master list of electrode names and their positions. And in that master list, the electrode Cz was listed with its name as Cz’. Turns out, Cz and Cz’ don’t match, if you’re a computer, so Cz is never included in any neighborhoods, and so doesn’t turn up in the final cluster. Fixing the master layout list got me clusters including Cz, and all is well.
This kind of thing really illustrates the importance of having a human being look at your output, and think about whether the answer the computer is giving is plausible. I’m still working on how to best teach students to bring their critical eyes to this kind of output, and not just assume that the computer is right. Part of it is just developing enough experience to know what the results “should” look like, but in a lot of cases students already have a great deal of knowledge to bring to bear. It’s the same reason that I plot really raw data as often as I can – I’m not necessarily going to publish these graphs, but understanding what’s happening in my data set is essential, both for sanity-checking and to develop a theoretical understanding of what participants, or their brains, are doing.