Back to the main page.

Bug 2917 - ft_selectdata sorts, channelconnectivity uses this regardless of actual channel order in data

Reported 2015-06-26 14:33:00 +0200
Modified 2019-08-10 12:32:40 +0200
Product: FieldTrip
Component: core
Version: unspecified
Hardware: All
Operating System: All
Importance: P5 major
Assigned to: Tzvetan Popov
Depends on:
See also:

Balázs Knakker - 2015-06-26 14:33:54 +0200

Created attachment 714 Test script based on an ERF timelockstats tutorial demonstrating how channel ordering affects spatial clustering In short, during ft_statistics_montecarlo, accross-channel clustering is not always done according to tha actual channel order in the data. In particular, I looked at ft_timelockanalysis. I copy here the part where data is selected, with some comments. For testing, I used the timelockstats tutorial script and data, where{'MEG'} is used. [varargin{:}] = ft_selectdata(tmpcfg, varargin{:}); % At this point, varargin{:} becomes sorted % alphabetically, even if varargin{:} is given explicitly. % varargin{:}.trial stays consistent with varargin{:}.label, thus % inconsistent with varargin{:} (and, see below.) % restore the provenance information [cfg, varargin{:}] = rollback_provenance(cfg, varargin{:}); % here, varargin{:}.label stays the same, varargin{:}.trial has the % same channel order as it was originally given, consistent with % varargin{:}.label. % however, is sorted % alphabetically! This occurs because ft_selectdata fills % with ABC-sorted list of channels (which becomes % inconsistent with data if input labels were non-alphabetic), and % rollback_provenance copies it into cfg from varargin, and clears it in % varargin. Later on, in ft_statistics_montecarlo, cfg.connectivity being empty and being populated, channelconnectivity is called with one parameter, which causes it to generate the neighbourhood matrix based on Thus, the connectivity matrix will be generated assuming alphabetic channel order, regardless of the actual channel ordering in the data. For testing this, I used the tutorial data and script from The test script I attach runs the original analysis from the tutorial (which has the correct output), and then demonstrates that a) accross-channel clustering fails if labels (and corresponding data) are not in alphabetical order. b) cfg.connectivity stays the same, disregarding shuffled channel order in the data ifself and its labels. Note that for this dataset, randomly shuffled channel order results in clusters that are spatially isolated, but depending on number of channels (especially if fewer) and actual (less random) channel order, less obviously erroneous results can be obtained. I tried to be as thorough as possible and ran several test scenarios (see comments in attached script), but it is of course still possible that I spoiled something. The workaround I applied for the time being was to insert the following line into ft_timelockanalysis, after the selectdata and rollback_provenance part quoted above:{1}.label; PS: I found several bugs that might be related (2069, 2848, 2639, 1707, 2597), but decided to file this separately anyway.

Balázs Knakker - 2015-06-30 10:45:47 +0200

(The version I used for testing was from GitHub, last commit at 19 Jun 14:52 by Robert Oostendveld. 75f84b74bbe3dc9fe66a295c4326add7ba5a714a)

Tzvetan Popov - 2015-06-30 10:55:13 +0200

(In reply to Balázs Knakker from comment #1) Hi Balázs, thanks for reporting this. I can reproduce the problem and the "fix" with a current FT version. We'll look into this. best tzvetan

Tzvetan Popov - 2015-06-30 12:15:24 +0200

The problem seem to originate from the subfunction getselection_chan within ft_selectdata. There on line 631 the labels are defined taking the output of ft_channelselection into account. At this point labels are then alphabetically ordered and used around line 677: = label; Suggested fix: = varargin{1}.label(indx(:,1)); which preserves the channel ordering of the input arguments. @ Eelke, Jörn and Robert do you foresee any problems with this? best tzvetan

Robert Oostenveld - 2015-06-30 17:32:36 +0200

(In reply to Tzvetan Popov from comment #3) There is no guarantee of a single unique channel ordering that is consistent with the data. E.g imagine that data1.label = {'1', '2', '3'}; data1.time = 1:10; data1.avg = randn(3,10); data2.label = {'3', '2', '1'}; data2.time = 1:10; data2.avg = randn(3,10); cfg = []; = {'2', '3'}; [data1s, data2s] = ft_selectdata(cfg, data1, data2); then >> data1s.label ans = '2' '3' >> data2s.label ans = '2' '3' so one is consistent, the other is not. But in general I would favour the order of channels at least for the first input data structure (or "the" data structure in case of a single input) to remain consistent. if you now look at line 637, there you see that the output channel order is consistent with the order in "label". However, on line 631 the channel order is not maintained consistently, which is obvious with >> union({'3', '2', '1'}, {'3', '2', '1'}) ans = '1' '2' '3' >> union({'3', '2', '1'}, {}) ans = '1' '2' '3' I suggest to add the lines [ix, iy] = match_str(varargin{1}.label, label); label = varargin{1}.label(ix) after line 633. This match_str call always returns it in the original order of its first input argument.

Robert Oostenveld - 2015-06-30 17:33:51 +0200

(In reply to Robert Oostenveld from comment #4) oh, and after updating the order.

Tzvetan Popov - 2015-06-30 18:58:24 +0200

(In reply to Robert Oostenveld from comment #5) Ok will do make sense.

Tzvetan Popov - 2015-07-02 09:04:08 +0200

Sending ft_selectdata Transmitting file data. Committed revision 10509

Tzvetan Popov - 2016-02-24 09:22:09 +0100

Sending ft_selectdata Transmitting file data. Committed revision 10509

Robert Oostenveld - 2019-08-10 12:32:40 +0200

This closes a whole series of bugs that have been resolved (either FIXED/WONTFIX/INVALID) for quite some time. If you disagree, please file a new issue on