Back to the main page.

Bug 2597 - ft_selectdata alphabetizes labels, this leads to wrong leadfield order with ft_prepare_leadfield

Status CLOSED FIXED
Reported 2014-06-02 15:37:00 +0200
Modified 2015-01-12 09:22:19 +0100
Product: FieldTrip
Component: core
Version: unspecified
Hardware: PC
Operating System: Mac OS
Importance: P5 major
Assigned to: Jan-Mathijs Schoffelen
URL:
Tags:
Depends on:
Blocks:
See also:

Philipp Ruhnau - 2014-06-02 15:37:16 +0200

Created attachment 630 example dataset Dear ft developers,


Philipp Ruhnau - 2014-06-02 15:37:54 +0200

Created attachment 631 example script


Philipp Ruhnau - 2014-06-02 15:52:00 +0200

Sorry I hit the return button and instead of just uploading submitted the whole site... so here is the whole story: I recently wanted to create some inverse solutions of auditory ERFs (N1) which was elicited under different conditions. when I looked at one subject's ERFs from single conditions alone within some snippets the ERFs looked perfect (i.e., as they should for a simple sinusoidal sound). However, when I created scripts for the analysis running everything on a cluster the result for the group was horrible (both in terms of a temporal peak and the source distribution, which was now far from auditory). I checked and found that also on single subject level the solutions differed for 'one-subject-one-condition-snippets' and the overall analysis. I think I found the problem, which seems to be a combination of ft_selectdata and ft_prepare_leadfield ft_selectdata (which I used in the overall analysis to separate trials from different conditions which I saved in one file) orders the sensor labels alphabetically (without any information of doing so) ft_prepare_leadfield has a cfg.channel input, which, in terms of order, it seems to simply ignore. Instead the output seems to be ordered by the grad structure which is also needed as an input. clearly a leafield applied to data which are ordered differently will result in nonsense. When I reordered the data after ft_selectdata to the default order all of a sudden my clear ERFs re-appeared on source level. Three NBs: 1) the selected data from above can be analyzed/plotted in sensor space and produce nice ERF peaks 2) if the leadfield contained a label (or two label) field(s) it would be possible to check whether they are matching. 3) as ft_selectdata only reorders the labels but not the grad.label field this could lead to problems elsewhere. See my attached test_script and and dataset to reproduce the alphabetization and to create the leadfield for the two differently ordered data sets (the lf is identical it seems, event though it shouldn't) I hope I did not just overlooked something obvious cheers philipp


Robert Oostenveld - 2014-06-02 16:49:08 +0200

Hi Philipp, thanks for reporting this, don't worry about the previous emails. We will have a look at whether it is a consequence of the "user script" or a bug in the code. At least it sounds like an undesirable feature.


Philipp Ruhnau - 2014-06-03 22:47:36 +0200

Created attachment 632 example script extended


Philipp Ruhnau - 2014-06-03 22:55:18 +0200

hey all, I did some more testing and it turns out that within the fieldtrip pipeline everything works as it should. the problem I described actually emerged for our own analysis, where we used precomputed filters directly on the data. as we did not know (this is what I think still needs fixing) that the data were ordered differently after ft_selectdata the filters and the data did not match and resulted in wrong inverse solutions. as my new test script shows fieldtrip seems to check for the order of the leadfield/filters (i found a label field in the cfg field in the leadfield, my bet is that this is checked?) and the solutions for both orders are identical. anyway, this is thus not a 'bug' i would say but still there needs to be either a warning in ft_selectdata or the data should be kept in the order of the input. In previous versions of fieldtrip this did not happen, which makes our error unfortunately likely to happen when one updates only irregularly (i.e., loosing track of what all has changed). so I think others could have this problem as well. sorry for making this potentially to big cheers philipp


Jan-Mathijs Schoffelen - 2014-06-04 10:30:36 +0200

I think the reordering of the channel order is an undesirable feature. I came across this exact issue when running statistics on channel level data. Since the input data are passed through ft_selectdata and ft_appendXXX for a bunch of bookkeeping/selection operations on the data, the output now has the channels re-ordered, even when just a single input argument of 'rpt_chan_freq' data is provided. As a user I would expect the order of the channels to be unchanged after running the statistics. Of course some amount of re-ordering may be necessary when multiple input arguments contain either different channels, or have them in a different order. Would it however be an idea to return the channels in the order according to the first input argument, provided each of the inputs has a full set of channels?


Philipp Ruhnau - 2014-06-17 09:27:23 +0200

Dear Robert and JM, I would like to emphasize that the above described behavior created a lot of trouble in our own analyses, because we did not expect such behavior (and as I said applied the filters to the data). Also because the issue wasn't obvious from the results! Thus, I think this reordering should be at least accompanied by a warning! It would be better of course if the data were only reordered when there are different channels in to-be-appended data, i.e., when it is expected. The issue JM described actually scares me a bit, as I do not know how much this can affect. Thus my pledge: please at least add a warning to ft_selectdata! best philipp


Jan-Mathijs Schoffelen - 2014-06-20 13:17:50 +0200

Grrrr, I concur with Philipp. I guess as soon as you do something slightly out of the ordinary, things may go wrong terribly. I came across this when using a estimated noise covariance where ft_selectdata had shuffled the labels, while I assumed the order of the channels to be the same as the order of the channels in the structure into which I plugged this covariance estimate(in the end of course my bad, because I did not explicitly check). Yet, I find it more and more reasonable for the user to rely on the expectation that the order of channels does not change by e.g. a call to ft_timelockanalysis for a covariance computation...


- 2014-06-25 14:33:02 +0200

Discussed 25/06. The order of the channels in the first data argument is more important than the order in the cfg. The order of the channels in the N'th data argument is more important than the order in the (N+1)'th. Note that channel order is represented at more than one location in the data-objects: data sensor representation localsphere head models leadfields cfg neighbours layout global approach: think of a way of checking the whole code base whether we ourselves adhere to these laws. specifically: fix it in ft_selectdata also, add the laws to the code guidelines.


Jan-Mathijs Schoffelen - 2014-06-25 15:00:56 +0200

added documentation on the guidelines page: http://fieldtrip.fcdonders.nl/development/guidelines/code#avoid_changing_the_order_of_the_channels_in_the_data_if_possible


Jan-Mathijs Schoffelen - 2014-06-25 15:17:30 +0200

(In reply to Philipp Ruhnau from comment #2) Hi Philipp, We discussed the issue in today's FT meeting and decided to adjust the behavior in ft_selectdata such that the order of the channels is not adjusted. I think that this change would to a large extent address your concerns. One thing that you mentioned in NB 2 of your comment 2, is that we might want to add a 'label'-field to leadfields in order to be able to explicitly match the order of the channels in (pre-computed) leadfields with the order of the channels in the numeric data. One thing that still can go wrong when using precomputed leadfields in the way you do it in the example script provided (and which actually I am also using all the time), is that in providing cfg.channel = data.label before calling ft_prepare_leadfield, the order of the channels in the leadfield will be according to the grad-structure. I haven't checked this yet, but I would suspect that the order will be only according to the order in data.label when the function is called with a data argument in the input. In order to really pre-empt any user scenario would indeed be to add a label to the leadfield ;-). Let's fix ft_selectdata first and take it from there.


Jan-Mathijs Schoffelen - 2014-06-25 15:27:34 +0200

Created attachment 642 twice the same leadfields but with channel order flipped


Jan-Mathijs Schoffelen - 2014-06-25 15:28:16 +0200

I just ran a quick check (using some CTF data in my Matlab workspace), and created a leadfield twice, once with ft_prepare_leadfield(cfg) (having cfg.channel = data.label) and once with ft_prepare_leadfield(cfg, data) (having cfg.channel = 'MEG') data.label was obtained with data.label = flipud(ft_channelselection('MEG',grad.label)); (i.e. flipping the order of the channels. In the attached screenshot one can easily see that the channel order in the leadfields is flipped in the two instances, i.e. adhering to the order in the grad.label when no data argument is provided.


Philipp Ruhnau - 2014-06-25 15:43:13 +0200

(In reply to Jan-Mathijs Schoffelen from comment #11) Dear Jan-Mathijs, thanks for working on this, I really appreciate it. Just one comment on my previous comment NB2: it seems that there is in fact a label field in the leadfield structure (which I found out way later), yet it is called leadfield.cfg.channel ... and indeed that one is ordered after the grad (as is the leadfield), as you showed in your example... I did not check whether leadfield.cfg.channel changes with data input, but I would suspect... best philipp


Jan-Mathijs Schoffelen - 2014-06-25 15:54:59 +0200

(In reply to Philipp Ruhnau from comment #14) Indeed, leadfield.cfg.channel changes with the order of the channels. It sounds as if this information should be ideally represented at a higher level, i.e. leadfield.label, rather than leadfield.cfg.channel


Jan-Mathijs Schoffelen - 2014-07-01 13:38:20 +0200

bash-4.1$ svn commit -m "enhancement - output the channels in the same order as the first data input argument, don't reorder" utilities/ft_selectdata.m test/test_bug2597.m Sending test/test_bug2597.m Sending utilities/ft_selectdata.m Transmitting file data .. Committed revision 9682.


Philipp Ruhnau - 2014-07-04 10:07:21 +0200

awesome, thanks JM!


Robert Oostenveld - 2014-07-07 09:35:28 +0200

*** Bug 2639 has been marked as a duplicate of this bug. ***