Back to the main page.

Bug 1370 - Buffer crashes or does not keep up with the load from time to time

Status CLOSED FIXED
Reported 2012-03-13 14:39:00 +0100
Modified 2012-08-23 10:42:09 +0200
Product: FieldTrip
Component: core
Version: unspecified
Hardware: PC
Operating System: Windows
Importance: P3 major
Assigned to: Boris Reuderink
URL:
Tags:
Depends on:
Blocks:
See also:

Jörn M. Horschig - 2012-03-13 14:39:05 +0100

Created attachment 238 Soheila's test script Although Robert and me already talked about this, there was a lady on the mailinglist reporting the same problem in their lab. This suggests that the problem is not specific caused by our (computer) hardware. The problem in short is, that every once in a while the "buffer cannot keep up with the load". It might be a problem with the shared memory, because (at least in our lab) the problem occurs more frequently when the acqusition computer ran for some time already. On the mailinglist, Soheila reported the same problem and also provided an example script that crashes fairly early in their lab. Her original mail: Dear Jörn, Thanks for your answer. I use CTF/VSM MedTech System. I use 364 channels and the sampling rate is 1200. The decimation rate is 1. As I saw in my recording there are two reasons for crashing in matlab session one of them is about transferring data which rarely occurs when the system is freshly restarted and the second one is related to reading data from buffer. When the number of samples of data which should be read is high we may see such problem. I wanted to check the timing of MEG System data stream and I used a m-file similar to what is published in fieldtrip website (http://fieldtrip.fcdonders.nl/example/measuring_the_timing_delay_and_jitter_for_a_real-time_application), I attach the mfile to this email. When I run it step by step there is no problem, but when I run the whole file the problem of crashing matlab is occurred exactly when "i=2" and we are on the line of reading the data (dat = buffer('get_dat', [s(i-1),s(i)],host,port); ). The problem occurs even if I restarted the computer just before running this file. On the other hand, although when the computer is freshly restarted we won't see any crash in matlab session during transferring data, but sometimes we see the message of "Internal Converter thread does not keep up with load" in the acq2ft terminal in acquisition system. This problem is depend on the rate of sending the data which is related to number of the channels and sampling rate, when we have more sampling rate or more number of channels we will see this message more times and when this message show on the screen, no data transfer to field trip buffer and we can't read data on the other computer. I appreciate it if you help me solve both of these problems about matlab crash and error in data transferring. All the best, Soheila


Boris Reuderink - 2012-03-28 14:55:44 +0200

I just met with Jörn. Here is a quick summary for future reference: The problem seems to appear with CTF acquisition, specifically acq2ftx.c:330, when the data is put in a FieldTrip buffer on a remote computer; when the ft-buffer is run locally, or remotely with a cross-link cable, the error is much less probable. My hunch is this: The CTF software writes to a shared memory region, that is quickly transported to the ft-buffer. When the transport happens to slowly, acq2ftx detects that data in the shared memory region is lost. It is difficult to test this problem, since the scanner is fully booked. Luckily, Soheila has a work-around. And Wouter is starting pilots soon, and might use a setup similar to the one causing problems; this might enable debugging.


Boris Reuderink - 2012-04-16 14:53:47 +0200

Today we tested a few setups in the MEG lab. It proved quite difficult to crash acq2ftx. One thing that lead to crashes was reading a short trial repeatedly from the ft-buffer, at maximum speed. The fact that the other processes (on different computers) freeze while communicating with the ft-buffer hints at a locking problem --- otherwise the buffer would keep running.


Boris Reuderink - 2012-04-16 15:07:33 +0200

Apparently, we used an older version of acq2ftx, that still contains bug 933.


Jörn M. Horschig - 2012-04-16 16:39:45 +0200

after replacing the old version, we tried to crash the buffer in the same way we succeeded some hours ago, but this time it did not work. Unfortunately, the error used to appear so irregular that it is hard to tell whether it is resolved now - Wouter's pilots this week should give a second impression, which hopefully confirms out current beliefs ;)


Jörn M. Horschig - 2012-06-20 15:15:50 +0200

status update: in all pilots from Wouter, there were no crashes anymore. Maybe you can consider this bug as fixed - it can be reopened in case the buffer crashes again some day. (thanks for fixing btw!)


Arjen Stolk - 2012-07-25 15:14:45 +0200

Considering fixed, thanks.