Neuroarchiver Tool

© 2008-2017, Kevan Hashemi Open Source Instruments Inc.

Contents

Description
Set-Up
Plots
Files
Channel Selection
Recording
Metadata
Playback
Glitch Filter
Overview
Frequency Spectrum
Interval Processing
Batch Processing
Interval Analysis
Calibration
Event Lists
Event Classifier
Batch Classifier
Event Handler
Location Tracking
Message Inspection
Import-Export
Version Changes

Description

Note: This manual applies to Neuroarchiver 119 with LWDAQ 8.4.9+. See Changes for new features if you are using older versions.

The Neuroarchiver is a program written in TclTk. It is available in the Tool Menu of the LWDAQ Program. The Neuroarchiver.tcl source code is in the Tools directory of the LWDAQ distribution. The Neuroarchiver works with the Recorder Instrument and LWDAQ data acquisition hardware such as the Data Receiver (A3018). The Data Receiver records signals from one or more Subcutaneous Transmitters (A3013 or A3019). The Neuroarchiver downloads the signals from the Data Receiver over the Internet with a LWDAQ Driver acting as an intermediary. The Neuroarchiver acquires raw data continuously and saves it to disk. At the same time, the Neuroarchiver can read raw data from disk, display the recorded signals on the screen, and execute user-defined processing and classification. The Neuroarchiver's recording and playback are independent. We can play back data as we record it, or we can play back previously-recorded data while at the same time recording new data. For example recordings that you can play back for yourself in the Neuroarchiver, see our Example Recordings page.


Figure: Neuroarchiver Tool on MacOS.

The original motivation behind the design of the Subcutaneous Transmitter was to detect epileptic seizures in rats. The Neuroarchiver provides automatic event detection for EEG recordings. Event detection consists of four steps.

  1. Recording: The Recording portion of the Neuroarchiver acquires new transmitter data and records it to disk in NDF files without any alteration. The Recorder acquires data in discrete chunks that span a length of time called the recording interval. A typical recording interval is 0.5 s.
  2. Playback: The Player portion of the Neuroarchiver reads transmitter data from disk, extracts the transmitter signals, calculates their spectra, and displays both on the screen. The Player reads data in discrete chunks that span a length of time called the playback interval. A typical playback interval is 1 s, but the Neuroarchiver supports playback intervals up to 32 s for 512 SPS recordings or 16 s for 1024 SPS recordings. When the Recorder and Player will run simultaneously, the Player displays the signal immediately after the Recorder writes the data to disk. When recording and playing simultaneously, the recording interval must be less than the playback interval, or else the Recorder will be unable to keep up with the data generated by the transmitters.
  3. Processing: The Interval Processor is an extension of the Player. It performs user-defined processing of the signals extracted from an NDF archive by the Player. For each playback interval, processing produces a characteristics line, which is a list of numbers and words that summarize the properties of the playback interval for subsequent analysis. The Interval Processor stores these characteristics to disk, creating a characteristics file. Interval Processing is computationally intensive, so we sometimes have to resort to using a cluster of computer to get it done more quickly, in which case we use Batch Processing.
  4. Calibration: The Calibration System helps us account for variations in electrode sensitivity and amplifier gain from one recording to the next. In the ideal experiment, these variations will be small enough that we can assume the same sensitivity for all recordings. But when the recorded amplitude varies by more than a factor of two when stimulated by the same signal, we will benefit from measuring the sensitivity of each channel and integrating this sensitivity into our analysis. The Calibration System assumes that the we can identify well-understood intervals in the recording, for which all recordings should show the same amplitude. We call these baseline intervals. The power of these intervals is the baseline power. Assuming such identification is practical, we use the baseline power to normalize the recorded signal power before analysis. The Calibration System provides baseline power variables for all recording channels. It manages the storage and retrieval of baseline power values from the NDF metadata, and it allows us to alter the baseline powers in its control panel.
  5. Tracking: When our recording is provided by an Animal Location Tracker, such as an A3032, each recorded sample is accompanied by detector coil power measurements that allow us to deduce the approximate location of the animal within its cage. The Tracker button opens the Location Tracker window, which plots the locations of the selected transmitters on a grid defined by the detector coils. The location measurements are available to interval processing, so that we could, for example, include in our characteristics files the average location of the animal in each interval, or the distance it moved.
  6. Analysis: The characteristics files produced by processing are the starting point of Interval Analysis. The Event Classifier compares intervals to a library and so detects interesting events by similarity. You open the Event Classifier with the Classifier button. Within the Event Classifier is a further analysis tool for going through existing characteristics files, called the Batch Classifier. We use the Event Classifier and Batch Classifier to perform automatic event detection, such as seizure counting. But other types of analysis, such as obtaining the hourly average power in a particular frequency band, can be performed by programs operating directly upon characteristics files. We tend to use Tcl scripts that run in the LWDAQ Toolmaker.
  7. Examination: Our analysis of a recording will usually produce a list of events. The Event List navigator of the Neuroarchiver allows us to jump to the events in our recordings. Each event is defined by a line of text in the event list file. So long as the event is contained in one of the NDF archives within the Player's directory tree, it will be found and displayed.

If we choose a playback interval of 1.0 s with fourteen active channels, reconstruction and display of the signal voltage and spectrum takes about one second on our laptop. If we turn off the signal and transform, reconstruction takes about 0.8 s. Processing of the playback interval to produce its characteristics takes tens of milliseconds. When we later apply an event-detection program to the characteristics, each playback interval is represented by ten or twenty numbers, and execution is a fraction of a millisecond. The creation of characteristics files is intensive, because it requires playback of the data. Event detection applied to the characteristics file is fast. By choosing the characteristics well, and keeping the characteristics file on disk, we can avoid re-playing the data.

In the screen shot above, the Recorder is configured to download data in half-second intervals and store them to an archive whose name appears to the right of the Recording Archive label. When we start recording, the Recorder state will be Record. A white background means the Recorder is waiting for the Player to finish its processing. A yellow background means the Recorder is waiting for data from the Data Receiver. With the Create parameter set to 3600, the Recorder will create a new recording file every 3600 seconds. The recording files are in the NDF format, and we call them archives.

As shown in the picture, the Player is reading a different archive from the one that is being recorded. But it could equally well be playing the same archive that is being written in real time by the Recorder. The Activity list shows the transmitter channels present in the playback interval. Each channel number is followed by a colon and the number of messages from this channel in the raw data. The clock channel (channel 0) should always receive 128 MSP (messages per second), or 512 in a 4-s interval. A transmitter running at 512 MSP will provide up to 512 messages in a 1-s interval.

The Processor is enabled. The processing script will be applied to each playback interval and its result printed in the text window. These lines are not currently being saved to disk, however, because the Save button is not checked.

The Value vs. Time plot shows the signal voltages during the playback interval. The plot can be simple, centered, or normalized. The Amplitude vs. Frequency plot shows the spectrum of the signals during the playback interval. We choose which channels will be plotted, transformed, and processed with the Select entry box. The string "1 2 3 78" would select channels 1, 2, 3, and 78 only. An asterisk (*) selects all available channels. We discuss channel selection in more detail below. Even the results of processing, shown in the text window, are restricted to the selected channels.

The Player state is Play. A white background means the Player is waiting for the Recorder to finish acquiring data. A green background means that the Player is analyzing messages. A yellow background means the Player is waiting from data to be added to its archive. The Player will be waiting regularly if it is playing Recorder file, because it must wait until the required data is added. When the orange background appears behind the Player state, the Player is jumping to a new archive or to a new point within an archive.

The Pick buttons allow us to select files named in the adjacent text label. The Recorder Pick button selects an existing archive to which we want to add data. The PickDir selects a directory in which to create a new archive. The Player PickDir button selects the top directory of a directory tree in which the Player will search for archives to play back or jump to. The Player Pick button selects an archive for playback.

The Help button offers some basic help, including example processor scripts. The Configure button in the Neuroarchiver opens the Configuration Panel. This panel is an array of all the Neuroarchiver's user-adjustable configuration parameters. In this panel, there is a Save button. When you press Save, the Neuroarchiver saves all its configuration parameters to a file Neuroarchiver_Settings.tcl in the LWDAQ/Tools/Data folder. When you next open the Neuroarchiver, all these settings will once again be loaded into the tool.

Set-Up

For playback of existing archives, run the LWDAQ Software and select it from the Tools menu. You can use the Player to go through existing archives, without doing anything with the Recorder. For recording, the Neuroarchier uses the Recorder Instrument to acquire new data from a Data Receiver, such as the A3018, or A3027.

  1. Start LWDAQ
  2. Open the Recorder Instrument from the Instrument Menu.
  3. Set the Recorder Instrument's daq_ip_addr and daq_driver_socket to point to your Data Receiver. If you are working with an Animal Location Tracker instead of a Data Receiver, set payload_length to match the location tracker messages. If your transmitters belong to a set other than the default set zero (0), enter their set number in the set_num parameter.
  4. Press Acquire to see if you can get some data.
  5. Press Reset and Loop. You are acquiring live data from the data recorder. Look at the signals displayed in the Recorder Panel. Make sure that you have the correct set of transmitters turned on, and that they are all working. When you are satisfied, press Stop.
  6. Close the Recorder Panel.
  7. Open the Neuroarchiver Instrument from the Tool Menu.
  8. Press PickDir in the Recorder section and pick a directory for recording archives.
  9. Press Reset. The Recorder state indicator will turn red. The Neuroarchiver is resetting the Data Receiver and creating a new archive file with a name of the form Mx.ndf, where x is a ten-digit UNIX timestamp.
  10. Press Record. You should see the Recorder state indicator flashing yellow.
  11. Press Pick in the Player section of the Neuroarchiver Panel, and select your new archive.
  12. Press Play. The Player state will flash green when it extracts a new interval, and yellow when it is waiting for new data.

Look at your data recorder. The EMPTY light should be flashing regularly. If it is not flashing, your Neuroarchiver is not acquiring data as fast as the data is being recorded. This failure to keep up with the pace of recording can be arise in several ways. If your playback interval is less than or equal to the recording interval, the Recorder will never catch up with the Player. The recording interval should be shorter than the playback interval. If your playback interval is several seconds, it could be that your computer is not fast enough to process and plot the signal as fast as it is being recorded.

Once you get the recording and playback working, you can try out various values of recording interval and playback interval. For the most stable operation with up-to date signal display, the recording interval should be half the playback interval. In stable operation, the Player is waiting for the Recorder to save data to disk. When the data is available, the Player displays it. While it waits, the Player state indicator is yellow.

You can look through previously-recorded archives even while you are recording a new archive. Stop the simultaneous playback and select a new archive. If you want to see an overview of an entire archive, select it in the Player and press the Overview button. If you double-click on the Overview, the Player will find the time you double-clicked and show you it in detail.

The Player will continue past the end of an archive if you have play_stop_at_end set to zero, which is the default. You will find this parameter in the Configuration Panel. When the Player reaches the end of an archive, it will make a list of all the NDF archives in its directory tree, and find the next file after the current file to continue playback. If you set play_stop_at_end to one, the Player will stop at the end of its file. You specify the Player's directory tree with the Player's PickDir button. Select the top directory in the tree.

Plots

The Neuroarchiver draws two plots during playback. On the left is value versus time, or VT. On the right is amplitude versus frequency, or AF. Each plot has its own Enable check-box. Disable the plots if you want processing to proceed as fast as possible. The VT plot shows the signal during the most recent playback interval. The AF plot shows its frequency spectrum as obtained by a discrete Fourier transform (DFT). The traces in both plots are color-coded by recording channel number.


Figure: Default Color Scheme for Channel Numbers.

By default, the Neuroarchiver uses the color coding shown above, which is the default LWDAQ color scheme for numbered plots. But we can assign the same colors to different channel numbers using the Neuroarchiver's color_table string. Each entry in the color_table is a channel number and a color number. The color numbers are defined for all numbers 0 to 255.

Example: The color_table string is by default "{0 0}", just to show us the format of its elements. But if we change it to "{5 7} {9 2} {222 1}" the trace for channel five will have color seven (salmon), for channel nine will have color two (blue), and for channel two hundred and twenty three will have color one (green).

For a magnified view of either plot, double-click on the plot display and a new window will open up that contains nothing but the plot. The size of this magnified plot is controlled by the vt_view_zoom and af_view_zoom parameters, which you will find in the configuration panel.

Before the Neuroarchiver generates the VT and AF plots, it applies signal reconstruction and glitch filtering to the signal. The glitch filter threshold appears below the plot. We disable the glitch filter by entering 0 for the threshold. We turn off signal reconstruction by setting the enable_reconstruct to zero. With reconstruction disabled, missing messages will remain missing.

The VT vertical axis is voltage in units of ADC counts. Each transmitter converts its analog input into a sixteen-bit value. Sixteen-bit values run from 0 to 65535. To convert between ADC counts and voltage at the transmitter input, consult the transmitter manual.

Example: The A3028A amplifier has a gain of ×100 and a battery voltage of around 2.7 V. Its dynamic range is around 23 mV and each ADC count represents 0.41 μV at either input. If we set v_range to to 2440 and apply alternating coupling, the height of the display represents 1 mV. With a one-second playback interval, each of the ten horizontal divisions is 100 ms.

There are three ways to plot scale the VT voltage values. In simple plot we select with the SP button. The v_range value sets the range of the plot from bottom to top in ADC counts. The v_offset sets the voltage at the bottom of the display.

Example: The A3013A transmitter manual has a section called Analog Inputs. Here we see that the gain of the transmitter from analog input to the ADC is 300. The voltage range 0 to 65535 corresponds to voltages 0 V to VBAT at the ADC input. For most of a transmitter's operating life, VBAT is 2.7 V, so each ADC count is 140 nV at the X input. The amplifier AC-couples the X input, placing its average value at 1.8 V. The dynamic range for signals at X is −6 mV to +3 mV. We can deduce the battery voltage, VBAT by looking at the average value of the signal in the plot. We use the simple plot to display the signal. If A is the average value of X as a fraction of 65535, we have VBAT = 1.8 V / A. From the signal present in archive M1259065886.ndf, we estimate A is 0.64 for channels No1 and No2, 0.70 for channel No14, and 0.8 for channel No6. This implies VBAT of 2.8 V for No1 and No2, 2.6 V for No14, and only 2.2 V for No6. We can compare these voltages to this graph to estimate how much more operating life each transmitter has left.

The centered plot uses v_range in the same way, but ignores v_offset. The plot of each signal is centered upon the window, so that the average value of the signal is exactly half-way up. The normalized plot ignores both v_range and v_offset and fits the signal exactly into the height of the display.

The horizontal axis in the VT plot is time. The t_min value is the time at the left edge of the interval. The full range from left to right covers the most recent playback interval. This interval is shown in the Interval selector beneath the plots. Note that the playback time, in the Time (s) entry box, is the time at which the next playback interval should begin. During continuous playback, this will be the time at the right edge of the plot.

The AF plot shows the amplitude of the signal's discrete Fourier transform components. The transform amplitude range is zero to a_range, where a_range is in ADC counts. The Player calculates all terms in the discrete Fourier transform and plots those between f_min to f_max. The discrete Fourier transform dictates a particular frequency step from one discrete component to the next. We have f_step = 1/p, where p is the playback interval. The highest frequency component in the transform is at half the transmitter's message frequency. For a 512 MPS transmitter, the highest frequency component in the transform will be 256 Hz. If we set the range of the frequency plot outside the range zero to one half the sampling frequency, the spectrum will be blank. Note that the transform applies to the reconstructed and glitch-filtered signal.

Detail: As we describe in the Recorder Instrument Manual, the reconstructed signal will always contain messages at exactly the transmitter's nominal frequency, regardless of how many messages we lost in reception. We calculate the transform using an fast Fourier transform algorithm. This algorithm requires a perfect power of two number of samples as its input, in order to allow its divide and conquer method to operate with perfect symmetry upon the problem. All our transmitters operate at a frequency that is a perfect power of two, so choosing playback intervals that are a power-of-two fractions of multiples of one second will always give us a number of samples that satisfies our algorithm. It is possible to turn off reconstruction in the Neuroarchiver by setting enable_reconstruct to 0. If we turn off reconstruction, the Neuroarchiver will perform an abbreviated reconstruction for the algorithm by adding dummy messages to or subtracting excess messages from the raw messages sequence.

Example: With amplitude range 1000 counts, each vertical division is 100 counts. Suppose our sample rate is 512 SPS. We set f_min to 0 Hz and f_max to 256 Hz so that we can see the entire discrete Fourier transform of the 512 samples taken in the 1-s play interval. The frequency step is 1 s because the play interval is 1 s. If we switch the play interval to 4 s, the Neuroarchiver will set the frequency step to 0.25 s.

If we click the Log checkbox, the frequency axis will become logarithmic, with lines marking the decades in the traditional fashion.

Detail: We do not provide logarithmic display for the amplitude, although we could easily do so if it were to prove useful. Furthermore, we feel that the logarithmic frequency display is not particularly useful because the Fourier transform components are distributed in uniform frequency steps instead of logarithmic steps.

We describe the generation and adjustment of the frequency spectrum in more detail below.

Files

The Neuroarchiver window displays the names of four files: the recording archive, the playback archive, the processing script, and the event list. We can select these files using the Pick buttons beside each file name. The Recorder creates new archives in a directory you specify with its PickDir button. The Player looks for archives to play or jump to in the directory tree you specify with its PickDir button. You pick a directory and the Player will make a list of all archives in this directory and its sub-directories.

Note: Do not use white spaces in your directory names or file names. You may use underscores or dashes instead.

The Neuroarchiver stores transmitter messages in NDF (Neuroscience Data Format) files. It performs no processing upon the messages as it stores them to disk. What appears in the NDF file is exactly the same sequence of messages that the Data Receiver stored it its memory. Thus we have the raw data on disk, and no information is lost in the storage process. We describe the format of the data in the NDF file in detail in Import-Export.

An NDF file contains a header, metadata string, and a data block to which we can append data at any time without writing to the header or the data string. We define the NDF format in the Images section of the LWDAQ Manual. The Neuroarchiver manipulates NDF files with a NDF-handling routines provided by LWDAQ. These routines are declared in LWDAQ's Utils.tcl script. You will find them described in the LWDAQ Command Reference. Their names begin with LWDAQ_ndf_.

All archives created by the Neuroarchiver receive a name of the form px.ndf, where p is the prefix string specified in ndf_prefix and x is a ten-digit number giving the time of the start of the recording. By default, the prefix is the letter "M". The ten-digit number specifies standard Unix Time. The ten-digit number is the number of seconds since time 00:00 hours on 1st January 1970, GMT. We get the Unix time in a Tcl script with command clock seconds. From the name of each file, we can determine the time, to within a second, at which its first clock message occurred. From there we can count clock messages and determine the time at which any other part of the data occurred. The Neuroarchiver's Player Date and Time function, accessible through the Clock button, uses the timestamps buried in archive file names to find intervals corresponding to specified absolute times.

The Recorder stores data in NDF archives and the Player reads the NDF archives to extract voltages and calculate spectra. When the Player reaches the end of an archive, it looks for a newer archive in the same directory and starts playing that one immediately afterwards. Thus if we are playing the archive that is being recorded, the Player will play the fresh data from the expanding archive until the Recorder starts a new archive, at which time the Player will switch to the new archive automatically. If we are playing old archives, the Player will still move from the end of one to the start of the next, even if the next is unrelated to the first. Thus we can go through a collection of archives that are from different experiments and different times, and apply processing to extract characteristics from all the archives.

The Recorder provides a Header button that opens a text window and allows you to enter a comment or document describing the recordings. This header string will be added to the metadata of every archive created by the Recorder. The Player provides a Metadata button. This button opens up a text window that displays the metadata of the playback archive and allows us to add comments and save the metadata to disk. The comments in an archive's metadata can remind us of what the file contains. The generic names of our archives don't help much when it comes to identifying particular experiments. So the Player provides a List button that allows us to choose files in a single directory whose metadata comments we wish to inspect.


Figure: Playback Archive List. The words with the blue background are buttons we can press with the mouse to step into an archive, view its metadata, or get an overview.

When we press the List button, the Neuroarchiver will ask us to specify one or more files in a single directory. It will open a new window and display these archives with their metadata comments. The list window provides three buttons for each archive: Step, Metadata, and Overview. These allow us to step directly into the start on an archive, edit the metadata, or jump to an overview of its contents.

In addition to NDF files, the Neuroarchiver works with two classes of text files. The first are Processing Scripts. These are TclTk programs that the Player will apply to the signals in each playback interval. The second are Event Lists. These are lists of events detected in recorded signals that the Player uses to navigate between events. We select these files with Pick buttons, but the Neuroarchiver provides no way to edit such files. We assume you have a text editor on your computer.

The Neuroarchiver prints messages to its text window. When a recording, playback, or processing generates a warning or an error, these appear in the text window in blue and red respectively. If we set log_warnings to 1, the Neuroarchiver will write all warnings and errors to a log file. The name of the log file is stored in log_file. By default, the log file is the /Tools/Data directory and is named Neuroarchiver_log.txt. We can change the name of the log file and so place it somewhere else. The warnings and error messages all include the current time as a suffix, which is the time the Neuroarchiver discovered a problem. The warnings that mention the name of an NDF file contain the playback time at which the problem was encountered.

Channel Selection

The data acquired by the Neuroarchiver takes the form of a list of data recorder messages, as we describe elsewhere. In general, the data will contain values from one or more channel numbers. The Neuroarchiver selects which channels to display, transform, and store to disk using its channel_select parameter.

In its simplest form, channel_select is a single "*" character. With channel_select set to "*", the Neuroarchiver looks through the playback interval data and counts how many messages it contains from each of the possible subcutaneous transmitter channel numbers. If we have more than a certain threshold number of messages from a channel, the Neuroarchiver considers it active. The activity threshold we obtain by multiplying the playback interval by the activity_rate parameter. The Neuroarchiver plots all active channels and lists them in the Player's Activity string. The activity list has format id:qty, where id is the channel number and qty is the number of messages.

We can select particular channels with a specific channel_select string. We can enter "1 2 6 14 217 222" and the Neuroarchiver will attempt to display these channels, even if they have very few messages. We can specify the nominal sampling frequency and scatter extent for each channel. For a description of sampling frequency and scatter see here. If we want to specify the frequency, we do so with two numbers in the form c:f. Thus "5:1024" means channel 5 with sampling frequency 1024 SPS. When the sample rate from one device is 2048 SPS, such as a two-channel transmitter with 1024 SPS on each channel, our reconstruction will be more effective if we specify scatter equal to 4. We specify the scatter with three numbers in the form c:f:s. Thus "5:1024:4 6:1024:4" means channels 5 and 6 have sampling frequency 1024 SPS and scatter extent 4. We cannot specify the scatter without specifying the frequency. When the total number of samples is 4096 SPS from a single device, the scatter should be 2, and when 8192 it should be 1. We cannot specify scatter without specifying frequency.

If we list the channel numbers on their own, or if we use "*" to specify all channels, the Neuroarchiver uses the default_frequency and default_scatter parameters to determine the sample frequency and scatter. The default scatter contains only one value, which will be applied to all transmitters for which no value has been specified in the channel_select string. This default scatter is 8 unless we change it. The default frequency can be one value, such as "512", or a list of values, such as "512 1024". If it is a list, the Neuroarchiver will try to pick the best match between the data and the frequencies in the list. Thus we can, provided reception is better than 80%, automatically detect and account for transmitters of frequencies 64, 128, 256, 512, 1024, 2048, and 4096 SPS.

The sampling frequency and scatter extent are used by the Neuroarchiver when it reconstructs an incoming message stream. The Neuroarchiver uses its clocks_per_second and ticks_per_clock parameter to convert samples per second into a sample period in units of data recorder clock ticks. The Neuroarchiver can then go through a channel's messages and identify places where messages are missing, and eliminate bad messages that occur in the message stream at random times.

By default, the Neuroarchiver applies reconstruction to all data during playback. But we can disable the reconstruction by setting enable_reconstruct to zero in the configuration array. We sometimes disable reconstruction so we can get a better look at bad messages and other reception problems.

Recording

The Neuroarchiver uses the Recorder Instrument to obtain live data. First we set up the Recorder Instrument to read out streams of messages from a data-recording device, then we open the Neuroarchiver to store the live data. The Neuroarchiver does not display new data until after it has been stored to disk, so display and processing are not part of data acquisition.

To capture live data, open the Recorder Instrument and configure it to read our data out of our data recorder. The Neuroarchiver will simply call the Recorder Instrument's data acquisition procedure when it captures new live data. We don't have to leave the Recorder Panel open after we set it up for data Recording, but we can leave it open if we like.

The only thing that passes from the Recorder Instrument to the Neuroarchiver is the raw data acquired from the data acquisition hardware. The Recorder Instrument has a data acquisition parameter called daq_num_clocks. When we instruct the Recorder Instrument to acquire new data, it acquires a block of messages with exactly this number of clocks. The Recorder Instrument makes sure that first message in the block is always a clock message. The Neuroarchiver calculates daq_num_clocks from record_interval, which has units of seconds. In the example shown above, the recording time interval is 1.0 s, and is shown in the menu-button to the right of the Recorder controls.

The Header button opens a text window into which you can enter a comment or paste a text document describing the contents of the archive you are recording. This comment will be written to the metadata of every archive created by the recorder.

Metadata

The NDF format contains a header, a metadata string, and a data block. Transmitter messages and clock messages are stored by the Recorder in the data block. New data is appended to the data block without any alteration of existing data. The metadata string has a fixed space allocated to it in the file, but is itself of variable length, being a null-terminated string of characters. We can edit the metadata of the playback archive with the Metadata button. We can save baseline powers to the playback archive metadata with the Save to Metadata button in the Calibration Panel. At the top of the metadata there is a metadata header, which the Recorder writes into the metadata when it creates the recording archive. The metadata header contains one or two comment fields, where one comment is a string delimited by an xml "c" tags, like this:

<c>
Date Created: Thu Sep  2 15:45:59 2010. 
Creator: Neuroarchiver 45, LWDAQ_7.4.4. 
Host: dyn-129-64-201-94.wireless.brandeis.edu
</c>

The Recorder always generates a header comment like the one shown above. In addition, it will add another header comment defined by the user. We press the Header button in the Recorder and we get the Header Panel, in which we can create, edit, and save a header string. This string might describe the apparatus from which we are recording, so that every archive we create contains a record of where the archive came from, and what it contains. We don't have to include xml "c" tags around the text in the Header Window. When the Recorder writes the header to the metadata, it adds the "c" itself.

When we edit and save the metadata of a playback archive, the Neuroarchiver does not add "c" tags to our edits. This allows us to add any other type of field we like. We can ensure that the Neuroarchiver will recognize our edits as comments by including our text in fields delimited by "c" tags. The List button opens a List Window. The List Window that provides us with a print-out of the comments from a selection of files. Thus we can use metadata comments to describe the contents and origin of our archives, and then view these comments later.

Playback

The Neuroarchiver performs all its signal reconstruction and plotting when its Player reads data from disk. Each plot has its own enable checkbox. If we want the Player to calculate interval characteristics with a processing script, we can accelerate the processing by turning off the plots. If we check the Verbose box on the Neuroarchiver, the Player will report on its reading and processing of data. We will see the loss in each channel, and the results of reconstruction and extraction of message from the playback interval's data.

When we open a new archive, the Neuroarchiver calculates the length of time spanned by the recording the archive contains. If the recording is one hour long, "3600.0" should appear for "Length (s)". We can navigate through the playback archive by entering a new Time value and pressing Step.

Note: When the Neuroarchiver calculates the length of the archive, and navigates to particular points in the archiver, it uses one of two algorithms. The default algorithm is non-sequential. The non-sequential algorithm is quick to calculate archive time, even with the largest archives. If your archive contains corruption of the data, however, such as results from interruptions of data acquisition, the non-sequential algorithm obtains time values that differ from those obtained by continuous play-back of the archive. For corrupted archives, we use the sequential algorithm, which is over ten times slower, but gives unambiguous time values for even the most corrupted archives. Select sequential navigation with the Sequential check-box. When we play through an archive, going from one interval to the next, the Neuroarchiver always uses the sequential navigation algorithm, regardless of the Sequential check-box. When we already know the time and location of the start of an interval, the sequential algorithm is far more efficient and it is always robust.

The Play button starts moving through an archive, one interval at a time. When the Player reaches the end of the file, it will continue to the next archive in the Player's directory tree, unless you set player_stop_at_end to 1. By "next archive" mean the file after the current file in the alphabetical list of all NDF files in the Player's directory tree. If we are playing data as it is recorded, we will want the Player to wait until new data is recorded in the playback archive, unless the Recorder has created a new archive, in which case the Player should move on to the new one. This progression will occur automatically provided that the Player's directory tree contains no archives with names that follow alphabetically after the current playback archive. This is sure to be the case if all files are named Mx.ndf, where x is a UNIX timestamp giving the start time of the recording. But if the files have other names, perhaps because they have been imported from other formats, or do make their names more descriptive, the Player may start playing an entirely unrelated and older file after it finishes the current play file.

The Stop button stops Play. The Repeat button causes the Player repeat the processing and display of the current playback interval. We use the Repeat button when we change the plot ranges or processing script so as to re-display and re-calculate characteristics of the same interval. The Back button step back one playback interval.

The Player recognizes several keyboard commands. We activate these with the Command key on MacOS, the Alt key on Windows, and the Control key on Linux. Command-right-arrow performs the Step function. Command-left-arrow is Back. Command-up-arrow jumps to the start of the next archive in the playback directory tree. Command-down-arrow jumps to the start of the previous archive. Command-greater-than (shift-period on a US keyboard) is Play. Command-less-than (shift-comma on a US keyboard) is jump back to the start of the archive.


Figure: Player Date and Time Window.

The Clock button opens the Player Date and Time window. This window displays the current play time as an absolute date and time, and the start time of the current play file as an absolute date and time also. A Jump to Time button allows us to jump to the recording of a particular date and time. The Player deduces the absolute date and time from the names of the archives in its directory tree. To determine the absolute date and time of a recording interval, the Player assumes all archives are named Mx.ndf, where x is the UNIX timestamp of the start of the recording. We can set the Jump to Time value to the current local time with the Now button. We set it to the play file's start time or the current play time with the corresponding Insert buttons.

If we want to move from one event to another within or between archives, we can use an event list. The Player provides Next, Go, and Previous buttons, as well as an event index, to allow us to navigate through an event list.

Glitch Filter

Subcutaneous transmitter recordings contain occasional glitches caused by bad messages. The glitch filter attempts to remove such glitches while leaving genuine signal spikes intact. The Neuroarchiver handles glitches and missing messages by inserting the previous valid sample value in place of the glitch or missing message. Immediately after reconstructing or extracting the signal, the Neuroarchiver applies the glitch filter, assuming it enabled. The glitch_threshold parameter displayed below the VT plot is by default zero, which disables the glitch filter. To enable the glitch filter, enter a value greater than zero for glitch_threshold. This number the same units as the sixteen-bit sample values: it is a measure of how large a deviation we will will tolerate before we assume the deviation is a glitch. We recommend a glitch threshold equal to the amplitude of your signal.


Figure: Example Glitches. We have ten implanted transmitters recording EEG, with a glitch in No10 and another No13.

If the signal jumps by an absolute distance greater than the glitch threshold, the glitch filter checks to see if the local coastline reduces by a factor of five (5) or more when the jumping sample is removed. The local coastline is the sum of the absolute changes in sample value in the nearest five samples. If a dramatic reduction does occur, the glitch filter removes the sample that jumps away from its neighbors and replaces it with the previous sample value. Regardless of any reduction in coastline, if the absolute distance is greater than ten (10) times the glitch threshold, the glitch filter removes the jumping sample.

With glitch_threshold = 200, we observe several glitches per hour from implanted A3028B transmitters in faraday enclosures. Outside a faraday enclosure, when reception is poor, the rate is several glitches per minute. Glitches such as those shown above are always removed by the glitch filter. When there are two independent glitches within a small neighborhood, our test for reduction in coastline does not detect the two glitches, but the test for ten (10) times the glitch threshold may remove them both, if the glitch threshold is small enough. Such double-glitches will occur once every few channel-hours of recording outside a faraday enclosure, and far less often within a faraday enclosure. The Neuroarchiver keeps count of the number of glitches it has removed in the glitch_count parameter.

Overview

The Overview button opens the Player archive and creates a condensed view of its entire contents. The overview opens in a separate window and takes a few seconds to appear. The Neuroarchiver selects num_samples random points from the archive and divides them up into separate channels. It uses those in channel No0 to calculate the approximate time of the transmitter channels. The result is a plot that gives a good representation of the archive contents, but not an exact representation.


Figure: File Overview.

The Plot button allows you to re-plot the overview with new time and value ranges. You can select channels with the Select string. If you double-click on a point in the Overview, the Player will jump to that time and show you the recordings in more detail. You can switch to the next archive in the Player Directory Tree with the NextNDF button, and the previous archive with PrevNDF.

The Export button saves the displayed data to disk. Each selected channel will receive its own file named En.txt, where n is the channel number. The export file will appear in the same directory as the archive. If you want more resolution in your overview, increase the number of samples the overview makes of the archive for its display. Note that the number of samples is not the same as the number of points in each signal plot. The activity label tells us the distribution of the samples among the channels.

Frequency Spectrum

We calculate the discrete Fourier transform of each channel using our lwdaq_fft routine, which is available in the LWDAQ command line. The lwdaq_fft routine takes the sequence of sample values produced by reconstruction and returns the complete discrete Fourier transform. If we pass N terms to the transform, we get N/2 terms back.

The lwdaq_fft routine uses the fast Fourier transform calculation, which is a divide and conquer algorithm that insists upon a number of samples that is an integer power of two. We can pass it 16, 32, 256, 512, or 1024 samples. Signal reconstruction ensures that we have a suitable number of samples. If we turn off reconstruction by setting enable_reconstruct to zero, the Neuroarchiver adds or subtracts samples to or from the signal so as to satisfy the Fourier transform's requirements.

When a signal's end value differs greatly from its start value, the Fourier transform sees a sharp step at the end of what it assumes is a periodic function represented by the signal interval. Such a step generates power at all frequencies of the spectrum, rendering the spectrum less useful for detecting events such as epileptic seizures. The Neuroarchiver applies a window function to the signal before it applies the Fourier transform. The window_fraction element in the Neuroarchiver's configuration array gives the fraction of the signal that should be subject to the window-function at the start and at the end of the sequence of available samples. We like to use window_fraction 0.1 for EEG (electroencephalograph) signals. The window function is provided as an option in the lwdaq_fft routine.

Data from wireless transmitters can contain bad messages arising from interference and noise. Signal reconstruction attempts to eliminate these messages, but we can still get several bad messages per hour on each signal channel, and these appear as one-sample spikes. The Neuroarchiver uses its glitch filter to remove such spikes immediately after signal reconstruction.

Interval Processing

In each playback step, the Neuroarchiver goes through each channel selected by channel_select and performs reconstruction, glitch filtering, spectrum-calculation, plotting, and processing. We enable processing with the enable processing checkbox. The processing reads the processor from disk. The processor must be a proper TclTk script. The Neuroarchiver executes the script once for each selected channel.

If you want to learn how to program in TclTk, so that you can write your own processors, we recommend Practical Programming in TclTk. Otherwise, you can consult About TclTk and the TclTk Manual. The language is interpreted rather than compiled, and the interpreter is available on all operating systems. Thus our LWDAQ software, and any scripts you write in TclTk, will work in MacOS, Windows, and Linux equally well.

The processing script has access to the Neuroarchiver's configuration and information arrays with the config(element_name) and info(element_name) respectively. The configuration parameters are ones the user is free to modify. The info parameters are a mixture of parameters that are too numerous to list in the configuration array and others that the user should not change. The processing script also has access to several temporary variables. We list some of the most useful variables in the following table.

VariableContents
num_clocksThe number of clock messages in the current playback interval
resultThe processing results string, integers are always channel numbers
config(play_file)The NDF archive being played back
config(play_time)Seconds from archive start to interval start
config(enable_vt)Voltage-time display is enabled
config(processor_file)The processing script file
config(channel_select)The channel-selection string, if * then all channels chosen
config(play_interval)The playback interval in seconds
info(channel_num)The number of the channel just reconstructed and transformed
info(num_received)The number of messages received in this channel during this interval
info(num_messages)The number of messages in the reconstructed signal
info(loss)The signal loss as a percentage. Subtract from 100% to obtain reception efficiency.
info(signal)The reconstructed signal as sequence of timestamps and values
info(spectrum)The transform as sequence of amplitudes and phases
info(f_step)The separation in Hertz of the transform components, equal 1/play_interval
info(bp_n)The baseline power of channel n
info(f_n)The sampling frequency assumed for a channel n
info(num_errors)The number of data corruptions present in this interval
info(tracker_history)The history of locations for the current channel.
info(tracker_x)The tracker x-coordinate of the current channel.
info(tracker_y)The tracker y-coordinate of the current channel.
info(tracker_powers)The median tracker coil powers for the current channel.
Table: Variables Useful to Processing Scripts. The names are given as they must be quoted in an processing script.

The characteristics are stored in the result string. Any word or number can be added to the characteristics of each channel, except that only channel numbers may be written to the string as integers. Subsequent analysis is able to separate the characteristics of the various channels by looking for the integers that delimit the channel data. If we want to store a value 4 as a characteristic, we can write it as 4.0.

Other elements of the configuration array we can find by pressing the Configure button in the Neuroarchiver window. Each is available with config(element_name) in the processing script. The information array elements we will have to seek out at the top of the Neuroarchiver script itself, where each is described in the comments.

If we select four channels for playback, the processing script will be called four times. Each time the Neuroarchiver calls the script, all variables that are specific to individual channels, such as loss, num_received, signal and spectrum, will be set for the current channel. We obtain the current channel number through the channel_num parameter.

The first time the processing is called, the result string is empty. Each call to the processing should append some more values to the result string. After the final call to the processing script, if the Neuroarchiver sees that result is not an empty string, it prints it to the text window. If the Save box is checked, it appends the string to the a characteristics file. The Processor constructs the name of this file from the name of the archive and the processing script. Script P.tcl, when applied to archive M1234567890.ndf produces a characteristics file called M1234567890_P.txt.

The following script records the reception efficiency for each active channel. This allows us to plot message reception versus time by importing the characteristics file into a spreadsheet.

append result "$info(channel_num) [format %.2f [expr 100.0 - $info(loss)]] "

Once the analysis has been applied to all active channels, the Player checks the result string. If the string is not empty, the Player adds the name of the play file and the play time to the beginning of the string. These two pieces of information apply equally to all channels, and are essential characteristics for event detection.

Because the script is TclTk, it can do just about anything that TclTk can do. In theory, it can e-mail the finished result string to us, or upload it over the network to a server. Most processor scripts produce characteristics files through use of the result string. But we can also use processing to export signals or spectra to disk.

The reconstructed signal is available in info(signal). The signal takes the form of a sequence of numbers separated by spaces. Each pair of numbers is the time and value of the signal. The time is in clock ticks from the start of the playback interval. The value is in sixteen-bit ADC counts. The timestamps are twenty-four bit numbers that give the number of data receiver ticks since the start of the playback interval. A twenty-four bit number is up to 16.8 million, and the tick frequency in the A3018 data receiver is 32.768 kHz. The maximum interval we can cover with these timestamps is 512 seconds. We usually specify intervals between 0.1 and 10 s. The sample values are sixteen-bit un-signed numbers.

The discrete Fourier transform of the signal is available in info(spectrum). The spectrum is a sequence of numbers separated by spaces. Each pair of numbers is an amplitude and a phase. The pairs are numbered 0 to n−1, where n is the number of samples in the signal, available in num_messages. The k'th pair of numbers describes the frequency component with frequency k×f_step. The amplitude is in sixteen-bit ADC counts and the phase is in radians.

The following processor illustrates how to manipulate the individual components of the signal spectrum. We can manipulate individual sample values in the same way. The script calculates the sum of the squares of the amplitudes of all frequency components in the range 2-40 Hz. The script uses the variable band_power to accumulate the sum of squares. The sum of the squares of the amplitudes of all components in the discrete Fourier transform is twice the mean square value of the signal itself. We use the term power because the power dissipated by a voltages applied to a resistor is proportional to the square of the voltage. When we select the components in 2-40 Hz and add up the squares of their amplitudes, we get a sum that is twice the mean square of the signal whose Fourier transform contains only those selected components. We can see what this signal looks like by taking the inverse transform of the 2-40 Hz components alone, and plotting the filtered signal in the value versus time window.

set band_lo 2
set band_hi 40
set band_power 0.0
set f 0
foreach {a p} $info(spectrum) {
  if {($f >= $band_lo) && ($f <= $band_hi)} {
    set band_power [expr $band_power + ($a * $a)]
  }
  set f [expr $f + $info(f_step)]
}

append result "$info(channel_num) [format %.1f $band_power] "

if {$config(enable_vt)} {
  set new_spectrum ""
  set f 0
  foreach {a p} $info(spectrum) {
    if {($f >= $band_lo) && ($f <= $band_hi)} {
      append new_spectrum "$a $p "
    } {
      append new_spectrum "0 0 "
    }
    set f [expr $f + $info(f_step)]
  }
  set new_values [lwdaq_fft $new_spectrum -inverse 1]
  set new_signal ""
  set timestamp 0
  foreach {v} $new_values {
    append new_signal "$timestamp $v "
    incr timestamp
  }
  Neuroarchiver_plot_signal [expr $id + 32] $new_signal
}

The band power has units of square counts. When we remove the 0-Hz component from the spectrum, all we have left is components with zero mean. The band_power is twice the mean square value of the filtered signal, so the root mean square value of the signal is the square root of half the band power.

We provide the Neuroarchiver_band_power command to do all of the work in the above code for us. The routine makes sure that the DC component of the filtered signal is included before plotting, so the filtered signal is always overlaid upon the original signal in the display. You will find Neuroarchiver_band_power defined in the Neuroarchiver.tcl program. The procedure takes four parameters. The first two are the low and high frequencies of the band we want to select. The third is a scaling factor, show, for plotting the filtered signal on the screen. When this factor is zero, the routine does not plot the signal. When the routine plots the filtered signal, it picks a color automatically. The result looks like this (4-s transients filtered to 2-160 Hz) and this (1-s seizure filtered to 2-160 Hz). The fourth parameter is a boolean flag, replace, instructing the routine to replace the info(values) string with the values of the inverse transform. If neither show nor replace are set, the routine refrains from calculating the inverse transform signal, and so is faster.

set tp [Neuroarchiver_band_power 0.1 1]
set sp [Neuroarchiver_band_power 2 20 2 0]
set bp [Neuroarchiver_band_power 40 160 0 1]
append result "$info(channel_num) [format %.1f $tp] [format %.1f $sp] [format %.1f $bp] "

The script above calculates power in three bands: transient (tp), seizure (sp), and burst (bp). The power has units of square counts, and is twice the mean square value of the signal in each band. The result string contains the channel number followed by the three power values with one digit after the decimal point. To convert to μV rms, we divide the band power by two, take the square root, and multiply by a conversion factor we obtain from the specifications of our transmitter. For most versions of the A3028, this scaling factor is 0.4 μV/count. Power in the first band arises from step-like artifacts generated by loose or poorly-insulated electrodes. Power in the second band rises during epileptic seizures. Power in the third band rises during bursts of high-frequency EEG power, or contamination of the EEG by EMG. The script plots the second band with gain two and leaves the third band values in the info(values) string. Subsequent lines of code in the same processor can use the contents of info(values) to operate upon the burst power signal.

The Neuroarchiver_multi_band_filter routine accepts a list of frequency bands, each specified with a low and high frequency. The routine returns the sum of the squares of the components that lie in at least one of the specified bands. Following the list of frequencies, the routine accepts two further parameters, show and replace, just as for Neuroarchiver_band_power. When the routine calculates the inverse transform for show or replace, all components in the discrete Fourier transform that lie within one or more of these bands will be retained, and those that lie in none of the bands will be removed. In the following example, we remove components below 1 Hz, betwen 48-52 Hz, and above 200 Hz. We show the filtered signal on the screen, and replaces the signal values in the info(values) array so that we can manipulate the filtered signal.

Neuroarchiver_multi_band_filter "1 48 52 200" 1 1

The Neuroarchiver_filter routine applies a single band-pass filter function to the original signal, but the edges of the band-pass filter are gradual rather than immediate. The band-power and multi-band-filter routines remove components outside a band and leave those inside the band intact. Neuroarchiver_filter provides a transition region between full rejection and full acceptance at the lower and upper side of the band. We specify the lower and upper cut-off regions each with two frequencies. The filter routine takes six parameters: four frequencies in ascending order to define the transition regions and the same optional show and replace flags used by the band-power routine. To see exactly what the filter routine does, look at its definition in the Neuroarchiver.tcl program.

Here are some further examples of processor scripts.

# Export signal values to text file. Each active channel receives a file
# En.txt, where n is the channel number. All values from the reconstructed 
# signal are appended as sixteen-bit integers to separate lines in the file. 
# Because this script does not use the processing result string, the Player 
# will not create or append to a characteristics file.
set fn [file join [file dirname $config(processor_file)] "E$info(channel_num)\.txt"]
set export_string ""
foreach {timestamp value} $info(signal) {
  append export_string "$value\n"
}
set f [open $fn a]
puts -nonewline $f $export_string
close $f

# Export signal spectrum, otherwise similar to above value-exporter. The
# script does not use the result string, and so produces no
# characteristics file. Instead of appending the spectrum to its output
# file, each run through this script re-writes the spectrum file.
set fn [file join [file dirname $config(processor_file)] "S$info(channel_num)\.txt"]
set export_string ""
set frequency 0
foreach {amplitude phase} $info(spectrum) {
  append export_string "$frequency $amplitude\n"
  set frequency [expr $frequency + $info(f_step)]
}
set f [open $fn w]
puts -nonewline $f $export_string
close $f

# Calculate and record the power in each of a sequence of contiguous
# bands, with the first band beginning just above 0 Hz. We specify the
# remaining bands with the frequency of the boundaries between the
# bands. The final frequency is the top end of the final band.
append result "$info(channel_num) "
set f_lo 0
foreach f_hi {1 20 40 160} {
  set power [Neuroarchiver_band_power [expr $f_lo + 0.01] $f_hi 0]
  append result "[format %.2f [expr 0.001 * $power]] "
  set f_lo $f_hi
}

# Here's another way to obtain power in various bands. We specify the
# lower and upper frequency of each band.
append result "$info(channel_num) [format %.2f [expr 100.0 - $info(loss)]] "
foreach {lo hi} {1 3.99 4 7.99 8 11.99 12 29.99 30 49.99 50 69.99 70 119.99 120 160} {
  set bp [expr 0.001 * [Neuroarchiver_band_power $lo $hi 0]]
  append result "[format %.2f $bp] "
}

Processors that assist with event detection, such as classification processors, are far longer than our examples. The ECP3.tcl processor, for example, is almost two hundred lines long.

Batch Processing

Suppose we want to process thousands of hours of data from a dozen transmitters stored on disk. We can open the Neuroarchiver and start processing, but we will have to wait hundreds of hours, and our computer screen will be occupied by the Neuroarchiver display. On Linux, Unix and MacOS, however, we can run the Neuroarchiver without graphics as a console application or as a background process with no console at all. We can take our list of archives and divide their processing among a cluster of computers, with a separate instance of the Neuroarchiver running on each computer. With no graphics, processing is ten times faster, so with ten computers running without graphics, we can get the processing done one hundred times faster.

To set up batch processing, start by consulting the Run In Terminal section of the LWDAQ Manual. The idea is to invoke LWDAQ from the command line using the lwdaq shell script that comes with every LWDAQ distribution. The following command invokes LWDAQ as a background process, executes a configuration script, and passes the name of an archive and a processor into LWDAQ.

lwdaq --no-console config.tcl processor.tcl M1288538199.ndf

The archive is the file ending in NDF. It contains binary data recorded from the subcutaneous transmitters. The processor.tcl file is a text file containing a processor script to create the lines of a characteristics file. The config.tcl file is a configuration script. Here is an example configuration script.

LWDAQ_run_tool Neuroarchiver.tcl
set Neuroarchiver_config(processor_file) [lindex $LWDAQ_Info(argv) 0]
set Neuroarchiver_config(play_file) [lindex $LWDAQ_Info(argv) 1]
set Neuroarchiver_info(play_control) Play
set Neuroarchiver_config(play_interval) 1
set Neuroarchiver_config(enable_processing) 1
set Neuroarchiver_config(save_processing) 1
set Neuroarchiver_config(play_stop_at_end) 1
set Neuroarchiver_config(glitch_threshold) 500
set Neuroarchiver_config(bp_set) 500
Neuroarchiver_baselines_set
LWDAQ_watch Neuroarchiver_info(play_control) Idle exit
Neuroarchiver_play

The script sets up the Neuroarchiver to read through the archive in 1-s intervals, creating a characteristics file in the manner described above. It sets the glitch threshold and the baseline power values for all channels. When it's done with the archive, it stops and terminates. (The LWDAQ_watch command does the termination.) We assume that that batch job manager will keep track of which analysis processes are still running, and add new ones as the previous ones terminate.

We can use the Unix xargs command to schedule the batch processing of all archives in a directory using the following command.

find . -name "*.ndf" -print | xargs -n1 -P4 ~/Active/LWDAQ/lwdaq --pipe config.tcl processor.tcl

This command starts by calling find to get a list of all .ndf files in a directory and all its subdirectories. We pass this file list to xargs, which takes one file name at a time from the list (-n1) and passes the name to LWDAQ running in pipe mode, after first passing the configuration file name and the processor file name. Thus the file name is the last parameter passed in to LWDAQ. Four separate processes (-P4) will run simultaneously. When one completes, xargs starts another, until every file in the list has been processed. The pipe mode is identical to no-console except that the lwdaq shell script that starts LWDAQ does not terminate until LWDAQ itself terminates. Each archive generates two processes: the lwdaq shell script, and the tclsh interpreter of LWDAQ. The xargs utility is watching the lwdaq shell processes as it schedules subsequent processes. It is not watching the process started by the lwdaq shell.

Batch processing works on Linux and MacOS. On Windows, you may be able to do the same if you install a Unix-like environment, such as Cygwin or MSYS. We tried in MYSIS and the xargs schedules processes generated errors. If you want to do batch processing on a Windows machine, we suggest you install a Virtual Box and load it with some variety of Linux. Within the virtual Linux machine, you can run all our batch processing scripts. Both Virtual Box and Linux are open-source and free.

Interval Analysis

Once we have applied processing to our data archives to produce characteristics files, we can look for events, calculate average characteristics, or determine summary information by analyzing the characteristics files. We call this step analysis. When the analysis detects events, we call the analysis program an event-detector.

The Seizure-Detector, Mark I (SD1) script is an example of an event-detector written in TclTk that we can run in the LWDAQ Toolmaker. The script looks through the characteristics produced by the TPSPBP processor and detects epileptic seizures by examining the development of seizure-band power in the absence of transient-band power.

The Power Band Average (PBA) script calculates the average power in a sequence of frequency bands during consecutive intervals in time. You specify the length of these intervals in the script, in units of seconds. So they could be minutes to hours or days. You run the script in the Toolmaker and specify any number of characteristics files with the file browser. Cut and paste from the window to plot in Excel.

The Average Reception (RA) script calculates average reception during consecutive intervals of time. It is similar to the Power Band Average script in the way it reads in characteristics files one after another and prints its results to the screen.

The Reception Failure (RF) script looks for periods of reception failure and writes an event list to the Toolmaker execution window. Cut and past the list into a file to make an event list the Neuroarchiver can step through.

The Bad Characteristics Sifter (BCS) script goes through characteristics files and extracts those corresponding to one particular channel, provided that the characteristics meet certain user-defined criteria, such as minimum or maximum power in various frequency bands.

We present the development of seizure detection using interval analysis in Seizure Detection. The Neuroarchiver's built-in Event Classifier provides analysis that compares intervals with reference cases to detect and identify events in recorded signals.

Calibration

The Calibration Panel allows us to manage the calibration of signal power from one archive to the next but it also displays channel sample rates, channel alerts, and shows us the colors in which channels will be plotted. We open the Calibration Panel with the Calibration button.


Figure: Calibration Panel. These baseline power values we calculated with the ECP3 processor.

The Calibration Panel shows the baseline power for each channel, as well as the Neuroarchiver's best guess at the nominal sample rate of each channel in the current recording. We can edit the baseline powers, but we cannot edit the sample rate. We select which channels we want to display in the Calibration Panel using the Channel Include String. We can enter "1-14" to specify channel numbers one through fourteen should be included. We press Refresh to put the new include string into effect. The keyword "All" includes all channel numbers, "Active" includes those that are active. The keywords "None", "Loss", and "Extra" selet channels with these alerts.

Example: The string "1 5 78 Active" includes channels one, five, seventy-eight, and all active channels. The string "1-14 Okay" includes all channels one through fourteen regardless of their state, and all channels that are running correctly.

The Calibration Panel displays an alert for each channel. If there are an excessive number of samples on one channel, the alert is "Extra". If there are too few, the alert is "Loss". If the channel is no longer active the alert is "None". An active, well-behavied channel is "Okay". We reset the sample rates determined by the Neuroarchiver, and all the alerts with the Reset Frequencies button. When the Neuroarchiver first detects extra samples on a channel, it issues a warning in its text window, suggesting that we check for duplicate transmitters using the same channel number.

The Calibration Panel displays the sample rate assumed for each channel to the left of its baseline reset button. Unless we specify the sample rate for each channel in the Player's Select string, the Neuroarchiver will have to guess the sample rate. We can specify the possible sample rates in the default_frequency parameter in the Configuration Panel. If we specify only one value, that's the value that shows up in the Calibration Panel. If we specify two or more values, the Neuroarchiver, on playback, will pick the best match to the data, and this value will show up in the Calibration Panel.

The power of a signal is always useful for event detection. But the sensitivity of our electrodes and the gain of our amplifiers varies from one recording to the next. The result is differences in the amplitude of the recordings, even when the power of the recorded biometric signals is the same. We would rather that these variations were insignificant, and we will of course exert effort to make sure that they are insignificant. But if, despite our efforts, these variations are great enough to undermine our use of the signal power for event detection, we must obtain some measure of the sensitivity of each recording, and use this measure to normalise the recording amplitude. The Neuroarchiver's Calibration System allows us to define a baseline power for each recording. The info(bp_n) parameters store the baseline power for channels n = 1..14. If we don't want to use the Calibration System, we don't have to disable it, we simply ignore it. Our interval processor, and whatever analysis we apply afterwards, will not refer to the baseline power values at all.

The baseline powers might represent the absolute baseline power of a signal. When we calibrate a recording, we reset all the baseline powers to a high value, and our interval processor adjusts them downwards to the correct value as it proceeds through the recording. We reset all baseline powers with Reset All. When we use baseline power values in this way, the Calibration System provides various ways to read and write the values to the recording metadata, which we describe below.

The "Playback Strategy" section allows us to instruct the Neuroarchiver as to how it is to read and write baseline power calibrations during playback. The "Reset Baselines on Playback Start" option causes the Neuroarchiver to reset the baseline power values when it plays back the first interval of an archive. The "Read Baselines from Metadata on Playback Start" option causes the Neuroarchiver to read the baselines stored in the metadata under the current read and write name. This read takes place after the reset, if any. The "Write Baselines to Metadata on Playback Finish" option causes the Neuroarchiver to save the baseline power calibration developed during playback of the archive to the metadata under the current read and write name. With these options it is possible to go through all archives in a directory tree and determine and store the baseline power calibration for each archive independently in its metadata. Later, we can re-process the data and use the already-developed calibration.

The "Jump Strategy" applies to jumping from one point in one archive to another point in the same archive or another archive. In this case, we might re-process the interval we jump to. When we re-process the events in an Event Library in the Event Classifier, we jump to each event in turn. When we re-process, we may need the baseline power calibration. We can use the calibration stored in the event description, which pre-supposes there is such a calibration stored in the event description. We can use the current baseline calibration for the same channel number. Or we can read a set of baseline powers from the metadata. With these options, it is possible to re-process event libraries from many different, independent archives.

One way to calibrate an EEG recording is to use some measure of the minimum power the signal can achieve. We go through a recording with the same interval length we want to use for event classification, and look at the power of the signal in each interval. We use the minimum interval power as our calibration. If we have our recording divided into one-hour NDF archives, we can perform this calibration on each one-hour period, so we use the minimum power in each hour as our baseline calibration. We set up the Neuroarchiver to reset baseline powers whenever it starts playing back a new archive, and to save the baseline powers it has in its calibration array every time it finishes playing an archive. We use a Baseline Calibration Processor, such as BCP2 to calculate interval power and watch for the minimum value. In the case of BCP2, the measure of interval power is simply the standard deviation of the signal, with no filtering applied other than the glitch filter. In BCP3, the interval power is the amplitude of a band-pass filtered version of the signal. We perform the band-pass filtering with a discrete Fourier transform.

Another use of the baseline power values is to hold a scaling factor we want to apply to the signal before calculating a power metric for event classification. We use Set Baselines To to write a single value to all the baseline powers. From here, we can adjust the individual baseline powers by hand so as to account for differences in the baseline amplitude of the recorded signals.

We can save the baseline powers to the archive's metadata string by pressing Write to Metadata, and retrieve previously-saved values by pressing Read from Metadata. When writing a set of baseline powers, the Neuroarchiver ignores values that have not been set. We specify a name for the set of baseline powers in the metadata in the "Name for All Metadata Reads and Writes" entry box. If we use three processors, ECP1.tcl, ECP2.tcl, and ECP3.tcl to calculate baseline powers, we can store each set of baseline powers under the names ECP1, ECP2, and ECP3. We can view all baseline power sets in the metadata with the Metadata button in the Player.

Processor scripts like ECP1 look for a minima in signal power in a particular frequency band, and use this as the baseline power, but with they also increase the baseline power by a small fraction for every interval so the calibration can adapt to a decrease in sensitivity with time. Such an algorithm is intended to follow a recording from the first hour to the last, with no resetting of baseline power between archives. Before we begin analysis with such a processor, we run it on ten or twenty minutes of data to obtain an initial value for the baseline power, and then start our processing in earnest, going from one archive to the next, carrying the baseline power calibration over from the previous archive. Although appealing, this method of calculating baseline power has two practical problems. If there is an interval in one archive that produces a minimum power that is far too low to be representative of EEG, this minimum stays with the baseline calibration through the subsequent archives. And the requirement that we run the processor for ten or twenty minutes and then go back and start again produces an awkward work flow.

The ECP3 processor contains configuration variables that set it up to calibrate baseline power, calculate metrics, or both at the same time. When ECP3 calibrates baseline power, we assume the Neuroarchiver resets the baseline power to some high value when it starts playing an archive, and writes the baseline power to the archive metadata when it finishes the archive. The ECP3 finds the minimum signal power in the archive and uses this as the baseline power. It does not increase the baseline power by increments as it plays the archive. When ECP3 calculates metrics without calculating baseline power, it uses the current baseline power, which we assume has been read from metadata by the Neuroarchiver when playback of the archive began. Thus ECP3 is a two-stage processor, operating upon each archive independently. First we run ECP3 on all archives to obtain baseline powers, then we run it on all archives to obtain metrics. The second stage uses the results of the first stage.

When it comes to batch classification, we use existing characteristics files, which were produced by a classification processor, to match intervals with an event library. This comparison does not use the current baseline power values. The baseline power values that applied during each interval described by the characteristics files is always stored along with the metrics. We do not need the baseline power to compare the metrics of a recorded interval with the metrics of an interval in the event library. But we do need the recorded baseline power, if we wanted to translate the metrics back into absolute signal power measurements from which they were obtained.

If we know we need to calibrate the sensitivity of all our recordings, one way to do so automatically is to play through the recordings with a baseline calibration processor. This processor will calculate the baseline power by, for example, looking for the least powerful interval in each hour of recording. We configure the Neuroarchvier to reset baseline power at the start of each archive and save baseline power to metadata at the end of each archive. We start playing the first archive and we let the Neuroarchiver play on through to the end of the final archive. At the end of each archive, the processor has obtained the calibration of all existing channels and stores their baseline powers in the archive's metadata. The values are stored under the name we specify in the Calibration panel.

Detail: Calculating baseline powers may take ten minutes per hour of recording if we are calculating all the event classification metrics at the same time. We don't need the metrics to calibrate baseline power. To accelerate the calibration, edit the processor and disable metric calculation.

To implement baseline calibration for the Event Classifier, we open the Calibration Panel and disable the resetting of baseline power at playback start, and disable the writing of baseline power at playback end. We enable the reading of baseline power on playback start, and we make sure the name for all metadata reads and writes matches the name under which our baseline calibrations are stored in the recording metadata. For our jumping strategy, we choose to read baselines from metadata.

To disable baseline calibration for the Event Classifier, we open the Calibration Panel and make sure all writing to metadata is disabled. For our jumping strategy, we use the current baseline power.

Event Lists

An Event List is a list of exceptional moments in the recorded data. They could be a list of detected seizure intervals, an a library of various event examples for the Event Classifier. The list takes the form of a text file. Each line of the text file defines a separate event. The Neuroarchiver's Event Handler allows us to navigate through event lists. We pick the event list with the event list Pick button. We move through an event list with the Back, Go, Step, Hop, and Play buttons. Each of these provokes a Jump to a new interval. The Back, Go, and Step add −1, 0, and +1 to the event index, read the event from the event list file, find the archive that contains the event, and displays the event in the Neuroarchiver window. The Hop picks an event at random from all the events in the list, and jumps to it. The Play button in the Event Handler steps repeatedly through the event list until it either reaches the end or we press the Event Handler's Stop button.

We use the Hop function with large event lists, where our purpose is to determine the false positive rate within the list. Thus we might have a list of ten thousand one-second spike events, and we hop to one hundred of them and find that 98 are true spike events and 2 are not, os our false positive rate is 2% within the list. If the list was taken from one million recorded seconds, the false positive rate is 0.02% within the recording.

Here is an example event list for archive M1300924251

M1300924251.ndf 13.0 3 Transient 3.4 0.995 0.994 0.009 0.136 0.408 0.533
M1300924251.ndf 303.0 3 Hiss 3.4 0.710 0.810 0.644 0.383 0.553 0.699
M1300924251.ndf 402.0 3 Other 3.4 0.513 0.595 0.618 0.473 0.559 0.578
M1300924251.ndf 105.0 4 Rhythm 2.8 0.656 0.226 0.441 0.790 0.324 0.688
1300924642 0.0 4 Quiet 2.8 0.351 0.202 0.723 0.221 0.470 0.216 
1300924662 0.0 "3 4 8" "Nothing remarkable here"

Each line contains a separate event. Each event is itself a list of elements. The first element is either the name of an archive or a UNIX timestamp. The second element is a time offset from the start of the archive or from the UNIX timestamp. This offset can be a fraction of a second, but the UNIX timestamp is a whole number of seconds. The third element is a list of channel numbers to which the event applies. The remaining elements are usually a description followed by characteristics, but could contain only a description, or could be omitted. An element containing spaces can be grouped with quotation. In the first few lines above, we have an event type followed by the baseline power at the time of the event, and six metrics used by the Event Classifier.

When the Event Handler moves between events, it searches its directory tree for the archive named by the event, or for an archive that contains the time specified in the event. If it finds the interval it is looking for, it displays the interval using the current Player settings. Otherwise it issues an error message in the Neuroarchiver's text window.

The isolate_events parameter in the configuration panel directs the Neuroarchiver to set channel_select event channel whenever it displays and event. This isolates the event channel for display. Set this parameter to 0 to see all channels.

The jump_offset parameter in the configuration panel is a time in seconds we add to the event time when we jump to the event. If, for example, we set the jump_offset to −4 and select an 8-s playback interval, we will see the four seconds recorded before and after the event time. By default, jump_offset is zero.

Whenever we jump to a new event, we use the current "Jump Strategy" in the Calibration Panel to determine what will happen to the current power calibration. We can use the baseline power stored with the event description, or we can read baseline powers from the metadata of the archive we are jumping to, or we can use the current baseline power calibration.

Event Classifier

When we press the Classifier button, we open the Event Classifier. The Event Classifier works with an Event Classification Processor to performs automated event detection for recorded signals. The Event Classification Processor calculates and assigns names to the metrics used by the Event Classifier, and it lists the names and display colors of the classification types. We introduce the Event Classifier in Similarity of Events. We describe the theoretical basis of the Event Classifier in Adequate Representation.


Figure: The Event Classifier. We have loaded an event library from disk. The library events are printed as text lines in the event list on the right, and plotted with respect to two metrics in the event map on the left. Click on a point in the map, or the J button in the list, and the Player will jump to the event. Click on the C button in the list to change the type of a event, which will change its color in the map.

The Event Classifier operates upon the characteristics of recorded data. These characteristics must conform to a particular format. They begin with a type string. The second characteristic is a real-valued baseline power with at least one digit after the decimal point. The third and subsequent characteristics are real-valued numbers between 0.0 and 1.0, all with at least one digit after the decimal point. These are the interval metrics. The events in the event library have characteristics in exactly the same format.

The first metric we assume to be some measure of the size of the signa, and we call it the power metric. The remaining metrics we assume to be independent of the power of the signal. Any two intervals that look exactly the same in a normalized Voltage vs. Time plot will have all metrics identical except for the power metric.

The Event Classifier allows us to select a subset of n metrics from those that are calculated by the Classification Processor. We enable each metric individually by checking its enable box at the bottom of the Event Classifier panel. With n metrics enabled, each playback interval and each library event appears as a point in an n-dimensional cube. If we disable the power metric, the Event Classifier ignores the power metric, and classification is normalized with respect to power. Classification without the power metric operates only upon the shape of the signal, not upon its size. The power metric could be obtained from the standard deviation of the signal, the mean absolute deviation, or by summing the squares of the frequency components in a particular frequency band.

Any interval with power metric lower than the classification threshold is classified as "Normal". The threshold appears in the Threshold entry box. Set it to 0.0 and all intervals will be compared to the event library. Set it to 0.5 and only events with power metric ≥0.5 will be compared to the event library. The Event Classifier applies this threshold even when it is performing normalized classification. An event with power metric less than the threshold will be classified as "Normal" regardless of how similar its shape is to an event in the library.

When the Event Classifier compares an interval to its library, it calculates the separation of the interval from each library event in the n-dimensional cube. The event library is a Neuroarchiver event list containing the reference events. Each reference event is an event to which we have ourselves assigned an event type. The Event Classifier finds the reference event closest to the new event. If the two are farther apart than the match_limit, the Event Classifier assigns the new event the reserved type Unknown. Otherwise, the Event Classifier assigns the new event the same type as the closest reference event. We set the match limit in the Limit entry box.

When we first open the Event Classifier panel with the Classifier button, we see two blank squares with some buttons and parameters. The blank square on the left is the event map, and the one on the right is the event list. Before we see something like the colorful display shown above, we have to initialize the Event Classifier with a Classification Processor and load an event library with the Load button.

The Event Classification Processor gives names for each of the metrics. These names appear in the menu buttons above the event map. The processor provides a list of event types and colors for their display in the event map. This list should not include the reserved type "Unknown", which will always be assigned by the Event Classifier with color "black" to represent powerful events of unknown type. The most important job of the processor is to calculate the interval metrics. For an introductory discussion of interval metrics, see Similarity of Events. For the exact calculation of its metrics, consult the processor script itself.

Detail: We describe the operation of several event classification processors in detail in our Seizure Detection page, such as ECP11, ECP15, ECP16.

Each line in the event list is an event with its classification, baseline calibration, and metrics. When we first start work, we don't have an event library, and we probably don't have an event classification processor that is suitable for classifying our particular events of interest in our particular recordings. In practice, we build a prototype library, we adjust the calculation of metrics to improve event separation, expand and edit the library, adjust the metric calculation again, and so on, until we arrive at a working library and set of metrics. We may be using the baseline calibration or we may not. But we will set the baseline powers to some default value consistent with our metric calculations.

Tutorial Package: We can load a working library, look through some real data, and apply the Batch Classifier with the files in our Classifier Example Package. The package contains four hours of single-channel EEG recorded from two animals by Iris Oren at Edinburgh University with two A3028A-CC transmitters. Channel No1 is a wild mouse. Channel No3 is an Alzheimer's disease transgenic mouse. Along with the recordings are the ECP3 classification processor script, an ECP3 event library, four characteristics files obtained from the recordings with the ECP3 processor, and a list of all events discovered in the recordings with the Batch Classifier. In the metadata of the archives are stored the baseline power calibration of each channel, under the name "ECP3". When we follow the instructions below, use channel No3. The first six candidate events are at 309 s, 579.0 s, 587.0 s, 588.0 s, 666.0 s, of which the first is a spike burst, the last is the beginning of a delta wave that cannot be classified reliably with one-second intervals, and the middle for are uninteresting.

To begin building an event library, we download an event classification processor, such as ECP16V2. We pick and enable the processor in the playback section of the Neuroarchiver. We set the playback interval to a time that is short enough so that our shortest events are still prominent, but not so short that these events are often getting lost at the edges of the interval. For spikes, seizures, spike bursts, hiss, and head shakes, we like to use one second intervals, so one second is a good value to start with.

Before we proceed, we must adopt and implement a policy for calibrating the power of our various recordings. As a rule of thumb, we want a power metric of 0.5 to correspond to an unusually powerful interval, so that 10% of intervals have power metric greater than 0.5. We discuss calibration and the Neuroarchiver's Calibration Panel in an earlier section. The ECP16V2 power metric divides the root mean square amplitude of an interval by the baseline power specified in the Calibration Panel. View your recordings in the Neuroarchiver and estimate the average amplitude by eye. For EEG recordings made with the Subcutaneous Transmitter (A3028) we find that a baseline calibration of 200 counts works well with rats and skull screws, 500 counts for rats and deeper wire electrodes, and 100 counts for mice with skull screws (the units are sixteen-bit ADC counts). Enable your event classification processor and adjust the baseline powers until you are satisfied that your power metric will be distributed around 0.5 for the intervals you are interested in. It's much easier to use the same baseline power for all channels, so don't give individual channels separate calibrations unless they vary dramatically in their baseline amplitude.

Now that we have baseline calibration established, we can start to build our event library. We go to the start of the first recording file, which is one of your NDF archives. We pick one channel in this archive to start our work. We enter this channel number in the channel select box. We open the Event Classifier panel. We configure the map to display two different metrics, such as power and coastline. We press Continue. The Event Classifier starts playing the recorded signal. It plots each interval as a white square in the event map.

When the Event Classifier encounters interval with the first metric greater than the classifier threshold, it stops. By default, the threshold is 0.5. You could set the threshold to zero to perform classification of all intervals. If the threshold is zero, the classifier stops when it encounters an event of type Unknown, which is any event farther than the match limit from any existing library event. Any interval above the threshold we call an event. When the Event Classifier stops at the first event, the interval appears as a black square in the map and the Event Classifier gives it type Unknown. If the event is uninteresting, we press Continue. We want to build a library of fine examples of the events we are interested in. We should not include poor examples nor events of no interest. If, however, this event is a good example of something we are interested in, we press Add. The Event Classifier adds the event to our library. We see the event as a new line of text in the event library window. We go to this new event and we press C. The event type changes. Pressing C repeatedly cycles through the event types defined in our classification processor. In the case of ECP16V2, these are Ictal, IctalSpikes, Hiss, Spindle, Artifact, Depression, and Baseline. If none of these types fit the event, we edit the processor script and add another event type to the list of types it defines. We go back a few intervals and press Continue. We come to the same event again, but this time we can assign it our new type with the C button. We can also delete types from the processor script. Every time we assign a new type, we must give it a unique color code.

When the Event Classifier encounters an event, it calculates the n-dimensional distance between the new event and our library event, where n is the number of metrics. We call this distance the match between the new event and the library event. The Event Classifier displays the match. If the match is less than or equal to the match limit, the Event Classifier classifies the new event as the same type as the library event. We set the match limit ourselves in the match limit entry box. By default, it is 0.2, which is good for building an event library. We may agree with the Event Classifier about this new event, in which case we press Continue. We may disagree because the new event is of a different type, in which case we add the new event to our library and assign it the correct type. We may disagree because the event is nothing like any event we are looking for, in which case we have a problem with our metrics, because they do not separate the events we are interested in from the ones we are not interested in. This failure to separate will occur, for example, if we try to use one-second intervals to find delta waves. But let us suppose that our metrics provide good separation.

We proceed through our example recording with Continue, using Add when necessary. When we have more than one event in our library, the Event Classifier finds the closest one to each new candidate event, and the match becomes the distance between the candidate and the closest event in the library. If we stop at a fine example of an event type, but the Event Classifier already classifies this example correctly, we can refrain from adding the event to our library. We do not want to clutter our library with unnecessary events. We may remove and add events by hand in the text window of the Event Classifier. We press Refresh to sort out the map and the list after such manual edits.

After a while, we arrive at a library with several fine examples of each of our event types. As we continue through our recording, the Event Classifier stops and provides the correct classification of each event. We go back to the beginning of our recording and pick a different channel. We repeat the same process, adding the slightly different examples of our event types that we might find in this channel, and in others subsequently. When we are satisfied that our event library is working, we Save to write it to disk.

Do not add events of type "Unknown" to the library. Do not attempt to assign some default type, such as "Other", to events with no specific type. The Event Classifier allows us to extract events of type "Unknown" as well as our specific types. There is no need to give these unknown or uninteresting events a special type. We can always go back through these unknown events and pick some to be reference events of a specific type at a later time. The event library should be a list of events of known and definite type. Allow the Event Classifier to resolve ambiguity.

Our reference events appear as points on the map and as lines in the text window. The map plots the events in a space defined by two of their characteristics. The Classifier obtains the metric names from the classifier_metrics string, which is initialized by the classification processor. We can change the metrics for the map and re-plot with the Refresh button. To jump to the event corresponding to one of the points, click on the point. We will see the Player jump to the event, and the event itself will be highlighted in the Classifier's text window.

The map shows how well two metrics can distinguish between events of different types. Each event type has its own color code, as set by classifier_types, which is initialized by the classification processor. We hope to see points of the same color clustering together in the map, and separately from points of different colors. In practice, what we see is overlapping clusters of points, each cluster with its own color.

The Event Classifier lets you enable and disable the available metrics with the check-boxes along the bottom of the Event Classifier window. We look at the various two-dimensional views of our library events. After enough study, we will notice that some metrics do not provide useful grouping of our events, while others do. Some types of seizure, for example, are symmetric, so an asymmetry metric will not help find them. The curse of dimensionality suggests that the number of events we need for classification increases exponentially with the number of metrics. So we should disable metrics we don't need.

The Compare button measures the distance between every pair of library events that have a different type, and makes a list of such pairs whose separation is less than the match limit. The Classifier prints the list of conflicting events in the Neuroarchiver text window.

Each event written by the Classifier to the event list window has a J button next to the C button. When we click on the J button, the Neuroarchiver jumps to the library event. The archive containing the event must be in the Player's directory tree. We can jump to an event in the event map by clicking on its square. When jumping to the event, the Neuroarchiver uses our selected jumping strategy to obtain baseline calibration. If we have a fixed calibration for all transmitters in all recordings, this problem of calibration is simple. We use the baseline power in the Calibration Panel. But if each transmitter has its own calibration, and we have multiple transmitters with the same channel number in our body of data, the best strategy with multiple archives is to read the baseline calibration from archive metadata.

As we mentioned earlier, we will end up modifying our event classification processor to suit our particular experiment. When we do this, the metrics change. We may eliminate or add metrics. In such cases, we can re-calculate the metrics of our event list with the Reprocess button. During reprocessing, the Neuroarchiver steps through all events in the library. All the recording archives must reside in the Player's directory tree. As the Neuroarchiver jumps to each event in the library, it applies the current processor to the interval it jumps to. Once the event list has been reprocessed, we can look at the library in various map views to see if the new metrics provide better separation of event types.

Batch Classifier

The Batch Classifier is an extension of the Event Classifier. We can go through an archive with the Event Classifier looking for particular events using the playback and the display in the Classifier window, or we can do so more quickly using Batch Classification. The Batch Classification button opens a new window with its own buttons and check boxes. It applies the reference library to previously-recorded characteristics files produced by the same classification processor.


Figure: The Batch Classifier. Along the top we have controls for specifying the input and output files. The Channel Numbers string allows us to list individual channel numbers we want to classify. Buttons select event types we want to find and collect. Other buttons allow us to select which metrics to enable for classification. The Event Classifier's match limit and power threshold are included so we can change it without going back to the Event Classifier panel.

Batch classification uses the classifier threshold, match limit, and metric enable values from the Event Classifier. Each of these appear in the Batch Classifier window. The Batch Classifier will classify as Normal any interval with power metric less than the classification threshold, regardless of any other settings.

If Exclusive is not checked, the Batch Classifier performs classification just as would the Event Classifier. When the power metric is above threshold, and the closest event in the library is closer than the match limit, the interval is classified as the same type as this closest event. In the calculation of proximity, the Batch Classifier uses only the metrics that are enabled. If the closest event is farther than the match limit, the interval is classified as Unknown. If Exclusive is checked, the Batch Classifier ignores all events in the library that are not of a type selected by check boxes in the Batch Classifier window. The Batch Classifier finds all intervals that lie within the match limit of the selected types and classifies them as one of those types.

Detail: Suppose one of our types is Baseline and we want to find intervals that are within 0.1 of our library baseline events, regardless of power metric, and not using power metric in the comparison. We disable the power metric and check Exclude. We set the threshold to 0.0 and the limit to 0.1. We will get a list of events that are, according to the metrics, of similar shape. We look at them in a normalized VT plot to find out if they are indeed of similar shape. This is a test of our metrics, one among many that we must perform before we can be confident in our event detection.

Before we start batch classification, we must select input files and specify the output. The input files are characteristics files. We can select them in of two ways. We can select individual files in the same directory using the Pick Files button. We can select all files in a directory tree that match a pattern with the Apply Pattern to Directory. The pattern uses "*" as a wildcard string and "?" as a wildcard character.

The output file is an event list. But default, the Batch Classifier produces a list of events in one file. Each line of this file is an event, in the Event List format. We can select it in the Neuroarchiver and navigate through them as we like. We specify the file with the Specify File button.

If we have characteristics files with names in the form Mx_s.txt, where x is a ten-digit timestamp and s is a string naming a classification processor, or any other name, then we the Batch Classifier can generate separate event lists for each characteristic file. When we specify the output file name, we enter a name for the event list for the first characteristics file, in the same form as above, perhaps M1234567890_Events.txt. Every time the Batch Classifier moves to a new characteristics file, it will open a new event list, replacing the timestamp in the previous event list name with the timestamp taken from the new characteristics file name.

In addition to the list of events, the Batch Classifier produces one summary line for each file. This line contains the file name, excluding directory path, and a string of integers. In purple are the selected channel numbers, and each of these is followed by the count of each selected type of event. We can cut and paste these counts into a spreadsheet, and sometimes this is all the data we need from the Batch Classifier. There is a checkbox for each channel number, so we can select which channels we want to search for events. There is a checkbox for each event type, so we can select which events we want to find.

If Loss is checked, the Batch Classifier will add two numbers to the text window output for each enabled channel. The first number is the number of loss intervals found in the characteristics file, and the second is the total number of intervals found. We note that total signal loss due to failure of a transmitter, or omission of a transmitter from the recording system, has two possible manifestations in the characteristics file, depending upon how the processing was set up. If the channel was specified explicitly in the Neuroarchiver's channel select string during processing, there will be an interval recorded in the characteristics file regardless of whether or not any samples are present. But if the channel select string was just a wildcard (*), there will be an interval only if there is some minimal number of samples is present.

The Unknown event type is an event that differs by more than match_limit from all existing events in the library. If we are searching for one particular type of event in our data, such as a Spike, we could fill our event library with spike events, and assume that anything with a match distance of 0.2 or less must be a spike, and anything else is not a spike. We set the match_limit to 0.2 in the Batch Classifier Window or the Classifier window (the two windows refer to the same parameter). The Batch Classifier will classify each event as either Unknown or Spike.

The Batch Classifier makes no use of the baseline power values recorded in the characteristics files it takes as input. The comparison between each interval in the characteristics files and each event in the reference library is done on the basis of the metrics alone. We need the baseline power to calculate the metrics in the first place, but we do not need the baseline power to compare the metrics.

Event Handler

The Event Handler is a program executed by the Event Classifier. When the Classifier and the Player are operating together, we have seen how the Classifier plots the current interval for all selected channels in its map, and classifies all events with its library. At the same time, the Classifier can execute a Tcl script that takes action based upon the nature of the events it encounters during play-back. We call this script the event handler and we enable its execution with the Handler check box. The script itself we store in the handler_script string. If this string is non-empty, the Classifier will attempt to execute it.

Example: We wish to flash a lamp whenever we encounter a Seizure event while playing live data recorded from an animal. We set the handler_script to a program that checks the current event type and executes a sequence of LWDAQ commands if the type is a Seizure. The LWDAQ commands will open a socket to a LWDAQ driver and turn on and off a lamp connected to some LWDAQ device such as the Lamp Controller A2060L.

The Event Handler has access to a selection of Event Classifier local variables, such as type, which contains the type of the current event. The following table lists the variables the event handler can use, and their values.

NameValue
idthe channel number in which the event occurred
eventthe event itself
closestthe closest event to this one in the event library
typethe name of the event type
fnthe archive file in which the event occurs
ptthe play time within the archive at which the event occurs
infothe Neuroarchiver_info array
configthe Neuroarchiver_config array
Table: Variables Available to Handler Scripts.

The info and config variables are the Neuroarchiver information and configuration arrays. Thus we would obtain the value of the playback interval with $config(play_interval) and the current recording time with $config(record_end_time). The event variable contains a string describing the current event in the same way it would appear in an event list. The closest variable contains the closest event in the library.

The following example responds to events of type Seizure by writing a message in red to the Neuroarchiver text window, giving the play time and channel number.

if {$type == "Seizure"} {
  Neuroarchiver_print "Seizure on channel $id at time $pt." red
}

One way to define the handler script is with a Classification Processor. A Classification Processor already defines the types and colors of events, and the names of the Classifier metrics. It can also define the value of event_handler. The following lines would establish the above handler script for the Classifier.

set info(handler_script) {
  if {$type == "Seizure"} {
    Neuroarchiver_print "Seizure on channel $id at time $pt." red
  }
}

Note that we simply declare the entire script as a string with curly braces marking its beginning and end. Here is another example. We define an event handler that opens a socket to a LWDAQ driver at address 10.0.0.37 and sends hexadecimal command words "0080 DE83 0585 3287 6489 0181" to the device in socket 1.

set info(handler_script) {
  if {$type == "Seizure"} {
    set sock [LWDAQ_socket_open 10.0.0.37]
    LWDAQ_set_driver_mux $sock 1 1
    foreach c "0080 DE83 0585 3287 6489 0181" {
      LWDAQ_transmit_command_hex $sock $c
    }
    LWDAQ_socket_close $sock
  }
}

If we have an A2060L connected to socket one, these commands will cause it to generate 100 pulses of 5 ms with fixed interval 50 ms and pulse height 10 V.

Location Tracking

To record from an Animal Location Tracker (ALT), such as the A3032, we must configure the Recorder Instrument to download the data from the tracker just as we download data from a traditional receiver. The tracker attaches a payload to the SCT messages. The A3032A, for example, attaches a payload of fifteen power measurement bytes and a firmware version byte. We set the payload_length parameter in the Recorder Instrument to sixteen when we want to record from an A3032A. When the Neuroarchiver creates a new NDF file for recording, it writes the payload_length into the file metadata. During playback, the Neuroarchiver reads the payload length from the metadata. The payload of the ALT clock messages specifies the arrangement of the ALT's detector coils, allowing the map of the coils to be generated automatically in the Tracker window.


Figure: The Animal Location Tracker Window. The map is for an A3032A which provides an array of fifteen detector coils on an 8-cm grid.

The Neuroarchiver calculates the location of transmitters using a weighted centroid of the tracker's power measurements. It rejects any coils that are farther than the extent parameter from the measured location of a transmitter, a result it obtains by an iterative calculation. It ignores any coils with power measurement less than a threshold, which it calculates using the threshold fraction. The threshold is the average coil power measurement plus the fraction times the difference between the maximum and average coil powers. For example, if the average coil power is 100 counts and the maximum is 170 counts, a fraction of 0.3 will set the threshold at 121 counts and a fraction of 0.0 will set it at 100 counts. When calculating the weighted centroid, the Neurotracker takes the power above threshold of each included coil and applies the tracker exponent to this net power. If the threshold is 121 and a power measurement is 156, the net power is 35. If the exponent is 1.0, the coil receives weight 35 in the centroid. If the exponent is 2.0, the coil receives weight 1225.

Each playback interval produces one location measurement for each active transmitter. We can see the path of the transmitter through previous playback intervals by setting the persistence parameter to a value greater than zero. With persistence = 20, we will see the twenty previous positions joined by a line.

The tracker measurements are available to processing in the Neuroarchiver info array. The following one-line processor writes the location of each transmitter to a characteristics line.

append result "$info(channel_num) $info(tracker_x) $info(tracker_y) "

The tracker_coordinates string contains the coordinates of the tracker coils, tracker_powers contains the average power of the coils for samples from the current channel number. The tracker_x and tracker_y parameters contain the position of the transmitter during the interval. The tracker_history string contains not only the current location, but all previous locations saved by the persistence parameter.

Message Inspection

The Neuroarchiver allows us to inspect the content of recorded data in detail, message by message if necessary. At times the Neuroarchiver might report errors in its text window, something like this:

WARNING: Clock jumps from 43904 to 44060 in M1295029550.ndf at 584 s.

These messages will be in blue. They mean that something has gone wrong in the acquisition of data by the Recorder Instrument. In the example above, the Neuroarchiver detected a jump in the value of the clock message from 43904 to 44060. The next clock message should always be one greater than the last, with the exception of clock message zero, which of course follows clock message 65535. The clock messages are inserted in the message stream by the Data Recorder (such as the A3018) regardless of the incoming transmitter data. They are the messages with channel number zero.

The Data Recorder inserts 128 messages per second, so they are spaced by 7.1825 ms. In the above example, the clock has jumped by 136 instead of 1. We are missing just over 1 s of data. There are several possible explanations for the missing data. One is that the Data Recorder buffer is overflowing because data acquisition is not keeping up with data recording. Another is that data on its way from the Data Recorder to the LWDAQ Driver is being severely corrupted, and the error correction used by the Recorder Instrument is chopping out large chunks of the data in order to make sure it does not pass on corrupted messages.

Most corruption of data from the Data Recorder to the LWDAQ Driver occurs because of extraordinary electrical events like static discharge. In these cases, a few extra bytes are inserted into the data stream by spurious pulses on the logic lines. Starting with LWDAQ 7.5, the Recorder Instrument provides error-correction so thorough that it will almost always be able to remove the spurious bytes and restore all but one or two of the original messages. Thus a warning like the one above will be unusual. Instead, we expect to see clock jumps of at most one or two steps.

The Neuroarchiver lets us look more closely at the incoming messages, which is useful when diagnosing problems. Try clicking the verbose check box. Now we will see more detailed reports of reconstruction in the text window. Press Configure and set show_messages to 1. Press Step in the Player. We will see detail of the number of errors in our playback interval, and a list of the actual message contents, as provided by the print instruction of the Recorder Instrument's message analysis. If there is an error in the playback interval, the list of messages will center itself upon that error. Otherwise the list will begin at the start of the interval. We set the number of messages the Neuroarchiver will print out for us with the show_num parameter.

Import-Export

Using a processor script, we can export NDF data to a text file, EDF file, or any other file format. We describe the available exporters in Exporting Data. To import data, we must translate into NDF. We have several import scripts available, all of which we can run inside LWDAQ with the Run Tool command, or in the LWDAQ Toolmaker. We present these importers in Importing Data.

We may also want to read the data directly from the NDF file into Matlab, LabView, or some other such program. To do that, we must understand the NDF file structure in detail. When we open an NDF file with a Hex editor we see a load of zeros after the header block, and then the transmitter data itself. The format of the NDF header is described here. The address of a byte in the file is the number of bytes we must skip over to get to it from the first byte. Thus byte zero is the first byte. Bytes 9-12 contain the four-byte address of the first data byte in the file, with the highest byte first.

The data itself starts at the data address, and is divided into messages. Each message has a core made up of four bytes. The first byte is the channel number. The next two bytes are the sixteen-bit sample value, high byte first. The fourth byte is a timestamp or, in the case of clock messages, a firmware version number. Most NDF files contain messages consisting only of the message core. But NDF files recorded from devices such as an Animal Location Tracker (A3032) have a payload in addition to the message core. The length of the payload is written in the NDF metadata. If we are planning to navigate through archives that contain messages with payloads, we must read the metadata string and look for a record of the form 16, which states that the payload is 16 bytes long. The NDF metadata begins at byte 16 and has length given by bytes 12-15. Note that the string length does not equal the size of the space in the file allocated to the string, but instead is the length of the string that has been deliberately written to the metadata since the file's creation.

Every byte in the NDF file from the first data byte to the final byte in the file is a message byte. When the Neuroarchiver adds data to the file, it simply appends the data to the file. It does not have to change anything in the header or make any other adjustment to the file. There is no value in the header that gives the length of the file. The length of the file is available from the operating system.

Having established the location of the first byte, and the length of the messages, we can read messages into our own program. Now we have to interpret them. When the channel number is zero, the message is a clock message. Clock messages are stored by all SCT data receivers at 128 Hz, which is every 256 periods of a 32.768 kH clock oscillator. Subcutaneous transmitters use micro-power 32.768 kHz oscillators to control their transmission rate, and data receivers use them to generate eight-bit (0-255) timestamp values for each SCT message. But the timestamp value for a clock message is always zero, because the clock message is stored whenever the data receiver's eight-bit timestamp value returns to zero. Instead of recording a redundant zero in the timestamp byte of the clock messages, we store the firmware version of the data receiver. But in all other messages, the timestamp byte contains the timestamp of the moment that the SCT message was received. Thus we know this moment with a precision of ±4 ms.

The content of a clock message is a sixteen-bit counter that increments from one clock message to the next. Every 512 s, this value cycles back to zero. The clock messages are always present in the data, unless the data has been corrupted. A corrupted archive can contain sequences of zeros that we call null messages. Any message for which the first and fourth bytes are zero is a null message, and is a sign of corruption. Do not count these as clock messages.

An SCT data message will contain its channel number, which is 1-14, followed by two bytes of data and a timestamp. An SCT auxiliary message will contain channel number 15, followed by sixteen bits in a particular auxiliary format, and a timestamp. Here is an example of four-byte messages in a data stream, expressed in hexadecimal.

00 46 00 04 
04 A5 97 06 
08 A0 EB 18 
0B A5 F6 20 
05 A5 E5 37 
03 A7 8F 3C 
04 A5 9F 46 
08 A0 F8 58 
0B A6 12 60 
05 A5 DD 77 
03 A7 8F 7C 
04 A5 B3 86 
08 A0 B7 98 
0B A5 F7 A0 
05 A5 EF B7 
03 A7 BF BC 
04 A5 DD C6 
08 A0 B9 D8 
0B A5 FF E0 
05 A5 E9 F7 
03 A7 A6 FC 
00 46 01 04 
04 A5 B9 06 
08 A0 CB 18 
0B A6 0D 20 
05 A5 DB 37 
03 A7 C7 3C

Each block of four bytes is a message. Those that start with 00 are clock messages. For channel zero, successive messages have a data value that increments. The firmware version of this data recorder is 04, which is an early version of the A3018 firmware. The rest are SCT messages. If we select the two middle bytes in a hex editor, we can read the data value. The first three for the example above are from channels 4, 8, and 12, and their sample values are 4391, 41195, and 42486.

The timestamp values for the SCT channels are relative to channel 0. If a transmitter runs at 512 SPS there will, on average, be 4 messages from each of channels 1-14 in between successive messages from channel 0. Not all channels need be present. If only one transmitter was active then there would only be messages from one channel. The time stamps for successive messages in between channel 0 messages increase monotonically unless the archive has been corrupted. The timestamps of the first three messages are 6, 24, and 32. The timestamps of the message from channel 4 are 6, 70, 134, 198, and 6. The messages arrive from the different channels roughly but not exactly in the same order between successive clock messages, with each channel sending a message roughly 4 times for every clock message, because they are operating at 512 SPS, while the clock is at 128 Hz.

The reason the messages are not exactly in sequence is three-fold. First, the transmitters deliberately scatter their transmission in time to minimize systematic collisions. Second, some signals may drop out or be corrupted. Third, we may occasionally receive bad messages on a transmitter channel that will appear as glitches in our data unless we reject them. Reconstruction of the signal despite loss of up to 80% is possible, despite the transmission scatter, the collisions, and occasional bad messages. The Neuroarchiver applies reconstruction to the data so that we get the highest quality signal. If we want to read the NDF data directly into some other program, we must either do so without reconstruction, or we must implement reconstruction ourselves.

The code that performs reconstruction for the Neuroarchiver is lwdq_sct_recorder in electronics.pas. The comments at the top of the routine and within the routine describe the details of signal reconstruction. In summary: we extract all message from a particular channel in a playback interval, use our knowledge of the nominal sample rate to find the nominal sample times for the signal, and so compose a sequence of time windows in which legitimate samples could have been generated. Samples outside these windows we reject. Within a window, if we have more than one sample, we choose the one most similar to the previous reliable sample. If we have no sample in a window, we insert the value of the previous sample into the reconstructed data. We end up with a complete set of samples for the interval.

Version Changes

Here we list changes in recent versions that will be most noticeable to the user. You will find the source code here.