Seizure Detection

© 2010-2014, Kevan Hashemi Open Source Instruments Inc.


Baseline Signal
Power Ratio
Wave Bursts
Exporting Data
Importing Data
Heartbeat Recording
Reception Robustness
Seizure-Like Events
Power Averages
Short Seizures
Baseline Calibration
Eating and Hiss
Similarity of Events
Harmonics and Fundamentals
Adequate Representation
High Frequency Oscillations


Here we record our efforts to apply our Subcutaneous Transmitter system and Neuroarchiver Tool to the detection of artifacts in electroencephalograph (EEG) recordings. In some cases, we import recordings from human subjects into the Neuroarchiver for analysis. In other cases we export recordings from animal subjects into text format for analysis in third-party software. But most of the time we work with data recorded by our subcutaneous transmitters from live laboratory animals. Our initial work upon the detection of EEG artifacts was in collaboration with Matthew Walker and his colleagues (Pishan Chang, Rob Wykes, Stephanie Schorge, Joost Heeroma, Athena Lemesiou) at the Institute of Neurology (ION), University College London, Alex Rotenberg and Sameer Dhamne at Children's Hospital Boston (CHB), and Louise Upton at Oxford University (OU). We discuss our choice of electrodes for EEG recording in Electrodes. We present our own understanding of the sources of EEG in The Source of EEG. For a list of example recordings you can download and examine, see our Example Recordings page.

Baseline Signal

[04-APR-11] The following shows a four-second interval of EEG from eight separate control animals by Joost at UCL. We have fifteen hours of control recordings in archives M1297634713 through M1297685113 made on 14-FEB-11. The animals have recovered from surgery and have received no drugs. The transmitters are all A3019Ds, for which each ADC count is 0.4 μV at the electrodes. The electrodes are screws held in place and insulated with dental cement.

Figure: Control Animal EEG Recording. There are eight traces drawn on top of one another.

The amplitude of these control signals varies from 30 μV to 80 μV rms in two-second intervals. During the course of ten minutes, the average amplitude of all eight signals in four-second intervals is between 40 μV and 60 μV. We calculate the signal power in three separate frequency bands for these control recordings using the following processor script. The bands are transient (0.1-1.9 Hz), seizure (2.0-20.0 Hz), and burst (60.0-160.0 Hz).

set config(glitch_threshold) 1000
append result "$info(channel_num) [format %.2f [expr 100.0 - $info(loss)]] "
set tp [expr 0.001 * [Neuroarchiver_band_power 0.1 1.9 0]]
set sp [expr 0.001 * [Neuroarchiver_band_power 2.0 20.0 0]]
set bp [expr 0.001 * [Neuroarchiver_band_power 60.0 160.0 0]]
append result "[format %.2f $tp] [format %.2f $sp] [format %.2f $bp] " 

The Neuroarchiver_band_power routine returns the sum of squares of the amplitudes of all the Fourier transform components within a frequency band. We call this the band power. The filtered signal is the signal we would obtain by taking the inverse transform of the frequency spectrum after setting all components outside the frequency band to zero. We define the amplitude of the filtered signal to be its root mean square value. The amplitude is A√(p/2), where p is the sum of squares of the Fourier transform component amplitudes, and A is a conversion factor in μV/count. Our control recordings were made by A3019D transmitters, for which A = 0.4 μV/count. Thus if the band power is 10000 sq-counts, or 10 ksq-counts, the amplitude of the filtered signal is 28 μV.

Transmitters No1 and No3 show no transient steps or spikes. We use our Power Band Average (PBA) analysis to measure the average power in the transient band for these two transmitters every minute.

Figure: Transient Power in Control Animals (0.1-1.9 Hz). Peak power 8 k-sq-counts is 25 μV rms and baseline power 700 sq-counts is 7 μV rms.

The transient amplitude of No1 and No3 is always less than 25 μV rms during any one-minute period and usually 7 μV rms. The same is not true for the remaining six transmitters. In the remaining transmitters, we use our Transient Spike (TS) analysis to search for transient amplitude greater than 63 μV. There are an average of one such period every forty seconds. Throughout the recording, transmitter No4 shows negative transients every minute or two, of up to 2 mV, like this. Half-way through the recording, No12 and No8 start to show positive transients, similar in shape and height. All transmitters other than No1 and No3 show transients in the final hours of the recording. Transmitter No13 shows one huge transient towards the end, which you can view here.

We plot the average power in the seizure band per minute for transmitters No1 and No3 below, knowing that these results are not contaminated by transient spikes. The seizure-band power in these control animals, which we assume are not suffering seizures, varies from 4 k-sq-counts (18 μV rms) to 25 k-sq-counts (45 μV rms). Seizure power in the remaining channels is corrupted by transients, as you can see here for No4, where we plot the transient and seizure power together.

Figure: Seizure Power in Control Animals (2.0-20.0 Hz). Peak power 25 k-sq-counts is amplitude 45 μV rms and baseline power 4 k-sq-counts is 18 μV.

Burst-band power is far less affected by transients, as you can below. Peak power from No4 in the burst band is 9 k-sq-counts, (27 μV rms), for No3 it is 4 k-sq-counts (18 μV), and for No1 it is 2 k-sq-counts (13 μV).

Figure: Burst Power in Control Animals (60.0-160.0 Hz). Peak power 9 k-sq-counts is amplitude 27 μV rms and baseline power 1 k-sq-counts is 9 μV rms.

The baseline signal amplitude obtained from screws in a rat's skull appears to be around 35 μV rms in the 1-160 Hz band, 18 μV in 2-20 Hz, and 9 μV in 60-160 Hz. When the amplitude in the 0.1-1.9 Hz band rises above 70 μV, it is likely that the electrodes are generating a transient spike. In the absence of any transient spikes, the amplitude of the 2.0-20.0 Hz band appears to have a minimum amplitude of around 20 μV rms, and rises to 40 μV for an hour or two at a time.

Power Ratio

[12-APR-10] We have five hours of data recorded at the Institude of Neurology at University College London by Dr. Matthew Walker and his student Pishan Chang using the A3013A. The inverse-gain of this transmitter is 0.14 μV/count. Here we consider using the ratio of band powers to detect seizures. Using a power ratio has the potential advantage of making seizure detection independent of the sensitivity of individual implant electrodes.

When we tried to detect seizures by measuring power in the range 2-160 Hz, we found that step-like transient events also produced high power. We're not sure of the source of these transients, but we suspect they are the result of intermittent contact between two types of metal in the electrodes. We are sure they are not seizures.

We define a seizure band, in which we expect epileptic seizures to exhibit most of their their power, and a transient band, in which we expect transient steps to exhibit power. In our initial experiments we choose 3-30 Hz for the seizure band and 0.2-2 Hz for the transient band. Seizures show up as power in the seizure band, but not in the transient band. Transients show up as power in both bands. When we see exceptional power in the seizure band, and the seizure power is five times greater than the transient power, we tag this interval as a candidate seizure. You can repeat our analysis with the following script.

set seizure_power [Neuroarchiver_band_power 3 30 $config(enable_vt)]
set transient_power [Neuroarchiver_band_power 0.2 2 0]
append result "$info(channel_num)\
  [format %.2f [expr 0.001 * $seizure_power]]\
  [format %.2f [expr 1.0 * $seizure_power / $transient_power]]"
if {($seizure_power > 1000000) \
  && ($seizure_power > [expr 5.0 * $transient_power])} {
  Neuroarchiver_print "Seizure on $info(channel_num) at $config(play_time) s." purple

The following graphs show seizure-band power and seizure-transient power ratio for the first hour of our recording (M1262395837). We used window fraction 0.1 and glitch filter threshold 10,000 (1.4 mV). The playback interval was 4 s.

Figure (Below): Power in the 3-30 Hz Range for M1262395837.

In these recordings, we convert band power to filtered signal amplitude with a = √(p/2) × 0.14 μV/count. Our seizure power threshold of 1 M-sq-counts is the same as a seizure-band amplitude threshold of 100 μV. Our requirement that the seizure power be five times the transient power is the same as requiring that the seizure-band amplitude ba 2.2 times the transient-band amplitude.

Figure (Above): Seizure-Transient Power Ratio for M1262395837.

Only the seizure activity confirmed by Matthew on No1 has both high seizure power and high power ratio. There are sixteen 4-s intervals between 1244 s to 1336 s that qualify as seizures according to our criteria, and Matthew agrees that they are indeed seizures. Channels No2 and No3 have many intervals with glitches and transients, but no seizures. Channel No7 has high power ratio in places, and at these times the signal looks like a seizure, but the power is below our threshold.

We apply the same processing to the remaining four hours of recordings (M1262403039, M1262417443, M1262798664, and M1264098379). We see several more confirmed seizures on No1 up to a hundred seconds long. We see a few seizure-like intervals on No3 and No7. Matthew cannot confirm whether these are short seizures or other activity. The following figure gives an example of such an interval.

Figure: Candidate Four-Second Seizure. The original signal is orange and the seizure-band signal is light blue. Seizure-transient power ratio is 8:1.

Power ratios are useful in isolating seizures from the baseline signal, but we have been unable to isolate seizures using power ratios alone. In practice, we may have to rely upon consistency between implant electrodes so that we can use the amplitude of the signal as an additional isolation criterion.

Wave Bursts

[12-APR-10] The EEG signals recorded at UCL from epileptic rats contain bursts of power in the range 60-100 Hz, with several bursts per second. The figure below gives an example.

Figure: Wave Bursts in Half-Second Interval. Transmitter No3 in M1264098379 at time 596 s. Recorded with A3013A.

The inverse-gain of the A3013A is 0.14 μV/count, so the burst has peak-to-peak amplitude of 500 μV. The following figure shows the same signal for a longer interval, showing how the bursts are repetetive.

Figure: Wave Bursts in Four-Second Interval. Transmitter No3 in M1264098379 at time 596 s, recorded with A3013A. In M1262413842 we see a single set of wave bursts on No7 at time 2330 s. In M1262417443 we see a sequence of wave bursts on No2 starting at 2976 s.

We used the following processor script to detect wave bursts as well as seizures.

set seizure_power [expr 0.001 * [Neuroarchiver_band_power 3 30 0]]
set burst_power [expr 0.001 * [Neuroarchiver_band_power 40 160 $config(enable_vt)]]
set transient_power [expr 0.001 * [Neuroarchiver_band_power 0.2 2 0]]
append result "$info(channel_num)\
  [format %.2f $transient_power]\
  [format %.2f $seizure_power]\
  [format %.2f $burst_power] "
if {($seizure_power > 1000) \
  && ($seizure_power > [expr 5.0 * $transient_power])} {
  Neuroarchiver_print "Seizure on $info(channel_num) at $config(play_time) s." purple
if {($burst_power > 100) \
  && ($burst_power > [expr 0.5 * $transient_power])\
  && ($burst_power > $seizure_power)} {
  Neuroarchiver_print "Wave burst on $info(channel_num) at $config(play_time) s." purple

The wave bursts on No3 in M1264098379 come and go for periods of a minute or longer. They are spread throughout the archive. When they occur, they often do so at roughly 5 Hz, but not always. In places, such as time 40 s, a single burst endures for a full second.

[09-JUN-10] The above script is effective at detecting all wave bursts, but it also responds to glitches too small for our glitch filter to eliminate. These glitches, caused by bad messages, were common before we introduced faraday enclosures and active antenna combiners to the recording system. Wave bursts are easy to detect using the 160-Hz bandwidth of the A3013A and A3019D.

[08-SEP-11] These wave bursts are most likely eating artifacts, which we have since seen in many other recordings.

Exporting Data

[02-AUG-10] We can use a processor script to extract signals from an NDF archive and store them in some other format for analysis by other programs. The following processor exports data to a text file suitable for Excel. It appends the signal from channel n to a file En.txt in the same directory as the playback archive.

The processor goes through all active channels, performs reconstruction on each one, and writes the sixteen-bit sample values to a text file. Each value will be on a separate line. By "active channels" we mean those that contain more than the Neuroarchiver's activity_threshold number of messages in the playback interval.

set fn [file join [file dirname $config(play_file)] \
set export_string ""
foreach {timestamp value} $info(signal) {
  append export_string "$value\n"
set f [open $fn a]
puts -nonewline $f $export_string
close $f
append result "$info(channel_num) [llength $export_string] "

Before we export a signal from an archive, the Neuroarchiver will perform reconstruction upon the raw data. Reconstruction attempts to eliminate bad messages from the message stream and replace missing messages with substitute messages. We describe the reconstruction process in detail here. Reconstruction always produces a messages stream fully-populated with samples, regardless of the number of missing or bad message in the raw data.

[10-OCT-12] The export processor creates a result string for display in the Neuroarchiver text window. The result for each selected channel contains the channel number and the number of samples written to its export file. This result string serves to tell the Neuroarchiver that processing took place.

The following version of the export script stores a filtered versiom of the signal. As always, the Neuroarchiver reconstructs the voltage signal, runs the signal through a glitch filter, and takes the discrete Fourier transform. This export script uses the band-power routine to extract a range of components from the spectrum and apply to these the inverse Fourier transform so as to obtain a band-pass filtered signal. In our example, we select the components from 60 Hz to 160 Hz.

set pwr [Neuroarchiver_band_power 60 160 1 1]
set fn [file join [file dirname $config(play_file)] \
set export_string ""
foreach value $info(values) {
   append export_string "$value\n"
set f [open $fn a]
puts -nonewline $f $export_string
close $f
append result "$info(channel_num) [llength $export_string] "

We specify the band we want to save in terms of its lower and upper bounds in Hertz, which in our example are 60 Hz and 160 Hz. The "1 1" tell the band-power routine to plot the filtered signal and to over-write the existing signal values with the filtered signal values.

[02-JUL-14] The LabChartExporter script is a processor that exports NDF recordings in a form that the Lab Chart program can read in and display.

Importing Data

[22-JUN-11] Our TDT1 (Text Data Translator) transforms text data into the NDF format. It runs in the LWDAQ Toolmaker. The script creates a new NDF file and translates from text to binary NDF sample format. It inserts the necessary timestamp messages. Each line in the input text file must have the same number of voltage values. A check-button in the TDT1 lets you delete the first value if it happens to be the sample time. The values can be space or tab delimited. The translator assigns a different channel number to each voltage. It allows us to specify the sampling frequency and the bottom and top of the dynamic range of the samples. The range will be mapped onto the subcutaneous transmitter system's sixteen-bit integer range.

[10-OCT-12] We update our original TDT1 translator. The translator does not read the entire text file at the start, but goes through it line by line, thus allowing it to deal with arbitrarily large files. It writes the channel names and numbers to the metadata of the translated archive.

[20-MAY-10] We receive from Sameer of CHB a text file containing some EEG data they recorded from an epileptic rat. The header in the file says the sample rate is 400 Hz and the units of voltage are μV. There follows a list of 500k voltages, which covers a period of 1250 s. He asked us to import the data into the Neuroarchiver and try our seizure-finding algorithm on it.

We remove the header from the file we received from Sameer and apply our translator. Our Fourier transform won't work on a number of samples that is not a perfect power of two, and our reconstruction won't work on a sample period that is not an integer multiple of the 32.768 kHz clock period. We don't need the reconstruction, since this data has no missing samples. So we turn it off. We use the Neuroarchiver's Configuration Panel set the playback interval to a time that contains a number of samples that is an exact power of two. We now see the data and a spectrum.

[21-MAY-10] Our initial spectrum had a hole in it at 80 Hz, when we expect the hole to be at 60 Hz. We obtain the transform of the text file data using the following script, and plot the spectrum in Excel. We see a hole at 60 Hz.

set fn [LWDAQ_get_file_name]
set df [open $fn r]
set data [read $df]
close $df
set interval [lrange $data 0 4095]
set spectrum [lwdaq_fft $interval]
set f_step [expr 1.0*400/4096]
set frequency 0
foreach {a p} $spectrum {
  LWDAQ_print $t "[format %.3f $frequency] $a "
  set frequency [expr $frequency + $f_step]

We correct a bug in the translator, whereby we skip a sample whenever we insert a timestamp into the translated data, and our imported spectrum is now correct, as you can see below.

Figure: Voltage and Spectrum of Text File Data.

According to Sameer, the text data was recorded from a tethered rat. Here we see 5.12 s of the signal displayed, and its spectrum. There is a 60 Hz mains stop filter operating upon the signal. We see what appears to be an epileptic seizure taking place. This interval covers 1034 s to 1039 s from the start of the recording.

[19-JUN-11] We receive two text files from Sameer of CHB. They represent recordings made with tethered electrodes. Each line consists of a time, a comma, and a sample value. We assume the time is seconds from the beginning of the file and the sample is in microvolts. The samples occur every 2.5 ms for a sample frequency of 400 Hz. We give our text files names of the form Mxxxxxxxxxx.txt so that the TDT1 script will create archives of the form Mxxxxxxxxxx.ndf, which is what the Neuroarchiver expects.

We run the TDT1 script. We set the voltage range to ±13000 in the text file's voltage units. Because we assume the unit is μV, our range is therefore ±13 mV, which is the dynamic range of an A3019. We produce two archives, one 300 s and the other 600 s long. The script sets up the Neuroarchiver to have default sample frequency 400 Hz and playback interval 1.28 s. These combine to create an interval with 512 samples, which is a perfect power of two, as required by our fast Fourier tansform routine. We play back the archive and look for short seizures. Before we can return to play-back of A3019 archives, we must restore the default configuration of the Neuroarchiver, such as sampling frequency of 512 SPS. The easiest way to do this is to close the Neuroarchiver window and open it again.

[11-FEB-12] Our new TDT3 script allows us to extract up to fourteen signals from a text file database. The first line of the text file should give the names of the signals, and subsequent lines should give the signal values. There should be no time information in the first column. Instead, we tell the translator script the sample rate. We also tell it the full range of the signals for translation into our sixteen-bit format. We used this script to extract human EEG signals from text files we received from Athena Lemesiou of ION.

[27-APR-12] Our new TDT4 script allows us to extract all channels from a text file database and record them in multiple NDF files, with up to fourteen channels in each NDF file. The first line of the text file should give the names of the signals, and subsequent lines should give the signal values. The script records in the metadata of each NDF file the channel names and their corresponding NDF channel numbers within the NDF file. As in TDT3, we can set the full range of the signals for translation into our sixteen-bit format. Copy and paste the script into the LWDAQ Toolmaker and press Execute. The program is not fast: it takes two minutes on our laptop to translate ten minutes of data from sixty-four channels.

[10-APR-13] We improve TDT1 so that it records more details of the translation in the ndf metadata. We fix problems in the Neuroarchiver that stopped it working with translated archives of sample rate 400 SPS.

Heartbeat Recording

[21-JUL-10] We deliver to CHB an SCT system. Reception during initial experiments was 100% within the faraday enclosure for a transmitter on a stick. Here is heartbeat in a live rat, recorded with the transmitter external to the body, and leads into the chest.

Figure: Heartbeat Voltage and Spectrum. Electrodes are inserted in the chest cavity of an unconcious animal. Archive M1279645346.

The heartbeat is the 500-μV spikes at 7 Hz. The lower-frequency rhythm is respiration.

Reception Robustness

The SCT signal transmission is not reliable; some messages will be lost. One source of message loss is collisions between transmitters, which we discuss here. Another is multi-path interference, which we discuss here. We present our efforts to achieve the best possible reception in our Reception page.

Seizure-Like Events

[06-AUG-10] At CHB, we have transmitter No.6 implanted, with electrodes cemented to the skull by silver epoxy. We record 1000 seconds in archive M1281121460. At time 936 s, we observe seizure-like activity, even though the rat is not epileptic. We observe the rat chewing the sanitary sheets shortly before and after we notice the signal.

Figure: Seizure-Like EEG Signal from Non-Epileptic Rat, archive M1281121460, time 936 s, 4-second interval. Electrodes cemented to skull with silver epoxy.

We put the rat in a cage within the enclosure. It roams around. We record 140 seconds in archive M1281122799. We again notice seizure-like activity at various times, including 104 s. We apply our seizure-finding algorithm. It does not identify any seizures. We apply the script to an older archive M1262395837 from ION and detect this seizure during the 4-s interval after time 1248 s. The amplitude of the ION signal is greater and the frequency of the oscillations is higher than in the CHB signal. Peak power is at around 10 Hz in this seizure, while we see no definite peak power in the seizure-like signal.

[08-AUG-10] Alex says the "recorded signal from our rat may be chewing artifact, but if I didn't have the history, I would say these are high voltage rhythmic spikes of Long Evans rats."

[12-AUG-10] In archive M1281455686 recorded by Sameer at CHB, we see EEG from implanted transmitters No8 and No10. At time 1140 s, Sameer's iPhone interferes with radio-frequency reception. Reception from No8 drops to 0%. Reception from No10 drops to 70%. In a 4-s interval we receive 25 bad messages on channel No5. There are a similar number of bad messages in channel No10. The bad messages produce a seizure-like spectrum, as shown below.

Figure: False Seizure Created by Bad Messages. Archive M1281455686, time 1140 s, 4-s interval.

We modify our seizure-detection so that it looks for seizures only when reception is robust (more than 80% of messages received).

[20-AUG-10] The TPSPBP processor calculates reception, transient power, seizure power, and burst power simultaneously for all selected channels. We apply the script to six hours of data starting with archive M1281985381 and ending with archive M1282081048. The following graph shows transient and seizure power during the 21,000 second period.

Figure: Transient and Seizure Power from Implanted No6 Transmitter. First archive M1281985381. Electrodes are two screws in holes in the skull of a live animal.

We see transient and seizure power jumping up for periods of a few minutes, growing more intense as time goes by. According to our earlier work, a seizure is characterized by high seizure power and low transient power, so we doubt that these events are seizures. The peak of seizure power corresponds to the following plot of EEG.

Figure: Seizure-Like Event. Archive M1282081048, time 132 s, 4-s interval. This interval has high seizure and transient power, as measured between 3-30 Hz and 0.2-2 Hz respectively.

The plot shows oscillations at around 3 Hz, but also a swinging baseline, which gives rise to the transient power. We are not sure if this is a seizure or not. Other examples of the signal with high seizure power show larger swings in the baseline, 0.5-Hz pulses, and rapid, but not instant, changes in baseline potential.

[25-AUG-10] Receive another six hours of data from Sameer at CHB, starting with archive M1282668620 and ending with M1282686620. The rat was less active and more asleep during these six hours. We plot signal power here. There are no seizure-like events.

Power Averages

[04-APR-11] Joost of ION asks us to provide a scripts that will calculate the average power in various frequency bands over a period of an hour. We start with a script that calculates the power in several bands of interest. The Power Band Average (PBA.tcl) script for use with the Toolmaker. The script opens a file browser window and we select the characteristics files from which we want to calculate power averages. With the parameters given at the top of the script, we instruct the analysis on how many power bands are in each channel entry, which channels we want to analyze, and the length of the averaging perioed. The averages appear in the Toolmaker execution window. We cut and paste them into a spreadsheet, where we obtain plots of averag power in all available bands.

We used the power average script to obtain this graph of mains hum amplitude versus time as recorded by Louise in her office window in Oxford over ninety hours.

Figure: Mains Hum at Oxon. Recorded in a window with an A3019A from Fri Apr 8 17:19:44 2011 onwards, with a one-day gap.

We see the power averaged over one-minute periods. A twenty-four hour cycle is evident, and some dramatic event in which the transmitter was moved upon its window sill.

[25-AUG-14] We have a new version of the power band average script, PBAv2.tcl. This script also runs in the Toolmaker, but writes output to a file named in the script, and to the Toolmaker text window it writes interval numbers and times. The script assumes the characteristics lines contain channel ID and power bands only, with no reception value included. Zero reception is instead indicated by the string value "0.00" in place of a band power. The script ignores values of "0.00", as it assumes there is no data. We use this script to obtain hourly power averages for Sukhvir Wright at Oxford University.

Short Seizures

[09-JUN-11] We have 67 hours of continuous recordings from two animals at CHB. Both animals have received an impact to their brains, and have developed epilepsy manifested in seizures of between one and four seconds duration. The transmitters are No4, and No5. The recordings cover 09-MAY-11 to 12-MAY-11 (archives M1304982132 to M1305220312). Our objective is to detect seizures with interval processing followed by interval analysis. The result will be a list of EEG events that we hope are seizures, giving the time of each seizure, its duration, some measure of its power, and an estimate of its spike frequency. The list should include at least 90% of the seizures that take place, and at the same time, 90% of the events in the list should be genuine seizures.

We sent a list of 105 candidate seizures on No4 to Sameer. He and Alex looked through the list declared 83 of them to be seizures, while the remaining 22 were not. On the basis of these examples, we devised our ESP1 epileptic seizure processor and our SD3 seizure detection.

Figure: Confirmed Seizure Events (Top Row) and Non-Seizure Events (Bottom Row). Each graph shows 4 s of recording with 200 μV/div. Click for higher resolution.

The confirmed seizures all contain a series of spikes. The other seizure-like oscillations can be large, like confirmed seizures, but they do not contain spikes. In the Fourier transform of the signal, spikes manifest themselves as harmonics of the fundamental frequency, as we can see in the following plot of the signal and its transform.

Figure: One-Second Segment of Confirmed Seizure.

The fundamental frequency of these short seizures varies from 5 Hz to 11 Hz. Because we want to measure the seizure duration to ±0.5 s, we must reduce our playback interval to 1 s. Thus our Fourier transform has components only every 1 Hz, and our frequency resolution will be ±0.5 Hz. Our first check for a seizure is to look at power in the range 20-40 Hz, which corresponds to the third and fourth harmonics of most seizures. Thus our seizure band is 20-40 Hz, not the 2-20 Hz we worked with initially. A 1-s interval is a canditate seizure if its seizure band amplitude exceeds 35 μV (which for the A3019D is 15 k-sq-counts in power).

We eliminate transient events by calculating the ratio of seizure power to transient power, with transient band 0.1-2.0 Hz. For the weakest seizures, we insist that the seizure power is four times the transient power. For stronger seizures, detection is less difficult, and we will accept transient power half the seizure power.

The final check for a seizure event is to confirm that it is indeed periodic with frequency between 5 Hz and 11 Hz. We find the peak of the spectrum in this range, and determine its amplitude. We require that the fundamental frequency amplitude be greater than 40 μV (which for the A3019D is 100 counts).

With these criteria applied to both the No4 and No5 recordings, we arrive at two lists of seizures, almost all of which contain spikey signals that we give every appearnce of being seizures. There are 529 seizures in 67 hours on No4, and 637 seizures on No5.

[19-APR-12] The CEL1 (Consolidate Event List) script reads in an event list and consolidates consecutive events. The output is a list of consolidated events and their durations.

Baseline Calibration

[17-JUN-14] Here we give a history of our efforts to perform baseline calibration, which is to measure the power of normal, or baseline, activity in our recording so as to account for variations in the sensitivity of our recording device. It turns out that our best definition of baseline power in EEG is "minimum power", and this definition sometimes fails us. The Neuroarchiver provides support for baseline calibration, and this calibration we have used with good effect on tens of thousands of hours of EEG recordings. But the ideal experiment would be one in which the sensitivity of our recording devices varied by no more than a factor of two from least sensitive to most sensitive, and in that case, we should be able to avoid the extra step and complication of baseline calibration, and instead rely upon an absolute power threshold for event detection or identification.

[09-JUN-11] The seizure-detector we presented in the section above used absolute power values as thresholds for the detection of seizure signals. The analysis will be effective with the recordings from multiple transmitters only if all transmitter electrodes are equally sensitive to the animal's EEG. In practice, we observe that the sensitivity of electrodes can vary by a factor of two from one animal to the next, as each animal grows, and in the final days of a transmitter's operating life.

Our study of baseline power above shows that during the course of sleeping and waking, a rat's EEG remains equal to or above a baseline power. The minimum power we observe is therefore a useful measure of baseline power. When we have a measure of baseline power, we can identify EEG events using the ratio of signal power to baseline power instead of the absolute signal power.

Our algorithm for baseline calibration is to as follows. First, we choose a frequency band that contains the power associated with the events we want to find. For seizures, we might choose 2-20 Hz as our event band. Looking at this graph we expect the minimum power in the event band to be around 4 k-square-counts for EEG recorded in a rat with an A3019D. When we start processing a recording, we set our baseline power to twice this expected value. As we process each playback interval, we compare the event power to our baseline power. If the event power is less, we set the baseline power to the new minimum. If the event power is greater, we increase the baseline power by a small fraction, say 0.01%. Increaseing at 0.01%, the baseline power will double in two hours if it is not depressed again by a lower event power. The processor records the baseline power in the interval characteristics, so that subsequent analysis can have avaliable to it the correct baseline power as obtained from the processing of hours of data.

By this algorithm, we hope to make our event detection independent of electrode and transmitter sensitivity. As a pair of electrodes becomes less sentitive with the thickening of a rat's skull, we reduce our baseline power through the observation of lower and lower event power. As a transmitter becomes more sensitive with the exhaustion of its battery, the fractional growth of our baseline power allows us to find the new minimum event power.

[26-NOV-12] The Neuroarchiver provides baseline power variables for processor scripts to use. The baseline power for channel n should be stored in info(bp_n). We can edit and view the baseline powers using the Neuroarchiver's Baselines button. We implement our baseline calibration algorithm in the BPC1 processor, where we use the baseline power variables in the processor script.

[15-APR-14] We have recordings of visual cortex seizures from ION, both single and dual channel. The following plot shows the minimum and average hourly power measured in eight-second intervals during 450 hours of recording from transmitter No8, M1361557835.ndf through M1367821007.ndf recorded in the ION bench rig.

Figure: Baseline Power Evolution for an Epileptic Rat. For each hour, we plot the average power and the minimum power observed for eight-second playback intervals.

At time 191 hr, the minimum power drops to 1.3 k-sq-counts while the average power remains around 30 k-sq-counts. During this hour, the reception is excellent, and the animal is having seizures. At 410 hours, the average power has risen to 200 k-sq-counts while minimum power drops to 1.13 k-sq-counts. But in this period, there are two No8 transmitters operating, and one of them is showing large transient pulses. The distribution of eight-second playback interval power during the entire 450-hour recording is shown here.

[02-JUN-14] In the figure above, there are several hours in which the minimum EEG power drops far below what is usual. At time 191 hr the minimum power is 1.3 k-sq-counts. There is one interval in which EEG appears to be depressed after an extended seizure.

Figure: Depressed EEG After Seizure. This is interval 688-696 s in archive M1362246369, Channel 8. Recording from visual cortex following extended seizure. Power in 1-160 Hz is 1.3 k-sq-counts. Electrodes are a wire inserted through a hole and a 1.6-mm diameter screw.

A depressed interval drops our baseline power to the point where almost all subsequent intervals exceed our threshold for event detection. We found several such intervals in the same hour. The majority of the hour was occupied by vigorous epileptic activity like this.

Meanwhile, the power of the electrical noise on the transmitter input increases with the impedance of the electrodes. In the extreme case, when the electrode leads break near the electrode, the noise looks like this. The figure below shows an eight seconds of normal mouse EEG recorded with M0.5 screws.

Figure: Normal Mouse EEG. Recorded with two M0.5 screws.

The spectrum of this interval is 1/f (pink) from 1-20 Hz and constant (white) from 20-160 Hz. The pink part reaches an amplitude four times higher than the white part. In the depressed rat EEG interval, the pink region is also 1-20 Hz, but the pink part reaches an amplitude twenty times higher than the white part. The spectrum of the baseline signal affects our choice of event power threshold, because white noise power varies far less than pink noise power over a finite interval of time. In mice we use a threshold of three times the baseline power, while in rats we use five times.

The existence of depressed intervals upsets our use of minimum power for our baseline calibration. The fact that the lowest-power intervals can be interesting upsets our use of high power as our trigger for event classification. The fact that the spectrum of noise varies with animal species and electrode size makes our power threshold dependent upon these factors also. The baseline calibration we have attempted to implement may be impractical. We are not even sure we can agree upon the definition of baseline power. We are considering abandoning baseline power calibration and replacing it with a straightfward, absolute, logarithmic power metric.

[09-JUN-14] Here is an absolute power metric that covers the full dynamic range of our subcutaneous transmitters. Its slope is greatest at 100 k-sq counts, or 90 μV rms in the A3028A transmitter, which is the root mean square amplitude of typical seizures.

Figure: Absolute Power Metric.

The metric, mpwr, is a sigmoidal function of signal power, P. With P in units of k-sq-counts we have, mpwr = 1 / (1 + (P/100)−0.5).

Eating and Hiss

[23-FEB-12] Readers may dispute the classification of EEG events as "eating" or "hiss" or "seizure" in the following passage. We do not stand by the veracity of these classification. Our objective is to demonstrate our ability to distinguish between classes of events. We could equally well use the names "A", "B", and "C".

[08-AUG-08] The following figure presents three examples each of hiss and eating events in the EEG signal recorded with an A3019D implanted in a rat. We are working with the signal from channel No4 in archive M1300924251, which we received from Rob Wykes at ION. This animal has been injected with tetanus toxin, and we assume it to be epileptic.

Figure: Confirmed Hiss Events (Top Row) and Eating Events (Bottom Roww). Each graph shows 1 s of recording with 200 μV/div. Click for higher resolution.

Hiss events are periods of between half a second and ten seconds during which the power of the signal in the high-frequency band (60-160 Hz) jumps up by a factor of ten. We observe these events to be far more frequent in epileptic animals, and we would like to identify, count, and quantify them. The eating events occur when the animal is chewing. Eating events also exhibit increased power in the high-frequency band, but they are dominated by a fundamental harmonic between 4 and 12 Hz.

Our job is to identify and distinguish between hiss and eating events automatically with processing and analysis. We find our task is complicated by the presence of three other events in the recording: transients, rhythms, and spikes.

Figure: From Left to Right: Transient, Rhythm, and Spike Events. Each graph shows 1 s of recording with 200 μV/div. Click for higher resolution.

A transient event is a large swing in the signal. A spike event is a periodic sequence of spikes, which we suspect are noise, but might be some form of seizure. A rhythm is a low-frequency rumble with no high-frequency content. For most of the time, the EEG signal exhibits a rhythm that is obvious as a peak in the frequency spectrum between 4 and 12 Hz, but sometimes this rhythm increases in amplitude so that its power is sufficient to trigger our event detection.

The Event Detector and Identifier Version 2, EDIP2, processor detects and attempts to distiguish between these five events. Read the comments in the code for a detailed description of its operation. The processor implements the baseline calibration algorithm we describe above. It is designed to work with 1-s playback intervals. In addition to calculating characteristics, the processor performs event detection and identification analysis immediately, so that you can see the result of analysis as you peruse an archive with the processor enabled.

The processor defines four frequency bands: transient (1-3 Hz), event (4-160 Hz), low-frequency (4-16 Hz), and high-frequency (60-160 Hz). When the event power is five times the baseline power, the processor detects an event. The processor records to the characteristics line the power in each of these bands. These proved insufficient, however, to distinguish eating from hiss events, so we added the power of the fundamental frequency, just as we did for our detection of Short Seizures.

These characteristics were still insufficient to distinguish between spike and eating events, so we added a new characteristics called spikiness. Spikiness is the ratio of the amplitude of the signal in the event band before and after we have removed all samples more than two standard deviations from the mean. Thus a signal with spikes will have a high spikiness because the extremes of the spikes, which contribute disproportionately to the amplitude, are more than two standard deviations from the mean. We find that baseline EEG has spikiness 1.1 to 1.2. White noise would have spikiness 1.07. Eating events have spikiness 1.2 to 1.5. Spike events are between 1.5 and 2.0. Hiss events are between 1.1 and 1.3. Thus spikiness gives us a reliable way to identify spike events.

The processor uses the ratio of powers in the various bands, and ranges of spikiness, to identify events. After analyzing the hour-long recording and examinging the detected events, we believe it to be over 95% accurate in identifying and distinguishing hiss and eating events. Nevertheless, the processor remains confused in some intervals, such as those we show below.

Figure: Identification Errors. From left to right: a weak eating event mistaken for a rhythm, a combination of hissing and spikes mistaken for an eating event, an unusual eating event mistaken for a hiss event. Each graph shows 1 s of recording with 200 μV/div. Click for higher resolution.

The processor does a good job of counting hiss and eating events in an animal treated with tetanus toxin. We are confident it will perform well enough in control animals and in other animals treated with tetanus toxin.

[10-AUG-11] Despite the success of the Epileptic Seizure Processor Version 1, ESP1, and the Event Detector and Identifier processor Version 2, EDIP2, we are dissatisfied with the principles of event identification we applied in both processors. It took many hours to adjust the identification criteria to the point where their performance was adequate. Furthermore, we find that we are unable to add the detection of short seizures to the processor, nor the detection of transient-like seizures we observe in recent recordings from epileptic mice.

Similarity of Events

[08-SEP-11] The Event Classifier provides us with a new way to identify events. The ECP1 (Event Classification Processor, Version 1) produces eight characteristics for each channel. The first is a letter code that we can use to mark likely intervals at the time of processing. The second is the baseline power, which we obtain from the processor's built-in baseline calibration. The remaining six are values between 0.000 and 1.000 that we call metrics. The metrics are independent characteristics of the interval.

Of the six metrics, the first is the event power metric, and it is this one that we use to detect the occurance of an event. We call it the event trigger metric. When the event power is five times greater than the baseline power, the event power metric reaches a value of 0.5, and the Classifier detects an event. It uses all six metrics to classify the type of the event, but the initial detection is performed on the basis of event power alone.

The metrics produced by ECP1 are event power, transient power, high-frequency power, spikiness, asymmetry, and intermittency. Event power is a sigmoidal function of the ratio of the power in the 4.0-160.0 Hz band to the baseline power, with a ratio of 5.0 giving us a metric of 0.5. Transient power is a sigmoidal function of the ratio of power in the 1-3 Hz band to the baseline power, with a ratio of 5.0 giving us a metric of 0.5. High-frequency power is a sigmoidal function of the ratio of power in the 60-160 Hz band to power in the 4.0-160 Hz band, with a ratio of 0.1 giving us a metric of 0.5. Spikiness is a sigmoidal function of the ratio of the range of the 4.0-160.0 Hz signal to its standard deviation, with a ratio of 8.0 giving us a metric of 0.5. Asymmetry is a sigmoidal function of the number of points more than two standard deviations above and below the mean in the 4.0-160 Hz signal. When these numbers are equal, the asymmetry is 0.5. When the number above is greater, the asymmetry is greater than 0.5. Thus downward spikes give us asymmetry of less than 0.5. The intermittency metric is a measure of how much the high-frequency content of the signal is appearing and disappearing during the course of the interval. We rectify the 60-160 Hz signal, so that negative values become positive, then take the Fourier transform of this rectified signal. The power in the 4.0-16.0 Hz band of this new transform is our measure of intermittency power. Our intermittency metric is a sigmoidal function of the ratio of intermittency power to high frequency power. The metric has value 0.5 when the intermittency power is one tenth the power in the original 60-160 Hz signal.

All six of these metrics are invertible, in that we can recover the original event, transient, high-frequency, and intermittency powers, as well as the spikiness and asymmetry by inverting the sigmoidal functions with the help of the baseline power. An example of a Toolmaker script that performs such an inversion is CMT1.

The Event Classifier allows us to build up a library of events that we have classified visually. Once we have a large enough library, we use the metrics to compare test events with the reference events. The reference event that is most simliar to the test event gives us the classification of the test event. If the nearest event is a hiss event, we assume the test event is a hiss event also. Our measure of the difference between two events is the square root of the sum of the squares of the differences between their metrics.

With six metrics, such as produced by the ECP1 classification processor, we can imagine each event as a point in a six-dimensional space. The difference between two events is the length of the six-dimensional vector between them. Because metrics are bounded between zero and one, the Classifier works entirely within the unit cube of our six-dimensional space. We call this unit cube the classification space. A test event appears in the classification space space along with the reference events from our library. We assume the test event is of the same type as its nearest neighbor.

The map provided by the Event Classifier shows us a projection of the classification space into a two-dimensional unit square defined by the x and y metrics selected by the user for the map map. We see the reference events as points, and the test event also. By selecting different projections, we can get some idea of how well the metrics are able to separate events of different types. But the picture we obtain from a two-dimensional projection of a higher-dimensional classification space will never be complete. Even with three metrics, we could have events of type A forming a spherical shell, with events of type B clustered near the center of the shell. In this case, our projections would always show us the B events right in the middle and among the A events, but the Classifier would have no trouble distinguishing the two by means of its distance comparison.

We apply ECP1 and the Event Classifier to archives M1300920651 and M1300924251, which we analyzed previously in Eating and Hiss. You will find the ECP1 characteristics files here and here. These archives contain signals from implanted transmitters 3, 4, 5, 8, and 11. We build up a library of a hundred reference events using No4 only. The library contains hiss, eating, spike, transient, and rhythm events as we showed above. Also in the library are rumble events, which are like rhythms but less periodic. Some events are a mixture of two overlapping types, and when neither is dominant we resort to calling them other. We find the same dramatic combination of downward transient, hiss, and upward spikes a dozen times in the No4 recording, and we label these seizure.

Figure: Three Events. From left to right: rumble, other, and something we decided to call a seizure. Signals appear on No4 in Archive M1300924251.

We apply Batch Classification to the entire two hours of recording from no4, classifying each of two thousand one-second events using our library of one hundred reference events. We go through several hundred of the classified events and find that the classification is correct 95% of the time. The remaining 5% of the time are events whose appearance is ambiguous, so that we find ourselves going back and forth in our visual classification. The Classification never fails on events that are obviously of one type or another.

We apply our reference library to the recording of No3. We add a few new reference events to describe transients suffered by No3 but not by No4. After this, we obtain over 95% accuracy on No3, as with No4. For No5 we add some surge events not seen on No4, and for No11 we add more eating events because this transmitter sees eating artifact upside down compared to the other channels, suggesting its electrodes were reversed. We also observe some seizure-like rhythm events, and we add a couple of examples of these to the library. We do not need to add new events to classify No8. You will find our final library here.

Figure: Three Events. From left to right: transient on No3, surge on No5, and upside-down eating on No11. Found in archives M1300920651 and M1300924251.

Transmitter No11's baseline power is only 0.5 k-sq-c compared to No3's 3.4 k-sq-c, meaning the baseline EEG amplitude is less than half that of No3. But the baseline calibration provided by ECP1 overcomes this dramatic difference, and we find our reference library is effective at classification in No11.

We repeat our classification procedure on our short seizure recording, which contains data from two transmitters, No4 and No5. Both animals show frequency powerful rhythm events, but No4 contains far more seizures than no5. These rhythms and seizures have a low proportion of high-frequency power. But the rhythm events tend to be symmetric, while the seizure events are downward-pointing. It is the assymetry metric that allows us to distinguish between the two. With a library of only thirty-five events, we obtain over 95% accuracy at distinguishing between seizures and rhythms. Indeed, out of hundreds of rhythm events, only one or two are mistaken for seizures, and event then, we're not certain ourselves that these unusual events are not seizures.

Upon two entirely independent data sets, our method of similar events proves itself to by over 95% accurate. In once case it succeeded in classifying thousands of events from five transmitters into nine different types. In another case it distinguished between rhythms and short seizures with no difficulty.

Harmonics and Fundamentals

When we look at the Fourier transform of a signal, we are almost always looking at the amplitude of the components plotted against frequency. We call this the spectrum of the signal. It is almost impossible for the human eye to make sense of the graph of phase plotted against frequency. Nevertheless, this second graph is essential to the transform. When we ignore it, we are throwing away precisely half of the information contained in the transform. When we look at pendulum waves, for example, we see how the same frequency oscillations combine to give a wide variety of patterns depending upon their relative phases.

Thus the spectrum of a signal is not unique to the signal. There will be many signals with almost exactly the same spectrum, even though the original signals are dramatically different in their appearance. When we use the spectrum alone to detect events, we may find that we cannot distinguish between the events we are looking for and a far larger number of irrelevant events that have the same spectrum. In this section we present examples of how the Fourier transform can be misused when applied to EEG.

[04-JAN-12] We are working with Athena on the detection of high-frequency oscillations in human EEG. In the introduction to Unsupervised Classification of High-Frequency Oscillations in Human Neocortical Epilepsy and Control Patients (Blanco et al., Journal of Neurophysiology 104: 2900Ð2912, 2010), the authors write, "In the epilepsy research community, there is mounting interest in the idea that high-frequency oscillations (HFOs) − narrowband transients recorded on the intracranial electroen-cephalogram (iEEG), having predominant frequency between roughly 100 and 500 Hz and lasting on the order of tens of milliseconds − can be used to identify epileptogenic brain tissue." To find these HFOs in their own recordings, they use a procedure similar to our Event Classifier, with metrics that rely partly upon the Fourier transform of the signal and partly upon non-linear features of the signal. At ION, our collaborators are working on detecting HFOs in human EEG, and they are wondering if looking at the power in various narrow bands between 100 Hz and 500 Hz will suffice as a detection mechanism.

The following plot is a 2-s segment of four channels of a human EEG recording we received from Athena Lemesiou, along with its discrete Fourier transform.

Figure: Two Seconds of Four Channels of Human EEG with Discrete Fourier Transform. The two seconds of data contain 1024 samples from each of four channels. Each channel is the difference in potential between two neighboring electrodes on the subject's brain.

We now take the central 1 s of the above recording, which contains some rapid fluctuations. We have 512 samples from each of the four channels. We take the discrete Fourier transform of the signal for each channel, remove all components below 80 Hz and above 200 Hz, then take the inverse transform.

Figure: One Second of Four Channels of EEG Band-Pass Filtered to 80-200 Hz.

The impression one might get from the above plot is that there are 100 Hz oscillations in the signal. But here is the same signal band-pass filtered to 40-200 Hz.

Figure: The Same Second of Four Channels of EEG Band-Pass Filtered to 40-200 Hz.

There are no 100 Hz oscillations. The signal contains oscillations of roughly 50 Hz. These vary in frequency during the one-second interval. These oscillations are asymmetric. In the Fourier transform, frequency variation and waveform asymmetry manifest themselves as frequency components at double the oscillation frequency. We call these double-frequency components the "second harmonics". The main, 50 Hz, components are the "fundamental frequencies" or "first harmonics". If you remove the first harmonics you are, of course, left with the second and higher harmonics. Here is the same EEG interval filtered to 120-200 Hz.

Figure: The Same Second Showing One Channel of EEG Band-Pass Filtered to 120-200 Hz.

Here is a 64-Hz square wave we imported into the Neuroarchiver. You can see the spectrum of the oscillation on the right, and on the left the oscillation itself in green. We plot the third harmonics on its own in purple. We extracted them with a 160-180 Hz band-pass filter.There is no second harmonic because the waveform is symmetric. But there is a third harmonic because the edges of the oscillation are sharp.

Figure: A Square Wave In Its Original Form (green) and Band-Passed Filtered to 160-180 Hz (pink). The signals are on the left, and the Fourier transform on the right.

We cannot use bandpass filters to deduce the presence of oscillations at a particular frequency. By definition, a harmonic is not an oscillation. The oscillation is the sum of all the harmonics. These harmonics emerge when we transform the original signal into an abstract place called the "frequency domain". We are permitted to say that there are "oscillations at 100 Hz" only if there is no fundamental harmonic at 50 Hz.

[22-MAR-12] In Pitfalls of high-pass filtering for detection of epileptic oscillations, Clinical Neurophysiology, 2010, Benar et al. make the same argument we present above. In Continuous high-frequency activity in mesial temporal lobe structures, Epilepsia, 2012, Mari et al. apply an 80-Hz high-pass filter to epileptic human EEG and observe continuous high-frequency oscillations in the output of the filter. The examples below show how this will always be the case. We take one second of EEG from a non-epileptic, active mouse (recorded at Oxford University, 25-JUL-11, archive M1311606907, we picked 40-41 s at random) and apply three sharp high-pass filters: 40 Hz, 60 Hz, and 80 Hz. In each case, we see the apparant frequency of oscillation is about 20 Hz above the filter cut-off frequency.

Figure: One Second of Baseline Mouse EEG, Sharp High-Pass Filtered. From left to right: 40 Hz, 60 Hz HPF, 80 Hz. We perform the filtering by removing all components of the Fourier transform above the cut-off frequency. The dark blue trace in the background is the EEG signal, unfiltered. The light blue trace in the foreground is the filtered trace, amplified by a factor of ten. Click on individual images for larger view.

When we apply a step function to a filter, its output will ring afterwards, at a frequency close to the cut-off frequency. The ringing continues for longer as the filter gets sharper.

Figure: One Second of Baseline Mouse EEG, Gentle High-Pass Filtered. The same as previous figure, except the high-pass filter has a 20-Hz cut-off region instead of a 0-Hz cut-off region.

Filters with a more gentle cut-off produce ringing of smaller amplitude that decays more quickly. The images above show the same data we presented above, with the same cut-off frequencies, except that the high-pass filter allows a fraction of components less than 20 Hz below the cut-off frequency to remain in the filtered signal. The frequency response of the filter, in terms of amplitude, looks like a straight line from zero at 20 Hz below the cut-off, to one at the cut-off. Despite the gentle nature of this filter, we see that the same filter-generated oscillations remain in the signal.

Adequate Representation

In the previous section we showed that the Fourier transform's amplitude spectrum alone was insufficient to identify high-frequency oscillations (HFOs) in EEG. In general, no matter what the appearance of the original signal, there will be other signals of dramatically different appearance with exactly the same amplitude spectrum. Their Fourier transforms are distinguished by the phases of their frequency components, not by the amplitudes. Thus the amplitude spectrum is not an adequate representation of the EEG signal for the identification of HFOs. In this section we present general-purpose definition of adequate representation that applies equally well to our Event Classifier and the Fourier spectrum.

Suppose we have an interval of EEG or some other signal, made up of M samples. Each sample xi has a one-dimensional value. In the case of the SCT system, the samples have integer values between 0 and 65535 as produced by the sixteen-bit analog to digital converters on the transmitters. We can think of each interval as being a point, x, in an M-dimensional space. Let S be the set of all possible intervals. In the case of the SCT system, S is an M-dimensional cube of side 65535 with one corner at the origin.

Let AS represent the set of all intervals of a particular type. Thus A might represent the set of all intervals that contain one or more HFOs, or it might represent the set of all intervals that contain seizure spikes. For any interval xS, we can examine the interval ourselves and determine whether x is in A or not. In the SCT system, we would look at x as it appears in the voltage versus time plot of the Neuroarchiver. Complimentary to A, we have a set SA which contains all intervals that are not members of A. Obviously, AS−A = S and AS−A = ∅.

Our objective is to pick out from a large number of intervals those that are members of A. We can do this by eye, but it takes several seconds per interval, and if we have a million intervals, the task becomes impractical. Thus we set out to perform the classification with a computer. This we do by transforming each interval into another form, and classifying the interval based upon some simple proximity or threshold rules we apply to this other form. In general, we transform points in S into points in another space T. In Similarity of Events we transform 512 consecutive sixteen-bit samples into six real-valued metrics, each of which is a number between zero and one. Thus T is a six-dimensional unit cube with one corner at the origin. In Harmonics and Fundamantals we transform 512 consecutive sixteen-bit samples into 256 components of an amplitude spectrum. Thus T is a 256-dimensional cube with one corner at the origin. (The sides of this cube have length 65535 × √2, but that's not so obvious.)

Let F be a function such that for any xS we have y = F(x) is the transform of x, with yT. We call F our transform function. Furthermore, we write F(A) to represent the set of all points yT such that there exists xA for which y = F(x). Thus F(A) is the transformed version of A. In the same way, F(S−A) is the transformed version of S−A.

The discrete Fourier transform of an interval, including both amplitude and phase information, is a one-to-one transformation of points in the time domain, S, into points in the frequency domain, T. Given any point in T we can deduce the one and only point in S from which that point could arise. The function F is invertible, which is to say that F−1(y) exists, where F−1(F(x)) = x. If we use only the amplitude spectrum of the Fourier transform, however, the transformation is no longer one-to-one. The function F−1(y) does not exist, and for any yT there are many xS such that F(x) = y. When we transform intervals into six metrics, it is clear that we cannot deduce the original signal from the transformed point. What we can hope for, however, is that y = F(x) contains enough information for us to deduce whether or not xA.

By construction, we have AS−A = ∅ and AS−A = S. We don't care if F(A)F(S−A) = F(S) = T. Indeed, when we restrict S to a finite number of recorded intervals, we will find that F(S) is a proper subset of T. But we certainly prefer to have F(A)F(S−A) = ∅. If not, there exist points yF(A)F(S−A) such that y is the transform of at least one point in A and one point in S−A. In other words, we cannot use y to determine if xA. We say F is ambiguous with respect to A. The set of points xS such that F(x)F(A)F(S−A) is the ambiguous region of S with respect to A. The set F(A)F(S−A) is the ambiguous region of T with respect to A.

If we are to use F to determine if xA, the function must be unambiguous with respect to A. If F is invertible, then it will certainly be unambiguous with respect to A. But we want to work with functions that greatly reduce the amount of information our computer must deal with when applying our classification rules. In that case, F will not be invertible. We can, however, prove that F is unambiguous with respect to A by going through all members of S, inspecting each by eye, transforming them, and constructing the ambiguous region of S with respect to A. If this region is empty, then F is unambiguous with respect to A.

But in practice, S will contain too many members for us to inspect all of them. If S is the set of all possible combinations of 512 sixteen-bit samples, for example, it will contain 65536512 points. Even if we restrict S to the set of all such combinations that we have actually recorded from a laboratory animal, the number of points in S will still be too large, because our original motivation for applying F to S was that it would take too much effort to inspect each and every point in S. In practice, therefore, F will not be invertible and it will be impractical to classify all members of S by visual inspection. But the requirement that F be unambiguous with respect to A remains.

Thus we need a practical procedure to show that F is unambiguous with respect to A. To do this, we will assume that S contains a finite number of points. This will indeed be the case in practice, even if that finite number is tens of millions of seconds of recorded signal. We now place further conditions upon F. Loosely speaking, we require that F be such that we can define F(A) with a finite set of spheres. In more precise language, F must be such that there exists a finite set of points LT and a radius r > 0 such that the union of all spheres of radius r centered on the members of L, which we will call U, is such that F(A)U and F(S−A)U = ∅. When such an L and r exist, we say that F separates A, and that U provides an adequate representation of A in F. We see that F separates A implies that F is unambiguous with respect to A.

Our task, therefore, is to find an adequate representation of A in F, because if we can do that we show at once that F is unambiguous with respect to A and we know F can separate A from S−A. Given any xS, we check to see if F(x)U, where U is our adequate representation of A in F, and if so, we know xA, but if not, we know xS−A.

In certain cases, we may find that there are other ways to determine if F(x)U besides going through all the spheres of U and checking to see if it contains F(x). For example, if one of our metrics is high-frequency power and another is intermittency, we might say that x is an HFO if the high-frequency power, intermittency, and purity of the signal all exceed certain thresholds, or lie within certain bounds. But it could be that the threshold for high-frequency power should be a function of the intermittency, in which case our fixed-threshold classification would perform poorly. Even if constant thresholds could give us effective classification, we would still be left with the task of determining the values of the thresholds. Indeed, after many hours spend adjusting such thresholds, we concluded that classification with constant thresholds is impractical.

We are left with the task of composing a list of points, LT, and choosing a radius, r, that together will provide us with an adequate representation of A in F. Consider the following separation algorithm.

  1. Let r = ∞, let LT be empty, and let VF(S−A) be empty. We define U as the union of all spheres of radius r centered upon the members of L.
  2. Pick at random some xS. Inspect x. If xA and F(x)U then add F(x) to L. If xA, add F(x) to V.
  3. Decrease r until UV = ∅
  4. Go back to Step 2.

If it is true that F separates A, this algorithm will eventually provide us with an adequate representation of A in F. If we find, after inspecting a sufficiently large number of points in S, that r has converged to some final value, and we are no longer adding points to L, we can be confident that we have an adequate representation. We will not be certain, but we can at least be confident.

In the case where A is a very small subset of S, such as when we are looking for rare features in a recorded signal, our algorithm is not efficient in its use of inspector time. It requires the inspector to examine a large number of uninteresting events. If possible, we should devise some simple criterion for initial rejection before inspection. The Event Classifier, for example, uses its first metric for initial rejection. If this metric is less than one half, the Event Classifier assumes that the interval is not an event (it assumes xA). Furthermore, the algorithm is inefficient in its use of the points in L because it allows them to extend influence only over a distance r. The Event Classifier overcomes this inefficiency by using a nearest-neighbor search through L to determine if F(x)F(A) or F(x)F(S−A). Thus the points in L have influence over all points in S to which they are nearest, and therefore all points in S are included in the sphere of influence of at least one member of L.

High Frequency Oscillations

We would like to detect wave packets of center-frequency 100-200 Hz in human EEG recordings. In the epilepsy community, these wave packets are called HFOs (high-frequency oscillations). Our sample EEG recordings come from intracranial electrodes implanted in the brains of adult patients with epilepsy. The recordings contain seizures, rumble, and mains hum. On some channels we see one or two HFOs per minute, but on others we see none. We would like to ignore the non-HFO features and extract from the signal the wave packets. At the same time, we are aware of how easily we can create wave packets by the action of high-pass filtering, as we discuss above. We resolve not to classify the EEG intervals by looking at a filtered version of the signal. We will look at the original signal.

Figure: HFO in Human EEG. Interval of 0.5 s and 1024 SPS. We can see the oscillation beginning at around 0.15 s and continuing until 0.3 s. We are able to see these small fluctuations because the spikes and swings in the underlying signal are not so large as to obscure the fluctuations from our eyes. We count roughly fifteen evenly-spaced fluctuations in 150 ms. The spetrum shows a matching peak at 100 Hz. We note that 100 Hz is the second harmonic of the local mains frequency. We do not know if these fluctuations are mains hum or neurological activity.

Although we will not use filtering in our visual identification of HFOs, we will use the power in the 80-260 Hz band to pick out intervals that might be HFOs. These intervals will be our events and the 80-260 Hz band will be our event band. Our hope is that we can train a computer to classify events as HFO or non-HFO. The upper limit of 260 Hz makes no sense for signals recorded at 512 SPS, but in this case our EEG data was recorded at 1024 SPS and filtered to 268 Hz. The power of our HFOs should lie entirely within the event band. We will use event-band power for baseline calibration. Our event-power metric is a measure of the ratio of the event power to the baseline power. After some examination of the data, we set our event trigger at ten times the baseline power.

We found that symmetry and spikiness were not useful in detecting HFOs, neither when applied to the 80-260 Hz signal nor to the 10-40 Hz signal. But intermittency remains useful, this being the power of fluctuations in the de-modulated event-band signal. We devised a new metric to measure the purity of a wave packet. A wave packet has a gaussian-shaped frequency spectrum. Our purity metric is a measure of how narrow this peak is compared to its center frequency. A sustained sinusoid will have maximum purity and a spike will have minimum purity. A wave packet is somewhere in-between.

Our ECP2 classification processor implements these three metrics. In the figure below we see a classification library arranged by purity and intermittency. The other two combinations of metrics give similar separation of HFO and Other events. All three metrics are necessary for adequate separation of the two classes.

Figure: HFO Classification Library. Purple events are non-HFO events called Other and orange are HFO events. We arrange the library by purity and intermittency. The white square is a current event that is classified as Other.

We built up our classification library by starting with channel No1 in an archive P9_HG_N4_REF_1.ndf for which ION had confirmed twelve HFOs. We used six of these in our library three others we found ourselves. We included twenty Other events from channel No1. We applied the library to all fourteen channels in the same archive. We found 5 New events and classified these. We applied the library again. We found 0 New events, 24 HFO, and 175 Other. Going through the 24 HFO events we find they all match our visual criteria. Our false positive rate among detected HFO events is less than 1/24 = 4%. Our false positive rate among randomly-selected 0.5-s intervals is less than 1/16800 = 0.006%. Going through the 175 Other events, we find seven or eight that we think should have been classified as HFO. These are either too close to the edge of the interval or accompanied by large artifacts that make detection difficult. Because we have roughly 32 HFO events and we miss 8 of them our false negative rate is 25%.