Saturday, April 24, 2010

setiQuest Kepler-Exo4 1420 MHz

This analysis is of the setiQuest Kepler Exoplanet 4 data file with the baudline signal analyzer. The quadrature data file has a base frequency of 1419.4464 MHz and a sample rate of 8.738133 Msamples per second. Not much information is given about this data file but I assume it is an observation of the Kepler Mission satellite collected at the Allen Telescope Array. But the SNR is far too low to be the telemetry of a near Earth satellite so the signal source could be the Kepler-4 planet (KIC 11853905). Will need source confirmation from the SETI Institute since they collected the signal. In any case there is some interesting stuff happening in this data file.

The following command line was used to stream the Kepler Exoplanet 4 data file into baudline:

cat 2010-01-22-kepler-exo4-1420mhz.dat | baudline -session setiquest -stdin -format s8 -channels 2 -quadrature -flipcomplex -samplerate 8738133 -fftsize 65536 -pause -utc 0

Full 8.738 MHz view
The Kelper Exo4 file was streamed into baudline's standard input. A 65536 point FFT was used for a bin resolution of 266.667 Hz / bin. The Welch window was used for a little more signal extraction SNR. The Histogram window shows a nice Gaussian noise shape with even-odd holes for the 8-bit samples. Optimal anti-alias beam slices were used to smooth the spectrogram and the Color Aperture window was tweaked to maximize the color resolution. Only 50 seconds of the spectrogram are shown because the run was RAM limited. The red Average spectrum shows a hump and 6 strong tones.

From the spectrogram; the two strongest features are the stationary tone at -3874 kHz and the spectral hump at 1000 kHz. Looking at the red Average plot shows several more sharp tones. Here is a list of the potentially interesting targets, add 1419.4464 MHz base frequency:
  • -3868800 Hz - very strong - stationary
  • -3470933 Hz - strong - random walk - drift
  • -2713067 Hz - weak - wild random walk - drift
  • -1482667 Hz - strong - wild random walk - drift
  • +586133 Hz - strong - drift - extremely interesting - "friend"
  • +977867 Hz in hump - weak - wild random walk - drift
  • +900000 ... +1111000 Hz (hump) - hydrogen - see below
All of these candidate signals are investigated individually below. Decimating by 4096 to increase the extraction power was used for all of them expect the hydrogen hump. Most of the analysis is quickly skimmed over except for the most interesting +586133 Hz signal which is analyzed at the end.

Hydrogen and Friend
Zooming the Average window into the frequency axis reveals this strong tone and spectral hump. Tone and hump, they make an interesting pair.

The spectral hump at +1 MHz is hydrogen. Interstellar hydrogen in space emits radio frequencies at 1.42 GHz, so with the base frequency offset the hump is centered at around 1.421 GHz. The strong tone to the left is just outside the "water hole" and because it requires a much more detailed analysis we will investigate it last.

-3868800 Hz
Decimating by 4096. Very strong stationary tone in the filter roll-off skirt. No drift. Not interesting.

-3470933 Hz
Decimating by 4096. Random walk tone with positive slope drift. +29 Hz drift / 326 seconds = +0.089 Hz / sec.

-2713067 Hz
Decimating by 4096. Random walk tone with positive slope drift. Difficult to measure, roughly +29 Hz drift / 326 seconds = +0.089 Hz / sec. It looks a bit like the previous signal but it is much weaker and it appears to be jumping around more. It could be a sideband of something but it doesn't seem to be harmonically related to the previous tone. More decimation might help pull out more signal.

-1482667 Hz
Decimating by 4096. Random walk tone with positive slope drift. Difficult to measure, roughly +29 Hz drift / 326 seconds = +0.089 Hz / sec. Looks almost exactly like a stronger version of the above signal.

+977867 Hz
Decimating by 4096. This signal is the weak tone that is in the hydrogen hump mentioned above. Very weak version of the above signal. Strength and random wander are almost identical to the -2713067 Hz signal. Why this signal is in the hydrogen hump is unknown.

This and the previous two signals are virtual copies of each other. They do not appear to be harmonically related. They could be sidebands or distortion products of the strong stationary tone but the harmonic relationship doesn't seem correct.

+586133 Hz
"Hydrogen's friend." This signal is extremely interesting. The true frequency of this tone is 1420.586133 MHz and it is just to the left of the hydrogen spectral hump. To zoom into the +586223 Hz tone the Input Devices window was set to decimate by 4096 with a +30 dB gain to improve the quantization SNR.1 The down mixer was set to a center frequency of +586133.3 Hz. The 4096 decimation along with a 65536 point FFT resulted in a bin resolution of 0.0651 Hz / bin.

The red spectrum in the Average window shows a strong +15 dB tone at +586223 Hz. The Histogram window shows the noise to have a nice Gaussian shaped curve. The Gaussian window was used to improve the spectral time resolution. The Color Aperture window was set to a -27 ... -61 dB range to improve the color resolution of the spectrogram. The green spectrogram window shows what previously was a constant stationary tone is now a drifting signal that has a slight random walk. Note that the spectrogram's horizontal zoom has been changed to Hz=1X. Here is a full screenshot:

Making some measurements in the spectrogram window shows that the +586223 Hz signal has a drift of +4.30 Hz / 326 seconds = +0.0132 Hz / second. It starts as what looks like a random walk as the tone zigzags back and forth. Then something really interesting happens half way down, it looks like the signal is being modulated. Zooming in on the lower half and increasing the Gaussian beta value to 11 shows: (click on image for higher resolution version)

This looks like FSK modulation with the delta between mark and space frequencies being about 1.2 Hz. Below is the Average spectrum showing the mark and space frequencies:

The purple spectrum is from the beginning of the modulated section. Since the signal is drifting with a positive slope the mark and space frequencies move to the right. The green spectrum is from the mid/bottom of the modulated section. This is clearly 2-tone FSK but at an extremely low baud rate with a very close mark and space frequency delta.

Enhance Resolution
The new blip Fourier transform enhances spectral resolution which is ideal for deep zooming down to the sample level. Time, frequency, and phase details are improved by using a new analysis primitive called the blip(let). A focus parameter allows for algorithm fine tuning on a signal by space by zoom basis.

Zooming into the FSK signal using a second decimation pass for a total decimate by ratio of 524288 and a bin resolution of 0.01628 Hz / bin. With focus=1 the structure of the individual FSK bits are clearly visible in the magnitude space view below:

Also part of the blip Fourier transform is a blind phase lock algorithm that tracks changes in phase. The problem of spinning phase that is inherent in the short-time Fourier transform (STFT) is solved with blind phase locking. Now the other half of the spectrum, the phase half, contains visibly useful information. With focus=4 the phase of the FSK bits are fairly constant in the unwrapped phase space view below:

The visible phase changes follow what is expected for a random walk coupled with FSK mark/space transitions. There is a fair amount of phase noise present but it does not appear that any phase coding exists within the steady state or the bit transitions.

In the spectrum section above the dB axis is incorrect. Since this is phase space the units should be {-pi ... +pi}. This will be fixed in a future version of baudline.

Also note that the spectrogram timebase parameter for the above images was set to 3X. The overlap value was 1 so this means that baudline can zoom in three more scale factors before the digital bottom is reached at the discrete sample level.

Since the entire FSK signal is drifting somewhat randomly at about +0.0132 Hz/sec machine demodulation is a bit difficult. Backed decimation up a notch from the previous Enhanced Resolution section since the following demodulation works better with a little less zoom. Used a second decimation pass, like was done previously, for a total decimate by ratio of 262144 and a bin resolution of 0.03255 Hz / bin. Baudline's periodicity bars were used to place and fine tune a horizontal grid that perfectly matched the modulated FSK symbols. See the two slightly overlapped spectrograms that have the periodicity bar overlays below: (click on image for higher resolution version)

Note that FSK2 modulation has one symbol per baud. From the periodicity bars delta selected value the baud rate was measured to have a period of 1.976 seconds which is 0.5061 baud. This works out to a spectral efficiency of roughly 0.17 (bit/s)/Hz.2 The periodicity bars sliced the symbols perfectly. With the periodicity bars up I was able to manually demodulate the individual bits. Here are the demodulated bits, it begins with a large number of leading zeroes:


In an attempt to make some sense of this bit stream here are the demodulated bits in a reduction grammar notation:3

0* 2(10) 00 1(10) 00 5(10) 0 1(10) 9(0) 5(10) 0 15(10) 0 10(10)+

where the 0's and 1's are bits and the bit string in parenthesis is repeated by the number before it. The pattern is mostly repeating 10's interspersed with an occasional 0 or two.

Ignoring the leading zeroes, here is the bitstream broken down into 32-bit hexadecimal integers (big endian):

10100010 00101010 10100100 00000000 = 0xA22AA400
10101010 10010101 01010101 01010101 = 0xAA955555
01010101 00101010 10101010 101010.. = 0x552AAAA.

There are 55 zero bits and 39 one bits which is somewhat lopsided but the sample size is way to small for that to be significant. It is interesting that there is not a single run of ones (11) in the bit stream which would suggest some form of Non-Return-to-Zero Inverted (NRZI) coding.4

The bit stream is definitely not random but I haven't been able to decode a pattern out of it yet. It is also possible that the demodulation process produced a couple of bit errors. It is also unfortunate that the data file terminated when it did. Plugging this bit stream (and parts of it) into Google returns zero hits. Maybe some bit wackers5 or crypto folk can pull meaning out of this bit stream.

The decimated quadrature FSK signal was mixed up to passband (real). To hear this signal download the kepler-exo4_FSK.wav file, load it into baudline, then open the Play Deck window to adjust the audio controls, and press play.

You can slow down the sample rate by adjusting the speed control or change the center down mix frequency by adjusting the shift control. Pressing the small arrow in the bottom right corner will pop down a section that has more controls. From there you can apply an equalization curve or adjust low and high pass filters to remove out-of-band noise.

The random walk wandering FSK signal was then ran through baudline's Autocorrelation transform. The Autocorrelation transform shows the self similarity of a signal and it can also be utilized as a form of waveform trigger lock mechanism. Think of Autocorrelation as a sort of self syncing waveform raster display.

For reference the Color Aperture window parameters were set to upper=-48 dB and lower=-69 dB. All other parameters except the windowing function are default. The Kaiser window was used and the beta parameter was increased from 0. to 15. in steps to create the following Autocorrelation spectrogram images:

beta = 0. (square window)

beta = 5.
beta = 15.

The progression of the Kaiser beta value shows how the structure evolves as the window gets narrower. No beta value here is inherently correct but the structures seem to stabilize with the higher betas.

Here is an Autocorrelation movie of the variation of the Kaiser window beta. Notice how patterns and structures pop out of the noise as the beta parameter changes. The audio in the movie is the sound of the drifting random walking FSK signal that has been speed and frequency shift modified for the audio band. Make sure to watch this in fullscreen 720p HD so you can see all the details.

This is not random noise and this is not what the Autocorrelation of a random walk looks like. I was expecting to see the FSK bits flipping on and off from a synchronized waveform perspective. That didn't happen and what this is is a lot more than 94 bits worth of structure. Also the drifting random walk isn't random at all, it contains information. What I believe is happening is that the drifting random walk and the FSK bit stream are modulated together to create this image. I've never heard of a modulation scheme like this before. It does have elements of NTSC and Hellschreiber to it but at an extremely low data rate.

I tried different FFT sizes and different time domain operations from the Input Mapping window that cause various signal distortions. The basic image structure did not change. This tells me that the signal is fairly robust and not an artifact created by the analysis equipment.

The importance of this analysis depends greatly on the identity of the target source. Is it the Kepler satellite, the Kepler-4 planet, or something else? It is very unlikely an error in the collection or analysis caused the modulated bit section because other features in this data file are stationary or drifting differently. It is extremely unlikely that the modulated bits were created by natural phenomena. Decoding of the bit stream may prove enlightening in identifying the source. The Autocorrelation images are likely an interesting byproduct of the FSK data stream coupled with the drifting random walk.

I really don't know what to say or think at this point. The SETI Institute collected this signal and they, hopefully, will tell us what the celestial source is. [Update: This thread confirmed the signal source to be the Kepler 4b star.]

Some important questions about the FSK modulated signal:
  • Is the signal's proximity of -500 kHz to hydrogen significant or is it an aliasing artifact?
  • Are the other tones related in any way? (harmonically or temporally)
  • Why is the signal drifting at a +0.0132 Hz/second rate? What should it be drifting at?
  • Why is it undergoing a random walk?
  • Why are the mark and space frequencies so close together? (1.2 Hz)
  • Why is the 0.5061 baud rate so low?
  • Do these modulation parameters match any known modem or system?
  • Do the demodulated bits match any known line coding, preamble, or training sequence?
  • Why is there not a single run of ones (11) in the bit stream?
  • Are there any "interesting" sequences or patterns in the demodulated bits?
  • Is there any significance to the Autocorrelation images?
  • Will this signal ever be seen or collected again?
Does anyone have any answers or ideas?

[Update: The SETI Institute did a re-observation of the Kepler-4 target and the analysis report is here setiQuest Kepler-4b redux.]

1. Decimating by 4096 has the effect of increasing SNR but with the byproduct of reducing gain. Since baudline uses a 16-bit internal sample size this gain reduction can push any weak signal past the LSB thus truncating it. The +30 dB decimation gain setting improves the quantization SNR which eliminates the potential signal loss problem. Note that SNR has been used twice here in this note but in different contexts.
2. This spectral efficiency is roughly equal to that of a 110 baud Bell 101 FSK modem.
3. A context-free grammar is a Computer Science tool that is used to define a formal language. They are very useful in the design of finite automata. Their reduction ability can simplify a complex repetitive string down to it's basic structure.
4. Non-return-to-zero (NRZ) is a telecommunication line coding technique that is useful for overcoming channel deficiencies and for dealing with clocking or synchronization issues.
5. Yes, "bit wacker" is a technical term.

Thursday, April 22, 2010

setiQuest amc7-3693.4464 MHz

Using the baudline signal analyzer to browse the setiQuest 2010-04-02-amc7-3693.4464 data file. It took most of a day to download the 3 parts of the amc7 data files (5.7 GB) and combine them. This radio telescope data file is way too big to load so it had to be streamed into baudline. Two benefits of streaming to standard input are that you can see the recorded signal data scroll by and that the Input Device's "decimate by" feature can be used to further increase the signal extraction power.

The following command line was used to stream the 5m 26s quadrature setiQuest signal into baudline:

cat ~/setiquest/2010-04-02-amc7-3693.4464-8bit_combined.dat | baudline -session setiquest -stdin -format s8 -channels 2 -quadrature -flipcomplex -samplerate 8738133 -fftsize 65536 -utc 0 -pause

Full 8.738 MHz view
Switching baudline to the record mode allowed the standard input to be collected and displayed. The red Average window reduces the variance of the noise floor and lets weak signals stand out. The green spectrogram is a time vs. frequency plot which shows the presence of several constant tones (straight lines). The Histogram window shows a Gaussian curve (see AWGN) which is customary for noise sampled from an analog digital converter (ADC), notice the alternating blank vertical lines caused by the signed 8-bit sample format. The Color Aperture window allows the upper and lower spectrogram intensity limits to be adjusted for maximum visual sensitivity. See the screenshot below (click for a larger image):

A 65536 point complex FFT was used for display and analysis. The frequency axis was zoomed in to a Hz=1X resolution to focus on the tones at +500 kHz. The screenshot of the zoomed in Average window is below:

The main tone is at 473 kHz with several weaker distortion sidebands. Next we want to zoom in even more to see what is going on.

Decimate by 512
The decimation and down mixer feature in the Input Devices window was used to zoom into the frequency axis which also has the side benefit of increasing the signal's SNR. Decimation by 512 was done combined with a 65536 point FFT which has the equivalent extraction power of a 32 million point FFT. This works out to a bin resolution of 0.52 Hz / bin. The down mixer is a digital down converter (DDC) which works a lot like turning a radio tuner. The down mixer was set to be centered on the strong tone at +473 kHz by setting the frequency range to +464533.3 ... +481600.0 Hz.

Interesting side note is the 8.3 Msample calibration rate estimate for stdin. The sample rate estimate is a clock measurement of the speed baudline is collecting data, in this case from stdin. This means that baudline is collecting standard input data from a file, decimating, down mixing, calculating a 65536 point FFT, accumulating the Average window, calculating and drawing the sample Histogram, and rendering the scrolling spectrogram in almost real-time on a cheap $500 one-year-old 2.0 GHz Intel Core 2 Duo machine.

Next, the "transform cache" feature was enabled in the Drift Integrator which used 524 MB of RAM to cache the results of the 65536 point FFT for extremely fast frequency axis zooming and scrolling. The Drift Integrator has a number of other useful features such as beam slices, Auto Drift, a folding paste algorithm, and anti-alias on spectrogram zoom which I will explain in a future blog post.

Below is a full screenshot of the result of the decimation and down mixing:

The red Average spectral plot and the green spectrogram show the same range of frequency data but at different Hz scale factors. The strong tone at 473067 Hz and its sidebands are the main concern of interest here. Notice the first sidebands offset by ±979 Hz on both sides of this strong 473 kHz tone are wiggly. Next we will zoom in on one of them.

Zoom Hz=1X
The Command+Left key was pressed several times to change the spectrogram's frequency zoom factor from 32X to 1X. Since the "transform cache" was enabled the zooming and frequency scrolling was extremely fast and responsive. It was like exploring the spectrum with a real-time DSP microscope looking for interesting spectral features. A screenshot of the wandering tone (F2) at 472075 Hz is below:

It was interesting to discover that the sidebands, offset by ±979 Hz, are mirror images of each other. The sidebands (F2) are about 25 dB down from the main tone (F1). The third ±harmonics (F3) are also wandering mirror images of F2, the F4 harmonic is missing, while the F5 is a clean constant tone. Here is a screenshot of -F3:

The wandering -F3 tone is just a weaker version of -F2 with 2x the frequency stretch which is customary for harmonic progressions.

Amplitude modulation (AM) of the strong 473 kHz tone by an unknown signal would cause similar sidebands. They could be distortion products from the transmitter or the radio telescope collection equipment. The wandering looks like it could be oscillator drift of the ADC sampling clock but that is just a guess.

Let's move the frequency scrollbar to look at the strong 473069 Hz tone. Here is a screenshot of the carrier (F1):

It looks very stationary and popping up baudline's crosshair cursor verifies such at this magnification level.

Decimate by 4096
Let's zoom in a little more. Increasing the decimation factor to 4096 reduces the bin resolution to 0.0651 Hz / bin. Below is a spectrogram screenshot of the strong carrier (F1) at this increased frequency resolution:

The strong tone has a slight drift of +0.52 Hz over a course of 326 seconds. This is about equal to the bin resolution from the previous decimate by 512 case so it isn't surprising that the signal looked stationary in that view. The increased frequency zoom has made the signal start to look a bit wiggly. What we need is even more frequency resolution.

Decimate by 32768
Baudline has a maximum "decimate by" limit of 4096 so I used a 2-pass method of decimating by 4096, saving the file, then feeding that into standard input again but with a decimate by 8 factor. I call this multi-pass algorithm "decimate by ∞" where you keep taking the output of the decimator and feed it back into the input. You can keep doing this ad infinitum until you end up with zero samples. I could of kept decimating past 32768 but too much time information would of been lost from the spectrogram and resulted in a poor looking image. The bin resolution of decimate by 32768 is 0.00814 Hz / bin. The once stationary tone no longer looks straight in the spectrogram below:

The tone isn't just drifting, it also has an incredible amount of wander to it. I measure a top to bottom drift of +0.35 Hz over 326 seconds. This isn't surprising, zoom in deep enough and even the world's best oscillator is will have some variation but more likely you'll be seeing the error in the ADC clock!

I've seen this deep decimation frequency wandering before in this Mystery Signal.

For fun this is what the decimate by 32768 quadrature signal looks like in the Waveform view:

Note that most of the noise has been decimated away and a quadrature sine wave is visible. Nice 90ยบ phase shift.

The combination of a large FFT and high decimation factor allow baudline to zoom in for a deep view of weak signal behavior. Using baudline's multiple features allowed for detailed signal measurements and interactive fast browsing of the time-frequency domain.

The setiQuest AMC-07 data file had several stationary tones with strong distortion sidebands. The wandering mirror sidebands are likely caused by oscillator drift of the ADC sampling clock. The non-drifting stationary nature of all tones in this data file suggest the source is of terrestrial origin.

[Update: This thread said that the signal is from the AMC-7 geosynchronous satellite.]

Tuesday, April 20, 2010

I joined setiQuest

I joined the setiQuest project that is being sponsored by the SETI Institute. As their blog states, today truly is an exciting day. The setiQuest project is placing data sets collected from the Allen Telescope Array into the public domain. Their goal is to encourage "citizen scientists" to help in the search for extraterrestrial intelligence by analyzing radio telescope data and look for signals. Here is a baudline screenshot of the one second test .dat file:

The red spectral plot of the Average window shows a strong tone at -469 kHz and two weaker tones at around -2 MHz. The slopes on the left and right of the red Average spectral curve are from filters in the sampling unit or from a digital down conversion (DDC) process. The slight negative slope (-0.5 dB over 6 MHz) of the spectral curve is interesting, I'd expect it to be symmetrical around 0 Hz but it could be because this chuck of spectrum was extracted from a wider section of bandwidth.

The green spectrogram plot shows that these tones are stationary for the one second file duration which is not long enough to determine if they are stationary or are drifting. I need to look at the larger 1.9 GB data file, that is still downloading, to know for sure. Looks like there might be some modulation but that could just be the noise. The tones are fairly weak signals and further analysis is required.

The histogram on the right shows that sample data has a Gaussian distribution which is to be expected from radio telescope data. The histogram is centered at zero and it doesn't have any skew which is good.

This one second test data file was streamed into baudline's standard input. The data format is 2-channel quadrature signed 8-bit samples. The sample rate was calculated by dividing the one second file size by 2 to be 8738133 samples/seconds. Here is the command line used:

cat 2010-04-02-amc7-3693.4464-8bit-one-second.dat | baudline -session setiquest -stdin -format s8 -channels 2 -quadrature -samplerate 8738133 -pause

So join setiQuest, download baudline, and start analyzing signals today.