OCaml Scientific Computing

1st Edition (in progress)
Back
Table of Contents

Signal Processing

Signal processing is an electrical engineering sub-field that focuses on analysing, modifying and synthesizing signals such as sound, images and biological measurements. (WIKI) It covers a wide range of techniques.

In this chapter we mainly focus on Fourier Transform, the core idea in signal processing and modern numerical computing. We introduce its basic idea, and then demonstrate how Owl support FFT with examples and applications. We also cover the relationship between FFT and Convolution, and filters.

Discrete Fourier Transform

One theme in numerical applications is the transformation of equations into a coordinate system so that the original question can be easily decoupled and simplified. One of the most important such transformation is the Fourier Transform, which decomposes a function of time into its constituent frequencies.

All these sound too vague. Let’s look at an example. Think about an audio that lasts for 10 seconds. This audio can surely be described in the time domain, which means plotting its sound intensity against time as x axis. On the other hand, maybe less obviously, the sound can also be described in the frequency domain. For example, if all the 10 seconds are filled with only playing the A# note, then you can describe this whole audio with one frequency number: 466.16 Hz. If it’s a C note, then the number is 523.25 Hz, etc. The thing is that, the real-world sound is not always so pure, they are quite likely compounded from different frequencies. Perhaps this 10 seconds are about water flowing, or wind whispering, what frequencies it is built from then?

That’s where Fourier Transform comes into play. It captures the idea of converting the two forms of representing a signal: in time domain and in frequency domain. We can represent a signal with the values of some quantity \(h\) as a function of time: \(h(t)\), or this signal can be represented by giving its amplitude \(H\) as function of frequency: \(H(f)\). We can think they are two representation of the same thing, and Fourier Transform change between them:

\[ h(f) = \int H(f)\exp^{-2\pi~ift}df\qquad(1)\] \[ H(f) = \int h(t)\exp^{2\pi~ift}dt\]

To put it simply: suppose Alice mix a unknown number of colour together, and let Bob to guess what those colours are, then perhaps Bob need a Fourier Transform machine of sorts.

In computer-based numerical computation, signals are often represented in a discrete way, i.e. a finite sequence of sampled data, instead of continuous. In that case, the method is called Discrete Fourier Transform (DFT). Suppose we have a complex vector \(y\) as signal, which contains \(n\) elements, then to get the Fourier Transform vector \(Y\), the discrete form of eq. 1 can be expressed with:

\[ Y_k = \sum_{j=0}^{n-1}y_j~\omega^{jk},\qquad(2)\] \[ y_k = \frac{1}{n}\sum_{j=0}^{n-1}Y_k~\omega^{-jk},\]

where \(\omega = \exp{-2\pi~i/n}\) and \(i = \sqrt{-1}\). \(j\) and \(k\) are indices that go from 0 to \(n-1\) and \(k = 0, 1, ..., n -1\).

We highly suggest you to checkout the video that’s named “But what is the Fourier Transform? A visual introduction” produced by 3Blue1Brown. It shows how this eq. 2 of Fourier Transform comes into being with beautiful and clear illustration. (TODO: follow the video, explain the idea of FT clearly, not just smashing an equation into readers’ faces.)

(maybe TODO: we can perhaps implement a naive DFT process, both to illustrate the theory with OCaml code, and lay the foundation for understanding FFT. Refer to: Matlab NC book, Chap 8.2)

You might be wondering, it’s cool that I can recognise how a sound is composed, but so what? Think of a classic example where you need to remove some high pitch noisy from some music. By using DFT, you can easily find out the frequency of this noisy, remove this frequency, and turn the signal back to the time domain by using something a reverse process.

Not just sound; you can also get a noisy image, recognise its noise by applying Fourier Transform, remove the noises, and reconstruct the image without noise. We will show such examples later.

Actually, the application of FT is more than on sound or image signal processing. We all use Fourier Transform every day without knowing it: mobile phones, image and audio compression, communication networks, large scale numerical physics and engineering, etc. It is the cornerstone fo computational mathematics. One important reason of its popularity is that it has an efficient algorithm in implementation: Fast Fourier Transform.

Fast Fourier Transform

One problem with DFT is that if you follow its definition in implementation, the algorithm computation complexity would be \(\mathcal{O}(n^2)\), since it involves a dense \(n\) by \(n\) matrix multiplication. It means that DFT doesn’t scale well with input size. The Fast Fourier Transform algorithm, first formulated by Gauss in 1805 and then developed by James Cooley and John Tukey in 1965, drops the complexity down to \(\mathcal{O}(n\log{}n)\). To put it in a simple way, the FFT algorithm finds out that, any DFT can be represented by the sum of two sub-DFTs: one consists of the elements on even index in the signal, and the other consists of elements on odd positions:

\[ Y_k = \sum_{even j}\omega^{jk}y_j + \sum_{odd j}\omega^{jk}y_j\qquad(3)\] \[ = \sum_{j=0}^{n/2-1}\omega^{2jk}y_{2j} + \omega^k\sum_{j=0}^{n/2-1}\omega^{2jk}y_{2j+1}.\]

The key to this step is the fact that \(\omega_{2n}^2 = \omega_n\). According to eq. 3, you only need to compute FFT for half of the signal, the other half can be gotten by multiplied with \(\omega^{k}\), and then concatenate these two sums. The half signal can further be halved, so on and so forth. Therefore the computation can be reduced to a \(log\) level in a recursive process.

(TODO: but what about the \(y_{2j+1}\) part in the second sum? how come the second could be the same as the first sum? Need figure it out.)

To introduce Fourier Transform in detailed math and analysis of its properties is beyond the scope of this book, we encourage the readers to refer to other classic textbook on this topic (Phillips, Parr, and Riskin 2003). In this chapter, we focus on introducing how to use FFT in Owl and its applications with Owl code. Hopefully these materials are enough to interest you to investigate more.

The implementation of the FFT module in Owl interfaces to the FFTPack C implementation. Owl provides these basic FFT functions, listed in Tabel tbl. 1

Table 1: FFT functions in Owl
Functions Description
fft ~axis x Compute the one-dimensional discrete Fourier Transform
ifft ~axis x Compute the one-dimensional inverse discrete Fourier Transform
rfft ~axis otyp x Compute the one-dimensional discrete Fourier Transform for real input
irfft ~axis ~n otyp x Compute the one-dimensional inverse discrete Fourier Transform for real input

Examples

We then show how to use these functions with some simple example. More complex and interesting will follow in the next section.

1-D Discrete Fourier transforms

Let start with the most basic fft and it reverse transform function ifft.

let a = [|1.;2.;1.;-1.;1.5;1.0|];;
>val a : float array = [|1.; 2.; 1.; -1.; 1.5; 1.|]
let b = Arr.of_array a [|6|] |> Dense.Ndarray.Generic.cast_d2z;;
>val b : (Complex.t, complex64_elt) Dense.Ndarray.Generic.t =
>
>       C0      C1      C2       C3        C4      C5
>R (1, 0i) (2, 0i) (1, 0i) (-1, 0i) (1.5, 0i) (1, 0i)
let c = Owl_fft.D.fft b;;
>val c : (Complex.t, complex64_elt) Owl_dense_ndarray_generic.t =
>
>         C0                 C1                 C2                  C3                C4                C5
>R (5.5, 0i) (2.25, -0.433013i) (-2.75, -1.29904i) (1.5, 1.94289E-16i) (-2.75, 1.29904i) (2.25, 0.433013i)
let d = Owl_fft.D.ifft c;;
>val d : (Complex.t, complex64_elt) Owl_dense_ndarray_generic.t =
>
>                 C0                C1                 C2                  C3                  C4                C5
>R (1, 1.38778E-17i) (2, 1.15186E-15i) (1, -8.65641E-17i) (-1, -1.52188E-15i) (1.5, 1.69831E-16i) (1, 2.72882E-16i)

In the result returned by fft, the first half contain the positive-frequency terms, and the second half contain the negative-frequency terms, in order of decreasingly negative frequency. Typically, only the FFT corresponding to positive frequencies is plotted.

The next example plots the FFT of the sum of two sines, showing the power of FFT to separate signals of different frequency.

module G = Dense.Ndarray.Generic;;
>module G = Owl.Dense.Ndarray.Generic
let n = 600. (* sample points *);;
>val n : float = 600.
let t = 1. /. 800. (* sample spacing *);;
>val t : float = 0.00125
let x = Arr.linspace 0. (n *. t) (int_of_float n);;
>val x : Arr.arr =
>
>  C0         C1         C2         C3         C4         C595     C596     C597     C598 C599
>R  0 0.00125209 0.00250417 0.00375626 0.00500835 ... 0.744992 0.746244 0.747496 0.748748 0.75
let y1 = Arr.((50. *. 2. *. Owl_const.pi) $* x |> sin);;
>val y1 : Arr.arr =
>
>  C0       C1       C2      C3       C4         C595    C596     C597     C598        C599
>R  0 0.383289 0.708033 0.92463 0.999997 ... 0.999997 0.92463 0.708033 0.383289 1.27376E-14
let y2 = Arr.(0.5 $* ((80. *. 2. *. Owl_const.pi) $* x |> sin));;
>val y2 : (float, float64_elt) Owl_dense_ndarray_generic.t =
>
>  C0       C1       C2      C3       C4          C595     C596      C597      C598         C599
>R  0 0.294317 0.475851 0.47504 0.292193 ... -0.292193 -0.47504 -0.475851 -0.294317 -2.15587E-14
let y = Arr.(y1 + y2) |> G.cast_d2z;;
>val y : (Complex.t, complex64_elt) G.t =
>
>       C0             C1            C2            C3            C4               C595           C596           C597            C598               C599
>R (0, 0i) (0.677606, 0i) (1.18388, 0i) (1.39967, 0i) (1.29219, 0i) ... (0.707804, 0i) (0.449591, 0i) (0.232182, 0i) (0.0889723, 0i) (-8.82117E-15, 0i)
let yf = Owl_fft.D.fft y;;
>val yf : (Complex.t, complex64_elt) Owl_dense_ndarray_generic.t =
>
>             C0                    C1                    C2                    C3                  C4                     C595                 C596                   C597                   C598                   C599
>R (5.01874, 0i) (5.02225, 0.0182513i) (5.03281, 0.0366004i) (5.05051, 0.0551465i) (5.0755, 0.073992i) ... (5.108, -0.0932438i) (5.0755, -0.073992i) (5.05051, -0.0551465i) (5.03281, -0.0366004i) (5.02225, -0.0182513i)
let z = Dense.Ndarray.Z.(abs yf |> re);;
>val z : Dense.Ndarray.Z.cast_arr =
>
>       C0      C1      C2      C3      C4        C595    C596    C597    C598    C599
>R 5.01874 5.02228 5.03294 5.05081 5.07604 ... 5.10886 5.07604 5.05081 5.03294 5.02228
Plot the result.
let h = Plot.create "plot_001.png" in
let xa = Arr.linspace 1. 600. 600 in
Plot.plot ~h xa z;
Plot.output h
;;
>- : unit = ()
Using FFT to separate two sine signals from their mixed signal

Next let’s see rfft and irfft. Function rfft calculates the FFT of a real signal input and generates the complex number FFT coefficients for half of the frequency domain range. The negative part is implied by the Hermitian symmetry of the FFT. Similarly, irfft performs the reverse step of rfft. First, let’s make the input even number.

let a = [|1.; 2.; 1.; -1.; 1.5; 1.0|];;
>val a : float array = [|1.; 2.; 1.; -1.; 1.5; 1.|]
let b = Arr.of_array a [|6|];;
>val b : Arr.arr =
>  C0 C1 C2 C3  C4 C5
>R  1  2  1 -1 1.5  1
let c = Owl_fft.D.rfft b;;
>val c : (Complex.t, complex64_elt) Owl_dense_ndarray_generic.t =
>
>         C0                 C1                 C2        C3
>R (5.5, 0i) (2.25, -0.433013i) (-2.75, -1.29904i) (1.5, 0i)
let d = Owl_fft.D.irfft c;;
>val d : (float, float64_elt) Owl_dense_ndarray_generic.t =
>
>  C0 C1 C2 C3  C4 C5
>R  1  2  1 -1 1.5  1

And then we change the length of signal to odd.

let a = [|1.; 2.; 1.; -1.; 1.5;|];;
>val a : float array = [|1.; 2.; 1.; -1.; 1.5|]
let b = Arr.of_array a [|5|];;
>val b : Arr.arr =
>  C0 C1 C2 C3  C4
>R  1  2  1 -1 1.5
let c = Owl_fft.D.rfft b;;
>val c : (Complex.t, complex64_elt) Owl_dense_ndarray_generic.t =
>
>         C0                  C1                   C2
>R (4.5, 0i) (2.08156, -1.6511i) (-1.83156, 1.60822i)

Notice that the rfft of odd and even length signals are of the same shape. (?)

N-D Discrete Fourier transforms

( TODO: This is not the real N-D FFT. Verify it with SciPy examples. IMPLEMENTATION required. TODO: explain briefly how 2D FFT can be built with 1D. Reference: Data-Driven Book, Chap2.6. )

The owl FFT functions also applies to multi-dimensional arrays, such as matrix. Example: the fft matrix.

let a = Dense.Matrix.Z.eye 5;;
>val a : Dense.Matrix.Z.mat =
>
>        C0      C1      C2      C3      C4
>R0 (1, 0i) (0, 0i) (0, 0i) (0, 0i) (0, 0i)
>R1 (0, 0i) (1, 0i) (0, 0i) (0, 0i) (0, 0i)
>R2 (0, 0i) (0, 0i) (1, 0i) (0, 0i) (0, 0i)
>R3 (0, 0i) (0, 0i) (0, 0i) (1, 0i) (0, 0i)
>R4 (0, 0i) (0, 0i) (0, 0i) (0, 0i) (1, 0i)
let b = Owl_fft.D.fft a;;
>val b : (Complex.t, complex64_elt) Owl_dense_ndarray_generic.t =
>
>        C0                      C1                      C2                      C3                      C4
>R0 (1, 0i)                 (1, 0i)                 (1, 0i)                 (1, 0i)                 (1, 0i)
>R1 (1, 0i)  (0.309017, -0.951057i) (-0.809017, -0.587785i)  (-0.809017, 0.587785i)   (0.309017, 0.951057i)
>R2 (1, 0i) (-0.809017, -0.587785i)   (0.309017, 0.951057i)  (0.309017, -0.951057i)  (-0.809017, 0.587785i)
>R3 (1, 0i)  (-0.809017, 0.587785i)  (0.309017, -0.951057i)   (0.309017, 0.951057i) (-0.809017, -0.587785i)
>R4 (1, 0i)   (0.309017, 0.951057i)  (-0.809017, 0.587785i) (-0.809017, -0.587785i)  (0.309017, -0.951057i)

IMAGE: plot x and y in to circle-like shape

Applications of FFT

As we said, the applications of FFT are numerous. Here we pick three to demonstrate the power of FFT and how to use it in Owl. The first is to find the period rules in the historical data of sunspots, and the second is about analysing the content of dial number according to audio information. Both applications are inspired by (Moler 2008). The third application is about image processing. These three applications together present a full picture about how the wide usage of FFT in various scenarios.

(A backup application can always be to separate noises; Refer to Data-Driven Book, Code 2.3.)

Find period of sunspots

Build data from a dataset from the Solar Influences Data Center. Explain the background of dataset, meaning of Wolfer index, and the dataset itself.

The dataset provided are of different granularity. Here we use the yearly data, from 1700 to 2020. You can also try the monthly data to get more detailed knowledge. First, load the data:

let data = Owl_io.read_csv ~sep:';' "sunspot_full.csv"
let data = Array.map (fun x -> Array.map float_of_string x) data |> Mat.of_arrays

let x = Mat.get_slice [[];[0]] data
let y = Mat.get_slice [[];[1]] data

We can then visualise the data:

let plot_sunspot x y =
  let h = Plot.create "plot_sunspot.png" in
  Plot.set_font_size h 8.;
  Plot.set_pen_size h 3.;
  Plot.set_xlabel h "Date";
  Plot.set_ylabel h "Sunspot number";
  Plot.plot ~h ~spec:[ RGB (255,0,0); LineStyle 1] x y;
  Plot.output h
Figure 1: Yearly sunspot data

We can see there is a cycle. We want to know exactly how long it is.

let y' = Owl_fft.D.rfft ~axis:0 y

To process the data, we first remove the first element of y, since it stores the sum of the data. The frequency is reduced to half, since we plot only half of the coefficients.

let y' = Dense.Ndarray.Z.get_slice [[1; (Dense.Ndarray.Z.shape y').(0) - 1];[]] y'
let p = Dense.Ndarray.Z.abs y' |> Dense.Ndarray.Z.re
let n = (Arr.shape p).(0)
let f = Arr.(mul_scalar (linspace 0. 1. n') 0.5)

The frequency cycle/year seems a bit confusing. To get the cyclical activity that is easier to interpret, we also plot the squared power as a function of years/cycle (periodogram). Both are plotted with:

let plot_sunspot_freq f p =
  let h = Plot.create ~m:1 ~n:2 "plot_sunspot_freq.png" in
  Plot.set_pen_size h 3.;
  Plot.subplot h 0 0;
  Plot.set_xlabel h "cycle/year";
  Plot.set_ylabel h "squared power";
  Plot.plot ~h ~spec:[ RGB (255,0,0); LineStyle 1] f p;

  Plot.subplot h 0 1;
  Plot.set_xlabel h "year/cycle";
  Plot.set_ylabel h "squared power";
  let f' = Arr.(scalar_div 1. (get_slice [[1; Stdlib.(n'-1)]] f)) in
  Plot.plot ~h ~spec:[ RGB (255,0,0); LineStyle 1] f' p;
  Plot.set_xrange h 0. 40.;
  Plot.output h

The result is shown in fig. 2. We can see the most prominent cycle is a little bit less than 11 years.

Figure 2: Find sunspot cycle with FFT

Decipher the Tone

This examples uses the data of (Moler 2008).

let data = Owl_io.read_csv ~sep:',' "touchtone.csv"
let data = Array.map (fun x -> Array.map float_of_string x) data |> Mat.of_arrays
let data = Mat.div_scalar data 128.

The dataset specifies a sampling rate of 8192.

let fs = 8192.

We have a segment of signal that shows the touch tone of dialling a phone number. We can visualise the signal:

let plot_tone x y filename =
  let h = Plot.create filename in
  Plot.set_font_size h 8.;
  Plot.set_pen_size h 3.;
  Plot.set_xlabel h "time(s)";
  Plot.set_ylabel h "signal magnitude";
  Plot.plot ~h ~spec:[ RGB (0, 0, 255); LineStyle 1] x y;
  Plot.output h

let x = Mat.div_scalar (Mat.sequential 1 (Arr.shape data).(1)) fs
let _ = plot_tone x y "plot_tone.png"

The result is shown in fig. 3(a). Apparently there are 11 digits in this phone number. The question we want to answer is: which numbers?

Figure 3: Recording of an 11-digit number and its FFT decomposition

This is a suitable question for FFT. We can apply the FFT to the original data.

let yf = Owl_fft.D.rfft data
let y' = Dense.Ndarray.Z.(abs yf |> re)
let n = (Arr.shape y').(1)
let x' = Mat.linspace 0. (fs /. 2.) n

We plot x' with y' similarly using the previous plotting function, and the result is shown in fig. 3(b). All the 11 digits are composed from 7 prominent frequencies.

Actually… (Explain the theory: how they combine into 10 digits. With IMAGE or TABLE)

We can use the first tone as an example. We get a subset:

let data2 = Arr.get_slice [[];[0; 4999]] data

And then perform the same process as before, the results are shown in fig. 4. We can see that the first digit is mainly composed from two frequencies: 600 and 1200. Looking it up in out table, we can see that the first digit is 1.

Figure 4: Recording of the first digit and its FFT decomposition

The tune of phone is combination of two different frequencies. You can investigate the whole phone number if you want.

Image Processing

Blurring an image with a two-dimensional FFT. IMPLEMENTATION required: FFT2D, iFFT2D, fftShift. Image utils also need to be made clean. Reference.

FFT on multi-dimensional signal is effective for image compression, because many Fourier frequencies in the image are small and can be neglected via using filters, leaving the major frequencies, and thus the image quality can be largely preserved.

We use the famous Lena image as example:

Figure 5: Lena

As the first step, we read in the image into Owl as a matrix. All the elements in this matrix are scaled to within 0 to 1.

code

Then we take the 2-D FFT, and centre the frequencies.

code: fft2 + fftshift; image output

IMAGE

The result is shown in a figure. It is clear that there are several small frequency bands that can be ignored. Let’s remove them using a Gaussian Filter.

code: using Mat.meshgrid to build a gaussian mask; matrix multiplication; image output

IMAGE

Now that we remove the insignificant frequencies, we can rebuild the image based on this filtered frequency with inverse 2-D FFT.

code:ifft2; show image

IMAGE + some comment about it: its blur but keep basic information; maybe we can do better with … (some advanced tricks + s-o-a if there is any).

Of course, following similar method as previous applications on 1-D signals, FFT is widely used for removing noise in images by isolate and manipulate particular frequency bands.

Filtering

In the N-dimensional Array chapter, we have introduced the idea of filter, which allows to get target data from the input according to some function. Filtering in signal processing is similar. It is a generic name for any system that modifies input signal in certain ways, most likely removing some unwanted features from it.

(Refer to “ThinkDSP” in writing.)

Example: Smoothing

Let’s start with a simple and common filter task: smoothing. Suppose you have a segment of noisy signal: the stock price. In many cases we hope to remove the extreme trends and see a long-term trend from the historical data. We take the stock price of Google in the past year, April 09, from 2019 to 2020. The data is taken from Yahoo Finance. We can load the data into a matrix:

let data = Owl_io.read_csv ~sep:',' "goog.csv"
let data = Array.map (fun x ->
    Array.map float_of_string (Array.sub x 1 6))
    (Array.sub data 1 (Array.length data - 1))
    |> Mat.of_arrays

The data y contains several columns, each representing opening price, volume, high price, etc. Here we use the daily closing price as example.

let y = Mat.get_slice [[];[3]] data

To compute the moving average of this signal, I’ll create a window with 10 elements. Here we only use a simple filter which normalises each element to the same 0.1.

let filter = Mat.of_array (Array.make 10 0.1 ) 1 10

Now, we can sliding this filter window along input signal to smooth the data step by step.

let y' = Mat.mapi (fun i _ ->
  let r = Mat.get_fancy [R [i; i+9]; R []] y in
  Mat.dot filter r |> Mat.sum'
) (Mat.get_slice [[0; (Arr.shape y).(0) - 10]; []] y)

Finally, we can plot the resulting smoothed data with the original data.

let plot_goog y y' =
  let n = (Arr.shape x).(0) in
  let x = Mat.sequential n 1 in
  let h = Plot.create "plot_goog.png" in
  Plot.set_font_size h 8.;
  Plot.set_pen_size h 3.;
  Plot.set_xlabel h "date";
  Plot.set_ylabel h "Google stock price ($)";
  Plot.plot ~h ~spec:[ RGB (255,0,0); LineStyle 1] x y;
  Plot.plot ~h ~spec:[ RGB (0,0,255); LineStyle 2] x y';
  Plot.(legend_on h ~position:NorthWest [|"original"; "smooth"|]);
  Plot.output h
Figure 6: Smoothed stock price of Google

The results are shown in fig. 6. The sudden drop in the last month might be related with the COVID-19.

Gaussian Filter

The filter we have used is a flat one: drops first, but then bouncing around (Check with experiment result). We can change to another one: the gaussian filter.

EXPLAIN

(numpy implementation and doc)

The code below generate a simple 1-D gaussian filter based on the equation: \(\exp{-\frac{x^2}{2\sigma^2}}\). Similar to the previous simple example, the filter also needs to be normalised. The range of radius is set to truncate standard deviations.

let gaussian_kernel sigma =
  let truncate = 4. in
  let radius = truncate *. sigma +. 0.5 |> int_of_float in
  let r = float_of_int radius in
  let x = Mat.linspace (-.r) r (2 * radius + 1) in
  let f a = Maths.exp (-0.5 *. a ** 2. /. (sigma *. sigma)) in
  let x = Mat.map f x in
  Mat.(div_scalar x (sum' x))

let filter = gaussian_kernel 3.

Computing the correlation between filter and the input data as before, we get a better smoothed curve in fig. 7.

Figure 7: Smoothed stock price of Google with Gaussian filtering

Filters can be generally by their usage into time domain filters and frequency domain filters. Time domain filters are used when the information is encoded in the shape of the signal’s waveform, and can be used for tasks such as smoothing, waveform shaping, etc. It includes filter methods such as moving average and single pole. Frequency filters are used to divide a band of frequencies from signals, and its input information is in the form of sinusoids. It includes filter methods such as Windowed-sinc and Chebyshev. There are many filters, each with different shape (or impulse response) and application scenarios, and we cannot cover them fully here. Please refer to some classical textbooks on signal processing such as (Smith and others 1997) for more information.

Signal Convolution

What we have done is called convolution. Formally it is mathematical operation on two functions that produces a third function expressing how the shape of one is modified by the other:

\[f(t) * g(t) = \sum_{\tau=-\infty}{\infty}f(\tau)g(t-\tau)\qquad(4)\]

In equation eq. 4, \(*\) denotes the convolution operation, and you can think of \(f\) as (discrete) input signal, and \(g\) be filter. Note that for computing \(f(\tau)g(t-\tau)\) for each \(\tau\) requires adding all product pairs. You can see that this process is computation-heavy. It is even more tricky for computing the convolution of two continuous signal following definition.

We have talked a lot about FFT, and here is the place we use it. Fourier Transformation can greatly reduce the complexity of computing the convolution and filtering. Specifically, the Convolution Theorem states that:

\[\textrm{DFT}(f * g) = \textrm{DFT}(f).\textrm{DFT}(g).\qquad(5)\]

To put equation eq. 5 into plain words, to get DFT of two signals’ convolution, we can simply get the DFT of each signal separately and then multiply them element-wise. Someone may also prefer to express it in this way: convolution in the time domain can be expressed in multiplication in the frequency domain. And multiplication we are very familiar with. Once you have the \(\textrm{DFT}(f * g)\), you can naturally apply the inverse transform and get \(f * g\).

Let’s apply the DFT approach to the previous data, and move it to the frequency domain:

let yf = Owl_fft.D.rfft ~axis:0 y

The resulting data yf looks like this:

val yf : (Complex.t, Bigarray.complex64_elt) Owl_dense_ndarray_generic.t =
                        C0
  R0          (312445, 0i)
  R1  (-2664.07, 17064.3i)
  R2  (-5272.52, 3899.16i)
  R3 (-2085.98, -3101.46i)
                       ...
R124   (23.0005, -38.841i)
R125   (153.294, 68.7544i)

We only keep the first most notable frequencies, and set the rest to zero.

let n = (Dense.Ndarray.Z.shape yf).(0)
let z = Dense.Ndarray.Z.zeros [|n-5; 1|]
let _ = Dense.Ndarray.Z.set_slice [[5;n-1];[]] yf z

Now, we can apply reverse FFT and get the smoothed data:

let y2 = Owl_fft.D.irfft ~axis:0 yf
Figure 8: Smoothed stock price of Google using FFT method

We can similarly check how the smoothing approach works in fig. 8.

FFT and Image Convolution

You might heard of the word “convolution” before, and yes you are right: convolution is also the core idea in the popular deep neural network (DNN) applications. The convolution in DNN is often applied on ndarrays. It is not complex: you start with an input image in the form of ndarray, and use another smaller ndarray called “kernel” to slide over the input image step by step, and at each position, an element-wise multiplication is applied, and the result is filled into corresponding position in an output ndarray. This process can be best illustrated in fig. 9 (inspired by the nice work by Andrej Karpathy:

Figure 9: Image convolution illustration

Owl has provided thorough support of convolution operation:

val conv1d : ?padding:padding -> (float, 'a) t -> (float, 'a) t -> int array -> (float, 'a) t

val conv2d : ?padding:padding -> (float, 'a) t -> (float, 'a) t -> int array -> (float, 'a) t

val conv3d : ?padding:padding -> (float, 'a) t -> (float, 'a) t -> int array -> (float, 'a) t

They corresponds to different dimension of inputs. Besides, Owl also support other derived convolution types, including dilated convolutions, transpose convolutions, and backward convolutions etc.

It’s OK if none of this makes sense to you now. We’ll explain the convolution and its usage in later chapter in detail. The point is that, if you look closely, you can find that the image convolution is but only a special high dimensional case of the convolution equation: a given input signal (the image), another similar but smaller filter signal (the kernel), and the filter slides across the input signal and perform element-wise multiplication.

Therefore, we can implement the convolution with FFT and vice versa. For example, we can use conv1d function in Owl to solve the previous simple smoothing problem:

let y3  = Arr.reshape y [|1;251;1|]
let f3  = Arr.reshape filter [|10;1;1|]
let y3' = Arr.conv1d y3 f3 [|1|]

The smoothed data is similar to that in fig. 6 since the calculation is the same, only with more concise code.

Also, FFT is a popular implementation method of convolution. There has been a lot of research on optimising and comparing its performance with other implementation methods such as Winograd, with practical considerations such as kernel size and implementation details of code, but we will omit these technical discussion for now.

Summary

References

Moler, Cleve B. 2008. Numerical Computing with Matlab: Revised Reprint. Vol. 87. Siam.

Phillips, Charles L, John M Parr, and Eve A Riskin. 2003. Signals, Systems, and Transforms. Prentice Hall Upper Saddle River.

Smith, Steven W, and others. 1997. “The Scientist and Engineer’s Guide to Digital Signal Processing.”

Next: Chapter 11Algorithmic Differentiation