## Hyperspectral video restoration using optical flow and sparse coding |

Optics Express, Vol. 20, Issue 10, pp. 10658-10673 (2012)

http://dx.doi.org/10.1364/OE.20.010658

Acrobat PDF (1557 KB)

### Abstract

Hyperspectral video acquisition is a trade-off between spectral and temporal resolution. We present an algorithm for recovering dense hyperspectral video of dynamic scenes from a few measured multispectral bands per frame using optical flow and sparse coding. Different set of bands are measured in each video frame and optical flow is used to register them. Optical flow errors are corrected by exploiting sparsity in the spectra and the spatial correlation between images of a scene at different wavelengths. A redundant dictionary of atoms is learned that can sparsely approximate training spectra. The restoration of correct spectra is formulated as an *ℓ*_{1} convex optimization problem that minimizes a Mahalanobis-like weighted distance between the restored and corrupt signals as well as the restored signal and the median of the eight connected neighbours of the corrupt signal such that the restored signal is a sparse linear combination of the dictionary atoms. Spectral restoration is followed by spatial restoration using a guided dictionary approach where one dictionary is learned for measured bands and another for a band that is to be spatially restored. By constraining the sparse coding coefficients of both dictionaries to be the same, the restoration of corrupt band is guided by the more reliable measured bands. Experiments on real data and comparison with an existing volumetric image denoising technique shows the superiority of our algorithm.

© 2012 OSA

## 1. Introduction

*hyperspectral frame*refers to a hyperspectral data cube acquired at a time instant

*t*and a

*band*refers to a single 2D image of a scene within a hyperspectral frame measured at a particular wavelength (see Fig. 2). Hyperspectral video acquisition is a tradeoff between spectral and temporal resolution. We consider the case where only a few bands are measured in each frame so that there is no motion within the bands of a frame and a higher frame rate is maintained. To increase the spectral resolution, the next hyperspectral frame acquires bands with a wavelength offset from the previous frame. Figure 3 illustrates the process. Throughout this paper, it will be assumed that exactly five bands are measured in each frame and the bands of the next frame are offset by 10nm from the previous ones so that after six frames, 30 bands are sensed covering the range from 430 to 720nm at 10nm resolution. Note that video is useful only for capturing a dynamic scene. A static scene can be fully acquired as a single hyperspectral cube at full spectral resolution without any time constraints.

## 2. Prior work

1. D. Kittle, K. Choi, A. Wagadarikar, and D. Brady, “Multiframe image estimation for coded aperture snapshot spectral imagers,” Appl. Opt. **49**, 6824–6833 (2010). [CrossRef] [PubMed]

3. A. Wagadarikar, N. Pitsianis, X. Sun, and D. Brady, “Video rate spectral imaging using a coded aperture snapshot spectral imager,” Opt. Express **17**, 6368–6388 (2009). [CrossRef] [PubMed]

**x**∈

*is said to exhibit a sparse structure if each signal can be approximated as a linear combination of a few atoms from a dictionary*

^{n}**D**∈

*, where*

^{n×M}*M*is the number of atoms in the dictionary. The dictionary

**D**contains prototype signal atoms and is usually overcomplete i.e. the number of atoms is greater than the dimensionality of the signals (

*M*>

*n*). An input signal

**x**is approximated as Here

*α*∈

*is sparse i.e. it has only a few non-zero elements. The parameter*

^{M}*γ*sets the tradeoff between the approximation and the sparsity of

*α*. The sparsity of

*α*is ensured by minimizing its

*ℓ*

_{0}pseudo-norm (number of non-zero entries) in the constrained optimization problem of Eq. (2) or its unconstrained version in Eq. (3). Computing

*ℓ*

_{0}is NP-hard and greedy algorithms such as Orthogonal Matching Pursuit (OMP) [4

4. Y. Pati, R. Rexaiifar, and P. Krishnaprasad, “Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition,” in Proceedings of the 27th Asilomar Conference on Signals, Systems, and Computers (IEEE, 1993), 40–44. [CrossRef]

*ℓ*

_{1}regularization also results in a sparse solution for

*α*, it is frequently used to formulate sparse coding as a convex optimization problem commonly known as the Lasso [5]. Here

*γ*

_{1}is a regularization parameter similar to Eq. (3).

**D**is critical, especially for the task of signal denoising and restoration. Aharon et al. [6

6. M. Aharon, M. Elad, and A. Bruckstein, “K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation,” IEEE Trans. Signal Process. **54**, 4311–4322 (2006). [CrossRef]

7. M. Elad and M. Aharon, “Image denoising via sparse and redundant representations over learned dictionaries,” IEEE Trans. Image Process. **15**, 3736–3745 (2006). [CrossRef] [PubMed]

*ℓ*

_{0}regularization was used for sparsity In this expression,

**A**represents the set of all

*α*where

_{ij}*ij*runs over all pixels of the image;

**Y**is the noisy image,

**X**is its unknown denoised version and

**R**

_{ij}**X**is a patch around pixel

*ij*extracted from image

**X**. This is similar to Eq. (3) except for the last self regularization term which forces the denoised image to be close to the original noisy image. The parameter

*γ*controls the relative importance of the sparsity of patch

_{ij}*ij*and

*γ*

_{2}controls the relative importance of self-similarity of the complete reconstructed image. This approach worked well for lower noise levels but the results deteriorated rapidly at higher levels of noise. For denoising high dimensional signals such as volumetric images, the same technique [7

7. M. Elad and M. Aharon, “Image denoising via sparse and redundant representations over learned dictionaries,” IEEE Trans. Image Process. **15**, 3736–3745 (2006). [CrossRef] [PubMed]

8. M. Elad and G. Sapiro, “Sparse representation for color image restoration,” IEEE Trans. Image Process. **17**, 53–69 (2008). [CrossRef] [PubMed]

7. M. Elad and M. Aharon, “Image denoising via sparse and redundant representations over learned dictionaries,” IEEE Trans. Image Process. **15**, 3736–3745 (2006). [CrossRef] [PubMed]

*p*×

*p*× 3 RGB patches except for a new projection method in the OMP step. The inner product

**y**

^{T}**x**in the original OMP [4

4. Y. Pati, R. Rexaiifar, and P. Krishnaprasad, “Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition,” in Proceedings of the 27th Asilomar Conference on Signals, Systems, and Computers (IEEE, 1993), 40–44. [CrossRef]

**y**

*(*

^{T}**I**+

*γ*

**K**/

*p*)

**x**where

*γ*is an empirically selected control parameter and

9. H. Othman and S. Qian, “Noise reduction of hyperspectral imagery using hybrid spatial-spectral derivative-domain wavelet shrinkage,” IEEE Trans. Geosci. Remote Sens. **44**, 397–408 (2006). [CrossRef]

10. S. Bourguignon, D. Mary, and E. Slezak, “Sparsity-based denoising of hyperspectral astrophysical data with colored noise: Application to the MUSE instrument,” in 2nd Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing (IEEE, 2010), 1–4. [CrossRef]

*ℓ*

_{1}-norm regularization to impose sparsity constraints on their respective canonical and DCT bases. Results are reported on

*simulated*data from the MUSE (Multi Unit Spectroscopic Explorer) consortium.

## 3. Hyperspectral frame registration using optical flow

*δ*,

_{x}*δ*) such that the error between two consecutive video frames is minimized We calculated the pixel displacements with Farnebäck’s two frame optic flow estimation [11] based on polynomial expansion. Each pixel neighbourhood is approximated by a quadratic polynomial

_{y}*f*(

**x**) ≈

**x**

^{T}**Ax**+

**b**

^{T}**x**+

*c*and the displacement between two pixels is estimated as

**A**is the symmetric matrix of coefficients common to both polynomials and

**b**

_{1}and

**b**

_{2}are the vectors of coefficients of their respective polynomials. Farnebäck also incorporates integration over a neighbourhood, multiscale analysis and an iterative process to increase optical flow accuracy [11]. However, as with other optical flow algorithms, image regions with minimal and ambiguous texture remain the main sources of errors.

## 4. Hyperspectral video restoration

12. J. Parkkinen, J. Hallikainen, and T. Jaaskelainen, “Characteristic spectra of munsell colors,” J. Opt. Soc. Am. A **6**, 318–322 (1989). [CrossRef]

6. M. Aharon, M. Elad, and A. Bruckstein, “K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation,” IEEE Trans. Signal Process. **54**, 4311–4322 (2006). [CrossRef]

### 4.1. Spectral restoration

*measured*at five out of 30 wavelengths in each hyperspectral frame. Therefore, these five measurements are more reliable compared to the remaining 25, which come from optical flow based registration and may contain errors. Let

**s**

*∈ *

_{ij}^{30}be the spectral response at pixel

*i*,

*j*of the 30 band hyperspectral frame (cube) obtained from optical flow and

**D**

*∈ *

_{λ}^{30×Mλ}be the overcomplete (spectral) dictionary (where

*M*is the size of the spectral dictionary) learned from the static pixels of the hyperspectral frame. Then, according to sparse coding theory the denoised hyperspectral frame can be recovered as where

_{λ}*α*̂

*are sparse coefficients given by and*

_{ij}*γ*

_{1}is the sparsity regularizer. Note that each vector

*α*̂

*is computed separately. The restored multispectral image*

_{ij}**Ĥ**is the ensemble of spectra

*Ĥ*defined at each pixel position

_{ij}*ij*. Note that Eq. (9) minimizes the

*ℓ*

_{2}error between the input signal

**s**

*and its sparse approximation giving equal weights to all dimensions (bands/wavelengths). However,*

_{ij}**s**

*is more reliable at the measured bands/wavelengths. Moreover, optical flow between neighbouring frames is less likely to have errors compared to optical flow accumulated over five consecutive frames. Therefore, we introduce a weighting term such that where*

_{ij}**W**∈

^{30×30}is a diagonal matrix of weights that gives the highest weight to the measured wavelengths followed by those which are estimated from optical flow between nearest frames. Bands/wavelengths that are registered between distant frames get the least weights.

**s**̄

*is the median of the eight-connected neighbours and*

_{ij}*γ*

_{2}is a mixing constant that regularizes the relative importance of

**s**

*and*

_{ij}**s**̄

*. The median is computed independently in each band. Then, weighting coefficients*

_{ij}*α*are computed using this filtered image: Equation 11 is convex for a known dictionary. The dictionary is learned in the formulation of Eq. (4) using the online dictionary learning algorithm [13]. The spectra in static regions of the scene are used as training data for dictionary learning. Note that

_{ij}**WD**

*needs to be calculated once only. Equation (11) can be solved using the Least Angle Regression (LARS) [14*

_{λ}14. B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani, “Least angle regression,” Ann. Stat. **32**, 407–499 (2004). [CrossRef]

### 4.2. Spatial restoration

**R**

*is an operator that extracts*

_{ij}*p*×

*p*patch(es) from any number of bands (5 measured bands

**H**

*or a single band to be restored*

_{m}**H**

*),*

_{e}**D**

*∈ *

_{sm}^{5p2×Ms}is the spatial dictionary learned to approximate the

*p*×

*p*× 5 patches of the measured bands,

**D**

*∈ *

_{se}^{p2×Ms}is the corresponding dictionary learned to approximate

*p*×

*p*patches of the band to be restored. By enforcing the sparse coefficients

*β*(and hence the size

_{ij}*M*) of both dictionaries to be the same, we establish a link between the two dictionaries. By combining the two dictionaries

_{s}*δ*is a tradeoff parameter), the above expression simplifies to We set

*δ*= 1 so that the five measured (more reliable) bands get 5/6 weight and the estimated band gets 1/6 weight. Our guided dictionary approach is similar to the work of Yang et al. [15

15. J. Yang, J. Wright, T. Huang, and Y. Ma, “Image super-resolution via sparse representation,” IEEE Trans. Image Process. **19**, 2861–2873 (2010). [CrossRef]

15. J. Yang, J. Wright, T. Huang, and Y. Ma, “Image super-resolution via sparse representation,” IEEE Trans. Image Process. **19**, 2861–2873 (2010). [CrossRef]

15. J. Yang, J. Wright, T. Huang, and Y. Ma, “Image super-resolution via sparse representation,” IEEE Trans. Image Process. **19**, 2861–2873 (2010). [CrossRef]

**19**, 2861–2873 (2010). [CrossRef]

## 5. Experimental setup and data collection

*γ*

_{1}=

*γ*

_{3}= 0.15 and

*γ*

_{2}= 0.3. The spectral dictionary size was set to over three times the dimensionality of spectral signal (the number of bands) i.e.

*M*= 100 so that the dictionary is overcomplete. The patch size

_{λ}*p*in spatial restoration was set to 3 since a smaller than 3 × 3 patch does not contain significant spatial information. Accordingly, the spatial dictionary size was set to be over three times the dimensionality of the patch (3 × 3 × 6, where 6 corresponds to 5 measured bands + one band to be restored) i.e.

*M*= 256. We also report results for

_{s}*p*= 5 and

*M*= 700.

_{s}## 6. Hyperspectral video restoration results

*H*∈

^{r}*and its corresponding ground truth*

^{u×v×n}*H*∈

^{g}*is given by where*

^{u×v×n}*n*is the spectral and

*u*×

*v*are the spatial dimensions. To avoid bias in the results, RMSE was always measured at only those pixels where motion was detected by optical flow. RMSE of the full frames were much lower due to averaging with more reliable spectra in the static regions. The input frames were divided into static regions and motion regions using two conservative masks obtained from optical flow. The masks ensured that only static regions were used for learning the dictionaries and that RMSE was calculated at the motion pixels.

## 7. Processing time

14. B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani, “Least angle regression,” Ann. Stat. **32**, 407–499 (2004). [CrossRef]

## 8. Conclusion

## Acknowledgment

## References and links

1. | D. Kittle, K. Choi, A. Wagadarikar, and D. Brady, “Multiframe image estimation for coded aperture snapshot spectral imagers,” Appl. Opt. |

2. | M. Shankar, N. Pitsianis, and D. Brady, “Compressive video sensors using multichannel imagers,” Appl. Opt. |

3. | A. Wagadarikar, N. Pitsianis, X. Sun, and D. Brady, “Video rate spectral imaging using a coded aperture snapshot spectral imager,” Opt. Express |

4. | Y. Pati, R. Rexaiifar, and P. Krishnaprasad, “Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition,” in Proceedings of the 27th Asilomar Conference on Signals, Systems, and Computers (IEEE, 1993), 40–44. [CrossRef] |

5. | R. Tibshirani, “Regression shrinkage and selection via the Lasso,” J. R. Stat. Soc. Ser. B |

6. | M. Aharon, M. Elad, and A. Bruckstein, “K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation,” IEEE Trans. Signal Process. |

7. | M. Elad and M. Aharon, “Image denoising via sparse and redundant representations over learned dictionaries,” IEEE Trans. Image Process. |

8. | M. Elad and G. Sapiro, “Sparse representation for color image restoration,” IEEE Trans. Image Process. |

9. | H. Othman and S. Qian, “Noise reduction of hyperspectral imagery using hybrid spatial-spectral derivative-domain wavelet shrinkage,” IEEE Trans. Geosci. Remote Sens. |

10. | S. Bourguignon, D. Mary, and E. Slezak, “Sparsity-based denoising of hyperspectral astrophysical data with colored noise: Application to the MUSE instrument,” in 2nd Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing (IEEE, 2010), 1–4. [CrossRef] |

11. | G. Farnebäck, “Two-frame motion estimation based on polynomial expansion,” in |

12. | J. Parkkinen, J. Hallikainen, and T. Jaaskelainen, “Characteristic spectra of munsell colors,” J. Opt. Soc. Am. A |

13. | J. Mairal, J. Ponce, and G. Sapiro, “Online learning for matrix factorization and sparse coding,” J. Mach. Learn. Res. |

14. | B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani, “Least angle regression,” Ann. Stat. |

15. | J. Yang, J. Wright, T. Huang, and Y. Ma, “Image super-resolution via sparse representation,” IEEE Trans. Image Process. |

16. | P. Ndajah, H. Kikuchi, M. Yukawa, H. Watanabe, and S. Muramatsu, “An investigation on the quality of denoised images,” Int. J. Circuits, Systems and Signal Process. |

**OCIS Codes**

(100.3020) Image processing : Image reconstruction-restoration

(100.4145) Image processing : Motion, hyperspectral image processing

**ToC Category:**

Image Processing

**History**

Original Manuscript: February 23, 2012

Revised Manuscript: April 19, 2012

Manuscript Accepted: April 19, 2012

Published: April 24, 2012

**Citation**

Ajmal Mian and Richard Hartley, "Hyperspectral video restoration using optical flow and sparse coding," Opt. Express **20**, 10658-10673 (2012)

http://www.opticsinfobase.org/oe/abstract.cfm?URI=oe-20-10-10658

Sort: Year | Journal | Reset

### References

- D. Kittle, K. Choi, A. Wagadarikar, and D. Brady, “Multiframe image estimation for coded aperture snapshot spectral imagers,” Appl. Opt.49, 6824–6833 (2010). [CrossRef] [PubMed]
- M. Shankar, N. Pitsianis, and D. Brady, “Compressive video sensors using multichannel imagers,” Appl. Opt.49, B9–B17 (2010). [CrossRef] [PubMed]
- Y. Pati, R. Rexaiifar, and P. Krishnaprasad, “Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition,” in Proceedings of the 27th Asilomar Conference on Signals, Systems, and Computers (IEEE, 1993), 40–44. [CrossRef]
- R. Tibshirani, “Regression shrinkage and selection via the Lasso,” J. R. Stat. Soc. Ser. B58, 267–288 (1996).
- M. Aharon, M. Elad, and A. Bruckstein, “K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation,” IEEE Trans. Signal Process.54, 4311–4322 (2006). [CrossRef]
- M. Elad and M. Aharon, “Image denoising via sparse and redundant representations over learned dictionaries,” IEEE Trans. Image Process.15, 3736–3745 (2006). [CrossRef] [PubMed]
- M. Elad and G. Sapiro, “Sparse representation for color image restoration,” IEEE Trans. Image Process.17, 53–69 (2008). [CrossRef] [PubMed]
- H. Othman and S. Qian, “Noise reduction of hyperspectral imagery using hybrid spatial-spectral derivative-domain wavelet shrinkage,” IEEE Trans. Geosci. Remote Sens.44, 397–408 (2006). [CrossRef]
- S. Bourguignon, D. Mary, and E. Slezak, “Sparsity-based denoising of hyperspectral astrophysical data with colored noise: Application to the MUSE instrument,” in 2nd Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing (IEEE, 2010), 1–4. [CrossRef]
- G. Farnebäck, “Two-frame motion estimation based on polynomial expansion,” in Proceedings of the 13th Scandinavian Conference on Image Analysis (Springer, 2003), 363–370.
- J. Parkkinen, J. Hallikainen, and T. Jaaskelainen, “Characteristic spectra of munsell colors,” J. Opt. Soc. Am. A6, 318–322 (1989). [CrossRef]
- J. Mairal, J. Ponce, and G. Sapiro, “Online learning for matrix factorization and sparse coding,” J. Mach. Learn. Res.11, 19–60 (2010).
- B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani, “Least angle regression,” Ann. Stat.32, 407–499 (2004). [CrossRef]
- J. Yang, J. Wright, T. Huang, and Y. Ma, “Image super-resolution via sparse representation,” IEEE Trans. Image Process.19, 2861–2873 (2010). [CrossRef]
- P. Ndajah, H. Kikuchi, M. Yukawa, H. Watanabe, and S. Muramatsu, “An investigation on the quality of denoised images,” Int. J. Circuits, Systems and Signal Process.5, 423–434 (2011).

## Cited By |
Alert me when this paper is cited |

OSA is able to provide readers links to articles that cite this paper by participating in CrossRef's Cited-By Linking service. CrossRef includes content from more than 3000 publishers and societies. In addition to listing OSA journal articles that cite this paper, citing articles from other participating publishers will also be listed.

### Figures

Fig. 1 |
Fig. 2 |
Fig. 3 |

Fig. 4 |
Fig. 5 |
Fig. 6 |

Fig. 7 |
Fig. 8 |
Fig. 9 |

Fig. 10 |
Fig. 11 |
Fig. 12 |

Fig. 13 |
||

« Previous Article | Next Article »

OSA is a member of CrossRef.