示例程序023--傅里叶变换

原理:

对一张图像使用傅立叶变换就是将它分解成正弦余弦两部分。也就是将图像从空间域(spatial domain)转换到频域(frequency domain)。这一转换的理论基础来自于以下事实:任一函数都可以表示成无数个正弦和余弦函数的和的形式。傅立叶变换就是一个用来将函数分解的工具。 2维图像的傅立叶变换可以用以下数学公式表达:
示例程序023--傅里叶变换

 

式中 f 是空间域(spatial domain)值, F 则是频域(frequency domain)值。转换之后的频域值是复数, 因此,显示傅立叶变换之后的结果需要使用实数图像(real image) 加虚数图像(complex image), 或者幅度图像(magitude image)加相位图像(phase image)。在实际的图像处理过程中,仅仅使用了幅度图像,因为幅度图像包含了原图像的几乎所有我们需要的几何信息。然而,如果你想通过修改幅度图像或者相位图像的方法来间接修改原空间图像,你需要使用逆傅立叶变换得到修改后的空间图像,这样你就必须同时保留幅度图像和相位图像了。

在此示例中,展示如何计算以及显示傅立叶变换后的幅度图像。由于数字图像的离散性,像素值的取值范围也是有限的。比如在一张灰度图像中,像素灰度值一般在0到255之间。因此,我们这里讨论的也仅仅是离散傅立叶变换(DFT)。如何得到图像中的几何结构信息等详细的解释在代码中。

 

用到的函数:

void dft(InputArray src, OutputArray dst, int flags=0, int nonzeroRows=0)

  • src – Source array that could be real or complex.
  • dst – Destination array whose size and type depends on the flags .
  • flags –

Transformation flags representing a combination of the following values:

    • DFT_INVERSE performs an inverse 1D or 2D transform instead of the default forward transform.
    • DFT_SCALE scales the result: divide it by the number of array elements. Normally, it is combined with DFT_INVERSE .
    • DFT_ROWS performs a forward or inverse transform of every individual row of the input matrix. This flag enables you to transform multiple vectors simultaneously and can be used to decrease the overhead (which is sometimes several times larger than the processing itself) to perform 3D and higher-dimensional transforms and so forth.
    • DFT_COMPLEX_OUTPUT performs a forward transformation of 1D or 2D real array. The result, though being a complex array, has complex-conjugate symmetry (CCS, see the function description below for details). Such an array can be packed into a real array of the same size as input, which is the fastest option and which is what the function does by default. However, you may wish to get a full complex array (for simpler spectrum analysis, and so on). Pass the flag to enable the function to produce a full-size complex output array.
    • DFT_REAL_OUTPUT performs an inverse transformation of a 1D or 2D complex array. The result is normally a complex array of the same size. However, if the source array has conjugate-complex symmetry (for example, it is a result of forward transformation with DFT_COMPLEX_OUTPUT flag), the output is a real array. While the function itself does not check whether the input is symmetrical or not, you can pass the flag and then the function will assume the symmetry and produce the real output array. Note that when the input is packed into a real array and inverse transformation is executed, the function treats the input as a packed complex-conjugate symmetrical array. So, the output will also be a real array.
  • nonzeroRows – When the parameter is not zero, the function assumes that only the first nonzeroRows rows of the input array ( DFT_INVERSE is not set) or only the first nonzeroRows of the output array ( DFT_INVERSE is set) contain non-zeros. Thus, the function can handle the rest of the rows more efficiently and save some time. This technique is very useful for computing array cross-correlation or convolution using DFT.

 

int getOptimalDFTSize(int vecsize)

The function getOptimalDFTSize returns the minimum number N that is greater than or equal to vecsize so that the DFT of a vector of size N can be computed efficiently.

 

void copyMakeBorder(InputArray src, OutputArray dst, int top, int bottom, int left, intright, int borderType, const Scalar& value=Scalar())

src – Source image.

dst – Destination image of the same type as src and the size Size(src.cols+left+right, src.rows+top+bottom) .

top –

bottom –

left –

right – Parameter specifying how many pixels in each direction from the source image rectangle to extrapolate. For example, top=1, bottom=1, left=1, right=1 mean that 1 pixel-wide border needs to be built.

borderType – Border type. See borderInterpolate() for details.

value – Border value if borderType==BORDER_CONSTANT .

The function copies the source image into the middle of the destination image. The areas to the left, to the right, above and below the copied source image will be filled with extrapolated pixels. This is not what FilterEngine or filtering functions based on it do (they extrapolate pixels on-fly), but what other more complex functions, including your own, may do to simplify image boundary handling.

 

 

void merge(const Mat* mv, size_t count, OutputArray dst)

mv – Source array or vector of matrices to be merged. All the matrices in mv must have the same size and the same depth.

count – Number of source matrices when mv is a plain C array. It must be greater than zero.

dst – Destination array of the same size and the same depth as mv[0] . The number of channels will be the total number of channels in the matrix array.

 

The functions merge merge several arrays to make a single multi-channel array. That is, each element of the output array will be a concatenation of the elements of the input arrays, where elements of i-th input array are treated as mv[i].channels()-element vectors.

The function split() does the reverse operation. If you need to shuffle channels in some other advanced way, use mixChannels() .

 

 

void magnitude(InputArray x, InputArray y, OutputArray magnitude)

x – Floating-point array of x-coordinates of the vectors.

y – Floating-point array of y-coordinates of the vectors. It must have the same size as x .

dst – Destination array of the same size and type as x .

The function magnitude calculates the magnitude of 2D vectors formed from the corresponding elements of x and y arrays:示例程序023--傅里叶变换

 

 

代码:

// 030 傅里叶变换.cpp : 定义控制台应用程序的入口点。
//

#include "stdafx.h"

#include "opencv2/core/core.hpp"
#include "opencv2/imgproc/imgproc.hpp"
#include "opencv2/highgui/highgui.hpp"

#include <iostream>

using namespace cv;
using namespace std;

void help(char* progName)
{
    cout << endl
        <<  "This program demonstrated the use of the discrete Fourier transform (DFT). " << endl
        <<  "The dft of an image is taken and it's power spectrum is displayed."          << endl
        <<  "Usage:"                                                                      << endl
        << progName << " [image_name -- default lena.jpg] "                       << endl << endl;
}

int main(int argc, char ** argv)
{
 help(argv[0]);

    const char* filename = argc >=2 ? argv[1] : "lena.jpg";

 //灰度读入
    Mat I = imread(filename, CV_LOAD_IMAGE_GRAYSCALE);
    if( I.empty())
        return -1;
    

1.

将图像延扩到最佳尺寸. 离散傅立叶变换的运行速度与图片的尺寸息息相关。当图像的尺寸是2, 3,5的整数倍时,计算速度最快。
 因此,为了达到快速计算的目的,经常通过添凑新的边缘像素的方法获取最佳图像尺寸。
 函数 getOptimalDFTSize() 返回最佳尺寸,而函数 copyMakeBorder() 填充边缘像素:

)

/////////////////////////////////////

    Mat padded;                            //将输入图像延扩到最佳的尺寸
    int m = getOptimalDFTSize( I.rows );
    int n = getOptimalDFTSize( I.cols );
    copyMakeBorder(I, padded, 0, m - I.rows, 0, n - I.cols, BORDER_CONSTANT, Scalar::all(0));    //源图像拷贝进目标图像,添加的多余像素初始化为0.

 

/////////////////////////////////////
 (

 2.
 为傅立叶变换的结果(实部和虚部)分配存储空间. 傅立叶变换的结果是复数,这就是说对于每个原图像值,结果是两个图像值。
 此外,频域值范围远远超过空间值范围, 因此至少要将频域储存在 float 格式中。
 结果我们将输入图像转换成浮点类型,并多加一个额外通道来储存复数部分.

)

/////////////////////////////////////
    Mat planes[] = {Mat_<float>(padded), Mat::zeros(padded.size(), CV_32F)};   //将延扩后的图像转float,且创建一个同样大小的矩阵
    Mat complexI;
    merge(planes, 2, complexI);         // 为延扩后的图像增添一个初始化为0的通道,单通道变为复通道

 


 //3.傅里叶变换
    dft(complexI, complexI);            // 原地计算,变换结果很好的保存在原始矩阵中

 

/////////////////////////////////////

(

 4.
 将复数转换为幅度.复数包含实数部分(Re)和复数部分 (imaginary - Im)。 离散傅立叶变换的结果是复数,对应的幅度可以表示为:
 sqrt(Re(DFT(I))^2 + Im(DFT(I))^2)
    其OpenCV代码:
)

/////////////////////////////////////

    split(complexI, planes);                   // planes[0] = Re(DFT(I), planes[1] = Im(DFT(I))
    magnitude(planes[0], planes[1], planes[0]);// 幅度变换,planes[0] = magnitude 
    Mat magI = planes[0];                         
   

/////////////////////////////////////

 (

 5.
 对数尺度(logarithmic scale)缩放. 傅立叶变换的幅度值范围大到不适合在屏幕上显示。
 高值在屏幕上显示为白点,而低值为黑点,高低值的变化无法有效分辨。为了在屏幕上凸显出高低变化的连续性,
 我们可以用对数尺度来替换线性尺度:
 )

/////////////////////////////////////

    magI += Scalar::all(1);                    // 矩阵所有值+1,转换到对数尺度
    log(magI, magI);                          //对数尺度缩放,原地操作

 

 

/////////////////////////////////////

 (

 6.
 剪切和重分布幅度图象限. 第一步时延扩了图像, 那现在是时候将新添加的像素剔除了。
 为了方便显示,我们也可以重新分布幅度图象限位置(注:将第五步得到的幅度图从中间划开得到四张1/4子图像,
 将每张子图像看成幅度图的一个象限,重新分布即将四个角点重叠到图片中心)。 这样的话原点(0,0)就位移到图像中心
 )

/////////////////////////////////////

    // crop the spectrum, if it has an odd number of rows or columns
    magI = magI(Rect(0, 0, magI.cols & -2, magI.rows & -2));
    // rearrange the quadrants of Fourier image  so that the origin is at the image center       
    int cx = magI.cols/2;
    int cy = magI.rows/2;

    Mat q0(magI, Rect(0, 0, cx, cy));   // Top-Left - 为每一个象限创建ROI
    Mat q1(magI, Rect(cx, 0, cx, cy));  // Top-Right
    Mat q2(magI, Rect(0, cy, cx, cy));  // Bottom-Left
    Mat q3(magI, Rect(cx, cy, cx, cy)); // Bottom-Right

    Mat tmp;                           // 交换象限 (Top-Left with Bottom-Right)
    q0.copyTo(tmp);
    q3.copyTo(q0);
    tmp.copyTo(q3);

    q1.copyTo(tmp);                    // 交换象限 (Top-Right with Bottom-Left)
    q2.copyTo(q1);
    tmp.copyTo(q2);

 

/////////////////////////////////////

 (

 7.
 归一化. 这一步的目的仍然是为了显示。
 现在我们有了重分布后的幅度图,但是幅度值仍然超过可显示范围[0,1] 。
 我们使用 normalize() 函数将幅度归一化到可显示范围。
 )

/////////////////////////////////////

    normalize(magI, magI, 0, 1, CV_MINMAX); // 将float类型的矩阵转换到可显示图像范围,(float [0, 1]).
                                          
    imshow("Input Image"       , I   );    // Show the result
    imshow("spectrum magnitude", magI);   
    waitKey();

    return 0;
}

 

运行结果:

示例程序023--傅里叶变换