链接CUDA和C++:架构i386的未定义符号

问题描述:

我已经尝试过很努力但没有成功。我希望有人能帮助我做到这一点。 我有两个源文件。链接CUDA和C++:架构i386的未定义符号

Main.cpp的

#include <stdio.h> 
#include "Math.h" 
#include <math.h> 
#include <iostream> 

int cuda_function(int a, int b); 
int callKnn(void); 

int main(void) 
{ 
    int x = cuda_function(1, 2); 
    int f = callKnn(); 
    std::cout << f << std::endl; 
    return 1; 
} 

CudaFunctions.cu

#include <cuda.h> 
#include <stdio.h> 
#include "Math.h" 
#include <math.h> 
#include "cuda.h" 
#include <time.h> 
#include "knn_cuda_without_indexes.cu" 

__global__ void kernel(int a, int b) 
{ 
    //statements 
} 

int cuda_function2(int a, int b) 
{ 
    return 2; 
} 

int callKnn(void) 
{ 
    // Variables and parameters 
    float* ref;     // Pointer to reference point array 
    float* query;    // Pointer to query point array 
    float* dist;    // Pointer to distance array 
    int ref_nb  = 4096; // Reference point number, max=65535 
    int query_nb = 4096; // Query point number,  max=65535 
    int dim  = 32;  // Dimension of points 
    int k   = 20;  // Nearest neighbors to consider 
    int iterations = 100; 
    int i; 

    // Memory allocation 
    ref = (float *) malloc(ref_nb * dim * sizeof(float)); 
    query = (float *) malloc(query_nb * dim * sizeof(float)); 
    dist = (float *) malloc(query_nb * sizeof(float)); 

    // Init 
    srand(time(NULL)); 
    for (i=0 ; i<ref_nb * dim ; i++) ref[i] = (float)rand()/(float)RAND_MAX; 
    for (i=0 ; i<query_nb * dim ; i++) query[i] = (float)rand()/(float)RAND_MAX; 

    // Variables for duration evaluation 
    cudaEvent_t start, stop; 
    cudaEventCreate(&start); 
    cudaEventCreate(&stop); 
    float elapsed_time; 

    // Display informations 
    printf("Number of reference points  : %6d\n", ref_nb ); 
    printf("Number of query points   : %6d\n", query_nb); 
    printf("Dimension of points    : %4d\n", dim ); 
    printf("Number of neighbors to consider : %4d\n", k  ); 
    printf("Processing kNN search   :"    ); 

    // Call kNN search CUDA 
    cudaEventRecord(start, 0); 
    for (i=0; i<iterations; i++) 
     knn(ref, ref_nb, query, query_nb, dim, k, dist); 
    cudaEventRecord(stop, 0); 
    cudaEventSynchronize(stop); 
    cudaEventElapsedTime(&elapsed_time, start, stop); 
    printf(" done in %f s for %d iterations (%f s by iteration)\n", elapsed_time/1000, iterations, elapsed_time/(iterations*1000)); 

    // Destroy cuda event object and free memory 
    cudaEventDestroy(start); 
    cudaEventDestroy(stop); 
    free(dist); 
    free(query); 
    free(ref); 

    return 1; 
} 

我尝试从终端用下面的命令运行它:

g++ -c Main.cpp -m32 
nvcc -c CudaFunctions.cu -lcuda -D_CRT_SECURE_NO_DEPRECATE 
nvcc -o mytest Main.o CudaFunctions.o 

但得到以下错误:

Undefined symbols for architecture i386: 
    "cuda_function(int, int)", referenced from: 
     _main in Main.o 
    "_cuInit", referenced from: 
     knn(float*, int, float*, int, int, int, float*)in CudaFunctions.o 
    "_cuCtxCreate_v2", referenced from: 
     knn(float*, int, float*, int, int, int, float*)in CudaFunctions.o 
    "_cuMemGetInfo_v2", referenced from: 
     knn(float*, int, float*, int, int, int, float*)in CudaFunctions.o 
    "_cuCtxDetach", referenced from: 
     knn(float*, int, float*, int, int, int, float*)in CudaFunctions.o 
ld: symbol(s) not found for architecture i386 
collect2: ld returned 1 exit status 

我不知道这是否有事情做与#include语句或头文件。我剩下的想法尝试。

+0

您是否还需要告诉nvcc您的CUDA代码是32位的? – ChrisV

+0

我已经试过。同样的错误。 –

第一未定义的符号

"cuda_function(int, int)", referenced from: 
    _main in Main.o 

由以下事实:CudaFunctions.cucuda_function2限定,而不是cuda_function引起的。更正CudaFunctions.cuMain.cpp中的名称。

未定义符号的其余部分是由于没有正确链接到libcuda.dylib而导致的,因为那是符号生活的地方。尝试将-lcuda参数移动到第二个nvcc命令行,该命令行实际上将该程序链接在一起。更好的是,尽量省略完整的论点,因为这不是必要的。

+0

嗨Jared。这是我得到的错误:ld:重复的符号_main在CudaFunctions.o和Main.o架构i386 collect2:ld返回1退出状态 –

+0

我试图去开发人员/ GPU计算/ C和类型 make clean make x86_64 = 1 但结果相同。 –

+0

在另一个论坛中,我遇到了同样问题的另一个示例。 http://forums.nvidia.com/index.php?showtopic=211771 –