链接CUDA和C++:架构i386的未定义符号
我已经尝试过很努力但没有成功。我希望有人能帮助我做到这一点。 我有两个源文件。链接CUDA和C++:架构i386的未定义符号
Main.cpp的
#include <stdio.h>
#include "Math.h"
#include <math.h>
#include <iostream>
int cuda_function(int a, int b);
int callKnn(void);
int main(void)
{
int x = cuda_function(1, 2);
int f = callKnn();
std::cout << f << std::endl;
return 1;
}
CudaFunctions.cu
#include <cuda.h>
#include <stdio.h>
#include "Math.h"
#include <math.h>
#include "cuda.h"
#include <time.h>
#include "knn_cuda_without_indexes.cu"
__global__ void kernel(int a, int b)
{
//statements
}
int cuda_function2(int a, int b)
{
return 2;
}
int callKnn(void)
{
// Variables and parameters
float* ref; // Pointer to reference point array
float* query; // Pointer to query point array
float* dist; // Pointer to distance array
int ref_nb = 4096; // Reference point number, max=65535
int query_nb = 4096; // Query point number, max=65535
int dim = 32; // Dimension of points
int k = 20; // Nearest neighbors to consider
int iterations = 100;
int i;
// Memory allocation
ref = (float *) malloc(ref_nb * dim * sizeof(float));
query = (float *) malloc(query_nb * dim * sizeof(float));
dist = (float *) malloc(query_nb * sizeof(float));
// Init
srand(time(NULL));
for (i=0 ; i<ref_nb * dim ; i++) ref[i] = (float)rand()/(float)RAND_MAX;
for (i=0 ; i<query_nb * dim ; i++) query[i] = (float)rand()/(float)RAND_MAX;
// Variables for duration evaluation
cudaEvent_t start, stop;
cudaEventCreate(&start);
cudaEventCreate(&stop);
float elapsed_time;
// Display informations
printf("Number of reference points : %6d\n", ref_nb );
printf("Number of query points : %6d\n", query_nb);
printf("Dimension of points : %4d\n", dim );
printf("Number of neighbors to consider : %4d\n", k );
printf("Processing kNN search :" );
// Call kNN search CUDA
cudaEventRecord(start, 0);
for (i=0; i<iterations; i++)
knn(ref, ref_nb, query, query_nb, dim, k, dist);
cudaEventRecord(stop, 0);
cudaEventSynchronize(stop);
cudaEventElapsedTime(&elapsed_time, start, stop);
printf(" done in %f s for %d iterations (%f s by iteration)\n", elapsed_time/1000, iterations, elapsed_time/(iterations*1000));
// Destroy cuda event object and free memory
cudaEventDestroy(start);
cudaEventDestroy(stop);
free(dist);
free(query);
free(ref);
return 1;
}
我尝试从终端用下面的命令运行它:
g++ -c Main.cpp -m32
nvcc -c CudaFunctions.cu -lcuda -D_CRT_SECURE_NO_DEPRECATE
nvcc -o mytest Main.o CudaFunctions.o
但得到以下错误:
Undefined symbols for architecture i386:
"cuda_function(int, int)", referenced from:
_main in Main.o
"_cuInit", referenced from:
knn(float*, int, float*, int, int, int, float*)in CudaFunctions.o
"_cuCtxCreate_v2", referenced from:
knn(float*, int, float*, int, int, int, float*)in CudaFunctions.o
"_cuMemGetInfo_v2", referenced from:
knn(float*, int, float*, int, int, int, float*)in CudaFunctions.o
"_cuCtxDetach", referenced from:
knn(float*, int, float*, int, int, int, float*)in CudaFunctions.o
ld: symbol(s) not found for architecture i386
collect2: ld returned 1 exit status
我不知道这是否有事情做与#include语句或头文件。我剩下的想法尝试。
第一未定义的符号
"cuda_function(int, int)", referenced from:
_main in Main.o
由以下事实:CudaFunctions.cu
cuda_function2
限定,而不是cuda_function
引起的。更正CudaFunctions.cu
或Main.cpp
中的名称。
未定义符号的其余部分是由于没有正确链接到libcuda.dylib
而导致的,因为那是符号生活的地方。尝试将-lcuda
参数移动到第二个nvcc
命令行,该命令行实际上将该程序链接在一起。更好的是,尽量省略完整的论点,因为这不是必要的。
嗨Jared。这是我得到的错误:ld:重复的符号_main在CudaFunctions.o和Main.o架构i386 collect2:ld返回1退出状态 –
我试图去开发人员/ GPU计算/ C和类型 make clean make x86_64 = 1 但结果相同。 –
在另一个论坛中,我遇到了同样问题的另一个示例。 http://forums.nvidia.com/index.php?showtopic=211771 –
您是否还需要告诉nvcc您的CUDA代码是32位的? – ChrisV
我已经试过。同样的错误。 –