在OS X上的合并排序实现中的总线错误10
问题描述:
我想在C中实现合并排序。我编写的代码适用于100,000个数字的列表,但是当我在1,000,000列表上运行它时, “总线错误:10”。在OS X上的合并排序实现中的总线错误10
错误发生在我评论过“BUS ERROR HERE”的地方。发生错误时,tmp_list_i == 65920和pws-> merge_cursor == 32776.函数merge
会合并任意数量的子数组,因为我也使用它来合并由不同线程排序的子数组。但即使我只使用单个线程时(即,一次只需要合并两个子阵列),总线错误也正在发生。
任何想法?
// Represents a sub-array in the list.
typedef struct
{
int begin_i; // inclusive
int end_i; // exclusive
int already_sorted; // if the partition was sorted before runtime
pthread_t tid; // thread associated with this partition, if any
int merge_cursor; // index used for merging
} Partition;
// O(n log(n))
// n = number of comparisons in a merge
// log(n) = number of merges
void* merge_sort(void* partition)
{
Partition* part = (Partition*) partition;
// Base case. One item, so partition is sorted
int len = part->end_i - part->begin_i;
if (len < 2)
{
part->already_sorted = TRUE;
return 0;
}
// Recursion
Partition left_part;
left_part.begin_i = part->begin_i;
left_part.end_i = part->begin_i + (len/2);
left_part.merge_cursor = left_part.begin_i;
Partition right_part;
right_part.begin_i = part->begin_i + (len/2);
right_part.end_i = part->end_i;
right_part.merge_cursor = right_part.begin_i;
merge_sort(&left_part);
merge_sort(&right_part);
if (left_part.already_sorted && right_part.already_sorted)
part->already_sorted = TRUE;
// Create parts array to pass to merge
Partition* parts[] = {&left_part, &right_part};
if (merge(parts, 2, len) == FALSE)
part->already_sorted = FALSE;
return 0;
}
// O(n) but more specifically O(n * p + n) where p is num_parts
int merge(Partition* parts[], int num_parts, int total_num)
{
int already_sorted = TRUE; // whether the partitions were already sorted
int tmp_list[total_num];
int tmp_list_i;
for (tmp_list_i = 0; tmp_list_i < total_num; tmp_list_i++)
{
// find (P)artition (W)ith (S)mallest number under its merge cursor
Partition* pws = NULL;
int parts_i;
for (parts_i = 0; parts_i < num_parts; parts_i++)
{
Partition* this_part = parts[parts_i];
if (this_part->merge_cursor == MERGE_CURSOR_DONE)
continue;
if (pws == NULL)
pws = this_part;
int this_part_num = list[this_part->merge_cursor];
int smallest_part_num = list[pws->merge_cursor];
if (this_part_num < smallest_part_num)
{
pws = this_part;
already_sorted = FALSE;
}
}
// add the smallest of the numbers to current spot in tmp array
tmp_list[tmp_list_i] = list[pws->merge_cursor]; // BUS ERROR HERE
// increment the merge cursor for pws and set to NULL if done
(pws->merge_cursor)++;
if (pws->merge_cursor == pws->end_i)
pws->merge_cursor = MERGE_CURSOR_DONE;
}
// Copy back to list from tmp_list. Costs an extra n.
int list_i = parts[0]->begin_i; // start where we should in list
for (tmp_list_i = 0; tmp_list_i < total_num; tmp_list_i++)
{
list[list_i] = tmp_list[tmp_list_i];
list_i++;
}
return already_sorted;
}
编辑: 当在堆上分配,而不是堆的一切,我得到一个不同的问题。分配int this_part_num = list[this_part->merge_cursor];
似乎并没有被正确评估,最终我获得了SIG故障:
141 int this_part_num = list[this_part->merge_cursor];
(gdb) s
142 int smallest_part_num = list[pws->merge_cursor];
(gdb) print this_part_num
$5 = 1
(gdb) print list[this_part->merge_cursor]
$6 = 6
答
想通了。列表在单独的文件中声明为int* list
,但在merge_sort函数为extern int list[]
的文件中声明。
发生错误时,'total_num'的值是什么?那么'list'数组的大小是多少?顺便说一句,我没有看到'list'的声明,它是否在某处? – user3386109 2014-08-28 05:33:53
经过进一步检查,将'tmp_list'声明为局部变量可能是问题所在。作为一个局部变量,'tmp_list'将被分配到堆栈上,并且堆栈中有100万个int的数组可能会导致堆栈溢出(而100K intts可能实际上适合堆栈)。我建议你'malloc''tmp_list'数组和'free'它在函数结束时。 – user3386109 2014-08-28 05:42:03
我试过在堆上创建所有东西。但现在事情变得更加怪异。我最终得到一个seg错误,但在此之前,赋值'int this_part_num = list [list_part-> merge_cursor];'没有正确评估。看到我上面的编辑。 – Sinclair 2014-08-28 12:11:57