gcc改变c程序行为的优化级别

问题描述：

我看到的行为我不希望在gcc中编译具有不同优化级别的代码时，我不会期望这种行为。gcc改变c程序行为的优化级别

函数测试应该用1填充64位无符号整数，将shift_size位移到左边，并将32位低位返回为32位无符号整数。

当我用-O0编译时，我得到了我期望的结果。

当我用-O2编译时，如果我尝试移位32位或更多，我不会。

实际上，如果我在x86上移位大于或等于位宽的32位整数，我只能得到我期望的结果，这是一种仅使用移位大小的低5位的移位。

但我正在转移一个64位数字，所以转移< 64应该是合法的权利？

我认为这是我理解中的错误，而不是编译器中的错误，但我一直无法弄清楚。

我的机器： GCC（Ubuntu的/ Linaro的4.4.4-14ubuntu5）4.4.5 i686的-Linux的GNU

#include <stdint.h> 
#include <stdio.h> 
#include <inttypes.h> 

uint32_t test(unsigned int shift_size) { 
    uint64_t res = 0; 
    res = ~res; 
    res = res << shift_size; //Shift size < uint64_t width so this should work 
    return res; //Implicit cast to uint32_t 
} 

int main(int argc, char *argv[]) 
{ 
    int dst; 
    sscanf(argv[1], "%d", &dst); //Get arg from outside so optimizer doesn't eat everything 
    printf("%" PRIu32 "l\n", test(dst)); 
    return 0; 
}

用法：

$ gcc -Wall -O0 test.c 
$ ./a.out 32 
0l 
$ gcc -Wall -O2 test.c 
$ ./a.out 32 
4294967295l

gcc -S -Wall -O0 test.c

gcc -S -Wall -O2 test.c

你还可以发布你的编译器生成的汇编代码的相关部分吗？（'-S' for gcc） – 2012-01-11 09:50:55

FWIW它使用gcc 4.2.1为我提供了从'-O0'到'-O3'的所有优化级别的正确结果（'0l'），所以我怀疑你*可能*拥有发现了一个gcc错误。 – 2012-01-11 09:52:46

HM，代码工作正常，在这两个优化级别为我...（GCC 4.5.0版20100604，openSUSE的11.3（x86_64的）） – Bort 2012-01-11 09:54:18

答

它看起来像一个错误。我的猜测是编译器已经从最后两行折叠：

res = res << shift_size 
return (uint32_t)res;

成：

return ((uint32_t)res) << shift_size;

后者现在32或更大的明确定义。

答

它看起来像它可能是一个32位特定的编译器错误给我。用问题和gcc 4.2.1的代码，只要我用gcc -m32 -O2 ...进行编译，我就可以重现这个bug。

不过，如果我添加了一个调试的printf：

uint32_t test(unsigned int shift_size) { 
    uint64_t res = 0; 
    res = ~res; 
    res = res << shift_size; //Shift size < uint64_t width so this should work 
    printf("res = %llx\n", res); 
    return res; //Implicit cast to uint32_t 
}

然后问题消失。

下一步是查看生成的代码来尝试识别/确认错误。

作为一个临时的解决方法，也许可以打印到'/ dev/null'：'FILE * null; null = fopen（“/ dev/null”，“w”）; fprintf（null，“res =％llx \ n”，res）; fclose（null）;' – 2012-01-11 10:10:04

作为一个临时解决方法，使'res'' volatile'工作得很好，并且不需要文件操作。 – Fanael 2012-01-11 11:34:38

答

"%u"（或"%lu"）和uint32_t不一定是相容的。尝试

#include <inttypes.h> 

    //printf("%ul\n", test(dst)); 
    printf("%" PRIu32 "l\n", test(dst));

打印uint32_t值与"%u"（或"%lu"）说明符可以调用未定义的行为。

好的调用，uint32_t可能不是unsigned int。 – dreamlax 2012-01-11 10:18:03

@dreamlax：我认为问题在于OP错误地使用了“％ul”，而不是“％lu”，最终被printf作为“％u”。使用PRIu32删除了使用不正确说明符的可能性。 – tinman 2012-01-11 10:21:13

同样的观点适用于''％u'''和''％lu“'。 – pmg 2012-01-11 10:22:25

答

我不记得是什么C99说，但似乎在海湾合作委员会，uint32_t的含有至少 32位，并可能含有较多的，所以当你优化它，它使用64位变种，这是更快64位机器。

不，uint32_t（如果可用）恰好是32位。（uint_fast32_t或uint_least32_t可能更大，但不uint32_t的） – nos 2012-01-11 10:30:20

uint64_t中并不是在所有情况下比uint32_t的速度更快：这是更快的索引数组中当且仅当您有英特尔/ AMD64上一个64位的地址空间，因为16位和32位的指标有在该汇编语言中8位和64位是合法偏移量时被转换。但是如果使用更大的比特尺寸，则乘法和除法会更慢。 – comonad 2012-01-11 14:12:41

答

我能摄制这一点。下面是生成代码的相关位与-O2：

movl $-1, %eax 
    movl $-1, %edx 
    sall %cl, %eax 
    xorl %edx, %edx 
    testb   $32, %cl 
    cmovne  %eax, %edx 
    cmovne  %edx, %eax ; This appears to be the instruction in error. 
          ; It looks as though gcc thought that %edx might still 
          ; be zero at this point, because if the shift count is 
          ; >= 32 then %eax should be zero after this.

，并在这里与-O0相当于位：

movl -16(%ebp), %eax 
    movl -12(%ebp), %edx 
    shldl %cl,%eax, %edx 
    sall %cl, %eax 
    testb   $32, %cl 
    je      L3 
    movl    %eax, %edx 
    xorl    %eax, %eax   ; correctly zeros %eax if shift count >= 32 
L3: 
    movl %eax, -16(%ebp) 
    movl %edx, -12(%ebp)

Compiler是：

i686-apple-darwin11-gcc-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5666) (dot 3)

感谢张贴您的gcc -S输出。我看了一下，虽然略有不同，但关键部分与我在机器上看到的错误相同。

gcc改变c程序行为的优化级别

相关推荐