Linux查看显存,TensorFlow 报错:CUDA_ERROR_OUT_OF_MEMORY显存不足

lspci | grep -i vga
查看显卡信息

[[email protected] ~]$ lspci | grep -i nvidia
83:00.0 3D controller: NVIDIA Corporation GK110BGL [Tesla K40m] (rev a1)
84:00.0 3D controller: NVIDIA Corporation GK110BGL [Tesla K40m] (rev a1)

我这里是两块显卡,是83:00.0和84:00.0

lspci -v -s 83:00.0
查看更详细信息

[[email protected] ~]$ lspci -v -s 83:00.0
83:00.0 3D controller: NVIDIA Corporation GK110BGL [Tesla K40m] (rev a1)
Subsystem: NVIDIA Corporation 12GB Computational Accelerator
Flags: bus master, fast devsel, latency 0, IRQ 92
Memory at fa000000 (32-bit, non-prefetchable) [size=16M]
Memory at 381800000000 (64-bit, prefetchable) [size=16G]
Memory at 381c00000000 (64-bit, prefetchable) [size=32M]
Capabilities:
Kernel driver in use: nvidia

显存16G

nvidia显卡还可以用下面的方法查看
表头释义:
Fan:显示风扇转速,数值在0到100%之间,是计算机的期望转速,如果计算机不是通过风扇冷却或者风扇坏了,显示出来就是N/A;
Temp:显卡内部的温度,单位是摄氏度;
Perf:表征性能状态,从P0到P12,P0表示最大性能,P12表示状态最小性能;
Pwr:能耗表示;
Bus-Id:涉及GPU总线的相关信息;
Disp.A:是Display Active的意思,表示GPU的显示是否初始化;
Memory Usage:显存的使用率;
Volatile GPU-Util:浮动的GPU利用率;
Compute M:计算模式;
下边的Processes显示每块GPU上每个进程所使用的显存情况。

Linux查看显存,TensorFlow 报错:CUDA_ERROR_OUT_OF_MEMORY显存不足