无法打开tensorflow会议

问题描述:

,我得到以下错误:无法打开tensorflow会议

2017-09-24 10:49:20.526121: I tensorflow/core/common_runtime/gpu/gpu_device.cc:955] Found device 0 with properties: 
name: GeForce GTX 970 
major: 5 minor: 2 memoryClockRate (GHz) 1.342 
pciBusID 0000:03:00.0 
Total memory: 3.94GiB 
Free memory: 3.87GiB 
2017-09-24 10:49:20.599629: W tensorflow/stream_executor/cuda/cuda_driver.cc:523] A non-primary context 0x3dcf7e0 exists before initializing the StreamExecutor. We haven't verified StreamExecutor works with that. 
2017-09-24 10:49:20.599947: E tensorflow/core/common_runtime/direct_session.cc:171] Internal: failed initializing StreamExecutor for CUDA device ordinal 1: Internal: failed call to cuDevicePrimaryCtxRetain: CUDA_ERROR_INVALID_DEVICE 
Traceback (most recent call last): 
    File "<stdin>", line 1, in <module> 
    File "/home/user/python-envs/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1486, in __init__ 
    super(Session, self).__init__(target, graph, config=config) 
    File "/home/user/python-envs/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 621, in __init__ 
    self._session = tf_session.TF_NewDeprecatedSession(opts, status) 
    File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__ 
    self.gen.next() 
    File "/home/user/python-envs/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status 
    pywrap_tensorflow.TF_GetCode(status)) 
tensorflow.python.framework.errors_impl.InternalError: Failed to create session. 

我在我的系统中两个GPU。一个用于显示,另一个用于计算:

GPU0 (display) : Nvidia NVS 310 
GPU1 (compute) : Nvidia Geforce GTX 970 
Graphics Driver: 384.66 
CUDA version : 8 
cuDNN version : v6 for CUDA 8 (April 27, 2017) 
Operating Sys. : Ubuntu 16.04 

是否有其他人有此问题?如何继续调试/修复此问题?

注意:我曾尝试在Github上打开一个问题。但是在我完成之前,我被要求寻找SO上提出的问题或者在那里询问。

谢谢!

看来tensorflow会尝试抓取所有可用的GPU进行计算,如下面链接的Github问题所示。将环境变量CUDA_VISIBLE_DEVICES设置为我想要用于计算的设备的确有诀窍。

在Github上可能相关的问题包括:Segmentation fault when GPUs are already used

人们可以通过运行nvidia-smi程序来检查在Ubuntu的设备ID。