stylegan2报错“undefined symbol: _ZN10tensorflow12OpDefBuilder6OutputESs”的解决方案
-
系统环境
本机python与gcc环境如下
-
问题描述
dnnlib: Running training.training_loop.training_loop() on localhost...
Streaming data using training.dataset.TFRecordDataset...
Dataset shape = [3, 512, 512]
Dynamic range = [0, 255]
Label size = 0
2019-12-19 17:15:03.506648: W tensorflow/compiler/jit/mark_for_compilation_pass.cc:1412] (One-time warning): Not using XLA:CPU for cluster because envvar TF_XLA_FLAGS=--tf_xla_cpu_global_jit was not set. If you want XLA:CPU, either set that envvar, or use experimental_jit_scope to enable XLA:CPU. To confirm that XLA is active, pass --vmodule=xla_compilation_cache=1 (as a proper command-line flag, not via TF_XLA_FLAGS) or set the envvar XLA_FLAGS=--xla_hlo_profile.
Constructing networks...
Setting up TensorFlow plugin "fused_bias_act.cu": Preprocessing... Loading... Failed!
Traceback (most recent call last):
File "run_training.py", line 194, in <module>
main()
File "run_training.py", line 189, in main
run(**vars(args))
File "run_training.py", line 122, in run
dnnlib.submit_run(**kwargs)
File "/data2/stylegan2/dnnlib/submission/submit.py", line 343, in submit_run
return farm.submit(submit_config, host_run_dir)
File "/data2/stylegan2/dnnlib/submission/internal/local.py", line 22, in submit
return run_wrapper(submit_config)
File "/data2/stylegan2/dnnlib/submission/submit.py", line 280, in run_wrapper
run_func_obj(**submit_config.run_func_kwargs)
File "/data2/stylegan2/training/training_loop.py", line 149, in training_loop
G = tflib.Network('G', num_channels=training_set.shape[0], resolution=training_set.shape[1], label_size=training_set.label_size, **G_args)
File "/data2/stylegan2/dnnlib/tflib/network.py", line 97, in __init__
self._init_graph()
File "/data2/stylegan2/dnnlib/tflib/network.py", line 154, in _init_graph
out_expr = self._build_func(*self.input_templates, **build_kwargs)
File "/data2//stylegan2/training/networks_stylegan2.py", line 186, in G_main
components.synthesis = tflib.Network('G_synthesis', func_name=globals()[synthesis_func], **kwargs)
File "/data2/stylegan2/dnnlib/tflib/network.py", line 97, in __init__
self._init_graph()
File "/data2/stylegan2/dnnlib/tflib/network.py", line 154, in _init_graph
out_expr = self._build_func(*self.input_templates, **build_kwargs)
File "/data2/stylegan2/training/networks_stylegan2.py", line 365, in G_synthesis_stylegan_revised
x = layer(x, layer_idx=0, fmaps=nf(1), kernel=3)
File "/data2/stylegan2/training/networks_stylegan2.py", line 350, in layer
x = modulated_conv2d_layer(x, dlatents_in[:, layer_idx], fmaps=fmaps, kernel=kernel, up=up, resample_kernel=resample_kernel, fused_modconv=fused_modconv)
File "/data2/stylegan2/training/networks_stylegan2.py", line 99, in modulated_conv2d_layer
s = apply_bias_act(s, bias_var=mod_bias_var) + 1 # [BI] Add bias (initially 1).
File "/data2/stylegan2/training/networks_stylegan2.py", line 68, in apply_bias_act
return fused_bias_act(x, b=tf.cast(b, x.dtype), act=act, alpha=alpha, gain=gain)
File "/data2/stylegan2/dnnlib/tflib/ops/fused_bias_act.py", line 68, in fused_bias_act
return impl_dict[impl](x=x, b=b, axis=axis, act=act, alpha=alpha, gain=gain)
File "/data2/stylegan2/dnnlib/tflib/ops/fused_bias_act.py", line 122, in _fused_bias_act_cuda
cuda_kernel = _get_plugin().fused_bias_act
File "/data2/stylegan2/dnnlib/tflib/ops/fused_bias_act.py", line 16, in _get_plugin
return custom_ops.get_plugin(os.path.splitext(__file__)[0] + '.cu')
File "/data2/stylegan2/dnnlib/tflib/custom_ops.py", line 156, in get_plugin
plugin = tf.load_op_library(bin_file)
File "/opt/conda/envs/lib/python3.7/site-packages/tensorflow/python/framework/load_library.py", line 61, in load_op_library
lib_handle = py_tf.TF_LoadLibrary(library_filename)
tensorflow.python.framework.errors_impl.NotFoundError: /data2/stylegan2/dnnlib/tflib/_cudacache/fused_bias_act_2d3e0715d2c9295a97a8ac95e846af6a.so: undefined symbol: _ZN10tensorflow12OpDefBuilder6OutputESs
-
解决方案
导致这个问题,主要是在生成so包过程中,主要是由gcc编译选项问题导致的,gcc版本大于 4,本身支持c11标准,则不需要选项-D_GLIBCXX_USE_CXX11_ABI = 0;打开custom_ops.py中127行--compiler-options \'-fPIC -D_GLIBCXX_USE_CXX11_ABI=0,改为--compiler-options \'-fPIC -D_GLIBCXX_USE_CXX11_ABI=1,即可如下图所示:
-
参考地址:
- https://blog.****.net/HelloJinYe/article/details/103302362
- https://blog.****.net/zhe_****/article/details/90345511
- https://blog.****.net/u012435142/article/details/81811487