Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CVMFS arch is not compatible with modern TensorFlow wheels #2

Open
matthewfeickert opened this issue Feb 10, 2022 · 1 comment
Open
Labels
wontfix This will not be worked on

Comments

@matthewfeickert
Copy link
Owner

CVMFS LCG views have architecture that is not necessarily compliant with modern machine learning library wheels. For an example CVMFS view LCG_98python3 x86_64-centos7-gcc8-opt is copatible with tensorflow v2.1.0 but not tensorflow v2.8.0.

Example

$ ssh uchicago
[17:38] login02.af.uchicago.edu:~ $ mkdir debug && cd debug
[17:38] login02.af.uchicago.edu:~/debug $ curl -sLO https://raw.githubusercontent.com/matthewfeickert/cvmfs-venv/2a6831069b4164925736efc9e4f25549ae831b4a/atlas_setup.sh
[17:38] login02.af.uchicago.edu:~/debug $ . atlas_setup.sh debug
(debug) [17:39] login02.af.uchicago.edu:~/debug $ deactivate
[17:39] login02.af.uchicago.edu:~/debug $ python -m pip show tensorflow
[17:39] login02.af.uchicago.edu:~/debug $ python -m pip show tensorflow-cpu
Name: tensorflow-cpu
Version: 2.1.0
Summary: TensorFlow is an open source machine learning framework for everyone.
Home-page: https://www.tensorflow.org/
Author: Google Inc.
Author-email: [email protected]
License: Apache 2.0
Location: /cvmfs/sft.cern.ch/lcg/views/LCG_98python3/x86_64-centos7-gcc8-opt/lib/python3.7/site-packages
Requires: grpcio, wrapt, opt-einsum, six, gast, wheel, scipy, astor, google-pasta, tensorflow-estimator, tensorboard, keras-preprocessing, protobuf, termcolor, numpy, absl-py, keras-applications
Required-by:
[17:39] login02.af.uchicago.edu:~/debug $ . debug/bin/activate
(debug) [17:39] login02.af.uchicago.edu:~/debug $ python -m pip install --upgrade 'tensorflow==2.1.0'
(debug) [17:39] login02.af.uchicago.edu:~/debug $ python -m pip show tensorflow
Name: tensorflow
Version: 2.1.0
Summary: TensorFlow is an open source machine learning framework for everyone.
Home-page: https://www.tensorflow.org/
Author: Google Inc.
Author-email: [email protected]
License: Apache 2.0
Location: /home/feickert/debug/debug/lib/python3.7/site-packages
Requires: absl-py, astor, gast, google-pasta, grpcio, keras-applications, keras-preprocessing, numpy, opt-einsum, protobuf, scipy, six, tensorboard, tensorflow-estimator, termcolor, wheel, wrapt
Required-by: 
(debug) [17:39] login02.af.uchicago.edu:~/debug $ python -c 'import tensorflow as tf; import keras'
2022-02-10 17:36:11.112165: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer.so.6'; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /cvmfs/sft.cern.ch/lcg/releases/MCGenerators/thepeg/2.2.1-c1b37/x86_64-centos7-gcc8-opt/lib/ThePEG:/cvmfs/sft.cern.ch/lcg/releases/MCGenerators/herwig++/7.2.1-71099/x86_64-centos7-gcc8-opt/lib/Herwig:/cvmfs/sft.cern.ch/lcg/views/LCG_98python3/x86_64-centos7-gcc8-opt/lib/python3.7/site-packages/torch/lib:/cvmfs/sft.cern.ch/lcg/views/LCG_98python3/x86_64-centos7-gcc8-opt/lib/python3.7/site-packages/tensorflow_core:/cvmfs/sft.cern.ch/lcg/releases/java/8u222-884d8/x86_64-centos7-gcc8-opt/jre/lib/amd64:/cvmfs/sft.cern.ch/lcg/views/LCG_98python3/x86_64-centos7-gcc8-opt/lib64:/cvmfs/sft.cern.ch/lcg/views/LCG_98python3/x86_64-centos7-gcc8-opt/lib:/cvmfs/sft.cern.ch/lcg/releases/gcc/8.3.0-cebb0/x86_64-centos7/lib:/cvmfs/sft.cern.ch/lcg/releases/gcc/8.3.0-cebb0/x86_64-centos7/lib64:/cvmfs/sft.cern.ch/lcg/releases/binutils/2.30-e5b21/x86_64-centos7/lib:/cvmfs/sft.cern.ch/lcg/releases/R/3.6.3-2dabd/x86_64-centos7-gcc8-opt/lib64/R/library/readr/rcon
2022-02-10 17:36:11.112273: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer_plugin.so.6'; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /cvmfs/sft.cern.ch/lcg/releases/MCGenerators/thepeg/2.2.1-c1b37/x86_64-centos7-gcc8-opt/lib/ThePEG:/cvmfs/sft.cern.ch/lcg/releases/MCGenerators/herwig++/7.2.1-71099/x86_64-centos7-gcc8-opt/lib/Herwig:/cvmfs/sft.cern.ch/lcg/views/LCG_98python3/x86_64-centos7-gcc8-opt/lib/python3.7/site-packages/torch/lib:/cvmfs/sft.cern.ch/lcg/views/LCG_98python3/x86_64-centos7-gcc8-opt/lib/python3.7/site-packages/tensorflow_core:/cvmfs/sft.cern.ch/lcg/releases/java/8u222-884d8/x86_64-centos7-gcc8-opt/jre/lib/amd64:/cvmfs/sft.cern.ch/lcg/views/LCG_98python3/x86_64-centos7-gcc8-opt/lib64:/cvmfs/sft.cern.ch/lcg/views/LCG_98python3/x86_64-centos7-gcc8-opt/lib:/cvmfs/sft.cern.ch/lcg/releases/gcc/8.3.0-cebb0/x86_64-centos7/lib:/cvmfs/sft.cern.ch/lcg/releases/gcc/8.3.0-cebb0/x86_64-centos7/lib64:/cvmfs/sft.cern.ch/lcg/releases/binutils/2.30-e5b21/x86_64-centos7/lib:/cvmfs/sft.cern.ch/lcg/releases/R/3.6.3-2dabd/x86_64-centos7-gcc8-opt/lib64/R/library/readr/rcon
2022-02-10 17:36:11.112286: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
Using TensorFlow backend.
(debug) $ python -m pip install --upgrade tensorflow
(debug) $ python -m pip show tensorflow
Name: tensorflow
Version: 2.8.0
Summary: TensorFlow is an open source machine learning framework for everyone.
Home-page: https://www.tensorflow.org/
Author: Google Inc.
Author-email: [email protected]
License: Apache 2.0
Location: /home/feickert/debug/debug/lib/python3.7/site-packages
Requires: absl-py, astunparse, flatbuffers, gast, google-pasta, grpcio, h5py, keras, keras-preprocessing, libclang, numpy, opt-einsum, protobuf, setuptools, six, tensorboard, tensorflow-io-gcs-filesystem, termcolor, tf-estimator-nightly, typing-extensions, wrapt
Required-by: 
(debug) [17:40] login02.af.uchicago.edu:~/debug $ python -c 'import tensorflow as tf; import keras'
Traceback (most recent call last):
  File "/home/feickert/debug/debug/lib/python3.7/site-packages/tensorflow/python/pywrap_tensorflow.py", line 60, in <module>
    from tensorflow.python._pywrap_tensorflow_internal import *
ImportError: /home/feickert/debug/debug/lib/python3.7/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so: undefined symbol: _ZNK10tensorflow8OpKernel11TraceStringERKNS_15OpKernelContextEb

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/feickert/debug/debug/lib/python3.7/site-packages/tensorflow/__init__.py", line 37, in <module>
    from tensorflow.python.tools import module_util as _module_util
  File "/home/feickert/debug/debug/lib/python3.7/site-packages/tensorflow/python/__init__.py", line 36, in <module>
    from tensorflow.python import pywrap_tensorflow as _pywrap_tensorflow
  File "/home/feickert/debug/debug/lib/python3.7/site-packages/tensorflow/python/pywrap_tensorflow.py", line 76, in <module>
    f'{traceback.format_exc()}'
ImportError: Traceback (most recent call last):
  File "/home/feickert/debug/debug/lib/python3.7/site-packages/tensorflow/python/pywrap_tensorflow.py", line 60, in <module>
    from tensorflow.python._pywrap_tensorflow_internal import *
ImportError: /home/feickert/debug/debug/lib/python3.7/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so: undefined symbol: _ZNK10tensorflow8OpKernel11TraceStringERKNS_15OpKernelContextEb


Failed to load the native TensorFlow runtime.
See https://www.tensorflow.org/install/errors for some common causes and solutions.
If you need help, create an issue at https://github.com/tensorflow/tensorflow/issues and include the entire stack trace above this error message.
@matthewfeickert matthewfeickert added the wontfix This will not be worked on label Feb 10, 2022
@matthewfeickert
Copy link
Owner Author

matthewfeickert commented Feb 11, 2022

As an example, I'm able to get the following example script (example.tar.gz) to run in a virtual environment on a CVMFS machine with

(debug) $ python -m pip install --upgrade --force-reinstall 'tensorflow==2.1.0'

and

(debug) $ python -m pip install --upgrade --force-reinstall 'tensorflow==2.5.0'

but the tensorflow>=2.6.0 wheels have some errors that are incompatible with the arch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

1 participant