Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce batch size & general questions #1

Open
Romeo1-1 opened this issue Jan 26, 2023 · 0 comments
Open

Reduce batch size & general questions #1

Romeo1-1 opened this issue Jan 26, 2023 · 0 comments

Comments

@Romeo1-1
Copy link

Hello, and thank you for this very interesting and exciting package !

I have 2 issues and 1 general question:

  1. I am working on a Windows Laptop with WSL2. I tried running your Docker Image but was always running through this issue:
docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: mount error: file creation failed: /var/lib/docker/overlay2/775f320175cbf2f849eb43680d71fb362bd58a8c7b33ea54ab99002f75bc476a/merged/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1: file exists: unknown

This is an issue that does not seem related only to your package. As mentionned here, NVIDIA/nvidia-docker#1699 (comment) I rebuilt a new docker image with this Dockerfile and it now works

FROM deepair1/deepair:latest

RUN rm -rf \
    /usr/lib/x86_64-linux-gnu/libcuda.so* \
    /usr/lib/x86_64-linux-gnu/libnvcuvid.so* \
    /usr/lib/x86_64-linux-gnu/libnvidia-*.so* \
    /usr/lib/firmware \
    /usr/local/cuda/compat/lib
  1. When I am trying to run your exemples, I always immediatly run out of memory. This is the end of the error message I receive.
tensorflow.python.framework.errors_impl.ResourceExhaustedError: Exception encountered when calling layer "output" (type TFBertOutput).

failed to allocate memory [Op:AddV2]

Call arguments received:
  • hidden_states=tf.Tensor(shape=(3, 5, 4096), dtype=float32)
  • input_tensor=tf.Tensor(shape=(3, 5, 1024), dtype=float32)
  • training=False

I am running your exemple with a Laptop, with a Nvidia GTX 1050Ti GPU, an i5 7th Gen with 4 cores (2.5Ghz) and 16Gb of RAM. After trying some workarounds (tensorflow/tensorflow#51354), I think the issue come from the batch_size that is set to 3 (if i understand correctly), but I couldn't find any way to simply lower it. Where could I modify it ?

  1. I am very interested in trying your packages with TCR that were sequenced with VDJ 10x genomic solution. I read thoroughly your pre-print but I am very new to deep learning. If I wanted to run specific sequences from my personal TCR runs against selected epitopes, I could directly use your model, right ? And as i understood it, for the runs I would just need the TCR chains sequences, but I would also need the sequences and AlphaFold2 results of my epitopes of interest ?

Thank you very much in advance for your help

@Romeo1-1 Romeo1-1 changed the title How can I Reduce batch size & general questions Jan 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant