You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
import spacy
import pandas as pd
from multiprocessing.dummy import Pool as ThreadPool
data = ['string'] * 10
langs = [
("English", spacy.load("en_core_web_md")),
("Chinese", spacy.load("zh_core_web_md")),
("Japanese", spacy.load("ja_core_news_md"))
]
for (lang, model) in langs:
print(lang)
pool = ThreadPool(10)
pool.map(model, data)
pool.close()
pool.join()
Output:
English
[string, string, string, string, string, string, string, string, string, string]
Chinese
[string, string, string, string, string, string, string, string, string, string]
Japanese
Traceback (most recent call last):
File "<stdin>", line 4, in <module>
File "/home/jbartlett/.pyenv/versions/3.11.11/lib/python3.11/multiprocessing/pool.py", line 367, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jbartlett/.pyenv/versions/3.11.11/lib/python3.11/multiprocessing/pool.py", line 774, in get
raise self._value
File "/home/jbartlett/.pyenv/versions/3.11.11/lib/python3.11/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
^^^^^^^^^^^^^^^^^^^
File "/home/jbartlett/.pyenv/versions/3.11.11/lib/python3.11/multiprocessing/pool.py", line 48, in mapstar
return list(map(*args))
^^^^^^^^^^^^^^^^
File "/home/jbartlett/data-airflow/workspace/ml-services/slack-ml/.venv/lib/python3.11/site-packages/spacy/language.py", line 1037, in __call__
doc = self._ensure_doc(text)
^^^^^^^^^^^^^^^^^^^^^^
File "/home/jbartlett/data-airflow/workspace/ml-services/slack-ml/.venv/lib/python3.11/site-packages/spacy/language.py", line 1128, in _ensure_doc
return self.make_doc(doc_like)
^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jbartlett/data-airflow/workspace/ml-services/slack-ml/.venv/lib/python3.11/site-packages/spacy/language.py", line 1120, in make_doc
return self.tokenizer(text)
^^^^^^^^^^^^^^^^^^^^
File "/home/jbartlett/data-airflow/workspace/ml-services/slack-ml/.venv/lib/python3.11/site-packages/spacy/lang/ja/__init__.py", line 56, in __call__
sudachipy_tokens = self.tokenizer.tokenize(text)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Already borrowed
RuntimeError: Already borrowed
when receiving multi-threaded requests.How to reproduce the behaviour
python -m spacy download en_core_web_md
python -m spacy download zh_core_web_md
python -m spacy download ja_core_news_md
python3
(open shell, paste below in):Output:
Info about spaCy
python -m spacy info --markdown
The text was updated successfully, but these errors were encountered: