Skip to content
This repository has been archived by the owner on Jul 24, 2018. It is now read-only.

Database connection closed issues when using multiprocessing #20

Open
fmontei opened this issue Mar 30, 2018 · 0 comments
Open

Database connection closed issues when using multiprocessing #20

fmontei opened this issue Mar 30, 2018 · 0 comments

Comments

@fmontei
Copy link
Contributor

fmontei commented Mar 30, 2018

There is a bug in Deckhand with connection being closed when uwsgi workers > 1:

[SQL: 'SELECT 1'] (Background on this error at: http://sqlalche.me/e/e3q8)
2018-03-30 18:19:12.897 21802 ERROR oslo_db.sqlalchemy.engines Traceback (most recent call last):
2018-03-30 18:19:12.897 21802 ERROR oslo_db.sqlalchemy.engines   File "/opt/stack/ucp/deckhand/.tox/functional/local/lib/python2.7/site-packages/oslo_db/sqlalchemy/engines.py", line 73, in _connect_ping_listener
2018-03-30 18:19:12.897 21802 ERROR oslo_db.sqlalchemy.engines     connection.scalar(select([1]))
2018-03-30 18:19:12.897 21802 ERROR oslo_db.sqlalchemy.engines   File "/opt/stack/ucp/deckhand/.tox/functional/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 880, in scalar
2018-03-30 18:19:12.897 21802 ERROR oslo_db.sqlalchemy.engines     return self.execute(object, *multiparams, **params).scalar()
2018-03-30 18:19:12.897 21802 ERROR oslo_db.sqlalchemy.engines   File "/opt/stack/ucp/deckhand/.tox/functional/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 948, in execute
2018-03-30 18:19:12.897 21802 ERROR oslo_db.sqlalchemy.engines     return meth(self, multiparams, params)
2018-03-30 18:19:12.897 21802 ERROR oslo_db.sqlalchemy.engines   File "/opt/stack/ucp/deckhand/.tox/functional/local/lib/python2.7/site-packages/sqlalchemy/sql/elements.py", line 269, in _execute_on_connection
2018-03-30 18:19:12.897 21802 ERROR oslo_db.sqlalchemy.engines     return connection._execute_clauseelement(self, multiparams, params)
2018-03-30 18:19:12.897 21802 ERROR oslo_db.sqlalchemy.engines   File "/opt/stack/ucp/deckhand/.tox/functional/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1060, in _execute_clauseelement
2018-03-30 18:19:12.897 21802 ERROR oslo_db.sqlalchemy.engines     compiled_sql, distilled_params
2018-03-30 18:19:12.897 21802 ERROR oslo_db.sqlalchemy.engines   File "/opt/stack/ucp/deckhand/.tox/functional/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1200, in _execute_context
2018-03-30 18:19:12.897 21802 ERROR oslo_db.sqlalchemy.engines     context)
2018-03-30 18:19:12.897 21802 ERROR oslo_db.sqlalchemy.engines   File "/opt/stack/ucp/deckhand/.tox/functional/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1409, in _handle_dbapi_exception
2018-03-30 18:19:12.897 21802 ERROR oslo_db.sqlalchemy.engines     util.raise_from_cause(newraise, exc_info)
2018-03-30 18:19:12.897 21802 ERROR oslo_db.sqlalchemy.engines   File "/opt/stack/ucp/deckhand/.tox/functional/local/lib/python2.7/site-packages/sqlalchemy/util/compat.py", line 203, in raise_from_cause
2018-03-30 18:19:12.897 21802 ERROR oslo_db.sqlalchemy.engines     reraise(type(exception), exception, tb=exc_tb, cause=cause)
2018-03-30 18:19:12.897 21802 ERROR oslo_db.sqlalchemy.engines   File "/opt/stack/ucp/deckhand/.tox/functional/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1193, in _execute_context
2018-03-30 18:19:12.897 21802 ERROR oslo_db.sqlalchemy.engines     context)
2018-03-30 18:19:12.897 21802 ERROR oslo_db.sqlalchemy.engines   File "/opt/stack/ucp/deckhand/.tox/functional/local/lib/python2.7/site-packages/sqlalchemy/engine/default.py", line 507, in do_execute
2018-03-30 18:19:12.897 21802 ERROR oslo_db.sqlalchemy.engines     cursor.execute(statement, parameters)
2018-03-30 18:19:12.897 21802 ERROR oslo_db.sqlalchemy.engines DBConnectionError: (psycopg2.OperationalError) server closed the connection unexpectedly
2018-03-30 18:19:12.897 21802 ERROR oslo_db.sqlalchemy.engines  This probably means the server terminated abnormally
2018-03-30 18:19:12.897 21802 ERROR oslo_db.sqlalchemy.engines  before or while processing the request.
2018-03-30 18:19:12.897 21802 ERROR oslo_db.sqlalchemy.engines  [SQL: 'SELECT 1'] (Background on this error at: http://sqlalche.me/e/e3q8)

The DB error from connection being reset has to do to with the fact that the DB connection pool is being shared across multiple processes (workers) when uwsgi workers > 1. There are ways to fix that but still don't know the fix without reading a bunch of long sqlalchemy documentation. And BTW, this issue would manifest itself with or without sqlalchemy. One way is to use a NullPool which doesn't pool any connection strings but I'm guessing has a slight impact on DB performance and the other is to use sqlalchemy event listeners and process PIDs. According to the docs anyway. (edited) The native OpenStack pattern is to only provide thread locking support, not multiprocess support.

More info:
http://docs.sqlalchemy.org/en/rel_1_0/faq/connections.html#how-do-i-use-engines-connections-sessions-with-python-multiprocessing-or-os-fork
http://docs.sqlalchemy.org/en/rel_1_0/core/pooling.html#using-connection-pools-with-multiprocessing

jenkins-attcomdev pushed a commit that referenced this issue Apr 2, 2018
This sets multiple threads in Deckhand's chart config (4)
and set workers to just 1.

Deckhand's database is not configured to work with multiprocessing.
Currently there is a data race on acquiring shared SQLAlchemy
engine pooled connection strings when workers > 1. As a
workaround, we use multiple threads but only 1 worker. For more
information, see:

#20

Change-Id: I60adeffff5461fdda957124232bc5a606baae413
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant