Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Best way to integrate with django migration system #280

Closed
agateblue opened this issue Jul 18, 2020 · 16 comments · Fixed by #283
Closed

Best way to integrate with django migration system #280

agateblue opened this issue Jul 18, 2020 · 16 comments · Fixed by #283
Labels
Issue appropriate for: Occasional contributors 😉 This issue will be best tackled by people with a minimum of experience on the project Issue contains: Some Python 🐍 This issue involves writing some Python code Issue contains: Some SQL 🐘 This features require changing the SQL model

Comments

@agateblue
Copy link
Contributor

This isn't a bug or feature request but rather a discussion to understand the best way to integrate procrastinate within a Django project, in particular, at the SQL level, as Django as its own migration system.

My first attempt was to use raw SQL and execute each sql/migrations/delta* delta file, like this:

# Generated by Django 3.0.8 on 2020-07-18 08:39
import os
from django.db import connection
from django.db import migrations
from django.db import transaction


def apply_sql(apps, schema_editor):
    import procrastinate
    sql_path = os.path.join(
        os.path.dirname(procrastinate.__file__),
        'sql',
        'migrations',
    )
     
    migrations = [
        "baseline-0.5.0.sql",
        "delta_0.5.0_001_drop_started_at_column.sql",
        "delta_0.5.0_002_drop_started_at_column.sql",
        "delta_0.5.0_003_drop_procrastinate_version_table.sql",
        "delta_0.6.0_001_fix_procrastinate_fetch_job.sql",
        "delta_0.7.1_001_fix_trigger_status_events_insert.sql",
        "delta_0.8.1_001_add_queueing_lock_column.sql",
        "delta_0.10.0_001_close_fetch_job_race_condition.sql",
        "delta_0.10.0_002_add_defer_job_function.sql",
        "delta_0.11.0_003_add_procrastinate_periodic_defers.sql",
        "delta_0.12.0_001_add_foreign_key_index.sql",
    ]

    with connection.cursor() as cursor:
        with transaction.atomic():
            for name in migrations:
                full_path = os.path.join(sql_path, name)
                with open(full_path, 'rb') as f:
                    print('Applying {}'.format(full_path))
                    sql = f.read().decode()
                    cursor.execute(sql)


def rewind(*args, **kwargs):
    pass


class Migration(migrations.Migration):

    dependencies = [
        ('common', '0008_auto_20200701_1317'),
    ]

    operations = [
        migrations.RunPython(apply_sql, rewind)
    ]

However, this crashes, with the following output:

python manage.py migrate common 0009

Operations to perform:
  Target specific migration: 0009_procrastinate, from common
Running migrations:
  Applying common.0009_procrastinate...Applying /venv/lib/python3.7/site-packages/procrastinate/sql/migrations/baseline-0.5.0.sql
Applying /venv/lib/python3.7/site-packages/procrastinate/sql/migrations/delta_0.5.0_001_drop_started_at_column.sql
Applying /venv/lib/python3.7/site-packages/procrastinate/sql/migrations/delta_0.5.0_002_drop_started_at_column.sql
Applying /venv/lib/python3.7/site-packages/procrastinate/sql/migrations/delta_0.5.0_003_drop_procrastinate_version_table.sql
Applying /venv/lib/python3.7/site-packages/procrastinate/sql/migrations/delta_0.6.0_001_fix_procrastinate_fetch_job.sql
Applying /venv/lib/python3.7/site-packages/procrastinate/sql/migrations/delta_0.7.1_001_fix_trigger_status_events_insert.sql
Applying /venv/lib/python3.7/site-packages/procrastinate/sql/migrations/delta_0.8.1_001_add_queueing_lock_column.sql
Applying /venv/lib/python3.7/site-packages/procrastinate/sql/migrations/delta_0.10.0_001_close_fetch_job_race_condition.sql
Applying /venv/lib/python3.7/site-packages/procrastinate/sql/migrations/delta_0.10.0_002_add_defer_job_function.sql
Applying /venv/lib/python3.7/site-packages/procrastinate/sql/migrations/delta_0.11.0_003_add_procrastinate_periodic_defers.sql
Traceback (most recent call last):
  File "/venv/lib/python3.7/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql)
psycopg2.errors.SyntaxError: syntax error at or near ";"
LINE 9: DROP FUNCTION IF EXISTS procrastinate_defer_job;
                                                       ^


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "manage.py", line 27, in <module>
    execute_from_command_line(sys.argv)
  File "/venv/lib/python3.7/site-packages/django/core/management/__init__.py", line 401, in execute_from_command_line
    utility.execute()
  File "/venv/lib/python3.7/site-packages/django/core/management/__init__.py", line 395, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/venv/lib/python3.7/site-packages/django/core/management/base.py", line 328, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/venv/lib/python3.7/site-packages/django/core/management/base.py", line 369, in execute
    output = self.handle(*args, **options)
  File "/venv/lib/python3.7/site-packages/django/core/management/base.py", line 83, in wrapped
    res = handle_func(*args, **kwargs)
  File "/venv/lib/python3.7/site-packages/django/core/management/commands/migrate.py", line 233, in handle
    fake_initial=fake_initial,
  File "/venv/lib/python3.7/site-packages/django/db/migrations/executor.py", line 117, in migrate
    state = self._migrate_all_forwards(state, plan, full_plan, fake=fake, fake_initial=fake_initial)
  File "/venv/lib/python3.7/site-packages/django/db/migrations/executor.py", line 147, in _migrate_all_forwards
    state = self.apply_migration(state, migration, fake=fake, fake_initial=fake_initial)
  File "/venv/lib/python3.7/site-packages/django/db/migrations/executor.py", line 245, in apply_migration
    state = migration.apply(state, schema_editor)
  File "/venv/lib/python3.7/site-packages/django/db/migrations/migration.py", line 124, in apply
    operation.database_forwards(self.app_label, schema_editor, old_state, project_state)
  File "/venv/lib/python3.7/site-packages/django/db/migrations/operations/special.py", line 190, in database_forwards
    self.code(from_state.apps, schema_editor)
  File "/app/funkwhale_api/common/migrations/0009_procrastinate.py", line 37, in apply_sql
    cursor.execute(sql)
  File "/venv/lib/python3.7/site-packages/django/db/backends/utils.py", line 100, in execute
    return super().execute(sql, params)
  File "/venv/lib/python3.7/site-packages/cacheops/transaction.py", line 93, in execute
    result = self._no_monkey.execute(self, sql, params)
  File "/venv/lib/python3.7/site-packages/django/db/backends/utils.py", line 68, in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
  File "/venv/lib/python3.7/site-packages/django/db/backends/utils.py", line 77, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "/venv/lib/python3.7/site-packages/django/db/backends/utils.py", line 86, in _execute
    return self.cursor.execute(sql, params)
  File "/venv/lib/python3.7/site-packages/django/db/utils.py", line 90, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/venv/lib/python3.7/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql)
django.db.utils.ProgrammingError: syntax error at or near ";"
LINE 9: DROP FUNCTION IF EXISTS procrastinate_defer_job;

Execution of delta_0.11.0_003_add_procrastinate_periodic_defers.sql fails with a syntax error, which I don't understand unfortunately. If you have some ideas, I can definitely try these, and possibly document Django integration when I have a working setup :)

@ewjoachim
Copy link
Member

Hm, you're right, if we expect people using Django to use procrastinate, we'll need to ship a way to transform PG migrations SQL files into Django migrations. Your approach seems nice, but I'd tend to try and generate one Django migration per SQL migration file (whereas if I understand your code, you have all SQL migrations in a single Django migration). But that's a detail, and doesn't help your problem.

Can I ask why you used a RunPython and not a RunSQL ?

I can try to have a look, but I don't have a clue at this point why this fails. Or you can have a look on your side. Or we can do this in pair programming if you want, like in the good ol' days :)

I'm perfectly fine adding a Django submodule for simplifying the usage of procrastinate with django, if it's deemed preferable to a django-procrastinate dedicated package.

@agateblue
Copy link
Contributor Author

Hm, you're right, if we expect people using Django to use procrastinate, we'll need to ship a way to transform PG migrations SQL files into Django migrations. Your approach seems nice, but I'd tend to try and generate one Django migration per SQL migration file (whereas if I understand your code, you have all SQL migrations in a single Django migration). But that's a detail, and doesn't help your problem.

I'm not sure about the "1 procrastinate migration <-> 1 django migration" approach. As far as I can tell, it doesn't bring any additional value and make integration with the app way harder (a dozen files to add, instead of only one).

Can I ask why you used a RunPython and not a RunSQL ?

I can try that, I wanted to avoid file I/O at import time, which RunSQL would require.

IMHO, the easiest option would be as you suggested, have a contrib.django package which would be a django app, and contain the proper migrations. Integrating procrastinate in a django project would then only require INSTALLED_APPS = [..., "procrastinate.contrib.django", ...] and python manage.py migrate.

However, it would increase the maintenance burden on procrastinate side, possibly by having duplicated SQL code (though it could probably be automated).

@agateblue
Copy link
Contributor Author

If I use RunSQL, I end up with the same error. This is the code I used:

# Generated by Django 3.0.8 on 2020-07-18 08:39
import os
from django.db import migrations


def get_sql(name):
    import procrastinate
    sql_path = os.path.join(
        os.path.dirname(procrastinate.__file__),
        'sql',
        'migrations',
    )
     
    full_path = os.path.join(sql_path, name)
    with open(full_path, 'rb') as f:
        return f.read().decode()


class Migration(migrations.Migration):

    dependencies = [
        ('common', '0008_auto_20200701_1317'),
    ]

    operations = [
        migrations.RunSQL(get_sql("baseline-0.5.0.sql")),
        migrations.RunSQL(get_sql("delta_0.5.0_001_drop_started_at_column.sql")),
        migrations.RunSQL(get_sql("delta_0.5.0_002_drop_started_at_column.sql")),
        migrations.RunSQL(get_sql("delta_0.5.0_003_drop_procrastinate_version_table.sql")),
        migrations.RunSQL(get_sql("delta_0.6.0_001_fix_procrastinate_fetch_job.sql")),
        migrations.RunSQL(get_sql("delta_0.7.1_001_fix_trigger_status_events_insert.sql")),
        migrations.RunSQL(get_sql("delta_0.8.1_001_add_queueing_lock_column.sql")),
        migrations.RunSQL(get_sql("delta_0.10.0_001_close_fetch_job_race_condition.sql")),
        migrations.RunSQL(get_sql("delta_0.10.0_002_add_defer_job_function.sql")),
        migrations.RunSQL(get_sql("delta_0.11.0_003_add_procrastinate_periodic_defers.sql")),
        migrations.RunSQL(get_sql("delta_0.12.0_001_add_foreign_key_index.sql")),
    ]

Maybe it's related to my postgresql installation, though it's a pretty basic postgres:11 docker installation. I'll try piping the SQL files directly to postres.

@agateblue
Copy link
Contributor Author

So, piping the migrations directly to postgresql raise the same bug:

docker-compose -f dev.yml exec -T postgres psql -U postgres -P < ../../procrastinate/procrastinate/sql/migrations/delta_0.11.0_003_add_procrastinate_periodic_defers.sql

CREATE TABLE
ERROR:  syntax error at or near ";"
LINE 1: DROP FUNCTION IF EXISTS procrastinate_defer_job;
                                                       ^
CREATE FUNCTION
ERROR:  syntax error at or near ";"
LINE 1: DROP FUNCTION IF EXISTS procrastinate_defer_periodic_job;
                                                                ^
CREATE FUNCTION

The error is still there, but because I piped a whole file, it doesn't crash immediatly.

As far as I know, it's because the DROP FUNCTION statement requires you to include the arguments of the function, like this: DROP FUNCTION sqrt(integer);

It's probably failing in other setups but silently ignored.

@ewjoachim
Copy link
Member

Hm, nice find, thanks !

@ewjoachim
Copy link
Member

Hm, the doc says:

If the function name is unique in its schema, it can be referred to without an argument list:

DROP FUNCTION update_employee_salaries;

So, I still don't understand how this is a syntax error.

@ewjoachim
Copy link
Member

On my machine:

$ psql
psql (12.3, server 10.10 (Debian 10.10-1.pgdg90+1))
Type "help" for help.

procrastinate=# DROP FUNCTION IF EXISTS procrastinate_defer_job;
DROP FUNCTION
procrastinate=# DROP FUNCTION IF EXISTS procrastinate_defer_job;
NOTICE:  function procrastinate_defer_job() does not exist, skipping
DROP FUNCTION
procrastinate=# DROP FUNCTION IF EXISTS procrastinate_defer_periodic_job;
DROP FUNCTION

@agateblue
Copy link
Contributor Author

agateblue commented Jul 18, 2020

I thought it may be related to an inconsistent state in my setup, so I'v tried with a clean postgres 9.6 container. Launch a container:

docker run -e POSTGRES_HOST_AUTH_METHOD=trust --name test-procrastinate postgres:9.6 postgres -c log_min_duration_statement=0

Apply the migrations:

docker exec -i test-procrastinate psql -Upostgres  < ../../procrastinate/procrastinate/sql/migrations/baseline-0.5.0.sql
docker exec -i test-procrastinate psql -Upostgres  < ../../procrastinate/procrastinate/sql/migrations/delta_0.5.0_001_drop_started_at_column.sql
docker exec -i test-procrastinate psql -Upostgres  < ../../procrastinate/procrastinate/sql/migrations/delta_0.5.0_002_drop_started_at_column.sql
docker exec -i test-procrastinate psql -Upostgres  < ../../procrastinate/procrastinate/sql/migrations/delta_0.5.0_003_drop_procrastinate_version_table.sql
docker exec -i test-procrastinate psql -Upostgres  < ../../procrastinate/procrastinate/sql/migrations/delta_0.6.0_001_fix_procrastinate_fetch_job.sql
docker exec -i test-procrastinate psql -Upostgres  < ../../procrastinate/procrastinate/sql/migrations/delta_0.7.1_001_fix_trigger_status_events_insert.sql
docker exec -i test-procrastinate psql -Upostgres  < ../../procrastinate/procrastinate/sql/migrations/delta_0.8.1_001_add_queueing_lock_column.sql
docker exec -i test-procrastinate psql -Upostgres  < ../../procrastinate/procrastinate/sql/migrations/delta_0.10.0_001_close_fetch_job_race_condition.sql
docker exec -i test-procrastinate psql -Upostgres  < ../../procrastinate/procrastinate/sql/migrations/delta_0.10.0_002_add_defer_job_function.sql
docker exec -i test-procrastinate psql -Upostgres  < ../../procrastinate/procrastinate/sql/migrations/delta_0.11.0_003_add_procrastinate_periodic_defers.sql
docker exec -i test-procrastinate psql -Upostgres  < ../../procrastinate/procrastinate/sql/migrations/delta_0.12.0_001_add_foreign_key_index.sql

Which gives the following output:

CREATE FUNCTION
ALTER TABLE
DROP TABLE
CREATE FUNCTION
DROP TRIGGER
CREATE TRIGGER
ALTER TABLE
CREATE INDEX
DROP TABLE
CREATE INDEX
CREATE FUNCTION
CREATE FUNCTION
CREATE FUNCTION
CREATE TABLE
ERROR:  syntax error at or near ";"
LINE 1: DROP FUNCTION IF EXISTS procrastinate_defer_job;
                                                       ^
CREATE FUNCTION
ERROR:  syntax error at or near ";"
LINE 1: DROP FUNCTION IF EXISTS procrastinate_defer_periodic_job;
                                                                ^
CREATE FUNCTION
CREATE INDEX

Now, trying with a postgres 10 container:

NOTICE:  extension "plpgsql" already exists, skipping
CREATE EXTENSION
CREATE TABLE
CREATE TYPE
CREATE TYPE
CREATE TABLE
CREATE TABLE
CREATE TABLE
CREATE FUNCTION
CREATE FUNCTION
CREATE FUNCTION
CREATE FUNCTION
CREATE FUNCTION
CREATE FUNCTION
CREATE INDEX
CREATE TRIGGER
CREATE TRIGGER
CREATE TRIGGER
CREATE TRIGGER
CREATE INDEX
CREATE FUNCTION
ALTER TABLE
DROP TABLE
CREATE FUNCTION
DROP TRIGGER
CREATE TRIGGER
ALTER TABLE
CREATE INDEX
DROP TABLE
CREATE INDEX
CREATE FUNCTION
CREATE FUNCTION
CREATE FUNCTION
CREATE TABLE
DROP FUNCTION
CREATE FUNCTION
DROP FUNCTION
NOTICE:  function procrastinate_defer_periodic_job() does not exist, skipping
CREATE FUNCTION
CREATE INDEX

So we can reasonnably say this is related to my Postgres version, and the behaviour change starting in Postgres 10.

Now, back to the initial discussion, regarding the opportunity to package procrastinate as a django app, would you accept a contribution based on #280 (comment)?

@ewjoachim
Copy link
Member

would you accept a contribution based on #280 (comment)?

Yes! Definitely! As I said:

I'm perfectly fine adding a Django submodule for simplifying the usage of procrastinate with django, if it's deemed preferable to a django-procrastinate dedicated package.

Also:

Your approach seems nice, but I'd tend to try and generate one Django migration per SQL migration file (whereas if I understand your code, you have all SQL migrations in a single Django migration).

And I stand by this if it's not too hard to do. This will make it much easier when we issue releases in the future: people will update procrastinate, then manage.py migrate and it should "just work".

@ewjoachim
Copy link
Member

ewjoachim commented Jul 18, 2020

Oh, also, we might try to explicit supported PG versions, and test them in the CI. In theory, this would be a good thing. In practice, I'm afraid testing will make the CI much much longer, and much more complicated. I'll create a ticket. (EDIT: #282)

@ewjoachim ewjoachim added Issue contains: Some SQL 🐘 This features require changing the SQL model Issue contains: Some Python 🐍 This issue involves writing some Python code Issue appropriate for: Occasional contributors 😉 This issue will be best tackled by people with a minimum of experience on the project Type: Bug labels Jul 18, 2020
@ewjoachim
Copy link
Member

If it's not possible to do that cleanly within Django (and reading the code, I'm doubting it is), it could also be acceptable to generate migration files in setup.py (so they'd end up in the release) and have them be gitignored.

@agateblue
Copy link
Contributor Author

I'd probably hardcode the migrations, because generating migrations on the fly during release comes with two potential pitfalls:

  1. It's really hard to test that the migrations are actually working if files aren't commited to the repo
  2. If the migration-generating code changes slightly, we could face hard-to-debug inconsistencies

Would it be acceptable to add a manual step when a new migration is created, basically copy-pasting 10 lines of code in the django app as a new migration?

@agateblue agateblue mentioned this issue Jul 18, 2020
4 tasks
@ewjoachim
Copy link
Member

I understand those points. I'll just note:

It's really hard to test that the migrations are actually working if files aren't committed to the repo

Hm, not sure about that. If migrations are generated every time we launch setup.py, then they will be OK in the CI, and then can be regenerated in the dev env by re-launching pip install -e, same as when a metadata value changes.

If the migration-generating code changes slightly, we could face hard-to-debug inconsistencies

With the current implementation, it's the same: if the SQL migration code changes, then the django migration will change. And I can't really see what we may change in the Django migration part if the only thing it does is a RunSQL. But your approach is leaning on the safe side, and I can't blame that.

That being said, I'm definitely not against your proposed changes. I'd rather we solve problems and iterate. We'll see how it goes.

@agateblue
Copy link
Contributor Author

Hm, not sure about that. If migrations are generated every time we launch setup.py, then they will be OK in the CI, and then can be regenerated in the dev env by re-launching pip install -e, same as when a metadata value changes.

Oh, I didn't thought about that. I'm not sure how to trigger a command when pip install -e . is called though, but I can definitely write something to generate the migration files once we figure that out!

@agateblue
Copy link
Contributor Author

@ewjoachim I've made an attempt at generating migrations on the fly (see my last commit) when running pip install -e .. I had to tweak the order, since the default order of SQL files isn't correct, and it work. However, it doesn't work with pip install . (without the -e flag), I need to find out why

@ewjoachim
Copy link
Member

Continuing the discussion in the PR !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Issue appropriate for: Occasional contributors 😉 This issue will be best tackled by people with a minimum of experience on the project Issue contains: Some Python 🐍 This issue involves writing some Python code Issue contains: Some SQL 🐘 This features require changing the SQL model
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants