Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unbreak Debian support with custom encoding #1564

Merged
merged 7 commits into from
Jan 10, 2025

Conversation

smortex
Copy link
Collaborator

@smortex smortex commented Jan 3, 2024

On Debian, version 10.0.0+ of the module cause a broken catalog due to dependecy cycles:

Anchor[postgresql::server::service::begin::main] =>
Postgresql_psql[Set template1 encoding to UTF8] =>
Postgresql::Server::Instance::Late_initdb[main] =>
Postgresql::Server::Instance::Initdb[main] =>
Class[Postgresql::Server::Initdb] =>
Postgresql_conf[port_for_instance_main] =>
Class[Postgresql::Server::Service] =>
Postgresql::Server::Instance::Service[main] =>
Anchor[postgresql::server::service::begin::main]

Reenable the test that trigger this issue (taken from #1547) and try to fix it.

@smortex smortex force-pushed the unbreak-debian-support-with-encoding branch from 37dfadf to ed8981e Compare January 4, 2024 00:57
@antaflos
Copy link
Contributor

We are successfully running this branch and it works as advertised. No more dependency cycles. 👍

@smortex
Copy link
Collaborator Author

smortex commented Jan 11, 2024

@antaflos Beware that this is unfortunately still WIP because it breaks EL for reasons I had not time to discover yet. I have been busy the last week with other $WORK related things, and am about to go on vacation for a week, so will not be able to have a look before I am back.

But thank your for your feedback 👍 : for now I only tested it with the CI and as you can see with d15ac61 there is some nasty bug because A -> B + A -> B -> C and A -> B + B -> C should not change much regarding dependencies.

@ekohl ekohl linked an issue Apr 5, 2024 that may be closed by this pull request
@smortex smortex force-pushed the unbreak-debian-support-with-encoding branch 5 times, most recently from 6414904 to b2284f1 Compare April 16, 2024 02:37
@@ -135,11 +135,7 @@ def matches(value)
end

autorequire(:anchor) do
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as an alternative, can we maybe autorequire the postgresql_conn_validator resource?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That quite make sense. I'll dig into this.

puts '-------------------------------'
puts LitmusHelper.instance.run_shell('systemctl status postgresql*').stdout
puts '-------------------------------'
expect(port(5432)).to be_listening.on('127.0.0.1').with('tcp')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the problem I had in the past: some systems report tcp, some tcp4. that was really ugly so I switched to matching the IP address. and that sometimes didn't work because some CI systems have IPv6 so it binds to ::1 but sometimes it's only 127.0.0.1. all ugly...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am doing more test because this is quite weird. Looking at the code it should work, looking at ss output it seems to be working, but it is not and that makes me feel angry. Similar code work in other areas of the project so I guess something is very wrong… I would like to nail it rather than ignoring it if possible.

Copy link
Collaborator

@ekohl ekohl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this, I think the initdb (which doesn't need a DB running) shouldn't include late_initdb (which does need a server running). Instead, postgresql::server_instance should be modified to split it.

The reason you run into this is that Debian has $needs_initdb set to false by default:

$needs_initdb = pick($needs_initdb, $postgresql::globals::manage_package_repo == true)

So this code path is triggered:

postgresql::server::instance::late_initdb { $name:
encoding => $encoding,
user => $user,
group => $group,
module_workdir => $module_workdir,
psql_path => $psql_path,
port => $port,
}

Wouldn't it make more sense to move it to postgresql::server_instance as:

if !$initdb_settings['needs_initdb'] and $initdb_settings['encoding'] {
  postgresql::server::instance::late_initdb { $instance_name:
    encoding       => $initdb_settings['encoding'],
    user           => $initdb_settings['user'],
    group          => $initdb_settings['group'],
    module_workdir => $initdb_settings['module_workdir'],
    psql_path      => $initdb_settings['psql_path'],
    port           => $initdb_settings['port'],
  }
}

It may break because those values may not be undef, but I hope the idea is clear.

["postgresql::server::service::begin::#{self[:instance]}"]
end

autorequire(:service) do
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to understand the logic. Don't you always need the server to be running to execute any SQL statements?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, but the service doesn't seem to be available immediately after start. There's a custom resource that waits until the tcp port is open and I think we should depend on that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes, that makes sense. Though with the newest versions (Fedora 38 with PostgreSQL 15) I see the systemd unit is Type=notify so I'd assume that there it will only really be ready once it's listening (spoiler: the patch sends READY=1 right after logging database system is ready to accept connections).

Digging into this, it was introduced in postgres/postgres@7d17e68 so with PostgreSQL 9.6 it became possible to support. Looking at EL8 that's built with Type=notify, but EL7 probably isn't. Sadly Debian 11 & 12 are also Type=forking. That means you're right and we still need to depend on the connection validator.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes, that makes sense. Though with the newest versions (Fedora 38 with PostgreSQL 15) I see the systemd unit is Type=notify so I'd assume that there it will only really be ready once it's listening (spoiler: the patch sends READY=1 right after logging database system is ready to accept connections).

Digging into this, it was introduced in postgres/postgres@7d17e68 so with PostgreSQL 9.6 it became possible to support. Looking at EL8 that's built with Type=notify, but EL7 probably isn't. Sadly Debian 11 & 12 are also Type=forking. That means you're right and we still need to depend on the connection validator.

These seem to be the relevant bugs about this:

@ShadNex
Copy link

ShadNex commented Jul 12, 2024

I've tested this PR as suggested by @bastelfreak and indeed it seems to solve the dependency issue for our environment as seen in #1541 👍 Would be great if this gets released soon!

@@ -42,6 +42,10 @@
status => $service_status,
}

Anchor["postgresql::server::service::begin::${name}"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may deeply regret asking when I do not have a full background on this but my understanding was Anchor was a redundant pattern as the contain function came in?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes anchors are from the past. Not sure if attempting to remove them in this PR will help however (and that would be a backwards incompatible change). But in the long term, having them gone is probably a good idea.

@ShadNex
Copy link

ShadNex commented Sep 11, 2024

Is there someting missing for this PR to get merged? Would be nice to have it in the next relase.

@bastelfreak
Copy link
Collaborator

@ShadNex yes, passing tests :)

@smortex can you rebase again please?

@smortex smortex force-pushed the unbreak-debian-support-with-encoding branch from 692f6e1 to 2d19e18 Compare October 17, 2024 21:01
@smortex smortex closed this Oct 17, 2024
@smortex smortex deleted the unbreak-debian-support-with-encoding branch October 17, 2024 21:04
@smortex smortex restored the unbreak-debian-support-with-encoding branch October 17, 2024 21:05
@smortex
Copy link
Collaborator Author

smortex commented Oct 17, 2024

Please bear with me while my fat fingers type crap on the keyboard 😒 .

@smortex smortex reopened this Oct 17, 2024
@smortex smortex force-pushed the unbreak-debian-support-with-encoding branch from 2893af5 to 38c9189 Compare October 18, 2024 00:12
@smortex
Copy link
Collaborator Author

smortex commented Oct 18, 2024

Please bear with me while my fat fingers type crap on the keyboard 😒 .

Thank you for your patience.

Rebasing done, I pushed each commit individually to have a step by step view of what works or not. I obviously lost track since January, and was (and still am) out of ideas so if you want to try something you are very welcome to do so. It seems that failures are test-order dependent…

@praj1001 praj1001 marked this pull request as ready for review November 19, 2024 12:08
@praj1001 praj1001 force-pushed the unbreak-debian-support-with-encoding branch from 38c9189 to 6c3748e Compare November 25, 2024 05:36
@malikparvez malikparvez force-pushed the unbreak-debian-support-with-encoding branch from 6c3748e to 8e184b9 Compare December 3, 2024 12:05
bastelfreak and others added 6 commits December 18, 2024 12:15
When using anchors, resources should be sandwiched those. An ordering
relation was missing for the "end" anchor.
Some resources where not properly ordered between the "begin" and "end"
anchors.  Make sure they are properly bound to them.
It does not really make sense to require the "begin" anchor, the
`postgresql_psql` type communicating with the server, it makes more
sense to require the "end" anchor which mean that the server is running
and reachable.
We already have a relationship in `manifests/server/instance/service.pp`
that ensure `Service["postgresqld_instance_${name}"]` is realized before
`Anchor["postgresql::server::service::end::${name}"]`.

Fun fact, removing this duplaciate fix the circular dependency reported
by puppet.
ss(8) report the same for a working and non-working port as far as
serverspec is concerned.  Maybe ordering has an impact?  We know these
tests fail when running at the end of the test suite, so attempt to run
them earlier.
@amitkarsale amitkarsale force-pushed the unbreak-debian-support-with-encoding branch from 8e184b9 to 256971d Compare December 18, 2024 06:45
@span786
Copy link
Contributor

span786 commented Jan 10, 2025

Failures seems unreleted to me, we can ge this merged.

@amitkarsale amitkarsale merged commit 3ea94c5 into main Jan 10, 2025
51 of 58 checks passed
@amitkarsale amitkarsale deleted the unbreak-debian-support-with-encoding branch January 10, 2025 12:58
@amitkarsale
Copy link
Contributor

Merged based on review comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

setting the encoding parameter causes a dependency cycle
10 participants