Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix MQTT test flake (backport #12687) #12690

Merged
merged 1 commit into from
Nov 8, 2024
Merged

Conversation

mergify[bot]
Copy link

@mergify mergify bot commented Nov 8, 2024

Prior to this commit, test

make -C deps/rabbitmq_mqtt ct-mqtt_shared t=[mqtt,cluster_size_1,v4]:non_clean_sess_reconnect_qos0_and_qos1

flaked in CI with error:

{mqtt_shared_SUITE,non_clean_sess_reconnect_qos0_and_qos1,972}
{badmatch,{publish_not_received,<<"msg-0">>}}

The problem was the following race condition:

  • The MQTT v4 client sends an async DISCONNECT
  • The global MQTT consumer metric got decremented. However, the classic queue still has the MQTT connection proc registered as consumer.
  • The test case sends a message
  • The classic queue checks out the message to the old connection instead of checking out the message to the new connection.

The solution in this commit is to check the consumer count of the classic queue before proceeding to send the message after disconnection.


This is an automatic backport of pull request #12687 done by [Mergify](https://mergify.com).

Prior to this commit, test
```
make -C deps/rabbitmq_mqtt ct-mqtt_shared t=[mqtt,cluster_size_1,v4]:non_clean_sess_reconnect_qos0_and_qos1
```

flaked in CI with error:
```
{mqtt_shared_SUITE,non_clean_sess_reconnect_qos0_and_qos1,972}
{badmatch,{publish_not_received,<<"msg-0">>}}
```

The problem was the following race condition:
* The MQTT v4 client sends an async DISCONNECT
* The global MQTT consumer metric got decremented. However, the classic
  queue still has the MQTT connection proc registered as consumer.
* The test case sends a message
* The classic queue checks out the message to the old connection instead
  of checking out the message to the new connection.

The solution in this commit is to check the consumer count of the
classic queue before proceeding to send the message after disconnection.

(cherry picked from commit 40bf778)
@mergify mergify bot assigned ansd Nov 8, 2024
@ansd ansd merged commit 24cbca1 into v4.0.x Nov 8, 2024
195 checks passed
@ansd ansd deleted the mergify/bp/v4.0.x/pr-12687 branch November 8, 2024 10:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant