Skip to content

Conversation

@MarcialRosales
Copy link
Contributor

Proposed Changes

Separate invalid client test from the valid one onto separate group and this way they use different SSL certs

Types of Changes

What types of changes does your code introduce to this project?
Put an x in the boxes that apply

  • Bug fix (non-breaking change which fixes issue #NNNN)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause an observable behavior change in existing systems)
  • Documentation improvements (corrections, new content, etc)
  • Cosmetic change (whitespace, formatting, etc)
  • Build system and/or CI

@MarcialRosales MarcialRosales self-assigned this Jan 30, 2025
@MarcialRosales MarcialRosales requested a review from ansd January 31, 2025 08:32
Copy link
Member

@ansd ansd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI fails

@MarcialRosales
Copy link
Contributor Author

This fix was only for the flake on the mqtt auth_SUITE ... I cannot see any test suite related to mqtt or mqtt's auth_SUITE failing besides the dialyzer suite.
I have noticed though that the selenium test is failing. The change done on this PR is totally unrelated. I am looking at this selenium failure now locally. I have run it two times in CI and it is failing in the same test.

@MarcialRosales MarcialRosales force-pushed the fix-flake-on-mqtt-auth branch 4 times, most recently from 46ff092 to dab66e9 Compare February 7, 2025 07:12
@MarcialRosales MarcialRosales force-pushed the fix-flake-on-mqtt-auth branch 2 times, most recently from 74b3ee4 to d702ce8 Compare February 11, 2025 15:59
MarcialRosales added a commit that referenced this pull request Feb 12, 2025
@MarcialRosales MarcialRosales marked this pull request as ready for review February 12, 2025 10:18
@MarcialRosales MarcialRosales requested a review from ansd February 12, 2025 10:18
MqttClientId = <<"other_client_id">>,
{ok, C} = connect_ssl(MqttClientId, Config),
?assertMatch({error, _}, emqtt:connect(C)),
{error, {client_identifier_not_valid, _}} = emqtt:connect(C),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR doesn't fix the flake. I can repro the flake locally as follows:

    {error, {client_identifier_not_valid, _}} = emqtt:connect(C),
    timer:sleep(1),
    unlink(C).

which makes the test fail with:

=== === Reason: {'EXIT',{shutdown,client_identifier_not_valid}}

That's the flake which we also observed in CI.

The correct fix is to unlink first, then connect.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i cannot reproduce it locally and i dont see CI failing because of this . I can modify the test as you suggest though.

Copy link
Contributor Author

@MarcialRosales MarcialRosales Feb 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still dont understand what you are suggesting. I am expecting that when the user connects it fails because the client_id is not valid. That is exactly what it is doing line 594.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you send me the link to the failed job in CI where this test case is failing ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you send me the link to the failed job in CI where this test case is failing ?

Yes, here are two flakes failing due to {'EXIT',{shutdown,client_identifier_not_valid}}:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i cannot reproduce it locally and i dont see CI failing because of this .

The CT test case process runs concurrently with the MQTT connection process. CI can fail if the CT test case process executes unlink(C) after receiving the EXIT signal.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still dont understand what you are suggesting.

I'm suggesting to flip the two lines, i.e. to unlink before the expected failure as done in many other places in this test suite, e.g. in:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

those ci failures are because this change has not been merged yet

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regardless i have committed that change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but in this PR, with just my changes you dont see the flake. You will see the flake in other CI which do not have this fix.

@ansd ansd merged commit 2ab890f into main Feb 12, 2025
270 checks passed
@ansd ansd deleted the fix-flake-on-mqtt-auth branch February 12, 2025 16:15
mergify bot pushed a commit that referenced this pull request Feb 12, 2025
* Separate invalid client test from the valid one

* Apply same changes from pr #13197

* Deal with stalereferences caused by timing issues

looking up objects in the DOM

* Unlink before assertion

(cherry picked from commit 2ab890f)
ansd pushed a commit that referenced this pull request Feb 12, 2025
* Separate invalid client test from the valid one

* Apply same changes from pr #13197

* Deal with stalereferences caused by timing issues

looking up objects in the DOM

* Unlink before assertion

(cherry picked from commit 2ab890f)
mergify bot pushed a commit that referenced this pull request Feb 12, 2025
* Separate invalid client test from the valid one

* Apply same changes from pr #13197

* Deal with stalereferences caused by timing issues

looking up objects in the DOM

* Unlink before assertion

(cherry picked from commit 2ab890f)
(cherry picked from commit e84a516)

# Conflicts:
#	deps/rabbitmq_mqtt/test/auth_SUITE.erl
#	selenium/package.json
michaelklishin added a commit that referenced this pull request Feb 12, 2025
Fix flake on rabbitmq_mqtt auth_SUITE (backport #13180) (backport #13246)
mergify bot pushed a commit that referenced this pull request Feb 13, 2025
(cherry picked from commit bf7de92)

# Conflicts:
#	deps/rabbitmq_mqtt/test/auth_SUITE.erl
michaelklishin added a commit that referenced this pull request Feb 13, 2025
michaelklishin pushed a commit that referenced this pull request Mar 17, 2025
michaelklishin pushed a commit that referenced this pull request Mar 17, 2025
* Separate invalid client test from the valid one

* Apply same changes from pr #13197

* Deal with stalereferences caused by timing issues

looking up objects in the DOM

* Unlink before assertion
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants