Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

shut_down() the service if TLS negotiation fails #5134

Merged

Conversation

ksmurchison
Copy link
Contributor

sending a cleartext response to the client on failure is nonsense

@ksmurchison
Copy link
Contributor Author

Prometheus.connection_setup_failure_imapd fails.
@elliefm You might know how to fix this better than I do

@elliefm
Copy link
Contributor

elliefm commented Nov 17, 2024

So that test connects to a secure IMAP port but then tries to speak plain text to it (which will look like a TLS negotiation failure), with the intention of creating "bad connection" situations, and it then examines the prometheus stats to find out whether we counted them correctly. Looks like most of the test still works, but the numbers are wrong at the end.

Up until now it's expected clean shutdowns, because that's how imapd used to handle this situation, but these aren't clean shutdowns anymore, so the count doesn't match. There's a comment from me expecting this to change to "setup failures" in the future. I guess we'll want to modify/replace this block and comments near the end with a similar one that counts and verifies the new behaviour instead:

    # should be $badconn shutdowns counted (imapd treats this condition
    # as an ok shutdown, not an error)
    $self->assert_num_equals(
        $badconns,
        $shutdown->{"$service_label,status=\"ok\""}->{value}
    );

    # XXX someday: expect to find $badconns setup failures counted

I guess the trick will be figuring out what prom stat gets counted when TLS negotiation fails, and update the test to check that stat instead of shutdown/ok, expecting to find $badconns of them. It might actually still be "cyrus_imap_shutdown_total" just with status="error" instead of status="ok", or something like that.

I don't remember exactly I meant by "setup failure" -- I don't see anything like it in promdata.p, but maybe I meant cyrus_master_ready_fails_total which is counted directly by master.c. That stat would've been zero before, but now I think it might match the error shutdowns count. I don't think you need to worry about it for this PR, but it's there to look at if you want to go deep.

@elliefm
Copy link
Contributor

elliefm commented Nov 18, 2024

I actually get a different error when I run it locally vs when CI runs it, curious. When I run it locally, it chokes out earlier on the connection being dropped

@elliefm
Copy link
Contributor

elliefm commented Nov 18, 2024

This works for me locally:

diff --git a/cassandane/Cassandane/Cyrus/Prometheus.pm b/cassandane/Cassandane/Cyrus/Prometheus.pm
index 8fde326b6..8c07a1517 100644
--- a/cassandane/Cassandane/Cyrus/Prometheus.pm
+++ b/cassandane/Cassandane/Cyrus/Prometheus.pm
@@ -374,7 +374,7 @@ sub test_connection_setup_failure_imapd
             $store->get_client();
         };
         my $error = $@;
-        $self->assert_matches(qr{Connection closed by other end}, $error);
+        $self->assert_not_null($error);
     }
 
     # wait a bit for the prometheus report to refresh
@@ -412,11 +412,10 @@ sub test_connection_setup_failure_imapd
     # should not have had any successful connections to imaps
     $self->assert(not exists $total->{$service_label});
 
-    # should be $badconn shutdowns counted (imapd treats this condition
-    # as an ok shutdown, not an error)
+    # should be $badconns error shutdowns counted
     $self->assert_num_equals(
         $badconns,
-        $shutdown->{"$service_label,status=\"ok\""}->{value}
+        $shutdown->{"$service_label,status=\"error\""}->{value}
     );
 
     # XXX someday: expect to find $badconns setup failures counted

sending a cleartext response to the client on failure is nonsense
@ksmurchison ksmurchison merged commit 7d92853 into cyrusimap:master Nov 20, 2024
1 check passed
@ksmurchison ksmurchison deleted the cyr-1400-shutdown-on-failed-tls branch November 21, 2024 18:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants