Skip to content

Rework TLS to OpenSSL-native style (memory BIOs, explicit status codes)#8002

Draft
Copilot wants to merge 8 commits into
mainfrom
copilot/rework-tls-to-match-openssl-style
Draft

Rework TLS to OpenSSL-native style (memory BIOs, explicit status codes)#8002
Copilot wants to merge 8 commits into
mainfrom
copilot/rework-tls-to-match-openssl-style

Conversation

Copilot AI commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Now that MbedTLS is gone from the tree, the TLS layer no longer needs to emulate its style. This reworks ccf::tls::Context and TLSSession to use OpenSSL idioms directly, removing the callback machinery and the "negative return" overloading. Net ~330 fewer lines.

Changes

  • In-memory BIOs replace BIO callbacks (context.h): set_bio() attaches plain BIO_s_mem read/write BIOs to the SSL. TLSSession feeds inbound ciphertext via recv() and drains outbound ciphertext via pending_write()/send() after each SSL operation. No more send_callback/recv_callback indirection.
  • Status separated from byte count (context.h, tls.h): handshake/read/write/close return 0 on success or an OpenSSL SSL_ERROR_* code (from SSL_get_error); bytes transferred are reported through an out-param. Removes the negative-return hack. Only TLS_ERR_X509_VERIFY remains as a CCF-specific sentinel to distinguish a handshake cert-verification failure (reported by OpenSSL as a generic SSL_ERROR_SSL) so callers can treat it as an auth failure.
  • Callers check status explicitly (tls_session.h): read/flush/do_handshake/close switch on SSL_ERROR_WANT_READ/WANT_WRITE/ZERO_RETURN/etc. instead of inspecting negated/overloaded returns. Outbound ciphertext is drained into pending_out and retried on ring-buffer backpressure. Deletes the obsolete handle_send/handle_recv/*_callback_openssl/write_some paths.
  • PlaintextServer (plaintext_server.h): updated to the new virtual signatures over its own pair of memory BIOs.
  • Unit test (tls/test/main.cpp): rewritten from a threaded socketpair + callback BIOs to a single-threaded BIO pump; all existing cases preserved.
  • Docs (tls/README.md, architecture/tls_internals.rst): describe the memory-BIO design and status-code contract; drop the MbedTLS-emulation and "future pure-OpenSSL" sections.

Read path, before/after

// Before: status smuggled through the return value (negated on error)
int n = ctx->read(buf, len);
if (n == TLS_ERR_WANT_READ) { ... }
else if (n < 0) { /* error */ }
else { /* n bytes */ }

// After: status and byte count are distinct
size_t readbytes = 0;
auto rc = ctx->read(buf, len, readbytes);
switch (rc)
{
  case 0: /* readbytes valid */ break;
  case SSL_ERROR_WANT_READ:
  case SSL_ERROR_WANT_WRITE: /* retry later */ break;
  case SSL_ERROR_ZERO_RETURN: /* peer closed */ break;
  default: /* error */ break;
}

Notes for reviewers

  • This is security-critical code; the cert-verification classification (peer_cert_ok() on SSL_ERROR_SSL) is the one place where OpenSSL's status is reinterpreted.
  • A full local build/tls_test run was not possible in the sandbox (requires OpenSSL 3.3 + Rust toolchain); logic was validated via syntax checks and standalone OpenSSL programs covering handshake, >16k records, close, and verify-failure. Please confirm tls_test is green in CI.

Copilot AI and others added 3 commits June 29, 2026 17:42
Co-authored-by: achamayou <4016369+achamayou@users.noreply.github.com>
Co-authored-by: achamayou <4016369+achamayou@users.noreply.github.com>
Co-authored-by: achamayou <4016369+achamayou@users.noreply.github.com>
Copilot AI changed the title [WIP] Refactor TLS implementation to align with OpenSSL style Rework TLS to OpenSSL-native style (memory BIOs, explicit status codes) Jun 29, 2026
Copilot AI requested a review from achamayou June 29, 2026 17:48
@achamayou

Copy link
Copy Markdown
Member

@copilot check previous CI runs, find out what failed and fix it.

Co-authored-by: achamayou <4016369+achamayou@users.noreply.github.com>

Copilot AI commented Jun 30, 2026

Copy link
Copy Markdown
Contributor Author

@copilot check previous CI runs, find out what failed and fix it.

Investigated the failing runs (ACI SNP Genoa/Milan + VMSS Virtual C): all hung in programmability_and_jwt / test_jwt_without_key_policy. The node logs show the server generated the expected 400 but never delivered it; the client saw Server disconnected without sending a response. main is green on this test, so it was a regression from this PR.

Root cause: in the new memory-BIO design, TLSSession::read() only drained the write BIO before the read (via flush()). Ciphertext that the TLS layer produces during SSL_read (TLS 1.3 NewSessionTickets / post-handshake messages / key updates) was left stranded in the write BIO. The original BIO send-callback pushed those bytes to the host synchronously, so this is an asymmetry the rework introduced.

Fix (3a8... commit): drain outbound with flush_outbound() immediately after ctx->read(), restoring parity with the original behaviour. Please confirm tls_test and programmability_and_jwt are green in CI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants