Skip to content

feat: TlsSocket AsyncWriteSome and AsyncReadSome #376

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

kostasrim
Copy link
Collaborator

@kostasrim kostasrim commented Jan 31, 2025

Implement async io for TlsSocket.

  • add AsyncReadSome
  • add AsyncWriteSome
  • add tests
  • fix a bug in io::AsyncRead

Resolves #361 when ready.

@kostasrim
Copy link
Collaborator Author

Do not review, it's not ready

@@ -388,6 +388,7 @@ void EpollSocket::AsyncWriteSome(const iovec* v, uint32_t len, io::AsyncProgress
async_write_pending_ = 1;
}

// TODO implement async functionality
void EpollSocket::AsyncReadSome(const iovec* v, uint32_t len, io::AsyncProgressCb cb) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

epoll AsyncRead is synchronous

@@ -0,0 +1,34 @@
-----BEGIN CERTIFICATE-----
Copy link
Collaborator Author

@kostasrim kostasrim Feb 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think 50 years will be enough for this to last 🤓 (that's the --days I used to generate it)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO I realized that we can generate the certificates with cmake as well. Not sure if it's worth it or not 🤔

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion, the better idea is to add a custom step for generating all keys and certificates on the cmake step.
You never know who or how can use these certificates and blame authors for that.

@codecov-commenter
Copy link

codecov-commenter commented Feb 7, 2025

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 91.07612% with 34 lines in your changes missing coverage. Please review.

Project coverage is 78.67%. Comparing base (d30de86) to head (e39ae8b).
Report is 169 commits behind head on master.

Files with missing lines Patch % Lines
util/tls/tls_socket.cc 81.50% 32 Missing ⚠️
util/tls/tls_socket_test.cc 99.01% 2 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #376      +/-   ##
==========================================
+ Coverage   77.60%   78.67%   +1.06%     
==========================================
  Files         103      111       +8     
  Lines        7824     9825    +2001     
==========================================
+ Hits         6072     7730    +1658     
- Misses       1752     2095     +343     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@@ -50,6 +50,7 @@ struct AsyncReadState {

AsyncReadState(AsyncSource* source, const iovec* v, uint32_t length)
: arr(length), owner(source) {
cur = arr.data();
Copy link
Collaborator Author

@kostasrim kostasrim Feb 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ouch -- we now got unit tests though 😄


// Main loop
void Run() override;
virtual void CompleteAsyncReq(io::Result<size_t> result) override;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

override always requires to override virtual method.
virtual keyword is optional in such cases.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I will fix. I also need to address a few things functionality/testing wise :)

@kostasrim kostasrim requested a review from Copilot April 7, 2025 14:34
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot wasn't able to review any files in this pull request.

Files not reviewed (5)
  • io/io.cc: Language not supported
  • util/fibers/epoll_socket.cc: Language not supported
  • util/tls/tls_socket.cc: Language not supported
  • util/tls/tls_socket.h: Language not supported
  • util/tls/tls_socket_test.cc: Language not supported

@kostasrim kostasrim changed the title [do not review] feat: TlsSocket AsyncWriteSome and AsyncReadSome feat: TlsSocket AsyncWriteSome and AsyncReadSome Apr 7, 2025
@kostasrim kostasrim marked this pull request as ready for review April 7, 2025 14:36
iovec send_vec{.iov_base = &send_buf, .iov_len = payload_sz_};
iovec read_vec{.iov_base = &res, .iov_len = payload_sz_};

// We don't need to call ssl_renegotiate here, the first read will also negotiate the protocol
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first AsyncRead causes the read to write, so the AsyncWrite that follows can't progress and it subscribes itself to the pending list. When AsyncRead finishes the renegotiation, it wakes up the pending (the AsyncWrite) and it makes progress. We would have deadlocked otherwise. This is equivalent to Yielding when WRITE/READ in progress

@romange
Copy link
Owner

romange commented Apr 27, 2025

Can I review it @kostasrim ?

void CompleteAsyncReq(io::Result<size_t> result) override;
};

friend AsyncWriteReq;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it will work without friend decl


// TODO clean up the optional before we yield such that progress callback can dispatch another
// async operation
std::optional<AsyncWriteReq> async_write_req_;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

most of our use with tls is synchronous, so I suggest to make them at least unique_ptr (could even be plain pointers), to reduce the size of TlsSocket for most cases.

this will also allow to move Async state definitions to cc file.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep that makes sense. It's 130 bytes roughly extra per object

void TlsSocket::AsyncReadSome(const iovec* v, uint32_t len, io::AsyncProgressCb cb) {
io::Result<size_t> res = ReadSome(v, len);
cb(res);
CHECK(!async_read_req_.has_value());
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not how I would design it.
TlsEngine could already have a ready read buffer. you could return quickly by fetching it without doing any async stuff. Only if the buffer is empty it's worth dispatching the upstream read.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, isn't that what I do here? We don't start/submit any async operation. We construct
the object and call Run() and within this function we will read the EngineBuffer first if there are bytes available -- so we are on the fast path already, that is, we don't dispatch upstream.
As I am changing it now to a unique ptr, we now always get an allocation. I think the cost would be insignificant because the allocator will be able to satisfy that without actually requesting more memory and this is not a common path anyways.

}
}

void TlsSocket::AsyncWriteReq::Run() {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we have while (true) here but we return in most cases?
it's hard to understand this logic - at first I thought Run is fiber-blocking.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what I am trying to understand - what is the path to the second loop?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no fiber blocks, everything is async and yes we need that. An example would be:

  1. We push to the engine
  2. We call HandleOpAsync because last_push.engine_opcode < 0 and we submit an async operation to the underline socket. Its completion handler will call Run() again.
  3. The fiber continues doing its work while the underline async operation is in flight
  4. The async operation completes, invokes the completion handler which calls Run() again. Our state now is MaybeSendOutputAsyncTag so we reach that branch but the condition last_push.written() > 0 is not true. Now we loop, push to the engine again and proceed.

return;
}
last_push = *push_res;
state = AsyncWriteReq::HandleOpAsyncTag;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like switch case without breaks would work here

void TlsSocket::AsyncReadReq::Run() {
DCHECK_GT(len, 0u);

while (true) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here as well, why do we need a loop?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to check this one

@romange
Copy link
Owner

romange commented Apr 27, 2025

In general this PR is fairly big and it is hard to reason about design choices at this point. And also do a proper code review.

I suggest to start smaller. Say, implement only the read path, while use synchronous stubs to handle Engine::NEED_WRITE states. (I would even expect that for simple hello worlds scenarios, NEED_WRITE state won't even happen.)

@kostasrim
Copy link
Collaborator Author

In general this PR is fairly big and it is hard to reason about design choices at this point. And also do a proper code review.

I suggest to start smaller. Say, implement only the read path, while use synchronous stubs to handle Engine::NEED_WRITE states. (I would even expect that for simple hello worlds scenarios, NEED_WRITE state won't even happen.)

I followed a similar design to what you did with async reads/writes for plain sockets. The difference is that the state machine for the tls stuff is a little bit more complicated and I find it harder to express those branches/flows async. For example, if we had coros, I would not need to do state tracking on some of the loops because once the completion handler for the inflight async operation completes, it would just call the coro handler which would resume from where it was suspended before the async operation started. Now it has to reach Run() first and drive down to its path. continuations would also help here, and I even took that route on a certain path (ofc not proper proper continuation).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

implement tls_socket async write/read
4 participants