Discussion: Improve DirectIO Directory for Java 24/25 #14928
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Currently the DirectIODircetory allocates direct byte buffers outside of heap (because that's needed for direct IO to work). It also needs to align them on the blocksize. The current code may also be wrong if the mergeBufferSize is not a multiple of blockSize. This PR fixes that to have a correct buffer aligned and with correct length.
With MemorySegments we can improve that:
MemorySegment#asByteBuffer()
. The resulting segment is compatible to direct IO.As IndexOutputs are only used by one thread we can use a confined arena and allocate the buffer there.
With IndexInputs it is more complicated: Theoretically they should also only be used from one thread (also RandomAccessInputs as far as I remember), but unfortunately the buffer is allocated at the time of cloning (which is not the thread when it is used). Actually the buffering code is a bit cryptic to me and I had no time to look closely into it: Actually like in BufferedIndexInput the buffer should be lazy initialized on the first real READ access (not on cloning and not on seeking for first time after cloning). To implement this correctly we may need to refactor the buffer code a bit.
Therefore in this mockup I use an AUTO arena which make the buffer freed by garbage collector. A shared arena is too expensive.
If you have an idea how to fix the IndexInput to use a lazy buffer like BufferedIndexInput without mixing everything up, tell me. The buffer should be confined and allocated only from the thread actually using the clone. An alternative is to have a pool of buffers for reuse "per thread" (threadlocal). The JDK internally uses a ThreadLocal for such buffers when implementing java's IO layer.