-
Notifications
You must be signed in to change notification settings - Fork 0
Feature/hd 4220 last block size #26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: development/1.0
Are you sure you want to change the base?
Conversation
Hello fferrandis,My role is to assist you with the merge of this Available options
Available commands
Status report is not available. |
Waiting for approvalThe following approvals are needed before I can proceed with the merge:
|
159884f to
6c8187c
Compare
- separate C implementations in dedicated files - improve encoding and decoding functionality - add new tests - improvements to the encoding and decoding processes
6c8187c to
181e135
Compare
- enable new buffer style - rework all tests to only use *Matrix
041a7c1 to
12234f8
Compare
4bbbda1 to
44ad647
Compare
44ad647 to
fd5d424
Compare
Store in EncodeData structure the realsize With old format, it contains the FragLen itself With new format, this is sticked to the real data. In DecodeMatrix, we also manage the last stripe size to ensure we read the header.ChunkSize value instead of reading piecesize bytes
fd5d424 to
5decc2c
Compare
| @@ -0,0 +1,408 @@ | |||
| #include <liberasurecode/erasurecode.h> | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is only a move from code that was directly inline in GO file to a proper dedicated C file.
| max: params.MaxBlockSize, | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as we dont use jerasure, this is removed. (this new feature can't work with jerasure anyway)
| wg.Add(int(ctx.number_of_subgroup)) | ||
|
|
||
| for i := 0; i < int(ctx.number_of_subgroup); i++ { | ||
| go func(nth int) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the main diff of the encode part. We make a distinction when it comes to the last lines/stripes. All the stripes are supposed to be "chunksize" long, but not the last one because of the new implementation
| C.my_liberasurecode_encode_buffermatrix_cleanup( | ||
| backend.libecDesc, C.size_t(ctx.frags_len), ctx.datas, ctx.codings) | ||
| }}, | ||
| }, totLen}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we store the real len of the slice. We could also reslice it to this len, but it was easier from the debug point of view to keep the slice with the same slice regardless of the usage of the new format, and store the real len alongside.
| }}, nil | ||
| } | ||
|
|
||
| // EncodeMatrix encodes data in small subpart of chunkSize bytes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EncodeMatrix isn't used in hdcontroller. So all the tests using it has been updated to replace the usage of EncodeMatrix by EncodeMatrixWithBufferMatrix
|
|
||
| var totLen int64 | ||
| for i := 0; i < chunkInfo.NrChunk; i++ { | ||
| vect := make([][]byte, len(frags)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when decoding stripes, we can't assume that all stripes are "chunksized long". When using the new format, the last stripe of a fragment is often smaller (that is the prupose of this PR and this new format). The nextbound is computed accordingly.
| * .. [-[*]- -] | ||
| * p4 [-[*]- -] | ||
| * | ||
| */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is mainly variable renaming to help to understand what we are manipulating in the matrix. That why we have lines, celles and columns.
| totalLines := (maxDataSize + lineSize - 1) / lineSize | ||
| isLastStripe := (lineEnd == totalLines-1) | ||
|
|
||
| /* When wrapping around, we read the full groups. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the main change of GetRangeMatrix : we are going to read a full group (ie a full line of stripes) in case of two condition :
- first one already existed
- the second one is new and is true when data is in the last stripe line.
As we cannot guess how the data is stored (using the old format or the new one) we chose to read all the stripes and then, based on the stripes header, get properly the data
fferrandis
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some comments to help the review
This pull request introduces significant improvements and refactoring to the buffer matrix management for erasure coding, focusing on more efficient handling of data fragments, especially for cases where the last subgroup (stripe) is not a full block. The changes include a new buffer format, more flexible offset calculations, improved test coverage, and a new C header for backend interoperability.
Key changes:
Buffer Matrix Refactoring and New Format Support
Introduced a new
BufferInfostruct to encapsulate buffer metadata, separated from the mainBufferMatrixdata structure, and refactored related logic for clarity and maintainability. Added methods to support both old and new buffer layouts, with explicitUseNewFormatandUseOldFormatsetters and corresponding offset calculation logic. (buffer.go[1] [2] [3] [4] [5] [6]Implemented logic to handle the last subgroup differently when the data size is not a multiple of the block size, ensuring correct data alignment and minimizing wasted space in the last fragments. (
buffer.go[1] [2] [3]Expanded and Improved Testing
buffer_test.go[1] [2] [3]Backend Interoperability
backend.hdefining theencode_chunk_contextstructure and a suite of utility functions to facilitate interaction between Go and C code for erasure coding backends. (backend.hbackend.hR1-R70)Minor Improvements
// nolint:goseccomment to suppress a linter warning for pointer arithmetic inmemalign.go. (memalign.gomemalign.goR9)These changes collectively modernize the buffer management for erasure coding, improve performance for non-aligned data sizes, and lay groundwork for further backend integration.
References: [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11]