* lib/packing.c behaving weird if buffer length is not multiple of 4 with QUIRK_LSW32_IS_FIRST @ 2024-08-15 21:16 Jacob Keller 2024-08-16 23:37 ` Jacob Keller 0 siblings, 1 reply; 12+ messages in thread From: Jacob Keller @ 2024-08-15 21:16 UTC (permalink / raw) To: Vladimir Oltean; +Cc: netdev@vger.kernel.org Hi Vladimir! I am recently attempting to refactor some bespoke code for packing data in the ice networking driver into a bit packed hardware array, using the <linux/packing.h> and lib/packing.c infrastructure. My hardware packed buffer needs QUIRK_LITTLE_ENDIAN and QUIRK_LSW32_IS_FIRST, as the entire buffer is packed as little endian, and the lower words come first in the buffer. Everything was working ok in initial testing, until I tried packing a buffer that was 22 bytes long. Everything seemed to shift by 2 bytes: Here, I print the initial details such as offset, lsb, width, then the hex dump of the u64 value I'm trying to insert. I do the call to packing() with the appropriate quirks set, and then hex dump the 22-byte buffer after. > kernel: offset=0, size=8, lsb=0, width=57 > kernel: value: 60 9b fe 01 00 00 00 00 > kernel: 0x0000000001fe9b60 -> 000-056: fe 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > kernel: 0x0000000001fe9b60 -> 000-056: 00 00 00 00 00 00 It seems to have failed to copy the first two bytes in to the buffer. I discovered that if I pretend the buffer is 24 bytes (a multiple of 4 bytes, i.e. a word size), then everything works as expected: > kernel: offset=0, size=8, lsb=0, width=57 > kernel: value: 60 fc fe 01 00 00 00 00 > kernel: 0x0000000001fefc60 -> 000-056: 60 fc fe 01 00 00 00 00 00 00 00 00 00 00 00 00 > kernel: 0x0000000001fefc60 -> 000-056: 00 00 00 00 00 00 00 00 It seems like this is either a misunderstanding of a fundamental requirement of the packing() infrastructure, a bug in the QUIRK_LSW32_IS_FIRST, or I need a new quirk for the packing infrastructure? Essentially, I think the problem is that it uses the length of the buffer to find the word, but when the length is not a multiple of 2, the word offset picked is incorrect. Using a larger length than the size of the buffer "works" as long as I never use a bit offset thats larger than the *actual* buffer.. but that seems like a poor solution. Essentially, it seems like the default indexing for big endian is searching for each byte from the end of the array and then using the quirks to swap back to the inverse ending. I think the fix is probably just to do a round-up division on the len/4 check in get_reverse_lsw32_offset, since its calculating the total number of words in the length, and effectively rounds down for lengths that aren't multiples of 4. Does this seem like the correct analysis to you? Thanks, Jake ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: lib/packing.c behaving weird if buffer length is not multiple of 4 with QUIRK_LSW32_IS_FIRST 2024-08-15 21:16 lib/packing.c behaving weird if buffer length is not multiple of 4 with QUIRK_LSW32_IS_FIRST Jacob Keller @ 2024-08-16 23:37 ` Jacob Keller 2024-08-18 13:29 ` Vladimir Oltean 0 siblings, 1 reply; 12+ messages in thread From: Jacob Keller @ 2024-08-16 23:37 UTC (permalink / raw) To: Vladimir Oltean; +Cc: netdev@vger.kernel.org On 8/15/2024 2:16 PM, Jacob Keller wrote: > Hi Vladimir! > > I am recently attempting to refactor some bespoke code for packing data > in the ice networking driver into a bit packed hardware array, using the > <linux/packing.h> and lib/packing.c infrastructure. > > My hardware packed buffer needs QUIRK_LITTLE_ENDIAN and > QUIRK_LSW32_IS_FIRST, as the entire buffer is packed as little endian, > and the lower words come first in the buffer. > > Everything was working ok in initial testing, until I tried packing a > buffer that was 22 bytes long. Everything seemed to shift by 2 bytes: > > Here, I print the initial details such as offset, lsb, width, then the > hex dump of the u64 value I'm trying to insert. > > I do the call to packing() with the appropriate quirks set, and then hex > dump the 22-byte buffer after. > >> kernel: offset=0, size=8, lsb=0, width=57 >> kernel: value: 60 9b fe 01 00 00 00 00 >> kernel: 0x0000000001fe9b60 -> 000-056: fe 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >> kernel: 0x0000000001fe9b60 -> 000-056: 00 00 00 00 00 00 > > It seems to have failed to copy the first two bytes in to the buffer. > > I discovered that if I pretend the buffer is 24 bytes (a multiple of 4 > bytes, i.e. a word size), then everything works as expected: > >> kernel: offset=0, size=8, lsb=0, width=57 >> kernel: value: 60 fc fe 01 00 00 00 00 >> kernel: 0x0000000001fefc60 -> 000-056: 60 fc fe 01 00 00 00 00 00 00 00 00 00 00 00 00 >> kernel: 0x0000000001fefc60 -> 000-056: 00 00 00 00 00 00 00 00 > > It seems like this is either a misunderstanding of a fundamental > requirement of the packing() infrastructure, a bug in the > QUIRK_LSW32_IS_FIRST, or I need a new quirk for the packing infrastructure? > > Essentially, I think the problem is that it uses the length of the > buffer to find the word, but when the length is not a multiple of 2, the > word offset picked is incorrect. > > Using a larger length than the size of the buffer "works" as long as I > never use a bit offset thats larger than the *actual* buffer.. but that > seems like a poor solution. > > Essentially, it seems like the default indexing for big endian is > searching for each byte from the end of the array and then using the > quirks to swap back to the inverse ending. > > I think the fix is probably just to do a round-up division on the len/4 > check in get_reverse_lsw32_offset, since its calculating the total > number of words in the length, and effectively rounds down for lengths > that aren't multiples of 4. > This doesn't fix the problem. I printed out the logical box number and its mapped box address in the various modes with a length of 22 bytes: > kernel: default: 00 => 21 > kernel: default: 01 => 20 > kernel: default: 02 => 19 > kernel: default: 03 => 18 > kernel: default: 04 => 17 > kernel: default: 05 => 16 > kernel: default: 06 => 15 > kernel: default: 07 => 14 > kernel: default: 08 => 13 > kernel: default: 09 => 12 > kernel: default: 10 => 11 > kernel: default: 11 => 10 > kernel: default: 12 => 09 > kernel: default: 13 => 08 > kernel: default: 14 => 07 > kernel: default: 15 => 06 > kernel: default: 16 => 05 > kernel: default: 17 => 04 > kernel: default: 18 => 03 > kernel: default: 19 => 02 > kernel: default: 20 => 01 > kernel: default: 21 => 00 Here, you can see that it correctly maps in "big endian" order the entire array. Basically it just reverses the ordering. Ok, this seems reasonable. With LITTLE_ENDIAN set, things get weird: > kernel: LITTLE_ENDIAN: 00 => 22 > kernel: LITTLE_ENDIAN: 01 => 23 > kernel: LITTLE_ENDIAN: 02 => 16 > kernel: LITTLE_ENDIAN: 03 => 17 > kernel: LITTLE_ENDIAN: 04 => 18 > kernel: LITTLE_ENDIAN: 05 => 19 > kernel: LITTLE_ENDIAN: 06 => 12 > kernel: LITTLE_ENDIAN: 07 => 13 > kernel: LITTLE_ENDIAN: 08 => 14 > kernel: LITTLE_ENDIAN: 09 => 15 > kernel: LITTLE_ENDIAN: 10 => 08 > kernel: LITTLE_ENDIAN: 11 => 09 > kernel: LITTLE_ENDIAN: 12 => 10 > kernel: LITTLE_ENDIAN: 13 => 11 > kernel: LITTLE_ENDIAN: 14 => 04 > kernel: LITTLE_ENDIAN: 15 => 05 > kernel: LITTLE_ENDIAN: 16 => 06 > kernel: LITTLE_ENDIAN: 17 => 07 > kernel: LITTLE_ENDIAN: 18 => 00 > kernel: LITTLE_ENDIAN: 19 => 01 > kernel: LITTLE_ENDIAN: 20 => 02 > kernel: LITTLE_ENDIAN: 21 => 03 We map *outside* the buffer! 0 goes to 22, 1 goes to 23! Nothing at all maps into 20 or 21. This is extremely not-intuitive. In practice, without having read the docs I would have expected LITTLE_ENDIAN to directly map 0 to 0. After having read docs, (where it mentions that LITTLE_ENDIAN only does by word), I honestly have no idea what I would expect. If we try to just do 4 byte blocks at a time, we end up having to swap bytes from outside the length. With LSW32_IS_FIRST set, things are even weirder: > kernel: LSW32_IS_FIRST: 00 => -3 > kernel: LSW32_IS_FIRST: 01 => -4 > kernel: LSW32_IS_FIRST: 02 => 03 > kernel: LSW32_IS_FIRST: 03 => 02 > kernel: LSW32_IS_FIRST: 04 => 01 > kernel: LSW32_IS_FIRST: 05 => 00 > kernel: LSW32_IS_FIRST: 06 => 07 > kernel: LSW32_IS_FIRST: 07 => 06 > kernel: LSW32_IS_FIRST: 08 => 05 > kernel: LSW32_IS_FIRST: 09 => 04 > kernel: LSW32_IS_FIRST: 10 => 11 > kernel: LSW32_IS_FIRST: 11 => 10 > kernel: LSW32_IS_FIRST: 12 => 09 > kernel: LSW32_IS_FIRST: 13 => 08 > kernel: LSW32_IS_FIRST: 14 => 15 > kernel: LSW32_IS_FIRST: 15 => 14 > kernel: LSW32_IS_FIRST: 16 => 13 > kernel: LSW32_IS_FIRST: 17 => 12 > kernel: LSW32_IS_FIRST: 18 => 19 > kernel: LSW32_IS_FIRST: 19 => 18 > kernel: LSW32_IS_FIRST: 20 => 17 > kernel: LSW32_IS_FIRST: 21 => 16 We access bytes with negative indexes... And with both set: > kernel: LITTLE_ENDIAN | LSW32_IS_FIRST: 00 => -2 > kernel: LITTLE_ENDIAN | LSW32_IS_FIRST: 01 => -1 > kernel: LITTLE_ENDIAN | LSW32_IS_FIRST: 02 => 00 > kernel: LITTLE_ENDIAN | LSW32_IS_FIRST: 03 => 01 > kernel: LITTLE_ENDIAN | LSW32_IS_FIRST: 04 => 02 > kernel: LITTLE_ENDIAN | LSW32_IS_FIRST: 05 => 03 > kernel: LITTLE_ENDIAN | LSW32_IS_FIRST: 06 => 04 > kernel: LITTLE_ENDIAN | LSW32_IS_FIRST: 07 => 05 > kernel: LITTLE_ENDIAN | LSW32_IS_FIRST: 08 => 06 > kernel: LITTLE_ENDIAN | LSW32_IS_FIRST: 09 => 07 > kernel: LITTLE_ENDIAN | LSW32_IS_FIRST: 10 => 08 > kernel: LITTLE_ENDIAN | LSW32_IS_FIRST: 11 => 09 > kernel: LITTLE_ENDIAN | LSW32_IS_FIRST: 12 => 10 > kernel: LITTLE_ENDIAN | LSW32_IS_FIRST: 13 => 11 > kernel: LITTLE_ENDIAN | LSW32_IS_FIRST: 14 => 12 > kernel: LITTLE_ENDIAN | LSW32_IS_FIRST: 15 => 13 > kernel: LITTLE_ENDIAN | LSW32_IS_FIRST: 16 => 14 > kernel: LITTLE_ENDIAN | LSW32_IS_FIRST: 17 => 15 > kernel: LITTLE_ENDIAN | LSW32_IS_FIRST: 18 => 16 > kernel: LITTLE_ENDIAN | LSW32_IS_FIRST: 19 => 17 > kernel: LITTLE_ENDIAN | LSW32_IS_FIRST: 20 => 18 > kernel: LITTLE_ENDIAN | LSW32_IS_FIRST: 21 => 19 We still get negative indexes, but things are close. Everything is off by 2 bytes from what I would expect. I'm honestly not sure what the right solution here is, because the way LITTLE_ENDIAN and LSW32_IS_FIRST work they effectively *require* word-aligned sizes. If we use a word-aligned size, then they both make sense, but my hardware buffer isn't word aligned. I can cheat, and just make sure I never use bits that access the invalid parts of the buffer.. but that seems like the wrong solution... A larger size would break normal Big endian ordering without quirks... Really, what my hardware buffer wants is to map the lowest byte of the data to the lowest byte of the buffer. This is what i would consider traditionally little endian ordering. This also happens to be is equivalent to LSW32_IS_FIRST and LITTLE_ENDIAN when sizes are multiples of 4. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: lib/packing.c behaving weird if buffer length is not multiple of 4 with QUIRK_LSW32_IS_FIRST 2024-08-16 23:37 ` Jacob Keller @ 2024-08-18 13:29 ` Vladimir Oltean 2024-08-19 18:45 ` Jacob Keller 0 siblings, 1 reply; 12+ messages in thread From: Vladimir Oltean @ 2024-08-18 13:29 UTC (permalink / raw) To: Jacob Keller; +Cc: netdev@vger.kernel.org [-- Attachment #1: Type: text/plain, Size: 1479 bytes --] Hi Jake, On Fri, Aug 16, 2024 at 04:37:22PM -0700, Jacob Keller wrote: > I'm honestly not sure what the right solution here is, because the way > LITTLE_ENDIAN and LSW32_IS_FIRST work they effectively *require* > word-aligned sizes. If we use a word-aligned size, then they both make > sense, but my hardware buffer isn't word aligned. I can cheat, and just > make sure I never use bits that access the invalid parts of the buffer.. > but that seems like the wrong solution... A larger size would break > normal Big endian ordering without quirks... It is a use case that I would like to support. Thanks for having the patience to explain the issue to me. > Really, what my hardware buffer wants is to map the lowest byte of the > data to the lowest byte of the buffer. This is what i would consider > traditionally little endian ordering. > > This also happens to be is equivalent to LSW32_IS_FIRST and > LITTLE_ENDIAN when sizes are multiples of 4. Yes, "traditionally little endian" would indeed translate into QUIRK_LSW32_IS_FIRST | QUIRK_LITTLE_ENDIAN. Your use of the API seems correct. I did need that further distinction between "little endian within a group of 4 bytes" and "little endian among groups of 4 bytes" because the NXP SJA1105 memory layout is weird like that, and is "little endian" in one way but not in another. Anyway.. I've attached 2 patches which hopefully make the API usable for your driver. I've tested them locally and did not notice issues. [-- Attachment #2: 0001-lib-packing-refuse-operating-on-bit-indices-which-ex.patch --] [-- Type: text/x-diff, Size: 1509 bytes --] From 8314f556f30e5e76973b1bcf1c552ff4a23621b9 Mon Sep 17 00:00:00 2001 From: Vladimir Oltean <vladimir.oltean@nxp.com> Date: Sun, 18 Aug 2024 15:50:56 +0300 Subject: [PATCH 1/2] lib: packing: refuse operating on bit indices which exceed size of buffer While reworking the implementation, it became apparent that this check does not exist. There is no functional issue yet, because at call sites, "startbit" and "endbit" are always hardcoded to correct values, and never come from the user. Even with the upcoming support of arbitrary buffer lengths, the "startbit >= 8 * pbuflen" check will remain correct. This is because we intend to always interpret the packed buffer in a way that avoids discontinuities in the available bit indices. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> --- lib/packing.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/lib/packing.c b/lib/packing.c index 3f656167c17e..1cb3e0b2a24e 100644 --- a/lib/packing.c +++ b/lib/packing.c @@ -86,8 +86,10 @@ int packing(void *pbuf, u64 *uval, int startbit, int endbit, size_t pbuflen, */ int plogical_first_u8, plogical_last_u8, box; - /* startbit is expected to be larger than endbit */ - if (startbit < endbit) + /* startbit is expected to be larger than endbit, and both are + * expected to be within the logically addressable range of the buffer. + */ + if (startbit < endbit || startbit >= 8 * pbuflen || endbit < 0) /* Invalid function call */ return -EINVAL; -- 2.34.1 [-- Attachment #3: 0002-lib-packing-adjust-definitions-and-implementation-fo.patch --] [-- Type: text/x-diff, Size: 9050 bytes --] From c37c61b4de3dcc4b7837445f0884f7c67e73f623 Mon Sep 17 00:00:00 2001 From: Vladimir Oltean <vladimir.oltean@nxp.com> Date: Sun, 18 Aug 2024 15:48:26 +0300 Subject: [PATCH 2/2] lib: packing: adjust definitions and implementation for arbitrary buffer lengths Jacob Keller has a use case for packing() in the intel/ice networking driver, but it cannot be used as-is. Simply put, the API quirks for LSW32_IS_FIRST and LITTLE_ENDIAN are naively implemented with the undocumented assumption that the buffer length must be a multiple of 4. All calculations of group offsets and offsets of bytes within groups assume that this is the case. But in the ice case, this does not hold true. For example, packing into a buffer of 22 bytes would yield wrong results, but pretending it was a 24 byte buffer would work. Rather than requiring such hacks, and leaving a big question mark when it comes to discontinuities in the accessible bit fields of such buffer, we should extend the packing API to support this use case. It turns out that we can keep the design in terms of groups of 4 bytes, but also make it work if the total length is not a multiple of 4. Just like before, imagine the buffer as a big number, and its most significant bytes (the ones that would make up to a multiple of 4) are missing. Thus, with a big endian (no quirks) interpretation of the buffer, those most significant bytes would be absent from the beginning of the buffer, and with a LSW32_IS_FIRST interpretation, they would be absent from the end of the buffer. The LITTLE_ENDIAN quirk, in the packing() API world, only affects byte ordering within groups of 4. Thus, it does not change which bytes are missing. Only the significance of the remaining bytes within the (smaller) group. No change intended for buffer sizes which are multiples of 4. Tested with the sja1105 driver and with downstream unit tests. Link: https://lore.kernel.org/netdev/a0338310-e66c-497c-bc1f-a597e50aa3ff@intel.com/ Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> --- Documentation/core-api/packing.rst | 71 ++++++++++++++++++++++++++++++ lib/packing.c | 70 +++++++++++++++++------------ 2 files changed, 114 insertions(+), 27 deletions(-) diff --git a/Documentation/core-api/packing.rst b/Documentation/core-api/packing.rst index 3ed13bc9a195..821691f23c54 100644 --- a/Documentation/core-api/packing.rst +++ b/Documentation/core-api/packing.rst @@ -151,6 +151,77 @@ the more significant 4-byte word. We always think of our offsets as if there were no quirk, and we translate them afterwards, before accessing the memory region. +Note on buffer lengths not multiple of 4 +---------------------------------------- + +To deal with memory layout quirks where groups of 4 bytes are laid out "little +endian" relative to each other, but "big endian" within the group itself, the +concept of groups of 4 bytes is intrinsic to the packing API (not to be +confused with the memory access, which is performed byte by byte, though). + +With buffer lengths not multiple of 4, this means one group will be incomplete. +Depending on the quirks, this may lead to discontinuities in the bit fields +accessible through the buffer. The packing API assumes discontinuities were not +the intention of the memory layout, so it avoids them by effectively logically +shortening the most significant group of 4 octets to the number of octets +actually available. + +Example with a 31 byte sized buffer given below. Physical buffer offsets are +implicit, and increase from left to right within a group, and from top to +bottom within a column. + +No quirks: + +:: + + 31 29 28 | Group 7 (most significant) + 27 26 25 24 | Group 6 + 23 22 21 20 | Group 5 + 19 18 17 16 | Group 4 + 15 14 13 12 | Group 3 + 11 10 9 8 | Group 2 + 7 6 5 4 | Group 1 + 3 2 1 0 | Group 0 (least significant) + +QUIRK_LSW32_IS_FIRST: + +:: + + 3 2 1 0 | Group 0 (least significant) + 7 6 5 4 | Group 1 + 11 10 9 8 | Group 2 + 15 14 13 12 | Group 3 + 19 18 17 16 | Group 4 + 23 22 21 20 | Group 5 + 27 26 25 24 | Group 6 + 30 29 28 | Group 7 (most significant) + +QUIRK_LITTLE_ENDIAN: + +:: + + 30 28 29 | Group 7 (most significant) + 24 25 26 27 | Group 6 + 20 21 22 23 | Group 5 + 16 17 18 19 | Group 4 + 12 13 14 15 | Group 3 + 8 9 10 11 | Group 2 + 4 5 6 7 | Group 1 + 0 1 2 3 | Group 0 (least significant) + +QUIRK_LITTLE_ENDIAN | QUIRK_LSW32_IS_FIRST: + +:: + + 0 1 2 3 | Group 0 (least significant) + 4 5 6 7 | Group 1 + 8 9 10 11 | Group 2 + 12 13 14 15 | Group 3 + 16 17 18 19 | Group 4 + 20 21 22 23 | Group 5 + 24 25 26 27 | Group 6 + 28 29 30 | Group 7 (most significant) + Intended use ------------ diff --git a/lib/packing.c b/lib/packing.c index 1cb3e0b2a24e..bfc40103f821 100644 --- a/lib/packing.c +++ b/lib/packing.c @@ -9,27 +9,6 @@ #include <linux/types.h> #include <linux/bitrev.h> -static int get_le_offset(int offset) -{ - int closest_multiple_of_4; - - closest_multiple_of_4 = (offset / 4) * 4; - offset -= closest_multiple_of_4; - return closest_multiple_of_4 + (3 - offset); -} - -static int get_reverse_lsw32_offset(int offset, size_t len) -{ - int closest_multiple_of_4; - int word_index; - - word_index = offset / 4; - closest_multiple_of_4 = word_index * 4; - offset -= closest_multiple_of_4; - word_index = (len / 4) - word_index - 1; - return word_index * 4 + offset; -} - static void adjust_for_msb_right_quirk(u64 *to_write, int *box_start_bit, int *box_end_bit, u8 *box_mask) { @@ -47,6 +26,48 @@ static void adjust_for_msb_right_quirk(u64 *to_write, int *box_start_bit, *box_end_bit = new_box_end_bit; } +/** + * calculate_box_addr - Determine physical location of byte in buffer + * @box: Index of byte within buffer seen as a logical big-endian big number + * @len: Size of buffer in bytes + * @quirks: mask of QUIRK_LSW32_IS_FIRST and QUIRK_LITTLE_ENDIAN + * + * Function interprets the buffer as a @len byte sized big number, and returns + * the physical offset of the @box logical octet within it. Internally, it + * treats the big number as groups of 4 bytes. If @len is not a multiple of 4, + * the last group may be shorter. + * + * @QUIRK_LSW32_IS_FIRST gives the ordering of groups of 4 octets relative to + * each other. If set, the most significant group of 4 octets is last in the + * buffer (and may be truncated if @len is not a multiple of 4). + * + * @QUIRK_LITTLE_ENDIAN gives the ordering of bytes within each group of 4. + * If set, the most significant byte is last in the group. If @len takes the + * form of 4k+3, the last group will only be able to represent 24 bits, and its + * most significant octet is byte 2. + * + * Returns the physical offset into the buffer corresponding to the logical box. + */ +static int calculate_box_addr(int box, size_t len, int quirks) +{ + size_t offset_of_group, offset_in_group, this_group = box / 4; + size_t group_size; + + if (quirks & QUIRK_LSW32_IS_FIRST) + offset_of_group = this_group * 4; + else + offset_of_group = len - ((this_group + 1) * 4); + + group_size = min(4, len - offset_of_group); + + if (quirks & QUIRK_LITTLE_ENDIAN) + offset_in_group = box - this_group * 4; + else + offset_in_group = group_size - (box - this_group * 4) - 1; + + return offset_of_group + offset_in_group; +} + /** * packing - Convert numbers (currently u64) between a packed and an unpacked * format. Unpacked means laid out in memory in the CPU's native @@ -157,12 +178,7 @@ int packing(void *pbuf, u64 *uval, int startbit, int endbit, size_t pbuflen, * effective addressing inside the pbuf (so it's not * logical any longer). */ - box_addr = pbuflen - box - 1; - if (quirks & QUIRK_LITTLE_ENDIAN) - box_addr = get_le_offset(box_addr); - if (quirks & QUIRK_LSW32_IS_FIRST) - box_addr = get_reverse_lsw32_offset(box_addr, - pbuflen); + box_addr = calculate_box_addr(box, pbuflen, quirks); if (op == UNPACK) { u64 pval; -- 2.34.1 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: lib/packing.c behaving weird if buffer length is not multiple of 4 with QUIRK_LSW32_IS_FIRST 2024-08-18 13:29 ` Vladimir Oltean @ 2024-08-19 18:45 ` Jacob Keller 2024-08-19 21:53 ` Jacob Keller 0 siblings, 1 reply; 12+ messages in thread From: Jacob Keller @ 2024-08-19 18:45 UTC (permalink / raw) To: Vladimir Oltean; +Cc: netdev@vger.kernel.org On 8/18/2024 6:29 AM, Vladimir Oltean wrote: > Hi Jake, > > On Fri, Aug 16, 2024 at 04:37:22PM -0700, Jacob Keller wrote: >> I'm honestly not sure what the right solution here is, because the way >> LITTLE_ENDIAN and LSW32_IS_FIRST work they effectively *require* >> word-aligned sizes. If we use a word-aligned size, then they both make >> sense, but my hardware buffer isn't word aligned. I can cheat, and just >> make sure I never use bits that access the invalid parts of the buffer.. >> but that seems like the wrong solution... A larger size would break >> normal Big endian ordering without quirks... > > It is a use case that I would like to support. Thanks for having the > patience to explain the issue to me. > Great, thank! >> Really, what my hardware buffer wants is to map the lowest byte of the >> data to the lowest byte of the buffer. This is what i would consider >> traditionally little endian ordering. >> >> This also happens to be is equivalent to LSW32_IS_FIRST and >> LITTLE_ENDIAN when sizes are multiples of 4. > > Yes, "traditionally little endian" would indeed translate into > QUIRK_LSW32_IS_FIRST | QUIRK_LITTLE_ENDIAN. Your use of the API seems > correct. I did need that further distinction between "little endian > within a group of 4 bytes" and "little endian among groups of 4 bytes" > because the NXP SJA1105 memory layout is weird like that, and is > "little endian" in one way but not in another. Anyway.. Yea, I figured the distinction was based on real hardware. > > I've attached 2 patches which hopefully make the API usable for your > driver. I've tested them locally and did not notice issues. I'll check these out and get back to you! Thanks, Jake ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: lib/packing.c behaving weird if buffer length is not multiple of 4 with QUIRK_LSW32_IS_FIRST 2024-08-19 18:45 ` Jacob Keller @ 2024-08-19 21:53 ` Jacob Keller 2024-08-21 13:58 ` Vladimir Oltean 0 siblings, 1 reply; 12+ messages in thread From: Jacob Keller @ 2024-08-19 21:53 UTC (permalink / raw) To: Vladimir Oltean; +Cc: netdev@vger.kernel.org On 8/19/2024 11:45 AM, Jacob Keller wrote: > > > On 8/18/2024 6:29 AM, Vladimir Oltean wrote: >> Hi Jake, >> >> On Fri, Aug 16, 2024 at 04:37:22PM -0700, Jacob Keller wrote: >>> I'm honestly not sure what the right solution here is, because the way >>> LITTLE_ENDIAN and LSW32_IS_FIRST work they effectively *require* >>> word-aligned sizes. If we use a word-aligned size, then they both make >>> sense, but my hardware buffer isn't word aligned. I can cheat, and just >>> make sure I never use bits that access the invalid parts of the buffer.. >>> but that seems like the wrong solution... A larger size would break >>> normal Big endian ordering without quirks... >> >> It is a use case that I would like to support. Thanks for having the >> patience to explain the issue to me. >> > > Great, thank! > >>> Really, what my hardware buffer wants is to map the lowest byte of the >>> data to the lowest byte of the buffer. This is what i would consider >>> traditionally little endian ordering. >>> >>> This also happens to be is equivalent to LSW32_IS_FIRST and >>> LITTLE_ENDIAN when sizes are multiples of 4. >> >> Yes, "traditionally little endian" would indeed translate into >> QUIRK_LSW32_IS_FIRST | QUIRK_LITTLE_ENDIAN. Your use of the API seems >> correct. I did need that further distinction between "little endian >> within a group of 4 bytes" and "little endian among groups of 4 bytes" >> because the NXP SJA1105 memory layout is weird like that, and is >> "little endian" in one way but not in another. Anyway.. > > > Yea, I figured the distinction was based on real hardware. > >> >> I've attached 2 patches which hopefully make the API usable for your >> driver. I've tested them locally and did not notice issues. > > I'll check these out and get back to you! > > Thanks, > Jake > The patches work for my use-case! I also think it might be helpful to add some Kunit tests to cover the packing and unpacking, which I wouldn't mind trying to do. If/when you send the patches, feel free to add: Tested-by: Jacob Keller <jacob.e.keller@intel.com> Alternatively, I could send them as part of the series where I implement the changes to use lib/packing in the ice driver. Thanks, Jake ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: lib/packing.c behaving weird if buffer length is not multiple of 4 with QUIRK_LSW32_IS_FIRST 2024-08-19 21:53 ` Jacob Keller @ 2024-08-21 13:58 ` Vladimir Oltean 2024-08-21 19:12 ` Jacob Keller 0 siblings, 1 reply; 12+ messages in thread From: Vladimir Oltean @ 2024-08-21 13:58 UTC (permalink / raw) To: Jacob Keller; +Cc: netdev@vger.kernel.org On Mon, Aug 19, 2024 at 02:53:34PM -0700, Jacob Keller wrote: > The patches work for my use-case! I also think it might be helpful to > add some Kunit tests to cover the packing and unpacking, which I > wouldn't mind trying to do. > > If/when you send the patches, feel free to add: > > Tested-by: Jacob Keller <jacob.e.keller@intel.com> > > Alternatively, I could send them as part of the series where I implement > the changes to use lib/packing in the ice driver. This is good to hear! I did have some boot-time selftests written, albeit in a very bare-bones format. I'm afraid I do not have as much energy as I'd wish to push this patch set forward. Especially not to learn to use KUnit and to adapt the tests to that format. If you wish, here is a Github branch with what I have. You can pick from it for your submissions and adapt as you see fit. https://github.com/vladimiroltean/linux/tree/packing-selftests ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: lib/packing.c behaving weird if buffer length is not multiple of 4 with QUIRK_LSW32_IS_FIRST 2024-08-21 13:58 ` Vladimir Oltean @ 2024-08-21 19:12 ` Jacob Keller 2024-08-21 20:21 ` Vladimir Oltean 0 siblings, 1 reply; 12+ messages in thread From: Jacob Keller @ 2024-08-21 19:12 UTC (permalink / raw) To: Vladimir Oltean; +Cc: netdev@vger.kernel.org On 8/21/2024 6:58 AM, Vladimir Oltean wrote: > On Mon, Aug 19, 2024 at 02:53:34PM -0700, Jacob Keller wrote: >> The patches work for my use-case! I also think it might be helpful to >> add some Kunit tests to cover the packing and unpacking, which I >> wouldn't mind trying to do. >> >> If/when you send the patches, feel free to add: >> >> Tested-by: Jacob Keller <jacob.e.keller@intel.com> >> >> Alternatively, I could send them as part of the series where I implement >> the changes to use lib/packing in the ice driver. > > This is good to hear! > > I did have some boot-time selftests written, albeit in a very bare-bones > format. I'm afraid I do not have as much energy as I'd wish to push this > patch set forward. Especially not to learn to use KUnit and to adapt the > tests to that format. No problem. I'm quite happy to work on that part. > > If you wish, here is a Github branch with what I have. You can pick from > it for your submissions and adapt as you see fit. > https://github.com/vladimiroltean/linux/tree/packing-selftests Ok. I'll investigate this, and I will send the two fixes for lib/packing in my series to implement the support in ice. That would help on our end with managing the changes since it avoids an interdependence between multiple series in flight. Regards, Jake ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: lib/packing.c behaving weird if buffer length is not multiple of 4 with QUIRK_LSW32_IS_FIRST 2024-08-21 19:12 ` Jacob Keller @ 2024-08-21 20:21 ` Vladimir Oltean 2024-08-21 23:41 ` Jacob Keller 0 siblings, 1 reply; 12+ messages in thread From: Vladimir Oltean @ 2024-08-21 20:21 UTC (permalink / raw) To: Jacob Keller; +Cc: netdev@vger.kernel.org On Wed, Aug 21, 2024 at 12:12:00PM -0700, Jacob Keller wrote: > Ok. I'll investigate this, and I will send the two fixes for lib/packing > in my series to implement the support in ice. That would help on our end > with managing the changes since it avoids an interdependence between > multiple series in flight. There's one patch in there which replaces the packing(PACK) call with a dedicated pack() function, and packing(UNPACK) with unpack(). The idea being that it helps with const correctness. I still have some mixed feelings about this, because a multiplexed packing() call is in some ways more flexible, but apparently others felt bad enough about the packing() API to tell me about it, and that stuck with me. I'm mentioning it because if you're going to use the API, you could at least consider using the const-correct form, so that there's one less driver to refactor later. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: lib/packing.c behaving weird if buffer length is not multiple of 4 with QUIRK_LSW32_IS_FIRST 2024-08-21 20:21 ` Vladimir Oltean @ 2024-08-21 23:41 ` Jacob Keller 2024-08-23 1:41 ` Jacob Keller 0 siblings, 1 reply; 12+ messages in thread From: Jacob Keller @ 2024-08-21 23:41 UTC (permalink / raw) To: Vladimir Oltean; +Cc: netdev@vger.kernel.org On 8/21/2024 1:21 PM, Vladimir Oltean wrote: > On Wed, Aug 21, 2024 at 12:12:00PM -0700, Jacob Keller wrote: >> Ok. I'll investigate this, and I will send the two fixes for lib/packing >> in my series to implement the support in ice. That would help on our end >> with managing the changes since it avoids an interdependence between >> multiple series in flight. > > There's one patch in there which replaces the packing(PACK) call with a > dedicated pack() function, and packing(UNPACK) with unpack(). The idea > being that it helps with const correctness. I still have some mixed > feelings about this, because a multiplexed packing() call is in some > ways more flexible, but apparently others felt bad enough about the > packing() API to tell me about it, and that stuck with me. > > I'm mentioning it because if you're going to use the API, you could at > least consider using the const-correct form, so that there's one less > driver to refactor later. Yep! I've got those patches in my series now. Though I should note that I did not include any of the patches for the other drivers. I'll CC you when I send the series out, though it may likely go through our Intel-Wired-LAN tree first. I've refactored your self tests into KUnit tests as well! ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: lib/packing.c behaving weird if buffer length is not multiple of 4 with QUIRK_LSW32_IS_FIRST 2024-08-21 23:41 ` Jacob Keller @ 2024-08-23 1:41 ` Jacob Keller 2024-08-23 19:53 ` Jacob Keller 0 siblings, 1 reply; 12+ messages in thread From: Jacob Keller @ 2024-08-23 1:41 UTC (permalink / raw) To: Vladimir Oltean; +Cc: netdev@vger.kernel.org On 8/21/2024 4:41 PM, Jacob Keller wrote: > > > On 8/21/2024 1:21 PM, Vladimir Oltean wrote: >> On Wed, Aug 21, 2024 at 12:12:00PM -0700, Jacob Keller wrote: >>> Ok. I'll investigate this, and I will send the two fixes for lib/packing >>> in my series to implement the support in ice. That would help on our end >>> with managing the changes since it avoids an interdependence between >>> multiple series in flight. >> >> There's one patch in there which replaces the packing(PACK) call with a >> dedicated pack() function, and packing(UNPACK) with unpack(). The idea >> being that it helps with const correctness. I still have some mixed >> feelings about this, because a multiplexed packing() call is in some >> ways more flexible, but apparently others felt bad enough about the >> packing() API to tell me about it, and that stuck with me. >> >> I'm mentioning it because if you're going to use the API, you could at >> least consider using the const-correct form, so that there's one less >> driver to refactor later. > > Yep! I've got those patches in my series now. Though I should note that > I did not include any of the patches for the other drivers. I'll CC you > when I send the series out, though it may likely go through our > Intel-Wired-LAN tree first. > > I've refactored your self tests into KUnit tests as well! > I was writing additional tests and I think I ran into another issue with QUIRK_MSB_ON_THE_RIGHT, when the bit offsets are not aligned to a byte boundary: When trying to unpack 0x1122334455667788 from the buffer between offsets 106-43, the calculation appears to completely break. When packing: > [18:34:50] box_bit_width = 3 > [18:34:50] box_start_bit = 2 > [18:34:50] box_end_bit = 0 > [18:34:50] new_box_start_bit = 2 > [18:34:50] new_box_end_bit = 0 > [18:34:50] box_bit_width = 8 > [18:34:50] box_start_bit = 7 > [18:34:50] box_end_bit = 0 > [18:34:50] new_box_start_bit = 7 > [18:34:50] new_box_end_bit = 0 > [18:34:50] box_bit_width = 8 > [18:34:50] box_start_bit = 7 > [18:34:50] box_end_bit = 0 > [18:34:50] new_box_start_bit = 7 > [18:34:50] new_box_end_bit = 0 > [18:34:50] box_bit_width = 8 > [18:34:50] box_start_bit = 7 > [18:34:50] box_end_bit = 0 > [18:34:50] new_box_start_bit = 7 > [18:34:50] new_box_end_bit = 0 > [18:34:50] box_bit_width = 8 > [18:34:50] box_start_bit = 7 > [18:34:50] box_end_bit = 0 > [18:34:50] new_box_start_bit = 7 > [18:34:50] new_box_end_bit = 0 > [18:34:50] box_bit_width = 8 > [18:34:50] box_start_bit = 7 > [18:34:50] box_end_bit = 0 > [18:34:50] new_box_start_bit = 7 > [18:34:50] new_box_end_bit = 0 > [18:34:50] box_bit_width = 8 > [18:34:50] box_start_bit = 7 > [18:34:50] box_end_bit = 0 > [18:34:50] new_box_start_bit = 7 > [18:34:50] new_box_end_bit = 0 > [18:34:50] box_bit_width = 8 > [18:34:50] box_start_bit = 7 > [18:34:50] box_end_bit = 0 > [18:34:50] new_box_start_bit = 7 > [18:34:50] new_box_end_bit = 0 > [18:34:50] box_bit_width = 5 > [18:34:50] box_start_bit = 7 > [18:34:50] box_end_bit = 3 > [18:34:50] new_box_start_bit = 1 > [18:34:50] new_box_end_bit = -3 > [18:34:50] # packing_test_unpack: EXPECTATION FAILED at lib/packing_test.c:264 > [18:34:50] Expected uval == params->uval, but > [18:34:50] uval == 1234605616436508544 (0x1122334455667780) > [18:34:50] params->uval == 1234605616436508552 (0x1122334455667788) > [18:34:50] [FAILED] msb right, 16 bytes, non-aligned > [18:34:50] # packing_test_unpack: pass:19 fail:1 skip:0 total:20 Notice that the box end bit is now negative. Specifically this is because the width is smaller than the start bit, so subtraction underflows. When unpacking: > [18:34:50] box_bit_width = 3 > [18:34:50] box_start_bit = 2 > [18:34:50] box_end_bit = 0 > [18:34:50] new_box_start_bit = 2 > [18:34:50] new_box_end_bit = 0 > [18:34:50] box_bit_width = 8 > [18:34:50] box_start_bit = 7 > [18:34:50] box_end_bit = 0 > [18:34:50] new_box_start_bit = 7 > [18:34:50] new_box_end_bit = 0 > [18:34:50] box_bit_width = 8 > [18:34:50] box_start_bit = 7 > [18:34:50] box_end_bit = 0 > [18:34:50] new_box_start_bit = 7 > [18:34:50] new_box_end_bit = 0 > [18:34:50] box_bit_width = 8 > [18:34:50] box_start_bit = 7 > [18:34:50] box_end_bit = 0 > [18:34:50] new_box_start_bit = 7 > [18:34:50] new_box_end_bit = 0 > [18:34:50] box_bit_width = 8 > [18:34:50] box_start_bit = 7 > [18:34:50] box_end_bit = 0 > [18:34:50] new_box_start_bit = 7 > [18:34:50] new_box_end_bit = 0 > [18:34:50] box_bit_width = 8 > [18:34:50] box_start_bit = 7 > [18:34:50] box_end_bit = 0 > [18:34:50] new_box_start_bit = 7 > [18:34:50] new_box_end_bit = 0 > [18:34:50] box_bit_width = 8 > [18:34:50] box_start_bit = 7 > [18:34:50] box_end_bit = 0 > [18:34:50] new_box_start_bit = 7 > [18:34:50] new_box_end_bit = 0 > [18:34:50] box_bit_width = 8 > [18:34:50] box_start_bit = 7 > [18:34:50] box_end_bit = 0 > [18:34:50] new_box_start_bit = 7 > [18:34:50] new_box_end_bit = 0 > [18:34:50] box_bit_width = 5 > [18:34:50] box_start_bit = 7 > [18:34:50] box_end_bit = 3 > [18:34:50] new_box_start_bit = 1 > [18:34:50] new_box_end_bit = -3 > [18:34:50] # packing_test_unpack: EXPECTATION FAILED at lib/packing_test.c:264 > [18:34:50] Expected uval == params->uval, but > [18:34:50] uval == 1234605616436508544 (0x1122334455667780) > [18:34:50] params->uval == 1234605616436508552 (0x1122334455667788) > [18:34:50] [FAILED] msb right, 16 bytes, non-aligned > [18:34:50] # packing_test_unpack: pass:19 fail:1 skip:0 total:20 Specifically, it looks like we basically fail to calculate valid new box offsets. What's weird to me is that when the box width is larger than the start bit position, we just calculate the same exact offsets, so I don't see why the existing calculations are there at all. Something is obviously wrong here. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: lib/packing.c behaving weird if buffer length is not multiple of 4 with QUIRK_LSW32_IS_FIRST 2024-08-23 1:41 ` Jacob Keller @ 2024-08-23 19:53 ` Jacob Keller 2024-08-26 22:03 ` Jacob Keller 0 siblings, 1 reply; 12+ messages in thread From: Jacob Keller @ 2024-08-23 19:53 UTC (permalink / raw) To: Vladimir Oltean; +Cc: netdev@vger.kernel.org [-- Attachment #1: Type: text/plain, Size: 8845 bytes --] On 8/22/2024 6:41 PM, Jacob Keller wrote: > > > On 8/21/2024 4:41 PM, Jacob Keller wrote: >> >> >> On 8/21/2024 1:21 PM, Vladimir Oltean wrote: >>> On Wed, Aug 21, 2024 at 12:12:00PM -0700, Jacob Keller wrote: >>>> Ok. I'll investigate this, and I will send the two fixes for lib/packing >>>> in my series to implement the support in ice. That would help on our end >>>> with managing the changes since it avoids an interdependence between >>>> multiple series in flight. >>> >>> There's one patch in there which replaces the packing(PACK) call with a >>> dedicated pack() function, and packing(UNPACK) with unpack(). The idea >>> being that it helps with const correctness. I still have some mixed >>> feelings about this, because a multiplexed packing() call is in some >>> ways more flexible, but apparently others felt bad enough about the >>> packing() API to tell me about it, and that stuck with me. >>> >>> I'm mentioning it because if you're going to use the API, you could at >>> least consider using the const-correct form, so that there's one less >>> driver to refactor later. >> >> Yep! I've got those patches in my series now. Though I should note that >> I did not include any of the patches for the other drivers. I'll CC you >> when I send the series out, though it may likely go through our >> Intel-Wired-LAN tree first. >> >> I've refactored your self tests into KUnit tests as well! >> > > I was writing additional tests and I think I ran into another issue with > QUIRK_MSB_ON_THE_RIGHT, when the bit offsets are not aligned to a byte > boundary: > > When trying to unpack 0x1122334455667788 from the buffer between offsets > 106-43, the calculation appears to completely break. > > When packing: > >> [18:34:50] box_bit_width = 3 >> [18:34:50] box_start_bit = 2 >> [18:34:50] box_end_bit = 0 >> [18:34:50] new_box_start_bit = 2 >> [18:34:50] new_box_end_bit = 0 >> [18:34:50] box_bit_width = 8 >> [18:34:50] box_start_bit = 7 >> [18:34:50] box_end_bit = 0 >> [18:34:50] new_box_start_bit = 7 >> [18:34:50] new_box_end_bit = 0 >> [18:34:50] box_bit_width = 8 >> [18:34:50] box_start_bit = 7 >> [18:34:50] box_end_bit = 0 >> [18:34:50] new_box_start_bit = 7 >> [18:34:50] new_box_end_bit = 0 >> [18:34:50] box_bit_width = 8 >> [18:34:50] box_start_bit = 7 >> [18:34:50] box_end_bit = 0 >> [18:34:50] new_box_start_bit = 7 >> [18:34:50] new_box_end_bit = 0 >> [18:34:50] box_bit_width = 8 >> [18:34:50] box_start_bit = 7 >> [18:34:50] box_end_bit = 0 >> [18:34:50] new_box_start_bit = 7 >> [18:34:50] new_box_end_bit = 0 >> [18:34:50] box_bit_width = 8 >> [18:34:50] box_start_bit = 7 >> [18:34:50] box_end_bit = 0 >> [18:34:50] new_box_start_bit = 7 >> [18:34:50] new_box_end_bit = 0 >> [18:34:50] box_bit_width = 8 >> [18:34:50] box_start_bit = 7 >> [18:34:50] box_end_bit = 0 >> [18:34:50] new_box_start_bit = 7 >> [18:34:50] new_box_end_bit = 0 >> [18:34:50] box_bit_width = 8 >> [18:34:50] box_start_bit = 7 >> [18:34:50] box_end_bit = 0 >> [18:34:50] new_box_start_bit = 7 >> [18:34:50] new_box_end_bit = 0 >> [18:34:50] box_bit_width = 5 >> [18:34:50] box_start_bit = 7 >> [18:34:50] box_end_bit = 3 >> [18:34:50] new_box_start_bit = 1 >> [18:34:50] new_box_end_bit = -3 >> [18:34:50] # packing_test_unpack: EXPECTATION FAILED at lib/packing_test.c:264 >> [18:34:50] Expected uval == params->uval, but >> [18:34:50] uval == 1234605616436508544 (0x1122334455667780) >> [18:34:50] params->uval == 1234605616436508552 (0x1122334455667788) >> [18:34:50] [FAILED] msb right, 16 bytes, non-aligned >> [18:34:50] # packing_test_unpack: pass:19 fail:1 skip:0 total:20 > > Notice that the box end bit is now negative. Specifically this is > because the width is smaller than the start bit, so subtraction underflows. > > When unpacking: >> [18:34:50] box_bit_width = 3 >> [18:34:50] box_start_bit = 2 >> [18:34:50] box_end_bit = 0 >> [18:34:50] new_box_start_bit = 2 >> [18:34:50] new_box_end_bit = 0 >> [18:34:50] box_bit_width = 8 >> [18:34:50] box_start_bit = 7 >> [18:34:50] box_end_bit = 0 >> [18:34:50] new_box_start_bit = 7 >> [18:34:50] new_box_end_bit = 0 >> [18:34:50] box_bit_width = 8 >> [18:34:50] box_start_bit = 7 >> [18:34:50] box_end_bit = 0 >> [18:34:50] new_box_start_bit = 7 >> [18:34:50] new_box_end_bit = 0 >> [18:34:50] box_bit_width = 8 >> [18:34:50] box_start_bit = 7 >> [18:34:50] box_end_bit = 0 >> [18:34:50] new_box_start_bit = 7 >> [18:34:50] new_box_end_bit = 0 >> [18:34:50] box_bit_width = 8 >> [18:34:50] box_start_bit = 7 >> [18:34:50] box_end_bit = 0 >> [18:34:50] new_box_start_bit = 7 >> [18:34:50] new_box_end_bit = 0 >> [18:34:50] box_bit_width = 8 >> [18:34:50] box_start_bit = 7 >> [18:34:50] box_end_bit = 0 >> [18:34:50] new_box_start_bit = 7 >> [18:34:50] new_box_end_bit = 0 >> [18:34:50] box_bit_width = 8 >> [18:34:50] box_start_bit = 7 >> [18:34:50] box_end_bit = 0 >> [18:34:50] new_box_start_bit = 7 >> [18:34:50] new_box_end_bit = 0 >> [18:34:50] box_bit_width = 8 >> [18:34:50] box_start_bit = 7 >> [18:34:50] box_end_bit = 0 >> [18:34:50] new_box_start_bit = 7 >> [18:34:50] new_box_end_bit = 0 >> [18:34:50] box_bit_width = 5 >> [18:34:50] box_start_bit = 7 >> [18:34:50] box_end_bit = 3 >> [18:34:50] new_box_start_bit = 1 >> [18:34:50] new_box_end_bit = -3 >> [18:34:50] # packing_test_unpack: EXPECTATION FAILED at lib/packing_test.c:264 >> [18:34:50] Expected uval == params->uval, but >> [18:34:50] uval == 1234605616436508544 (0x1122334455667780) >> [18:34:50] params->uval == 1234605616436508552 (0x1122334455667788) >> [18:34:50] [FAILED] msb right, 16 bytes, non-aligned >> [18:34:50] # packing_test_unpack: pass:19 fail:1 skip:0 total:20 > > Specifically, it looks like we basically fail to calculate valid new box > offsets. > > What's weird to me is that when the box width is larger than the start > bit position, we just calculate the same exact offsets, so I don't see > why the existing calculations are there at all. Something is obviously > wrong here. > Specifically this is making me question: Does QUIRK_MSB_ON_THE_RIGHT mean that the msb of each bit field is on the right? or does it mean that every byte in the buffer has its 8 bits flipped? It seems to document that it applies to the bits within the byte, but not to the byte ordering. The code seems to be trying to do a mix of different things though, and my attempts at fixing it haven't worked properly yet. I think I was able to get pack() to behave correctly, but unpacking seems to still be erratic. I tried adding a few more tests but so far haven't figured out what is wrong with the unpacking code. I think the attempt to re-use adjust_for_msb_right on both unpacking and packing isn't working correctly. When unpacking, I think we end up masking the wrong bits. With the attached patch, I was able to get my test case for packing fixed, but the following couple of tests fail: > { > .desc = "msb right, 16 bytes, non-aligned", > PBUF(0x00, 0x00, 0x00, 0x91, 0x88, 0x59, 0x44, 0xd5, > 0xcc, 0x3d, 0x02, 0x00, 0x00, 0x00, 0x00, 0x00), > .uval = 0x1122334455667788, > .start_bit = 106, > .end_bit = 43, > .quirks = QUIRK_MSB_ON_THE_RIGHT, > }, packing for this test works with the attached patch, but fails to unpack: > [12:49:53] # packing_test_unpack: EXPECTATION FAILED at lib/packing_test.c:282 > [12:49:53] Expected uval == params->uval, but > [12:49:53] uval == 1234605616436508544 (0x1122334455667780) > [12:49:53] params->uval == 1234605616436508552 (0x1122334455667788) the last few bits don't seem to be included properly. I also tried a test case with all bits of the u64 set: > { > .desc = "msb right, 16 bytes, non-aligned, 0xff", > PBUF(0x00, 0x00, 0xe0, 0xff, 0xff, 0xff, 0xff, 0xff, > 0xff, 0xff, 0x1f, 0x00, 0x00, 0x00, 0x00, 0x00), > .uval = 0xffffffffffffffff, > .start_bit = 106, > .end_bit = 43, > .quirks = QUIRK_MSB_ON_THE_RIGHT, > }, > [12:49:53] # packing_test_unpack: EXPECTATION FAILED at lib/packing_test.c:282 > [12:49:53] Expected uval == params->uval, but > [12:49:53] uval == 2305843009213693944 (0x1ffffffffffffff8) > [12:49:53] params->uval == -1 (0xffffffffffffffff) In this case, again it seems like the bits on the tail end partial bytes don't get unpacked properly. This test also passes packing with the attempted fix. I so far think the likely issue is with the way we handle the box and mask offsets. Somehow we must be shifting things in a strange way that causes us to discard bits, but only when we deal with the QUIRK_MSB_ON_THE_RIGHT. Thanks, Jake [-- Attachment #2: maybe-fixed-packing.patch --] [-- Type: text/plain, Size: 1388 bytes --] diff --git c/lib/packing.c i/lib/packing.c index 80c95dacbfaa..c7aab00b106c 100644 --- c/lib/packing.c +++ i/lib/packing.c @@ -14,14 +14,16 @@ static void adjust_for_msb_right_quirk(u64 *to_write, size_t *box_start_bit, { size_t box_bit_width = *box_start_bit - *box_end_bit + 1; size_t new_box_start_bit, new_box_end_bit; + u8 new_box_mask; - *to_write >>= *box_end_bit; - *to_write = bitrev8(*to_write) >> (8 - box_bit_width); - *to_write <<= *box_end_bit; - new_box_end_bit = box_bit_width - *box_start_bit - 1; - new_box_start_bit = box_bit_width - *box_end_bit - 1; - *box_mask = GENMASK_ULL(new_box_start_bit, new_box_end_bit); + *to_write = bitrev8(*to_write); + + new_box_mask = bitrev8(*box_mask); + new_box_end_bit = ffs(new_box_mask) - 1; + new_box_start_bit = fls(new_box_mask) - 1; + + *box_mask = new_box_mask; *box_start_bit = new_box_start_bit; *box_end_bit = new_box_end_bit; } @@ -170,13 +172,13 @@ int pack(void *pbuf, u64 uval, size_t startbit, size_t endbit, size_t pbuflen, /* Write to pbuf, read from uval */ pval = uval & proj_mask; pval >>= proj_end_bit; + pval <<= box_end_bit; if (quirks & QUIRK_MSB_ON_THE_RIGHT) adjust_for_msb_right_quirk(&pval, &box_start_bit, &box_end_bit, &box_mask); - pval <<= box_end_bit; ((u8 *)pbuf)[box_addr] &= ~box_mask; ((u8 *)pbuf)[box_addr] |= pval; } ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: lib/packing.c behaving weird if buffer length is not multiple of 4 with QUIRK_LSW32_IS_FIRST 2024-08-23 19:53 ` Jacob Keller @ 2024-08-26 22:03 ` Jacob Keller 0 siblings, 0 replies; 12+ messages in thread From: Jacob Keller @ 2024-08-26 22:03 UTC (permalink / raw) To: Vladimir Oltean; +Cc: netdev@vger.kernel.org On 8/23/2024 12:53 PM, Jacob Keller wrote: > I so far think the likely issue is with the way we handle the box and > mask offsets. Somehow we must be shifting things in a strange way that > causes us to discard bits, but only when we deal with the > QUIRK_MSB_ON_THE_RIGHT. > > Thanks, > Jake I eventually figured out what i believe is wrong here, and have a fix. I'm going to include it in the series I send for using <linux/packing.h> in the ice driver, along with the additional KUnit tests. Thanks, Jake ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2024-08-26 22:03 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-08-15 21:16 lib/packing.c behaving weird if buffer length is not multiple of 4 with QUIRK_LSW32_IS_FIRST Jacob Keller 2024-08-16 23:37 ` Jacob Keller 2024-08-18 13:29 ` Vladimir Oltean 2024-08-19 18:45 ` Jacob Keller 2024-08-19 21:53 ` Jacob Keller 2024-08-21 13:58 ` Vladimir Oltean 2024-08-21 19:12 ` Jacob Keller 2024-08-21 20:21 ` Vladimir Oltean 2024-08-21 23:41 ` Jacob Keller 2024-08-23 1:41 ` Jacob Keller 2024-08-23 19:53 ` Jacob Keller 2024-08-26 22:03 ` Jacob Keller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).