netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* lib/packing.c behaving weird if buffer length is not multiple of 4 with QUIRK_LSW32_IS_FIRST
@ 2024-08-15 21:16 Jacob Keller
  2024-08-16 23:37 ` Jacob Keller
  0 siblings, 1 reply; 12+ messages in thread
From: Jacob Keller @ 2024-08-15 21:16 UTC (permalink / raw)
  To: Vladimir Oltean; +Cc: netdev@vger.kernel.org

Hi Vladimir!

I am recently attempting to refactor some bespoke code for packing data
in the ice networking driver into a bit packed hardware array, using the
<linux/packing.h> and lib/packing.c infrastructure.

My hardware packed buffer needs QUIRK_LITTLE_ENDIAN and
QUIRK_LSW32_IS_FIRST, as the entire buffer is packed as little endian,
and the lower words come first in the buffer.

Everything was working ok in initial testing, until I tried packing a
buffer that was 22 bytes long. Everything seemed to shift by 2 bytes:

Here, I print the initial details such as offset, lsb, width, then the
hex dump of the u64 value I'm trying to insert.

I do the call to packing() with the appropriate quirks set, and then hex
dump the 22-byte buffer after.

> kernel: offset=0, size=8, lsb=0, width=57
> kernel: value: 60 9b fe 01 00 00 00 00
> kernel: 0x0000000001fe9b60 -> 000-056: fe 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> kernel: 0x0000000001fe9b60 -> 000-056: 00 00 00 00 00 00

It seems to have failed to copy the first two bytes in to the buffer.

I discovered that if I pretend the buffer is 24 bytes (a multiple of 4
bytes, i.e. a word size), then everything works as expected:

> kernel: offset=0, size=8, lsb=0, width=57
> kernel: value: 60 fc fe 01 00 00 00 00
> kernel: 0x0000000001fefc60 -> 000-056: 60 fc fe 01 00 00 00 00 00 00 00 00 00 00 00 00
> kernel: 0x0000000001fefc60 -> 000-056: 00 00 00 00 00 00 00 00

It seems like this is either a misunderstanding of a fundamental
requirement of the packing() infrastructure, a bug in the
QUIRK_LSW32_IS_FIRST, or I need a new quirk for the packing infrastructure?

Essentially, I think the problem is that it uses the length of the
buffer to find the word, but when the length is not a multiple of 2, the
word offset picked is incorrect.

Using a larger length than the size of the buffer "works" as long as I
never use a bit offset thats larger than the *actual* buffer.. but that
seems like a poor solution.

Essentially, it seems like the default indexing for big endian is
searching for each byte from the end of the array and then using the
quirks to swap back to the inverse ending.

I think the fix is probably just to do a round-up division on the len/4
check in get_reverse_lsw32_offset, since its calculating the total
number of words in the length, and effectively rounds down for lengths
that aren't multiples of 4.

Does this seem like the correct analysis to you?

Thanks,
Jake

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: lib/packing.c behaving weird if buffer length is not multiple of 4 with QUIRK_LSW32_IS_FIRST
  2024-08-15 21:16 lib/packing.c behaving weird if buffer length is not multiple of 4 with QUIRK_LSW32_IS_FIRST Jacob Keller
@ 2024-08-16 23:37 ` Jacob Keller
  2024-08-18 13:29   ` Vladimir Oltean
  0 siblings, 1 reply; 12+ messages in thread
From: Jacob Keller @ 2024-08-16 23:37 UTC (permalink / raw)
  To: Vladimir Oltean; +Cc: netdev@vger.kernel.org



On 8/15/2024 2:16 PM, Jacob Keller wrote:
> Hi Vladimir!
> 
> I am recently attempting to refactor some bespoke code for packing data
> in the ice networking driver into a bit packed hardware array, using the
> <linux/packing.h> and lib/packing.c infrastructure.
> 
> My hardware packed buffer needs QUIRK_LITTLE_ENDIAN and
> QUIRK_LSW32_IS_FIRST, as the entire buffer is packed as little endian,
> and the lower words come first in the buffer.
> 
> Everything was working ok in initial testing, until I tried packing a
> buffer that was 22 bytes long. Everything seemed to shift by 2 bytes:
> 
> Here, I print the initial details such as offset, lsb, width, then the
> hex dump of the u64 value I'm trying to insert.
> 
> I do the call to packing() with the appropriate quirks set, and then hex
> dump the 22-byte buffer after.
> 
>> kernel: offset=0, size=8, lsb=0, width=57
>> kernel: value: 60 9b fe 01 00 00 00 00
>> kernel: 0x0000000001fe9b60 -> 000-056: fe 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> kernel: 0x0000000001fe9b60 -> 000-056: 00 00 00 00 00 00
> 
> It seems to have failed to copy the first two bytes in to the buffer.
> 
> I discovered that if I pretend the buffer is 24 bytes (a multiple of 4
> bytes, i.e. a word size), then everything works as expected:
> 
>> kernel: offset=0, size=8, lsb=0, width=57
>> kernel: value: 60 fc fe 01 00 00 00 00
>> kernel: 0x0000000001fefc60 -> 000-056: 60 fc fe 01 00 00 00 00 00 00 00 00 00 00 00 00
>> kernel: 0x0000000001fefc60 -> 000-056: 00 00 00 00 00 00 00 00
> 
> It seems like this is either a misunderstanding of a fundamental
> requirement of the packing() infrastructure, a bug in the
> QUIRK_LSW32_IS_FIRST, or I need a new quirk for the packing infrastructure?
> 
> Essentially, I think the problem is that it uses the length of the
> buffer to find the word, but when the length is not a multiple of 2, the
> word offset picked is incorrect.
> 
> Using a larger length than the size of the buffer "works" as long as I
> never use a bit offset thats larger than the *actual* buffer.. but that
> seems like a poor solution.
> 
> Essentially, it seems like the default indexing for big endian is
> searching for each byte from the end of the array and then using the
> quirks to swap back to the inverse ending.
> 
> I think the fix is probably just to do a round-up division on the len/4
> check in get_reverse_lsw32_offset, since its calculating the total
> number of words in the length, and effectively rounds down for lengths
> that aren't multiples of 4.
> 

This doesn't fix the problem. I printed out the logical box number and
its mapped box address in the various modes with a length of 22 bytes:

> kernel: default: 00 => 21
> kernel: default: 01 => 20
> kernel: default: 02 => 19
> kernel: default: 03 => 18
> kernel: default: 04 => 17
> kernel: default: 05 => 16
> kernel: default: 06 => 15
> kernel: default: 07 => 14
> kernel: default: 08 => 13
> kernel: default: 09 => 12
> kernel: default: 10 => 11
> kernel: default: 11 => 10
> kernel: default: 12 => 09
> kernel: default: 13 => 08
> kernel: default: 14 => 07
> kernel: default: 15 => 06
> kernel: default: 16 => 05
> kernel: default: 17 => 04
> kernel: default: 18 => 03
> kernel: default: 19 => 02
> kernel: default: 20 => 01
> kernel: default: 21 => 00

Here, you can see that it correctly maps in "big endian" order the
entire array. Basically it just reverses the ordering. Ok, this seems
reasonable.

With LITTLE_ENDIAN set, things get weird:
> kernel: LITTLE_ENDIAN: 00 => 22
> kernel: LITTLE_ENDIAN: 01 => 23
> kernel: LITTLE_ENDIAN: 02 => 16
> kernel: LITTLE_ENDIAN: 03 => 17
> kernel: LITTLE_ENDIAN: 04 => 18
> kernel: LITTLE_ENDIAN: 05 => 19
> kernel: LITTLE_ENDIAN: 06 => 12
> kernel: LITTLE_ENDIAN: 07 => 13
> kernel: LITTLE_ENDIAN: 08 => 14
> kernel: LITTLE_ENDIAN: 09 => 15
> kernel: LITTLE_ENDIAN: 10 => 08
> kernel: LITTLE_ENDIAN: 11 => 09
> kernel: LITTLE_ENDIAN: 12 => 10
> kernel: LITTLE_ENDIAN: 13 => 11
> kernel: LITTLE_ENDIAN: 14 => 04
> kernel: LITTLE_ENDIAN: 15 => 05
> kernel: LITTLE_ENDIAN: 16 => 06
> kernel: LITTLE_ENDIAN: 17 => 07
> kernel: LITTLE_ENDIAN: 18 => 00
> kernel: LITTLE_ENDIAN: 19 => 01
> kernel: LITTLE_ENDIAN: 20 => 02
> kernel: LITTLE_ENDIAN: 21 => 03

We map *outside* the buffer! 0 goes to 22, 1 goes to 23! Nothing at all
maps into 20 or 21. This is extremely not-intuitive. In practice,
without having read the docs I would have expected LITTLE_ENDIAN to
directly map 0 to 0. After having read docs, (where it mentions that
LITTLE_ENDIAN only does by word), I honestly have no idea what I would
expect. If we try to just do 4 byte blocks at a time, we end up having
to swap bytes from outside the length.

With LSW32_IS_FIRST set, things are even weirder:

> kernel: LSW32_IS_FIRST: 00 => -3
> kernel: LSW32_IS_FIRST: 01 => -4
> kernel: LSW32_IS_FIRST: 02 => 03
> kernel: LSW32_IS_FIRST: 03 => 02
> kernel: LSW32_IS_FIRST: 04 => 01
> kernel: LSW32_IS_FIRST: 05 => 00
> kernel: LSW32_IS_FIRST: 06 => 07
> kernel: LSW32_IS_FIRST: 07 => 06
> kernel: LSW32_IS_FIRST: 08 => 05
> kernel: LSW32_IS_FIRST: 09 => 04
> kernel: LSW32_IS_FIRST: 10 => 11
> kernel: LSW32_IS_FIRST: 11 => 10
> kernel: LSW32_IS_FIRST: 12 => 09
> kernel: LSW32_IS_FIRST: 13 => 08
> kernel: LSW32_IS_FIRST: 14 => 15
> kernel: LSW32_IS_FIRST: 15 => 14
> kernel: LSW32_IS_FIRST: 16 => 13
> kernel: LSW32_IS_FIRST: 17 => 12
> kernel: LSW32_IS_FIRST: 18 => 19
> kernel: LSW32_IS_FIRST: 19 => 18
> kernel: LSW32_IS_FIRST: 20 => 17
> kernel: LSW32_IS_FIRST: 21 => 16

We access bytes with negative indexes...

And with both set:

> kernel: LITTLE_ENDIAN | LSW32_IS_FIRST: 00 => -2
> kernel: LITTLE_ENDIAN | LSW32_IS_FIRST: 01 => -1
> kernel: LITTLE_ENDIAN | LSW32_IS_FIRST: 02 => 00
> kernel: LITTLE_ENDIAN | LSW32_IS_FIRST: 03 => 01
> kernel: LITTLE_ENDIAN | LSW32_IS_FIRST: 04 => 02
> kernel: LITTLE_ENDIAN | LSW32_IS_FIRST: 05 => 03
> kernel: LITTLE_ENDIAN | LSW32_IS_FIRST: 06 => 04
> kernel: LITTLE_ENDIAN | LSW32_IS_FIRST: 07 => 05
> kernel: LITTLE_ENDIAN | LSW32_IS_FIRST: 08 => 06
> kernel: LITTLE_ENDIAN | LSW32_IS_FIRST: 09 => 07
> kernel: LITTLE_ENDIAN | LSW32_IS_FIRST: 10 => 08
> kernel: LITTLE_ENDIAN | LSW32_IS_FIRST: 11 => 09
> kernel: LITTLE_ENDIAN | LSW32_IS_FIRST: 12 => 10
> kernel: LITTLE_ENDIAN | LSW32_IS_FIRST: 13 => 11
> kernel: LITTLE_ENDIAN | LSW32_IS_FIRST: 14 => 12
> kernel: LITTLE_ENDIAN | LSW32_IS_FIRST: 15 => 13
> kernel: LITTLE_ENDIAN | LSW32_IS_FIRST: 16 => 14
> kernel: LITTLE_ENDIAN | LSW32_IS_FIRST: 17 => 15
> kernel: LITTLE_ENDIAN | LSW32_IS_FIRST: 18 => 16
> kernel: LITTLE_ENDIAN | LSW32_IS_FIRST: 19 => 17
> kernel: LITTLE_ENDIAN | LSW32_IS_FIRST: 20 => 18
> kernel: LITTLE_ENDIAN | LSW32_IS_FIRST: 21 => 19

We still get negative indexes, but things are close. Everything is off
by 2 bytes from what I would expect.

I'm honestly not sure what the right solution here is, because the way
LITTLE_ENDIAN and LSW32_IS_FIRST work they effectively *require*
word-aligned sizes. If we use a word-aligned size, then they both make
sense, but my hardware buffer isn't word aligned. I can cheat, and just
make sure I never use bits that access the invalid parts of the buffer..
but that seems like the wrong solution... A larger size would break
normal Big endian ordering without quirks...

Really, what my hardware buffer wants is to map the lowest byte of the
data to the lowest byte of the buffer. This is what i would consider
traditionally little endian ordering.

This also happens to be is equivalent to LSW32_IS_FIRST and
LITTLE_ENDIAN when sizes are multiples of 4.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: lib/packing.c behaving weird if buffer length is not multiple of 4 with QUIRK_LSW32_IS_FIRST
  2024-08-16 23:37 ` Jacob Keller
@ 2024-08-18 13:29   ` Vladimir Oltean
  2024-08-19 18:45     ` Jacob Keller
  0 siblings, 1 reply; 12+ messages in thread
From: Vladimir Oltean @ 2024-08-18 13:29 UTC (permalink / raw)
  To: Jacob Keller; +Cc: netdev@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 1479 bytes --]

Hi Jake,

On Fri, Aug 16, 2024 at 04:37:22PM -0700, Jacob Keller wrote:
> I'm honestly not sure what the right solution here is, because the way
> LITTLE_ENDIAN and LSW32_IS_FIRST work they effectively *require*
> word-aligned sizes. If we use a word-aligned size, then they both make
> sense, but my hardware buffer isn't word aligned. I can cheat, and just
> make sure I never use bits that access the invalid parts of the buffer..
> but that seems like the wrong solution... A larger size would break
> normal Big endian ordering without quirks...

It is a use case that I would like to support. Thanks for having the
patience to explain the issue to me.

> Really, what my hardware buffer wants is to map the lowest byte of the
> data to the lowest byte of the buffer. This is what i would consider
> traditionally little endian ordering.
> 
> This also happens to be is equivalent to LSW32_IS_FIRST and
> LITTLE_ENDIAN when sizes are multiples of 4.

Yes, "traditionally little endian" would indeed translate into
QUIRK_LSW32_IS_FIRST | QUIRK_LITTLE_ENDIAN. Your use of the API seems
correct. I did need that further distinction between "little endian
within a group of 4 bytes" and "little endian among groups of 4 bytes"
because the NXP SJA1105 memory layout is weird like that, and is
"little endian" in one way but not in another. Anyway..

I've attached 2 patches which hopefully make the API usable for your
driver. I've tested them locally and did not notice issues.

[-- Attachment #2: 0001-lib-packing-refuse-operating-on-bit-indices-which-ex.patch --]
[-- Type: text/x-diff, Size: 1509 bytes --]

From 8314f556f30e5e76973b1bcf1c552ff4a23621b9 Mon Sep 17 00:00:00 2001
From: Vladimir Oltean <vladimir.oltean@nxp.com>
Date: Sun, 18 Aug 2024 15:50:56 +0300
Subject: [PATCH 1/2] lib: packing: refuse operating on bit indices which
 exceed size of buffer

While reworking the implementation, it became apparent that this check
does not exist.

There is no functional issue yet, because at call sites, "startbit" and
"endbit" are always hardcoded to correct values, and never come from the
user.

Even with the upcoming support of arbitrary buffer lengths, the
"startbit >= 8 * pbuflen" check will remain correct. This is because
we intend to always interpret the packed buffer in a way that avoids
discontinuities in the available bit indices.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 lib/packing.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/lib/packing.c b/lib/packing.c
index 3f656167c17e..1cb3e0b2a24e 100644
--- a/lib/packing.c
+++ b/lib/packing.c
@@ -86,8 +86,10 @@ int packing(void *pbuf, u64 *uval, int startbit, int endbit, size_t pbuflen,
 	 */
 	int plogical_first_u8, plogical_last_u8, box;
 
-	/* startbit is expected to be larger than endbit */
-	if (startbit < endbit)
+	/* startbit is expected to be larger than endbit, and both are
+	 * expected to be within the logically addressable range of the buffer.
+	 */
+	if (startbit < endbit || startbit >= 8 * pbuflen || endbit < 0)
 		/* Invalid function call */
 		return -EINVAL;
 
-- 
2.34.1


[-- Attachment #3: 0002-lib-packing-adjust-definitions-and-implementation-fo.patch --]
[-- Type: text/x-diff, Size: 9050 bytes --]

From c37c61b4de3dcc4b7837445f0884f7c67e73f623 Mon Sep 17 00:00:00 2001
From: Vladimir Oltean <vladimir.oltean@nxp.com>
Date: Sun, 18 Aug 2024 15:48:26 +0300
Subject: [PATCH 2/2] lib: packing: adjust definitions and implementation for
 arbitrary buffer lengths

Jacob Keller has a use case for packing() in the intel/ice networking
driver, but it cannot be used as-is.

Simply put, the API quirks for LSW32_IS_FIRST and LITTLE_ENDIAN are
naively implemented with the undocumented assumption that the buffer
length must be a multiple of 4. All calculations of group offsets and
offsets of bytes within groups assume that this is the case. But in the
ice case, this does not hold true. For example, packing into a buffer
of 22 bytes would yield wrong results, but pretending it was a 24 byte
buffer would work.

Rather than requiring such hacks, and leaving a big question mark when
it comes to discontinuities in the accessible bit fields of such buffer,
we should extend the packing API to support this use case.

It turns out that we can keep the design in terms of groups of 4 bytes,
but also make it work if the total length is not a multiple of 4.
Just like before, imagine the buffer as a big number, and its most
significant bytes (the ones that would make up to a multiple of 4) are
missing. Thus, with a big endian (no quirks) interpretation of the
buffer, those most significant bytes would be absent from the beginning
of the buffer, and with a LSW32_IS_FIRST interpretation, they would be
absent from the end of the buffer. The LITTLE_ENDIAN quirk, in the
packing() API world, only affects byte ordering within groups of 4.
Thus, it does not change which bytes are missing. Only the significance
of the remaining bytes within the (smaller) group.

No change intended for buffer sizes which are multiples of 4. Tested
with the sja1105 driver and with downstream unit tests.

Link: https://lore.kernel.org/netdev/a0338310-e66c-497c-bc1f-a597e50aa3ff@intel.com/
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 Documentation/core-api/packing.rst | 71 ++++++++++++++++++++++++++++++
 lib/packing.c                      | 70 +++++++++++++++++------------
 2 files changed, 114 insertions(+), 27 deletions(-)

diff --git a/Documentation/core-api/packing.rst b/Documentation/core-api/packing.rst
index 3ed13bc9a195..821691f23c54 100644
--- a/Documentation/core-api/packing.rst
+++ b/Documentation/core-api/packing.rst
@@ -151,6 +151,77 @@ the more significant 4-byte word.
 We always think of our offsets as if there were no quirk, and we translate
 them afterwards, before accessing the memory region.
 
+Note on buffer lengths not multiple of 4
+----------------------------------------
+
+To deal with memory layout quirks where groups of 4 bytes are laid out "little
+endian" relative to each other, but "big endian" within the group itself, the
+concept of groups of 4 bytes is intrinsic to the packing API (not to be
+confused with the memory access, which is performed byte by byte, though).
+
+With buffer lengths not multiple of 4, this means one group will be incomplete.
+Depending on the quirks, this may lead to discontinuities in the bit fields
+accessible through the buffer. The packing API assumes discontinuities were not
+the intention of the memory layout, so it avoids them by effectively logically
+shortening the most significant group of 4 octets to the number of octets
+actually available.
+
+Example with a 31 byte sized buffer given below. Physical buffer offsets are
+implicit, and increase from left to right within a group, and from top to
+bottom within a column.
+
+No quirks:
+
+::
+
+            31         29         28        |   Group 7 (most significant)
+ 27         26         25         24        |   Group 6
+ 23         22         21         20        |   Group 5
+ 19         18         17         16        |   Group 4
+ 15         14         13         12        |   Group 3
+ 11         10          9          8        |   Group 2
+  7          6          5          4        |   Group 1
+  3          2          1          0        |   Group 0 (least significant)
+
+QUIRK_LSW32_IS_FIRST:
+
+::
+
+  3          2          1          0        |   Group 0 (least significant)
+  7          6          5          4        |   Group 1
+ 11         10          9          8        |   Group 2
+ 15         14         13         12        |   Group 3
+ 19         18         17         16        |   Group 4
+ 23         22         21         20        |   Group 5
+ 27         26         25         24        |   Group 6
+ 30         29         28                   |   Group 7 (most significant)
+
+QUIRK_LITTLE_ENDIAN:
+
+::
+
+            30         28         29        |   Group 7 (most significant)
+ 24         25         26         27        |   Group 6
+ 20         21         22         23        |   Group 5
+ 16         17         18         19        |   Group 4
+ 12         13         14         15        |   Group 3
+  8          9         10         11        |   Group 2
+  4          5          6          7        |   Group 1
+  0          1          2          3        |   Group 0 (least significant)
+
+QUIRK_LITTLE_ENDIAN | QUIRK_LSW32_IS_FIRST:
+
+::
+
+  0          1          2          3        |   Group 0 (least significant)
+  4          5          6          7        |   Group 1
+  8          9         10         11        |   Group 2
+ 12         13         14         15        |   Group 3
+ 16         17         18         19        |   Group 4
+ 20         21         22         23        |   Group 5
+ 24         25         26         27        |   Group 6
+ 28         29         30                   |   Group 7 (most significant)
+
 Intended use
 ------------
 
diff --git a/lib/packing.c b/lib/packing.c
index 1cb3e0b2a24e..bfc40103f821 100644
--- a/lib/packing.c
+++ b/lib/packing.c
@@ -9,27 +9,6 @@
 #include <linux/types.h>
 #include <linux/bitrev.h>
 
-static int get_le_offset(int offset)
-{
-	int closest_multiple_of_4;
-
-	closest_multiple_of_4 = (offset / 4) * 4;
-	offset -= closest_multiple_of_4;
-	return closest_multiple_of_4 + (3 - offset);
-}
-
-static int get_reverse_lsw32_offset(int offset, size_t len)
-{
-	int closest_multiple_of_4;
-	int word_index;
-
-	word_index = offset / 4;
-	closest_multiple_of_4 = word_index * 4;
-	offset -= closest_multiple_of_4;
-	word_index = (len / 4) - word_index - 1;
-	return word_index * 4 + offset;
-}
-
 static void adjust_for_msb_right_quirk(u64 *to_write, int *box_start_bit,
 				       int *box_end_bit, u8 *box_mask)
 {
@@ -47,6 +26,48 @@ static void adjust_for_msb_right_quirk(u64 *to_write, int *box_start_bit,
 	*box_end_bit   = new_box_end_bit;
 }
 
+/**
+ * calculate_box_addr - Determine physical location of byte in buffer
+ * @box: Index of byte within buffer seen as a logical big-endian big number
+ * @len: Size of buffer in bytes
+ * @quirks: mask of QUIRK_LSW32_IS_FIRST and QUIRK_LITTLE_ENDIAN
+ *
+ * Function interprets the buffer as a @len byte sized big number, and returns
+ * the physical offset of the @box logical octet within it. Internally, it
+ * treats the big number as groups of 4 bytes. If @len is not a multiple of 4,
+ * the last group may be shorter.
+ *
+ * @QUIRK_LSW32_IS_FIRST gives the ordering of groups of 4 octets relative to
+ * each other. If set, the most significant group of 4 octets is last in the
+ * buffer (and may be truncated if @len is not a multiple of 4).
+ *
+ * @QUIRK_LITTLE_ENDIAN gives the ordering of bytes within each group of 4.
+ * If set, the most significant byte is last in the group. If @len takes the
+ * form of 4k+3, the last group will only be able to represent 24 bits, and its
+ * most significant octet is byte 2.
+ *
+ * Returns the physical offset into the buffer corresponding to the logical box.
+ */
+static int calculate_box_addr(int box, size_t len, int quirks)
+{
+	size_t offset_of_group, offset_in_group, this_group = box / 4;
+	size_t group_size;
+
+	if (quirks & QUIRK_LSW32_IS_FIRST)
+		offset_of_group = this_group * 4;
+	else
+		offset_of_group = len - ((this_group + 1) * 4);
+
+	group_size = min(4, len - offset_of_group);
+
+	if (quirks & QUIRK_LITTLE_ENDIAN)
+		offset_in_group = box - this_group * 4;
+	else
+		offset_in_group = group_size - (box - this_group * 4) - 1;
+
+	return offset_of_group + offset_in_group;
+}
+
 /**
  * packing - Convert numbers (currently u64) between a packed and an unpacked
  *	     format. Unpacked means laid out in memory in the CPU's native
@@ -157,12 +178,7 @@ int packing(void *pbuf, u64 *uval, int startbit, int endbit, size_t pbuflen,
 		 * effective addressing inside the pbuf (so it's not
 		 * logical any longer).
 		 */
-		box_addr = pbuflen - box - 1;
-		if (quirks & QUIRK_LITTLE_ENDIAN)
-			box_addr = get_le_offset(box_addr);
-		if (quirks & QUIRK_LSW32_IS_FIRST)
-			box_addr = get_reverse_lsw32_offset(box_addr,
-							    pbuflen);
+		box_addr = calculate_box_addr(box, pbuflen, quirks);
 
 		if (op == UNPACK) {
 			u64 pval;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: lib/packing.c behaving weird if buffer length is not multiple of 4 with QUIRK_LSW32_IS_FIRST
  2024-08-18 13:29   ` Vladimir Oltean
@ 2024-08-19 18:45     ` Jacob Keller
  2024-08-19 21:53       ` Jacob Keller
  0 siblings, 1 reply; 12+ messages in thread
From: Jacob Keller @ 2024-08-19 18:45 UTC (permalink / raw)
  To: Vladimir Oltean; +Cc: netdev@vger.kernel.org



On 8/18/2024 6:29 AM, Vladimir Oltean wrote:
> Hi Jake,
> 
> On Fri, Aug 16, 2024 at 04:37:22PM -0700, Jacob Keller wrote:
>> I'm honestly not sure what the right solution here is, because the way
>> LITTLE_ENDIAN and LSW32_IS_FIRST work they effectively *require*
>> word-aligned sizes. If we use a word-aligned size, then they both make
>> sense, but my hardware buffer isn't word aligned. I can cheat, and just
>> make sure I never use bits that access the invalid parts of the buffer..
>> but that seems like the wrong solution... A larger size would break
>> normal Big endian ordering without quirks...
> 
> It is a use case that I would like to support. Thanks for having the
> patience to explain the issue to me.
> 

Great, thank!

>> Really, what my hardware buffer wants is to map the lowest byte of the
>> data to the lowest byte of the buffer. This is what i would consider
>> traditionally little endian ordering.
>>
>> This also happens to be is equivalent to LSW32_IS_FIRST and
>> LITTLE_ENDIAN when sizes are multiples of 4.
> 
> Yes, "traditionally little endian" would indeed translate into
> QUIRK_LSW32_IS_FIRST | QUIRK_LITTLE_ENDIAN. Your use of the API seems
> correct. I did need that further distinction between "little endian
> within a group of 4 bytes" and "little endian among groups of 4 bytes"
> because the NXP SJA1105 memory layout is weird like that, and is
> "little endian" in one way but not in another. Anyway..


Yea, I figured the distinction was based on real hardware.

> 
> I've attached 2 patches which hopefully make the API usable for your
> driver. I've tested them locally and did not notice issues.

I'll check these out and get back to you!

Thanks,
Jake

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: lib/packing.c behaving weird if buffer length is not multiple of 4 with QUIRK_LSW32_IS_FIRST
  2024-08-19 18:45     ` Jacob Keller
@ 2024-08-19 21:53       ` Jacob Keller
  2024-08-21 13:58         ` Vladimir Oltean
  0 siblings, 1 reply; 12+ messages in thread
From: Jacob Keller @ 2024-08-19 21:53 UTC (permalink / raw)
  To: Vladimir Oltean; +Cc: netdev@vger.kernel.org



On 8/19/2024 11:45 AM, Jacob Keller wrote:
> 
> 
> On 8/18/2024 6:29 AM, Vladimir Oltean wrote:
>> Hi Jake,
>>
>> On Fri, Aug 16, 2024 at 04:37:22PM -0700, Jacob Keller wrote:
>>> I'm honestly not sure what the right solution here is, because the way
>>> LITTLE_ENDIAN and LSW32_IS_FIRST work they effectively *require*
>>> word-aligned sizes. If we use a word-aligned size, then they both make
>>> sense, but my hardware buffer isn't word aligned. I can cheat, and just
>>> make sure I never use bits that access the invalid parts of the buffer..
>>> but that seems like the wrong solution... A larger size would break
>>> normal Big endian ordering without quirks...
>>
>> It is a use case that I would like to support. Thanks for having the
>> patience to explain the issue to me.
>>
> 
> Great, thank!
> 
>>> Really, what my hardware buffer wants is to map the lowest byte of the
>>> data to the lowest byte of the buffer. This is what i would consider
>>> traditionally little endian ordering.
>>>
>>> This also happens to be is equivalent to LSW32_IS_FIRST and
>>> LITTLE_ENDIAN when sizes are multiples of 4.
>>
>> Yes, "traditionally little endian" would indeed translate into
>> QUIRK_LSW32_IS_FIRST | QUIRK_LITTLE_ENDIAN. Your use of the API seems
>> correct. I did need that further distinction between "little endian
>> within a group of 4 bytes" and "little endian among groups of 4 bytes"
>> because the NXP SJA1105 memory layout is weird like that, and is
>> "little endian" in one way but not in another. Anyway..
> 
> 
> Yea, I figured the distinction was based on real hardware.
> 
>>
>> I've attached 2 patches which hopefully make the API usable for your
>> driver. I've tested them locally and did not notice issues.
> 
> I'll check these out and get back to you!
> 
> Thanks,
> Jake
> 

The patches work for my use-case! I also think it might be helpful to
add some Kunit tests to cover the packing and unpacking, which I
wouldn't mind trying to do.

If/when you send the patches, feel free to add:

Tested-by: Jacob Keller <jacob.e.keller@intel.com>

Alternatively, I could send them as part of the series where I implement
the changes to use lib/packing in the ice driver.

Thanks,
Jake

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: lib/packing.c behaving weird if buffer length is not multiple of 4 with QUIRK_LSW32_IS_FIRST
  2024-08-19 21:53       ` Jacob Keller
@ 2024-08-21 13:58         ` Vladimir Oltean
  2024-08-21 19:12           ` Jacob Keller
  0 siblings, 1 reply; 12+ messages in thread
From: Vladimir Oltean @ 2024-08-21 13:58 UTC (permalink / raw)
  To: Jacob Keller; +Cc: netdev@vger.kernel.org

On Mon, Aug 19, 2024 at 02:53:34PM -0700, Jacob Keller wrote:
> The patches work for my use-case! I also think it might be helpful to
> add some Kunit tests to cover the packing and unpacking, which I
> wouldn't mind trying to do.
> 
> If/when you send the patches, feel free to add:
> 
> Tested-by: Jacob Keller <jacob.e.keller@intel.com>
> 
> Alternatively, I could send them as part of the series where I implement
> the changes to use lib/packing in the ice driver.

This is good to hear!

I did have some boot-time selftests written, albeit in a very bare-bones
format. I'm afraid I do not have as much energy as I'd wish to push this
patch set forward. Especially not to learn to use KUnit and to adapt the
tests to that format.

If you wish, here is a Github branch with what I have. You can pick from
it for your submissions and adapt as you see fit.
https://github.com/vladimiroltean/linux/tree/packing-selftests

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: lib/packing.c behaving weird if buffer length is not multiple of 4 with QUIRK_LSW32_IS_FIRST
  2024-08-21 13:58         ` Vladimir Oltean
@ 2024-08-21 19:12           ` Jacob Keller
  2024-08-21 20:21             ` Vladimir Oltean
  0 siblings, 1 reply; 12+ messages in thread
From: Jacob Keller @ 2024-08-21 19:12 UTC (permalink / raw)
  To: Vladimir Oltean; +Cc: netdev@vger.kernel.org



On 8/21/2024 6:58 AM, Vladimir Oltean wrote:
> On Mon, Aug 19, 2024 at 02:53:34PM -0700, Jacob Keller wrote:
>> The patches work for my use-case! I also think it might be helpful to
>> add some Kunit tests to cover the packing and unpacking, which I
>> wouldn't mind trying to do.
>>
>> If/when you send the patches, feel free to add:
>>
>> Tested-by: Jacob Keller <jacob.e.keller@intel.com>
>>
>> Alternatively, I could send them as part of the series where I implement
>> the changes to use lib/packing in the ice driver.
> 
> This is good to hear!
> 
> I did have some boot-time selftests written, albeit in a very bare-bones
> format. I'm afraid I do not have as much energy as I'd wish to push this
> patch set forward. Especially not to learn to use KUnit and to adapt the
> tests to that format.

No problem. I'm quite happy to work on that part.

> 
> If you wish, here is a Github branch with what I have. You can pick from
> it for your submissions and adapt as you see fit.
> https://github.com/vladimiroltean/linux/tree/packing-selftests

Ok. I'll investigate this, and I will send the two fixes for lib/packing
in my series to implement the support in ice. That would help on our end
with managing the changes since it avoids an interdependence between
multiple series in flight.

Regards,
Jake

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: lib/packing.c behaving weird if buffer length is not multiple of 4 with QUIRK_LSW32_IS_FIRST
  2024-08-21 19:12           ` Jacob Keller
@ 2024-08-21 20:21             ` Vladimir Oltean
  2024-08-21 23:41               ` Jacob Keller
  0 siblings, 1 reply; 12+ messages in thread
From: Vladimir Oltean @ 2024-08-21 20:21 UTC (permalink / raw)
  To: Jacob Keller; +Cc: netdev@vger.kernel.org

On Wed, Aug 21, 2024 at 12:12:00PM -0700, Jacob Keller wrote:
> Ok. I'll investigate this, and I will send the two fixes for lib/packing
> in my series to implement the support in ice. That would help on our end
> with managing the changes since it avoids an interdependence between
> multiple series in flight.

There's one patch in there which replaces the packing(PACK) call with a
dedicated pack() function, and packing(UNPACK) with unpack(). The idea
being that it helps with const correctness. I still have some mixed
feelings about this, because a multiplexed packing() call is in some
ways more flexible, but apparently others felt bad enough about the
packing() API to tell me about it, and that stuck with me.

I'm mentioning it because if you're going to use the API, you could at
least consider using the const-correct form, so that there's one less
driver to refactor later.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: lib/packing.c behaving weird if buffer length is not multiple of 4 with QUIRK_LSW32_IS_FIRST
  2024-08-21 20:21             ` Vladimir Oltean
@ 2024-08-21 23:41               ` Jacob Keller
  2024-08-23  1:41                 ` Jacob Keller
  0 siblings, 1 reply; 12+ messages in thread
From: Jacob Keller @ 2024-08-21 23:41 UTC (permalink / raw)
  To: Vladimir Oltean; +Cc: netdev@vger.kernel.org



On 8/21/2024 1:21 PM, Vladimir Oltean wrote:
> On Wed, Aug 21, 2024 at 12:12:00PM -0700, Jacob Keller wrote:
>> Ok. I'll investigate this, and I will send the two fixes for lib/packing
>> in my series to implement the support in ice. That would help on our end
>> with managing the changes since it avoids an interdependence between
>> multiple series in flight.
> 
> There's one patch in there which replaces the packing(PACK) call with a
> dedicated pack() function, and packing(UNPACK) with unpack(). The idea
> being that it helps with const correctness. I still have some mixed
> feelings about this, because a multiplexed packing() call is in some
> ways more flexible, but apparently others felt bad enough about the
> packing() API to tell me about it, and that stuck with me.
> 
> I'm mentioning it because if you're going to use the API, you could at
> least consider using the const-correct form, so that there's one less
> driver to refactor later.

Yep! I've got those patches in my series now. Though I should note that
I did not include any of the patches for the other drivers. I'll CC you
when I send the series out, though it may likely go through our
Intel-Wired-LAN tree first.

I've refactored your self tests into KUnit tests as well!

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: lib/packing.c behaving weird if buffer length is not multiple of 4 with QUIRK_LSW32_IS_FIRST
  2024-08-21 23:41               ` Jacob Keller
@ 2024-08-23  1:41                 ` Jacob Keller
  2024-08-23 19:53                   ` Jacob Keller
  0 siblings, 1 reply; 12+ messages in thread
From: Jacob Keller @ 2024-08-23  1:41 UTC (permalink / raw)
  To: Vladimir Oltean; +Cc: netdev@vger.kernel.org



On 8/21/2024 4:41 PM, Jacob Keller wrote:
> 
> 
> On 8/21/2024 1:21 PM, Vladimir Oltean wrote:
>> On Wed, Aug 21, 2024 at 12:12:00PM -0700, Jacob Keller wrote:
>>> Ok. I'll investigate this, and I will send the two fixes for lib/packing
>>> in my series to implement the support in ice. That would help on our end
>>> with managing the changes since it avoids an interdependence between
>>> multiple series in flight.
>>
>> There's one patch in there which replaces the packing(PACK) call with a
>> dedicated pack() function, and packing(UNPACK) with unpack(). The idea
>> being that it helps with const correctness. I still have some mixed
>> feelings about this, because a multiplexed packing() call is in some
>> ways more flexible, but apparently others felt bad enough about the
>> packing() API to tell me about it, and that stuck with me.
>>
>> I'm mentioning it because if you're going to use the API, you could at
>> least consider using the const-correct form, so that there's one less
>> driver to refactor later.
> 
> Yep! I've got those patches in my series now. Though I should note that
> I did not include any of the patches for the other drivers. I'll CC you
> when I send the series out, though it may likely go through our
> Intel-Wired-LAN tree first.
> 
> I've refactored your self tests into KUnit tests as well!
> 

I was writing additional tests and I think I ran into another issue with
QUIRK_MSB_ON_THE_RIGHT, when the bit offsets are not aligned to a byte
boundary:

When trying to unpack 0x1122334455667788 from the buffer between offsets
106-43, the calculation appears to completely break.

When packing:

> [18:34:50] box_bit_width = 3
> [18:34:50] box_start_bit = 2
> [18:34:50] box_end_bit = 0
> [18:34:50] new_box_start_bit = 2
> [18:34:50] new_box_end_bit = 0
> [18:34:50] box_bit_width = 8
> [18:34:50] box_start_bit = 7
> [18:34:50] box_end_bit = 0
> [18:34:50] new_box_start_bit = 7
> [18:34:50] new_box_end_bit = 0
> [18:34:50] box_bit_width = 8
> [18:34:50] box_start_bit = 7
> [18:34:50] box_end_bit = 0
> [18:34:50] new_box_start_bit = 7
> [18:34:50] new_box_end_bit = 0
> [18:34:50] box_bit_width = 8
> [18:34:50] box_start_bit = 7
> [18:34:50] box_end_bit = 0
> [18:34:50] new_box_start_bit = 7
> [18:34:50] new_box_end_bit = 0
> [18:34:50] box_bit_width = 8
> [18:34:50] box_start_bit = 7
> [18:34:50] box_end_bit = 0
> [18:34:50] new_box_start_bit = 7
> [18:34:50] new_box_end_bit = 0
> [18:34:50] box_bit_width = 8
> [18:34:50] box_start_bit = 7
> [18:34:50] box_end_bit = 0
> [18:34:50] new_box_start_bit = 7
> [18:34:50] new_box_end_bit = 0
> [18:34:50] box_bit_width = 8
> [18:34:50] box_start_bit = 7
> [18:34:50] box_end_bit = 0
> [18:34:50] new_box_start_bit = 7
> [18:34:50] new_box_end_bit = 0
> [18:34:50] box_bit_width = 8
> [18:34:50] box_start_bit = 7
> [18:34:50] box_end_bit = 0
> [18:34:50] new_box_start_bit = 7
> [18:34:50] new_box_end_bit = 0
> [18:34:50] box_bit_width = 5
> [18:34:50] box_start_bit = 7
> [18:34:50] box_end_bit = 3
> [18:34:50] new_box_start_bit = 1
> [18:34:50] new_box_end_bit = -3
> [18:34:50]     # packing_test_unpack: EXPECTATION FAILED at lib/packing_test.c:264
> [18:34:50]     Expected uval == params->uval, but
> [18:34:50]         uval == 1234605616436508544 (0x1122334455667780)
> [18:34:50]         params->uval == 1234605616436508552 (0x1122334455667788)
> [18:34:50] [FAILED] msb right, 16 bytes, non-aligned
> [18:34:50] # packing_test_unpack: pass:19 fail:1 skip:0 total:20

Notice that the box end bit is now negative. Specifically this is
because the width is smaller than the start bit, so subtraction underflows.

When unpacking:
> [18:34:50] box_bit_width = 3
> [18:34:50] box_start_bit = 2
> [18:34:50] box_end_bit = 0
> [18:34:50] new_box_start_bit = 2
> [18:34:50] new_box_end_bit = 0
> [18:34:50] box_bit_width = 8
> [18:34:50] box_start_bit = 7
> [18:34:50] box_end_bit = 0
> [18:34:50] new_box_start_bit = 7
> [18:34:50] new_box_end_bit = 0
> [18:34:50] box_bit_width = 8
> [18:34:50] box_start_bit = 7
> [18:34:50] box_end_bit = 0
> [18:34:50] new_box_start_bit = 7
> [18:34:50] new_box_end_bit = 0
> [18:34:50] box_bit_width = 8
> [18:34:50] box_start_bit = 7
> [18:34:50] box_end_bit = 0
> [18:34:50] new_box_start_bit = 7
> [18:34:50] new_box_end_bit = 0
> [18:34:50] box_bit_width = 8
> [18:34:50] box_start_bit = 7
> [18:34:50] box_end_bit = 0
> [18:34:50] new_box_start_bit = 7
> [18:34:50] new_box_end_bit = 0
> [18:34:50] box_bit_width = 8
> [18:34:50] box_start_bit = 7
> [18:34:50] box_end_bit = 0
> [18:34:50] new_box_start_bit = 7
> [18:34:50] new_box_end_bit = 0
> [18:34:50] box_bit_width = 8
> [18:34:50] box_start_bit = 7
> [18:34:50] box_end_bit = 0
> [18:34:50] new_box_start_bit = 7
> [18:34:50] new_box_end_bit = 0
> [18:34:50] box_bit_width = 8
> [18:34:50] box_start_bit = 7
> [18:34:50] box_end_bit = 0
> [18:34:50] new_box_start_bit = 7
> [18:34:50] new_box_end_bit = 0
> [18:34:50] box_bit_width = 5
> [18:34:50] box_start_bit = 7
> [18:34:50] box_end_bit = 3
> [18:34:50] new_box_start_bit = 1
> [18:34:50] new_box_end_bit = -3
> [18:34:50]     # packing_test_unpack: EXPECTATION FAILED at lib/packing_test.c:264
> [18:34:50]     Expected uval == params->uval, but
> [18:34:50]         uval == 1234605616436508544 (0x1122334455667780)
> [18:34:50]         params->uval == 1234605616436508552 (0x1122334455667788)
> [18:34:50] [FAILED] msb right, 16 bytes, non-aligned
> [18:34:50] # packing_test_unpack: pass:19 fail:1 skip:0 total:20

Specifically, it looks like we basically fail to calculate valid new box
offsets.

What's weird to me is that when the box width is larger than the start
bit position, we just calculate the same exact offsets, so I don't see
why the existing calculations are there at all. Something is obviously
wrong here.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: lib/packing.c behaving weird if buffer length is not multiple of 4 with QUIRK_LSW32_IS_FIRST
  2024-08-23  1:41                 ` Jacob Keller
@ 2024-08-23 19:53                   ` Jacob Keller
  2024-08-26 22:03                     ` Jacob Keller
  0 siblings, 1 reply; 12+ messages in thread
From: Jacob Keller @ 2024-08-23 19:53 UTC (permalink / raw)
  To: Vladimir Oltean; +Cc: netdev@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 8845 bytes --]



On 8/22/2024 6:41 PM, Jacob Keller wrote:
> 
> 
> On 8/21/2024 4:41 PM, Jacob Keller wrote:
>>
>>
>> On 8/21/2024 1:21 PM, Vladimir Oltean wrote:
>>> On Wed, Aug 21, 2024 at 12:12:00PM -0700, Jacob Keller wrote:
>>>> Ok. I'll investigate this, and I will send the two fixes for lib/packing
>>>> in my series to implement the support in ice. That would help on our end
>>>> with managing the changes since it avoids an interdependence between
>>>> multiple series in flight.
>>>
>>> There's one patch in there which replaces the packing(PACK) call with a
>>> dedicated pack() function, and packing(UNPACK) with unpack(). The idea
>>> being that it helps with const correctness. I still have some mixed
>>> feelings about this, because a multiplexed packing() call is in some
>>> ways more flexible, but apparently others felt bad enough about the
>>> packing() API to tell me about it, and that stuck with me.
>>>
>>> I'm mentioning it because if you're going to use the API, you could at
>>> least consider using the const-correct form, so that there's one less
>>> driver to refactor later.
>>
>> Yep! I've got those patches in my series now. Though I should note that
>> I did not include any of the patches for the other drivers. I'll CC you
>> when I send the series out, though it may likely go through our
>> Intel-Wired-LAN tree first.
>>
>> I've refactored your self tests into KUnit tests as well!
>>
> 
> I was writing additional tests and I think I ran into another issue with
> QUIRK_MSB_ON_THE_RIGHT, when the bit offsets are not aligned to a byte
> boundary:
> 
> When trying to unpack 0x1122334455667788 from the buffer between offsets
> 106-43, the calculation appears to completely break.
> 
> When packing:
> 
>> [18:34:50] box_bit_width = 3
>> [18:34:50] box_start_bit = 2
>> [18:34:50] box_end_bit = 0
>> [18:34:50] new_box_start_bit = 2
>> [18:34:50] new_box_end_bit = 0
>> [18:34:50] box_bit_width = 8
>> [18:34:50] box_start_bit = 7
>> [18:34:50] box_end_bit = 0
>> [18:34:50] new_box_start_bit = 7
>> [18:34:50] new_box_end_bit = 0
>> [18:34:50] box_bit_width = 8
>> [18:34:50] box_start_bit = 7
>> [18:34:50] box_end_bit = 0
>> [18:34:50] new_box_start_bit = 7
>> [18:34:50] new_box_end_bit = 0
>> [18:34:50] box_bit_width = 8
>> [18:34:50] box_start_bit = 7
>> [18:34:50] box_end_bit = 0
>> [18:34:50] new_box_start_bit = 7
>> [18:34:50] new_box_end_bit = 0
>> [18:34:50] box_bit_width = 8
>> [18:34:50] box_start_bit = 7
>> [18:34:50] box_end_bit = 0
>> [18:34:50] new_box_start_bit = 7
>> [18:34:50] new_box_end_bit = 0
>> [18:34:50] box_bit_width = 8
>> [18:34:50] box_start_bit = 7
>> [18:34:50] box_end_bit = 0
>> [18:34:50] new_box_start_bit = 7
>> [18:34:50] new_box_end_bit = 0
>> [18:34:50] box_bit_width = 8
>> [18:34:50] box_start_bit = 7
>> [18:34:50] box_end_bit = 0
>> [18:34:50] new_box_start_bit = 7
>> [18:34:50] new_box_end_bit = 0
>> [18:34:50] box_bit_width = 8
>> [18:34:50] box_start_bit = 7
>> [18:34:50] box_end_bit = 0
>> [18:34:50] new_box_start_bit = 7
>> [18:34:50] new_box_end_bit = 0
>> [18:34:50] box_bit_width = 5
>> [18:34:50] box_start_bit = 7
>> [18:34:50] box_end_bit = 3
>> [18:34:50] new_box_start_bit = 1
>> [18:34:50] new_box_end_bit = -3
>> [18:34:50]     # packing_test_unpack: EXPECTATION FAILED at lib/packing_test.c:264
>> [18:34:50]     Expected uval == params->uval, but
>> [18:34:50]         uval == 1234605616436508544 (0x1122334455667780)
>> [18:34:50]         params->uval == 1234605616436508552 (0x1122334455667788)
>> [18:34:50] [FAILED] msb right, 16 bytes, non-aligned
>> [18:34:50] # packing_test_unpack: pass:19 fail:1 skip:0 total:20
> 
> Notice that the box end bit is now negative. Specifically this is
> because the width is smaller than the start bit, so subtraction underflows.
> 
> When unpacking:
>> [18:34:50] box_bit_width = 3
>> [18:34:50] box_start_bit = 2
>> [18:34:50] box_end_bit = 0
>> [18:34:50] new_box_start_bit = 2
>> [18:34:50] new_box_end_bit = 0
>> [18:34:50] box_bit_width = 8
>> [18:34:50] box_start_bit = 7
>> [18:34:50] box_end_bit = 0
>> [18:34:50] new_box_start_bit = 7
>> [18:34:50] new_box_end_bit = 0
>> [18:34:50] box_bit_width = 8
>> [18:34:50] box_start_bit = 7
>> [18:34:50] box_end_bit = 0
>> [18:34:50] new_box_start_bit = 7
>> [18:34:50] new_box_end_bit = 0
>> [18:34:50] box_bit_width = 8
>> [18:34:50] box_start_bit = 7
>> [18:34:50] box_end_bit = 0
>> [18:34:50] new_box_start_bit = 7
>> [18:34:50] new_box_end_bit = 0
>> [18:34:50] box_bit_width = 8
>> [18:34:50] box_start_bit = 7
>> [18:34:50] box_end_bit = 0
>> [18:34:50] new_box_start_bit = 7
>> [18:34:50] new_box_end_bit = 0
>> [18:34:50] box_bit_width = 8
>> [18:34:50] box_start_bit = 7
>> [18:34:50] box_end_bit = 0
>> [18:34:50] new_box_start_bit = 7
>> [18:34:50] new_box_end_bit = 0
>> [18:34:50] box_bit_width = 8
>> [18:34:50] box_start_bit = 7
>> [18:34:50] box_end_bit = 0
>> [18:34:50] new_box_start_bit = 7
>> [18:34:50] new_box_end_bit = 0
>> [18:34:50] box_bit_width = 8
>> [18:34:50] box_start_bit = 7
>> [18:34:50] box_end_bit = 0
>> [18:34:50] new_box_start_bit = 7
>> [18:34:50] new_box_end_bit = 0
>> [18:34:50] box_bit_width = 5
>> [18:34:50] box_start_bit = 7
>> [18:34:50] box_end_bit = 3
>> [18:34:50] new_box_start_bit = 1
>> [18:34:50] new_box_end_bit = -3
>> [18:34:50]     # packing_test_unpack: EXPECTATION FAILED at lib/packing_test.c:264
>> [18:34:50]     Expected uval == params->uval, but
>> [18:34:50]         uval == 1234605616436508544 (0x1122334455667780)
>> [18:34:50]         params->uval == 1234605616436508552 (0x1122334455667788)
>> [18:34:50] [FAILED] msb right, 16 bytes, non-aligned
>> [18:34:50] # packing_test_unpack: pass:19 fail:1 skip:0 total:20
> 
> Specifically, it looks like we basically fail to calculate valid new box
> offsets.
> 
> What's weird to me is that when the box width is larger than the start
> bit position, we just calculate the same exact offsets, so I don't see
> why the existing calculations are there at all. Something is obviously
> wrong here.
> 

Specifically this is making me question:

Does QUIRK_MSB_ON_THE_RIGHT mean that the msb of each bit field is on
the right? or does it mean that every byte in the buffer has its 8 bits
flipped? It seems to document that it applies to the bits within the
byte, but not to the byte ordering.

The code seems to be trying to do a mix of different things though, and
my attempts at fixing it haven't worked properly yet.

I think I was able to get pack() to behave correctly, but unpacking
seems to still be erratic. I tried adding a few more tests but so far
haven't figured out what is wrong with the unpacking code.

I think the attempt to re-use adjust_for_msb_right on both unpacking and
packing isn't working correctly. When unpacking, I think we end up
masking the wrong bits.

With the attached patch, I was able to get my test case for packing
fixed, but the following couple of tests fail:

>         {
>                 .desc = "msb right, 16 bytes, non-aligned",
>                 PBUF(0x00, 0x00, 0x00, 0x91, 0x88, 0x59, 0x44, 0xd5,
>                      0xcc, 0x3d, 0x02, 0x00, 0x00, 0x00, 0x00, 0x00),
>                 .uval = 0x1122334455667788,
>                 .start_bit = 106,
>                 .end_bit = 43,
>                 .quirks = QUIRK_MSB_ON_THE_RIGHT,
>         },

packing for this test works with the attached patch, but fails to unpack:

 > [12:49:53]     # packing_test_unpack: EXPECTATION FAILED at
lib/packing_test.c:282
> [12:49:53]     Expected uval == params->uval, but
> [12:49:53]         uval == 1234605616436508544 (0x1122334455667780)
> [12:49:53]         params->uval == 1234605616436508552 (0x1122334455667788)


the last few bits don't seem to be included properly.

I also tried a test case with all bits of the u64 set:

>         {
>                 .desc = "msb right, 16 bytes, non-aligned, 0xff",
>                 PBUF(0x00, 0x00, 0xe0, 0xff, 0xff, 0xff, 0xff, 0xff,
>                      0xff, 0xff, 0x1f, 0x00, 0x00, 0x00, 0x00, 0x00),
>                 .uval = 0xffffffffffffffff,
>                 .start_bit = 106,
>                 .end_bit = 43,
>                 .quirks = QUIRK_MSB_ON_THE_RIGHT,
>         },


> [12:49:53]     # packing_test_unpack: EXPECTATION FAILED at lib/packing_test.c:282
> [12:49:53]     Expected uval == params->uval, but
> [12:49:53]         uval == 2305843009213693944 (0x1ffffffffffffff8)
> [12:49:53]         params->uval == -1 (0xffffffffffffffff)

In this case, again it seems like the bits on the tail end partial bytes
don't get unpacked properly.

This test also passes packing with the attempted fix.

I so far think the likely issue is with the way we handle the box and
mask offsets. Somehow we must be shifting things in a strange way that
causes us to discard bits, but only when we deal with the
QUIRK_MSB_ON_THE_RIGHT.

Thanks,
Jake

[-- Attachment #2: maybe-fixed-packing.patch --]
[-- Type: text/plain, Size: 1388 bytes --]

diff --git c/lib/packing.c i/lib/packing.c
index 80c95dacbfaa..c7aab00b106c 100644
--- c/lib/packing.c
+++ i/lib/packing.c
@@ -14,14 +14,16 @@ static void adjust_for_msb_right_quirk(u64 *to_write, size_t *box_start_bit,
 {
 	size_t box_bit_width = *box_start_bit - *box_end_bit + 1;
 	size_t new_box_start_bit, new_box_end_bit;
+	u8 new_box_mask;
 
-	*to_write >>= *box_end_bit;
-	*to_write = bitrev8(*to_write) >> (8 - box_bit_width);
-	*to_write <<= *box_end_bit;
 
-	new_box_end_bit   = box_bit_width - *box_start_bit - 1;
-	new_box_start_bit = box_bit_width - *box_end_bit - 1;
-	*box_mask = GENMASK_ULL(new_box_start_bit, new_box_end_bit);
+	*to_write = bitrev8(*to_write);
+
+	new_box_mask = bitrev8(*box_mask);
+	new_box_end_bit = ffs(new_box_mask) - 1;
+	new_box_start_bit = fls(new_box_mask) - 1;
+
+	*box_mask = new_box_mask;
 	*box_start_bit = new_box_start_bit;
 	*box_end_bit   = new_box_end_bit;
 }
@@ -170,13 +172,13 @@ int pack(void *pbuf, u64 uval, size_t startbit, size_t endbit, size_t pbuflen,
 		/* Write to pbuf, read from uval */
 		pval = uval & proj_mask;
 		pval >>= proj_end_bit;
+		pval <<= box_end_bit;
 		if (quirks & QUIRK_MSB_ON_THE_RIGHT)
 			adjust_for_msb_right_quirk(&pval,
 						   &box_start_bit,
 						   &box_end_bit,
 						   &box_mask);
 
-		pval <<= box_end_bit;
 		((u8 *)pbuf)[box_addr] &= ~box_mask;
 		((u8 *)pbuf)[box_addr] |= pval;
 	}

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: lib/packing.c behaving weird if buffer length is not multiple of 4 with QUIRK_LSW32_IS_FIRST
  2024-08-23 19:53                   ` Jacob Keller
@ 2024-08-26 22:03                     ` Jacob Keller
  0 siblings, 0 replies; 12+ messages in thread
From: Jacob Keller @ 2024-08-26 22:03 UTC (permalink / raw)
  To: Vladimir Oltean; +Cc: netdev@vger.kernel.org



On 8/23/2024 12:53 PM, Jacob Keller wrote:
> I so far think the likely issue is with the way we handle the box and
> mask offsets. Somehow we must be shifting things in a strange way that
> causes us to discard bits, but only when we deal with the
> QUIRK_MSB_ON_THE_RIGHT.
> 
> Thanks,
> Jake

I eventually figured out what i believe is wrong here, and have a fix.
I'm going to include it in the series I send for using <linux/packing.h>
in the ice driver, along with the additional KUnit tests.

Thanks,
Jake

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2024-08-26 22:03 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-15 21:16 lib/packing.c behaving weird if buffer length is not multiple of 4 with QUIRK_LSW32_IS_FIRST Jacob Keller
2024-08-16 23:37 ` Jacob Keller
2024-08-18 13:29   ` Vladimir Oltean
2024-08-19 18:45     ` Jacob Keller
2024-08-19 21:53       ` Jacob Keller
2024-08-21 13:58         ` Vladimir Oltean
2024-08-21 19:12           ` Jacob Keller
2024-08-21 20:21             ` Vladimir Oltean
2024-08-21 23:41               ` Jacob Keller
2024-08-23  1:41                 ` Jacob Keller
2024-08-23 19:53                   ` Jacob Keller
2024-08-26 22:03                     ` Jacob Keller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).