* [net PATCH] octeontx2-af: Fix marking couple of structure as __packed
@ 2023-12-18 8:27 Suman Ghosh
2023-12-18 20:44 ` Jacob Keller
0 siblings, 1 reply; 6+ messages in thread
From: Suman Ghosh @ 2023-12-18 8:27 UTC (permalink / raw)
To: davem, edumazet, kuba, pabeni, sgoutham, sbhatta, jerinj, gakula,
hkelam, lcherian, netdev, linux-kernel
Cc: Suman Ghosh
Couple of structures was not marked as __packed which may have some
performance implication. This patch fixes the same and mark them as
__packed.
Fixes: 42006910b5ea ("octeontx2-af: cleanup KPU config data")
Signed-off-by: Suman Ghosh <sumang@marvell.com>
---
drivers/net/ethernet/marvell/octeontx2/af/npc.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/npc.h b/drivers/net/ethernet/marvell/octeontx2/af/npc.h
index ab3e39eef2eb..8c0732c9a7ee 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/npc.h
+++ b/drivers/net/ethernet/marvell/octeontx2/af/npc.h
@@ -528,7 +528,7 @@ struct npc_lt_def {
u8 ltype_mask;
u8 ltype_match;
u8 lid;
-};
+} __packed;
struct npc_lt_def_ipsec {
u8 ltype_mask;
@@ -536,7 +536,7 @@ struct npc_lt_def_ipsec {
u8 lid;
u8 spi_offset;
u8 spi_nz;
-};
+} __packed;
struct npc_lt_def_apad {
u8 ltype_mask;
--
2.25.1
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [net PATCH] octeontx2-af: Fix marking couple of structure as __packed
2023-12-18 8:27 [net PATCH] octeontx2-af: Fix marking couple of structure as __packed Suman Ghosh
@ 2023-12-18 20:44 ` Jacob Keller
2023-12-19 14:22 ` [EXT] " Suman Ghosh
2023-12-19 15:26 ` David Laight
0 siblings, 2 replies; 6+ messages in thread
From: Jacob Keller @ 2023-12-18 20:44 UTC (permalink / raw)
To: Suman Ghosh, davem, edumazet, kuba, pabeni, sgoutham, sbhatta,
jerinj, gakula, hkelam, lcherian, netdev, linux-kernel
On 12/18/2023 12:27 AM, Suman Ghosh wrote:
> Couple of structures was not marked as __packed which may have some
> performance implication. This patch fixes the same and mark them as
> __packed.
Not sure I follow why lack of __packed would have performance
implications? I get that __packed is important to ensure layout is
correct or to ensure the whole structure has the right size rather than
unexpected gaps. I'd guess maybe because the structures size would
include padding without __packed, leading to a lot of gaps when
combining several structures together...
I did test on my system with pahole, and even without __packed, I don't
get any gaps in the npc_lt_def_cfg structure:
> struct npc_lt_def_cfg {
> struct npc_lt_def rx_ol2; /* 0 3 */
> struct npc_lt_def rx_oip4; /* 3 3 */
> struct npc_lt_def rx_iip4; /* 6 3 */
> struct npc_lt_def rx_oip6; /* 9 3 */
> struct npc_lt_def rx_iip6; /* 12 3 */
> struct npc_lt_def rx_otcp; /* 15 3 */
> struct npc_lt_def rx_itcp; /* 18 3 */
> struct npc_lt_def rx_oudp; /* 21 3 */
> struct npc_lt_def rx_iudp; /* 24 3 */
> struct npc_lt_def rx_osctp; /* 27 3 */
> struct npc_lt_def rx_isctp; /* 30 3 */
> struct npc_lt_def_ipsec rx_ipsec[2]; /* 33 10 */
> struct npc_lt_def pck_ol2; /* 43 3 */
> struct npc_lt_def pck_oip4; /* 46 3 */
> struct npc_lt_def pck_oip6; /* 49 3 */
> struct npc_lt_def pck_iip4; /* 52 3 */
> struct npc_lt_def_apad rx_apad0; /* 55 4 */
> struct npc_lt_def_apad rx_apad1; /* 59 4 */
> struct npc_lt_def_color ovlan; /* 63 5 */
> /* --- cacheline 1 boundary (64 bytes) was 4 bytes ago --- */
> struct npc_lt_def_color ivlan; /* 68 5 */
> struct npc_lt_def_color rx_gen0_color; /* 73 5 */
> struct npc_lt_def_color rx_gen1_color; /* 78 5 */
> struct npc_lt_def_et rx_et[2]; /* 83 10 */
>
> /* size: 93, cachelines: 2, members: 23 */
> /* last cacheline: 29 bytes */
> };
However that may not be true across all compilers etc. Also all the
other structures are __packed. Makes sense.
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
>
> Fixes: 42006910b5ea ("octeontx2-af: cleanup KPU config data")
> Signed-off-by: Suman Ghosh <sumang@marvell.com>
> ---
> drivers/net/ethernet/marvell/octeontx2/af/npc.h | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/marvell/octeontx2/af/npc.h b/drivers/net/ethernet/marvell/octeontx2/af/npc.h
> index ab3e39eef2eb..8c0732c9a7ee 100644
> --- a/drivers/net/ethernet/marvell/octeontx2/af/npc.h
> +++ b/drivers/net/ethernet/marvell/octeontx2/af/npc.h
> @@ -528,7 +528,7 @@ struct npc_lt_def {
> u8 ltype_mask;
> u8 ltype_match;
> u8 lid;
> -};
> +} __packed;
>
> struct npc_lt_def_ipsec {
> u8 ltype_mask;
> @@ -536,7 +536,7 @@ struct npc_lt_def_ipsec {
> u8 lid;
> u8 spi_offset;
> u8 spi_nz;
> -};
> +} __packed;
>
> struct npc_lt_def_apad {
> u8 ltype_mask;
^ permalink raw reply [flat|nested] 6+ messages in thread* RE: [EXT] Re: [net PATCH] octeontx2-af: Fix marking couple of structure as __packed
2023-12-18 20:44 ` Jacob Keller
@ 2023-12-19 14:22 ` Suman Ghosh
2023-12-19 15:26 ` David Laight
1 sibling, 0 replies; 6+ messages in thread
From: Suman Ghosh @ 2023-12-19 14:22 UTC (permalink / raw)
To: Jacob Keller, davem@davemloft.net, edumazet@google.com,
kuba@kernel.org, pabeni@redhat.com, Sunil Kovvuri Goutham,
Subbaraya Sundeep Bhatta, Jerin Jacob Kollanukkaran,
Geethasowjanya Akula, Hariprasad Kelam, Linu Cherian,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org
>Not sure I follow why lack of __packed would have performance
>implications? I get that __packed is important to ensure layout is
>correct or to ensure the whole structure has the right size rather than
>unexpected gaps. I'd guess maybe because the structures size would
>include padding without __packed, leading to a lot of gaps when
>combining several structures together...
>
>I did test on my system with pahole, and even without __packed, I don't
>get any gaps in the npc_lt_def_cfg structure:
>
>
>> struct npc_lt_def_cfg {
>> struct npc_lt_def rx_ol2; /* 0
>3 */
>> struct npc_lt_def rx_oip4; /* 3
>3 */
>> struct npc_lt_def rx_iip4; /* 6
>3 */
>> struct npc_lt_def rx_oip6; /* 9
>3 */
>> struct npc_lt_def rx_iip6; /* 12
>3 */
>> struct npc_lt_def rx_otcp; /* 15
>3 */
>> struct npc_lt_def rx_itcp; /* 18
>3 */
>> struct npc_lt_def rx_oudp; /* 21
>3 */
>> struct npc_lt_def rx_iudp; /* 24
>3 */
>> struct npc_lt_def rx_osctp; /* 27
>3 */
>> struct npc_lt_def rx_isctp; /* 30
>3 */
>> struct npc_lt_def_ipsec rx_ipsec[2]; /* 33
>10 */
>> struct npc_lt_def pck_ol2; /* 43
>3 */
>> struct npc_lt_def pck_oip4; /* 46
>3 */
>> struct npc_lt_def pck_oip6; /* 49
>3 */
>> struct npc_lt_def pck_iip4; /* 52
>3 */
>> struct npc_lt_def_apad rx_apad0; /* 55
>4 */
>> struct npc_lt_def_apad rx_apad1; /* 59
>4 */
>> struct npc_lt_def_color ovlan; /* 63
>5 */
>> /* --- cacheline 1 boundary (64 bytes) was 4 bytes ago --- */
>> struct npc_lt_def_color ivlan; /* 68
>5 */
>> struct npc_lt_def_color rx_gen0_color; /* 73
>5 */
>> struct npc_lt_def_color rx_gen1_color; /* 78
>5 */
>> struct npc_lt_def_et rx_et[2]; /* 83
>10 */
>>
>> /* size: 93, cachelines: 2, members: 23 */
>> /* last cacheline: 29 bytes */ };
>
>
>However that may not be true across all compilers etc. Also all the
>other structures are __packed. Makes sense.
>
>Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
>
[Suman] I agree, "having performance impact" is a wrong statement. I will update the same in v2.
^ permalink raw reply [flat|nested] 6+ messages in thread* RE: [net PATCH] octeontx2-af: Fix marking couple of structure as __packed
2023-12-18 20:44 ` Jacob Keller
2023-12-19 14:22 ` [EXT] " Suman Ghosh
@ 2023-12-19 15:26 ` David Laight
2023-12-19 21:05 ` Jacob Keller
1 sibling, 1 reply; 6+ messages in thread
From: David Laight @ 2023-12-19 15:26 UTC (permalink / raw)
To: 'Jacob Keller', Suman Ghosh, davem@davemloft.net,
edumazet@google.com, kuba@kernel.org, pabeni@redhat.com,
sgoutham@marvell.com, sbhatta@marvell.com, jerinj@marvell.com,
gakula@marvell.com, hkelam@marvell.com, lcherian@marvell.com,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org
From: Jacob Keller
> Sent: 18 December 2023 20:44
>
> On 12/18/2023 12:27 AM, Suman Ghosh wrote:
> > Couple of structures was not marked as __packed which may have some
> > performance implication. This patch fixes the same and mark them as
> > __packed.
>
> Not sure I follow why lack of __packed would have performance
> implications? I get that __packed is important to ensure layout is
> correct or to ensure the whole structure has the right size rather than
> unexpected gaps. I'd guess maybe because the structures size would
> include padding without __packed, leading to a lot of gaps when
> combining several structures together...
>
> I did test on my system with pahole, and even without __packed, I don't
> get any gaps in the npc_lt_def_cfg structure:
>
>
> > struct npc_lt_def_cfg {
> > struct npc_lt_def rx_ol2; /* 0 3 */
> > struct npc_lt_def rx_oip4; /* 3 3 */
> > struct npc_lt_def rx_iip4; /* 6 3 */
> > struct npc_lt_def rx_oip6; /* 9 3 */
> > struct npc_lt_def rx_iip6; /* 12 3 */
> > struct npc_lt_def rx_otcp; /* 15 3 */
> > struct npc_lt_def rx_itcp; /* 18 3 */
> > struct npc_lt_def rx_oudp; /* 21 3 */
> > struct npc_lt_def rx_iudp; /* 24 3 */
> > struct npc_lt_def rx_osctp; /* 27 3 */
> > struct npc_lt_def rx_isctp; /* 30 3 */
> > struct npc_lt_def_ipsec rx_ipsec[2]; /* 33 10 */
> > struct npc_lt_def pck_ol2; /* 43 3 */
> > struct npc_lt_def pck_oip4; /* 46 3 */
> > struct npc_lt_def pck_oip6; /* 49 3 */
> > struct npc_lt_def pck_iip4; /* 52 3 */
> > struct npc_lt_def_apad rx_apad0; /* 55 4 */
> > struct npc_lt_def_apad rx_apad1; /* 59 4 */
> > struct npc_lt_def_color ovlan; /* 63 5 */
> > /* --- cacheline 1 boundary (64 bytes) was 4 bytes ago --- */
> > struct npc_lt_def_color ivlan; /* 68 5 */
> > struct npc_lt_def_color rx_gen0_color; /* 73 5 */
> > struct npc_lt_def_color rx_gen1_color; /* 78 5 */
> > struct npc_lt_def_et rx_et[2]; /* 83 10 */
> >
> > /* size: 93, cachelines: 2, members: 23 */
> > /* last cacheline: 29 bytes */
> > };
>
>
> However that may not be true across all compilers etc. Also all the
> other structures are __packed. Makes sense.
Or not - maybe all the __packed should be removed instead!
Unless these structures (or any others) appear in 'messages' which
get transferred between systems they really shouldn't be __packed.
And a 93 byte 'message' with all those fields seems rather odd.
The above breakdown seems to imply everything is 'unsigned char'
so the __packed makes no difference.
Using __packed requires the compiler generate byte loads/store
with shifts (etc) on many architectures and should really be avoided
unless it is absolutely needed for binary compatibility.
Even then if the problem is a 64bit field that only needs to be
32bit aligned (as is common for some compat32 code) then the 64bit
fields should be marked as being 32bit aligned.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [net PATCH] octeontx2-af: Fix marking couple of structure as __packed
2023-12-19 15:26 ` David Laight
@ 2023-12-19 21:05 ` Jacob Keller
2023-12-20 13:04 ` [EXT] " Suman Ghosh
0 siblings, 1 reply; 6+ messages in thread
From: Jacob Keller @ 2023-12-19 21:05 UTC (permalink / raw)
To: David Laight, Suman Ghosh, davem@davemloft.net,
edumazet@google.com, kuba@kernel.org, pabeni@redhat.com,
sgoutham@marvell.com, sbhatta@marvell.com, jerinj@marvell.com,
gakula@marvell.com, hkelam@marvell.com, lcherian@marvell.com,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org
On 12/19/2023 7:26 AM, David Laight wrote:
> From: Jacob Keller
>> Sent: 18 December 2023 20:44
>>
>> On 12/18/2023 12:27 AM, Suman Ghosh wrote:
>>> Couple of structures was not marked as __packed which may have some
>>> performance implication. This patch fixes the same and mark them as
>>> __packed.
>>
>> Not sure I follow why lack of __packed would have performance
>> implications? I get that __packed is important to ensure layout is
>> correct or to ensure the whole structure has the right size rather than
>> unexpected gaps. I'd guess maybe because the structures size would
>> include padding without __packed, leading to a lot of gaps when
>> combining several structures together...
>>
>> I did test on my system with pahole, and even without __packed, I don't
>> get any gaps in the npc_lt_def_cfg structure:
>>
>>
>>> struct npc_lt_def_cfg {
>>> struct npc_lt_def rx_ol2; /* 0 3 */
>>> struct npc_lt_def rx_oip4; /* 3 3 */
>>> struct npc_lt_def rx_iip4; /* 6 3 */
>>> struct npc_lt_def rx_oip6; /* 9 3 */
>>> struct npc_lt_def rx_iip6; /* 12 3 */
>>> struct npc_lt_def rx_otcp; /* 15 3 */
>>> struct npc_lt_def rx_itcp; /* 18 3 */
>>> struct npc_lt_def rx_oudp; /* 21 3 */
>>> struct npc_lt_def rx_iudp; /* 24 3 */
>>> struct npc_lt_def rx_osctp; /* 27 3 */
>>> struct npc_lt_def rx_isctp; /* 30 3 */
>>> struct npc_lt_def_ipsec rx_ipsec[2]; /* 33 10 */
>>> struct npc_lt_def pck_ol2; /* 43 3 */
>>> struct npc_lt_def pck_oip4; /* 46 3 */
>>> struct npc_lt_def pck_oip6; /* 49 3 */
>>> struct npc_lt_def pck_iip4; /* 52 3 */
>>> struct npc_lt_def_apad rx_apad0; /* 55 4 */
>>> struct npc_lt_def_apad rx_apad1; /* 59 4 */
>>> struct npc_lt_def_color ovlan; /* 63 5 */
>>> /* --- cacheline 1 boundary (64 bytes) was 4 bytes ago --- */
>>> struct npc_lt_def_color ivlan; /* 68 5 */
>>> struct npc_lt_def_color rx_gen0_color; /* 73 5 */
>>> struct npc_lt_def_color rx_gen1_color; /* 78 5 */
>>> struct npc_lt_def_et rx_et[2]; /* 83 10 */
>>>
>>> /* size: 93, cachelines: 2, members: 23 */
>>> /* last cacheline: 29 bytes */
>>> };
>>
>>
>> However that may not be true across all compilers etc. Also all the
>> other structures are __packed. Makes sense.
>
> Or not - maybe all the __packed should be removed instead!
>
> Unless these structures (or any others) appear in 'messages' which
> get transferred between systems they really shouldn't be __packed.
> And a 93 byte 'message' with all those fields seems rather odd.
>
> The above breakdown seems to imply everything is 'unsigned char'
> so the __packed makes no difference.
>
> Using __packed requires the compiler generate byte loads/store
> with shifts (etc) on many architectures and should really be avoided
> unless it is absolutely needed for binary compatibility.
>
> Even then if the problem is a 64bit field that only needs to be
> 32bit aligned (as is common for some compat32 code) then the 64bit
> fields should be marked as being 32bit aligned.
>
> David
>
Right. Typically packed is only required when dealing with something
where the exact binary layout matters (i.e. copying to/from hardware or
across systems in such a way that the layout might change with different
compilers/arch).
If that isn't how this structure is used, then ya, removing __packed
seems reasonable. And at least for one system I see no difference in the
actual generated layout, making __packed redundant.
However, its not clear to me at a glance how this structure is used and
whether it really is copied between places where binary compatibility is
a requirement.
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)
^ permalink raw reply [flat|nested] 6+ messages in thread* RE: [EXT] Re: [net PATCH] octeontx2-af: Fix marking couple of structure as __packed
2023-12-19 21:05 ` Jacob Keller
@ 2023-12-20 13:04 ` Suman Ghosh
0 siblings, 0 replies; 6+ messages in thread
From: Suman Ghosh @ 2023-12-20 13:04 UTC (permalink / raw)
To: Jacob Keller, David Laight, davem@davemloft.net,
edumazet@google.com, kuba@kernel.org, pabeni@redhat.com,
Sunil Kovvuri Goutham, Subbaraya Sundeep Bhatta,
Jerin Jacob Kollanukkaran, Geethasowjanya Akula, Hariprasad Kelam,
Linu Cherian, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org
>>>
>>> However that may not be true across all compilers etc. Also all the
>>> other structures are __packed. Makes sense.
>>
>> Or not - maybe all the __packed should be removed instead!
>>
>> Unless these structures (or any others) appear in 'messages' which get
>> transferred between systems they really shouldn't be __packed.
>> And a 93 byte 'message' with all those fields seems rather odd.
>>
>> The above breakdown seems to imply everything is 'unsigned char'
>> so the __packed makes no difference.
>>
>> Using __packed requires the compiler generate byte loads/store with
>> shifts (etc) on many architectures and should really be avoided unless
>> it is absolutely needed for binary compatibility.
>>
>> Even then if the problem is a 64bit field that only needs to be 32bit
>> aligned (as is common for some compat32 code) then the 64bit fields
>> should be marked as being 32bit aligned.
>>
>> David
>>
>Right. Typically packed is only required when dealing with something
>where the exact binary layout matters (i.e. copying to/from hardware or
>across systems in such a way that the layout might change with different
>compilers/arch).
>
>If that isn't how this structure is used, then ya, removing __packed
>seems reasonable. And at least for one system I see no difference in the
>actual generated layout, making __packed redundant.
>
>However, its not clear to me at a glance how this structure is used and
>whether it really is copied between places where binary compatibility is
>a requirement.
>
>> -
>> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes,
>> MK1 1PT, UK Registration No: 1397386 (Wales)
[Suman] Yes, these structures are copied from firmware. It is needed to inform kernel about some parsing information required by hardware. That is the reason structures are packed and these two were missed.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2023-12-20 13:05 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-12-18 8:27 [net PATCH] octeontx2-af: Fix marking couple of structure as __packed Suman Ghosh
2023-12-18 20:44 ` Jacob Keller
2023-12-19 14:22 ` [EXT] " Suman Ghosh
2023-12-19 15:26 ` David Laight
2023-12-19 21:05 ` Jacob Keller
2023-12-20 13:04 ` [EXT] " Suman Ghosh
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).