* [PATCH net] net: dsa: validate source trunk against lags_len
@ 2025-07-31 5:35 Luke Howard
2025-07-31 8:23 ` Dawid Osuchowski
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Luke Howard @ 2025-07-31 5:35 UTC (permalink / raw)
To: netdev
A DSA frame with an invalid source trunk ID could cause an out-of-bounds read
access of dst->lags.
This patch adds a check to dsa_lag_by_id() to validate the LAG ID is not zero,
and is less than or equal to dst->lags_len. (The LAG ID is derived by adding
one to the source trunk ID.)
Note: this is in the fast path for any frames within a trunk.
Signed-off-by: Luke Howard <lukeh@padl.com>
---
include/net/dsa.h | 3 +++
1 file changed, 3 insertions(+)
diff --git a/include/net/dsa.h b/include/net/dsa.h
index 2e148823c366c..67672c5ff22e5 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -180,6 +180,9 @@ struct dsa_switch_tree {
static inline struct dsa_lag *dsa_lag_by_id(struct dsa_switch_tree *dst,
unsigned int id)
{
+ if (unlikely(id == 0 || id > dst->lags_len))
+ return NULL;
+
/* DSA LAG IDs are one-based, dst->lags is zero-based */
return dst->lags[id - 1];
}
--
2.43.0
^ permalink raw reply related [flat|nested] 11+ messages in thread* Re: [PATCH net] net: dsa: validate source trunk against lags_len 2025-07-31 5:35 [PATCH net] net: dsa: validate source trunk against lags_len Luke Howard @ 2025-07-31 8:23 ` Dawid Osuchowski 2025-07-31 9:07 ` Vladimir Oltean 2025-07-31 9:56 ` [PATCH v2 net-next] " Luke Howard 2 siblings, 0 replies; 11+ messages in thread From: Dawid Osuchowski @ 2025-07-31 8:23 UTC (permalink / raw) To: Luke Howard, netdev On 2025-07-31 7:35 AM, Luke Howard wrote: > A DSA frame with an invalid source trunk ID could cause an out-of-bounds read > access of dst->lags. > > This patch adds a check to dsa_lag_by_id() to validate the LAG ID is not zero, checkpatch is saying this line is not wrapped at 75 columns > and is less than or equal to dst->lags_len. (The LAG ID is derived by adding > one to the source trunk ID.) > > Note: this is in the fast path for any frames within a trunk. > > Signed-off-by: Luke Howard <lukeh@padl.com> Reviewed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> Thanks, Dawid > --- > include/net/dsa.h | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/include/net/dsa.h b/include/net/dsa.h > index 2e148823c366c..67672c5ff22e5 100644 > --- a/include/net/dsa.h > +++ b/include/net/dsa.h > @@ -180,6 +180,9 @@ struct dsa_switch_tree { > static inline struct dsa_lag *dsa_lag_by_id(struct dsa_switch_tree *dst, > unsigned int id) > { > + if (unlikely(id == 0 || id > dst->lags_len)) > + return NULL; > + > /* DSA LAG IDs are one-based, dst->lags is zero-based */ > return dst->lags[id - 1]; > } ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net] net: dsa: validate source trunk against lags_len 2025-07-31 5:35 [PATCH net] net: dsa: validate source trunk against lags_len Luke Howard 2025-07-31 8:23 ` Dawid Osuchowski @ 2025-07-31 9:07 ` Vladimir Oltean 2025-07-31 9:55 ` Luke Howard 2025-07-31 9:56 ` [PATCH v2 net-next] " Luke Howard 2 siblings, 1 reply; 11+ messages in thread From: Vladimir Oltean @ 2025-07-31 9:07 UTC (permalink / raw) To: Luke Howard; +Cc: netdev Hello Luke, On Thu, Jul 31, 2025 at 03:35:34PM +1000, Luke Howard wrote: > A DSA frame with an invalid source trunk ID could cause an out-of-bounds read > access of dst->lags. > > This patch adds a check to dsa_lag_by_id() to validate the LAG ID is not zero, > and is less than or equal to dst->lags_len. (The LAG ID is derived by adding > one to the source trunk ID.) > > Note: this is in the fast path for any frames within a trunk. > > Signed-off-by: Luke Howard <lukeh@padl.com> > --- > include/net/dsa.h | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/include/net/dsa.h b/include/net/dsa.h > index 2e148823c366c..67672c5ff22e5 100644 > --- a/include/net/dsa.h > +++ b/include/net/dsa.h > @@ -180,6 +180,9 @@ struct dsa_switch_tree { > static inline struct dsa_lag *dsa_lag_by_id(struct dsa_switch_tree *dst, > unsigned int id) > { > + if (unlikely(id == 0 || id > dst->lags_len)) > + return NULL; > + > /* DSA LAG IDs are one-based, dst->lags is zero-based */ > return dst->lags[id - 1]; > } > -- > 2.43.0 1. You need to add a Fixes: tag, like the following: Fixes: 5b60dadb71db ("net: dsa: tag_dsa: Support reception of packets from LAG devices") 2. The problem statement must not remain in the theoretical realm if you submit a patch intended as a bug fix. Normally the tagger is used to process data coming from the switch hardware, so to trigger an out-of-bounds array access would imply that the problem is elsewhere. That, or you can make it clear that the patch is to prevent a modified dsa_loop from crashing when receiving crafted packets over a regular network interface. But using dsa_loop with a modified dsa_loop_get_protocol() return value is a developer tool which involves modifying kernel sources. I would say any fix that doesn't fix any real life problem in production systems should be sent to 'net-next', not to 'net'. This is in accordance with Documentation/process/stable-kernel-rules.rst. 3. As per Documentation/process/submitting-patches.rst, you should replace the wording "This patch adds" with the imperative mood. 4. Please use ./scripts/get_maintainer.pl to generate the recipient list, don't send patches just to the mailing list, reviewers might miss them. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net] net: dsa: validate source trunk against lags_len 2025-07-31 9:07 ` Vladimir Oltean @ 2025-07-31 9:55 ` Luke Howard 2025-07-31 11:37 ` Vladimir Oltean 0 siblings, 1 reply; 11+ messages in thread From: Luke Howard @ 2025-07-31 9:55 UTC (permalink / raw) To: Vladimir Oltean; +Cc: netdev, Andrew Lunn, Ryan Wilkins Hi Vladimir, Thanks for helping me walk through my first kernel patch (I thought I would start with a one line one!). > 1. You need to add a Fixes: tag, like the following: > Fixes: 5b60dadb71db ("net: dsa: tag_dsa: Support reception of packets from LAG devices") Noted. > 2. The problem statement must not remain in the theoretical realm if you > submit a patch intended as a bug fix. Normally the tagger is used to > process data coming from the switch hardware, so to trigger an > out-of-bounds array access would imply that the problem is elsewhere. > That, or you can make it clear that the patch is to prevent a > modified dsa_loop from crashing when receiving crafted packets over a > regular network interface. But using dsa_loop with a modified > dsa_loop_get_protocol() return value is a developer tool which > involves modifying kernel sources. I would say any fix that doesn't > fix any real life problem in production systems should be sent to > 'net-next', not to 'net'. This is in accordance with > Documentation/process/stable-kernel-rules.rst. Thanks for the clarification, I was unsure which to send to: I’ll send the revised patch to net-next instead. Ryan (on cc) saw this crash with a Marvell switch, using some not yet submitted patches to support in-band switch management without MDIO. Exactly what caused the switch to deliver a malformed DSA packet is unknown, but it seems reasonable to expect the kernel to be resilient to this (although one could make an argument that the trust boundary extends to the switch chip). > 3. As per Documentation/process/submitting-patches.rst, you should > replace the wording "This patch adds" with the imperative mood. Noted. > 4. Please use ./scripts/get_maintainer.pl to generate the recipient > list, don't send patches just to the mailing list, reviewers might > miss them. Noted, I just added Andrew for now. Cheers, Luke ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net] net: dsa: validate source trunk against lags_len 2025-07-31 9:55 ` Luke Howard @ 2025-07-31 11:37 ` Vladimir Oltean [not found] ` <C867697B-7F5B-4500-8098-9C44630D7930@padl.com> 0 siblings, 1 reply; 11+ messages in thread From: Vladimir Oltean @ 2025-07-31 11:37 UTC (permalink / raw) To: Luke Howard; +Cc: netdev, Andrew Lunn, Ryan Wilkins On Thu, Jul 31, 2025 at 07:55:06PM +1000, Luke Howard wrote: > Hi Vladimir, > > Thanks for helping me walk through my first kernel patch (I thought I would start with a one line one!). > > > 1. You need to add a Fixes: tag, like the following: > > Fixes: 5b60dadb71db ("net: dsa: tag_dsa: Support reception of packets from LAG devices") > > Noted. Further clarification: add the Fixes: tag if you target the patch to the 'net' tree (based on which it will also be backported to stable kernels) or if the bug was introduced during this kernel development cycle and is only in 'net-next'. A patch with a 'Fixes' tag for an old commit is incompatible with submitting it via the 'net-next' tree. In my previous reply I was only presenting all possibilities; it's still not clear to me whether this patch should go through net or net-next, because I haven't understood the circumstances that lead to it yet. > > 2. The problem statement must not remain in the theoretical realm if you > > submit a patch intended as a bug fix. Normally the tagger is used to > > process data coming from the switch hardware, so to trigger an > > out-of-bounds array access would imply that the problem is elsewhere. > > That, or you can make it clear that the patch is to prevent a > > modified dsa_loop from crashing when receiving crafted packets over a > > regular network interface. But using dsa_loop with a modified > > dsa_loop_get_protocol() return value is a developer tool which > > involves modifying kernel sources. I would say any fix that doesn't > > fix any real life problem in production systems should be sent to > > 'net-next', not to 'net'. This is in accordance with > > Documentation/process/stable-kernel-rules.rst. > > Thanks for the clarification, I was unsure which to send to: I’ll send the revised patch to net-next instead. > > Ryan (on cc) saw this crash with a Marvell switch, using some not yet submitted patches to support in-band switch management without MDIO. > > Exactly what caused the switch to deliver a malformed DSA packet is unknown, but it seems reasonable to expect the kernel to be resilient to this (although one could make an argument that the trust boundary extends to the switch chip). It does seem reasonable considering dsa_loop, but if we can first characterize the problem to understand the impact, we should. Then, this characterization should go into the commit message, to justify to readers, backporters, etc, when they should worry and when they shouldn't. If you or Ryan still have access to the buggy system, does the problem reproduce without the in-band management patches? That is the most important argument for targetting the 'net' tree. Could you provide an skb_dump() of its contents? Are you even using offloaded LAG interfaces? Another rule from Documentation/process/maintainer-netdev.rst is to wait at least 24 hours between patch revisions. ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <C867697B-7F5B-4500-8098-9C44630D7930@padl.com>]
[parent not found: <CAD3ieB36VnKAQPXUGbnRdWYFThf-VfLkhZTRfb9=ddndZEW3=A@mail.gmail.com>]
* Re: [PATCH net] net: dsa: validate source trunk against lags_len [not found] ` <CAD3ieB36VnKAQPXUGbnRdWYFThf-VfLkhZTRfb9=ddndZEW3=A@mail.gmail.com> @ 2025-08-05 7:42 ` Vladimir Oltean 2025-08-05 11:52 ` Andrew Lunn 0 siblings, 1 reply; 11+ messages in thread From: Vladimir Oltean @ 2025-08-05 7:42 UTC (permalink / raw) To: Ryan Wilkins; +Cc: Luke Howard, netdev, Andrew Lunn Hello Ryan, On Tue, Aug 05, 2025 at 12:06:32AM -0400, Ryan Wilkins wrote: > I cannot confirm if the problem reproduces without the in-band management > patches applied as I’m testing with a Raspberry Pi that has no current > ability to connect to the switch chip via MDIO. Last I checked, the RMU patches did not offer the possibility to control the switch exclusively over Ethernet; an MDIO connection was still necessary, mainly as a fallback to mv88e6xxx_rmu_available(). Has any of that changed? Who enables RMU management in the first place? > It’s possible that I could get into a place where I could attempt to test > if the problem reproduces on an i.MX7D which is directly connected to the > switch chip, but this will take some time to work up a custom software > image for the board to enable this test. I think that would change too much, and the possibility of spending time in vain is too high. > Luke is correct in stating that I am not using LAG between my Pi and switch > chip. > > We could probably modify the system to capture skb_dump() output when the > error condition is detected, if that helps. That would help. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net] net: dsa: validate source trunk against lags_len 2025-08-05 7:42 ` Vladimir Oltean @ 2025-08-05 11:52 ` Andrew Lunn 2025-08-06 1:28 ` Luke Howard 0 siblings, 1 reply; 11+ messages in thread From: Andrew Lunn @ 2025-08-05 11:52 UTC (permalink / raw) To: Vladimir Oltean; +Cc: Ryan Wilkins, Luke Howard, netdev On Tue, Aug 05, 2025 at 10:42:26AM +0300, Vladimir Oltean wrote: > Hello Ryan, > > On Tue, Aug 05, 2025 at 12:06:32AM -0400, Ryan Wilkins wrote: > > I cannot confirm if the problem reproduces without the in-band management > > patches applied as I’m testing with a Raspberry Pi that has no current > > ability to connect to the switch chip via MDIO. > > Last I checked, the RMU patches did not offer the possibility to control > the switch exclusively over Ethernet; an MDIO connection was still > necessary, mainly as a fallback to mv88e6xxx_rmu_available(). Has any of > that changed? Who enables RMU management in the first place? Luke has extended my patches to add pure RMU support. It does require the EEPROM has the needed contents to get the switch into RMU mode, but with that, no MDIO is required. Andrew ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net] net: dsa: validate source trunk against lags_len 2025-08-05 11:52 ` Andrew Lunn @ 2025-08-06 1:28 ` Luke Howard 0 siblings, 0 replies; 11+ messages in thread From: Luke Howard @ 2025-08-06 1:28 UTC (permalink / raw) To: Andrew Lunn; +Cc: Vladimir Oltean, Ryan Wilkins, netdev > Luke has extended my patches to add pure RMU support. It does require > the EEPROM has the needed contents to get the switch into RMU mode, > but with that, no MDIO is required. Or, a separate device connected via MDIO which places the device in RMU mode (optionally signalling readiness by setting a switch GPIO high). More information was posted at [1]. [1] https://www.spinics.net/lists/netdev/msg1097599.html ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 net-next] net: dsa: validate source trunk against lags_len 2025-07-31 5:35 [PATCH net] net: dsa: validate source trunk against lags_len Luke Howard 2025-07-31 8:23 ` Dawid Osuchowski 2025-07-31 9:07 ` Vladimir Oltean @ 2025-07-31 9:56 ` Luke Howard 2025-07-31 11:44 ` Vladimir Oltean 2025-07-31 13:07 ` Andrew Lunn 2 siblings, 2 replies; 11+ messages in thread From: Luke Howard @ 2025-07-31 9:56 UTC (permalink / raw) To: netdev; +Cc: Vladimir Oltean, Andrew Lunn, Ryan Wilkins A DSA frame with an invalid source trunk ID could cause an out-of-bounds read access of dst->lags. Add a check to dsa_lag_by_id() to validate the LAG ID is not zero, and is less than or equal to dst->lags_len. (The LAG ID is derived by adding one to the source trunk ID.) Note: this is in the fast path for any frames within a trunk. Fixes: 5b60dadb71db ("net: dsa: tag_dsa: Support reception of packets from LAG devices") Signed-off-by: Luke Howard <lukeh@padl.com> --- include/net/dsa.h | 3 +++ 1 file changed, 3 insertions(+) diff --git a/include/net/dsa.h b/include/net/dsa.h index 2e148823c366c..67672c5ff22e5 100644 --- a/include/net/dsa.h +++ b/include/net/dsa.h @@ -180,6 +180,9 @@ struct dsa_switch_tree { static inline struct dsa_lag *dsa_lag_by_id(struct dsa_switch_tree *dst, unsigned int id) { + if (unlikely(id == 0 || id > dst->lags_len)) + return NULL; + /* DSA LAG IDs are one-based, dst->lags is zero-based */ return dst->lags[id - 1]; } -- 2.43.0 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v2 net-next] net: dsa: validate source trunk against lags_len 2025-07-31 9:56 ` [PATCH v2 net-next] " Luke Howard @ 2025-07-31 11:44 ` Vladimir Oltean 2025-07-31 13:07 ` Andrew Lunn 1 sibling, 0 replies; 11+ messages in thread From: Vladimir Oltean @ 2025-07-31 11:44 UTC (permalink / raw) To: Luke Howard; +Cc: netdev, Andrew Lunn, Ryan Wilkins On Thu, Jul 31, 2025 at 07:56:37PM +1000, Luke Howard wrote: > A DSA frame with an invalid source trunk ID could cause an out-of-bounds > read access of dst->lags. > > Add a check to dsa_lag_by_id() to validate the LAG ID is not zero, and is > less than or equal to dst->lags_len. (The LAG ID is derived by adding one > to the source trunk ID.) > > Note: this is in the fast path for any frames within a trunk. > > Fixes: 5b60dadb71db ("net: dsa: tag_dsa: Support reception of packets from LAG devices") > Signed-off-by: Luke Howard <lukeh@padl.com> > --- 1. You need to carry over previous review tags (like from Dawid Osuchowski) when you send new patch versions. 2. As mentioned in the comments to the previous change, you can submit patches with a Fixes tag to 'net-next' only if the fixed commit is exclusively present in net-next. Not the case here. 3. The discussion on v1 is still ongoing, you shouldn't have posted v2 within one minute of replying to the v1 comments. Further changes will have to be made at least to the commit message of this patch, let's discuss them back on v1 after we have more data. pw-bot: changes-requested ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 net-next] net: dsa: validate source trunk against lags_len 2025-07-31 9:56 ` [PATCH v2 net-next] " Luke Howard 2025-07-31 11:44 ` Vladimir Oltean @ 2025-07-31 13:07 ` Andrew Lunn 1 sibling, 0 replies; 11+ messages in thread From: Andrew Lunn @ 2025-07-31 13:07 UTC (permalink / raw) To: Luke Howard; +Cc: netdev, Vladimir Oltean, Ryan Wilkins On Thu, Jul 31, 2025 at 07:56:37PM +1000, Luke Howard wrote: > A DSA frame with an invalid source trunk ID could cause an out-of-bounds > read access of dst->lags. > > Add a check to dsa_lag_by_id() to validate the LAG ID is not zero, and is > less than or equal to dst->lags_len. (The LAG ID is derived by adding one > to the source trunk ID.) > > Note: this is in the fast path for any frames within a trunk. > > Fixes: 5b60dadb71db ("net: dsa: tag_dsa: Support reception of packets from LAG devices") > Signed-off-by: Luke Howard <lukeh@padl.com> Adding to what Vladimir said, please also have each patchset in its own thread. There is a CI system which takes patches from the list and tests them. It does not always understand multiple patchsets in one thread. If the CI misses a patch, it pretty much means an automatic rejection of the patch and you will need to resubmit. Andrew ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2025-08-06 1:29 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-31 5:35 [PATCH net] net: dsa: validate source trunk against lags_len Luke Howard
2025-07-31 8:23 ` Dawid Osuchowski
2025-07-31 9:07 ` Vladimir Oltean
2025-07-31 9:55 ` Luke Howard
2025-07-31 11:37 ` Vladimir Oltean
[not found] ` <C867697B-7F5B-4500-8098-9C44630D7930@padl.com>
[not found] ` <CAD3ieB36VnKAQPXUGbnRdWYFThf-VfLkhZTRfb9=ddndZEW3=A@mail.gmail.com>
2025-08-05 7:42 ` Vladimir Oltean
2025-08-05 11:52 ` Andrew Lunn
2025-08-06 1:28 ` Luke Howard
2025-07-31 9:56 ` [PATCH v2 net-next] " Luke Howard
2025-07-31 11:44 ` Vladimir Oltean
2025-07-31 13:07 ` Andrew Lunn
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox