* [RFC PATCH net-next] switchdev: don't abort hardware ipv4 fib offload on failure to program fib entry in hardware
@ 2015-05-02 16:24 roopa
2015-05-03 9:28 ` Rosen, Rami
2015-05-03 17:45 ` Scott Feldman
0 siblings, 2 replies; 7+ messages in thread
From: roopa @ 2015-05-02 16:24 UTC (permalink / raw)
To: davem, sfeldma, john.fastabend, jiri; +Cc: netdev
From: Roopa Prabhu <roopa@cumulusnetworks.com>
This basically removes the calls to netdev_switch_fib_ipv4_abort when
there is an error in programming fib entry in hardware.
On most systems where you can offload routes to hardware,
doing routing in software is not an option (the cpu's are not powerful
to route in software).
I understand that this was added to keep the first fib offload support
simple. This RFC patch is to start a discussion around why the fib
abort will not work for most hardware and to discuss alternate options
for default behaviour.
Available options:
a) Dont abort hardware offload and return appropriate error to the user on
fib offload failure by default (this patch)
b) make the behaviour in a) conditional on a global flag/sysctl (a per fib
entry flag will not work because by default user/routing-daemons dont care
if they are hardware offloaded)
c) for the users/routing-daemons interested in controlling the hardware
offload behaviour there is always the per fib entry flag RTF_F_EXTERNAL
(and special error codes -EOFFLOAD could perhaps indicate that the
hardware offload failed)
Considering the characteristics of the systems that support fib offloads,
a) above seems to be the right default.
PS: This patch currently removes use of the netdev_switch_fib_ipv4_abort
function but not the function itself. This will be fixed if we converge
on the default behaviour.
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
---
net/ipv4/fib_trie.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
index e13fcc6..05b6439 100644
--- a/net/ipv4/fib_trie.c
+++ b/net/ipv4/fib_trie.c
@@ -1171,7 +1171,6 @@ int fib_table_insert(struct fib_table *tb, struct fib_config *cfg)
cfg->fc_nlflags,
tb->tb_id);
if (err) {
- netdev_switch_fib_ipv4_abort(fi);
kmem_cache_free(fn_alias_kmem, new_fa);
goto out;
}
@@ -1219,10 +1218,8 @@ int fib_table_insert(struct fib_table *tb, struct fib_config *cfg)
cfg->fc_type,
cfg->fc_nlflags,
tb->tb_id);
- if (err) {
- netdev_switch_fib_ipv4_abort(fi);
+ if (err)
goto out_free_new_fa;
- }
/* Insert new entry to the list. */
err = fib_insert_alias(t, tp, l, new_fa, fa, key);
--
1.7.10.4
^ permalink raw reply related [flat|nested] 7+ messages in thread
* RE: [RFC PATCH net-next] switchdev: don't abort hardware ipv4 fib offload on failure to program fib entry in hardware
2015-05-02 16:24 [RFC PATCH net-next] switchdev: don't abort hardware ipv4 fib offload on failure to program fib entry in hardware roopa
@ 2015-05-03 9:28 ` Rosen, Rami
2015-05-04 1:20 ` David Miller
2015-05-04 4:39 ` roopa
2015-05-03 17:45 ` Scott Feldman
1 sibling, 2 replies; 7+ messages in thread
From: Rosen, Rami @ 2015-05-03 9:28 UTC (permalink / raw)
To: roopa@cumulusnetworks.com, davem@davemloft.net, sfeldma@gmail.com,
john.fastabend@gmail.com, jiri@resnulli.us
Cc: netdev@vger.kernel.org
Hi,
Removing the netdev_switch_fib_ipv4_abort() when there is an error in programming fib entry in hardware seems reasonable to me. I Just want to note that this is not only a matter of CPU strength; even if the switches' CPUs were powerful enough to do routing in software, still doing so seems not a good option, as routing is implemented in different ways by different switch vendors.
Regards,
Rami
-----Original Message-----
From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org] On Behalf Of roopa@cumulusnetworks.com
Sent: Saturday, May 02, 2015 19:25
To: davem@davemloft.net; sfeldma@gmail.com; john.fastabend@gmail.com; jiri@resnulli.us
Cc: netdev@vger.kernel.org
Subject: [RFC PATCH net-next] switchdev: don't abort hardware ipv4 fib offload on failure to program fib entry in hardware
From: Roopa Prabhu <roopa@cumulusnetworks.com>
This basically removes the calls to netdev_switch_fib_ipv4_abort when there is an error in programming fib entry in hardware.
On most systems where you can offload routes to hardware, doing routing in software is not an option (the cpu's are not powerful to route in software).
I understand that this was added to keep the first fib offload support simple. This RFC patch is to start a discussion around why the fib abort will not work for most hardware and to discuss alternate options for default behaviour.
Available options:
a) Dont abort hardware offload and return appropriate error to the user on fib offload failure by default (this patch)
b) make the behaviour in a) conditional on a global flag/sysctl (a per fib entry flag will not work because by default user/routing-daemons dont care if they are hardware offloaded)
c) for the users/routing-daemons interested in controlling the hardware offload behaviour there is always the per fib entry flag RTF_F_EXTERNAL (and special error codes -EOFFLOAD could perhaps indicate that the hardware offload failed)
Considering the characteristics of the systems that support fib offloads,
a) above seems to be the right default.
PS: This patch currently removes use of the netdev_switch_fib_ipv4_abort function but not the function itself. This will be fixed if we converge on the default behaviour.
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
---
net/ipv4/fib_trie.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c index e13fcc6..05b6439 100644
--- a/net/ipv4/fib_trie.c
+++ b/net/ipv4/fib_trie.c
@@ -1171,7 +1171,6 @@ int fib_table_insert(struct fib_table *tb, struct fib_config *cfg)
cfg->fc_nlflags,
tb->tb_id);
if (err) {
- netdev_switch_fib_ipv4_abort(fi);
kmem_cache_free(fn_alias_kmem, new_fa);
goto out;
}
@@ -1219,10 +1218,8 @@ int fib_table_insert(struct fib_table *tb, struct fib_config *cfg)
cfg->fc_type,
cfg->fc_nlflags,
tb->tb_id);
- if (err) {
- netdev_switch_fib_ipv4_abort(fi);
+ if (err)
goto out_free_new_fa;
- }
/* Insert new entry to the list. */
err = fib_insert_alias(t, tp, l, new_fa, fa, key);
--
1.7.10.4
--
To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC PATCH net-next] switchdev: don't abort hardware ipv4 fib offload on failure to program fib entry in hardware
2015-05-02 16:24 [RFC PATCH net-next] switchdev: don't abort hardware ipv4 fib offload on failure to program fib entry in hardware roopa
2015-05-03 9:28 ` Rosen, Rami
@ 2015-05-03 17:45 ` Scott Feldman
2015-05-04 3:56 ` roopa
1 sibling, 1 reply; 7+ messages in thread
From: Scott Feldman @ 2015-05-03 17:45 UTC (permalink / raw)
To: Roopa Prabhu
Cc: David S. Miller, john fastabend, Jiří Pírko,
Netdev
On Sat, May 2, 2015 at 9:24 AM, <roopa@cumulusnetworks.com> wrote:
> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>
> This basically removes the calls to netdev_switch_fib_ipv4_abort when
> there is an error in programming fib entry in hardware.
>
> On most systems where you can offload routes to hardware,
> doing routing in software is not an option (the cpu's are not powerful
> to route in software).
>
> I understand that this was added to keep the first fib offload support
> simple. This RFC patch is to start a discussion around why the fib
> abort will not work for most hardware and to discuss alternate options
> for default behaviour.
Thanks for revisiting this topic. Agreed, current arrangement isn't
practical for real offload devices with limited CPU support.
> Available options:
> a) Dont abort hardware offload and return appropriate error to the user on
> fib offload failure by default (this patch)
> b) make the behaviour in a) conditional on a global flag/sysctl (a per fib
> entry flag will not work because by default user/routing-daemons dont care
> if they are hardware offloaded)
> c) for the users/routing-daemons interested in controlling the hardware
> offload behaviour there is always the per fib entry flag RTF_F_EXTERNAL
> (and special error codes -EOFFLOAD could perhaps indicate that the
> hardware offload failed)
>
> Considering the characteristics of the systems that support fib offloads,
> a) above seems to be the right default.
Going with option a) puts a burden on the user: on failure, the user
may or may not understand why the operation failed. Or worse, the
routing protocol may not make the failure obvious to the user, so the
user is unaware or unsure why routes aren't getting installed. With
the current solution, routes are still installed even if there was a
failure to offload, so the user doesn't care (yes, whimpy CPU would
make the user care :).
Anyway, I'd prefer to see an option c) -like solution where we force
user to explicitly indicate route offload (by setting
RTNH_F_EXTERNAL). The default would be no offload. The apps
(iproute2 + routing protocols) would need to be updated to support
setting this new flag. This way, the user participates at the app
level in setting the flag and if there is a failure, the user would be
aware and can react. (switchdev driver can return meaningful err
codes, such as ENOSPC for hw route tables full, for example).
With c), the user must be aware that offloaded routes take priority
over non-offloaded routes. It's as if there is a new ip rule -1
that's higher priority than the kernel's ip rule 0. ip rule -1 says
look up in offload device first.
> PS: This patch currently removes use of the netdev_switch_fib_ipv4_abort
> function but not the function itself. This will be fixed if we converge
> on the default behaviour.
I'd prefer a full cleanup, including removal of
ipv4.fib_offload_disabled, rather than leaving orphaned code.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC PATCH net-next] switchdev: don't abort hardware ipv4 fib offload on failure to program fib entry in hardware
2015-05-03 9:28 ` Rosen, Rami
@ 2015-05-04 1:20 ` David Miller
2015-05-04 4:39 ` roopa
1 sibling, 0 replies; 7+ messages in thread
From: David Miller @ 2015-05-04 1:20 UTC (permalink / raw)
To: rami.rosen; +Cc: roopa, sfeldma, john.fastabend, jiri, netdev
Please do not top-post.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC PATCH net-next] switchdev: don't abort hardware ipv4 fib offload on failure to program fib entry in hardware
2015-05-03 17:45 ` Scott Feldman
@ 2015-05-04 3:56 ` roopa
2015-05-05 17:23 ` Scott Feldman
0 siblings, 1 reply; 7+ messages in thread
From: roopa @ 2015-05-04 3:56 UTC (permalink / raw)
To: Scott Feldman
Cc: David S. Miller, john fastabend, Jiří Pírko,
Netdev
On 5/3/15, 10:45 AM, Scott Feldman wrote:
> On Sat, May 2, 2015 at 9:24 AM, <roopa@cumulusnetworks.com> wrote:
>> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>>
>> This basically removes the calls to netdev_switch_fib_ipv4_abort when
>> there is an error in programming fib entry in hardware.
>>
>> On most systems where you can offload routes to hardware,
>> doing routing in software is not an option (the cpu's are not powerful
>> to route in software).
>>
>> I understand that this was added to keep the first fib offload support
>> simple. This RFC patch is to start a discussion around why the fib
>> abort will not work for most hardware and to discuss alternate options
>> for default behaviour.
> Thanks for revisiting this topic. Agreed, current arrangement isn't
> practical for real offload devices with limited CPU support.
>
>> Available options:
>> a) Dont abort hardware offload and return appropriate error to the user on
>> fib offload failure by default (this patch)
>> b) make the behaviour in a) conditional on a global flag/sysctl (a per fib
>> entry flag will not work because by default user/routing-daemons dont care
>> if they are hardware offloaded)
>> c) for the users/routing-daemons interested in controlling the hardware
>> offload behaviour there is always the per fib entry flag RTF_F_EXTERNAL
>> (and special error codes -EOFFLOAD could perhaps indicate that the
>> hardware offload failed)
>>
>> Considering the characteristics of the systems that support fib offloads,
>> a) above seems to be the right default.
> Going with option a) puts a burden on the user: on failure, the user
> may or may not understand why the operation failed. Or worse, the
> routing protocol may not make the failure obvious to the user, so the
> user is unaware or unsure why routes aren't getting installed. With
> the current solution, routes are still installed even if there was a
> failure to offload, so the user doesn't care (yes, whimpy CPU would
> make the user care :).
>
> Anyway, I'd prefer to see an option c) -like solution where we force
> user to explicitly indicate route offload (by setting
> RTNH_F_EXTERNAL). The default would be no offload. The apps
> (iproute2 + routing protocols) would need to be updated to support
> setting this new flag. This way, the user participates at the app
> level in setting the flag and if there is a failure, the user would be
> aware and can react.
I agree with you that there should be an option c), but making it the
default is a problem
which i am trying to avoid. An example routing daemon like quagga should
work
on a server with no l3 acceleration and the same app should continue
working on a switch with l3 acceleration. Changing apps to use
hardware acceleration will not be a welcomed IMO. I am afraid it will slow
down adoption of these apps on switches. And we do want to see these
linux networking apps running on switches the same way they ran on a server.
As you know, we (@cumulus) can run many networking apps (more than one
routing daemon)
on our switches today only because apps can use the existing kernel API
without
having to change to use special flags.
> (switchdev driver can return meaningful err
> codes, such as ENOSPC for hw route tables full, for example).
>
> With c), the user must be aware that offloaded routes take priority
> over non-offloaded routes. It's as if there is a new ip rule -1
> that's higher priority than the kernel's ip rule 0. ip rule -1 says
> look up in offload device first.
I see that some systems could maybe use this (I dont know of any right
now..but
maybe in the future ?).
I can submit patches with the required implementation to cover option c) ...
but i would prefer a way to see a) being the default to enable more apps
to run
on a switch seamlessly.
>
>> PS: This patch currently removes use of the netdev_switch_fib_ipv4_abort
>> function but not the function itself. This will be fixed if we converge
>> on the default behaviour.
> I'd prefer a full cleanup, including removal of
> ipv4.fib_offload_disabled, rather than leaving orphaned code.
ack, I will post a full patch once we converge.
Thanks for the review.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC PATCH net-next] switchdev: don't abort hardware ipv4 fib offload on failure to program fib entry in hardware
2015-05-03 9:28 ` Rosen, Rami
2015-05-04 1:20 ` David Miller
@ 2015-05-04 4:39 ` roopa
1 sibling, 0 replies; 7+ messages in thread
From: roopa @ 2015-05-04 4:39 UTC (permalink / raw)
To: Rosen, Rami
Cc: davem@davemloft.net, sfeldma@gmail.com, john.fastabend@gmail.com,
jiri@resnulli.us, netdev@vger.kernel.org
On 5/3/15, 2:28 AM, Rosen, Rami wrote:
> Hi,
> Removing the netdev_switch_fib_ipv4_abort() when there is an error in programming fib entry in hardware seems reasonable to me. I Just want to note that this is not only a matter of CPU strength; even if the switches' CPUs were powerful enough to do routing in software, still doing so seems not a good option, as routing is implemented in different ways by different switch vendors.
>
>
agree,
thanks.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC PATCH net-next] switchdev: don't abort hardware ipv4 fib offload on failure to program fib entry in hardware
2015-05-04 3:56 ` roopa
@ 2015-05-05 17:23 ` Scott Feldman
0 siblings, 0 replies; 7+ messages in thread
From: Scott Feldman @ 2015-05-05 17:23 UTC (permalink / raw)
To: roopa; +Cc: David S. Miller, john fastabend, Jiří Pírko,
Netdev
On Sun, May 3, 2015 at 8:56 PM, roopa <roopa@cumulusnetworks.com> wrote:
> On 5/3/15, 10:45 AM, Scott Feldman wrote:
>>
>> On Sat, May 2, 2015 at 9:24 AM, <roopa@cumulusnetworks.com> wrote:
>>>
>>> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>>>
>>> This basically removes the calls to netdev_switch_fib_ipv4_abort when
>>> there is an error in programming fib entry in hardware.
[cut]
> ack, I will post a full patch once we converge.
I'm OK with option a) for now.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2015-05-05 17:23 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-05-02 16:24 [RFC PATCH net-next] switchdev: don't abort hardware ipv4 fib offload on failure to program fib entry in hardware roopa
2015-05-03 9:28 ` Rosen, Rami
2015-05-04 1:20 ` David Miller
2015-05-04 4:39 ` roopa
2015-05-03 17:45 ` Scott Feldman
2015-05-04 3:56 ` roopa
2015-05-05 17:23 ` Scott Feldman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).