* [PATCH net 1/5] sfc: Minimal support for 40G link speed
From: Ben Hutchings @ 2013-09-17 0:11 UTC (permalink / raw)
To: David Miller; +Cc: netdev, linux-net-drivers
In-Reply-To: <1379376611.1945.11.camel@bwh-desktop.uk.level5networks.com>
Accept and handle 40G link events.
Accept ethtool link settings of speed == 40000 && duplex, and set the
appropriate MCDI PHY capability.
This does not include reporting of 40G media types, as those have not
yet been assigned numbers in the MCDI protocol.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
drivers/net/ethernet/sfc/Kconfig | 2 +-
drivers/net/ethernet/sfc/mcdi_port.c | 2 ++
2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/sfc/Kconfig b/drivers/net/ethernet/sfc/Kconfig
index 8b71525..0889212 100644
--- a/drivers/net/ethernet/sfc/Kconfig
+++ b/drivers/net/ethernet/sfc/Kconfig
@@ -7,7 +7,7 @@ config SFC
select I2C_ALGOBIT
select PTP_1588_CLOCK
---help---
- This driver supports 10-gigabit Ethernet cards based on
+ This driver supports 10/40-gigabit Ethernet cards based on
the Solarflare SFC4000, SFC9000-family and SFC9100-family
controllers.
diff --git a/drivers/net/ethernet/sfc/mcdi_port.c b/drivers/net/ethernet/sfc/mcdi_port.c
index 8d33da6..7b6be61 100644
--- a/drivers/net/ethernet/sfc/mcdi_port.c
+++ b/drivers/net/ethernet/sfc/mcdi_port.c
@@ -556,6 +556,7 @@ static int efx_mcdi_phy_set_settings(struct efx_nic *efx, struct ethtool_cmd *ec
case 100: caps = 1 << MC_CMD_PHY_CAP_100FDX_LBN; break;
case 1000: caps = 1 << MC_CMD_PHY_CAP_1000FDX_LBN; break;
case 10000: caps = 1 << MC_CMD_PHY_CAP_10000FDX_LBN; break;
+ case 40000: caps = 1 << MC_CMD_PHY_CAP_40000FDX_LBN; break;
default: return -EINVAL;
}
} else {
@@ -841,6 +842,7 @@ static unsigned int efx_mcdi_event_link_speed[] = {
[MCDI_EVENT_LINKCHANGE_SPEED_100M] = 100,
[MCDI_EVENT_LINKCHANGE_SPEED_1G] = 1000,
[MCDI_EVENT_LINKCHANGE_SPEED_10G] = 10000,
+ [MCDI_EVENT_LINKCHANGE_SPEED_40G] = 40000,
};
void efx_mcdi_process_link_change(struct efx_nic *efx, efx_qword_t *ev)
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply related
* [PATCH net 2/5] sfc: Disable PTP on EF10 until we're ready to handle inline RX timestamps
From: Ben Hutchings @ 2013-09-17 0:12 UTC (permalink / raw)
To: David Miller; +Cc: netdev, linux-net-drivers
In-Reply-To: <1379376611.1945.11.camel@bwh-desktop.uk.level5networks.com>
Unlike Siena where timestamping is provided by a peripheral, EF10
delivers RX timestamps in the packet prefix. However the driver
doesn't yet support this.
We are also creating a PHC device for each EF10 function, even though
the clock is really shared between all of them.
Disable hardware PTP/timestamping support until it's complete.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
drivers/net/ethernet/sfc/ef10.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c
index 5f42313..357a6e5 100644
--- a/drivers/net/ethernet/sfc/ef10.c
+++ b/drivers/net/ethernet/sfc/ef10.c
@@ -260,8 +260,6 @@ static int efx_ef10_probe(struct efx_nic *efx)
if (rc)
goto fail3;
- efx_ptp_probe(efx);
-
return 0;
fail3:
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply related
* [PATCH net 3/5] sfc: Reset derived rx_bad_bytes statistic when EF10 MC is rebooted
From: Ben Hutchings @ 2013-09-17 0:12 UTC (permalink / raw)
To: David Miller; +Cc: netdev, linux-net-drivers
In-Reply-To: <1379376611.1945.11.camel@bwh-desktop.uk.level5networks.com>
If the MC reboots then the stats it reports to us will have been
reset. We need to reset ours to get efx_update_diff_stat() working
properly.
(This is a re-run of commit 876be083b669 'sfc: Reset driver's
MAC stats after MC reboot seen'.)
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
drivers/net/ethernet/sfc/ef10.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c
index 357a6e5..80a6eea 100644
--- a/drivers/net/ethernet/sfc/ef10.c
+++ b/drivers/net/ethernet/sfc/ef10.c
@@ -708,6 +708,11 @@ static int efx_ef10_mcdi_poll_reboot(struct efx_nic *efx)
nic_data->must_restore_filters = true;
nic_data->rx_rss_context = EFX_EF10_RSS_CONTEXT_INVALID;
+ /* MAC statistics have been cleared on the NIC; clear the local
+ * statistic that we update with efx_update_diff_stat().
+ */
+ nic_data->stats[EF10_STAT_rx_bad_bytes] = 0;
+
return -EIO;
}
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply related
* [PATCH net 4/5] sfc: Clean up validation of datapath capabilities
From: Ben Hutchings @ 2013-09-17 0:13 UTC (permalink / raw)
To: David Miller; +Cc: netdev, linux-net-drivers
In-Reply-To: <1379376611.1945.11.camel@bwh-desktop.uk.level5networks.com>
Rename efx_ef10_init_capabilities() to the more specific
efx_ef10_init_datapath_caps().
Stop accepting short responses to MC_CMD_GET_CAPABILITIES; we
don't need to support pre-production firmware.
Move the check for RX prefix support from efx_ef10_probe() into
efx_ef10_init_datapath_caps() and use consistent error messages
for missing TSO support and missing RX prefix support.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
drivers/net/ethernet/sfc/ef10.c | 41 ++++++++++++++++++++++-------------------
1 file changed, 22 insertions(+), 19 deletions(-)
diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c
index 80a6eea..a4956b8 100644
--- a/drivers/net/ethernet/sfc/ef10.c
+++ b/drivers/net/ethernet/sfc/ef10.c
@@ -94,7 +94,7 @@ static unsigned int efx_ef10_mem_map_size(struct efx_nic *efx)
return resource_size(&efx->pci_dev->resource[EFX_MEM_BAR]);
}
-static int efx_ef10_init_capabilities(struct efx_nic *efx)
+static int efx_ef10_init_datapath_caps(struct efx_nic *efx)
{
MCDI_DECLARE_BUF(outbuf, MC_CMD_GET_CAPABILITIES_OUT_LEN);
struct efx_ef10_nic_data *nic_data = efx->nic_data;
@@ -107,16 +107,27 @@ static int efx_ef10_init_capabilities(struct efx_nic *efx)
outbuf, sizeof(outbuf), &outlen);
if (rc)
return rc;
+ if (outlen < sizeof(outbuf)) {
+ netif_err(efx, drv, efx->net_dev,
+ "unable to read datapath firmware capabilities\n");
+ return -EIO;
+ }
- if (outlen >= sizeof(outbuf)) {
- nic_data->datapath_caps =
- MCDI_DWORD(outbuf, GET_CAPABILITIES_OUT_FLAGS1);
- if (!(nic_data->datapath_caps &
- (1 << MC_CMD_GET_CAPABILITIES_OUT_TX_TSO_LBN))) {
- netif_err(efx, drv, efx->net_dev,
- "Capabilities don't indicate TSO support.\n");
- return -ENODEV;
- }
+ nic_data->datapath_caps =
+ MCDI_DWORD(outbuf, GET_CAPABILITIES_OUT_FLAGS1);
+
+ if (!(nic_data->datapath_caps &
+ (1 << MC_CMD_GET_CAPABILITIES_OUT_TX_TSO_LBN))) {
+ netif_err(efx, drv, efx->net_dev,
+ "current firmware does not support TSO\n");
+ return -ENODEV;
+ }
+
+ if (!(nic_data->datapath_caps &
+ (1 << MC_CMD_GET_CAPABILITIES_OUT_RX_PREFIX_LEN_14_LBN))) {
+ netif_err(efx, probe, efx->net_dev,
+ "current firmware does not support an RX prefix\n");
+ return -ENODEV;
}
return 0;
@@ -217,21 +228,13 @@ static int efx_ef10_probe(struct efx_nic *efx)
if (rc)
goto fail3;
- rc = efx_ef10_init_capabilities(efx);
+ rc = efx_ef10_init_datapath_caps(efx);
if (rc < 0)
goto fail3;
efx->rx_packet_len_offset =
ES_DZ_RX_PREFIX_PKTLEN_OFST - ES_DZ_RX_PREFIX_SIZE;
- if (!(nic_data->datapath_caps &
- (1 << MC_CMD_GET_CAPABILITIES_OUT_RX_PREFIX_LEN_14_LBN))) {
- netif_err(efx, probe, efx->net_dev,
- "current firmware does not support an RX prefix\n");
- rc = -ENODEV;
- goto fail3;
- }
-
rc = efx_mcdi_port_get_number(efx);
if (rc < 0)
goto fail3;
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply related
* [PATCH net 5/5] sfc: Reinitialise and re-validate datapath caps after MC reboot
From: Ben Hutchings @ 2013-09-17 0:13 UTC (permalink / raw)
To: David Miller; +Cc: netdev, linux-net-drivers
In-Reply-To: <1379376611.1945.11.camel@bwh-desktop.uk.level5networks.com>
After an MC reboot, the datapath may be running a different firmware
variant and have different capabilities. It is critical that we know
the current capabilities so that we can pass valid flags to
MC_CMD_INIT_EVQ.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
drivers/net/ethernet/sfc/ef10.c | 10 ++++++++++
drivers/net/ethernet/sfc/nic.h | 3 +++
2 files changed, 13 insertions(+)
diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c
index a4956b8..9f18ae9 100644
--- a/drivers/net/ethernet/sfc/ef10.c
+++ b/drivers/net/ethernet/sfc/ef10.c
@@ -343,6 +343,13 @@ static int efx_ef10_init_nic(struct efx_nic *efx)
struct efx_ef10_nic_data *nic_data = efx->nic_data;
int rc;
+ if (nic_data->must_check_datapath_caps) {
+ rc = efx_ef10_init_datapath_caps(efx);
+ if (rc)
+ return rc;
+ nic_data->must_check_datapath_caps = false;
+ }
+
if (nic_data->must_realloc_vis) {
/* We cannot let the number of VIs change now */
rc = efx_ef10_alloc_vis(efx, nic_data->n_allocated_vis,
@@ -711,6 +718,9 @@ static int efx_ef10_mcdi_poll_reboot(struct efx_nic *efx)
nic_data->must_restore_filters = true;
nic_data->rx_rss_context = EFX_EF10_RSS_CONTEXT_INVALID;
+ /* The datapath firmware might have been changed */
+ nic_data->must_check_datapath_caps = true;
+
/* MAC statistics have been cleared on the NIC; clear the local
* statistic that we update with efx_update_diff_stat().
*/
diff --git a/drivers/net/ethernet/sfc/nic.h b/drivers/net/ethernet/sfc/nic.h
index 4b1e188..fda29d3 100644
--- a/drivers/net/ethernet/sfc/nic.h
+++ b/drivers/net/ethernet/sfc/nic.h
@@ -400,6 +400,8 @@ enum {
* @rx_rss_context: Firmware handle for our RSS context
* @stats: Hardware statistics
* @workaround_35388: Flag: firmware supports workaround for bug 35388
+ * @must_check_datapath_caps: Flag: @datapath_caps needs to be revalidated
+ * after MC reboot
* @datapath_caps: Capabilities of datapath firmware (FLAGS1 field of
* %MC_CMD_GET_CAPABILITIES response)
*/
@@ -413,6 +415,7 @@ struct efx_ef10_nic_data {
u32 rx_rss_context;
u64 stats[EF10_STAT_COUNT];
bool workaround_35388;
+ bool must_check_datapath_caps;
u32 datapath_caps;
};
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply related
* Re: [PATCH 1/1] net: race condition when removing virtual net_device
From: Eric W. Biederman @ 2013-09-17 0:25 UTC (permalink / raw)
To: Francesco Ruggeri
Cc: David S. Miller, Eric Dumazet, Jiri Pirko, Alexander Duyck,
Cong Wang, netdev
In-Reply-To: <CA+HUmGih9akzhpRrb_0WEapi4jGcuSV8qO==QeRWHoqnxzFyng@mail.gmail.com>
Francesco Ruggeri <fruggeri@aristanetworks.com> writes:
> On Mon, Sep 16, 2013 at 3:45 AM, Eric W. Biederman
> <ebiederm@xmission.com> wrote:
>
>> I believe the weird reordering of veth devices is solved or largely
>> solved by:
>>
>> commit f45a5c267da35174e22cec955093a7513dc1623d
>
> I looked at d0e2c55e and f45a5c26 in the past and I do not think they
> are related to the problem I see, which I think is more related to the
> infrastructure. See more below.
If everything is in the same batch they definitely relate, as it stops
refiling one of the devices past the point where it's loopback device
is destroyed.
It is not the fundamental problem but it is related and those fixes.
>>> and this is the sequence after pushing the loopback_devs to the tail
>>> of net_todo_list:
>>>
>>> unregister_netdevice_queue: v0 (ns ffff8800379f8000)
>>> unregister_netdevice_queue: v1 (ns ffff8800378c0b00)
>>> unregister_netdevice_queue: lo (ns ffff8800379f8000)
>>> unregister_netdevice_queue: v1 (ns ffff8800378c0b00)
>>> unregister_netdevice_queue: v0 (ns ffff8800379f8000)
>>> unregister_netdevice_queue: lo (ns ffff8800378c0b00)
>>> unregister_netdevice_many: lo (ns ffff8800379f8000) v1 (ns
>>> ffff8800378c0b00) v0 (ns ffff8800379f8000) lo (ns ffff8800378c0b00)
>>> netdev_run_todo: v1 (ns ffff8800378c0b00) v0 (ns ffff8800379f8000) lo
>>> (ns ffff8800379f8000) lo (ns ffff8800378c0b00)
>>
>> I am scratching my head. That isn't what I would expect from putting
>> loopback devices at the end of the list.
>
> The way to read it is as follows:
> unregister_netdevice_many gets the sequence lo0 v1 v0 lo1 (lo0 is the
> occurrence of "lo" with the same namespace as v0, etc.).
> As the interfaces are unlisted the loopback_devs are queued at the end
> of net_todo_list, so when netdev_run_todo executes net_todo_list is v1
> v0 lo0 lo1.
My reading jumbled the unregister_netdevice_many line and the
netdev_run_todo line. So I was seeing v0, lo, v1, v0, lo, lo.
But since both loop devices are now at the end your that part of your
patch looks ok.
>> Even more I am not comfortable with any situation where preserving the
>> order in which the devices are unregistered is not sufficient to make
>> things work correctly. That quickly gets into fragile layering
>> problems, and I don't know how anyone would be able to maintain that.
>>
>
> I agree with your general statement, but unfortunately the order in
> which the interfaces are unregistered is not preserved even in the
> current code, since dev_close_many already rearranges them based on
> whether they are UP or not.
Ugh. That is a bug.
I have sent a patch to solve this by simply not reordering the device
deletions as that is easier to think about. And more robust if we
happen to care about anything besides the loopback device. Reording
happens so seldom I can't imagine it being tested properly.
I have also sent a patch to set loopback_dev to NULL so that it is more
likely we will see your class of bugs.
I think I see bugs with the handling of rtmsg_ifinfo, and
netpoll_rx_enable/disable in dev_close_many as well. But those are
significantly less serious issues.
If you could verify that my patch to dev_close solves the ordering
issues you were seeing I would appreciate that.
>> I still don't see a better solution than dev_change_net_namespace,
>> for the network devices that have this problem. Network devices are not
>> supposed to be findable after the dance at the start of cleanup_net.
>>
>
> I think the problem we are seeing is an infrastructure one. If I can restate it:
>
> 1) When destroying a namespace all net_devices in it should be
> disposed of before any non-device pernet subsystems exit. The existing
> code tries to do this but it does not take into consideration
> net_devices that may have been unlisted from the namespace by another
> process which is still in the process of destroying them (specifically
> in netdev_run_todo/netdev_wait_allrefs).
The existing code makes the assumption that is not possible to find
a network device and manipulate it outside the network namespace in
any way (outside of the network namespace cleanup methods) after the
network namespace reference count has dropped to 0 and the network
namespace is no longer in the network namespace list.
That assumption is clearly not true for veth pairs, vlans, macvlans and
the like, and it is an infrastructure problem.
The question is what is the most robust comprehensible and maintinable
fix? (That doesn't penalize the fast path of course).
> There is also another issue, separate but related (my diffs did not
> try to address it):
>
> 2) dev_change_net_namespace does not have the equivalent of
> netdev_wait_allrefs. If rebroadcasts are in fact needed when
> destroying a net_device then the same should apply when the net_device
> is moved out of a namespace. With existing code a net_device can be
> moved to a different namespace before its refcount drops to 0. Any
> later broadcasts will not trigger any action in the original namespace
> where they should have occurred.
Not having the broadcast in dev_change_net_namespace is not a bug. The
repeated broadcast is most definitely there for hysterical raisins, and
with a good audit of dev_hold/dev_put call sites can most definitely be
removed. At a practical level the repeated broadcast is just acting as
a hammer and saying to subsystems you messed up now let this device go.
Let go! Let go! Let go! And there is a slight hope that something will
let go when told multiple times.
> I suspect that this is not catastrophic but its effects could be
> subtle. This is another reason why I would avoid using
> dev_change_net_namespace as a driver level solution for 1) above,
> besides the fact that we would have to apply it to all drivers
> affected by this (veth, vlan, macvlan, maybe more).
We don't avoid the rebroadcast and wait, with dev_change_net_namespace,
they are just performed in a different network namespace. So if
something is wrong we will know.
>> It might be possible to actually build asynchronous network device
>> unregistration and opportunistick batching. Aka cleanup network devices
>> much like we clean up network namespaces now, and by funnelling them all
>> into a single threaded work queue that would probably stop the races, as
>> well as improve performance. I don't have the energy to do that right
>> now unfortunately :(
>>
>
> That would probably solve it, but as you suggest it will take a
> considerable amount of work and code re-structuring.
And it may be worth it for the speed up you get when destroying 384 mlag
ports.
Eric
^ permalink raw reply
* [PATCH v3 net-next 02/27] net: add RCU variant to search for netdev_adjacent link
From: Veaceslav Falico @ 2013-09-17 0:46 UTC (permalink / raw)
To: netdev
Cc: jiri, Veaceslav Falico, David S. Miller, Eric Dumazet,
Alexander Duyck, Cong Wang
In-Reply-To: <1379378812-18346-1-git-send-email-vfalico@redhat.com>
Currently we have only the RTNL flavour, however we can traverse it while
holding only RCU, so add the RCU search. Add only one function that will be
used further, other functions can be added easily afterwards, if anyone
would need them.
CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
CC: Cong Wang <amwang@redhat.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
Notes:
v2 -> v3:
No change.
v1 -> v2:
No changes.
RFC -> v1:
rename neighbour_dev_list/all_dev_list to adj_list/all_adj_list and rework
the naming of functions - to make the use the pattern
netdev_(all_)(lower|upper)_* . Uninline functions to get better stack
traces - inline doesn't give much speed improvement, but this way the stack
traces are meaningful. Remove the unused (through the whole patchset)
functions - they can easily be added afterwards.
net/core/dev.c | 27 +++++++++++++++++++++++++++
1 file changed, 27 insertions(+)
diff --git a/net/core/dev.c b/net/core/dev.c
index f69a8ab..75eabef 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4380,6 +4380,33 @@ struct netdev_adjacent {
struct rcu_head rcu;
};
+static struct netdev_adjacent *__netdev_find_adj_rcu(struct net_device *dev,
+ struct net_device *adj_dev,
+ bool upper, bool neighbour)
+{
+ struct netdev_adjacent *adj;
+ struct list_head *adj_list;
+
+ if (neighbour)
+ adj_list = upper ? &dev->adj_list.upper :
+ &dev->adj_list.lower;
+ else
+ adj_list = upper ? &dev->all_adj_list.upper :
+ &dev->all_adj_list.lower;
+
+ list_for_each_entry_rcu(adj, adj_list, list) {
+ if (adj->dev == adj_dev)
+ return adj;
+ }
+ return NULL;
+}
+
+static struct netdev_adjacent *__netdev_lower_find_rcu(struct net_device *dev,
+ struct net_device *ldev)
+{
+ return __netdev_find_adj_rcu(dev, ldev, false, true);
+}
+
static struct netdev_adjacent *__netdev_find_adj(struct net_device *dev,
struct net_device *adj_dev,
bool upper, bool neighbour)
--
1.8.4
^ permalink raw reply related
* [PATCH v3 net-next 03/27] net: uninline netdev neighbour functions
From: Veaceslav Falico @ 2013-09-17 0:46 UTC (permalink / raw)
To: netdev
Cc: jiri, Veaceslav Falico, David S. Miller, Eric Dumazet,
Alexander Duyck
In-Reply-To: <1379378812-18346-1-git-send-email-vfalico@redhat.com>
They don't give almost any speed/size advantage, however it's really
useful to have them in the backtrace.
CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
Notes:
v2 -> v3:
No change.
v1 -> v2:
No changes.
RFC -> v1:
New patch.
net/core/dev.c | 22 +++++++++++-----------
1 file changed, 11 insertions(+), 11 deletions(-)
diff --git a/net/core/dev.c b/net/core/dev.c
index 75eabef..d277081 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4684,25 +4684,25 @@ void __netdev_adjacent_dev_remove(struct net_device *dev,
kfree_rcu(adj, rcu);
}
-static inline void __netdev_upper_dev_remove(struct net_device *dev,
- struct net_device *udev)
+static void __netdev_upper_dev_remove(struct net_device *dev,
+ struct net_device *udev)
{
return __netdev_adjacent_dev_remove(dev, udev, true, false);
}
-static inline void __netdev_upper_dev_remove_neighbour(struct net_device *dev,
+static void __netdev_upper_dev_remove_neighbour(struct net_device *dev,
struct net_device *udev)
{
return __netdev_adjacent_dev_remove(dev, udev, true, true);
}
-static inline void __netdev_lower_dev_remove(struct net_device *dev,
- struct net_device *ldev)
+static void __netdev_lower_dev_remove(struct net_device *dev,
+ struct net_device *ldev)
{
return __netdev_adjacent_dev_remove(dev, ldev, false, false);
}
-static inline void __netdev_lower_dev_remove_neighbour(struct net_device *dev,
+static void __netdev_lower_dev_remove_neighbour(struct net_device *dev,
struct net_device *ldev)
{
return __netdev_adjacent_dev_remove(dev, ldev, false, true);
@@ -4727,15 +4727,15 @@ int __netdev_adjacent_dev_insert_link(struct net_device *dev,
return 0;
}
-static inline int __netdev_adjacent_dev_link(struct net_device *dev,
- struct net_device *udev)
+static int __netdev_adjacent_dev_link(struct net_device *dev,
+ struct net_device *udev)
{
return __netdev_adjacent_dev_insert_link(dev, udev, false, false);
}
-static inline int __netdev_adjacent_dev_link_neighbour(struct net_device *dev,
- struct net_device *udev,
- bool master)
+static int __netdev_adjacent_dev_link_neighbour(struct net_device *dev,
+ struct net_device *udev,
+ bool master)
{
return __netdev_adjacent_dev_insert_link(dev, udev, master, true);
}
--
1.8.4
^ permalink raw reply related
* [PATCH v3 net-next 01/27] net: add adj_list to save only neighbours
From: Veaceslav Falico @ 2013-09-17 0:46 UTC (permalink / raw)
To: netdev
Cc: jiri, Veaceslav Falico, David S. Miller, Eric Dumazet,
Alexander Duyck, Cong Wang
In-Reply-To: <1379378812-18346-1-git-send-email-vfalico@redhat.com>
Currently, we distinguish neighbours (first-level linked devices) from
non-neighbours by the neighbour bool in the netdev_adjacent. This could be
quite time-consuming in case we would like to traverse *only* through
neighbours - cause we'd have to traverse through all devices and check for
this flag, and in a (quite common) scenario where we have lots of vlans on
top of bridge, which is on top of a bond - the bonding would have to go
through all those vlans to get its upper neighbour linked devices.
This situation is really unpleasant, cause there are already a lot of cases
when a device with slaves needs to go through them in hot path.
To fix this, introduce a new upper/lower device lists structure -
adj_list, which contains only the neighbours. It works always in
pair with the all_adj_list structure (renamed from upper/lower_dev_list),
i.e. both of them contain the same links, only that all_adj_list contains
also non-neighbour device links. It's really a small change visible,
currently, only for __netdev_adjacent_dev_insert/remove(), and doesn't
change the main linked logic at all.
Also, add some comments a fix a name collision in
netdev_for_each_upper_dev_rcu() and rework the naming by the following
rules:
netdev_(all_)(upper|lower)_*
If "all_" is present, then we work with the whole list of upper/lower
devices, otherwise - only with direct neighbours. Uninline functions - to
get better stack traces.
CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
CC: Cong Wang <amwang@redhat.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
Notes:
v2 -> v3:
No change.
v1 -> v2:
Fix a bug when we've tried to add a non-neighbour dev1 to dev2, while dev2
already had dev1 as a neighbour - we shouldn't care about neighbour's
ref_nr at all cause we can only add one - so just add/remove it via
different functions. And it became more readable.
RFC -> v1:
rename neighbour_dev_list/all_dev_list to adj_list/all_adj_list and rework
the naming of functions - to make the use the pattern
netdev_(all_)(lower|upper)_* . Uninline functions to get better stack
traces - inline doesn't give much speed improvement, but this way the stack
traces are meaningful.
drivers/net/bonding/bond_alb.c | 2 +-
drivers/net/bonding/bond_main.c | 10 +-
include/linux/netdevice.h | 28 +++--
net/core/dev.c | 219 ++++++++++++++++++++++++++++------------
4 files changed, 179 insertions(+), 80 deletions(-)
diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
index 91f179d..c3dcc6b 100644
--- a/drivers/net/bonding/bond_alb.c
+++ b/drivers/net/bonding/bond_alb.c
@@ -1019,7 +1019,7 @@ static void alb_send_learning_packets(struct slave *slave, u8 mac_addr[])
/* loop through vlans and send one packet for each */
rcu_read_lock();
- netdev_for_each_upper_dev_rcu(bond->dev, upper, iter) {
+ netdev_for_each_all_upper_dev_rcu(bond->dev, upper, iter) {
if (upper->priv_flags & IFF_802_1Q_VLAN)
alb_send_lp_vid(slave, mac_addr,
vlan_dev_vlan_id(upper));
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 39e5b1c..72bdb8b 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2267,7 +2267,7 @@ static bool bond_has_this_ip(struct bonding *bond, __be32 ip)
return true;
rcu_read_lock();
- netdev_for_each_upper_dev_rcu(bond->dev, upper, iter) {
+ netdev_for_each_all_upper_dev_rcu(bond->dev, upper, iter) {
if (ip == bond_confirm_addr(upper, 0, ip)) {
ret = true;
break;
@@ -2342,10 +2342,12 @@ static void bond_arp_send_all(struct bonding *bond, struct slave *slave)
*
* TODO: QinQ?
*/
- netdev_for_each_upper_dev_rcu(bond->dev, vlan_upper, vlan_iter) {
+ netdev_for_each_all_upper_dev_rcu(bond->dev, vlan_upper,
+ vlan_iter) {
if (!is_vlan_dev(vlan_upper))
continue;
- netdev_for_each_upper_dev_rcu(vlan_upper, upper, iter) {
+ netdev_for_each_all_upper_dev_rcu(vlan_upper, upper,
+ iter) {
if (upper == rt->dst.dev) {
vlan_id = vlan_dev_vlan_id(vlan_upper);
rcu_read_unlock();
@@ -2358,7 +2360,7 @@ static void bond_arp_send_all(struct bonding *bond, struct slave *slave)
* our upper vlans, then just search for any dev that
* matches, and in case it's a vlan - save the id
*/
- netdev_for_each_upper_dev_rcu(bond->dev, upper, iter) {
+ netdev_for_each_all_upper_dev_rcu(bond->dev, upper, iter) {
if (upper == rt->dst.dev) {
/* if it's a vlan - get its VID */
if (is_vlan_dev(upper))
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 041b42a..2a944e5 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1143,8 +1143,18 @@ struct net_device {
struct list_head dev_list;
struct list_head napi_list;
struct list_head unreg_list;
- struct list_head upper_dev_list; /* List of upper devices */
- struct list_head lower_dev_list;
+
+ /* directly linked devices, like slaves for bonding */
+ struct {
+ struct list_head upper;
+ struct list_head lower;
+ } adj_list;
+
+ /* all linked devices, *including* neighbours */
+ struct {
+ struct list_head upper;
+ struct list_head lower;
+ } all_adj_list;
/* currently active device features */
@@ -2813,15 +2823,15 @@ extern int bpf_jit_enable;
extern bool netdev_has_upper_dev(struct net_device *dev,
struct net_device *upper_dev);
extern bool netdev_has_any_upper_dev(struct net_device *dev);
-extern struct net_device *netdev_upper_get_next_dev_rcu(struct net_device *dev,
- struct list_head **iter);
+extern struct net_device *netdev_all_upper_get_next_dev_rcu(struct net_device *dev,
+ struct list_head **iter);
/* iterate through upper list, must be called under RCU read lock */
-#define netdev_for_each_upper_dev_rcu(dev, upper, iter) \
- for (iter = &(dev)->upper_dev_list, \
- upper = netdev_upper_get_next_dev_rcu(dev, &(iter)); \
- upper; \
- upper = netdev_upper_get_next_dev_rcu(dev, &(iter)))
+#define netdev_for_each_all_upper_dev_rcu(dev, updev, iter) \
+ for (iter = &(dev)->all_adj_list.upper, \
+ updev = netdev_all_upper_get_next_dev_rcu(dev, &(iter)); \
+ updev; \
+ updev = netdev_all_upper_get_next_dev_rcu(dev, &(iter)))
extern struct net_device *netdev_master_upper_dev_get(struct net_device *dev);
extern struct net_device *netdev_master_upper_dev_get_rcu(struct net_device *dev);
diff --git a/net/core/dev.c b/net/core/dev.c
index 5c713f2..f69a8ab 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4373,9 +4373,6 @@ struct netdev_adjacent {
/* upper master flag, there can only be one master device per list */
bool master;
- /* indicates that this dev is our first-level lower/upper device */
- bool neighbour;
-
/* counter for the number of times this device was added to us */
u16 ref_nr;
@@ -4385,30 +4382,47 @@ struct netdev_adjacent {
static struct netdev_adjacent *__netdev_find_adj(struct net_device *dev,
struct net_device *adj_dev,
- bool upper)
+ bool upper, bool neighbour)
{
struct netdev_adjacent *adj;
- struct list_head *dev_list;
+ struct list_head *adj_list;
- dev_list = upper ? &dev->upper_dev_list : &dev->lower_dev_list;
+ if (neighbour)
+ adj_list = upper ? &dev->adj_list.upper :
+ &dev->adj_list.lower;
+ else
+ adj_list = upper ? &dev->all_adj_list.upper :
+ &dev->all_adj_list.lower;
- list_for_each_entry(adj, dev_list, list) {
+ list_for_each_entry(adj, adj_list, list) {
if (adj->dev == adj_dev)
return adj;
}
return NULL;
}
-static inline struct netdev_adjacent *__netdev_find_upper(struct net_device *dev,
- struct net_device *udev)
+static struct netdev_adjacent *__netdev_all_upper_find(struct net_device *dev,
+ struct net_device *udev)
+{
+ return __netdev_find_adj(dev, udev, true, false);
+}
+
+static struct netdev_adjacent *__netdev_all_lower_find(struct net_device *dev,
+ struct net_device *ldev)
{
- return __netdev_find_adj(dev, udev, true);
+ return __netdev_find_adj(dev, ldev, false, false);
}
-static inline struct netdev_adjacent *__netdev_find_lower(struct net_device *dev,
- struct net_device *ldev)
+static struct netdev_adjacent *__netdev_upper_find(struct net_device *dev,
+ struct net_device *udev)
{
- return __netdev_find_adj(dev, ldev, false);
+ return __netdev_find_adj(dev, udev, true, true);
+}
+
+static struct netdev_adjacent *__netdev_lower_find(struct net_device *dev,
+ struct net_device *ldev)
+{
+ return __netdev_find_adj(dev, ldev, false, true);
}
/**
@@ -4425,7 +4439,7 @@ bool netdev_has_upper_dev(struct net_device *dev,
{
ASSERT_RTNL();
- return __netdev_find_upper(dev, upper_dev);
+ return __netdev_all_upper_find(dev, upper_dev);
}
EXPORT_SYMBOL(netdev_has_upper_dev);
@@ -4440,7 +4454,7 @@ bool netdev_has_any_upper_dev(struct net_device *dev)
{
ASSERT_RTNL();
- return !list_empty(&dev->upper_dev_list);
+ return !list_empty(&dev->all_adj_list.upper);
}
EXPORT_SYMBOL(netdev_has_any_upper_dev);
@@ -4457,10 +4471,10 @@ struct net_device *netdev_master_upper_dev_get(struct net_device *dev)
ASSERT_RTNL();
- if (list_empty(&dev->upper_dev_list))
+ if (list_empty(&dev->adj_list.upper))
return NULL;
- upper = list_first_entry(&dev->upper_dev_list,
+ upper = list_first_entry(&dev->adj_list.upper,
struct netdev_adjacent, list);
if (likely(upper->master))
return upper->dev;
@@ -4468,15 +4482,15 @@ struct net_device *netdev_master_upper_dev_get(struct net_device *dev)
}
EXPORT_SYMBOL(netdev_master_upper_dev_get);
-/* netdev_upper_get_next_dev_rcu - Get the next dev from upper list
+/* netdev_all_upper_get_next_dev_rcu - Get the next dev from upper list
* @dev: device
* @iter: list_head ** of the current position
*
* Gets the next device from the dev's upper list, starting from iter
* position. The caller must hold RCU read lock.
*/
-struct net_device *netdev_upper_get_next_dev_rcu(struct net_device *dev,
- struct list_head **iter)
+struct net_device *netdev_all_upper_get_next_dev_rcu(struct net_device *dev,
+ struct list_head **iter)
{
struct netdev_adjacent *upper;
@@ -4484,14 +4498,14 @@ struct net_device *netdev_upper_get_next_dev_rcu(struct net_device *dev,
upper = list_entry_rcu((*iter)->next, struct netdev_adjacent, list);
- if (&upper->list == &dev->upper_dev_list)
+ if (&upper->list == &dev->all_adj_list.upper)
return NULL;
*iter = &upper->list;
return upper->dev;
}
-EXPORT_SYMBOL(netdev_upper_get_next_dev_rcu);
+EXPORT_SYMBOL(netdev_all_upper_get_next_dev_rcu);
/**
* netdev_master_upper_dev_get_rcu - Get master upper device
@@ -4504,7 +4518,7 @@ struct net_device *netdev_master_upper_dev_get_rcu(struct net_device *dev)
{
struct netdev_adjacent *upper;
- upper = list_first_or_null_rcu(&dev->upper_dev_list,
+ upper = list_first_or_null_rcu(&dev->adj_list.upper,
struct netdev_adjacent, list);
if (upper && likely(upper->master))
return upper->dev;
@@ -4517,11 +4531,12 @@ static int __netdev_adjacent_dev_insert(struct net_device *dev,
bool neighbour, bool master,
bool upper)
{
- struct netdev_adjacent *adj;
+ struct netdev_adjacent *adj, *neigh = NULL;
- adj = __netdev_find_adj(dev, adj_dev, upper);
+ adj = __netdev_find_adj(dev, adj_dev, upper, false);
if (adj) {
+ /* we cannot insert a neighbour device twice */
BUG_ON(neighbour);
adj->ref_nr++;
return 0;
@@ -4533,58 +4548,103 @@ static int __netdev_adjacent_dev_insert(struct net_device *dev,
adj->dev = adj_dev;
adj->master = master;
- adj->neighbour = neighbour;
adj->ref_nr = 1;
-
dev_hold(adj_dev);
+
+ if (neighbour) {
+ neigh = kmalloc(sizeof(*neigh), GFP_KERNEL);
+ if (!neigh) {
+ kfree(adj);
+ return -ENOMEM;
+ }
+ neigh->dev = adj_dev;
+ neigh->master = master;
+ neigh->ref_nr = 1;
+ dev_hold(adj_dev);
+ }
+
pr_debug("dev_hold for %s, because of %s link added from %s to %s\n",
adj_dev->name, upper ? "upper" : "lower", dev->name,
adj_dev->name);
+ if (neigh)
+ pr_debug("dev_hold for %s, because of %s link added from %s to %s (neighbour)\n",
+ adj_dev->name, upper ? "upper" : "lower", dev->name,
+ adj_dev->name);
if (!upper) {
- list_add_tail_rcu(&adj->list, &dev->lower_dev_list);
+ if (neigh)
+ list_add_tail_rcu(&neigh->list,
+ &dev->adj_list.lower);
+ list_add_tail_rcu(&adj->list, &dev->all_adj_list.lower);
return 0;
}
/* Ensure that master upper link is always the first item in list. */
- if (master)
- list_add_rcu(&adj->list, &dev->upper_dev_list);
- else
- list_add_tail_rcu(&adj->list, &dev->upper_dev_list);
+ if (master) {
+ if (neigh)
+ list_add_rcu(&neigh->list,
+ &dev->adj_list.upper);
+ list_add_rcu(&adj->list, &dev->all_adj_list.upper);
+ } else {
+ if (neigh)
+ list_add_tail_rcu(&neigh->list,
+ &dev->adj_list.upper);
+ list_add_tail_rcu(&adj->list, &dev->all_adj_list.upper);
+ }
return 0;
}
-static inline int __netdev_upper_dev_insert(struct net_device *dev,
- struct net_device *udev,
- bool master, bool neighbour)
+static int __netdev_upper_dev_insert(struct net_device *dev,
+ struct net_device *udev,
+ bool master, bool neighbour)
{
return __netdev_adjacent_dev_insert(dev, udev, neighbour, master,
true);
}
-static inline int __netdev_lower_dev_insert(struct net_device *dev,
- struct net_device *ldev,
- bool neighbour)
+static int __netdev_lower_dev_insert(struct net_device *dev,
+ struct net_device *ldev,
+ bool neighbour)
{
return __netdev_adjacent_dev_insert(dev, ldev, neighbour, false,
false);
}
void __netdev_adjacent_dev_remove(struct net_device *dev,
- struct net_device *adj_dev, bool upper)
+ struct net_device *adj_dev, bool upper,
+ bool is_neigh)
{
- struct netdev_adjacent *adj;
+ struct netdev_adjacent *adj, *neighbour;
- if (upper)
- adj = __netdev_find_upper(dev, adj_dev);
- else
- adj = __netdev_find_lower(dev, adj_dev);
+ if (upper) {
+ adj = __netdev_all_upper_find(dev, adj_dev);
+ neighbour = __netdev_upper_find(dev, adj_dev);
+ } else {
+ adj = __netdev_all_lower_find(dev, adj_dev);
+ neighbour = __netdev_lower_find(dev, adj_dev);
+ }
- if (!adj)
+ if (!adj) {
+ pr_err("tried to remove %s device %s from %s\n",
+ upper ? "upper" : "lower", dev->name, adj_dev->name);
BUG();
+ }
+
+ if (is_neigh) {
+ BUG_ON(!neighbour || neighbour->ref_nr > 1);
+ pr_debug("dev_put for %s, because of %s link removed from %s to %s (neighbour)\n",
+ adj_dev->name, upper ? "upper" : "lower", dev->name,
+ adj_dev->name);
+ list_del_rcu(&neighbour->list);
+ dev_put(adj_dev);
+ kfree_rcu(neighbour, rcu);
+ }
if (adj->ref_nr > 1) {
+ pr_debug("rec_cnt-- for link to %s, because of %s link removed from %s to %s, remains %d\n",
+ adj_dev->name, upper ? "upper" : "lower", dev->name,
+ adj_dev->name, adj->ref_nr-1);
adj->ref_nr--;
return;
}
@@ -4600,13 +4660,25 @@ void __netdev_adjacent_dev_remove(struct net_device *dev,
static inline void __netdev_upper_dev_remove(struct net_device *dev,
struct net_device *udev)
{
- return __netdev_adjacent_dev_remove(dev, udev, true);
+ return __netdev_adjacent_dev_remove(dev, udev, true, false);
+}
+
+static inline void __netdev_upper_dev_remove_neighbour(struct net_device *dev,
+ struct net_device *udev)
+{
+ return __netdev_adjacent_dev_remove(dev, udev, true, true);
}
static inline void __netdev_lower_dev_remove(struct net_device *dev,
struct net_device *ldev)
{
- return __netdev_adjacent_dev_remove(dev, ldev, false);
+ return __netdev_adjacent_dev_remove(dev, ldev, false, false);
+}
+
+static inline void __netdev_lower_dev_remove_neighbour(struct net_device *dev,
+ struct net_device *ldev)
+{
+ return __netdev_adjacent_dev_remove(dev, ldev, false, true);
}
int __netdev_adjacent_dev_insert_link(struct net_device *dev,
@@ -4648,6 +4720,13 @@ void __netdev_adjacent_dev_unlink(struct net_device *dev,
__netdev_lower_dev_remove(upper_dev, dev);
}
+void __netdev_adjacent_dev_unlink_neighbour(struct net_device *dev,
+ struct net_device *upper_dev)
+{
+ __netdev_upper_dev_remove_neighbour(dev, upper_dev);
+ __netdev_lower_dev_remove_neighbour(upper_dev, dev);
+}
+
static int __netdev_upper_dev_link(struct net_device *dev,
struct net_device *upper_dev, bool master)
@@ -4661,10 +4740,10 @@ static int __netdev_upper_dev_link(struct net_device *dev,
return -EBUSY;
/* To prevent loops, check if dev is not upper device to upper_dev. */
- if (__netdev_find_upper(upper_dev, dev))
+ if (__netdev_all_upper_find(upper_dev, dev))
return -EBUSY;
- if (__netdev_find_upper(dev, upper_dev))
+ if (__netdev_all_upper_find(dev, upper_dev))
return -EEXIST;
if (master && netdev_master_upper_dev_get(dev))
@@ -4675,12 +4754,14 @@ static int __netdev_upper_dev_link(struct net_device *dev,
return ret;
/* Now that we linked these devs, make all the upper_dev's
- * upper_dev_list visible to every dev's lower_dev_list and vice
+ * all_adj_list.upper visible to every dev's all_adj_list.lower an
* versa, and don't forget the devices itself. All of these
* links are non-neighbours.
*/
- list_for_each_entry(i, &dev->lower_dev_list, list) {
- list_for_each_entry(j, &upper_dev->upper_dev_list, list) {
+ list_for_each_entry(i, &dev->all_adj_list.lower, list) {
+ list_for_each_entry(j, &upper_dev->all_adj_list.upper, list) {
+ pr_debug("Interlinking %s with %s, non-neighbour\n",
+ i->dev->name, j->dev->name);
ret = __netdev_adjacent_dev_link(i->dev, j->dev);
if (ret)
goto rollback_mesh;
@@ -4688,14 +4769,18 @@ static int __netdev_upper_dev_link(struct net_device *dev,
}
/* add dev to every upper_dev's upper device */
- list_for_each_entry(i, &upper_dev->upper_dev_list, list) {
+ list_for_each_entry(i, &upper_dev->all_adj_list.upper, list) {
+ pr_debug("linking %s's upper device %s with %s\n",
+ upper_dev->name, i->dev->name, dev->name);
ret = __netdev_adjacent_dev_link(dev, i->dev);
if (ret)
goto rollback_upper_mesh;
}
/* add upper_dev to every dev's lower device */
- list_for_each_entry(i, &dev->lower_dev_list, list) {
+ list_for_each_entry(i, &dev->all_adj_list.lower, list) {
+ pr_debug("linking %s's lower device %s with %s\n", dev->name,
+ i->dev->name, upper_dev->name);
ret = __netdev_adjacent_dev_link(i->dev, upper_dev);
if (ret)
goto rollback_lower_mesh;
@@ -4706,7 +4791,7 @@ static int __netdev_upper_dev_link(struct net_device *dev,
rollback_lower_mesh:
to_i = i;
- list_for_each_entry(i, &dev->lower_dev_list, list) {
+ list_for_each_entry(i, &dev->all_adj_list.lower, list) {
if (i == to_i)
break;
__netdev_adjacent_dev_unlink(i->dev, upper_dev);
@@ -4716,7 +4801,7 @@ rollback_lower_mesh:
rollback_upper_mesh:
to_i = i;
- list_for_each_entry(i, &upper_dev->upper_dev_list, list) {
+ list_for_each_entry(i, &upper_dev->all_adj_list.upper, list) {
if (i == to_i)
break;
__netdev_adjacent_dev_unlink(dev, i->dev);
@@ -4727,8 +4812,8 @@ rollback_upper_mesh:
rollback_mesh:
to_i = i;
to_j = j;
- list_for_each_entry(i, &dev->lower_dev_list, list) {
- list_for_each_entry(j, &upper_dev->upper_dev_list, list) {
+ list_for_each_entry(i, &dev->all_adj_list.lower, list) {
+ list_for_each_entry(j, &upper_dev->all_adj_list.upper, list) {
if (i == to_i && j == to_j)
break;
__netdev_adjacent_dev_unlink(i->dev, j->dev);
@@ -4737,7 +4822,7 @@ rollback_mesh:
break;
}
- __netdev_adjacent_dev_unlink(dev, upper_dev);
+ __netdev_adjacent_dev_unlink_neighbour(dev, upper_dev);
return ret;
}
@@ -4791,23 +4876,23 @@ void netdev_upper_dev_unlink(struct net_device *dev,
struct netdev_adjacent *i, *j;
ASSERT_RTNL();
- __netdev_adjacent_dev_unlink(dev, upper_dev);
+ __netdev_adjacent_dev_unlink_neighbour(dev, upper_dev);
/* Here is the tricky part. We must remove all dev's lower
* devices from all upper_dev's upper devices and vice
* versa, to maintain the graph relationship.
*/
- list_for_each_entry(i, &dev->lower_dev_list, list)
- list_for_each_entry(j, &upper_dev->upper_dev_list, list)
+ list_for_each_entry(i, &dev->all_adj_list.lower, list)
+ list_for_each_entry(j, &upper_dev->all_adj_list.upper, list)
__netdev_adjacent_dev_unlink(i->dev, j->dev);
/* remove also the devices itself from lower/upper device
* list
*/
- list_for_each_entry(i, &dev->lower_dev_list, list)
+ list_for_each_entry(i, &dev->all_adj_list.lower, list)
__netdev_adjacent_dev_unlink(i->dev, upper_dev);
- list_for_each_entry(i, &upper_dev->upper_dev_list, list)
+ list_for_each_entry(i, &upper_dev->all_adj_list.upper, list)
__netdev_adjacent_dev_unlink(dev, i->dev);
call_netdevice_notifiers(NETDEV_CHANGEUPPER, dev);
@@ -6069,8 +6154,10 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
INIT_LIST_HEAD(&dev->napi_list);
INIT_LIST_HEAD(&dev->unreg_list);
INIT_LIST_HEAD(&dev->link_watch_list);
- INIT_LIST_HEAD(&dev->upper_dev_list);
- INIT_LIST_HEAD(&dev->lower_dev_list);
+ INIT_LIST_HEAD(&dev->adj_list.upper);
+ INIT_LIST_HEAD(&dev->adj_list.lower);
+ INIT_LIST_HEAD(&dev->all_adj_list.upper);
+ INIT_LIST_HEAD(&dev->all_adj_list.lower);
dev->priv_flags = IFF_XMIT_DST_RELEASE;
setup(dev);
--
1.8.4
^ permalink raw reply related
* [PATCH v3 net-next 04/27] net: add netdev_adjacent->private and allow to use it
From: Veaceslav Falico @ 2013-09-17 0:46 UTC (permalink / raw)
To: netdev
Cc: jiri, Veaceslav Falico, David S. Miller, Eric Dumazet,
Alexander Duyck
In-Reply-To: <1379378812-18346-1-git-send-email-vfalico@redhat.com>
Currently, even though we can access any linked device, we can't attach
anything to it, which is vital to properly manage them.
To fix this, add a new void *private to netdev_adjacent and functions
setting/getting it (per link), so that we can save, per example, bonding's
slave structures there, per slave device.
netdev_master_upper_dev_link_private(dev, upper_dev, private) links dev to
upper dev and populates the neighbour link only with private.
netdev_lower_dev_get_private{,_rcu}() returns the private, if found.
CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
Notes:
v2 -> v3:
No change.
v1 -> v2:
No changes.
RFC -> v1:
rename neighbour_dev_list/all_dev_list to adj_list/all_adj_list and rework
the naming of functions - to make the use the pattern
netdev_(all_)(lower|upper)_* .
include/linux/netdevice.h | 7 ++++
net/core/dev.c | 83 ++++++++++++++++++++++++++++++++++++++---------
2 files changed, 75 insertions(+), 15 deletions(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 2a944e5..eab8e36 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2839,8 +2839,15 @@ extern int netdev_upper_dev_link(struct net_device *dev,
struct net_device *upper_dev);
extern int netdev_master_upper_dev_link(struct net_device *dev,
struct net_device *upper_dev);
+extern int netdev_master_upper_dev_link_private(struct net_device *dev,
+ struct net_device *upper_dev,
+ void *private);
extern void netdev_upper_dev_unlink(struct net_device *dev,
struct net_device *upper_dev);
+extern void *netdev_lower_dev_get_private_rcu(struct net_device *dev,
+ struct net_device *lower_dev);
+extern void *netdev_lower_dev_get_private(struct net_device *dev,
+ struct net_device *lower_dev);
extern int skb_checksum_help(struct sk_buff *skb);
extern struct sk_buff *__skb_gso_segment(struct sk_buff *skb,
netdev_features_t features, bool tx_path);
diff --git a/net/core/dev.c b/net/core/dev.c
index d277081..6b2b12f 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4376,6 +4376,9 @@ struct netdev_adjacent {
/* counter for the number of times this device was added to us */
u16 ref_nr;
+ /* private field for the users */
+ void *private;
+
struct list_head list;
struct rcu_head rcu;
};
@@ -4556,7 +4559,7 @@ EXPORT_SYMBOL(netdev_master_upper_dev_get_rcu);
static int __netdev_adjacent_dev_insert(struct net_device *dev,
struct net_device *adj_dev,
bool neighbour, bool master,
- bool upper)
+ bool upper, void *private)
{
struct netdev_adjacent *adj, *neigh = NULL;
@@ -4599,9 +4602,15 @@ static int __netdev_adjacent_dev_insert(struct net_device *dev,
adj_dev->name);
if (!upper) {
- if (neigh)
+ if (neigh) {
+ /* we're backlinging the master device to its
+ * slave, so save the private in this link.
+ */
+ if (master)
+ neigh->private = private;
list_add_tail_rcu(&neigh->list,
&dev->adj_list.lower);
+ }
list_add_tail_rcu(&adj->list, &dev->all_adj_list.lower);
return 0;
}
@@ -4627,15 +4636,16 @@ static int __netdev_upper_dev_insert(struct net_device *dev,
bool master, bool neighbour)
{
return __netdev_adjacent_dev_insert(dev, udev, neighbour, master,
- true);
+ true, NULL);
}
static int __netdev_lower_dev_insert(struct net_device *dev,
struct net_device *ldev,
- bool neighbour)
+ bool master, bool neighbour,
+ void *private)
{
- return __netdev_adjacent_dev_insert(dev, ldev, neighbour, false,
- false);
+ return __netdev_adjacent_dev_insert(dev, ldev, neighbour, master,
+ false, private);
}
void __netdev_adjacent_dev_remove(struct net_device *dev,
@@ -4710,7 +4720,8 @@ static void __netdev_lower_dev_remove_neighbour(struct net_device *dev,
int __netdev_adjacent_dev_insert_link(struct net_device *dev,
struct net_device *upper_dev,
- bool master, bool neighbour)
+ bool master, bool neighbour,
+ void *private)
{
int ret;
@@ -4718,7 +4729,8 @@ int __netdev_adjacent_dev_insert_link(struct net_device *dev,
if (ret)
return ret;
- ret = __netdev_lower_dev_insert(upper_dev, dev, neighbour);
+ ret = __netdev_lower_dev_insert(upper_dev, dev, master, neighbour,
+ private);
if (ret) {
__netdev_upper_dev_remove(dev, upper_dev);
return ret;
@@ -4730,14 +4742,15 @@ int __netdev_adjacent_dev_insert_link(struct net_device *dev,
static int __netdev_adjacent_dev_link(struct net_device *dev,
struct net_device *udev)
{
- return __netdev_adjacent_dev_insert_link(dev, udev, false, false);
+ return __netdev_adjacent_dev_insert_link(dev, udev, false, false, NULL);
}
static int __netdev_adjacent_dev_link_neighbour(struct net_device *dev,
struct net_device *udev,
- bool master)
+ bool master, void *priv)
{
- return __netdev_adjacent_dev_insert_link(dev, udev, master, true);
+ return __netdev_adjacent_dev_insert_link(dev, udev, master, true,
+ priv);
}
void __netdev_adjacent_dev_unlink(struct net_device *dev,
@@ -4756,7 +4769,8 @@ void __netdev_adjacent_dev_unlink_neighbour(struct net_device *dev,
static int __netdev_upper_dev_link(struct net_device *dev,
- struct net_device *upper_dev, bool master)
+ struct net_device *upper_dev, bool master,
+ void *private)
{
struct netdev_adjacent *i, *j, *to_i, *to_j;
int ret = 0;
@@ -4776,7 +4790,8 @@ static int __netdev_upper_dev_link(struct net_device *dev,
if (master && netdev_master_upper_dev_get(dev))
return -EBUSY;
- ret = __netdev_adjacent_dev_link_neighbour(dev, upper_dev, master);
+ ret = __netdev_adjacent_dev_link_neighbour(dev, upper_dev, master,
+ private);
if (ret)
return ret;
@@ -4867,7 +4882,7 @@ rollback_mesh:
int netdev_upper_dev_link(struct net_device *dev,
struct net_device *upper_dev)
{
- return __netdev_upper_dev_link(dev, upper_dev, false);
+ return __netdev_upper_dev_link(dev, upper_dev, false, NULL);
}
EXPORT_SYMBOL(netdev_upper_dev_link);
@@ -4885,10 +4900,18 @@ EXPORT_SYMBOL(netdev_upper_dev_link);
int netdev_master_upper_dev_link(struct net_device *dev,
struct net_device *upper_dev)
{
- return __netdev_upper_dev_link(dev, upper_dev, true);
+ return __netdev_upper_dev_link(dev, upper_dev, true, NULL);
}
EXPORT_SYMBOL(netdev_master_upper_dev_link);
+int netdev_master_upper_dev_link_private(struct net_device *dev,
+ struct net_device *upper_dev,
+ void *private)
+{
+ return __netdev_upper_dev_link(dev, upper_dev, true, private);
+}
+EXPORT_SYMBOL(netdev_master_upper_dev_link_private);
+
/**
* netdev_upper_dev_unlink - Removes a link to upper device
* @dev: device
@@ -4926,6 +4949,36 @@ void netdev_upper_dev_unlink(struct net_device *dev,
}
EXPORT_SYMBOL(netdev_upper_dev_unlink);
+void *netdev_lower_dev_get_private_rcu(struct net_device *dev,
+ struct net_device *lower_dev)
+{
+ struct netdev_adjacent *lower;
+
+ if (!lower_dev)
+ return NULL;
+ lower = __netdev_lower_find_rcu(dev, lower_dev);
+ if (!lower)
+ return NULL;
+
+ return lower->private;
+}
+EXPORT_SYMBOL(netdev_lower_dev_get_private_rcu);
+
+void *netdev_lower_dev_get_private(struct net_device *dev,
+ struct net_device *lower_dev)
+{
+ struct netdev_adjacent *lower;
+
+ if (!lower_dev)
+ return NULL;
+ lower = __netdev_lower_find(dev, lower_dev);
+ if (!lower)
+ return NULL;
+
+ return lower->private;
+}
+EXPORT_SYMBOL(netdev_lower_dev_get_private);
+
static void dev_change_rx_flags(struct net_device *dev, int flags)
{
const struct net_device_ops *ops = dev->netdev_ops;
--
1.8.4
^ permalink raw reply related
* [PATCH v3 net-next 00/27] bonding: use neighbours instead of own lists
From: Veaceslav Falico @ 2013-09-17 0:46 UTC (permalink / raw)
To: netdev
Cc: jiri, Jay Vosburgh, Andy Gospodarek, Dimitris Michailidis,
David S. Miller, Patrick McHardy, Eric Dumazet, Alexander Duyck,
Veaceslav Falico
This patchset introduces all the needed infrastructure, on top of current
adjacent lists, to be able to remove bond's slave_list/slave->list. The
overhead in memory/CPU is minimal, and after the patchset bonding can rely
on its slave-related functions, given the proper locking. I've done some
netperf benchmarks on a vm, and the delta was about 0.1gbps for 35gbps as a
whole, so no speed fluctuations.
It also automatically creates lower/upper and master symlinks in dev's
sysfs directory.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: Dimitris Michailidis <dm@chelsio.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: Patrick McHardy <kaber@trash.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
v2 -> v3:
Fix the commit subject for patch 08 - forgot the _continue in
bond_for_each_slave_continue_reverse(), suggested by Cong Wang.
This version is already ready for the inclusion, hopefully.
Moved the changelog in cover letter to the bottom and cleaned it
up a bit.
v1 -> v2:
Fix a bug when we could remove try to remove a neighbour twice - it was
possible if we've had this neighbour already and added another devices
which had this neighbour in its lists. Fix it by following the rule - there
can be only one neighbour link, so add also _remove_neighbour() functions
to the generic _remove()s to handle the neighbour only removing, and fix
its refcounting while adding (all in patch 01).
Also, fix a vlan bug when it called unregister netdevice before unlinking -
it could possibly lead to a subte race condition - cause it was
unregistered but still in global-accessible list.
Rearrange sysfs/vlan patches so that first we fix vlan and only after we
add sysfs stuff.
RFC -> v1:
I've added proper, consistent naming for all variables/functions, uninlined
some helpers to get better backtraces, just in case (overhead is minimal,
no hot paths), rearranged patches for better review, dropped bondings
->prev and bond_for_each_slave_continue() functionality - to be able to
RCUify it easier, and renamed slave_* sysfs links to lower_* sysfs links to
maintain upper/lower naming. I've also dropped, thanks to bonding cleanup,
some heavy and ugly/intrusive patches.
drivers/net/bonding/bond_3ad.c | 54 +--
drivers/net/bonding/bond_alb.c | 81 ++--
drivers/net/bonding/bond_alb.h | 4 +-
drivers/net/bonding/bond_main.c | 296 ++++++++-------
drivers/net/bonding/bond_procfs.c | 5 +-
drivers/net/bonding/bond_sysfs.c | 62 +--
drivers/net/bonding/bonding.h | 74 ++--
drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 3 +-
include/linux/netdevice.h | 57 ++-
net/8021q/vlan.c | 18 +-
net/core/dev.c | 482 +++++++++++++++++++-----
11 files changed, 724 insertions(+), 412 deletions(-)
^ permalink raw reply
* [PATCH v3 net-next 06/27] bonding: modify bond_get_slave_by_dev() to use neighbours
From: Veaceslav Falico @ 2013-09-17 0:46 UTC (permalink / raw)
To: netdev; +Cc: jiri, Veaceslav Falico, Jay Vosburgh, Andy Gospodarek
In-Reply-To: <1379378812-18346-1-git-send-email-vfalico@redhat.com>
It should be used under rtnl/bonding lock, so use the non-RCU version.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
Notes:
v2 -> v3:
No change.
v1 -> v2:
No changes.
RFC -> v1:
Move the patch right after we've populated ->private with bonding.
drivers/net/bonding/bonding.h | 9 ++-------
1 file changed, 2 insertions(+), 7 deletions(-)
diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
index f7ab161..6d1a893 100644
--- a/drivers/net/bonding/bonding.h
+++ b/drivers/net/bonding/bonding.h
@@ -275,13 +275,8 @@ struct bonding {
static inline struct slave *bond_get_slave_by_dev(struct bonding *bond,
struct net_device *slave_dev)
{
- struct slave *slave = NULL;
-
- bond_for_each_slave(bond, slave)
- if (slave->dev == slave_dev)
- return slave;
-
- return NULL;
+ return (struct slave *)netdev_lower_dev_get_private(bond->dev,
+ slave_dev);
}
static inline struct bonding *bond_get_bond_by_slave(struct slave *slave)
--
1.8.4
^ permalink raw reply related
* [PATCH v3 net-next 07/27] net: add for_each iterators through neighbour lower link's private
From: Veaceslav Falico @ 2013-09-17 0:46 UTC (permalink / raw)
To: netdev
Cc: jiri, Veaceslav Falico, David S. Miller, Eric Dumazet,
Alexander Duyck
In-Reply-To: <1379378812-18346-1-git-send-email-vfalico@redhat.com>
Add a possibility to iterate through netdev_adjacent's private, currently
only for lower neighbours.
Add both RCU and RTNL/other locking variants of iterators.
CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
Notes:
v2 -> v3:
No change.
v1 -> v2:
No changes.
RFC -> v1:
rename neighbour_dev_list/all_dev_list to adj_list/all_adj_list and rework
the naming of functions - to make the use the pattern
netdev_(all_)(lower|upper)_* . Also reword the commit message - I had to
read it three times to understand :-/, and move it closer to the beginning
of the patchset.
include/linux/netdevice.h | 17 ++++++++++++
net/core/dev.c | 66 +++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 83 insertions(+)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index eab8e36..48431f0 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2833,6 +2833,23 @@ extern struct net_device *netdev_all_upper_get_next_dev_rcu(struct net_device *d
updev; \
updev = netdev_all_upper_get_next_dev_rcu(dev, &(iter)))
+extern void *netdev_lower_get_next_private(struct net_device *dev,
+ struct list_head **iter);
+extern void *netdev_lower_get_next_private_rcu(struct net_device *dev,
+ struct list_head **iter);
+
+#define netdev_for_each_lower_private(dev, priv, iter) \
+ for (iter = &(dev)->adj_list.lower, \
+ priv = netdev_lower_get_next_private(dev, &(iter)); \
+ priv; \
+ priv = netdev_lower_get_next_private(dev, &(iter)))
+
+#define netdev_for_each_lower_private_rcu(dev, priv, iter) \
+ for (iter = &(dev)->adj_list.lower, \
+ priv = netdev_lower_get_next_private_rcu(dev, &(iter)); \
+ priv; \
+ priv = netdev_lower_get_next_private_rcu(dev, &(iter)))
+
extern struct net_device *netdev_master_upper_dev_get(struct net_device *dev);
extern struct net_device *netdev_master_upper_dev_get_rcu(struct net_device *dev);
extern int netdev_upper_dev_link(struct net_device *dev,
diff --git a/net/core/dev.c b/net/core/dev.c
index 6b2b12f..11284c8 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4537,6 +4537,72 @@ struct net_device *netdev_all_upper_get_next_dev_rcu(struct net_device *dev,
}
EXPORT_SYMBOL(netdev_all_upper_get_next_dev_rcu);
+/* netdev_lower_get_next_private - Get the next ->private from the
+ * lower neighbour list
+ * @dev: device
+ * @iter: list_head ** of the current position
+ *
+ * Gets the next netdev_adjacent->private from the dev's lower neighbour
+ * list, starting from iter position. The caller must hold either hold the
+ * RTNL lock or its own locking that guarantees that the neighbour lower
+ * list will remain unchainged. If iter is NULL - return the first private.
+ */
+void *netdev_lower_get_next_private(struct net_device *dev,
+ struct list_head **iter)
+{
+ struct netdev_adjacent *lower;
+
+ if (iter)
+ lower = list_entry((*iter)->next, struct netdev_adjacent,
+ list);
+ else
+ lower = list_entry(dev->adj_list.lower.next,
+ struct netdev_adjacent, list);
+
+ if (&lower->list == &dev->adj_list.lower)
+ return NULL;
+
+ if (iter)
+ *iter = &lower->list;
+
+ return lower->private;
+}
+EXPORT_SYMBOL(netdev_lower_get_next_private);
+
+/* netdev_lower_get_next_private_rcu - Get the next ->private from the
+ * lower neighbour list, RCU
+ * variant
+ * @dev: device
+ * @iter: list_head ** of the current position
+ *
+ * Gets the next netdev_adjacent->private from the dev's lower neighbour
+ * list, starting from iter position. The caller must hold RCU read lock.
+ * If iter is NULL - returns the first private.
+ */
+void *netdev_lower_get_next_private_rcu(struct net_device *dev,
+ struct list_head **iter)
+{
+ struct netdev_adjacent *lower;
+
+ WARN_ON_ONCE(!rcu_read_lock_held());
+
+ if (iter)
+ lower = list_entry_rcu((*iter)->next, struct netdev_adjacent,
+ list);
+ else
+ lower = list_entry_rcu(dev->adj_list.lower.next,
+ struct netdev_adjacent, list);
+
+ if (&lower->list == &dev->adj_list.lower)
+ return NULL;
+
+ if (iter)
+ *iter = &lower->list;
+
+ return lower->private;
+}
+EXPORT_SYMBOL(netdev_lower_get_next_private_rcu);
+
/**
* netdev_master_upper_dev_get_rcu - Get master upper device
* @dev: device
--
1.8.4
^ permalink raw reply related
* [PATCH v3 net-next 08/27] bonding: remove bond_for_each_slave_continue_reverse()
From: Veaceslav Falico @ 2013-09-17 0:46 UTC (permalink / raw)
To: netdev; +Cc: jiri, Veaceslav Falico, Jay Vosburgh, Andy Gospodarek
In-Reply-To: <1379378812-18346-1-git-send-email-vfalico@redhat.com>
We only use it in rollback scenarios and can easily use the standart
bond_for_each_dev() instead.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
Notes:
v2 -> v3:
Add the forgotten _continue to the subject.
v1 -> v2:
No changes.
RFC -> v1:
Move the patch right before we change bond_for_each_slave() conversion,
so that we don't have to convert bond_for_each_slave_reverse() too.
drivers/net/bonding/bond_alb.c | 14 ++++++++------
drivers/net/bonding/bond_main.c | 34 ++++++++++++++++++++++------------
drivers/net/bonding/bonding.h | 10 ----------
3 files changed, 30 insertions(+), 28 deletions(-)
diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
index c3dcc6b..46f6b40 100644
--- a/drivers/net/bonding/bond_alb.c
+++ b/drivers/net/bonding/bond_alb.c
@@ -1246,9 +1246,9 @@ static int alb_handle_addr_collision_on_attach(struct bonding *bond, struct slav
*/
static int alb_set_mac_address(struct bonding *bond, void *addr)
{
- char tmp_addr[ETH_ALEN];
- struct slave *slave;
+ struct slave *slave, *rollback_slave;
struct sockaddr sa;
+ char tmp_addr[ETH_ALEN];
int res;
if (bond->alb_info.rlb_enabled)
@@ -1274,10 +1274,12 @@ unwind:
sa.sa_family = bond->dev->type;
/* unwind from head to the slave that failed */
- bond_for_each_slave_continue_reverse(bond, slave) {
- memcpy(tmp_addr, slave->dev->dev_addr, ETH_ALEN);
- dev_set_mac_address(slave->dev, &sa);
- memcpy(slave->dev->dev_addr, tmp_addr, ETH_ALEN);
+ bond_for_each_slave(bond, rollback_slave) {
+ if (rollback_slave == slave)
+ break;
+ memcpy(tmp_addr, rollback_slave->dev->dev_addr, ETH_ALEN);
+ dev_set_mac_address(rollback_slave->dev, &sa);
+ memcpy(rollback_slave->dev->dev_addr, tmp_addr, ETH_ALEN);
}
return res;
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index f862dc8..3956478 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -332,7 +332,7 @@ static int bond_vlan_rx_add_vid(struct net_device *bond_dev,
__be16 proto, u16 vid)
{
struct bonding *bond = netdev_priv(bond_dev);
- struct slave *slave;
+ struct slave *slave, *rollback_slave;
int res;
bond_for_each_slave(bond, slave) {
@@ -344,9 +344,13 @@ static int bond_vlan_rx_add_vid(struct net_device *bond_dev,
return 0;
unwind:
- /* unwind from the slave that failed */
- bond_for_each_slave_continue_reverse(bond, slave)
- vlan_vid_del(slave->dev, proto, vid);
+ /* unwind to the slave that failed */
+ bond_for_each_slave(bond, rollback_slave) {
+ if (rollback_slave == slave)
+ break;
+
+ vlan_vid_del(rollback_slave->dev, proto, vid);
+ }
return res;
}
@@ -3468,7 +3472,7 @@ static int bond_neigh_setup(struct net_device *dev,
static int bond_change_mtu(struct net_device *bond_dev, int new_mtu)
{
struct bonding *bond = netdev_priv(bond_dev);
- struct slave *slave;
+ struct slave *slave, *rollback_slave;
int res = 0;
pr_debug("bond=%p, name=%s, new_mtu=%d\n", bond,
@@ -3517,13 +3521,16 @@ static int bond_change_mtu(struct net_device *bond_dev, int new_mtu)
unwind:
/* unwind from head to the slave that failed */
- bond_for_each_slave_continue_reverse(bond, slave) {
+ bond_for_each_slave(bond, rollback_slave) {
int tmp_res;
- tmp_res = dev_set_mtu(slave->dev, bond_dev->mtu);
+ if (rollback_slave == slave);
+ break;
+
+ tmp_res = dev_set_mtu(rollback_slave->dev, bond_dev->mtu);
if (tmp_res) {
pr_debug("unwind err %d dev %s\n",
- tmp_res, slave->dev->name);
+ tmp_res, rollback_slave->dev->name);
}
}
@@ -3540,8 +3547,8 @@ unwind:
static int bond_set_mac_address(struct net_device *bond_dev, void *addr)
{
struct bonding *bond = netdev_priv(bond_dev);
+ struct slave *slave, *rollback_slave;
struct sockaddr *sa = addr, tmp_sa;
- struct slave *slave;
int res = 0;
if (bond->params.mode == BOND_MODE_ALB)
@@ -3607,13 +3614,16 @@ unwind:
tmp_sa.sa_family = bond_dev->type;
/* unwind from head to the slave that failed */
- bond_for_each_slave_continue_reverse(bond, slave) {
+ bond_for_each_slave(bond, rollback_slave) {
int tmp_res;
- tmp_res = dev_set_mac_address(slave->dev, &tmp_sa);
+ if (rollback_slave == slave)
+ break;
+
+ tmp_res = dev_set_mac_address(rollback_slave->dev, &tmp_sa);
if (tmp_res) {
pr_debug("unwind err %d dev %s\n",
- tmp_res, slave->dev->name);
+ tmp_res, rollback_slave->dev->name);
}
}
diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
index 6d1a893..141ddc3 100644
--- a/drivers/net/bonding/bonding.h
+++ b/drivers/net/bonding/bonding.h
@@ -120,16 +120,6 @@
#define bond_for_each_slave_rcu(bond, pos) \
list_for_each_entry_rcu(pos, &(bond)->slave_list, list)
-/**
- * bond_for_each_slave_reverse - iterate in reverse from a given position
- * @bond: the bond holding this list
- * @pos: slave to continue from
- *
- * Caller must hold bond->lock
- */
-#define bond_for_each_slave_continue_reverse(bond, pos) \
- list_for_each_entry_continue_reverse(pos, &(bond)->slave_list, list)
-
#ifdef CONFIG_NET_POLL_CONTROLLER
extern atomic_t netpoll_block_tx;
--
1.8.4
^ permalink raw reply related
* [PATCH v3 net-next 10/27] bonding: use bond_for_each_slave() in bond_uninit()
From: Veaceslav Falico @ 2013-09-17 0:46 UTC (permalink / raw)
To: netdev; +Cc: jiri, Veaceslav Falico, Jay Vosburgh, Andy Gospodarek
In-Reply-To: <1379378812-18346-1-git-send-email-vfalico@redhat.com>
We're safe agains removal there, cause we use neighbours primitives.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
Notes:
v2 -> v3:
No change.
v1 -> v2:
No changes.
RFC -> v1:
Move the patch rigth after we start using neighbour lists for
bond_for_each_slave().
drivers/net/bonding/bond_main.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index cdd5c5f..2075321 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -4090,12 +4090,13 @@ static void bond_setup(struct net_device *bond_dev)
static void bond_uninit(struct net_device *bond_dev)
{
struct bonding *bond = netdev_priv(bond_dev);
- struct slave *slave, *tmp_slave;
+ struct list_head *iter;
+ struct slave *slave;
bond_netpoll_cleanup(bond_dev);
/* Release the bonded slaves */
- list_for_each_entry_safe(slave, tmp_slave, &bond->slave_list, list)
+ bond_for_each_slave(bond, slave, iter)
__bond_release_one(bond_dev, slave->dev, true);
pr_info("%s: released all slaves\n", bond_dev->name);
--
1.8.4
^ permalink raw reply related
* [PATCH v3 net-next 09/27] bonding: make bond_for_each_slave() use lower neighbour's private
From: Veaceslav Falico @ 2013-09-17 0:46 UTC (permalink / raw)
To: netdev
Cc: jiri, Veaceslav Falico, Jay Vosburgh, Andy Gospodarek,
Dimitris Michailidis
In-Reply-To: <1379378812-18346-1-git-send-email-vfalico@redhat.com>
It needs a list_head *iter, so add it wherever needed. Use both non-rcu and
rcu variants.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: Dimitris Michailidis <dm@chelsio.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
Notes:
v2 -> v3:
No change.
v1 -> v2:
No changes.
RFC -> v1:
Move the patch right after we're ready to use it - for easier review.
Follow renaming strategy.
Add cxgb4 driver - it also began to use bond_for_each_slave().
drivers/net/bonding/bond_3ad.c | 6 +-
drivers/net/bonding/bond_alb.c | 18 ++++--
drivers/net/bonding/bond_main.c | 86 ++++++++++++++++---------
drivers/net/bonding/bond_procfs.c | 5 +-
drivers/net/bonding/bond_sysfs.c | 23 ++++---
drivers/net/bonding/bonding.h | 12 ++--
drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 3 +-
7 files changed, 98 insertions(+), 55 deletions(-)
diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
index 0d8f427..3847aee 100644
--- a/drivers/net/bonding/bond_3ad.c
+++ b/drivers/net/bonding/bond_3ad.c
@@ -2419,6 +2419,7 @@ int bond_3ad_xmit_xor(struct sk_buff *skb, struct net_device *dev)
{
struct slave *slave, *start_at;
struct bonding *bond = netdev_priv(dev);
+ struct list_head *iter;
int slave_agg_no;
int slaves_in_agg;
int agg_id;
@@ -2444,7 +2445,7 @@ int bond_3ad_xmit_xor(struct sk_buff *skb, struct net_device *dev)
slave_agg_no = bond->xmit_hash_policy(skb, slaves_in_agg);
- bond_for_each_slave(bond, slave) {
+ bond_for_each_slave(bond, slave, iter) {
struct aggregator *agg = SLAVE_AD_INFO(slave).port.aggregator;
if (agg && (agg->aggregator_identifier == agg_id)) {
@@ -2515,11 +2516,12 @@ int bond_3ad_lacpdu_recv(const struct sk_buff *skb, struct bonding *bond,
void bond_3ad_update_lacp_rate(struct bonding *bond)
{
struct port *port = NULL;
+ struct list_head *iter;
struct slave *slave;
int lacp_fast;
lacp_fast = bond->params.lacp_fast;
- bond_for_each_slave(bond, slave) {
+ bond_for_each_slave(bond, slave, iter) {
port = &(SLAVE_AD_INFO(slave).port);
__get_state_machine_lock(port);
if (lacp_fast)
diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
index 46f6b40..caa437d 100644
--- a/drivers/net/bonding/bond_alb.c
+++ b/drivers/net/bonding/bond_alb.c
@@ -223,13 +223,14 @@ static long long compute_gap(struct slave *slave)
static struct slave *tlb_get_least_loaded_slave(struct bonding *bond)
{
struct slave *slave, *least_loaded;
+ struct list_head *iter;
long long max_gap;
least_loaded = NULL;
max_gap = LLONG_MIN;
/* Find the slave with the largest gap */
- bond_for_each_slave(bond, slave) {
+ bond_for_each_slave(bond, slave, iter) {
if (SLAVE_IS_OK(slave)) {
long long gap = compute_gap(slave);
@@ -1172,8 +1173,9 @@ static void alb_change_hw_addr_on_detach(struct bonding *bond, struct slave *sla
*/
static int alb_handle_addr_collision_on_attach(struct bonding *bond, struct slave *slave)
{
- struct slave *tmp_slave1, *free_mac_slave = NULL;
struct slave *has_bond_addr = bond->curr_active_slave;
+ struct slave *tmp_slave1, *free_mac_slave = NULL;
+ struct list_head *iter;
if (list_empty(&bond->slave_list)) {
/* this is the first slave */
@@ -1196,7 +1198,7 @@ static int alb_handle_addr_collision_on_attach(struct bonding *bond, struct slav
/* The slave's address is equal to the address of the bond.
* Search for a spare address in the bond for this slave.
*/
- bond_for_each_slave(bond, tmp_slave1) {
+ bond_for_each_slave(bond, tmp_slave1, iter) {
if (!bond_slave_has_mac(bond, tmp_slave1->perm_hwaddr)) {
/* no slave has tmp_slave1's perm addr
* as its curr addr
@@ -1247,6 +1249,7 @@ static int alb_handle_addr_collision_on_attach(struct bonding *bond, struct slav
static int alb_set_mac_address(struct bonding *bond, void *addr)
{
struct slave *slave, *rollback_slave;
+ struct list_head *iter;
struct sockaddr sa;
char tmp_addr[ETH_ALEN];
int res;
@@ -1254,7 +1257,7 @@ static int alb_set_mac_address(struct bonding *bond, void *addr)
if (bond->alb_info.rlb_enabled)
return 0;
- bond_for_each_slave(bond, slave) {
+ bond_for_each_slave(bond, slave, iter) {
/* save net_device's current hw address */
memcpy(tmp_addr, slave->dev->dev_addr, ETH_ALEN);
@@ -1274,7 +1277,7 @@ unwind:
sa.sa_family = bond->dev->type;
/* unwind from head to the slave that failed */
- bond_for_each_slave(bond, rollback_slave) {
+ bond_for_each_slave(bond, rollback_slave, iter) {
if (rollback_slave == slave)
break;
memcpy(tmp_addr, rollback_slave->dev->dev_addr, ETH_ALEN);
@@ -1460,6 +1463,7 @@ void bond_alb_monitor(struct work_struct *work)
struct bonding *bond = container_of(work, struct bonding,
alb_work.work);
struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
+ struct list_head *iter;
struct slave *slave;
read_lock(&bond->lock);
@@ -1482,7 +1486,7 @@ void bond_alb_monitor(struct work_struct *work)
*/
read_lock(&bond->curr_slave_lock);
- bond_for_each_slave(bond, slave)
+ bond_for_each_slave(bond, slave, iter)
alb_send_learning_packets(slave, slave->dev->dev_addr);
read_unlock(&bond->curr_slave_lock);
@@ -1495,7 +1499,7 @@ void bond_alb_monitor(struct work_struct *work)
read_lock(&bond->curr_slave_lock);
- bond_for_each_slave(bond, slave) {
+ bond_for_each_slave(bond, slave, iter) {
tlb_clear_slave(bond, slave, 1);
if (slave == bond->curr_active_slave) {
SLAVE_TLB_INFO(slave).load =
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 3956478..cdd5c5f 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -333,9 +333,10 @@ static int bond_vlan_rx_add_vid(struct net_device *bond_dev,
{
struct bonding *bond = netdev_priv(bond_dev);
struct slave *slave, *rollback_slave;
+ struct list_head *iter;
int res;
- bond_for_each_slave(bond, slave) {
+ bond_for_each_slave(bond, slave, iter) {
res = vlan_vid_add(slave->dev, proto, vid);
if (res)
goto unwind;
@@ -345,7 +346,7 @@ static int bond_vlan_rx_add_vid(struct net_device *bond_dev,
unwind:
/* unwind to the slave that failed */
- bond_for_each_slave(bond, rollback_slave) {
+ bond_for_each_slave(bond, rollback_slave, iter) {
if (rollback_slave == slave)
break;
@@ -364,9 +365,10 @@ static int bond_vlan_rx_kill_vid(struct net_device *bond_dev,
__be16 proto, u16 vid)
{
struct bonding *bond = netdev_priv(bond_dev);
+ struct list_head *iter;
struct slave *slave;
- bond_for_each_slave(bond, slave)
+ bond_for_each_slave(bond, slave, iter)
vlan_vid_del(slave->dev, proto, vid);
if (bond_is_lb(bond))
@@ -386,6 +388,7 @@ static int bond_vlan_rx_kill_vid(struct net_device *bond_dev,
*/
static int bond_set_carrier(struct bonding *bond)
{
+ struct list_head *iter;
struct slave *slave;
if (list_empty(&bond->slave_list))
@@ -394,7 +397,7 @@ static int bond_set_carrier(struct bonding *bond)
if (bond->params.mode == BOND_MODE_8023AD)
return bond_3ad_set_carrier(bond);
- bond_for_each_slave(bond, slave) {
+ bond_for_each_slave(bond, slave, iter) {
if (slave->link == BOND_LINK_UP) {
if (!netif_carrier_ok(bond->dev)) {
netif_carrier_on(bond->dev);
@@ -526,7 +529,9 @@ static int bond_check_dev_link(struct bonding *bond,
*/
static int bond_set_promiscuity(struct bonding *bond, int inc)
{
+ struct list_head *iter;
int err = 0;
+
if (USES_PRIMARY(bond->params.mode)) {
/* write lock already acquired */
if (bond->curr_active_slave) {
@@ -536,7 +541,7 @@ static int bond_set_promiscuity(struct bonding *bond, int inc)
} else {
struct slave *slave;
- bond_for_each_slave(bond, slave) {
+ bond_for_each_slave(bond, slave, iter) {
err = dev_set_promiscuity(slave->dev, inc);
if (err)
return err;
@@ -550,7 +555,9 @@ static int bond_set_promiscuity(struct bonding *bond, int inc)
*/
static int bond_set_allmulti(struct bonding *bond, int inc)
{
+ struct list_head *iter;
int err = 0;
+
if (USES_PRIMARY(bond->params.mode)) {
/* write lock already acquired */
if (bond->curr_active_slave) {
@@ -560,7 +567,7 @@ static int bond_set_allmulti(struct bonding *bond, int inc)
} else {
struct slave *slave;
- bond_for_each_slave(bond, slave) {
+ bond_for_each_slave(bond, slave, iter) {
err = dev_set_allmulti(slave->dev, inc);
if (err)
return err;
@@ -1050,9 +1057,10 @@ static void bond_poll_controller(struct net_device *bond_dev)
static void bond_netpoll_cleanup(struct net_device *bond_dev)
{
struct bonding *bond = netdev_priv(bond_dev);
+ struct list_head *iter;
struct slave *slave;
- bond_for_each_slave(bond, slave)
+ bond_for_each_slave(bond, slave, iter)
if (IS_UP(slave->dev))
slave_disable_netpoll(slave);
}
@@ -1060,10 +1068,11 @@ static void bond_netpoll_cleanup(struct net_device *bond_dev)
static int bond_netpoll_setup(struct net_device *dev, struct netpoll_info *ni, gfp_t gfp)
{
struct bonding *bond = netdev_priv(dev);
+ struct list_head *iter;
struct slave *slave;
int err = 0;
- bond_for_each_slave(bond, slave) {
+ bond_for_each_slave(bond, slave, iter) {
err = slave_enable_netpoll(slave);
if (err) {
bond_netpoll_cleanup(dev);
@@ -1091,6 +1100,7 @@ static netdev_features_t bond_fix_features(struct net_device *dev,
netdev_features_t features)
{
struct bonding *bond = netdev_priv(dev);
+ struct list_head *iter;
netdev_features_t mask;
struct slave *slave;
@@ -1104,7 +1114,7 @@ static netdev_features_t bond_fix_features(struct net_device *dev,
features &= ~NETIF_F_ONE_FOR_ALL;
features |= NETIF_F_ALL_FOR_ALL;
- bond_for_each_slave(bond, slave) {
+ bond_for_each_slave(bond, slave, iter) {
features = netdev_increment_features(features,
slave->dev->features,
mask);
@@ -1122,16 +1132,17 @@ static void bond_compute_features(struct bonding *bond)
{
unsigned int flags, dst_release_flag = IFF_XMIT_DST_RELEASE;
netdev_features_t vlan_features = BOND_VLAN_FEATURES;
+ struct net_device *bond_dev = bond->dev;
+ struct list_head *iter;
+ struct slave *slave;
unsigned short max_hard_header_len = ETH_HLEN;
unsigned int gso_max_size = GSO_MAX_SIZE;
- struct net_device *bond_dev = bond->dev;
u16 gso_max_segs = GSO_MAX_SEGS;
- struct slave *slave;
if (list_empty(&bond->slave_list))
goto done;
- bond_for_each_slave(bond, slave) {
+ bond_for_each_slave(bond, slave, iter) {
vlan_features = netdev_increment_features(vlan_features,
slave->dev->vlan_features, BOND_VLAN_FEATURES);
@@ -1993,11 +2004,12 @@ static int bond_info_query(struct net_device *bond_dev, struct ifbond *info)
static int bond_slave_info_query(struct net_device *bond_dev, struct ifslave *info)
{
struct bonding *bond = netdev_priv(bond_dev);
+ struct list_head *iter;
int i = 0, res = -ENODEV;
struct slave *slave;
read_lock(&bond->lock);
- bond_for_each_slave(bond, slave) {
+ bond_for_each_slave(bond, slave, iter) {
if (i++ == (int)info->slave_id) {
res = 0;
strcpy(info->slave_name, slave->dev->name);
@@ -2018,12 +2030,13 @@ static int bond_slave_info_query(struct net_device *bond_dev, struct ifslave *in
static int bond_miimon_inspect(struct bonding *bond)
{
int link_state, commit = 0;
+ struct list_head *iter;
struct slave *slave;
bool ignore_updelay;
ignore_updelay = !bond->curr_active_slave ? true : false;
- bond_for_each_slave(bond, slave) {
+ bond_for_each_slave(bond, slave, iter) {
slave->new_link = BOND_LINK_NOCHANGE;
link_state = bond_check_dev_link(bond, slave->dev, 0);
@@ -2117,9 +2130,10 @@ static int bond_miimon_inspect(struct bonding *bond)
static void bond_miimon_commit(struct bonding *bond)
{
+ struct list_head *iter;
struct slave *slave;
- bond_for_each_slave(bond, slave) {
+ bond_for_each_slave(bond, slave, iter) {
switch (slave->new_link) {
case BOND_LINK_NOCHANGE:
continue;
@@ -2513,6 +2527,7 @@ void bond_loadbalance_arp_mon(struct work_struct *work)
struct bonding *bond = container_of(work, struct bonding,
arp_work.work);
struct slave *slave, *oldcurrent;
+ struct list_head *iter;
int do_failover = 0;
read_lock(&bond->lock);
@@ -2529,7 +2544,7 @@ void bond_loadbalance_arp_mon(struct work_struct *work)
* TODO: what about up/down delay in arp mode? it wasn't here before
* so it can wait
*/
- bond_for_each_slave(bond, slave) {
+ bond_for_each_slave(bond, slave, iter) {
unsigned long trans_start = dev_trans_start(slave->dev);
if (slave->link != BOND_LINK_UP) {
@@ -2620,10 +2635,11 @@ re_arm:
static int bond_ab_arp_inspect(struct bonding *bond)
{
unsigned long trans_start, last_rx;
+ struct list_head *iter;
struct slave *slave;
int commit = 0;
- bond_for_each_slave(bond, slave) {
+ bond_for_each_slave(bond, slave, iter) {
slave->new_link = BOND_LINK_NOCHANGE;
last_rx = slave_last_rx(bond, slave);
@@ -2690,9 +2706,10 @@ static int bond_ab_arp_inspect(struct bonding *bond)
static void bond_ab_arp_commit(struct bonding *bond)
{
unsigned long trans_start;
+ struct list_head *iter;
struct slave *slave;
- bond_for_each_slave(bond, slave) {
+ bond_for_each_slave(bond, slave, iter) {
switch (slave->new_link) {
case BOND_LINK_NOCHANGE:
continue;
@@ -3156,13 +3173,14 @@ static void bond_work_cancel_all(struct bonding *bond)
static int bond_open(struct net_device *bond_dev)
{
struct bonding *bond = netdev_priv(bond_dev);
+ struct list_head *iter;
struct slave *slave;
/* reset slave->backup and slave->inactive */
read_lock(&bond->lock);
if (!list_empty(&bond->slave_list)) {
read_lock(&bond->curr_slave_lock);
- bond_for_each_slave(bond, slave) {
+ bond_for_each_slave(bond, slave, iter) {
if ((bond->params.mode == BOND_MODE_ACTIVEBACKUP)
&& (slave != bond->curr_active_slave)) {
bond_set_slave_inactive_flags(slave);
@@ -3222,12 +3240,13 @@ static struct rtnl_link_stats64 *bond_get_stats(struct net_device *bond_dev,
{
struct bonding *bond = netdev_priv(bond_dev);
struct rtnl_link_stats64 temp;
+ struct list_head *iter;
struct slave *slave;
memset(stats, 0, sizeof(*stats));
read_lock_bh(&bond->lock);
- bond_for_each_slave(bond, slave) {
+ bond_for_each_slave(bond, slave, iter) {
const struct rtnl_link_stats64 *sstats =
dev_get_stats(slave->dev, &temp);
@@ -3394,6 +3413,7 @@ static void bond_change_rx_flags(struct net_device *bond_dev, int change)
static void bond_set_rx_mode(struct net_device *bond_dev)
{
struct bonding *bond = netdev_priv(bond_dev);
+ struct list_head *iter;
struct slave *slave;
ASSERT_RTNL();
@@ -3405,7 +3425,7 @@ static void bond_set_rx_mode(struct net_device *bond_dev)
dev_mc_sync(slave->dev, bond_dev);
}
} else {
- bond_for_each_slave(bond, slave) {
+ bond_for_each_slave(bond, slave, iter) {
dev_uc_sync_multiple(slave->dev, bond_dev);
dev_mc_sync_multiple(slave->dev, bond_dev);
}
@@ -3473,6 +3493,7 @@ static int bond_change_mtu(struct net_device *bond_dev, int new_mtu)
{
struct bonding *bond = netdev_priv(bond_dev);
struct slave *slave, *rollback_slave;
+ struct list_head *iter;
int res = 0;
pr_debug("bond=%p, name=%s, new_mtu=%d\n", bond,
@@ -3493,7 +3514,7 @@ static int bond_change_mtu(struct net_device *bond_dev, int new_mtu)
* call to the base driver.
*/
- bond_for_each_slave(bond, slave) {
+ bond_for_each_slave(bond, slave, iter) {
pr_debug("s %p s->p %p c_m %p\n",
slave,
bond_prev_slave(bond, slave),
@@ -3521,7 +3542,7 @@ static int bond_change_mtu(struct net_device *bond_dev, int new_mtu)
unwind:
/* unwind from head to the slave that failed */
- bond_for_each_slave(bond, rollback_slave) {
+ bond_for_each_slave(bond, rollback_slave, iter) {
int tmp_res;
if (rollback_slave == slave);
@@ -3549,6 +3570,7 @@ static int bond_set_mac_address(struct net_device *bond_dev, void *addr)
struct bonding *bond = netdev_priv(bond_dev);
struct slave *slave, *rollback_slave;
struct sockaddr *sa = addr, tmp_sa;
+ struct list_head *iter;
int res = 0;
if (bond->params.mode == BOND_MODE_ALB)
@@ -3582,7 +3604,7 @@ static int bond_set_mac_address(struct net_device *bond_dev, void *addr)
* call to the base driver.
*/
- bond_for_each_slave(bond, slave) {
+ bond_for_each_slave(bond, slave, iter) {
const struct net_device_ops *slave_ops = slave->dev->netdev_ops;
pr_debug("slave %p %s\n", slave, slave->dev->name);
@@ -3614,7 +3636,7 @@ unwind:
tmp_sa.sa_family = bond_dev->type;
/* unwind from head to the slave that failed */
- bond_for_each_slave(bond, rollback_slave) {
+ bond_for_each_slave(bond, rollback_slave, iter) {
int tmp_res;
if (rollback_slave == slave)
@@ -3642,11 +3664,12 @@ unwind:
*/
void bond_xmit_slave_id(struct bonding *bond, struct sk_buff *skb, int slave_id)
{
+ struct list_head *iter;
struct slave *slave;
int i = slave_id;
/* Here we start from the slave with slave_id */
- bond_for_each_slave_rcu(bond, slave) {
+ bond_for_each_slave_rcu(bond, slave, iter) {
if (--i < 0) {
if (slave_can_tx(slave)) {
bond_dev_queue_xmit(bond, skb, slave->dev);
@@ -3657,7 +3680,7 @@ void bond_xmit_slave_id(struct bonding *bond, struct sk_buff *skb, int slave_id)
/* Here we start from the first slave up to slave_id */
i = slave_id;
- bond_for_each_slave_rcu(bond, slave) {
+ bond_for_each_slave_rcu(bond, slave, iter) {
if (--i < 0)
break;
if (slave_can_tx(slave)) {
@@ -3734,8 +3757,9 @@ static int bond_xmit_broadcast(struct sk_buff *skb, struct net_device *bond_dev)
{
struct bonding *bond = netdev_priv(bond_dev);
struct slave *slave = NULL;
+ struct list_head *iter;
- bond_for_each_slave_rcu(bond, slave) {
+ bond_for_each_slave_rcu(bond, slave, iter) {
if (bond_is_last_slave(bond, slave))
break;
if (IS_UP(slave->dev) && slave->link == BOND_LINK_UP) {
@@ -3784,13 +3808,14 @@ static inline int bond_slave_override(struct bonding *bond,
{
struct slave *slave = NULL;
struct slave *check_slave;
+ struct list_head *iter;
int res = 1;
if (!skb->queue_mapping)
return 1;
/* Find out if any slaves have the same mapping as this skb. */
- bond_for_each_slave_rcu(bond, check_slave) {
+ bond_for_each_slave_rcu(bond, check_slave, iter) {
if (check_slave->queue_id == skb->queue_mapping) {
slave = check_slave;
break;
@@ -3922,6 +3947,7 @@ static int bond_ethtool_get_settings(struct net_device *bond_dev,
{
struct bonding *bond = netdev_priv(bond_dev);
unsigned long speed = 0;
+ struct list_head *iter;
struct slave *slave;
ecmd->duplex = DUPLEX_UNKNOWN;
@@ -3933,7 +3959,7 @@ static int bond_ethtool_get_settings(struct net_device *bond_dev,
* this is an accurate maximum.
*/
read_lock(&bond->lock);
- bond_for_each_slave(bond, slave) {
+ bond_for_each_slave(bond, slave, iter) {
if (SLAVE_IS_OK(slave)) {
if (slave->speed != SPEED_UNKNOWN)
speed += slave->speed;
diff --git a/drivers/net/bonding/bond_procfs.c b/drivers/net/bonding/bond_procfs.c
index 20a6ee2..7af5646 100644
--- a/drivers/net/bonding/bond_procfs.c
+++ b/drivers/net/bonding/bond_procfs.c
@@ -10,8 +10,9 @@ static void *bond_info_seq_start(struct seq_file *seq, loff_t *pos)
__acquires(&bond->lock)
{
struct bonding *bond = seq->private;
- loff_t off = 0;
+ struct list_head *iter;
struct slave *slave;
+ loff_t off = 0;
/* make sure the bond won't be taken away */
rcu_read_lock();
@@ -20,7 +21,7 @@ static void *bond_info_seq_start(struct seq_file *seq, loff_t *pos)
if (*pos == 0)
return SEQ_START_TOKEN;
- bond_for_each_slave(bond, slave)
+ bond_for_each_slave(bond, slave, iter)
if (++off == *pos)
return slave;
diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index ce46776..939148a 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -210,11 +210,12 @@ static ssize_t bonding_show_slaves(struct device *d,
struct device_attribute *attr, char *buf)
{
struct bonding *bond = to_bond(d);
+ struct list_head *iter;
struct slave *slave;
int res = 0;
read_lock(&bond->lock);
- bond_for_each_slave(bond, slave) {
+ bond_for_each_slave(bond, slave, iter) {
if (res > (PAGE_SIZE - IFNAMSIZ)) {
/* not enough space for another interface name */
if ((PAGE_SIZE - res) > 10)
@@ -637,6 +638,7 @@ static ssize_t bonding_store_arp_targets(struct device *d,
const char *buf, size_t count)
{
struct bonding *bond = to_bond(d);
+ struct list_head *iter;
struct slave *slave;
__be32 newtarget, *targets;
unsigned long *targets_rx;
@@ -669,7 +671,7 @@ static ssize_t bonding_store_arp_targets(struct device *d,
&newtarget);
/* not to race with bond_arp_rcv */
write_lock_bh(&bond->lock);
- bond_for_each_slave(bond, slave)
+ bond_for_each_slave(bond, slave, iter)
slave->target_last_arp_rx[ind] = jiffies;
targets[ind] = newtarget;
write_unlock_bh(&bond->lock);
@@ -695,7 +697,7 @@ static ssize_t bonding_store_arp_targets(struct device *d,
&newtarget);
write_lock_bh(&bond->lock);
- bond_for_each_slave(bond, slave) {
+ bond_for_each_slave(bond, slave, iter) {
targets_rx = slave->target_last_arp_rx;
j = ind;
for (; (j < BOND_MAX_ARP_TARGETS-1) && targets[j+1]; j++)
@@ -1092,6 +1094,7 @@ static ssize_t bonding_store_primary(struct device *d,
const char *buf, size_t count)
{
struct bonding *bond = to_bond(d);
+ struct list_head *iter;
char ifname[IFNAMSIZ];
struct slave *slave;
@@ -1119,7 +1122,7 @@ static ssize_t bonding_store_primary(struct device *d,
goto out;
}
- bond_for_each_slave(bond, slave) {
+ bond_for_each_slave(bond, slave, iter) {
if (strncmp(slave->dev->name, ifname, IFNAMSIZ) == 0) {
pr_info("%s: Setting %s as primary slave.\n",
bond->dev->name, slave->dev->name);
@@ -1267,6 +1270,7 @@ static ssize_t bonding_store_active_slave(struct device *d,
{
struct slave *slave, *old_active, *new_active;
struct bonding *bond = to_bond(d);
+ struct list_head *iter;
char ifname[IFNAMSIZ];
if (!rtnl_trylock())
@@ -1294,7 +1298,7 @@ static ssize_t bonding_store_active_slave(struct device *d,
goto out;
}
- bond_for_each_slave(bond, slave) {
+ bond_for_each_slave(bond, slave, iter) {
if (strncmp(slave->dev->name, ifname, IFNAMSIZ) == 0) {
old_active = bond->curr_active_slave;
new_active = slave;
@@ -1474,6 +1478,7 @@ static ssize_t bonding_show_queue_id(struct device *d,
char *buf)
{
struct bonding *bond = to_bond(d);
+ struct list_head *iter;
struct slave *slave;
int res = 0;
@@ -1481,7 +1486,7 @@ static ssize_t bonding_show_queue_id(struct device *d,
return restart_syscall();
read_lock(&bond->lock);
- bond_for_each_slave(bond, slave) {
+ bond_for_each_slave(bond, slave, iter) {
if (res > (PAGE_SIZE - IFNAMSIZ - 6)) {
/* not enough space for another interface_name:queue_id pair */
if ((PAGE_SIZE - res) > 10)
@@ -1510,6 +1515,7 @@ static ssize_t bonding_store_queue_id(struct device *d,
{
struct slave *slave, *update_slave;
struct bonding *bond = to_bond(d);
+ struct list_head *iter;
u16 qid;
int ret = count;
char *delim;
@@ -1546,7 +1552,7 @@ static ssize_t bonding_store_queue_id(struct device *d,
/* Search for thes slave and check for duplicate qids */
update_slave = NULL;
- bond_for_each_slave(bond, slave) {
+ bond_for_each_slave(bond, slave, iter) {
if (sdev == slave->dev)
/*
* We don't need to check the matching
@@ -1600,6 +1606,7 @@ static ssize_t bonding_store_slaves_active(struct device *d,
{
struct bonding *bond = to_bond(d);
int new_value, ret = count;
+ struct list_head *iter;
struct slave *slave;
if (sscanf(buf, "%d", &new_value) != 1) {
@@ -1622,7 +1629,7 @@ static ssize_t bonding_store_slaves_active(struct device *d,
}
read_lock(&bond->lock);
- bond_for_each_slave(bond, slave) {
+ bond_for_each_slave(bond, slave, iter) {
if (!bond_is_active_slave(slave)) {
if (new_value)
slave->inactive = 0;
diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
index 141ddc3..4d725a5 100644
--- a/drivers/net/bonding/bonding.h
+++ b/drivers/net/bonding/bonding.h
@@ -110,15 +110,16 @@
* bond_for_each_slave - iterate over all slaves
* @bond: the bond holding this list
* @pos: current slave
+ * @iter: list_head * iterator
*
* Caller must hold bond->lock
*/
-#define bond_for_each_slave(bond, pos) \
- list_for_each_entry(pos, &(bond)->slave_list, list)
+#define bond_for_each_slave(bond, pos, iter) \
+ netdev_for_each_lower_private((bond)->dev, pos, iter)
/* Caller must have rcu_read_lock */
-#define bond_for_each_slave_rcu(bond, pos) \
- list_for_each_entry_rcu(pos, &(bond)->slave_list, list)
+#define bond_for_each_slave_rcu(bond, pos, iter) \
+ netdev_for_each_lower_private_rcu((bond)->dev, pos, iter)
#ifdef CONFIG_NET_POLL_CONTROLLER
extern atomic_t netpoll_block_tx;
@@ -475,9 +476,10 @@ static inline void bond_destroy_proc_dir(struct bond_net *bn)
static inline struct slave *bond_slave_has_mac(struct bonding *bond,
const u8 *mac)
{
+ struct list_head *iter;
struct slave *tmp;
- bond_for_each_slave(bond, tmp)
+ bond_for_each_slave(bond, tmp, iter)
if (ether_addr_equal_64bits(mac, tmp->dev->dev_addr))
return tmp;
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index 0d0665c..2cd6962 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -3983,6 +3983,7 @@ static int cxgb4_inet6addr_handler(struct notifier_block *this,
struct net_device *event_dev;
int ret = NOTIFY_DONE;
struct bonding *bond = netdev_priv(ifa->idev->dev);
+ struct list_head *iter;
struct slave *slave;
struct pci_dev *first_pdev = NULL;
@@ -3995,7 +3996,7 @@ static int cxgb4_inet6addr_handler(struct notifier_block *this,
* in all of them only once.
*/
read_lock(&bond->lock);
- bond_for_each_slave(bond, slave) {
+ bond_for_each_slave(bond, slave, iter) {
if (!first_pdev) {
ret = clip_add(slave->dev, ifa, event);
/* If clip_add is success then only initialize
--
1.8.4
^ permalink raw reply related
* [PATCH v3 net-next 11/27] bonding: rework bond_3ad_xmit_xor() to use bond_for_each_slave() only
From: Veaceslav Falico @ 2013-09-17 0:46 UTC (permalink / raw)
To: netdev; +Cc: jiri, Veaceslav Falico, Jay Vosburgh, Andy Gospodarek
In-Reply-To: <1379378812-18346-1-git-send-email-vfalico@redhat.com>
Currently, there are two loops - first we find the first slave in an
aggregator after the xmit_hash_policy() returned number, and after that we
loop from that slave, over bonding head, and till that slave to find any
suitable slave to send the packet through.
Replace it by just one bond_for_each_slave() loop, which first loops
through the requested number of slaves, saving the first suitable one, and
after that we've hit the requested number of slaves to skip - search for
any up slave to send the packet through. If we don't find such kind of
slave - then just send the packet through the first suitable slave found.
Logic remains unchainged, and we skip two loops. Also, refactor it a bit
for readability.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
Notes:
v2 -> v3:
No change.
v1 -> v2:
No changes.
RFC -> v1:
New patch.
drivers/net/bonding/bond_3ad.c | 46 ++++++++++++++++++++----------------------
1 file changed, 22 insertions(+), 24 deletions(-)
diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
index 3847aee..c861ee7 100644
--- a/drivers/net/bonding/bond_3ad.c
+++ b/drivers/net/bonding/bond_3ad.c
@@ -2417,15 +2417,15 @@ int bond_3ad_get_active_agg_info(struct bonding *bond, struct ad_info *ad_info)
int bond_3ad_xmit_xor(struct sk_buff *skb, struct net_device *dev)
{
- struct slave *slave, *start_at;
struct bonding *bond = netdev_priv(dev);
+ struct slave *slave, *first_ok_slave;
+ struct aggregator *agg;
+ struct ad_info ad_info;
struct list_head *iter;
- int slave_agg_no;
int slaves_in_agg;
- int agg_id;
- int i;
- struct ad_info ad_info;
+ int slave_agg_no;
int res = 1;
+ int agg_id;
read_lock(&bond->lock);
if (__bond_3ad_get_active_agg_info(bond, &ad_info)) {
@@ -2438,20 +2438,28 @@ int bond_3ad_xmit_xor(struct sk_buff *skb, struct net_device *dev)
agg_id = ad_info.aggregator_id;
if (slaves_in_agg == 0) {
- /*the aggregator is empty*/
pr_debug("%s: Error: active aggregator is empty\n", dev->name);
goto out;
}
slave_agg_no = bond->xmit_hash_policy(skb, slaves_in_agg);
+ first_ok_slave = NULL;
bond_for_each_slave(bond, slave, iter) {
- struct aggregator *agg = SLAVE_AD_INFO(slave).port.aggregator;
+ agg = SLAVE_AD_INFO(slave).port.aggregator;
+ if (!agg || agg->aggregator_identifier != agg_id)
+ continue;
- if (agg && (agg->aggregator_identifier == agg_id)) {
+ if (slave_agg_no >= 0) {
+ if (!first_ok_slave && SLAVE_IS_OK(slave))
+ first_ok_slave = slave;
slave_agg_no--;
- if (slave_agg_no < 0)
- break;
+ continue;
+ }
+
+ if (SLAVE_IS_OK(slave)) {
+ res = bond_dev_queue_xmit(bond, skb, slave->dev);
+ goto out;
}
}
@@ -2461,20 +2469,10 @@ int bond_3ad_xmit_xor(struct sk_buff *skb, struct net_device *dev)
goto out;
}
- start_at = slave;
-
- bond_for_each_slave_from(bond, slave, i, start_at) {
- int slave_agg_id = 0;
- struct aggregator *agg = SLAVE_AD_INFO(slave).port.aggregator;
-
- if (agg)
- slave_agg_id = agg->aggregator_identifier;
-
- if (SLAVE_IS_OK(slave) && agg && (slave_agg_id == agg_id)) {
- res = bond_dev_queue_xmit(bond, skb, slave->dev);
- break;
- }
- }
+ /* we couldn't find any suitable slave after the agg_no, so use the
+ * first suitable found, if found. */
+ if (first_ok_slave)
+ res = bond_dev_queue_xmit(bond, skb, first_ok_slave->dev);
out:
read_unlock(&bond->lock);
--
1.8.4
^ permalink raw reply related
* [PATCH v3 net-next 13/27] bonding: rework bond_find_best_slave() to use bond_for_each_slave()
From: Veaceslav Falico @ 2013-09-17 0:46 UTC (permalink / raw)
To: netdev; +Cc: jiri, Veaceslav Falico, Jay Vosburgh, Andy Gospodarek
In-Reply-To: <1379378812-18346-1-git-send-email-vfalico@redhat.com>
bond_find_best_slave() does not have to be balanced - i.e. return the slave
that is *after* some other slave, but rather return the best slave that
suits, except of bond->primary_slave - in which case we just return it if
it's suitable.
After that we just look through all the slaves and return either first up
slave or the slave whose link came back earliest.
We also don't care about curr_active_slave lock cause we use it in
bond_should_change_active() only and there we take it right away - i.e. it
won't go away.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
Notes:
v2 -> v3:
No change.
v1 -> v2:
No changes.
RFC -> v1:
New patch.
drivers/net/bonding/bond_main.c | 43 ++++++++++++-----------------------------
1 file changed, 12 insertions(+), 31 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 2075321..b5497e7 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -785,43 +785,24 @@ static bool bond_should_change_active(struct bonding *bond)
/**
* find_best_interface - select the best available slave to be the active one
* @bond: our bonding struct
- *
- * Warning: Caller must hold curr_slave_lock for writing.
*/
static struct slave *bond_find_best_slave(struct bonding *bond)
{
- struct slave *new_active, *old_active;
- struct slave *bestslave = NULL;
+ struct slave *slave, *bestslave = NULL;
+ struct list_head *iter;
int mintime = bond->params.updelay;
- int i;
- new_active = bond->curr_active_slave;
-
- if (!new_active) { /* there were no active slaves left */
- new_active = bond_first_slave(bond);
- if (!new_active)
- return NULL; /* still no slave, return NULL */
- }
+ if (bond->primary_slave && bond->primary_slave->link == BOND_LINK_UP &&
+ bond_should_change_active(bond))
+ return bond->primary_slave;
- if ((bond->primary_slave) &&
- bond->primary_slave->link == BOND_LINK_UP &&
- bond_should_change_active(bond)) {
- new_active = bond->primary_slave;
- }
-
- /* remember where to stop iterating over the slaves */
- old_active = new_active;
-
- bond_for_each_slave_from(bond, new_active, i, old_active) {
- if (new_active->link == BOND_LINK_UP) {
- return new_active;
- } else if (new_active->link == BOND_LINK_BACK &&
- IS_UP(new_active->dev)) {
- /* link up, but waiting for stabilization */
- if (new_active->delay < mintime) {
- mintime = new_active->delay;
- bestslave = new_active;
- }
+ bond_for_each_slave(bond, slave, iter) {
+ if (slave->link == BOND_LINK_UP)
+ return slave;
+ if (slave->link == BOND_LINK_BACK && IS_UP(slave->dev) &&
+ slave->delay < mintime) {
+ mintime = slave->delay;
+ bestslave = slave;
}
}
--
1.8.4
^ permalink raw reply related
* [PATCH v3 net-next 14/27] bonding: rework bond_ab_arp_probe() to use bond_for_each_slave()
From: Veaceslav Falico @ 2013-09-17 0:46 UTC (permalink / raw)
To: netdev; +Cc: jiri, Veaceslav Falico, Jay Vosburgh, Andy Gospodarek
In-Reply-To: <1379378812-18346-1-git-send-email-vfalico@redhat.com>
Currently it uses the hard-to-rcuify bond_for_each_slave_from(), and also
it doesn't check every slave for disrepencies between the actual
IS_UP(slave) and the slave->link == BOND_LINK_UP, but only till we find the
next suitable slave.
Fix this by using bond_for_each_slave() and storing the first good slave in
*before till we find the current_arp_slave, after that we store the first good
slave in new_slave. If new_slave is empty - use the slave stored in before,
and if it's also empty - then we didn't find any suitable slave.
Also, in the meanwhile, check for each slave status.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
Notes:
v2 -> v3:
No change.
v1 -> v2:
No changes.
RFC -> v1:
New patch.
drivers/net/bonding/bond_main.c | 38 ++++++++++++++++++++++++--------------
1 file changed, 24 insertions(+), 14 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index b5497e7..eedfa8f 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2761,8 +2761,9 @@ do_failover:
*/
static void bond_ab_arp_probe(struct bonding *bond)
{
- struct slave *slave, *next_slave;
- int i;
+ struct slave *slave, *before = NULL, *new_slave = NULL;
+ struct list_head *iter;
+ bool found = false;
read_lock(&bond->curr_slave_lock);
@@ -2792,18 +2793,12 @@ static void bond_ab_arp_probe(struct bonding *bond)
bond_set_slave_inactive_flags(bond->current_arp_slave);
- /* search for next candidate */
- next_slave = bond_next_slave(bond, bond->current_arp_slave);
- bond_for_each_slave_from(bond, slave, i, next_slave) {
- if (IS_UP(slave->dev)) {
- slave->link = BOND_LINK_BACK;
- bond_set_slave_active_flags(slave);
- bond_arp_send_all(bond, slave);
- slave->jiffies = jiffies;
- bond->current_arp_slave = slave;
- break;
- }
+ bond_for_each_slave(bond, slave, iter) {
+ if (!found && !before && IS_UP(slave->dev))
+ before = slave;
+ if (found && !new_slave && IS_UP(slave->dev))
+ new_slave = slave;
/* if the link state is up at this point, we
* mark it down - this can happen if we have
* simultaneous link failures and
@@ -2811,7 +2806,7 @@ static void bond_ab_arp_probe(struct bonding *bond)
* one the current slave so it is still marked
* up when it is actually down
*/
- if (slave->link == BOND_LINK_UP) {
+ if (!IS_UP(slave->dev) && slave->link == BOND_LINK_UP) {
slave->link = BOND_LINK_DOWN;
if (slave->link_failure_count < UINT_MAX)
slave->link_failure_count++;
@@ -2821,7 +2816,22 @@ static void bond_ab_arp_probe(struct bonding *bond)
pr_info("%s: backup interface %s is now down.\n",
bond->dev->name, slave->dev->name);
}
+ if (slave == bond->current_arp_slave)
+ found = true;
}
+
+ if (!new_slave && before)
+ new_slave = before;
+
+ if (!new_slave)
+ return;
+
+ new_slave->link = BOND_LINK_BACK;
+ bond_set_slave_active_flags(new_slave);
+ bond_arp_send_all(bond, new_slave);
+ new_slave->jiffies = jiffies;
+ bond->current_arp_slave = new_slave;
+
}
void bond_activebackup_arp_mon(struct work_struct *work)
--
1.8.4
^ permalink raw reply related
* [PATCH v3 net-next 15/27] bonding: remove unused bond_for_each_slave_from()
From: Veaceslav Falico @ 2013-09-17 0:46 UTC (permalink / raw)
To: netdev; +Cc: jiri, Veaceslav Falico, Jay Vosburgh, Andy Gospodarek
In-Reply-To: <1379378812-18346-1-git-send-email-vfalico@redhat.com>
It has no users, so we can remove it.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
Notes:
v2 -> v3:
No change.
v1 -> v2:
No changes.
RFC -> v1:
New patch.
drivers/net/bonding/bonding.h | 13 -------------
1 file changed, 13 deletions(-)
diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
index 4d725a5..e484c38 100644
--- a/drivers/net/bonding/bonding.h
+++ b/drivers/net/bonding/bonding.h
@@ -94,19 +94,6 @@
bond_to_slave((pos)->list.prev))
/**
- * bond_for_each_slave_from - iterate the slaves list from a starting point
- * @bond: the bond holding this list.
- * @pos: current slave.
- * @cnt: counter for max number of moves
- * @start: starting point.
- *
- * Caller must hold bond->lock
- */
-#define bond_for_each_slave_from(bond, pos, cnt, start) \
- for (cnt = 0, pos = start; pos && cnt < (bond)->slave_cnt; \
- cnt++, pos = bond_next_slave(bond, pos))
-
-/**
* bond_for_each_slave - iterate over all slaves
* @bond: the bond holding this list
* @pos: current slave
--
1.8.4
^ permalink raw reply related
* [PATCH v3 net-next 16/27] bonding: add bond_has_slaves() and use it
From: Veaceslav Falico @ 2013-09-17 0:46 UTC (permalink / raw)
To: netdev; +Cc: jiri, Veaceslav Falico, Jay Vosburgh, Andy Gospodarek
In-Reply-To: <1379378812-18346-1-git-send-email-vfalico@redhat.com>
Currently we verify if we have slaves by checking if bond->slave_list is
empty. Create a define bond_has_slaves() and use it, a bit more readable
and easier to change in the future.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
Notes:
v2 -> v3:
No change.
v1 -> v2:
No changes.
RFC -> v1:
No change.
drivers/net/bonding/bond_3ad.c | 2 +-
drivers/net/bonding/bond_alb.c | 8 ++++----
drivers/net/bonding/bond_main.c | 32 ++++++++++++++++----------------
drivers/net/bonding/bond_sysfs.c | 4 ++--
drivers/net/bonding/bonding.h | 2 ++
5 files changed, 25 insertions(+), 23 deletions(-)
diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
index c861ee7..1337eaf 100644
--- a/drivers/net/bonding/bond_3ad.c
+++ b/drivers/net/bonding/bond_3ad.c
@@ -2117,7 +2117,7 @@ void bond_3ad_state_machine_handler(struct work_struct *work)
read_lock(&bond->lock);
//check if there are any slaves
- if (list_empty(&bond->slave_list))
+ if (!bond_has_slaves(bond))
goto re_arm;
// check if agg_select_timer timer after initialize is timed out
diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
index 1172474..f3e8114 100644
--- a/drivers/net/bonding/bond_alb.c
+++ b/drivers/net/bonding/bond_alb.c
@@ -1178,7 +1178,7 @@ static int alb_handle_addr_collision_on_attach(struct bonding *bond, struct slav
struct slave *tmp_slave1, *free_mac_slave = NULL;
struct list_head *iter;
- if (list_empty(&bond->slave_list)) {
+ if (!bond_has_slaves(bond)) {
/* this is the first slave */
return 0;
}
@@ -1469,7 +1469,7 @@ void bond_alb_monitor(struct work_struct *work)
read_lock(&bond->lock);
- if (list_empty(&bond->slave_list)) {
+ if (!bond_has_slaves(bond)) {
bond_info->tx_rebalance_counter = 0;
bond_info->lp_counter = 0;
goto re_arm;
@@ -1606,7 +1606,7 @@ int bond_alb_init_slave(struct bonding *bond, struct slave *slave)
*/
void bond_alb_deinit_slave(struct bonding *bond, struct slave *slave)
{
- if (!list_empty(&bond->slave_list))
+ if (bond_has_slaves(bond))
alb_change_hw_addr_on_detach(bond, slave);
tlb_clear_slave(bond, slave, 0);
@@ -1676,7 +1676,7 @@ void bond_alb_handle_active_change(struct bonding *bond, struct slave *new_slave
swap_slave = bond->curr_active_slave;
rcu_assign_pointer(bond->curr_active_slave, new_slave);
- if (!new_slave || list_empty(&bond->slave_list))
+ if (!new_slave || !bond_has_slaves(bond))
return;
/* set the new curr_active_slave to the bonds mac address
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index eedfa8f..c91b754 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -391,7 +391,7 @@ static int bond_set_carrier(struct bonding *bond)
struct list_head *iter;
struct slave *slave;
- if (list_empty(&bond->slave_list))
+ if (!bond_has_slaves(bond))
goto down;
if (bond->params.mode == BOND_MODE_8023AD)
@@ -1085,7 +1085,7 @@ static netdev_features_t bond_fix_features(struct net_device *dev,
netdev_features_t mask;
struct slave *slave;
- if (list_empty(&bond->slave_list)) {
+ if (!bond_has_slaves(bond)) {
/* Disable adding VLANs to empty bond. But why? --mq */
features |= NETIF_F_VLAN_CHALLENGED;
return features;
@@ -1120,7 +1120,7 @@ static void bond_compute_features(struct bonding *bond)
unsigned int gso_max_size = GSO_MAX_SIZE;
u16 gso_max_segs = GSO_MAX_SEGS;
- if (list_empty(&bond->slave_list))
+ if (!bond_has_slaves(bond))
goto done;
bond_for_each_slave(bond, slave, iter) {
@@ -1310,7 +1310,7 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
* bond ether type mutual exclusion - don't allow slaves of dissimilar
* ether type (eg ARPHRD_ETHER and ARPHRD_INFINIBAND) share the same bond
*/
- if (list_empty(&bond->slave_list)) {
+ if (!bond_has_slaves(bond)) {
if (bond_dev->type != slave_dev->type) {
pr_debug("%s: change device type from %d to %d\n",
bond_dev->name,
@@ -1349,7 +1349,7 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
}
if (slave_ops->ndo_set_mac_address == NULL) {
- if (list_empty(&bond->slave_list)) {
+ if (!bond_has_slaves(bond)) {
pr_warning("%s: Warning: The first slave device specified does not support setting the MAC address. Setting fail_over_mac to active.",
bond_dev->name);
bond->params.fail_over_mac = BOND_FOM_ACTIVE;
@@ -1365,7 +1365,7 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
/* If this is the first slave, then we need to set the master's hardware
* address to be the same as the slave's. */
- if (list_empty(&bond->slave_list) &&
+ if (!bond_has_slaves(bond) &&
bond->dev->addr_assign_type == NET_ADDR_RANDOM)
bond_set_dev_addr(bond->dev, slave_dev);
@@ -1696,7 +1696,7 @@ err_free:
err_undo_flags:
bond_compute_features(bond);
/* Enslave of first slave has failed and we need to fix master's mac */
- if (list_empty(&bond->slave_list) &&
+ if (!bond_has_slaves(bond) &&
ether_addr_equal(bond_dev->dev_addr, slave_dev->dev_addr))
eth_hw_addr_random(bond_dev);
@@ -1776,7 +1776,7 @@ static int __bond_release_one(struct net_device *bond_dev,
if (!all && !bond->params.fail_over_mac) {
if (ether_addr_equal(bond_dev->dev_addr, slave->perm_hwaddr) &&
- !list_empty(&bond->slave_list))
+ bond_has_slaves(bond))
pr_warn("%s: Warning: the permanent HWaddr of %s - %pM - is still in use by %s. Set the HWaddr of %s to a different address to avoid conflicts.\n",
bond_dev->name, slave_dev->name,
slave->perm_hwaddr,
@@ -1819,7 +1819,7 @@ static int __bond_release_one(struct net_device *bond_dev,
write_lock_bh(&bond->lock);
}
- if (list_empty(&bond->slave_list)) {
+ if (!bond_has_slaves(bond)) {
bond_set_carrier(bond);
eth_hw_addr_random(bond_dev);
@@ -1835,7 +1835,7 @@ static int __bond_release_one(struct net_device *bond_dev,
unblock_netpoll_tx();
synchronize_rcu();
- if (list_empty(&bond->slave_list)) {
+ if (!bond_has_slaves(bond)) {
call_netdevice_notifiers(NETDEV_CHANGEADDR, bond->dev);
call_netdevice_notifiers(NETDEV_RELEASE, bond->dev);
}
@@ -1904,7 +1904,7 @@ static int bond_release_and_destroy(struct net_device *bond_dev,
int ret;
ret = bond_release(bond_dev, slave_dev);
- if (ret == 0 && list_empty(&bond->slave_list)) {
+ if (ret == 0 && !bond_has_slaves(bond)) {
bond_dev->priv_flags |= IFF_DISABLE_NETPOLL;
pr_info("%s: destroying bond %s.\n",
bond_dev->name, bond_dev->name);
@@ -2219,7 +2219,7 @@ void bond_mii_monitor(struct work_struct *work)
delay = msecs_to_jiffies(bond->params.miimon);
- if (list_empty(&bond->slave_list))
+ if (!bond_has_slaves(bond))
goto re_arm;
should_notify_peers = bond_should_notify_peers(bond);
@@ -2513,7 +2513,7 @@ void bond_loadbalance_arp_mon(struct work_struct *work)
read_lock(&bond->lock);
- if (list_empty(&bond->slave_list))
+ if (!bond_has_slaves(bond))
goto re_arm;
oldcurrent = bond->curr_active_slave;
@@ -2845,7 +2845,7 @@ void bond_activebackup_arp_mon(struct work_struct *work)
delta_in_ticks = msecs_to_jiffies(bond->params.arp_interval);
- if (list_empty(&bond->slave_list))
+ if (!bond_has_slaves(bond))
goto re_arm;
should_notify_peers = bond_should_notify_peers(bond);
@@ -3169,7 +3169,7 @@ static int bond_open(struct net_device *bond_dev)
/* reset slave->backup and slave->inactive */
read_lock(&bond->lock);
- if (!list_empty(&bond->slave_list)) {
+ if (bond_has_slaves(bond)) {
read_lock(&bond->curr_slave_lock);
bond_for_each_slave(bond, slave, iter) {
if ((bond->params.mode == BOND_MODE_ACTIVEBACKUP)
@@ -3892,7 +3892,7 @@ static netdev_tx_t bond_start_xmit(struct sk_buff *skb, struct net_device *dev)
return NETDEV_TX_BUSY;
rcu_read_lock();
- if (!list_empty(&bond->slave_list))
+ if (bond_has_slaves(bond))
ret = __bond_start_xmit(skb, dev);
else
kfree_skb(skb);
diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index 939148a..5d40889 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -327,7 +327,7 @@ static ssize_t bonding_store_mode(struct device *d,
goto out;
}
- if (!list_empty(&bond->slave_list)) {
+ if (bond_has_slaves(bond)) {
pr_err("unable to update mode of %s because it has slaves.\n",
bond->dev->name);
ret = -EPERM;
@@ -509,7 +509,7 @@ static ssize_t bonding_store_fail_over_mac(struct device *d,
if (!rtnl_trylock())
return restart_syscall();
- if (!list_empty(&bond->slave_list)) {
+ if (bond_has_slaves(bond)) {
pr_err("%s: Can't alter fail_over_mac with slaves in bond.\n",
bond->dev->name);
ret = -EPERM;
diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
index e484c38..2908bd2 100644
--- a/drivers/net/bonding/bonding.h
+++ b/drivers/net/bonding/bonding.h
@@ -72,6 +72,8 @@
res; })
/* slave list primitives */
+#define bond_has_slaves(bond) !list_empty(&(bond)->slave_list)
+
#define bond_to_slave(ptr) list_entry(ptr, struct slave, list)
/* IMPORTANT: bond_first/last_slave can return NULL in case of an empty list */
--
1.8.4
^ permalink raw reply related
* [PATCH v3 net-next 17/27] bonding: convert bond_has_slaves() to use the neighbour list
From: Veaceslav Falico @ 2013-09-17 0:46 UTC (permalink / raw)
To: netdev; +Cc: jiri, Veaceslav Falico, Jay Vosburgh, Andy Gospodarek
In-Reply-To: <1379378812-18346-1-git-send-email-vfalico@redhat.com>
The same way as it was used for its own slave_list.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
Notes:
v2 -> v3:
No change.
v1 -> v2:
No changes.
RFC -> v1:
Use the renamed adj_list instead of neighbour_dev_list.
drivers/net/bonding/bonding.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
index 2908bd2..8176842 100644
--- a/drivers/net/bonding/bonding.h
+++ b/drivers/net/bonding/bonding.h
@@ -72,7 +72,7 @@
res; })
/* slave list primitives */
-#define bond_has_slaves(bond) !list_empty(&(bond)->slave_list)
+#define bond_has_slaves(bond) !list_empty(&(bond)->dev->adj_list.lower)
#define bond_to_slave(ptr) list_entry(ptr, struct slave, list)
--
1.8.4
^ permalink raw reply related
* [PATCH v3 net-next 18/27] net: add a possibility to get private from netdev_adjacent->list
From: Veaceslav Falico @ 2013-09-17 0:46 UTC (permalink / raw)
To: netdev
Cc: jiri, Veaceslav Falico, David S. Miller, Eric Dumazet,
Alexander Duyck
In-Reply-To: <1379378812-18346-1-git-send-email-vfalico@redhat.com>
It will be useful to get first/last element.
CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
Notes:
v2 -> v3:
No change.
v1 -> v2:
No changes.
RFC -> v1:
No changes.
include/linux/netdevice.h | 1 +
net/core/dev.c | 10 ++++++++++
2 files changed, 11 insertions(+)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 48431f0..72d1631 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2850,6 +2850,7 @@ extern void *netdev_lower_get_next_private_rcu(struct net_device *dev,
priv; \
priv = netdev_lower_get_next_private_rcu(dev, &(iter)))
+extern void *netdev_adjacent_get_private(struct list_head *adj_list);
extern struct net_device *netdev_master_upper_dev_get(struct net_device *dev);
extern struct net_device *netdev_master_upper_dev_get_rcu(struct net_device *dev);
extern int netdev_upper_dev_link(struct net_device *dev,
diff --git a/net/core/dev.c b/net/core/dev.c
index 11284c8..24c7d17 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4512,6 +4512,16 @@ struct net_device *netdev_master_upper_dev_get(struct net_device *dev)
}
EXPORT_SYMBOL(netdev_master_upper_dev_get);
+void *netdev_adjacent_get_private(struct list_head *adj_list)
+{
+ struct netdev_adjacent *adj;
+
+ adj = list_entry(adj_list, struct netdev_adjacent, list);
+
+ return adj->private;
+}
+EXPORT_SYMBOL(netdev_adjacent_get_private);
+
/* netdev_all_upper_get_next_dev_rcu - Get the next dev from upper list
* @dev: device
* @iter: list_head ** of the current position
--
1.8.4
^ permalink raw reply related
* [PATCH v3 net-next 19/27] bonding: convert first/last slave logic to use neighbours
From: Veaceslav Falico @ 2013-09-17 0:46 UTC (permalink / raw)
To: netdev; +Cc: jiri, Veaceslav Falico, Jay Vosburgh, Andy Gospodarek
In-Reply-To: <1379378812-18346-1-git-send-email-vfalico@redhat.com>
For that, use netdev_adjacent_get_private(list_head) on bond's lower
neighbour list members. Also, add a small macro - bond_slave_list(bond),
which returns the bond list via neighbour list.
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
Notes:
v2 -> v3:
No change.
v1 -> v2:
No changes.
RFC -> v1:
Rename neieghour_dev_list to adj_list.
drivers/net/bonding/bonding.h | 17 +++++++++++------
1 file changed, 11 insertions(+), 6 deletions(-)
diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
index 8176842..b82474d 100644
--- a/drivers/net/bonding/bonding.h
+++ b/drivers/net/bonding/bonding.h
@@ -72,19 +72,24 @@
res; })
/* slave list primitives */
-#define bond_has_slaves(bond) !list_empty(&(bond)->dev->adj_list.lower)
+#define bond_slave_list(bond) (&(bond)->dev->adj_list.lower)
+
+#define bond_has_slaves(bond) !list_empty(bond_slave_list(bond))
#define bond_to_slave(ptr) list_entry(ptr, struct slave, list)
/* IMPORTANT: bond_first/last_slave can return NULL in case of an empty list */
#define bond_first_slave(bond) \
- list_first_entry_or_null(&(bond)->slave_list, struct slave, list)
+ (bond_has_slaves(bond) ? \
+ netdev_adjacent_get_private(bond_slave_list(bond)->next) : \
+ NULL)
#define bond_last_slave(bond) \
- (list_empty(&(bond)->slave_list) ? NULL : \
- bond_to_slave((bond)->slave_list.prev))
+ (bond_has_slaves(bond) ? \
+ netdev_adjacent_get_private(bond_slave_list(bond)->prev) : \
+ NULL)
-#define bond_is_first_slave(bond, pos) ((pos)->list.prev == &(bond)->slave_list)
-#define bond_is_last_slave(bond, pos) ((pos)->list.next == &(bond)->slave_list)
+#define bond_is_first_slave(bond, pos) (pos == bond_first_slave(bond))
+#define bond_is_last_slave(bond, pos) (pos == bond_last_slave(bond))
/* Since bond_first/last_slave can return NULL, these can return NULL too */
#define bond_next_slave(bond, pos) \
--
1.8.4
^ permalink raw reply related
* [PATCH v3 net-next 22/27] bonding: use neighbours for bond_next_slave()
From: Veaceslav Falico @ 2013-09-17 0:46 UTC (permalink / raw)
To: netdev; +Cc: jiri, Veaceslav Falico, Jay Vosburgh, Andy Gospodarek
In-Reply-To: <1379378812-18346-1-git-send-email-vfalico@redhat.com>
If it's the last slave - return the first slave, otherwise - return the
next slave via netdev_lower_dev_get_next_private().
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
Notes:
v2 -> v3:
No change.
v1 -> v2:
No changes.
RFC -> v1:
Change to standard logic - if it's the last slave - return the first one,
otherwise - return the next via netdev_lower_dev_get_next_private().
Also, bond_prev_slave() was dropped - so we don't need it.
drivers/net/bonding/bonding.h | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
index 03daadd..9329509 100644
--- a/drivers/net/bonding/bonding.h
+++ b/drivers/net/bonding/bonding.h
@@ -76,8 +76,6 @@
#define bond_has_slaves(bond) !list_empty(bond_slave_list(bond))
-#define bond_to_slave(ptr) list_entry(ptr, struct slave, list)
-
/* IMPORTANT: bond_first/last_slave can return NULL in case of an empty list */
#define bond_first_slave(bond) \
(bond_has_slaves(bond) ? \
@@ -93,8 +91,9 @@
/* Since bond_first/last_slave can return NULL, these can return NULL too */
#define bond_next_slave(bond, pos) \
- (bond_is_last_slave(bond, pos) ? bond_first_slave(bond) : \
- bond_to_slave((pos)->list.next))
+ (bond_is_last_slave(bond, pos) ? \
+ bond_first_slave(bond) : \
+ netdev_lower_dev_get_next_private((bond)->dev, pos))
/**
* bond_for_each_slave - iterate over all slaves
--
1.8.4
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox