* Re: [PATCH net-next 0/2] extend sch_mqprio to distribute traffic not only by ETS TC
From: Amir Vadai @ 2012-05-14 19:24 UTC (permalink / raw)
To: John Fastabend
Cc: David S. Miller, netdev, Oren Duer, Liran Liss, Jamal Hadi Salim,
Diego Crupnicoff, Or Gerlitz
In-Reply-To: <4FAA0D2C.6040306@intel.com>
On 05/09/2012 09:22 AM, John Fastabend wrote:
> On 5/8/2012 6:56 AM, Amir Vadai wrote:
>> On 05/08/2012 03:54 AM, John Fastabend wrote:
>>> On 5/6/2012 12:05 AM, Amir Vadai wrote:
>>>> This series comes to revive the discussion initiated on the thread "net:
>>>> support tx_ring per UP in HW based QoS mechanism" (see
>>>> http://marc.info/?t=133165957200004&r=1&w=2) with the major issue to be address
>>>> is - how should sk_prio<=> TC be done, for both, tagged and untagged traffic.
>>>> Following is a staged description addressing the background, problem
>>>> description, current situation, suggestion for the change and implementation of
>>>> it.
>>>
>>> OK but the mqprio qdisc is only concerned with mapping skb->priority to
>>> queue sets I perhaps unfortunately called the queue sets tc's. Try not
>>> to get hung up on my perhaps limiting naming of variables and functions.
>> I understand that.
>
> I figured you did. Just wanted to state it again.
>
>>
>>>
>>> mqprio is used outside of 802.1Q as well in these cases a traffic class
>>> is not usually even defined.
>>>
>>>>
>
> [...]
>
>>>> Implementation
>>>> --------------
>>>> Extended mqprio hw attribute:
>>>> * Bit 1: is queue offset/count owned by HW
>>>> * Bits 2-7: HW queueing type.
>>>> * 0 - by ETS TC
>>>> * 1 - by UP
>>>>
>>>> __skb_tx_hash() is now aware to the HW queuing type (pg_type): for pg_type
>>>> being ETS TC, traffic is distributed as it was before - tagged and untagged
>>>> packets are distributed by netdev_get_prio_tc_map. For pg_type being UP, tagged
>>>> and untagged packets are distributed by UP (taken from egress map for tagged
>>>> traffic, or netdev_get_prio_tc_map for untagged).
>>>
>>> I guess I don't see why we need this. If you keep the mqprio priority to
>>> queue set mapping set to 1:1. Then modify the egress map accordingly then
>>> this should work right?
>>>
>>> For example:
>>>
>>> If we want to map 8 802.1Q priority code points onto 4 traffic classes this
>>> should work,
>>>
>>> egress map: 0:0 1:0 2:1 3:1 4:2 5:2 6:3 7:3<-- vlan layer inserts correct tag
>>> mqprio up2tc: 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7<-- mqprio with unfortunate 'tc' name maps priority to queues
>> But mqprio is mapping sk_priority to queue set, which is different from UP to queue set.
>> The term UP which we use, is the 8021q PCP in tagged traffic, and a priority mapped by the host admin for untagged traffic.
>> What you wrote, is actually, that the application will set the priority without enabling the host admin to control it.
>>> dcbnl up2tc: 0:0 1:0 2:1 3:1 4:2 5:2 6:3 7:3<-- dcbx pushes up2tc mapping correctly
>>
>> For example, lets take an application that calls setsockopt(SO_PRIORITY, 2):
>> according to egress map: 8021q PCP field in vlan tag should be set to 1 (=UP)
>> according to mqprio: a tx ring belonging to UP 2 will be selected.
>> according to dcbnl: traffic will have ETS attributes of TC 1
>>
>> Except some conceptual problems, it won't work:
>> 8021q PCP field is set by HW according to the tx ring (UP=2). which
>> is different from the one that the user configured in the egress_map
>> (UP=1).
>
> sorry I set it up wrong above but it _can_ be made to work as best I
> can tell (for a single egress_map at least)
>
> egress map: 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7<-- ignored
> mqprio : 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7<-- map priority to up queue sets
> up2tc : 0:0 1:0 2:1 3:1 4:2 5:2 6:3 7:3<-- dcbx negotiated up2tc map
>
> With your example application does setsockopt(SO_PRIORITY, 2):
> according to egress map : insert PCP 2
> according to mqprio : tx ring belonging to UP2 is selected
> accroding to dcbnl : traffic will have ETS attributes of TC1
>
> The point is either you use the skb->priority to PCP map and then a 1:1
> mqprio map or you use the mqprio map directly. Agree?
>
> One more example translating the case with these patches onto a case
> without these patches as I understand them.
>
> first with these patches mapping:
>
> up2tc : 0:0 1:0 2:1 3:1 4:2 5:2 6:3 7:3<-- dcbnl up2tc map
> egress map: 0:0 1:0 2:1 3:1 4:2 5:2 6:3 7:3<-- sets pcp bits
> mqprio : 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7<-- with PGROUP_UP set
>
> equivalent mapping without PGROUP_UP set
>
> up2tc : 0:0 1:0 2:1 3:1 4:2 5:2 6:3 7:3<-- dcbnl negotiated up2tc map
> egress map: 0:0 1:0 2:0 3:0 4:0 5:0 6:0 7:0<-- default unset egress map
> mqprio : 0:0 1:0 2:1 3:1 4:2 5:2 6:3 7:3<-- with PGROUP_UP set
>
> Application sets skb->priority to 2, in the first case egress map
> sets the PCP to 1 and mqprio maps it to queues for UP1 based on PCP bits.
>
> In case two egress map does nothing but skb->priority is still 2 so the
> mqprio maps these to queues associated with UP1 so everything works
> still.
>
> I apologize if I'm missing something here but it seems correct to me. Spell
> it out for me if I am still wrong.
>
>>
>>>
>>> We may need to fixup userspace tools some but I think the kernel mechanisms
>>> are in place.
>>>
>>> BTW I did think about this while implementing the existing code and decided
>>> that rather than create more if/else branches the case you are describing
>>> could be handled by independently controlling the priority to queue set mappings
>>> in mqprio and the egress map.
>>>
>>> Feel free to let me know where I went wrong or why this doesn't work on
>>> your hardware. I agree though we may need to fixup lldpad and maybe even
>>> 'tc' some to get this to look correct in user space and get automagically
>>> setup correctly.
>>>
>>> Thanks,
>>> .John
>>
>> I think I understand your stand, that mqprio mapping should be generic
>> and should not be TC nor UP oriented.
>>
>
> Right. Also I want to avoid user space having to somehow "know" which
> mode to put the qdisc in. Checking if the hardware is a mellanox card
> does not scale.
>
>> The problem is that our HW needs the queues to be selected by UP. ETS
>> TC mapping is configured before traffic starts, and therefore ETS
>> attributes are enforced by HW according to the UP, which the tx ring
>> belongs to. Same thing for vlan tag in tagged traffic. 8021q PCP is
>> placed by HW according to the tx ring.
>
> OK. Can you agree though in the case where we restrict the egress_map
> to be the same for all vlan's on a real_dev the existing mqprio map
> _can_ work?
>
>>
>> For backward compatibility, tagged traffic should be steered to a tx
>> ring by UP taken from egress map - skb_tx_hash() as it is today, can't
>> do it. Even if we implement ndo_select_queue(), we will need to
>> duplicate most of the code from skb_tx_hash(), because there are more
>> than one tx ring per UP, and we need skb_tx_hash() to be able to select
>> a ring from a range of rings belonging to a UP.
>
> agreed implementing this in select_queue() is no good.
>
>>
>> Also, there are some configurations that can't be done by mqprio and
>> egress map. For example when having two vlans with different egress
>> maps.
>
> This is a good example thanks. This currently doesn't work for hardware
> that maps tx_rings to traffic classes either. So if we need/want to
> solve this case then I guess we need something like your patches. Note
> I think the patch you submitted should work regardless if you map
> the PCP bits to UP queues or PCP bits to TC queues. We could use this
> in both cases which helps my user space concerns.
>
>>
>> And in the conceptual level, I think that kernel should not accept bad
>> configurations and rely on user space to protect it.
>
> I disagree policy should be managed in user space. Also I see no reason
> to create a strong coupling between egress maps and mqprio maps. I can
> imagine a case where they could be used independently. DCB is just one
> usage model for mqprio. Using a bit like you did seems fine though.
>
>>
>> - Amir
>
> Assuming we want the multiple different egress map case to work I guess
> we will have to do something like this. At least right now I can't think
> of anything better but I'll think on it tomorrow some more.
>
> The other thought would be to provide a qdisc hook before queue selection
> to attach 'tc filters' and provide a new action to hash a skb across
> queue sets.
>
> Thanks,
> John
John Hi,
After some internal discussions, it was agreed to line up with your
approach, to leave mqprio an abstract skb->priority <=> queue set
mapping and to ignore egress_map if mqprio is enabled.
It would be very nice, if the term 'tc' in kernel code would be replaced
to queue set, since it is very misleading.
There still might be some small issues with skb_tx_hash for tagged
traffic, which I will work on tomorrow, and hopefully will send a new
patch set with the solution.
Thanks,
Amir
^ permalink raw reply
* [PATCH] net/bridge/netfilter: Fix the randconfig warning
From: Devendra Naga @ 2012-05-14 19:30 UTC (permalink / raw)
To: Pablo Neira Ayuso, Patrick McHardy, Stephen Hemminger,
David S. Miller, netfilter-devel, netfilter, coreteam, bridge,
netdev
Cc: Devendra Naga
when ran with make randconfig got
warning: (BRIDGE_NF_EBTABLES) selects NETFILTER_XTABLES which has unmet direct dependencies (NET && INET && NETFILTER)
added NET && INET && NETFILTER dependency to the BRIDGE_NF_EBTABLES
Signed-off-by: Devendra Naga <devendra.aaru@gmail.com>
---
net/bridge/netfilter/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/bridge/netfilter/Kconfig b/net/bridge/netfilter/Kconfig
index a9aff9c..34f6d9d 100644
--- a/net/bridge/netfilter/Kconfig
+++ b/net/bridge/netfilter/Kconfig
@@ -4,7 +4,7 @@
menuconfig BRIDGE_NF_EBTABLES
tristate "Ethernet Bridge tables (ebtables) support"
- depends on BRIDGE && NETFILTER
+ depends on NET && INET && BRIDGE && NETFILTER
select NETFILTER_XTABLES
help
ebtables is a general, extensible frame/packet identification
--
1.7.9.5
^ permalink raw reply related
* Re: [PATCH RFC] tun: experimental zero copy tx support
From: Michael S. Tsirkin @ 2012-05-14 19:31 UTC (permalink / raw)
To: Eric Dumazet
Cc: Stephen Hemminger, David S. Miller, Joe Perches, Jason Wang,
netdev, linux-kernel, Ian.Campbell, kvm
In-Reply-To: <1337023307.8512.617.camel@edumazet-glaptop>
On Mon, May 14, 2012 at 09:21:47PM +0200, Eric Dumazet wrote:
> On Mon, 2012-05-14 at 22:14 +0300, Michael S. Tsirkin wrote:
>
> > They seem to be in net-next, or did I miss something?
>
> Yes, you re-introduce in this patch some bugs already fixed in macvtap
Could explain why I see some problems in testing :)
Maybe common code should go into net/core?
I couldn't decide whether the increase in kernel
size is worth it.
--
MST
^ permalink raw reply
* Re[2]: [PATCH] netfilter: xt_HMARK: endian bugs
From: Hans Schillstrom @ 2012-05-14 19:32 UTC (permalink / raw)
To: Eric Dumazet
Cc: Pablo Neira Ayuso, Jozsef Kadlecsik, Hans Schillstrom,
Jan Engelhardt, kaber@trash.net, jengelh@medozas.de,
netfilter-devel@vger.kernel.org, netdev@vger.kernel.org,
dan.carpenter@oracle.com
>---- Original Message ----
>From: Eric Dumazet <eric.dumazet@gmail.com>
>To: "Pablo Neira Ayuso" <pablo@netfilter.org>
>Cc: "Jozsef Kadlecsik" <kadlec@blackhole.kfki.hu>, "Hans Schillstrom" <hans.schillstrom@ericsson.com>, "Jan Engelhardt" <jengelh@inai.de>, "kaber@trash.net" <kaber@trash.net>, "jengelh@medozas.de" <jengelh@medozas.de>, "netfilter-devel@vger.kernel.org" <netfilter-devel@vger.kernel.org>, "netdev@vger.kernel.org" <netdev@vger.kernel.org>, "dan.carpenter@oracle.com" <dan.carpenter@oracle.com>, "hans@schillstrom.com" <hans@schillstrom.com>
>Sent: Mon, May 14, 2012, 9:14 PM
>Subject: Re: [PATCH] netfilter: xt_HMARK: endian bugs
>
>On Mon, 2012-05-14 at 21:02 +0200, Pablo Neira Ayuso wrote:
>
>> IIRC, Hans wants that, in case you have a cluster composed of system
>> with different endianess, the hash mark calculated will be the same
>> in both systems. To ensure that the distribution is consistent with
>> independency of the endianess.
>
>Then jhash() must be audited to make sure its output is OK with this
>requirement.
I'm preparing test on mips & ppc right now to verify that as a first step.
^ permalink raw reply
* [PATCH -next] net/codel: Add missing #include <linux/prefetch.h>
From: Geert Uytterhoeven @ 2012-05-14 19:47 UTC (permalink / raw)
To: Eric Dumazet, Dave Taht, David S. Miller
Cc: netdev, linux-kernel, linux-next, Geert Uytterhoeven
m68k allmodconfig:
net/sched/sch_codel.c: In function ‘dequeue’:
net/sched/sch_codel.c:70: error: implicit declaration of function ‘prefetch’
make[1]: *** [net/sched/sch_codel.o] Error 1
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
---
http://kisskb.ellerman.id.au/kisskb/buildresult/6320394/
net/sched/sch_codel.c | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
diff --git a/net/sched/sch_codel.c b/net/sched/sch_codel.c
index b4a1a81..213ef60 100644
--- a/net/sched/sch_codel.c
+++ b/net/sched/sch_codel.c
@@ -46,6 +46,7 @@
#include <linux/kernel.h>
#include <linux/errno.h>
#include <linux/skbuff.h>
+#include <linux/prefetch.h>
#include <net/pkt_sched.h>
#include <net/codel.h>
--
1.7.0.4
^ permalink raw reply related
* pull request: wireless-next 2012-05-14
From: John W. Linville @ 2012-05-14 19:55 UTC (permalink / raw)
To: davem; +Cc: linux-wireless, netdev
commit 341352d13dae752610342923c53ebe461624ee2c
Dave,
This is essentially the same batch of updates for 3.5 that I originally
offered ~1.5 weeks ago. The difference is that I rebased the tree
both to drop the offending bluetooth-next pull and to base on top of
a more recent net-next in order to resolve the continuing NLA_PUT_*
issues affecting linux-next. I also took the opportunity to resolve
a little merge damage resulting from a recent pull of the net tree
into net-next.
Other highlights of this pull request include some HT enhancements
for mac80211, an expansion of the ethtool support for cfg80211-
and mac80211-based drivers, and some more iwlwifi refactoring.
Please let me know if there are problems!
John
---
The following changes since commit 9bb862beb6e5839e92f709d33fda07678f062f20:
Merge branch 'master' of git://1984.lsi.us.es/net-next (2012-05-08 14:40:21 -0400)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next.git master
Amitkumar Karwar (1):
mwifiex: fix static checker warnings
Anisse Astier (2):
rt2x00: debugfs support - allow a register to be empty
rt2x00: Add debugfs access for rfcsr register
Ashok Nagarajan (4):
mac80211: Advertise HT protection mode in IEs
mac80211: Implement HT mixed protection mode
mac80211: Allow nonHT/HT peering in mesh
{nl,cfg,mac}80211: Allow user to see/configure HT protection mode
Ben Greear (4):
cfg80211: Add framework to support ethtool stats.
mac80211: Support getting sta_info stats via ethtool.
mac80211: Framework to get wifi-driver stats via ethtool.
mac80211: Add more ethtools stats: survey, rates, etc
Ben Hutchings (2):
ipw2200: Fix order of device registration
ipw2100: Fix order of device registration
Dan Carpenter (1):
wireless: at76c50x: allocating too much data
Emmanuel Grumbach (3):
iwlwifi: use IWL_* instead of dev_printk when possible
iwlwifi: don't init trans->reg_lock from the op_mode
cfg80211: fix BSS comparison
Franky Lin (4):
brcmfmac: stop releasing sdio host in irq handler
brcmfmac: check bus state for status
brcmfmac: postpone interrupt register function
brcmfmac: add out of band interrupt support
John W. Linville (1):
iwlwifi: fix-up some merge damage from commit 0d6c4a2
Luis R. Rodriguez (1):
libertas: include sched.h on firmware.c
Meenakshi Venkataraman (1):
iwlwifi: use correct released ucode version
Rajkumar Manoharan (1):
mac80211: fix rate control update on 2040 bss change
Stanislav Yakovlev (1):
net/wireless: ipw2200: Fix WARN_ON occurring in wiphy_register called by ipw_pci_probe
Stanislaw Gruszka (1):
iwlwifi: add option to disable 5GHz band
Thomas Pedersen (2):
mac80211: insert mesh peer after init
mac80211: don't transmit 40MHz frames to 20MHz peer
WarheadsSE (1):
mwifiex: add support for SD8786 sdio
Wey-Yi Guy (11):
iwlwifi: remove unused macros
iwlwifi: add BT reduced tx power flag
iwlwifi: add checking for the condition to reduce tx power
iwlwifi: add reduced tx power threshold define
iwlwifi: small define change
iwlwifi: send reduce tx power info in command
iwlwifi: change kill mask based on reduce power state
iwlwifi: add loose coex lut
iwlwifi: modify #ifdef to avoid sparse complain
iwlwifi: remove the iwl_shared reference
iwlwifi: use 6000G2B for 6030 device series
drivers/net/wireless/at76c50x-usb.c | 4 +-
drivers/net/wireless/brcm80211/Kconfig | 9 +
drivers/net/wireless/brcm80211/brcmfmac/bcmsdh.c | 97 ++++++++++-
.../net/wireless/brcm80211/brcmfmac/bcmsdh_sdmmc.c | 105 +++++++++++-
drivers/net/wireless/brcm80211/brcmfmac/dhd_sdio.c | 39 +++--
.../net/wireless/brcm80211/brcmfmac/sdio_host.h | 22 ++-
drivers/net/wireless/ipw2x00/ipw2100.c | 24 ++--
drivers/net/wireless/ipw2x00/ipw2200.c | 44 ++---
drivers/net/wireless/iwlwifi/iwl-agn-lib.c | 153 ++++++++---------
drivers/net/wireless/iwlwifi/iwl-agn-rx.c | 3 +-
drivers/net/wireless/iwlwifi/iwl-agn.c | 38 ++--
drivers/net/wireless/iwlwifi/iwl-agn.h | 2 +-
drivers/net/wireless/iwlwifi/iwl-commands.h | 21 ++-
drivers/net/wireless/iwlwifi/iwl-dev.h | 1 +
drivers/net/wireless/iwlwifi/iwl-drv.c | 12 +-
drivers/net/wireless/iwlwifi/iwl-modparams.h | 8 +-
drivers/net/wireless/iwlwifi/iwl-trans-pcie.c | 1 +
drivers/net/wireless/libertas/firmware.c | 1 +
drivers/net/wireless/mwifiex/Kconfig | 4 +-
drivers/net/wireless/mwifiex/fw.h | 3 +-
drivers/net/wireless/mwifiex/sdio.c | 7 +
drivers/net/wireless/mwifiex/sdio.h | 1 +
drivers/net/wireless/rt2x00/rt2800.h | 2 +
drivers/net/wireless/rt2x00/rt2800lib.c | 7 +
drivers/net/wireless/rt2x00/rt2x00debug.c | 82 +++++----
drivers/net/wireless/rt2x00/rt2x00debug.h | 1 +
include/linux/nl80211.h | 3 +
include/net/cfg80211.h | 18 ++
include/net/mac80211.h | 17 ++
net/mac80211/cfg.c | 182 ++++++++++++++++++++
net/mac80211/driver-ops.h | 37 ++++
net/mac80211/driver-trace.h | 15 ++
net/mac80211/ibss.c | 2 +-
net/mac80211/ieee80211_i.h | 3 +-
net/mac80211/mesh.c | 18 ++-
net/mac80211/mesh_plink.c | 96 ++++++++++-
net/mac80211/mlme.c | 2 +-
net/mac80211/sta_info.h | 1 +
net/mac80211/util.c | 9 +-
net/wireless/ethtool.c | 29 +++
net/wireless/mesh.c | 1 +
net/wireless/nl80211.c | 7 +-
net/wireless/scan.c | 6 +-
43 files changed, 893 insertions(+), 244 deletions(-)
--
John W. Linville Someday the world will need a hero, and you
linville@tuxdriver.com might be all we have. Be ready.
^ permalink raw reply
* [PATCH net-next 5/8] net/mlx4_core: Do not reset module-parameter num_vfs when fail to enable sriov
From: Or Gerlitz @ 2012-05-14 20:04 UTC (permalink / raw)
To: davem; +Cc: roland, netdev, Jack Morgenstein, Or Gerlitz
In-Reply-To: <1337025853-26685-1-git-send-email-ogerlitz@mellanox.com>
From: Jack Morgenstein <jackm@dev.mellanox.co.il>
Consider the following scenario: 2 HCAs, where only one of which can run SRIOV.
If we reset the module parameter, all the VFs of the SRIOV HCA will be
claimed by the PPF host (-- the code relies on num_vfs being non-zero
to avoid this claiming, and num_vfs was reset when pci_enable_sriov failed
for the non-SRIOV HCA).
The solution is not to touch the num_vfs parameter.
Also, eliminate the unneeded check of num_vfs when disabling sriov
(the dev flag bit is sufficient).
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx4/main.c | 5 ++---
1 files changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c
index 8bb05b4..8eed1f2 100644
--- a/drivers/net/ethernet/mellanox/mlx4/main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/main.c
@@ -1865,7 +1865,6 @@ static int __mlx4_init_one(struct pci_dev *pdev, const struct pci_device_id *id)
mlx4_err(dev, "Failed to enable sriov,"
"continuing without sriov enabled"
" (err = %d).\n", err);
- num_vfs = 0;
err = 0;
} else {
mlx4_warn(dev, "Running in master mode\n");
@@ -2022,7 +2021,7 @@ err_cmd:
mlx4_cmd_cleanup(dev);
err_sriov:
- if (num_vfs && (dev->flags & MLX4_FLAG_SRIOV))
+ if (dev->flags & MLX4_FLAG_SRIOV)
pci_disable_sriov(pdev);
err_rel_own:
@@ -2099,7 +2098,7 @@ static void mlx4_remove_one(struct pci_dev *pdev)
if (dev->flags & MLX4_FLAG_MSI_X)
pci_disable_msix(pdev);
- if (num_vfs && (dev->flags & MLX4_FLAG_SRIOV)) {
+ if (dev->flags & MLX4_FLAG_SRIOV) {
mlx4_warn(dev, "Disabling sriov\n");
pci_disable_sriov(pdev);
}
--
1.7.1
^ permalink raw reply related
* [PATCH net-next 4/8] net/mlx4_core: Remove unused *_str functions from the resource tracker
From: Or Gerlitz @ 2012-05-14 20:04 UTC (permalink / raw)
To: davem; +Cc: roland, netdev, Jack Morgenstein, Or Gerlitz
In-Reply-To: <1337025853-26685-1-git-send-email-ogerlitz@mellanox.com>
From: Jack Morgenstein <jackm@dev.mellanox.co.il>
Removed unsued *_str helper functions from resource_tracker.c
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
---
.../net/ethernet/mellanox/mlx4/resource_tracker.c | 30 --------------------
1 files changed, 0 insertions(+), 30 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
index d8970cc..66df9e1 100644
--- a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
+++ b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
@@ -89,17 +89,6 @@ enum res_qp_states {
RES_QP_HW
};
-static inline const char *qp_states_str(enum res_qp_states state)
-{
- switch (state) {
- case RES_QP_BUSY: return "RES_QP_BUSY";
- case RES_QP_RESERVED: return "RES_QP_RESERVED";
- case RES_QP_MAPPED: return "RES_QP_MAPPED";
- case RES_QP_HW: return "RES_QP_HW";
- default: return "Unknown";
- }
-}
-
struct res_qp {
struct res_common com;
struct res_mtt *mtt;
@@ -173,16 +162,6 @@ enum res_srq_states {
RES_SRQ_HW,
};
-static inline const char *srq_states_str(enum res_srq_states state)
-{
- switch (state) {
- case RES_SRQ_BUSY: return "RES_SRQ_BUSY";
- case RES_SRQ_ALLOCATED: return "RES_SRQ_ALLOCATED";
- case RES_SRQ_HW: return "RES_SRQ_HW";
- default: return "Unknown";
- }
-}
-
struct res_srq {
struct res_common com;
struct res_mtt *mtt;
@@ -195,15 +174,6 @@ enum res_counter_states {
RES_COUNTER_ALLOCATED,
};
-static inline const char *counter_states_str(enum res_counter_states state)
-{
- switch (state) {
- case RES_COUNTER_BUSY: return "RES_COUNTER_BUSY";
- case RES_COUNTER_ALLOCATED: return "RES_COUNTER_ALLOCATED";
- default: return "Unknown";
- }
-}
-
struct res_counter {
struct res_common com;
int port;
--
1.7.1
^ permalink raw reply related
* [PATCH net-next 3/8] net/mlx4_core: Change SYNC_TPT to be native (not wrapped)
From: Or Gerlitz @ 2012-05-14 20:04 UTC (permalink / raw)
To: davem; +Cc: roland, netdev, Jack Morgenstein, Or Gerlitz
In-Reply-To: <1337025853-26685-1-git-send-email-ogerlitz@mellanox.com>
From: Jack Morgenstein <jackm@dev.mellanox.co.il>
The "wrapped" was incorrect, since no wrapper function was defined.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx4/mr.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/mr.c b/drivers/net/ethernet/mellanox/mlx4/mr.c
index cefa76f..af55b7c 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mr.c
+++ b/drivers/net/ethernet/mellanox/mlx4/mr.c
@@ -892,6 +892,6 @@ EXPORT_SYMBOL_GPL(mlx4_fmr_free);
int mlx4_SYNC_TPT(struct mlx4_dev *dev)
{
return mlx4_cmd(dev, 0, 0, 0, MLX4_CMD_SYNC_TPT, 1000,
- MLX4_CMD_WRAPPED);
+ MLX4_CMD_NATIVE);
}
EXPORT_SYMBOL_GPL(mlx4_SYNC_TPT);
--
1.7.1
^ permalink raw reply related
* [PATCH net-next 2/8] net/mlx4_core: Fix init_port mask state for slaves
From: Or Gerlitz @ 2012-05-14 20:04 UTC (permalink / raw)
To: davem; +Cc: roland, netdev, Jack Morgenstein, Or Gerlitz
In-Reply-To: <1337025853-26685-1-git-send-email-ogerlitz@mellanox.com>
From: Jack Morgenstein <jackm@dev.mellanox.co.il>
In function mlx4_INIT_PORT_wrapper, the port state mask for the
slave is only set if we are invoking the INIT_PORT fw command.
However, the reference count for the (initialized) port is
incremented anyway.
This creates a problem in that when we have multiple slaves,
then the CLOSE_PORT command will never be invoked. The
reason is that in the CLOSE_PORT wrapper, if the port-state
mask is zero for the slave (which it is), the wrapper returns
without doing anything. The only slave which will not return
immediately in the CLOSE_PORT wrapper is that slave for which
INIT_PORT was invoked.
The fix is to not have the port-state mask setting depend
on the logic for calling the INIT_PORT fw command.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx4/fw.c | 3 +--
1 files changed, 1 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/fw.c b/drivers/net/ethernet/mellanox/mlx4/fw.c
index 2a02ba5..24429a9 100644
--- a/drivers/net/ethernet/mellanox/mlx4/fw.c
+++ b/drivers/net/ethernet/mellanox/mlx4/fw.c
@@ -1164,9 +1164,8 @@ int mlx4_INIT_PORT_wrapper(struct mlx4_dev *dev, int slave,
MLX4_CMD_TIME_CLASS_A, MLX4_CMD_NATIVE);
if (err)
return err;
- priv->mfunc.master.slave_state[slave].init_port_mask |=
- (1 << port);
}
+ priv->mfunc.master.slave_state[slave].init_port_mask |= (1 << port);
++priv->mfunc.master.init_port_ref[port];
return 0;
}
--
1.7.1
^ permalink raw reply related
* [PATCH net-next 1/8] net/mlx4: Address build warnings on set but not used variables
From: Or Gerlitz @ 2012-05-14 20:04 UTC (permalink / raw)
To: davem; +Cc: roland, netdev, Or Gerlitz
In-Reply-To: <1337025853-26685-1-git-send-email-ogerlitz@mellanox.com>
Handle the compiler warnings on variables which are set but not used
by removing the relevant variable or casting a return value which is
ignored on purpose to void.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx4/cmd.c | 5 +----
drivers/net/ethernet/mellanox/mlx4/mcg.c | 2 --
drivers/net/ethernet/mellanox/mlx4/mr.c | 3 ---
drivers/net/ethernet/mellanox/mlx4/port.c | 3 +--
.../net/ethernet/mellanox/mlx4/resource_tracker.c | 7 +++----
5 files changed, 5 insertions(+), 15 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/cmd.c b/drivers/net/ethernet/mellanox/mlx4/cmd.c
index 773c70e..53b738b 100644
--- a/drivers/net/ethernet/mellanox/mlx4/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx4/cmd.c
@@ -1254,7 +1254,6 @@ static void mlx4_master_do_cmd(struct mlx4_dev *dev, int slave, u8 cmd,
struct mlx4_priv *priv = mlx4_priv(dev);
struct mlx4_slave_state *slave_state = priv->mfunc.master.slave_state;
u32 reply;
- u32 slave_status = 0;
u8 is_going_down = 0;
int i;
@@ -1274,10 +1273,8 @@ static void mlx4_master_do_cmd(struct mlx4_dev *dev, int slave, u8 cmd,
}
/*check if we are in the middle of FLR process,
if so return "retry" status to the slave*/
- if (MLX4_COMM_CMD_FLR == slave_state[slave].last_cmd) {
- slave_status = MLX4_DELAY_RESET_SLAVE;
+ if (MLX4_COMM_CMD_FLR == slave_state[slave].last_cmd)
goto inform_slave_state;
- }
/* write the version in the event field */
reply |= mlx4_comm_get_version();
diff --git a/drivers/net/ethernet/mellanox/mlx4/mcg.c b/drivers/net/ethernet/mellanox/mlx4/mcg.c
index 4799e82..f4a8f98 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mcg.c
+++ b/drivers/net/ethernet/mellanox/mlx4/mcg.c
@@ -357,7 +357,6 @@ static int add_promisc_qp(struct mlx4_dev *dev, u8 port,
u32 prot;
int i;
bool found;
- int last_index;
int err;
struct mlx4_priv *priv = mlx4_priv(dev);
@@ -419,7 +418,6 @@ static int add_promisc_qp(struct mlx4_dev *dev, u8 port,
if (err)
goto out_mailbox;
}
- last_index = entry->index;
}
/* add the new qpn to list of promisc qps */
diff --git a/drivers/net/ethernet/mellanox/mlx4/mr.c b/drivers/net/ethernet/mellanox/mlx4/mr.c
index fe2ac84..cefa76f 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mr.c
+++ b/drivers/net/ethernet/mellanox/mlx4/mr.c
@@ -788,7 +788,6 @@ int mlx4_fmr_alloc(struct mlx4_dev *dev, u32 pd, u32 access, int max_pages,
int max_maps, u8 page_shift, struct mlx4_fmr *fmr)
{
struct mlx4_priv *priv = mlx4_priv(dev);
- u64 mtt_offset;
int err = -ENOMEM;
if (max_maps > dev->caps.max_fmr_maps)
@@ -811,8 +810,6 @@ int mlx4_fmr_alloc(struct mlx4_dev *dev, u32 pd, u32 access, int max_pages,
if (err)
return err;
- mtt_offset = fmr->mr.mtt.offset * dev->caps.mtt_entry_sz;
-
fmr->mtts = mlx4_table_find(&priv->mr_table.mtt_table,
fmr->mr.mtt.offset,
&fmr->dma_handle);
diff --git a/drivers/net/ethernet/mellanox/mlx4/port.c b/drivers/net/ethernet/mellanox/mlx4/port.c
index 55b12e6..70ef4f1 100644
--- a/drivers/net/ethernet/mellanox/mlx4/port.c
+++ b/drivers/net/ethernet/mellanox/mlx4/port.c
@@ -338,11 +338,10 @@ EXPORT_SYMBOL_GPL(__mlx4_unregister_mac);
void mlx4_unregister_mac(struct mlx4_dev *dev, u8 port, u64 mac)
{
u64 out_param;
- int err;
if (mlx4_is_mfunc(dev)) {
set_param_l(&out_param, port);
- err = mlx4_cmd_imm(dev, mac, &out_param, RES_MAC,
+ (void) mlx4_cmd_imm(dev, mac, &out_param, RES_MAC,
RES_OP_RESERVE_AND_MAP, MLX4_CMD_FREE_RES,
MLX4_CMD_TIME_CLASS_A, MLX4_CMD_WRAPPED);
return;
diff --git a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
index 8752e6e..d8970cc 100644
--- a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
+++ b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
@@ -2536,7 +2536,7 @@ int mlx4_QP_ATTACH_wrapper(struct mlx4_dev *dev, int slave,
struct mlx4_qp qp; /* dummy for calling attach/detach */
u8 *gid = inbox->buf;
enum mlx4_protocol prot = (vhcr->in_modifier >> 28) & 0x7;
- int err, err1;
+ int err;
int qpn;
struct res_qp *rqp;
int attach = vhcr->op_modifier;
@@ -2571,7 +2571,7 @@ int mlx4_QP_ATTACH_wrapper(struct mlx4_dev *dev, int slave,
ex_rem:
/* ignore error return below, already in error */
- err1 = rem_mcg_res(dev, slave, rqp, gid, prot, type);
+ (void) rem_mcg_res(dev, slave, rqp, gid, prot, type);
ex_put:
put_res(dev, slave, qpn, RES_QP);
@@ -2604,12 +2604,11 @@ static void detach_qp(struct mlx4_dev *dev, int slave, struct res_qp *rqp)
{
struct res_gid *rgid;
struct res_gid *tmp;
- int err;
struct mlx4_qp qp; /* dummy for calling attach/detach */
list_for_each_entry_safe(rgid, tmp, &rqp->mcg_list, list) {
qp.qpn = rqp->local_qpn;
- err = mlx4_qp_detach_common(dev, &qp, rgid->gid, rgid->prot,
+ (void) mlx4_qp_detach_common(dev, &qp, rgid->gid, rgid->prot,
rgid->steer);
list_del(&rgid->list);
kfree(rgid);
--
1.7.1
^ permalink raw reply related
* [PATCH net-next 0/8] batch of mlx4 driver fixes
From: Or Gerlitz @ 2012-05-14 20:04 UTC (permalink / raw)
To: davem; +Cc: roland, netdev, Or Gerlitz, Jack Morgenstein
Hi Dave,
This is a batch of relatively small fixes for the mlx4_core driver, except for
one cleanup patch from myself, all the patches are from Jack Morgenstein, who
leads our SRIOV development efforts and do relate to the SRIOV functionality.
With Yevgeny being mostly out this week and as both of us run the internal
reviews for upstream patches, he delegated this submission to be carried
out by me, hope this is fine by you.
Or.
Jack Morgenstein (7):
net/mlx4_core: Fix init_port mask state for slaves
net/mlx4_core: Change SYNC_TPT to be native (not wrapped)
net/mlx4_core: Remove unused *_str functions from the resource tracker
net/mlx4_core: Do not reset module-parameter num_vfs when fail to enable sriov
net/mlx4_core: Fix potential kernel Oops in res tracker during Dom0 driver unload
net/mlx4_core: Add XRC domains and counters to resource tracker
net/mlx4_core: Fixed error flow in rem_slave_eqs
Or Gerlitz (1):
net/mlx4: Address build warnings on set but not used variables
drivers/net/ethernet/mellanox/mlx4/cmd.c | 7 +-
drivers/net/ethernet/mellanox/mlx4/fw.c | 3 +-
drivers/net/ethernet/mellanox/mlx4/main.c | 47 +++-
drivers/net/ethernet/mellanox/mlx4/mcg.c | 2 -
drivers/net/ethernet/mellanox/mlx4/mlx4.h | 12 +-
drivers/net/ethernet/mellanox/mlx4/mr.c | 5 +-
drivers/net/ethernet/mellanox/mlx4/pd.c | 39 +++-
drivers/net/ethernet/mellanox/mlx4/port.c | 3 +-
.../net/ethernet/mellanox/mlx4/resource_tracker.c | 269 ++++++++++++++++----
9 files changed, 316 insertions(+), 71 deletions(-)
Cc: Jack Morgenstein <jackm@dev.mellanox.co.il>
^ permalink raw reply
* [PATCH net-next 8/8] net/mlx4_core: Fixed error flow in rem_slave_eqs
From: Or Gerlitz @ 2012-05-14 20:04 UTC (permalink / raw)
To: davem; +Cc: roland, netdev, Jack Morgenstein, Or Gerlitz
In-Reply-To: <1337025853-26685-1-git-send-email-ogerlitz@mellanox.com>
From: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
---
.../net/ethernet/mellanox/mlx4/resource_tracker.c | 13 ++++++-------
1 files changed, 6 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
index 626d1e3..2650f23 100644
--- a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
+++ b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
@@ -3152,14 +3152,13 @@ static void rem_slave_eqs(struct mlx4_dev *dev, int slave)
MLX4_CMD_HW2SW_EQ,
MLX4_CMD_TIME_CLASS_A,
MLX4_CMD_NATIVE);
- mlx4_dbg(dev, "rem_slave_eqs: failed"
- " to move slave %d eqs %d to"
- " SW ownership\n", slave, eqn);
+ if (err)
+ mlx4_dbg(dev, "rem_slave_eqs: failed"
+ " to move slave %d eqs %d to"
+ " SW ownership\n", slave, eqn);
mlx4_free_cmd_mailbox(dev, mailbox);
- if (!err) {
- atomic_dec(&eq->mtt->ref_count);
- state = RES_EQ_RESERVED;
- }
+ atomic_dec(&eq->mtt->ref_count);
+ state = RES_EQ_RESERVED;
break;
default:
--
1.7.1
^ permalink raw reply related
* [PATCH net-next 6/8] net/mlx4_core: Fix potential kernel Oops in res tracker during Dom0 driver unload
From: Or Gerlitz @ 2012-05-14 20:04 UTC (permalink / raw)
To: davem; +Cc: roland, netdev, Jack Morgenstein, Or Gerlitz
In-Reply-To: <1337025853-26685-1-git-send-email-ogerlitz@mellanox.com>
From: Jack Morgenstein <jackm@dev.mellanox.co.il>
Currently the slave and master resources are deleted after master freed
all bitmaps. If any resources were not properly cleaned up during the
shutdown process, an Oops would result.
Fix so that delete slave (only) resources during cleanup. Master resources
are cleaned up during unload process, and need not separately be cleaned.
Note that during cleanup, we need to split the resource-tracker freeing
functionality.
Before removing all the bitmaps, we free any leftover slave resources.
However, we can only remove the resource tracker linked list after
all bitmap frees, since some of the freeing functions (e.g.,
mlx4_cleanup_eq_table) use paravirtualized FW commands which expect
the resource tracker linked list to be present.
Found-by: Aviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx4/cmd.c | 2 +-
drivers/net/ethernet/mellanox/mlx4/main.c | 7 ++++++-
drivers/net/ethernet/mellanox/mlx4/mlx4.h | 8 +++++++-
.../net/ethernet/mellanox/mlx4/resource_tracker.c | 17 ++++++++++++-----
4 files changed, 26 insertions(+), 8 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/cmd.c b/drivers/net/ethernet/mellanox/mlx4/cmd.c
index 53b738b..1bcead1 100644
--- a/drivers/net/ethernet/mellanox/mlx4/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx4/cmd.c
@@ -1554,7 +1554,7 @@ int mlx4_multi_func_init(struct mlx4_dev *dev)
return 0;
err_resource:
- mlx4_free_resource_tracker(dev);
+ mlx4_free_resource_tracker(dev, RES_TR_FREE_ALL);
err_thread:
flush_workqueue(priv->mfunc.master.comm_wq);
destroy_workqueue(priv->mfunc.master.comm_wq);
diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c
index 8eed1f2..2e94f76 100644
--- a/drivers/net/ethernet/mellanox/mlx4/main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/main.c
@@ -2069,6 +2069,10 @@ static void mlx4_remove_one(struct pci_dev *pdev)
mlx4_CLOSE_PORT(dev, p);
}
+ if (mlx4_is_master(dev))
+ mlx4_free_resource_tracker(dev,
+ RES_TR_FREE_SLAVES_ONLY);
+
mlx4_cleanup_counters_table(dev);
mlx4_cleanup_mcg_table(dev);
mlx4_cleanup_qp_table(dev);
@@ -2081,7 +2085,8 @@ static void mlx4_remove_one(struct pci_dev *pdev)
mlx4_cleanup_pd_table(dev);
if (mlx4_is_master(dev))
- mlx4_free_resource_tracker(dev);
+ mlx4_free_resource_tracker(dev,
+ RES_TR_FREE_STRUCTS_ONLY);
iounmap(priv->kar);
mlx4_uar_free(dev, &priv->driver_uar);
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4.h b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
index cd56f1a..8767fbf 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
@@ -146,6 +146,11 @@ enum mlx4_alloc_mode {
RES_OP_MAP_ICM,
};
+enum mlx4_res_tracker_free_type {
+ RES_TR_FREE_ALL,
+ RES_TR_FREE_SLAVES_ONLY,
+ RES_TR_FREE_STRUCTS_ONLY,
+};
/*
*Virtual HCR structures.
@@ -1027,7 +1032,8 @@ int mlx4_get_slave_from_resource_id(struct mlx4_dev *dev,
void mlx4_delete_all_resources_for_slave(struct mlx4_dev *dev, int slave_id);
int mlx4_init_resource_tracker(struct mlx4_dev *dev);
-void mlx4_free_resource_tracker(struct mlx4_dev *dev);
+void mlx4_free_resource_tracker(struct mlx4_dev *dev,
+ enum mlx4_res_tracker_free_type type);
int mlx4_SET_PORT_wrapper(struct mlx4_dev *dev, int slave,
struct mlx4_vhcr *vhcr,
diff --git a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
index 66df9e1..e17d759 100644
--- a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
+++ b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
@@ -224,16 +224,23 @@ int mlx4_init_resource_tracker(struct mlx4_dev *dev)
return 0 ;
}
-void mlx4_free_resource_tracker(struct mlx4_dev *dev)
+void mlx4_free_resource_tracker(struct mlx4_dev *dev,
+ enum mlx4_res_tracker_free_type type)
{
struct mlx4_priv *priv = mlx4_priv(dev);
int i;
if (priv->mfunc.master.res_tracker.slave_list) {
- for (i = 0 ; i < dev->num_slaves; i++)
- mlx4_delete_all_resources_for_slave(dev, i);
-
- kfree(priv->mfunc.master.res_tracker.slave_list);
+ if (type != RES_TR_FREE_STRUCTS_ONLY)
+ for (i = 0 ; i < dev->num_slaves; i++)
+ if (type == RES_TR_FREE_ALL ||
+ dev->caps.function != i)
+ mlx4_delete_all_resources_for_slave(dev, i);
+
+ if (type != RES_TR_FREE_SLAVES_ONLY) {
+ kfree(priv->mfunc.master.res_tracker.slave_list);
+ priv->mfunc.master.res_tracker.slave_list = NULL;
+ }
}
}
--
1.7.1
^ permalink raw reply related
* [PATCH net-next 7/8] net/mlx4_core: Add XRC domains and counters to resource tracker
From: Or Gerlitz @ 2012-05-14 20:04 UTC (permalink / raw)
To: davem; +Cc: roland, netdev, Jack Morgenstein, Or Gerlitz
In-Reply-To: <1337025853-26685-1-git-send-email-ogerlitz@mellanox.com>
From: Jack Morgenstein <jackm@dev.mellanox.co.il>
Add missing resource tracking for XRC domains and complete the tracking for HCA
network flow counters.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx4/main.c | 35 ++++-
drivers/net/ethernet/mellanox/mlx4/mlx4.h | 4 +
drivers/net/ethernet/mellanox/mlx4/pd.c | 39 ++++-
.../net/ethernet/mellanox/mlx4/resource_tracker.c | 202 +++++++++++++++++++-
4 files changed, 275 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c
index 2e94f76..984ace4 100644
--- a/drivers/net/ethernet/mellanox/mlx4/main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/main.c
@@ -1306,7 +1306,7 @@ static void mlx4_cleanup_counters_table(struct mlx4_dev *dev)
mlx4_bitmap_cleanup(&mlx4_priv(dev)->counters_bitmap);
}
-int mlx4_counter_alloc(struct mlx4_dev *dev, u32 *idx)
+int __mlx4_counter_alloc(struct mlx4_dev *dev, u32 *idx)
{
struct mlx4_priv *priv = mlx4_priv(dev);
@@ -1319,13 +1319,44 @@ int mlx4_counter_alloc(struct mlx4_dev *dev, u32 *idx)
return 0;
}
+
+int mlx4_counter_alloc(struct mlx4_dev *dev, u32 *idx)
+{
+ u64 out_param;
+ int err;
+
+ if (mlx4_is_mfunc(dev)) {
+ err = mlx4_cmd_imm(dev, 0, &out_param, RES_COUNTER,
+ RES_OP_RESERVE, MLX4_CMD_ALLOC_RES,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_WRAPPED);
+ if (!err)
+ *idx = get_param_l(&out_param);
+
+ return err;
+ }
+ return __mlx4_counter_alloc(dev, idx);
+}
EXPORT_SYMBOL_GPL(mlx4_counter_alloc);
-void mlx4_counter_free(struct mlx4_dev *dev, u32 idx)
+void __mlx4_counter_free(struct mlx4_dev *dev, u32 idx)
{
mlx4_bitmap_free(&mlx4_priv(dev)->counters_bitmap, idx);
return;
}
+
+void mlx4_counter_free(struct mlx4_dev *dev, u32 idx)
+{
+ u64 in_param;
+
+ if (mlx4_is_mfunc(dev)) {
+ set_param_l(&in_param, idx);
+ mlx4_cmd(dev, in_param, RES_COUNTER, RES_OP_RESERVE,
+ MLX4_CMD_FREE_RES, MLX4_CMD_TIME_CLASS_A,
+ MLX4_CMD_WRAPPED);
+ return;
+ }
+ __mlx4_counter_free(dev, idx);
+}
EXPORT_SYMBOL_GPL(mlx4_counter_free);
static int mlx4_setup_hca(struct mlx4_dev *dev)
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4.h b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
index 8767fbf..86b6e5a 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
@@ -876,6 +876,10 @@ void __mlx4_unregister_mac(struct mlx4_dev *dev, u8 port, u64 mac);
int __mlx4_replace_mac(struct mlx4_dev *dev, u8 port, int qpn, u64 new_mac);
int __mlx4_write_mtt(struct mlx4_dev *dev, struct mlx4_mtt *mtt,
int start_index, int npages, u64 *page_list);
+int __mlx4_counter_alloc(struct mlx4_dev *dev, u32 *idx);
+void __mlx4_counter_free(struct mlx4_dev *dev, u32 idx);
+int __mlx4_xrcd_alloc(struct mlx4_dev *dev, u32 *xrcdn);
+void __mlx4_xrcd_free(struct mlx4_dev *dev, u32 xrcdn);
void mlx4_start_catas_poll(struct mlx4_dev *dev);
void mlx4_stop_catas_poll(struct mlx4_dev *dev);
diff --git a/drivers/net/ethernet/mellanox/mlx4/pd.c b/drivers/net/ethernet/mellanox/mlx4/pd.c
index db4746d..1ac8863 100644
--- a/drivers/net/ethernet/mellanox/mlx4/pd.c
+++ b/drivers/net/ethernet/mellanox/mlx4/pd.c
@@ -63,7 +63,7 @@ void mlx4_pd_free(struct mlx4_dev *dev, u32 pdn)
}
EXPORT_SYMBOL_GPL(mlx4_pd_free);
-int mlx4_xrcd_alloc(struct mlx4_dev *dev, u32 *xrcdn)
+int __mlx4_xrcd_alloc(struct mlx4_dev *dev, u32 *xrcdn)
{
struct mlx4_priv *priv = mlx4_priv(dev);
@@ -73,12 +73,47 @@ int mlx4_xrcd_alloc(struct mlx4_dev *dev, u32 *xrcdn)
return 0;
}
+
+int mlx4_xrcd_alloc(struct mlx4_dev *dev, u32 *xrcdn)
+{
+ u64 out_param;
+ int err;
+
+ if (mlx4_is_mfunc(dev)) {
+ err = mlx4_cmd_imm(dev, 0, &out_param,
+ RES_XRCD, RES_OP_RESERVE,
+ MLX4_CMD_ALLOC_RES,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_WRAPPED);
+ if (err)
+ return err;
+
+ *xrcdn = get_param_l(&out_param);
+ return 0;
+ }
+ return __mlx4_xrcd_alloc(dev, xrcdn);
+}
EXPORT_SYMBOL_GPL(mlx4_xrcd_alloc);
-void mlx4_xrcd_free(struct mlx4_dev *dev, u32 xrcdn)
+void __mlx4_xrcd_free(struct mlx4_dev *dev, u32 xrcdn)
{
mlx4_bitmap_free(&mlx4_priv(dev)->xrcd_bitmap, xrcdn);
}
+
+void mlx4_xrcd_free(struct mlx4_dev *dev, u32 xrcdn)
+{
+ u64 in_param;
+ int err;
+
+ if (mlx4_is_mfunc(dev)) {
+ set_param_l(&in_param, xrcdn);
+ err = mlx4_cmd(dev, in_param, RES_XRCD,
+ RES_OP_RESERVE, MLX4_CMD_FREE_RES,
+ MLX4_CMD_TIME_CLASS_A, MLX4_CMD_WRAPPED);
+ if (err)
+ mlx4_warn(dev, "Failed to release xrcdn %d\n", xrcdn);
+ } else
+ __mlx4_xrcd_free(dev, xrcdn);
+}
EXPORT_SYMBOL_GPL(mlx4_xrcd_free);
int mlx4_init_pd_table(struct mlx4_dev *dev)
diff --git a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
index e17d759..626d1e3 100644
--- a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
+++ b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
@@ -179,6 +179,16 @@ struct res_counter {
int port;
};
+enum res_xrcdn_states {
+ RES_XRCD_BUSY = RES_ANY_BUSY,
+ RES_XRCD_ALLOCATED,
+};
+
+struct res_xrcdn {
+ struct res_common com;
+ int port;
+};
+
/* For Debug uses */
static const char *ResourceType(enum mlx4_resource rt)
{
@@ -191,6 +201,7 @@ static const char *ResourceType(enum mlx4_resource rt)
case RES_MAC: return "RES_MAC";
case RES_EQ: return "RES_EQ";
case RES_COUNTER: return "RES_COUNTER";
+ case RES_XRCD: return "RES_XRCD";
default: return "Unknown resource type !!!";
};
}
@@ -448,6 +459,20 @@ static struct res_common *alloc_counter_tr(int id)
return &ret->com;
}
+static struct res_common *alloc_xrcdn_tr(int id)
+{
+ struct res_xrcdn *ret;
+
+ ret = kzalloc(sizeof *ret, GFP_KERNEL);
+ if (!ret)
+ return NULL;
+
+ ret->com.res_id = id;
+ ret->com.state = RES_XRCD_ALLOCATED;
+
+ return &ret->com;
+}
+
static struct res_common *alloc_tr(int id, enum mlx4_resource type, int slave,
int extra)
{
@@ -478,7 +503,9 @@ static struct res_common *alloc_tr(int id, enum mlx4_resource type, int slave,
case RES_COUNTER:
ret = alloc_counter_tr(id);
break;
-
+ case RES_XRCD:
+ ret = alloc_xrcdn_tr(id);
+ break;
default:
return NULL;
}
@@ -601,6 +628,16 @@ static int remove_counter_ok(struct res_counter *res)
return 0;
}
+static int remove_xrcdn_ok(struct res_xrcdn *res)
+{
+ if (res->com.state == RES_XRCD_BUSY)
+ return -EBUSY;
+ else if (res->com.state != RES_XRCD_ALLOCATED)
+ return -EPERM;
+
+ return 0;
+}
+
static int remove_cq_ok(struct res_cq *res)
{
if (res->com.state == RES_CQ_BUSY)
@@ -640,6 +677,8 @@ static int remove_ok(struct res_common *res, enum mlx4_resource type, int extra)
return remove_eq_ok((struct res_eq *)res);
case RES_COUNTER:
return remove_counter_ok((struct res_counter *)res);
+ case RES_XRCD:
+ return remove_xrcdn_ok((struct res_xrcdn *)res);
default:
return -EINVAL;
}
@@ -1246,6 +1285,50 @@ static int vlan_alloc_res(struct mlx4_dev *dev, int slave, int op, int cmd,
return 0;
}
+static int counter_alloc_res(struct mlx4_dev *dev, int slave, int op, int cmd,
+ u64 in_param, u64 *out_param)
+{
+ u32 index;
+ int err;
+
+ if (op != RES_OP_RESERVE)
+ return -EINVAL;
+
+ err = __mlx4_counter_alloc(dev, &index);
+ if (err)
+ return err;
+
+ err = add_res_range(dev, slave, index, 1, RES_COUNTER, 0);
+ if (err)
+ __mlx4_counter_free(dev, index);
+ else
+ set_param_l(out_param, index);
+
+ return err;
+}
+
+static int xrcdn_alloc_res(struct mlx4_dev *dev, int slave, int op, int cmd,
+ u64 in_param, u64 *out_param)
+{
+ u32 xrcdn;
+ int err;
+
+ if (op != RES_OP_RESERVE)
+ return -EINVAL;
+
+ err = __mlx4_xrcd_alloc(dev, &xrcdn);
+ if (err)
+ return err;
+
+ err = add_res_range(dev, slave, xrcdn, 1, RES_XRCD, 0);
+ if (err)
+ __mlx4_xrcd_free(dev, xrcdn);
+ else
+ set_param_l(out_param, xrcdn);
+
+ return err;
+}
+
int mlx4_ALLOC_RES_wrapper(struct mlx4_dev *dev, int slave,
struct mlx4_vhcr *vhcr,
struct mlx4_cmd_mailbox *inbox,
@@ -1291,6 +1374,16 @@ int mlx4_ALLOC_RES_wrapper(struct mlx4_dev *dev, int slave,
vhcr->in_param, &vhcr->out_param);
break;
+ case RES_COUNTER:
+ err = counter_alloc_res(dev, slave, vhcr->op_modifier, alop,
+ vhcr->in_param, &vhcr->out_param);
+ break;
+
+ case RES_XRCD:
+ err = xrcdn_alloc_res(dev, slave, vhcr->op_modifier, alop,
+ vhcr->in_param, &vhcr->out_param);
+ break;
+
default:
err = -EINVAL;
break;
@@ -1473,6 +1566,44 @@ static int vlan_free_res(struct mlx4_dev *dev, int slave, int op, int cmd,
return 0;
}
+static int counter_free_res(struct mlx4_dev *dev, int slave, int op, int cmd,
+ u64 in_param, u64 *out_param)
+{
+ int index;
+ int err;
+
+ if (op != RES_OP_RESERVE)
+ return -EINVAL;
+
+ index = get_param_l(&in_param);
+ err = rem_res_range(dev, slave, index, 1, RES_COUNTER, 0);
+ if (err)
+ return err;
+
+ __mlx4_counter_free(dev, index);
+
+ return err;
+}
+
+static int xrcdn_free_res(struct mlx4_dev *dev, int slave, int op, int cmd,
+ u64 in_param, u64 *out_param)
+{
+ int xrcdn;
+ int err;
+
+ if (op != RES_OP_RESERVE)
+ return -EINVAL;
+
+ xrcdn = get_param_l(&in_param);
+ err = rem_res_range(dev, slave, xrcdn, 1, RES_XRCD, 0);
+ if (err)
+ return err;
+
+ __mlx4_xrcd_free(dev, xrcdn);
+
+ return err;
+}
+
int mlx4_FREE_RES_wrapper(struct mlx4_dev *dev, int slave,
struct mlx4_vhcr *vhcr,
struct mlx4_cmd_mailbox *inbox,
@@ -1518,6 +1649,15 @@ int mlx4_FREE_RES_wrapper(struct mlx4_dev *dev, int slave,
vhcr->in_param, &vhcr->out_param);
break;
+ case RES_COUNTER:
+ err = counter_free_res(dev, slave, vhcr->op_modifier, alop,
+ vhcr->in_param, &vhcr->out_param);
+ break;
+
+ case RES_XRCD:
+ err = xrcdn_free_res(dev, slave, vhcr->op_modifier, alop,
+ vhcr->in_param, &vhcr->out_param);
+
default:
break;
}
@@ -3032,6 +3172,64 @@ static void rem_slave_eqs(struct mlx4_dev *dev, int slave)
spin_unlock_irq(mlx4_tlock(dev));
}
+static void rem_slave_counters(struct mlx4_dev *dev, int slave)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_resource_tracker *tracker = &priv->mfunc.master.res_tracker;
+ struct list_head *counter_list =
+ &tracker->slave_list[slave].res_list[RES_COUNTER];
+ struct res_counter *counter;
+ struct res_counter *tmp;
+ int err;
+ int index;
+
+ err = move_all_busy(dev, slave, RES_COUNTER);
+ if (err)
+ mlx4_warn(dev, "rem_slave_counters: Could not move all counters to "
+ "busy for slave %d\n", slave);
+
+ spin_lock_irq(mlx4_tlock(dev));
+ list_for_each_entry_safe(counter, tmp, counter_list, com.list) {
+ if (counter->com.owner == slave) {
+ index = counter->com.res_id;
+ radix_tree_delete(&tracker->res_tree[RES_COUNTER], index);
+ list_del(&counter->com.list);
+ kfree(counter);
+ __mlx4_counter_free(dev, index);
+ }
+ }
+ spin_unlock_irq(mlx4_tlock(dev));
+}
+
+static void rem_slave_xrcdns(struct mlx4_dev *dev, int slave)
+{
+ struct mlx4_priv *priv = mlx4_priv(dev);
+ struct mlx4_resource_tracker *tracker = &priv->mfunc.master.res_tracker;
+ struct list_head *xrcdn_list =
+ &tracker->slave_list[slave].res_list[RES_XRCD];
+ struct res_xrcdn *xrcd;
+ struct res_xrcdn *tmp;
+ int err;
+ int xrcdn;
+
+ err = move_all_busy(dev, slave, RES_XRCD);
+ if (err)
+ mlx4_warn(dev, "rem_slave_xrcdns: Could not move all xrcdns to "
+ "busy for slave %d\n", slave);
+
+ spin_lock_irq(mlx4_tlock(dev));
+ list_for_each_entry_safe(xrcd, tmp, xrcdn_list, com.list) {
+ if (xrcd->com.owner == slave) {
+ xrcdn = xrcd->com.res_id;
+ radix_tree_delete(&tracker->res_tree[RES_XRCD], xrcdn);
+ list_del(&xrcd->com.list);
+ kfree(xrcd);
+ __mlx4_xrcd_free(dev, xrcdn);
+ }
+ }
+ spin_unlock_irq(mlx4_tlock(dev));
+}
+
void mlx4_delete_all_resources_for_slave(struct mlx4_dev *dev, int slave)
{
struct mlx4_priv *priv = mlx4_priv(dev);
@@ -3045,5 +3243,7 @@ void mlx4_delete_all_resources_for_slave(struct mlx4_dev *dev, int slave)
rem_slave_mrs(dev, slave);
rem_slave_eqs(dev, slave);
rem_slave_mtts(dev, slave);
+ rem_slave_counters(dev, slave);
+ rem_slave_xrcdns(dev, slave);
mutex_unlock(&priv->mfunc.master.res_tracker.slave_list[slave].mutex);
}
--
1.7.1
^ permalink raw reply related
* Re: [PATCH 1/4] netfilter: ipset: fix timeout value overflow bug
From: Pablo Neira Ayuso @ 2012-05-14 20:10 UTC (permalink / raw)
To: Jozsef Kadlecsik
Cc: Eric Dumazet, David Laight, netfilter-devel, davem, netdev
In-Reply-To: <alpine.DEB.2.00.1205141943240.28291@blackhole.kfki.hu>
[-- Attachment #1: Type: text/plain, Size: 1062 bytes --]
On Mon, May 14, 2012 at 07:45:07PM +0200, Jozsef Kadlecsik wrote:
> On Mon, 14 May 2012, Eric Dumazet wrote:
>
> > On Mon, 2012-05-14 at 16:36 +0200, Pablo Neira Ayuso wrote:
> > > On Mon, May 14, 2012 at 03:19:49PM +0100, David Laight wrote:
> > > >
> > > > > --- a/include/linux/netfilter/ipset/ip_set_timeout.h
> > > > > +++ b/include/linux/netfilter/ipset/ip_set_timeout.h
> > > > > @@ -30,6 +30,10 @@ ip_set_timeout_uget(struct nlattr *tb)
> > > > > {
> > > > > unsigned int timeout = ip_set_get_h32(tb);
> > > > >
> > > > > + /* Normalize to fit into jiffies */
> > > > > + if (timeout > UINT_MAX/1000)
> > > > > + timeout = UINT_MAX/1000;
> > > > > +
> > > >
> > > > Doesn't that rather assume that HZ is 1000 ?
> > >
> > > Indeed. I overlooked that. Thanks David.
> >
> > I dont think so.
> >
> > 1000 here is really MSEC_PER_SEC
> >
> > (All occurrences should be changed in this file)
>
> Yes, exactly. Pablo, shall I produce a little patch or could you change
> 1000 to MSEC_PER_SEC?
New patch attached.
I have rebase my tree again.
[-- Attachment #2: 0001-netfilter-ipset-fix-timeout-value-overflow-bug.patch --]
[-- Type: text/x-diff, Size: 2715 bytes --]
>From 9fdc90b2d00ecf98797c147d58cfe324fc92c9ed Mon Sep 17 00:00:00 2001
From: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Date: Mon, 7 May 2012 02:35:44 +0000
Subject: [PATCH] netfilter: ipset: fix timeout value overflow bug
Large timeout parameters could result wrong timeout values due to
an overflow at msec to jiffies conversion (reported by Andreas Herz)
[ This patch was mangled by Pablo Neira Ayuso since David Laight and
Eric Dumazet noticed that we were using hardcoded 1000 instead of
MSEC_PER_SEC to calculate the timeout ]
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
include/linux/netfilter/ipset/ip_set_timeout.h | 4 ++++
net/netfilter/xt_set.c | 15 +++++++++++++--
2 files changed, 17 insertions(+), 2 deletions(-)
diff --git a/include/linux/netfilter/ipset/ip_set_timeout.h b/include/linux/netfilter/ipset/ip_set_timeout.h
index 4792320..41d9cfa 100644
--- a/include/linux/netfilter/ipset/ip_set_timeout.h
+++ b/include/linux/netfilter/ipset/ip_set_timeout.h
@@ -30,6 +30,10 @@ ip_set_timeout_uget(struct nlattr *tb)
{
unsigned int timeout = ip_set_get_h32(tb);
+ /* Normalize to fit into jiffies */
+ if (timeout > UINT_MAX/MSEC_PER_SEC)
+ timeout = UINT_MAX/MSEC_PER_SEC;
+
/* Userspace supplied TIMEOUT parameter: adjust crazy size */
return timeout == IPSET_NO_TIMEOUT ? IPSET_NO_TIMEOUT - 1 : timeout;
}
diff --git a/net/netfilter/xt_set.c b/net/netfilter/xt_set.c
index 0ec8138..035960e 100644
--- a/net/netfilter/xt_set.c
+++ b/net/netfilter/xt_set.c
@@ -44,6 +44,14 @@ const struct ip_set_adt_opt n = { \
.cmdflags = cfs, \
.timeout = t, \
}
+#define ADT_MOPT(n, f, d, fs, cfs, t) \
+struct ip_set_adt_opt n = { \
+ .family = f, \
+ .dim = d, \
+ .flags = fs, \
+ .cmdflags = cfs, \
+ .timeout = t, \
+}
/* Revision 0 interface: backward compatible with netfilter/iptables */
@@ -296,11 +304,14 @@ static unsigned int
set_target_v2(struct sk_buff *skb, const struct xt_action_param *par)
{
const struct xt_set_info_target_v2 *info = par->targinfo;
- ADT_OPT(add_opt, par->family, info->add_set.dim,
- info->add_set.flags, info->flags, info->timeout);
+ ADT_MOPT(add_opt, par->family, info->add_set.dim,
+ info->add_set.flags, info->flags, info->timeout);
ADT_OPT(del_opt, par->family, info->del_set.dim,
info->del_set.flags, 0, UINT_MAX);
+ /* Normalize to fit into jiffies */
+ if (add_opt.timeout > UINT_MAX/MSEC_PER_SEC)
+ add_opt.timeout = UINT_MAX/MSEC_PER_SEC;
if (info->add_set.index != IPSET_INVALID_ID)
ip_set_add(info->add_set.index, skb, par, &add_opt);
if (info->del_set.index != IPSET_INVALID_ID)
--
1.7.10
^ permalink raw reply related
* Re: [PATCH 01/12] netvm: Prevent a stream-specific deadlock
From: David Miller @ 2012-05-14 20:26 UTC (permalink / raw)
To: mgorman
Cc: akpm, linux-mm, netdev, linux-nfs, linux-kernel, Trond.Myklebust,
neilb, hch, a.p.zijlstra, michaelc, emunson
In-Reply-To: <20120514105604.GB29102@suse.de>
From: Mel Gorman <mgorman@suse.de>
Date: Mon, 14 May 2012 11:56:04 +0100
> On Fri, May 11, 2012 at 01:10:34AM -0400, David Miller wrote:
>> From: Mel Gorman <mgorman@suse.de>
>> Date: Thu, 10 May 2012 14:54:14 +0100
>>
>> > It could happen that all !SOCK_MEMALLOC sockets have buffered so
>> > much data that we're over the global rmem limit. This will prevent
>> > SOCK_MEMALLOC buffers from receiving data, which will prevent userspace
>> > from running, which is needed to reduce the buffered data.
>> >
>> > Fix this by exempting the SOCK_MEMALLOC sockets from the rmem limit.
>> >
>> > Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
>> > Signed-off-by: Mel Gorman <mgorman@suse.de>
>>
>> This introduces an invariant which I am not so sure is enforced.
>>
>> With this change it is absolutely required that once a socket
>> becomes SOCK_MEMALLOC it must never _ever_ lose that attribute.
>>
>
> This is effectively true. In the NFS case, the flag is cleared on
> swapoff after all the entries have been paged in. In the NBD case,
> SOCK_MEMALLOC is left set until the socket is destroyed. I'll update the
> changelog.
Bugs happen, you need to find a way to assert that nobody every does
this. Because if a bug is introduced which makes this happen, it will
otherwise be very difficult to debug.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply
* inconsistent null checking in ipx_ioctl()
From: Dan Carpenter @ 2012-05-14 20:56 UTC (permalink / raw)
To: netdev
Hi, I'm working on some new Smatch stuff and going through some warnings
in old code.
----
This is a semi-automatic email about new static checker warnings.
The patch b0d0d915d1d1: "ipx: remove the BKL" from Jan 25, 2011,
leads to the following Smatch complaint:
net/ipx/af_ipx.c:1928 ipx_ioctl()
error: we previously assumed 'sk' could be null (see line 1913)
net/ipx/af_ipx.c
1912 rc = -EINVAL;
1913 if (sk)
^^^^
Check.
1914 rc = sock_get_timestamp(sk, argp);
1915 break;
1916 case SIOCGIFDSTADDR:
1917 case SIOCSIFDSTADDR:
1918 case SIOCGIFBRDADDR:
1919 case SIOCSIFBRDADDR:
1920 case SIOCGIFNETMASK:
1921 case SIOCSIFNETMASK:
1922 rc = -EINVAL;
1923 break;
1924 default:
1925 rc = -ENOIOCTLCMD;
1926 break;
1927 }
1928 release_sock(sk);
^^^^^^^^^^^^^^^^^
The lock and release functions dereference "sk". Probably the check
can be removed. The rest of the function dereferences "sk" without
checking. A lot of this code goes back to 2.6.12.
1929
1930 return rc;
regards,
dan carpenter
^ permalink raw reply
* Re: [PATCH -next] net/codel: Add missing #include <linux/prefetch.h>
From: Eric Dumazet @ 2012-05-14 21:04 UTC (permalink / raw)
To: Geert Uytterhoeven
Cc: Eric Dumazet, Dave Taht, David S. Miller, netdev, linux-kernel,
linux-next
In-Reply-To: <1337024825-20130-1-git-send-email-geert@linux-m68k.org>
On Mon, 2012-05-14 at 21:47 +0200, Geert Uytterhoeven wrote:
> m68k allmodconfig:
>
> net/sched/sch_codel.c: In function ‘dequeue’:
> net/sched/sch_codel.c:70: error: implicit declaration of function ‘prefetch’
> make[1]: *** [net/sched/sch_codel.o] Error 1
>
> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
> ---
I plaid guilty
Acked-by: Eric Dumazet <edumazet@google.com>
Thanks
^ permalink raw reply
* Re: [PATCH] net: codel: fix build errors
From: Eric Dumazet @ 2012-05-14 21:08 UTC (permalink / raw)
To: Sasha Levin; +Cc: edumazet, dave.taht, davem, linux-kernel, netdev
In-Reply-To: <1337015687-17693-1-git-send-email-levinsasha928@gmail.com>
On Mon, 2012-05-14 at 19:14 +0200, Sasha Levin wrote:
> Fix the following build error:
>
> net/sched/sch_fq_codel.c: In function 'fq_codel_dump_stats':
> net/sched/sch_fq_codel.c:464:3: error: unknown field 'qdisc_stats' specified in initializer
> net/sched/sch_fq_codel.c:464:3: warning: missing braces around initializer
> net/sched/sch_fq_codel.c:464:3: warning: (near initialization for 'st.<anonymous>')
> net/sched/sch_fq_codel.c:465:3: error: unknown field 'qdisc_stats' specified in initializer
> net/sched/sch_fq_codel.c:465:3: warning: excess elements in struct initializer
> net/sched/sch_fq_codel.c:465:3: warning: (near initialization for 'st')
> net/sched/sch_fq_codel.c:466:3: error: unknown field 'qdisc_stats' specified in initializer
> net/sched/sch_fq_codel.c:466:3: warning: excess elements in struct initializer
> net/sched/sch_fq_codel.c:466:3: warning: (near initialization for 'st')
> net/sched/sch_fq_codel.c:467:3: error: unknown field 'qdisc_stats' specified in initializer
> net/sched/sch_fq_codel.c:467:3: warning: excess elements in struct initializer
> net/sched/sch_fq_codel.c:467:3: warning: (near initialization for 'st')
> make[1]: *** [net/sched/sch_fq_codel.o] Error 1
>
> Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
> ---
> include/linux/pkt_sched.h | 4 +++-
> 1 files changed, 3 insertions(+), 1 deletions(-)
>
> diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
> index 32aef0a..c304eda 100644
> --- a/include/linux/pkt_sched.h
> +++ b/include/linux/pkt_sched.h
> @@ -732,7 +732,9 @@ struct tc_fq_codel_xstats {
> union {
> struct tc_fq_codel_qd_stats qdisc_stats;
> struct tc_fq_codel_cl_stats class_stats;
> - };
> + } u;
> +#define qdisc_stats u.qdisc_stats
> +#define class_stats u.class_stats
> };
>
Anonymous unions are fine, we use them a lot in kernel.
Please fix the initializers instead in fq_codel_dump_stats(), because
these two #define are not very nice.
^ permalink raw reply
* Re: [PATCH] net: codel: fix build errors
From: David Miller @ 2012-05-14 21:20 UTC (permalink / raw)
To: eric.dumazet; +Cc: levinsasha928, edumazet, dave.taht, linux-kernel, netdev
In-Reply-To: <1337029683.8512.626.camel@edumazet-glaptop>
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 14 May 2012 23:08:03 +0200
> Anonymous unions are fine, we use them a lot in kernel.
>
> Please fix the initializers instead in fq_codel_dump_stats(), because
> these two #define are not very nice.
Agreed.
^ permalink raw reply
* Question about be2net error field, rx_drops_no_pbuf
From: Marcelo Leitner @ 2012-05-14 21:25 UTC (permalink / raw)
To: netdev
Hi,
What does 'rx_drops_no_pbuf' mean at be2net driver? I can see it is a
hardware counter for some type of error, which I would like to know
about. What causes it?
All documentation I could find about it is a comment referring firmware
specification:
struct be_rxf_stats_v0 {
struct be_port_rxf_stats_v0 port[2];
u32 rx_drops_no_pbuf; /* dword 132*/
...
}
struct be_rxf_stats_v1 {
struct be_port_rxf_stats_v1 port[4];
u32 rsvd0[2];
u32 rx_drops_no_pbuf;
...
}
That, plus this:
be_get_stats64()
{
...
/* receiver fifo overrun */
/* drops_no_pbuf is no per i/f, it's per BE card */
stats->rx_fifo_errors = drvs->rxpp_fifo_overflow_drop +
drvs->rx_input_fifo_overflow_drop +
drvs->rx_drops_no_pbuf;
return stats;
}
Thanks,
Marcelo.
^ permalink raw reply
* Re: [PATCH] net: codel: fix build errors
From: Sasha Levin @ 2012-05-14 21:39 UTC (permalink / raw)
To: Eric Dumazet; +Cc: edumazet, dave.taht, davem, linux-kernel, netdev
In-Reply-To: <1337029683.8512.626.camel@edumazet-glaptop>
On Mon, May 14, 2012 at 11:08 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Anonymous unions are fine, we use them a lot in kernel.
While we use them a lot, we don't try initializing them that often.
> Please fix the initializers instead in fq_codel_dump_stats(), because
> these two #define are not very nice.
The only method I know of fixing that up is getting braces around them
in the initializer, which is hacky and will break every time a new
member is added to the struct before the anonymous union. Is there a
different solution?
^ permalink raw reply
* Re: [PATCH 1/4] netfilter: ipset: fix timeout value overflow bug
From: Jozsef Kadlecsik @ 2012-05-14 21:45 UTC (permalink / raw)
To: Pablo Neira Ayuso
Cc: Eric Dumazet, David Laight, netfilter-devel, davem, netdev
In-Reply-To: <20120514201043.GA15485@1984>
On Mon, 14 May 2012, Pablo Neira Ayuso wrote:
> On Mon, May 14, 2012 at 07:45:07PM +0200, Jozsef Kadlecsik wrote:
> > On Mon, 14 May 2012, Eric Dumazet wrote:
> >
> > > On Mon, 2012-05-14 at 16:36 +0200, Pablo Neira Ayuso wrote:
> > > > On Mon, May 14, 2012 at 03:19:49PM +0100, David Laight wrote:
> > > > >
> > > > > > --- a/include/linux/netfilter/ipset/ip_set_timeout.h
> > > > > > +++ b/include/linux/netfilter/ipset/ip_set_timeout.h
> > > > > > @@ -30,6 +30,10 @@ ip_set_timeout_uget(struct nlattr *tb)
> > > > > > {
> > > > > > unsigned int timeout = ip_set_get_h32(tb);
> > > > > >
> > > > > > + /* Normalize to fit into jiffies */
> > > > > > + if (timeout > UINT_MAX/1000)
> > > > > > + timeout = UINT_MAX/1000;
> > > > > > +
> > > > >
> > > > > Doesn't that rather assume that HZ is 1000 ?
> > > >
> > > > Indeed. I overlooked that. Thanks David.
> > >
> > > I dont think so.
> > >
> > > 1000 here is really MSEC_PER_SEC
> > >
> > > (All occurrences should be changed in this file)
> >
> > Yes, exactly. Pablo, shall I produce a little patch or could you change
> > 1000 to MSEC_PER_SEC?
>
> New patch attached.
>
> I have rebase my tree again.
Thanks Pablo, indeed!
Best regards,
Jozsef
-
E-mail : kadlec@blackhole.kfki.hu, kadlecsik.jozsef@wigner.mta.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : Wigner Research Centre for Physics, Hungarian Academy of Sciences
H-1525 Budapest 114, POB. 49, Hungary
^ permalink raw reply
* Re: [PATCH] net: codel: fix build errors
From: Eric Dumazet @ 2012-05-14 21:49 UTC (permalink / raw)
To: Sasha Levin; +Cc: edumazet, dave.taht, davem, linux-kernel, netdev
In-Reply-To: <CA+1xoqfWR+BA4xqqG+8SzVGYSH1z6nu_=iEFX7XoZzMhv_GvqA@mail.gmail.com>
On Mon, 2012-05-14 at 23:39 +0200, Sasha Levin wrote:
> On Mon, May 14, 2012 at 11:08 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > Anonymous unions are fine, we use them a lot in kernel.
>
> While we use them a lot, we don't try initializing them that often.
>
> > Please fix the initializers instead in fq_codel_dump_stats(), because
> > these two #define are not very nice.
>
> The only method I know of fixing that up is getting braces around them
> in the initializer, which is hacky and will break every time a new
> member is added to the struct before the anonymous union. Is there a
> different solution?
instead of
struct foo x = {
.field = value;
.sub.f2 = xxx;
...
};
use :
struct foo x = {
.field = value;
};
x.sub.f2 = xxx;
...
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox