* RE: INVESTMENT/ RELOCATION ASSISTANCE. 14th/10/2013
From: Mrs. Maryann Jamila Hussein. @ 2013-10-14 19:55 UTC (permalink / raw)
To: Recipients
Dear Beloved,
I am Mrs. Maryann Jamila Hussein a Teacher and a Muslim Convert here in Syria,i had sent a previous mail which i am not sure you got. I need your assistance to invest and help me relocate my 3 kids who are 17 years and below, so that they can get a better life there in your country due to the on going crises here in Syria.
I need your trust, before the death of my husband we had a savings with an Indian Bank, so money is not the issue.
I got your reference in my search for someone who suits my
purpose.If you can help me reply, let me know.
Regards,
Mrs. Maryann Jamila Hussein.
=====================================
^ permalink raw reply
* Re: [PATCH -next] netdev: inet_timewait_sock.h missing semi-colon when KMEMCHECK is enabled
From: Joe Perches @ 2013-10-14 19:53 UTC (permalink / raw)
To: Randy Dunlap
Cc: Thierry Reding, linux-next, linux-kernel, Mark Brown,
netdev@vger.kernel.org, David Miller
In-Reply-To: <525C47C0.2000907@infradead.org>
On Mon, 2013-10-14 at 12:36 -0700, Randy Dunlap wrote:
> From: Randy Dunlap <rdunlap@infradead.org>
>
> Fix (a few hundred) build errors due to missing semi-colon when
> KMEMCHECK is enabled:
>
> include/net/inet_timewait_sock.h:139:2: error: expected ',', ';' or '}' before 'int'
> include/net/inet_timewait_sock.h:148:28: error: 'const struct inet_timewait_sock' has no member named 'tw_death_node'
>
> Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
> ---
> include/net/inet_timewait_sock.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> --- next-2013-1014.orig/include/net/inet_timewait_sock.h
> +++ next-2013-1014/include/net/inet_timewait_sock.h
> @@ -135,7 +135,7 @@ struct inet_timewait_sock {
> tw_transparent : 1,
> tw_pad : 6, /* 6 bits hole */
> tw_tos : 8,
> - tw_pad2 : 16 /* 16 bits hole */
> + tw_pad2 : 16; /* 16 bits hole */
> kmemcheck_bitfield_end(flags);
> u32 tw_ttd;
> struct inet_bind_bucket *tw_tb;
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
Shouldn't this be done in kmemcheck.h?
include/linux/kmemcheck.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/linux/kmemcheck.h b/include/linux/kmemcheck.h
index 39f8453..b9ffad5 100644
--- a/include/linux/kmemcheck.h
+++ b/include/linux/kmemcheck.h
@@ -62,10 +62,10 @@ bool kmemcheck_is_obj_initialized(unsigned long addr, size_t size);
* kmemcheck_annotate_bitfield(a, flags);
*/
#define kmemcheck_bitfield_begin(name) \
- int name##_begin[0];
+ int name##_begin[0]
#define kmemcheck_bitfield_end(name) \
- int name##_end[0];
+ int name##_end[0]
#define kmemcheck_annotate_bitfield(ptr, name) \
do { \
^ permalink raw reply related
* Re: [PATCH 01/07] 8139too: Support for byte queue limits
From: Tino Reichardt @ 2013-10-14 19:52 UTC (permalink / raw)
To: netdev
In-Reply-To: <1381777891.2045.1.camel@edumazet-glaptop.roam.corp.google.com>
* Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Mon, 2013-10-14 at 20:26 +0200, Tino Reichardt wrote:
> > Changes to 8139too to use byte queue limits.
> >
> > @@ -1733,6 +1735,7 @@ static netdev_tx_t rtl8139_start_xmit (struct sk_buff *skb,
> > tp->tx_flag | max(len, (unsigned int)ETH_ZLEN));
> >
> > tp->cur_tx++;
> > + netdev_sent_queue(dev, len);
> >
>
> This looks wrong if len < ETH_ZLEN
Fixed this (really stupid) issue, updated patch is located here:
http://www.mcmilk.de/projects/linux-bql/dl/0001-8139too-Support-for-byte-queue-limits.patch
The netif_dbg() statement will now also print the correct queued length.
--
Best regards, TR
^ permalink raw reply
* Re: [PATCH] staging: octeon-ethernet: trivial: Avoid OOPS if phydev is not set
From: Aaro Koskinen @ 2013-10-14 19:49 UTC (permalink / raw)
To: Dan Carpenter
Cc: support, David Daney, Greg KH, driverdev-devel,
Sebastian Pöhn, netdev
In-Reply-To: <20131014191541.GA11797@mwanda>
On Mon, Oct 14, 2013 at 10:16:49PM +0300, Dan Carpenter wrote:
> On Mon, Oct 14, 2013 at 09:39:06PM +0300, Aaro Koskinen wrote:
> > It's initialized in cvm_oct_phy_setup_device():
> >
> > priv->phydev = of_phy_connect(dev, phy_node, cvm_oct_adjust_link, 0,
> ^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^
>
> Sorry I should have explained better.
>
> We use cvm_oct_adjust_link() to initialize priv->phydev but
> cvm_oct_adjust_link() depends on priv->phydev. It seems like we would
> hit the NULL dereference every time. Weird huh?
It doesn't happen on my system (EdgeRouter Lite). I think you need to
explain even more better. :-)
What you mean by "We use cvm_oct_adjust_link() to initialize
priv->phydev..."? Sorry, maybe I'm just missing something really
obvious...
A.
^ permalink raw reply
* [PATCH -next] netdev: inet_timewait_sock.h missing semi-colon when KMEMCHECK is enabled
From: Randy Dunlap @ 2013-10-14 19:36 UTC (permalink / raw)
To: Thierry Reding, linux-next, linux-kernel
Cc: Mark Brown, netdev@vger.kernel.org, David Miller
In-Reply-To: <1381762088-18880-1-git-send-email-treding@nvidia.com>
From: Randy Dunlap <rdunlap@infradead.org>
Fix (a few hundred) build errors due to missing semi-colon when
KMEMCHECK is enabled:
include/net/inet_timewait_sock.h:139:2: error: expected ',', ';' or '}' before 'int'
include/net/inet_timewait_sock.h:148:28: error: 'const struct inet_timewait_sock' has no member named 'tw_death_node'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
---
include/net/inet_timewait_sock.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- next-2013-1014.orig/include/net/inet_timewait_sock.h
+++ next-2013-1014/include/net/inet_timewait_sock.h
@@ -135,7 +135,7 @@ struct inet_timewait_sock {
tw_transparent : 1,
tw_pad : 6, /* 6 bits hole */
tw_tos : 8,
- tw_pad2 : 16 /* 16 bits hole */
+ tw_pad2 : 16; /* 16 bits hole */
kmemcheck_bitfield_end(flags);
u32 tw_ttd;
struct inet_bind_bucket *tw_tb;
^ permalink raw reply
* Re: [PATCH 01/07] 8139too: Support for byte queue limits
From: Tino Reichardt @ 2013-10-14 19:31 UTC (permalink / raw)
To: netdev
In-Reply-To: <1381777891.2045.1.camel@edumazet-glaptop.roam.corp.google.com>
* Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Mon, 2013-10-14 at 20:26 +0200, Tino Reichardt wrote:
> > Changes to 8139too to use byte queue limits.
> >
> > This patch was not tested on real hardware currently, but compiles fine and
> > should work.
> >
> > tp->cur_tx = 0;
> > tp->dirty_tx = 0;
> > + netdev_reset_queue(tp->dev);
> >
> > /* XXX account for unsent Tx packets in tp->stats.tx_dropped */
> > }
> > @@ -1733,6 +1735,7 @@ static netdev_tx_t rtl8139_start_xmit (struct sk_buff *skb,
> > tp->tx_flag | max(len, (unsigned int)ETH_ZLEN));
> >
> > tp->cur_tx++;
> > + netdev_sent_queue(dev, len);
> >
>
> This looks wrong if len < ETH_ZLEN
Yes you are right. I looked to the debug statement in the end of that
function, so len had to be correct ... but it isn't ... yes :(
netif_dbg(tp, tx_queued, dev, "Queued Tx packet size %u to slot %d\n"...
Should a make len = max(len, ETH_ZLEN) there .. so the debug statement
is also correct?
--
Best regards, TR
^ permalink raw reply
* Re: [PATCH] staging: octeon-ethernet: trivial: Avoid OOPS if phydev is not set
From: Dan Carpenter @ 2013-10-14 19:16 UTC (permalink / raw)
To: Aaro Koskinen
Cc: support, David Daney, Greg KH, driverdev-devel,
Sebastian Pöhn, netdev
In-Reply-To: <20131014183906.GA4260@blackmetal.musicnaut.iki.fi>
On Mon, Oct 14, 2013 at 09:39:06PM +0300, Aaro Koskinen wrote:
> Hi,
>
> On Mon, Oct 14, 2013 at 01:10:51PM +0300, Dan Carpenter wrote:
> > On Sun, Oct 13, 2013 at 02:28:10PM -0700, Greg KH wrote:
> > > On Sun, Oct 13, 2013 at 08:59:54PM +0200, Sebastian Pöhn wrote:
> > > > A zero pointer deref on priv->phydev->link was causing oops on our systems.
> > > > Might be related to improper configuration but we should fail gracefully here ;-)
> > > >
> > > > Signed-off-by: Sebastian Poehn <sebastian.poehn@googlemail.com>
> > > >
> > > > ---
> > > >
> > > > diff --git a/drivers/staging/octeon/ethernet-mdio.c b/drivers/staging/octeon/ethernet-mdio.c
> > > > index 83b1030..bc8c503 100644
> > > > --- a/drivers/staging/octeon/ethernet-mdio.c
> > > > +++ b/drivers/staging/octeon/ethernet-mdio.c
> > > > @@ -121,6 +121,9 @@ static void cvm_oct_adjust_link(struct net_device *dev)
> > > > struct octeon_ethernet *priv = netdev_priv(dev);
> > > > cvmx_helper_link_info_t link_info;
> > > >
> > > > + if(!priv->phydev)
> > > > + return ;
> > >
> > > Please always run your patches through the scripts/checkpatch.pl tool so
> > > that maintainers don't have to point out the obvious coding syle errors
> > > by hand each time :)
> >
> > Also it's whitespace damaged and doesn't apply.
> >
> > >
> > > Care to try again?
> > >
> > > Also, how was phydev NULL? What was causing that?
> >
> > To me it looks like phydev is always NULL.
>
> It's initialized in cvm_oct_phy_setup_device():
>
> priv->phydev = of_phy_connect(dev, phy_node, cvm_oct_adjust_link, 0,
^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^
Sorry I should have explained better.
We use cvm_oct_adjust_link() to initialize priv->phydev but
cvm_oct_adjust_link() depends on priv->phydev. It seems like we would
hit the NULL dereference every time. Weird huh?
regards,
dan carpenter
^ permalink raw reply
* Re: [PATCH 01/07] 8139too: Support for byte queue limits
From: Eric Dumazet @ 2013-10-14 19:11 UTC (permalink / raw)
To: Tino Reichardt
Cc: netdev, David S. Miller, Joe Perches, Jiri Pirko, Bill Pemberton,
Greg Kroah-Hartman
In-Reply-To: <1381775183-24866-2-git-send-email-milky-kernel@mcmilk.de>
On Mon, 2013-10-14 at 20:26 +0200, Tino Reichardt wrote:
> Changes to 8139too to use byte queue limits.
>
> This patch was not tested on real hardware currently, but compiles fine and
> should work.
>
>
> Signed-off-by: Tino Reichardt <milky-kernel@mcmilk.de>
>
> ---
> drivers/net/ethernet/realtek/8139too.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/drivers/net/ethernet/realtek/8139too.c b/drivers/net/ethernet/realtek/8139too.c
> index 3ccedeb..992ac57 100644
> --- a/drivers/net/ethernet/realtek/8139too.c
> +++ b/drivers/net/ethernet/realtek/8139too.c
> @@ -1409,6 +1409,7 @@ static void rtl8139_hw_start (struct net_device *dev)
> }
>
> netdev_dbg(dev, "init buffer addresses\n");
> + netdev_reset_queue(dev);
>
> /* Lock Config[01234] and BMCR register writes */
> RTL_W8 (Cfg9346, Cfg9346_Lock);
> @@ -1638,6 +1639,7 @@ static inline void rtl8139_tx_clear (struct rtl8139_private *tp)
> {
> tp->cur_tx = 0;
> tp->dirty_tx = 0;
> + netdev_reset_queue(tp->dev);
>
> /* XXX account for unsent Tx packets in tp->stats.tx_dropped */
> }
> @@ -1733,6 +1735,7 @@ static netdev_tx_t rtl8139_start_xmit (struct sk_buff *skb,
> tp->tx_flag | max(len, (unsigned int)ETH_ZLEN));
>
> tp->cur_tx++;
> + netdev_sent_queue(dev, len);
>
This looks wrong if len < ETH_ZLEN
> if ((tp->cur_tx - NUM_TX_DESC) == tp->dirty_tx)
> netif_stop_queue (dev);
> @@ -1750,6 +1753,7 @@ static void rtl8139_tx_interrupt (struct net_device *dev,
> void __iomem *ioaddr)
> {
> unsigned long dirty_tx, tx_left;
> + unsigned bytes_compl = 0, pkts_compl = 0;
>
> assert (dev != NULL);
> assert (ioaddr != NULL);
> @@ -1792,6 +1796,8 @@ static void rtl8139_tx_interrupt (struct net_device *dev,
> u64_stats_update_begin(&tp->tx_stats.syncp);
> tp->tx_stats.packets++;
> tp->tx_stats.bytes += txstatus & 0x7ff;
> + pkts_compl++;
> + bytes_compl += txstatus & 0x7ff;
Because here len reported by NIC will be >= ETH_ZLEN
^ permalink raw reply
* Re: [PATCHv2] staging: octeon-ethernet: trivial: Avoid OOPS if phydev is not set
From: Aaro Koskinen @ 2013-10-14 19:06 UTC (permalink / raw)
To: Sebastian Pöhn
Cc: driverdev-devel@linuxdriverproject.org, netdev@vger.kernel.org,
dan.carpenter
In-Reply-To: <1381773535.2049.4.camel@alpha.Speedport_W723_V_Typ_A_1_00_098>
On Mon, Oct 14, 2013 at 07:58:55PM +0200, Sebastian Pöhn wrote:
> Sorry. Haven't signed off for a while now :(
>
> I bet that this is really an issue of incorrect OF information. If I find out more I'll let you know.
>
> @dan: The code works for some interfaces - so phydev is set correctly in some cases.
Kernel git log is not a chat forum; please write a proper changelog,
if possible include the original oops log.
Thanks,
A.
> Signed-off-by: Sebastian Poehn <sebastian.poehn@googlemail.com>
> ---
> diff --git a/drivers/staging/octeon/ethernet-mdio.c b/drivers/staging/octeon/ethernet-mdio.c
> index 83b1030..bc8c503 100644
> --- a/drivers/staging/octeon/ethernet-mdio.c
> +++ b/drivers/staging/octeon/ethernet-mdio.c
> @@ -121,6 +121,9 @@ static void cvm_oct_adjust_link(struct net_device *dev)
> struct octeon_ethernet *priv = netdev_priv(dev);
> cvmx_helper_link_info_t link_info;
>
> + if (!priv->phydev)
> + return;
> +
> if (priv->last_link != priv->phydev->link) {
> priv->last_link = priv->phydev->link;
> link_info.u64 = 0;
>
> _______________________________________________
> devel mailing list
> devel@linuxdriverproject.org
> http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
^ permalink raw reply
* IMMEDIATE REPLY.
From: Alif Tomar @ 2013-10-14 18:57 UTC (permalink / raw)
Dear,
I know that this letter may come to you as a surprise, I got your
contact address from the computerized search. My name is Mr Alif
Tomar, I am the Bill and Exchange (assistant) Manager of Bank of
Africa Ouagadougou, Burkina Faso. In my department I discovered an
abandoned sum of eighteen million three hundred thousand United State
of American dollars (18.3MILLION USA DOLLARS) in an account that
belongs to one of our foreign customer Mr Kurt Kuhle from Alexandra
Egypt who died along with his family in Siber airline that crashed
into sea at Isreal on 4th October 2001.
Since I got information about his death I have been expecting his next
of kin to come over and claim his money because we can not release it
unless somebody applies for it as the next of kin or relation to the
deceased as indicated in our banking guidelines, but unfortunately we
learnt that all his supposed next of kin or relation died alongside
with him in the plane crash leaving nobody behind for the claim. It is
therefore upon this discovery that I decided to make this business
proposal to you and release the money to you as next of kin or
relation to the deceased for safety and subsequent disbursement since
nobody is coming for it and I don't want the money to go into the bank
treasury as unclaimed bill.
Am contacting you because our deceased customer is a foreigner and a
Burkinabe can not stand as a next of kin to foreign customer. The
banking guidelines stipulate that the fund should be transferred into
the bank treasury after (12) years if nobody is coming for the claim.
I have agreed that 33% of this money will be for you as foreign
partner in respect to the provision of your account for the transfer,
2% will be set aside for expenses that might occurred during the
business and 65% would be for me, after which I shall visit your
country
for disbursement according to the percentage as indicated.
Please I would like you to keep this transaction confidential and as a
top secret as you may wish to know that I am a bank official.
Yours sincerely,
Mr Alif Tomar.
^ permalink raw reply
* Re: [PATCHSET v1 00/07] Support for byte queue limits on various network interfaces
From: Tino Reichardt @ 2013-10-14 18:46 UTC (permalink / raw)
To: netdev
In-Reply-To: <20131014114117.5808e665@nehalam.linuxnetplumber.net>
* Stephen Hemminger <stephen@networkplumber.org> wrote:
> On Mon, 14 Oct 2013 20:26:16 +0200
> Tino Reichardt <milky-kernel@mcmilk.de> wrote:
>
> > Hello,
> >
> > this patchset adds support for byte queue limits for various network drivers.
> >
> > These drivers are used as WAN interface on servers that I am managing. So
> > it would be nice, if support for BQL / codel for these drivers will make it
> > into the mainline... @ some time ;)
> >
> >
> > Any comments are welcome, thanks.
> > Tino Reichardt
>
> How many of these have been tested on real devices?
> BQL has a nasty way of exposing bugs.
Currently the only one tested is the realtek t8169 for about two weeks.
All other hardware (except via-velocity) will get tested by me in near
future. But maybe other will also test these patches, so I put them to
the list.
--
Best regards, TR
^ permalink raw reply
* Re: [PATCHSET v1 00/07] Support for byte queue limits on various network interfaces
From: Stephen Hemminger @ 2013-10-14 18:41 UTC (permalink / raw)
To: Tino Reichardt; +Cc: netdev, David S. Miller
In-Reply-To: <1381775183-24866-1-git-send-email-milky-kernel@mcmilk.de>
On Mon, 14 Oct 2013 20:26:16 +0200
Tino Reichardt <milky-kernel@mcmilk.de> wrote:
> Hello,
>
> this patchset adds support for byte queue limits for various network drivers.
>
> These drivers are used as WAN interface on servers that I am managing. So
> it would be nice, if support for BQL / codel for these drivers will make it
> into the mainline... @ some time ;)
>
>
> Any comments are welcome, thanks.
> Tino Reichardt
How many of these have been tested on real devices?
BQL has a nasty way of exposing bugs.
^ permalink raw reply
* Re: [PATCH] staging: octeon-ethernet: trivial: Avoid OOPS if phydev is not set
From: Aaro Koskinen @ 2013-10-14 18:39 UTC (permalink / raw)
To: Dan Carpenter
Cc: Greg KH, support, netdev, driverdev-devel, Sebastian Pöhn,
David Daney
In-Reply-To: <20131014101051.GH6192@mwanda>
Hi,
On Mon, Oct 14, 2013 at 01:10:51PM +0300, Dan Carpenter wrote:
> On Sun, Oct 13, 2013 at 02:28:10PM -0700, Greg KH wrote:
> > On Sun, Oct 13, 2013 at 08:59:54PM +0200, Sebastian Pöhn wrote:
> > > A zero pointer deref on priv->phydev->link was causing oops on our systems.
> > > Might be related to improper configuration but we should fail gracefully here ;-)
> > >
> > > Signed-off-by: Sebastian Poehn <sebastian.poehn@googlemail.com>
> > >
> > > ---
> > >
> > > diff --git a/drivers/staging/octeon/ethernet-mdio.c b/drivers/staging/octeon/ethernet-mdio.c
> > > index 83b1030..bc8c503 100644
> > > --- a/drivers/staging/octeon/ethernet-mdio.c
> > > +++ b/drivers/staging/octeon/ethernet-mdio.c
> > > @@ -121,6 +121,9 @@ static void cvm_oct_adjust_link(struct net_device *dev)
> > > struct octeon_ethernet *priv = netdev_priv(dev);
> > > cvmx_helper_link_info_t link_info;
> > >
> > > + if(!priv->phydev)
> > > + return ;
> >
> > Please always run your patches through the scripts/checkpatch.pl tool so
> > that maintainers don't have to point out the obvious coding syle errors
> > by hand each time :)
>
> Also it's whitespace damaged and doesn't apply.
>
> >
> > Care to try again?
> >
> > Also, how was phydev NULL? What was causing that?
>
> To me it looks like phydev is always NULL.
It's initialized in cvm_oct_phy_setup_device():
priv->phydev = of_phy_connect(dev, phy_node, cvm_oct_adjust_link, 0,
PHY_INTERFACE_MODE_GMII);
So maybe there is a chance that cvm_oct_adjust_link() callback gets called
already before the function returns? Getting a copy of the original OOPS
report/crash dump could help to confirm this.
A.
^ permalink raw reply
* [PATCH 05/07] via-velocity: Support for byte queue limits
From: Tino Reichardt @ 2013-10-14 18:26 UTC (permalink / raw)
To: netdev, Francois Romieu
In-Reply-To: <1381775183-24866-1-git-send-email-milky-kernel@mcmilk.de>
Changes to via-velocity to use byte queue limits.
I can't test this patch on real hardware :(
Signed-off-by: Tino Reichardt <milky-kernel@mcmilk.de>
---
drivers/net/ethernet/via/via-velocity.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/net/ethernet/via/via-velocity.c b/drivers/net/ethernet/via/via-velocity.c
index d022bf9..037f304 100644
--- a/drivers/net/ethernet/via/via-velocity.c
+++ b/drivers/net/ethernet/via/via-velocity.c
@@ -1751,6 +1751,8 @@ static void velocity_free_tx_buf(struct velocity_info *vptr,
le16_to_cpu(pktlen), DMA_TO_DEVICE);
}
}
+
+ netdev_reset_queue(vptr->netdev);
dev_kfree_skb_irq(skb);
tdinfo->skb = NULL;
}
@@ -1915,6 +1917,7 @@ static int velocity_tx_srv(struct velocity_info *vptr)
int works = 0;
struct velocity_td_info *tdinfo;
struct net_device_stats *stats = &vptr->netdev->stats;
+ unsigned int pkts_compl = 0, bytes_compl = 0;
for (qnum = 0; qnum < vptr->tx.numq; qnum++) {
for (idx = vptr->tx.tail[qnum]; vptr->tx.used[qnum] > 0;
@@ -1946,6 +1949,8 @@ static int velocity_tx_srv(struct velocity_info *vptr)
} else {
stats->tx_packets++;
stats->tx_bytes += tdinfo->skb->len;
+ pkts_compl++;
+ bytes_compl += tdinfo->skb->len;
}
velocity_free_tx_buf(vptr, tdinfo, td);
vptr->tx.used[qnum]--;
@@ -1955,6 +1960,9 @@ static int velocity_tx_srv(struct velocity_info *vptr)
if (AVAIL_TD(vptr, qnum) < 1)
full = 1;
}
+
+ netdev_completed_queue(vptr->netdev, pkts_compl, bytes_compl);
+
/*
* Look to see if we should kick the transmit network
* layer for more work.
@@ -2641,6 +2649,7 @@ static netdev_tx_t velocity_xmit(struct sk_buff *skb,
td_ptr->td_buf[0].size |= TD_QUEUE;
mac_tx_queue_wake(vptr->mac_regs, qnum);
+ netdev_sent_queue(vptr->netdev, skb->len);
spin_unlock_irqrestore(&vptr->lock, flags);
out:
return NETDEV_TX_OK;
--
1.8.4
^ permalink raw reply related
* [PATCH 06/07] 3c59x: Support for byte queue limits
From: Tino Reichardt @ 2013-10-14 18:26 UTC (permalink / raw)
To: netdev, Steffen Klassert
In-Reply-To: <1381775183-24866-1-git-send-email-milky-kernel@mcmilk.de>
Changes to 3c59x to use byte queue limits.
The checkpatch.pl script will raise this formatting error:
"WARNING: line over 80 characters" - but I don't want to change the whole
formatting of this driver ;)
This patch was not tested on real hardware currently, but compiles fine and
should work.
Signed-off-by: Tino Reichardt <milky-kernel@mcmilk.de>
---
drivers/net/ethernet/3com/3c59x.c | 37 +++++++++++++++++++++++++++++++------
1 file changed, 31 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/3com/3c59x.c b/drivers/net/ethernet/3com/3c59x.c
index ad5272b..fd03165 100644
--- a/drivers/net/ethernet/3com/3c59x.c
+++ b/drivers/net/ethernet/3com/3c59x.c
@@ -1726,6 +1726,8 @@ vortex_up(struct net_device *dev)
iowrite16(vp->intr_enable, ioaddr + EL3_CMD);
if (vp->cb_fn_base) /* The PCMCIA people are idiots. */
iowrite32(0x8000, vp->cb_fn_base + 4);
+
+ netdev_reset_queue(dev);
netif_start_queue (dev);
err_out:
return err;
@@ -2080,10 +2082,13 @@ vortex_start_xmit(struct sk_buff *skb, struct net_device *dev)
spin_unlock_irq(&vp->window_lock);
vp->tx_skb = skb;
iowrite16(StartDMADown, ioaddr + EL3_CMD);
+ netdev_sent_queue(dev, len);
/* netif_wake_queue() will be called at the DMADone interrupt. */
} else {
/* ... and the packet rounded to a doubleword. */
- iowrite32_rep(ioaddr + TX_FIFO, skb->data, (skb->len + 3) >> 2);
+ int len = (skb->len + 3) >> 2;
+ iowrite32_rep(ioaddr + TX_FIFO, skb->data, len);
+ netdev_sent_queue(dev, len);
dev_kfree_skb (skb);
if (ioread16(ioaddr + TxFree) > 1536) {
netif_start_queue (dev); /* AKPM: redundant? */
@@ -2094,7 +2099,6 @@ vortex_start_xmit(struct sk_buff *skb, struct net_device *dev)
}
}
-
/* Clear the Tx status stack. */
{
int tx_status;
@@ -2164,12 +2168,14 @@ boomerang_start_xmit(struct sk_buff *skb, struct net_device *dev)
vp->tx_ring[entry].frag[0].addr = cpu_to_le32(pci_map_single(VORTEX_PCI(vp), skb->data,
skb->len, PCI_DMA_TODEVICE));
vp->tx_ring[entry].frag[0].length = cpu_to_le32(skb->len | LAST_FRAG);
+ netdev_sent_queue(dev, skb->len);
} else {
- int i;
+ int i, len;
vp->tx_ring[entry].frag[0].addr = cpu_to_le32(pci_map_single(VORTEX_PCI(vp), skb->data,
skb_headlen(skb), PCI_DMA_TODEVICE));
vp->tx_ring[entry].frag[0].length = cpu_to_le32(skb_headlen(skb));
+ len = skb_headlen(skb);
for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
@@ -2180,16 +2186,21 @@ boomerang_start_xmit(struct sk_buff *skb, struct net_device *dev)
(void *)skb_frag_address(frag),
skb_frag_size(frag), PCI_DMA_TODEVICE));
- if (i == skb_shinfo(skb)->nr_frags-1)
+ if (i == skb_shinfo(skb)->nr_frags - 1) {
vp->tx_ring[entry].frag[i+1].length = cpu_to_le32(skb_frag_size(frag)|LAST_FRAG);
- else
+ len += skb_frag_size(frag) | LAST_FRAG;
+ } else {
vp->tx_ring[entry].frag[i+1].length = cpu_to_le32(skb_frag_size(frag));
+ len += skb_frag_size(frag);
+ }
}
+ netdev_sent_queue(dev, len);
}
#else
vp->tx_ring[entry].addr = cpu_to_le32(pci_map_single(VORTEX_PCI(vp), skb->data, skb->len, PCI_DMA_TODEVICE));
vp->tx_ring[entry].length = cpu_to_le32(skb->len | LAST_FRAG);
vp->tx_ring[entry].status = cpu_to_le32(skb->len | TxIntrUploaded);
+ netdev_sent_queue(dev, skb->len | LAST_FRAG);
#endif
spin_lock_irqsave(&vp->lock, flags);
@@ -2234,6 +2245,7 @@ vortex_interrupt(int irq, void *dev_id)
int status;
int work_done = max_interrupt_work;
int handled = 0;
+ unsigned bytes_compl = 0, pkts_compl = 0;
ioaddr = vp->ioaddr;
spin_lock(&vp->lock);
@@ -2279,8 +2291,12 @@ vortex_interrupt(int irq, void *dev_id)
if (status & DMADone) {
if (ioread16(ioaddr + Wn7_MasterStatus) & 0x1000) {
+ int len;
iowrite16(0x1000, ioaddr + Wn7_MasterStatus); /* Ack the event. */
- pci_unmap_single(VORTEX_PCI(vp), vp->tx_skb_dma, (vp->tx_skb->len + 3) & ~3, PCI_DMA_TODEVICE);
+ len = (vp->tx_skb->len + 3) & ~3;
+ pci_unmap_single(VORTEX_PCI(vp), vp->tx_skb_dma, len, PCI_DMA_TODEVICE);
+ bytes_compl += len;
+ pkts_compl++;
dev_kfree_skb_irq(vp->tx_skb); /* Release the transferred buffer */
if (ioread16(ioaddr + TxFree) > 1536) {
/*
@@ -2327,6 +2343,8 @@ vortex_interrupt(int irq, void *dev_id)
spin_unlock(&vp->window_lock);
+ netdev_completed_queue(dev, pkts_compl, bytes_compl);
+
if (vortex_debug > 4)
pr_debug("%s: exiting interrupt, status %4.4x.\n",
dev->name, status);
@@ -2348,6 +2366,7 @@ boomerang_interrupt(int irq, void *dev_id)
void __iomem *ioaddr;
int status;
int work_done = max_interrupt_work;
+ unsigned bytes_compl = 0, pkts_compl = 0;
ioaddr = vp->ioaddr;
@@ -2420,6 +2439,8 @@ boomerang_interrupt(int irq, void *dev_id)
pci_unmap_single(VORTEX_PCI(vp),
le32_to_cpu(vp->tx_ring[entry].addr), skb->len, PCI_DMA_TODEVICE);
#endif
+ bytes_compl += skb->len;
+ pkts_compl++;
dev_kfree_skb_irq(skb);
vp->tx_skbuff[entry] = NULL;
} else {
@@ -2467,6 +2488,9 @@ boomerang_interrupt(int irq, void *dev_id)
handler_exit:
vp->handling_irq = 0;
spin_unlock(&vp->lock);
+
+ netdev_completed_queue(dev, pkts_compl, bytes_compl);
+
return IRQ_HANDLED;
}
@@ -2660,6 +2684,7 @@ vortex_down(struct net_device *dev, int final_down)
struct vortex_private *vp = netdev_priv(dev);
void __iomem *ioaddr = vp->ioaddr;
+ netdev_reset_queue(dev);
netif_stop_queue (dev);
del_timer_sync(&vp->rx_oom_timer);
--
1.8.4
^ permalink raw reply related
* [PATCH 01/07] 8139too: Support for byte queue limits
From: Tino Reichardt @ 2013-10-14 18:26 UTC (permalink / raw)
To: netdev, David S. Miller, Joe Perches, Jiri Pirko, Bill Pemberton,
Greg Kroah-Hartman
In-Reply-To: <1381775183-24866-1-git-send-email-milky-kernel@mcmilk.de>
Changes to 8139too to use byte queue limits.
This patch was not tested on real hardware currently, but compiles fine and
should work.
Signed-off-by: Tino Reichardt <milky-kernel@mcmilk.de>
---
drivers/net/ethernet/realtek/8139too.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/drivers/net/ethernet/realtek/8139too.c b/drivers/net/ethernet/realtek/8139too.c
index 3ccedeb..992ac57 100644
--- a/drivers/net/ethernet/realtek/8139too.c
+++ b/drivers/net/ethernet/realtek/8139too.c
@@ -1409,6 +1409,7 @@ static void rtl8139_hw_start (struct net_device *dev)
}
netdev_dbg(dev, "init buffer addresses\n");
+ netdev_reset_queue(dev);
/* Lock Config[01234] and BMCR register writes */
RTL_W8 (Cfg9346, Cfg9346_Lock);
@@ -1638,6 +1639,7 @@ static inline void rtl8139_tx_clear (struct rtl8139_private *tp)
{
tp->cur_tx = 0;
tp->dirty_tx = 0;
+ netdev_reset_queue(tp->dev);
/* XXX account for unsent Tx packets in tp->stats.tx_dropped */
}
@@ -1733,6 +1735,7 @@ static netdev_tx_t rtl8139_start_xmit (struct sk_buff *skb,
tp->tx_flag | max(len, (unsigned int)ETH_ZLEN));
tp->cur_tx++;
+ netdev_sent_queue(dev, len);
if ((tp->cur_tx - NUM_TX_DESC) == tp->dirty_tx)
netif_stop_queue (dev);
@@ -1750,6 +1753,7 @@ static void rtl8139_tx_interrupt (struct net_device *dev,
void __iomem *ioaddr)
{
unsigned long dirty_tx, tx_left;
+ unsigned bytes_compl = 0, pkts_compl = 0;
assert (dev != NULL);
assert (ioaddr != NULL);
@@ -1792,6 +1796,8 @@ static void rtl8139_tx_interrupt (struct net_device *dev,
u64_stats_update_begin(&tp->tx_stats.syncp);
tp->tx_stats.packets++;
tp->tx_stats.bytes += txstatus & 0x7ff;
+ pkts_compl++;
+ bytes_compl += txstatus & 0x7ff;
u64_stats_update_end(&tp->tx_stats.syncp);
}
@@ -1807,6 +1813,8 @@ static void rtl8139_tx_interrupt (struct net_device *dev,
}
#endif /* RTL8139_NDEBUG */
+ netdev_completed_queue(dev, pkts_compl, bytes_compl);
+
/* only wake the queue if we did work, and the queue is stopped */
if (tp->dirty_tx != dirty_tx) {
tp->dirty_tx = dirty_tx;
--
1.8.4
^ permalink raw reply related
* [PATCH 03/03] tulip: Support for byte queue limits
From: Tino Reichardt @ 2013-10-14 18:26 UTC (permalink / raw)
To: netdev, Grant Grundler
In-Reply-To: <1381775183-24866-1-git-send-email-milky-kernel@mcmilk.de>
Changes to tulip to use byte queue limits.
Nearly the same patch which was already sent by George Spelvin to the netdev
list, see here: http://thread.gmane.org/gmane.linux.network/276166
Maybe George could re-test it and give it an ACK/NACK?
This patch _was not_ tested on real hardware by me. But I have such card in
an ADSL Linux Router, which may be updated some time and then I would like
to use codel with it :)
Original-Patch-By: George Spelvin <linux@horizon.com>
Signed-off-by: Tino Reichardt <milky-kernel@mcmilk.de>
---
drivers/net/ethernet/dec/tulip/interrupt.c | 3 +++
drivers/net/ethernet/dec/tulip/tulip_core.c | 2 ++
2 files changed, 5 insertions(+)
diff --git a/drivers/net/ethernet/dec/tulip/interrupt.c b/drivers/net/ethernet/dec/tulip/interrupt.c
index 92306b3..d74426e 100644
--- a/drivers/net/ethernet/dec/tulip/interrupt.c
+++ b/drivers/net/ethernet/dec/tulip/interrupt.c
@@ -532,6 +532,7 @@ irqreturn_t tulip_interrupt(int irq, void *dev_instance)
#endif
unsigned int work_count = tulip_max_interrupt_work;
unsigned int handled = 0;
+ unsigned int bytes_compl = 0;
/* Let's see whether the interrupt really is for us */
csr5 = ioread32(ioaddr + CSR5);
@@ -634,6 +635,7 @@ irqreturn_t tulip_interrupt(int irq, void *dev_instance)
PCI_DMA_TODEVICE);
/* Free the original skb. */
+ bytes_compl += tp->tx_buffers[entry].skb->len;
dev_kfree_skb_irq(tp->tx_buffers[entry].skb);
tp->tx_buffers[entry].skb = NULL;
tp->tx_buffers[entry].mapping = 0;
@@ -802,6 +804,7 @@ irqreturn_t tulip_interrupt(int irq, void *dev_instance)
}
#endif /* CONFIG_TULIP_NAPI */
+ netdev_completed_queue(dev, tx, bytes_compl);
if ((missed = ioread32(ioaddr + CSR8) & 0x1ffff)) {
dev->stats.rx_dropped += missed & 0x10000 ? 0x10000 : missed;
}
diff --git a/drivers/net/ethernet/dec/tulip/tulip_core.c b/drivers/net/ethernet/dec/tulip/tulip_core.c
index 4e8cfa2..69cdcff 100644
--- a/drivers/net/ethernet/dec/tulip/tulip_core.c
+++ b/drivers/net/ethernet/dec/tulip/tulip_core.c
@@ -703,6 +703,7 @@ tulip_start_xmit(struct sk_buff *skb, struct net_device *dev)
wmb();
tp->cur_tx++;
+ netdev_sent_queue(dev, skb->len);
/* Trigger an immediate transmit demand. */
iowrite32(0, tp->base_addr + CSR1);
@@ -746,6 +747,7 @@ static void tulip_clean_tx_ring(struct tulip_private *tp)
tp->tx_buffers[entry].skb = NULL;
tp->tx_buffers[entry].mapping = 0;
}
+ netdev_reset_queue(tp->dev);
}
static void tulip_down (struct net_device *dev)
--
1.8.4
^ permalink raw reply related
* [PATCH 07/07] natsemi: Support for byte queue limits
From: Tino Reichardt @ 2013-10-14 18:26 UTC (permalink / raw)
To: netdev, Greg Kroah-Hartman, David S. Miller, Jiri Pirko,
Bill Pemberton
In-Reply-To: <1381775183-24866-1-git-send-email-milky-kernel@mcmilk.de>
Changes to natsemi to use byte queue limits.
This patch was not tested on real hardware currently, but compiles fine and
should work.
Signed-off-by: Tino Reichardt <milky-kernel@mcmilk.de>
---
drivers/net/ethernet/natsemi/natsemi.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/drivers/net/ethernet/natsemi/natsemi.c b/drivers/net/ethernet/natsemi/natsemi.c
index 7a5e295..3d738b9 100644
--- a/drivers/net/ethernet/natsemi/natsemi.c
+++ b/drivers/net/ethernet/natsemi/natsemi.c
@@ -1973,6 +1973,7 @@ static void init_ring(struct net_device *dev)
*((i+1)%TX_RING_SIZE+RX_RING_SIZE));
np->tx_ring[i].cmd_status = 0;
}
+ netdev_reset_queue(dev);
/* 2) RX ring */
np->dirty_rx = 0;
@@ -2012,6 +2013,7 @@ static void drain_tx(struct net_device *dev)
}
np->tx_skbuff[i] = NULL;
}
+ netdev_reset_queue(dev);
}
static void drain_rx(struct net_device *dev)
@@ -2116,6 +2118,8 @@ static netdev_tx_t start_tx(struct sk_buff *skb, struct net_device *dev)
dev_kfree_skb_irq(skb);
dev->stats.tx_dropped++;
}
+
+ netdev_sent_queue(dev, skb->len);
spin_unlock_irqrestore(&np->lock, flags);
if (netif_msg_tx_queued(np)) {
@@ -2128,6 +2132,7 @@ static netdev_tx_t start_tx(struct sk_buff *skb, struct net_device *dev)
static void netdev_tx_done(struct net_device *dev)
{
struct netdev_private *np = netdev_priv(dev);
+ unsigned bytes_compl = 0, pkts_compl = 0;
for (; np->cur_tx - np->dirty_tx > 0; np->dirty_tx++) {
int entry = np->dirty_tx % TX_RING_SIZE;
@@ -2158,9 +2163,14 @@ static void netdev_tx_done(struct net_device *dev)
np->tx_skbuff[entry]->len,
PCI_DMA_TODEVICE);
/* Free the original skb. */
+ bytes_compl += np->tx_skbuff[entry]->len;
+ pkts_compl++;
dev_kfree_skb_irq(np->tx_skbuff[entry]);
np->tx_skbuff[entry] = NULL;
}
+
+ netdev_completed_queue(dev, pkts_compl, bytes_compl);
+
if (netif_queue_stopped(dev) &&
np->cur_tx - np->dirty_tx < TX_QUEUE_LEN - 4) {
/* The ring is no longer full, wake queue. */
--
1.8.4
^ permalink raw reply related
* [PATCHSET v1 00/07] Support for byte queue limits on various network interfaces
From: Tino Reichardt @ 2013-10-14 18:26 UTC (permalink / raw)
To: netdev, David S. Miller
Hello,
this patchset adds support for byte queue limits for various network drivers.
These drivers are used as WAN interface on servers that I am managing. So
it would be nice, if support for BQL / codel for these drivers will make it
into the mainline... @ some time ;)
Any comments are welcome, thanks.
Tino Reichardt
This BQL patchset contains the following patches by now:
0001-8139too-Support-for-byte-queue-limits.patch
0002-r8169-Support-for-byte-queue-limits.patch
0003-tulip-Support-for-byte-queue-limits.patch
0004-via-rhine-Support-for-byte-queue-limits.patch
0005-via-velocity-Support-for-byte-queue-limits.patch
0006-3c59x-Support-for-byte-queue-limits.patch
0007-natsemi-Support-for-byte-queue-limits.patch
---
drivers/net/ethernet/3com/3c59x.c | 37 ++++++++++++++++++++++++-----
drivers/net/ethernet/dec/tulip/interrupt.c | 3 +++
drivers/net/ethernet/dec/tulip/tulip_core.c | 2 ++
drivers/net/ethernet/natsemi/natsemi.c | 10 ++++++++
drivers/net/ethernet/realtek/8139too.c | 8 +++++++
drivers/net/ethernet/realtek/r8169.c | 7 ++++++
drivers/net/ethernet/via/via-rhine.c | 10 ++++++++
drivers/net/ethernet/via/via-velocity.c | 9 +++++++
8 files changed, 80 insertions(+), 6 deletions(-)
^ permalink raw reply
* [PATCH 04/07] via-rhine: Support for byte queue limits
From: Tino Reichardt @ 2013-10-14 18:26 UTC (permalink / raw)
To: netdev, Roger Luethi
In-Reply-To: <1381775183-24866-1-git-send-email-milky-kernel@mcmilk.de>
Changes to via-rhine to use byte queue limits.
Signed-off-by: Tino Reichardt <milky-kernel@mcmilk.de>
---
drivers/net/ethernet/via/via-rhine.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/drivers/net/ethernet/via/via-rhine.c b/drivers/net/ethernet/via/via-rhine.c
index bdf697b..49bd6653 100644
--- a/drivers/net/ethernet/via/via-rhine.c
+++ b/drivers/net/ethernet/via/via-rhine.c
@@ -1220,6 +1220,7 @@ static void alloc_tbufs(struct net_device* dev)
}
rp->tx_ring[i-1].next_desc = cpu_to_le32(rp->tx_ring_dma);
+ netdev_reset_queue(dev);
}
static void free_tbufs(struct net_device* dev)
@@ -1719,6 +1720,7 @@ static netdev_tx_t rhine_start_tx(struct sk_buff *skb,
/* lock eth irq */
wmb();
rp->tx_ring[entry].tx_status |= cpu_to_le32(DescOwn);
+ netdev_sent_queue(dev, skb->len);
wmb();
rp->cur_tx++;
@@ -1783,6 +1785,7 @@ static void rhine_tx(struct net_device *dev)
{
struct rhine_private *rp = netdev_priv(dev);
int txstatus = 0, entry = rp->dirty_tx % TX_RING_SIZE;
+ unsigned int pkts_compl = 0, bytes_compl = 0;
/* find and cleanup dirty tx descriptors */
while (rp->dirty_tx != rp->cur_tx) {
@@ -1830,10 +1833,17 @@ static void rhine_tx(struct net_device *dev)
rp->tx_skbuff[entry]->len,
PCI_DMA_TODEVICE);
}
+
+ bytes_compl += rp->tx_skbuff[entry]->len;
+ pkts_compl++;
dev_kfree_skb(rp->tx_skbuff[entry]);
+
rp->tx_skbuff[entry] = NULL;
entry = (++rp->dirty_tx) % TX_RING_SIZE;
}
+
+ netdev_completed_queue(dev, pkts_compl, bytes_compl);
+
if ((rp->cur_tx - rp->dirty_tx) < TX_QUEUE_LEN - 4)
netif_wake_queue(dev);
}
--
1.8.4
^ permalink raw reply related
* [PATCH 02/07] r8169: Support for byte queue limits
From: Tino Reichardt @ 2013-10-14 18:26 UTC (permalink / raw)
To: netdev, Realtek linux nic maintainers, Igor Maravic,
Francois Romieu
In-Reply-To: <1381775183-24866-1-git-send-email-milky-kernel@mcmilk.de>
Changes to r8169 to use byte queue limits.
This driver got BQL disabled, cause there were some issues in the old byte
queue limit code itself which resulted in errors.
Here is the old thread for the revert of commit 036daf..7a0060:
http://thread.gmane.org/gmane.linux.network/238202
The rtl8169_private tx_stats struct is not touched by this patch now.
I have tested this patch on a small server in home use and it's working with
no problems for about two weeks now. (kernel 3.10.10 and fq_codel enabled)
Original-Patch-By: Igor Maravic <igorm@etf.rs>
Signed-off-by: Tino Reichardt <milky-kernel@mcmilk.de>
---
drivers/net/ethernet/realtek/r8169.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index 3397cee..9cefacc 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -5841,6 +5841,7 @@ static void rtl8169_tx_clear(struct rtl8169_private *tp)
{
rtl8169_tx_clear_range(tp, tp->dirty_tx, NUM_TX_DESC);
tp->cur_tx = tp->dirty_tx = 0;
+ netdev_reset_queue(tp->dev);
}
static void rtl_reset_work(struct rtl8169_private *tp)
@@ -6017,6 +6018,7 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
txd->opts2 = cpu_to_le32(opts[1]);
skb_tx_timestamp(skb);
+ netdev_sent_queue(dev, skb->len);
wmb();
@@ -6116,6 +6118,7 @@ static void rtl8169_pcierr_interrupt(struct net_device *dev)
static void rtl_tx(struct net_device *dev, struct rtl8169_private *tp)
{
unsigned int dirty_tx, tx_left;
+ unsigned int pkts_compl = 0, bytes_compl = 0;
dirty_tx = tp->dirty_tx;
smp_rmb();
@@ -6138,6 +6141,9 @@ static void rtl_tx(struct net_device *dev, struct rtl8169_private *tp)
tp->tx_stats.packets++;
tp->tx_stats.bytes += tx_skb->skb->len;
u64_stats_update_end(&tp->tx_stats.syncp);
+
+ bytes_compl += tx_skb->skb->len;
+ pkts_compl++;
dev_kfree_skb(tx_skb->skb);
tx_skb->skb = NULL;
}
@@ -6155,6 +6161,7 @@ static void rtl_tx(struct net_device *dev, struct rtl8169_private *tp)
* ring status.
*/
smp_mb();
+ netdev_completed_queue(dev, pkts_compl, bytes_compl);
if (netif_queue_stopped(dev) &&
TX_FRAGS_READY_FOR(tp, MAX_SKB_FRAGS)) {
netif_wake_queue(dev);
--
1.8.4
^ permalink raw reply related
* From Mr. Charles Kabore.
From: Mr. Charles Kabore @ 2013-10-14 18:21 UTC (permalink / raw)
>From Mr. Charles Kabore.
Ouagadougou,
Burkina Faso.
Dear Friend,
Good day to you. I am Mr. Charles Kabore. a lawyer and personal confidant to Abdullah Senussi who was the intelligence chief of Colonel Muammar Gaddafi. I need your urgent assistance in transferring the sum of ($39.5) million to your account within 14 banking days from a bank in Burkina Faso. This money belongs to my master Abdullah Senussi and was deposited in the bank on the name of his son. The urgent need for the transfer of this fund is to avoid confiscation by the Libyan government as they quest the seizure of every related assets belonging to Late Colonel Muammar Gaddafi and his aides. I am contacting you in a good faith so that the bank will release the money to you for safe keeping/investments till the release of my master who is now in custody.
So if you are capable of receiving this huge amount of money,let me have a positive response from you via return mail for more personal discussions on how we are going to go about it.Contact me in this my private email address for security purpose {ckabore94@yahoo.com.hk}.
Best.Regards,
Mr. Charles Kabore.
^ permalink raw reply
* Re: [RFC PATCH v2 1/1] Workqueue based vhost workers
From: Stephen Hemminger @ 2013-10-14 18:27 UTC (permalink / raw)
To: Bandan Das; +Cc: kvm, netdev, Michael Tsirkin, Jason Wang, Bandan Das
In-Reply-To: <1381715743-13672-2-git-send-email-bsd@redhat.com>
On Sun, 13 Oct 2013 21:55:43 -0400
Bandan Das <bsd@redhat.com> wrote:
> +
> + if (cmwq_worker) {
> + ret = vhost_wq_init();
> + if (ret) {
> + pr_info("Enabling wq based vhost workers failed! "
> + "Switching to device based worker instead\n");
> + cmwq_worker = 0;
> + } else
> + pr_info("Enabled workqueues based vhost workers\n");
> + }
Why keep two mechanisms (and two potential code paths to maintain)
when the only way vhost_wq_init() can fail is if out of memory.
You may have needed the messages and this during development but for
the final version just do it one way.
If alloc_workqueue fails, then the net_init function should propogate
the error code and fail as well.
^ permalink raw reply
* [PATCHv2] staging: octeon-ethernet: trivial: Avoid OOPS if phydev is not set
From: Sebastian Pöhn @ 2013-10-14 17:58 UTC (permalink / raw)
To: driverdev-devel@linuxdriverproject.org
Cc: netdev@vger.kernel.org, dan.carpenter
Sorry. Haven't signed off for a while now :(
I bet that this is really an issue of incorrect OF information. If I find out more I'll let you know.
@dan: The code works for some interfaces - so phydev is set correctly in some cases.
Signed-off-by: Sebastian Poehn <sebastian.poehn@googlemail.com>
---
diff --git a/drivers/staging/octeon/ethernet-mdio.c b/drivers/staging/octeon/ethernet-mdio.c
index 83b1030..bc8c503 100644
--- a/drivers/staging/octeon/ethernet-mdio.c
+++ b/drivers/staging/octeon/ethernet-mdio.c
@@ -121,6 +121,9 @@ static void cvm_oct_adjust_link(struct net_device *dev)
struct octeon_ethernet *priv = netdev_priv(dev);
cvmx_helper_link_info_t link_info;
+ if (!priv->phydev)
+ return;
+
if (priv->last_link != priv->phydev->link) {
priv->last_link = priv->phydev->link;
link_info.u64 = 0;
^ permalink raw reply related
* [PATCH 15/17] netfilter: nfnetlink: add batch support and use it from nf_tables
From: Pablo Neira Ayuso @ 2013-10-14 16:38 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, kaber, netdev
In-Reply-To: <1381768738-17739-1-git-send-email-pablo@netfilter.org>
This patch adds a batch support to nfnetlink. Basically, it adds
two new control messages:
* NFNL_MSG_BATCH_BEGIN, that indicates the beginning of a batch,
the nfgenmsg->res_id indicates the nfnetlink subsystem ID.
* NFNL_MSG_BATCH_END, that results in the invocation of the
ss->commit callback function. If not specified or an error
ocurred in the batch, the ss->abort function is invoked
instead.
The end message represents the commit operation in nftables, the
lack of end message results in an abort. This patch also adds the
.call_batch function that is only called from the batch receival
path.
This patch adds atomic rule updates and dumps based on
bitmask generations. This allows to atomically commit a set of
rule-set updates incrementally without altering the internal
state of existing nf_tables expressions/matches/targets.
The idea consists of using a generation cursor of 1 bit and
a bitmask of 2 bits per rule. Assuming the gencursor is 0,
then the genmask (expressed as a bitmask) can be interpreted
as:
00 active in the present, will be active in the next generation.
01 inactive in the present, will be active in the next generation.
10 active in the present, will be deleted in the next generation.
^
gencursor
Once you invoke the transition to the next generation, the global
gencursor is updated:
00 active in the present, will be active in the next generation.
01 active in the present, needs to zero its future, it becomes 00.
10 inactive in the present, delete now.
^
gencursor
If a dump is in progress and nf_tables enters a new generation,
the dump will stop and return -EBUSY to let userspace know that
it has to retry again. In order to invalidate dumps, a global
genctr counter is increased everytime nf_tables enters a new
generation.
This new operation can be used from the user-space utility
that controls the firewall, eg.
nft -f restore
The rule updates contained in `file' will be applied atomically.
cat file
-----
add filter INPUT ip saddr 1.1.1.1 counter accept #1
del filter INPUT ip daddr 2.2.2.2 counter drop #2
-EOF-
Note that the rule 1 will be inactive until the transition to the
next generation, the rule 2 will be evicted in the next generation.
There is a penalty during the rule update due to the branch
misprediction in the packet matching framework. But that should be
quickly resolved once the iteration over the commit list that
contain rules that require updates is finished.
Event notification happens once the rule-set update has been
committed. So we skip notifications is case the rule-set update
is aborted, which can happen in case that the rule-set is tested
to apply correctly.
This patch squashed the following patches from Pablo:
* nf_tables: atomic rule updates and dumps
* nf_tables: get rid of per rule list_head for commits
* nf_tables: use per netns commit list
* nfnetlink: add batch support and use it from nf_tables
* nf_tables: all rule updates are transactional
* nf_tables: attach replacement rule after stale one
* nf_tables: do not allow deletion/replacement of stale rules
* nf_tables: remove unused NFTA_RULE_FLAGS
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
include/linux/netfilter/nfnetlink.h | 5 +
include/net/netfilter/nf_tables.h | 25 +++-
include/net/netns/nftables.h | 3 +
include/uapi/linux/netfilter/nfnetlink.h | 4 +
net/netfilter/nf_tables_api.c | 202 +++++++++++++++++++++++++++---
net/netfilter/nf_tables_core.c | 10 ++
net/netfilter/nfnetlink.c | 175 +++++++++++++++++++++++++-
7 files changed, 401 insertions(+), 23 deletions(-)
diff --git a/include/linux/netfilter/nfnetlink.h b/include/linux/netfilter/nfnetlink.h
index 4f68cd7..28c7436 100644
--- a/include/linux/netfilter/nfnetlink.h
+++ b/include/linux/netfilter/nfnetlink.h
@@ -14,6 +14,9 @@ struct nfnl_callback {
int (*call_rcu)(struct sock *nl, struct sk_buff *skb,
const struct nlmsghdr *nlh,
const struct nlattr * const cda[]);
+ int (*call_batch)(struct sock *nl, struct sk_buff *skb,
+ const struct nlmsghdr *nlh,
+ const struct nlattr * const cda[]);
const struct nla_policy *policy; /* netlink attribute policy */
const u_int16_t attr_count; /* number of nlattr's */
};
@@ -23,6 +26,8 @@ struct nfnetlink_subsystem {
__u8 subsys_id; /* nfnetlink subsystem ID */
__u8 cb_count; /* number of callbacks */
const struct nfnl_callback *cb; /* callback for individual types */
+ int (*commit)(struct sk_buff *skb);
+ int (*abort)(struct sk_buff *skb);
};
int nfnetlink_subsys_register(const struct nfnetlink_subsystem *n);
diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
index d3272e9..975ad3c 100644
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -323,18 +323,39 @@ static inline void *nft_expr_priv(const struct nft_expr *expr)
* @list: used internally
* @rcu_head: used internally for rcu
* @handle: rule handle
+ * @genmask: generation mask
* @dlen: length of expression data
* @data: expression data
*/
struct nft_rule {
struct list_head list;
struct rcu_head rcu_head;
- u64 handle:48,
+ u64 handle:46,
+ genmask:2,
dlen:16;
unsigned char data[]
__attribute__((aligned(__alignof__(struct nft_expr))));
};
+/**
+ * struct nft_rule_trans - nf_tables rule update in transaction
+ *
+ * @list: used internally
+ * @rule: rule that needs to be updated
+ * @chain: chain that this rule belongs to
+ * @table: table for which this chain applies
+ * @nlh: netlink header of the message that contain this update
+ * @family: family expressesed as AF_*
+ */
+struct nft_rule_trans {
+ struct list_head list;
+ struct nft_rule *rule;
+ const struct nft_chain *chain;
+ const struct nft_table *table;
+ const struct nlmsghdr *nlh;
+ u8 family;
+};
+
static inline struct nft_expr *nft_expr_first(const struct nft_rule *rule)
{
return (struct nft_expr *)&rule->data[0];
@@ -370,6 +391,7 @@ enum nft_chain_flags {
* @rules: list of rules in the chain
* @list: used internally
* @rcu_head: used internally
+ * @net: net namespace that this chain belongs to
* @handle: chain handle
* @flags: bitmask of enum nft_chain_flags
* @use: number of jump references to this chain
@@ -380,6 +402,7 @@ struct nft_chain {
struct list_head rules;
struct list_head list;
struct rcu_head rcu_head;
+ struct net *net;
u64 handle;
u8 flags;
u16 use;
diff --git a/include/net/netns/nftables.h b/include/net/netns/nftables.h
index a98b1c5..08a4248 100644
--- a/include/net/netns/nftables.h
+++ b/include/net/netns/nftables.h
@@ -7,9 +7,12 @@ struct nft_af_info;
struct netns_nftables {
struct list_head af_info;
+ struct list_head commit_list;
struct nft_af_info *ipv4;
struct nft_af_info *ipv6;
struct nft_af_info *bridge;
+ u8 gencursor;
+ u8 genctr;
};
#endif
diff --git a/include/uapi/linux/netfilter/nfnetlink.h b/include/uapi/linux/netfilter/nfnetlink.h
index 2889594..596ddd4 100644
--- a/include/uapi/linux/netfilter/nfnetlink.h
+++ b/include/uapi/linux/netfilter/nfnetlink.h
@@ -57,4 +57,8 @@ struct nfgenmsg {
#define NFNL_SUBSYS_NFT_COMPAT 11
#define NFNL_SUBSYS_COUNT 12
+/* Reserved control nfnetlink messages */
+#define NFNL_MSG_BATCH_BEGIN NLMSG_MIN_TYPE
+#define NFNL_MSG_BATCH_END NLMSG_MIN_TYPE+1
+
#endif /* _UAPI_NFNETLINK_H */
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 0f14066..79e1418 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -978,6 +978,7 @@ static int nf_tables_newchain(struct sock *nlsk, struct sk_buff *skb,
INIT_LIST_HEAD(&chain->rules);
chain->handle = nf_tables_alloc_handle(table);
+ chain->net = net;
nla_strlcpy(chain->name, name, NFT_CHAIN_MAXNAMELEN);
if (!(table->flags & NFT_TABLE_F_DORMANT) &&
@@ -1371,6 +1372,41 @@ err:
return err;
}
+static inline bool
+nft_rule_is_active(struct net *net, const struct nft_rule *rule)
+{
+ return (rule->genmask & (1 << net->nft.gencursor)) == 0;
+}
+
+static inline int gencursor_next(struct net *net)
+{
+ return net->nft.gencursor+1 == 1 ? 1 : 0;
+}
+
+static inline int
+nft_rule_is_active_next(struct net *net, const struct nft_rule *rule)
+{
+ return (rule->genmask & (1 << gencursor_next(net))) == 0;
+}
+
+static inline void
+nft_rule_activate_next(struct net *net, struct nft_rule *rule)
+{
+ /* Now inactive, will be active in the future */
+ rule->genmask = (1 << net->nft.gencursor);
+}
+
+static inline void
+nft_rule_disactivate_next(struct net *net, struct nft_rule *rule)
+{
+ rule->genmask = (1 << gencursor_next(net));
+}
+
+static inline void nft_rule_clear(struct net *net, struct nft_rule *rule)
+{
+ rule->genmask = 0;
+}
+
static int nf_tables_dump_rules(struct sk_buff *skb,
struct netlink_callback *cb)
{
@@ -1382,6 +1418,8 @@ static int nf_tables_dump_rules(struct sk_buff *skb,
unsigned int idx = 0, s_idx = cb->args[0];
struct net *net = sock_net(skb->sk);
int family = nfmsg->nfgen_family;
+ u8 genctr = ACCESS_ONCE(net->nft.genctr);
+ u8 gencursor = ACCESS_ONCE(net->nft.gencursor);
list_for_each_entry(afi, &net->nft.af_info, list) {
if (family != NFPROTO_UNSPEC && family != afi->family)
@@ -1390,6 +1428,8 @@ static int nf_tables_dump_rules(struct sk_buff *skb,
list_for_each_entry(table, &afi->tables, list) {
list_for_each_entry(chain, &table->chains, list) {
list_for_each_entry(rule, &chain->rules, list) {
+ if (!nft_rule_is_active(net, rule))
+ goto cont;
if (idx < s_idx)
goto cont;
if (idx > s_idx)
@@ -1408,6 +1448,10 @@ cont:
}
}
done:
+ /* Invalidate this dump, a transition to the new generation happened */
+ if (gencursor != net->nft.gencursor || genctr != net->nft.genctr)
+ return -EBUSY;
+
cb->args[0] = idx;
return skb->len;
}
@@ -1492,6 +1536,25 @@ static void nf_tables_rule_destroy(struct nft_rule *rule)
static struct nft_expr_info *info;
+static struct nft_rule_trans *
+nf_tables_trans_add(struct nft_rule *rule, const struct nft_ctx *ctx)
+{
+ struct nft_rule_trans *rupd;
+
+ rupd = kmalloc(sizeof(struct nft_rule_trans), GFP_KERNEL);
+ if (rupd == NULL)
+ return NULL;
+
+ rupd->chain = ctx->chain;
+ rupd->table = ctx->table;
+ rupd->rule = rule;
+ rupd->family = ctx->afi->family;
+ rupd->nlh = ctx->nlh;
+ list_add_tail(&rupd->list, &ctx->net->nft.commit_list);
+
+ return rupd;
+}
+
static int nf_tables_newrule(struct sock *nlsk, struct sk_buff *skb,
const struct nlmsghdr *nlh,
const struct nlattr * const nla[])
@@ -1502,6 +1565,7 @@ static int nf_tables_newrule(struct sock *nlsk, struct sk_buff *skb,
struct nft_table *table;
struct nft_chain *chain;
struct nft_rule *rule, *old_rule = NULL;
+ struct nft_rule_trans *repl = NULL;
struct nft_expr *expr;
struct nft_ctx ctx;
struct nlattr *tmp;
@@ -1576,6 +1640,8 @@ static int nf_tables_newrule(struct sock *nlsk, struct sk_buff *skb,
if (rule == NULL)
goto err1;
+ nft_rule_activate_next(net, rule);
+
rule->handle = handle;
rule->dlen = size;
@@ -1589,8 +1655,18 @@ static int nf_tables_newrule(struct sock *nlsk, struct sk_buff *skb,
}
if (nlh->nlmsg_flags & NLM_F_REPLACE) {
- list_replace_rcu(&old_rule->list, &rule->list);
- nf_tables_rule_destroy(old_rule);
+ if (nft_rule_is_active_next(net, old_rule)) {
+ repl = nf_tables_trans_add(old_rule, &ctx);
+ if (repl == NULL) {
+ err = -ENOMEM;
+ goto err2;
+ }
+ nft_rule_disactivate_next(net, old_rule);
+ list_add_tail(&rule->list, &old_rule->list);
+ } else {
+ err = -ENOENT;
+ goto err2;
+ }
} else if (nlh->nlmsg_flags & NLM_F_APPEND)
if (old_rule)
list_add_rcu(&rule->list, &old_rule->list);
@@ -1603,11 +1679,20 @@ static int nf_tables_newrule(struct sock *nlsk, struct sk_buff *skb,
list_add_rcu(&rule->list, &chain->rules);
}
- nf_tables_rule_notify(skb, nlh, table, chain, rule, NFT_MSG_NEWRULE,
- nlh->nlmsg_flags & (NLM_F_APPEND | NLM_F_REPLACE),
- nfmsg->nfgen_family);
+ if (nf_tables_trans_add(rule, &ctx) == NULL) {
+ err = -ENOMEM;
+ goto err3;
+ }
return 0;
+err3:
+ list_del_rcu(&rule->list);
+ if (repl) {
+ list_del_rcu(&repl->rule->list);
+ list_del(&repl->list);
+ nft_rule_clear(net, repl->rule);
+ kfree(repl);
+ }
err2:
nf_tables_rule_destroy(rule);
err1:
@@ -1618,6 +1703,19 @@ err1:
return err;
}
+static int
+nf_tables_delrule_one(struct nft_ctx *ctx, struct nft_rule *rule)
+{
+ /* You cannot delete the same rule twice */
+ if (nft_rule_is_active_next(ctx->net, rule)) {
+ if (nf_tables_trans_add(rule, ctx) == NULL)
+ return -ENOMEM;
+ nft_rule_disactivate_next(ctx->net, rule);
+ return 0;
+ }
+ return -ENOENT;
+}
+
static int nf_tables_delrule(struct sock *nlsk, struct sk_buff *skb,
const struct nlmsghdr *nlh,
const struct nlattr * const nla[])
@@ -1628,7 +1726,8 @@ static int nf_tables_delrule(struct sock *nlsk, struct sk_buff *skb,
const struct nft_table *table;
struct nft_chain *chain;
struct nft_rule *rule, *tmp;
- int family = nfmsg->nfgen_family;
+ int family = nfmsg->nfgen_family, err = 0;
+ struct nft_ctx ctx;
afi = nf_tables_afinfo_lookup(net, family, false);
if (IS_ERR(afi))
@@ -1642,31 +1741,95 @@ static int nf_tables_delrule(struct sock *nlsk, struct sk_buff *skb,
if (IS_ERR(chain))
return PTR_ERR(chain);
+ nft_ctx_init(&ctx, skb, nlh, afi, table, chain, nla);
+
if (nla[NFTA_RULE_HANDLE]) {
rule = nf_tables_rule_lookup(chain, nla[NFTA_RULE_HANDLE]);
if (IS_ERR(rule))
return PTR_ERR(rule);
- /* List removal must be visible before destroying expressions */
- list_del_rcu(&rule->list);
-
- nf_tables_rule_notify(skb, nlh, table, chain, rule,
- NFT_MSG_DELRULE, 0, family);
- nf_tables_rule_destroy(rule);
+ err = nf_tables_delrule_one(&ctx, rule);
} else {
/* Remove all rules in this chain */
list_for_each_entry_safe(rule, tmp, &chain->rules, list) {
- list_del_rcu(&rule->list);
+ err = nf_tables_delrule_one(&ctx, rule);
+ if (err < 0)
+ break;
+ }
+ }
+
+ return err;
+}
+
+static int nf_tables_commit(struct sk_buff *skb)
+{
+ struct net *net = sock_net(skb->sk);
+ struct nft_rule_trans *rupd, *tmp;
- nf_tables_rule_notify(skb, nlh, table, chain, rule,
- NFT_MSG_DELRULE, 0, family);
- nf_tables_rule_destroy(rule);
+ /* Bump generation counter, invalidate any dump in progress */
+ net->nft.genctr++;
+
+ /* A new generation has just started */
+ net->nft.gencursor = gencursor_next(net);
+
+ /* Make sure all packets have left the previous generation before
+ * purging old rules.
+ */
+ synchronize_rcu();
+
+ list_for_each_entry_safe(rupd, tmp, &net->nft.commit_list, list) {
+ /* Delete this rule from the dirty list */
+ list_del(&rupd->list);
+
+ /* This rule was inactive in the past and just became active.
+ * Clear the next bit of the genmask since its meaning has
+ * changed, now it is the future.
+ */
+ if (nft_rule_is_active(net, rupd->rule)) {
+ nft_rule_clear(net, rupd->rule);
+ nf_tables_rule_notify(skb, rupd->nlh, rupd->table,
+ rupd->chain, rupd->rule,
+ NFT_MSG_NEWRULE, 0,
+ rupd->family);
+ kfree(rupd);
+ continue;
}
+
+ /* This rule is in the past, get rid of it */
+ list_del_rcu(&rupd->rule->list);
+ nf_tables_rule_notify(skb, rupd->nlh, rupd->table, rupd->chain,
+ rupd->rule, NFT_MSG_DELRULE, 0,
+ rupd->family);
+ nf_tables_rule_destroy(rupd->rule);
+ kfree(rupd);
}
return 0;
}
+static int nf_tables_abort(struct sk_buff *skb)
+{
+ struct net *net = sock_net(skb->sk);
+ struct nft_rule_trans *rupd, *tmp;
+
+ list_for_each_entry_safe(rupd, tmp, &net->nft.commit_list, list) {
+ /* Delete all rules from the dirty list */
+ list_del(&rupd->list);
+
+ if (!nft_rule_is_active_next(net, rupd->rule)) {
+ nft_rule_clear(net, rupd->rule);
+ kfree(rupd);
+ continue;
+ }
+
+ /* This rule is inactive, get rid of it */
+ list_del_rcu(&rupd->rule->list);
+ nf_tables_rule_destroy(rupd->rule);
+ kfree(rupd);
+ }
+ return 0;
+}
+
/*
* Sets
*/
@@ -2634,7 +2797,7 @@ static const struct nfnl_callback nf_tables_cb[NFT_MSG_MAX] = {
.policy = nft_chain_policy,
},
[NFT_MSG_NEWRULE] = {
- .call = nf_tables_newrule,
+ .call_batch = nf_tables_newrule,
.attr_count = NFTA_RULE_MAX,
.policy = nft_rule_policy,
},
@@ -2644,7 +2807,7 @@ static const struct nfnl_callback nf_tables_cb[NFT_MSG_MAX] = {
.policy = nft_rule_policy,
},
[NFT_MSG_DELRULE] = {
- .call = nf_tables_delrule,
+ .call_batch = nf_tables_delrule,
.attr_count = NFTA_RULE_MAX,
.policy = nft_rule_policy,
},
@@ -2685,6 +2848,8 @@ static const struct nfnetlink_subsystem nf_tables_subsys = {
.subsys_id = NFNL_SUBSYS_NFTABLES,
.cb_count = NFT_MSG_MAX,
.cb = nf_tables_cb,
+ .commit = nf_tables_commit,
+ .abort = nf_tables_abort,
};
/*
@@ -3056,6 +3221,7 @@ EXPORT_SYMBOL_GPL(nft_data_dump);
static int nf_tables_init_net(struct net *net)
{
INIT_LIST_HEAD(&net->nft.af_info);
+ INIT_LIST_HEAD(&net->nft.commit_list);
return 0;
}
diff --git a/net/netfilter/nf_tables_core.c b/net/netfilter/nf_tables_core.c
index 3c13007..d581ef6 100644
--- a/net/netfilter/nf_tables_core.c
+++ b/net/netfilter/nf_tables_core.c
@@ -88,12 +88,22 @@ nft_do_chain_pktinfo(struct nft_pktinfo *pkt, const struct nf_hook_ops *ops)
struct nft_data data[NFT_REG_MAX + 1];
unsigned int stackptr = 0;
struct nft_jumpstack jumpstack[NFT_JUMP_STACK_SIZE];
+ /*
+ * Cache cursor to avoid problems in case that the cursor is updated
+ * while traversing the ruleset.
+ */
+ unsigned int gencursor = ACCESS_ONCE(chain->net->nft.gencursor);
do_chain:
rule = list_entry(&chain->rules, struct nft_rule, list);
next_rule:
data[NFT_REG_VERDICT].verdict = NFT_CONTINUE;
list_for_each_entry_continue_rcu(rule, &chain->rules, list) {
+
+ /* This rule is not active, skip. */
+ if (unlikely(rule->genmask & (1 << gencursor)))
+ continue;
+
nft_rule_for_each_expr(expr, last, rule) {
if (expr->ops == &nft_cmp_fast_ops)
nft_cmp_fast_eval(expr, data);
diff --git a/net/netfilter/nfnetlink.c b/net/netfilter/nfnetlink.c
index 572d87d..027f16a 100644
--- a/net/netfilter/nfnetlink.c
+++ b/net/netfilter/nfnetlink.c
@@ -147,9 +147,6 @@ static int nfnetlink_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
const struct nfnetlink_subsystem *ss;
int type, err;
- if (!ns_capable(net->user_ns, CAP_NET_ADMIN))
- return -EPERM;
-
/* All the messages must at least contain nfgenmsg */
if (nlmsg_len(nlh) < sizeof(struct nfgenmsg))
return 0;
@@ -217,9 +214,179 @@ replay:
}
}
+static void nfnetlink_rcv_batch(struct sk_buff *skb, struct nlmsghdr *nlh,
+ u_int16_t subsys_id)
+{
+ struct sk_buff *nskb, *oskb = skb;
+ struct net *net = sock_net(skb->sk);
+ const struct nfnetlink_subsystem *ss;
+ const struct nfnl_callback *nc;
+ bool success = true, done = false;
+ int err;
+
+ if (subsys_id >= NFNL_SUBSYS_COUNT)
+ return netlink_ack(skb, nlh, -EINVAL);
+replay:
+ nskb = netlink_skb_clone(oskb, GFP_KERNEL);
+ if (!nskb)
+ return netlink_ack(oskb, nlh, -ENOMEM);
+
+ nskb->sk = oskb->sk;
+ skb = nskb;
+
+ nfnl_lock(subsys_id);
+ ss = rcu_dereference_protected(table[subsys_id].subsys,
+ lockdep_is_held(&table[subsys_id].mutex));
+ if (!ss) {
+#ifdef CONFIG_MODULES
+ nfnl_unlock(subsys_id);
+ request_module("nfnetlink-subsys-%d", subsys_id);
+ nfnl_lock(subsys_id);
+ ss = rcu_dereference_protected(table[subsys_id].subsys,
+ lockdep_is_held(&table[subsys_id].mutex));
+ if (!ss)
+#endif
+ {
+ nfnl_unlock(subsys_id);
+ kfree_skb(nskb);
+ return netlink_ack(skb, nlh, -EOPNOTSUPP);
+ }
+ }
+
+ if (!ss->commit || !ss->abort) {
+ nfnl_unlock(subsys_id);
+ kfree_skb(nskb);
+ return netlink_ack(skb, nlh, -EOPNOTSUPP);
+ }
+
+ while (skb->len >= nlmsg_total_size(0)) {
+ int msglen, type;
+
+ nlh = nlmsg_hdr(skb);
+ err = 0;
+
+ if (nlh->nlmsg_len < NLMSG_HDRLEN) {
+ err = -EINVAL;
+ goto ack;
+ }
+
+ /* Only requests are handled by the kernel */
+ if (!(nlh->nlmsg_flags & NLM_F_REQUEST)) {
+ err = -EINVAL;
+ goto ack;
+ }
+
+ type = nlh->nlmsg_type;
+ if (type == NFNL_MSG_BATCH_BEGIN) {
+ /* Malformed: Batch begin twice */
+ success = false;
+ goto done;
+ } else if (type == NFNL_MSG_BATCH_END) {
+ done = true;
+ goto done;
+ } else if (type < NLMSG_MIN_TYPE) {
+ err = -EINVAL;
+ goto ack;
+ }
+
+ /* We only accept a batch with messages for the same
+ * subsystem.
+ */
+ if (NFNL_SUBSYS_ID(type) != subsys_id) {
+ err = -EINVAL;
+ goto ack;
+ }
+
+ nc = nfnetlink_find_client(type, ss);
+ if (!nc) {
+ err = -EINVAL;
+ goto ack;
+ }
+
+ {
+ int min_len = nlmsg_total_size(sizeof(struct nfgenmsg));
+ u_int8_t cb_id = NFNL_MSG_TYPE(nlh->nlmsg_type);
+ struct nlattr *cda[ss->cb[cb_id].attr_count + 1];
+ struct nlattr *attr = (void *)nlh + min_len;
+ int attrlen = nlh->nlmsg_len - min_len;
+
+ err = nla_parse(cda, ss->cb[cb_id].attr_count,
+ attr, attrlen, ss->cb[cb_id].policy);
+ if (err < 0)
+ goto ack;
+
+ if (nc->call_batch) {
+ err = nc->call_batch(net->nfnl, skb, nlh,
+ (const struct nlattr **)cda);
+ }
+
+ /* The lock was released to autoload some module, we
+ * have to abort and start from scratch using the
+ * original skb.
+ */
+ if (err == -EAGAIN) {
+ ss->abort(skb);
+ nfnl_unlock(subsys_id);
+ kfree_skb(nskb);
+ goto replay;
+ }
+ }
+ack:
+ if (nlh->nlmsg_flags & NLM_F_ACK || err) {
+ /* We don't stop processing the batch on errors, thus,
+ * userspace gets all the errors that the batch
+ * triggers.
+ */
+ netlink_ack(skb, nlh, err);
+ if (err)
+ success = false;
+ }
+
+ msglen = NLMSG_ALIGN(nlh->nlmsg_len);
+ if (msglen > skb->len)
+ msglen = skb->len;
+ skb_pull(skb, msglen);
+ }
+done:
+ if (success && done)
+ ss->commit(skb);
+ else
+ ss->abort(skb);
+
+ nfnl_unlock(subsys_id);
+ kfree_skb(nskb);
+}
+
static void nfnetlink_rcv(struct sk_buff *skb)
{
- netlink_rcv_skb(skb, &nfnetlink_rcv_msg);
+ struct nlmsghdr *nlh = nlmsg_hdr(skb);
+ struct net *net = sock_net(skb->sk);
+ int msglen;
+
+ if (!ns_capable(net->user_ns, CAP_NET_ADMIN))
+ return netlink_ack(skb, nlh, -EPERM);
+
+ if (nlh->nlmsg_len < NLMSG_HDRLEN ||
+ skb->len < nlh->nlmsg_len)
+ return;
+
+ if (nlh->nlmsg_type == NFNL_MSG_BATCH_BEGIN) {
+ struct nfgenmsg *nfgenmsg;
+
+ msglen = NLMSG_ALIGN(nlh->nlmsg_len);
+ if (msglen > skb->len)
+ msglen = skb->len;
+
+ if (nlh->nlmsg_len < NLMSG_HDRLEN ||
+ skb->len < NLMSG_HDRLEN + sizeof(struct nfgenmsg))
+ return;
+
+ nfgenmsg = nlmsg_data(nlh);
+ skb_pull(skb, msglen);
+ nfnetlink_rcv_batch(skb, nlh, nfgenmsg->res_id);
+ } else {
+ netlink_rcv_skb(skb, &nfnetlink_rcv_msg);
+ }
}
#ifdef CONFIG_MODULES
--
1.7.10.4
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox