Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: Possible issue with Mellanox mlx4/port handling
From: Marcelo Ricardo Leitner @ 2012-09-03 17:45 UTC (permalink / raw)
  To: Yevgeny Petrilin; +Cc: netdev@vger.kernel.org, Or Gerlitz
In-Reply-To: <953B660C027164448AE903364AC447D28720CF7D@MTLDAG01.mtl.com>

On 09/03/2012 02:32 PM, Yevgeny Petrilin wrote:
>> Commit
>> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=4c41b3673759d096106e68bce586f103c51d4119
>> inserted changes like:
>>
>> @@ -361,7 +361,7 @@ static int add_promisc_qp(struct mlx4_dev *dev, u8
>> port,
>>           int err;
>>           struct mlx4_priv *priv = mlx4_priv(dev);
>>
>> -       s_steer =&mlx4_priv(dev)->steer[0];
>> +       s_steer =&mlx4_priv(dev)->steer[port - 1];
>>
>>           mutex_lock(&priv->mcg_table.mutex);
>>
>> But I fear we missed one part of the deal. Concept patch:
>>
>> @@ -365,7 +365,7 @@ static int add_promisc_qp(struct mlx4_dev *dev, u8
>> port,
>>
>>           mutex_lock(&priv->mcg_table.mutex);
>>
>> -       if (get_promisc_qp(dev, 0, steer, qpn)) {
>> +       if (get_promisc_qp(dev, port - 1, steer, qpn)) {
>>                   err = 0;  /* Noting to do, already exists */
>>                   goto out_mutex;
>>           }
>>
> ...
>>
>> As far as I can understand, we are changing a list for a port and checking for
>> duplicates on the other list. Points marked as A, B and C for highlighting. Am I
>> missing something? What do you think?
>>
>> FWIW, this call get_promisc_qp(dev, 0, ...) happens in other places too.
>>
>> Thank you,
>> Marcelo.
>
> Hi Marcelo,
> Thanks for this, You are absolutely right.
> We actually have a fix for this issue which we are now verifying, and it will be sent to the mailing list in a few days.

Hi Yevgeny,

Thanks for the fast confirmation.

If you can share, what can we expect it to be like? Like the chunk I 
suggested above or is there anything else needed? I could notice only 6 
places calling get_promisc_qp() that way and couldn't find any other 
issue like that.

Thanks,
Marcelo.

^ permalink raw reply

* RE: Possible issue with Mellanox mlx4/port handling
From: Yevgeny Petrilin @ 2012-09-03 17:32 UTC (permalink / raw)
  To: mleitner@redhat.com; +Cc: netdev@vger.kernel.org, Or Gerlitz

> Commit
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=4c41b3673759d096106e68bce586f103c51d4119
> inserted changes like:
> 
> @@ -361,7 +361,7 @@ static int add_promisc_qp(struct mlx4_dev *dev, u8
> port,
>          int err;
>          struct mlx4_priv *priv = mlx4_priv(dev);
> 
> -       s_steer = &mlx4_priv(dev)->steer[0];
> +       s_steer = &mlx4_priv(dev)->steer[port - 1];
> 
>          mutex_lock(&priv->mcg_table.mutex);
> 
> But I fear we missed one part of the deal. Concept patch:
> 
> @@ -365,7 +365,7 @@ static int add_promisc_qp(struct mlx4_dev *dev, u8
> port,
> 
>          mutex_lock(&priv->mcg_table.mutex);
> 
> -       if (get_promisc_qp(dev, 0, steer, qpn)) {
> +       if (get_promisc_qp(dev, port - 1, steer, qpn)) {
>                  err = 0;  /* Noting to do, already exists */
>                  goto out_mutex;
>          }
> 
...
> 
> As far as I can understand, we are changing a list for a port and checking for
> duplicates on the other list. Points marked as A, B and C for highlighting. Am I
> missing something? What do you think?
> 
> FWIW, this call get_promisc_qp(dev, 0, ...) happens in other places too.
> 
> Thank you,
> Marcelo.

Hi Marcelo,
Thanks for this, You are absolutely right.
We actually have a fix for this issue which we are now verifying, and it will be sent to the mailing list in a few days.

Thanks,
Yevgeny

^ permalink raw reply

* Re: [PATCH] sctp: Don't charge for data in sndbuf again when transmitting packet
From: David Miller @ 2012-09-03 17:24 UTC (permalink / raw)
  To: vyasevich; +Cc: tgraf, linux-sctp, netdev, vyasevic, nhorman
In-Reply-To: <378685E0-7225-4C9D-AE0F-7B8767ECB16A@gmail.com>

From: Vlad Yasevich <vyasevich@gmail.com>
Date: Mon, 3 Sep 2012 11:02:51 -0400

> 
> 
> On Sep 3, 2012, at 10:27 AM, Thomas Graf <tgraf@suug.ch> wrote:
> 
>> SCTP charges wmem_alloc via sctp_set_owner_w() in sctp_sendmsg() and
>> via
>> skb_set_owner_w() in sctp_packet_transmit(). If a sender runs out of
>> sndbuf it will sleep in sctp_wait_for_sndbuf() and expects to be waken
>> up
>> by __sctp_write_space().
>>
>> Buffer space charged via sctp_set_owner_w() is released in
>> sctp_wfree()
>> which calls __sctp_write_space() directly.
>>
>> Buffer space charged via skb_set_owner_w() is released via
>> sock_wfree()
>> which calls sk->sk_write_space() _if_ SOCK_USE_WRITE_QUEUE is not set.
>> sctp_endpoint_init() sets SOCK_USE_WRITE_QUEUE on all sockets.
>>
>> Therefore if sctp_packet_transmit() manages to queue up more than
>> sndbuf
>> bytes, sctp_wait_for_sndbuf() will never be woken up again unless it
>> is
>> interrupted by a signal.
>>
>> This could be fixed by clearing the SOCK_USE_WRITE_QUEUE flag but ...
>>
>> Charging for the data twice does not make sense in the first place, it
>> leads to overcharging sndbuf by a factor 2. Therefore this patch only
>> charges a single byte in wmem_alloc when transmitting an SCTP packet
>> to
>> ensure that the socket stays alive until the packet has been released.
>>
>> This means that control chunks are no longer accounted for in
>> wmem_alloc
>> which I believe is not a problem as skb->truesize will typically lead
>> to overcharging anyway and thus compensates for any control overhead.
> 
> Acked-by: Vlad Yasevich <vyasevich@gmail.com>

Applied and queued up for -stable, thanks everyone.

^ permalink raw reply

* Re: [PATCH v2] netfilter: take care of timewait sockets
From: David Miller @ 2012-09-03 17:23 UTC (permalink / raw)
  To: eric.dumazet; +Cc: fw, hvtaifwkbgefbaei, netdev, e1000-devel
In-Reply-To: <1346666238.2563.113.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 03 Sep 2012 11:57:18 +0200

> David, tell me if you prefer to change TCP demux to avoid timewait,
> as I have no strong opinion.

It would be the stupidest thing ever to do the whole hash lookup
just to throw the result away just because it's a timewait socket.

> [PATCH] net: sock_edemux() should take care of timewait sockets
> 
> sock_edemux() can handle either a regular socket or a timewait socket
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied, thanks.

^ permalink raw reply

* Re: [net-next.git 5/7] stmmac: add sysFs support
From: David Miller @ 2012-09-03 17:20 UTC (permalink / raw)
  To: bhutchings; +Cc: peppe.cavallaro, netdev
In-Reply-To: <1346676267.7787.82.camel@deadeye.wl.decadent.org.uk>

From: Ben Hutchings <bhutchings@solarflare.com>
Date: Mon, 3 Sep 2012 13:44:27 +0100

> On Mon, 2012-09-03 at 09:47 +0200, Giuseppe CAVALLARO wrote:
>> This patch adds the sysFs support.
>> Some internal driver parameters can be tuned by using some
>> entries exposed via sysFS. There parameter currently are,
>> for example, for internal timers used to mitigate the rx/tx
>> interrupts or for EEE.
> [...]
> 
> Why are you not exposing these through the standard ethtool operations?

Guiseppe, I'm not appyling driver patches that add sysfs crap like
this.  Either use existing ethtool interfaces or create new ones that
provide the necessary functionality.

Adding unique configuration mechanisms to a device driver is always a
bug.

And I'm getting real fed up with driver writers simply not getting the
message.  Every time someone adds sysfs or ioctl crap, we push back,
so just don't do it and stop wasting our time.

^ permalink raw reply

* Re: [PATCH 2/2] netfilter: pass 'nf_hook_ops' instead of 'list_head' to nf_queue()
From: Pablo Neira Ayuso @ 2012-09-03 15:31 UTC (permalink / raw)
  To: Michael Wang
  Cc: LKML, netdev@vger.kernel.org, coreteam, netfilter,
	netfilter-devel, David Miller, kaber
In-Reply-To: <5035C6E6.6040001@linux.vnet.ibm.com>

On Thu, Aug 23, 2012 at 02:00:06PM +0800, Michael Wang wrote:
> From: Michael Wang <wangyun@linux.vnet.ibm.com>
> 
> Since 'list_for_each_continue_rcu' has already been replaced by
> 'list_for_each_entry_continue_rcu', pass 'list_head' to nf_queue() as a
> parameter can not benefit us any more.
> 
> This patch will replace 'list_head' with 'nf_hook_ops' as the parameter of
> nf_queue() and __nf_queue() to save code.

Applied, thanks.

^ permalink raw reply

* Re: [PATCH 1/2] netfilter: pass 'nf_hook_ops' instead of 'list_head' to nf_iterate()
From: Pablo Neira Ayuso @ 2012-09-03 15:31 UTC (permalink / raw)
  To: Michael Wang
  Cc: LKML, netdev@vger.kernel.org, coreteam, netfilter,
	netfilter-devel, David Miller, kaber
In-Reply-To: <5035C6DD.3000600@linux.vnet.ibm.com>

On Thu, Aug 23, 2012 at 01:59:57PM +0800, Michael Wang wrote:
> From: Michael Wang <wangyun@linux.vnet.ibm.com>
> 
> Since 'list_for_each_continue_rcu' has already been replaced by
> 'list_for_each_entry_continue_rcu', pass 'list_head' to nf_iterate() as a
> parameter can not benefit us any more.
> 
> This patch will replace 'list_head' with 'nf_hook_ops' as the parameter of
> nf_iterate() to save code.

Applied, thanks.

^ permalink raw reply

* Re: [net-next.git 5/7] stmmac: add sysFs support
From: Ben Hutchings @ 2012-09-03 15:09 UTC (permalink / raw)
  To: Giuseppe CAVALLARO; +Cc: netdev
In-Reply-To: <5044B245.5050703@st.com>

On Mon, 2012-09-03 at 15:36 +0200, Giuseppe CAVALLARO wrote:
> Hello Ben,
> 
> On 9/3/2012 2:44 PM, Ben Hutchings wrote:
> > On Mon, 2012-09-03 at 09:47 +0200, Giuseppe CAVALLARO wrote:
> >> This patch adds the sysFs support.
> >> Some internal driver parameters can be tuned by using some
> >> entries exposed via sysFS. There parameter currently are,
> >> for example, for internal timers used to mitigate the rx/tx
> >> interrupts or for EEE.
> > [...]
> > 
> > Why are you not exposing these through the standard ethtool operations?
> > 
> > Ben.
> 
> yes I want to expose them via ethtool and I'll do this as soon as I have
> clear with ethtool parameters have to be used (
> http://marc.info/?l=linux-netdev&m=134561966226677&w=2 ).

Sorry, I meant to reply to that but didn't get round to it.

> For the reception side, I have the RI Watchdog Timer count field and I
> do not know what is the appropriate ethtool parameter to use.
> From the Synopsys databook, the RI Watchdog Timer count indicates the
> number of system clock cycles. When the it runs out, the receive
> interrupt bit is set and the timer is stopped.
> No idea if it can be actually covered, for example, with
> rx_coalesce_usecs_irq.

As I understand it, interrupt moderation time is supposed to be the
minimum time between completion IRQs, not a minimum delay from
completion-with-IRQ-armed to assertion of the IRQ.  The timer should
start running again immediately after the associated IRQ is asserted.
But I don't know whether it's universally implemented this way.

The field names including '_irq' are to be used if the hardware can use
a different moderation time while the IRQ is still masked (i.e. NAPI or
hard interrupt handler is still running).  I think most hardware doesn't
support this.

> For the transmission I have a SW timer that periodically calls the tx
> function (stmmac_tx) and a threshold to also set the "Interrupt on
> completion" bit in the TDES when a frame is transmitted.
> I wonder (but not sure) if in this case I could be: tx_coalesce_usec and
> tx_mac_coalesced_frames.
> From the kernel documentation IIUC these seem to have other meaning.

The semantics don't seem to match the documentation exactly but I think
this is probably close enough.

Ben.

> No problem, to extend ethtool to cover these kind of parameters if
> necessary.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [PATCH] sctp: Don't charge for data in sndbuf again when transmitting packet
From: Vlad Yasevich @ 2012-09-03 15:02 UTC (permalink / raw)
  To: Thomas Graf
  Cc: linux-sctp@vger.kernel.org, netdev@vger.kernel.org, Vlad Yasevich,
	Neil Horman, David Miller
In-Reply-To: <ed0aceea14e890e55829f42d1594ae9417b8ca25.1346681739.git.tgraf@suug.ch>



On Sep 3, 2012, at 10:27 AM, Thomas Graf <tgraf@suug.ch> wrote:

> SCTP charges wmem_alloc via sctp_set_owner_w() in sctp_sendmsg() and  
> via
> skb_set_owner_w() in sctp_packet_transmit(). If a sender runs out of
> sndbuf it will sleep in sctp_wait_for_sndbuf() and expects to be  
> waken up
> by __sctp_write_space().
>
> Buffer space charged via sctp_set_owner_w() is released in sctp_wfree 
> ()
> which calls __sctp_write_space() directly.
>
> Buffer space charged via skb_set_owner_w() is released via sock_wfree 
> ()
> which calls sk->sk_write_space() _if_ SOCK_USE_WRITE_QUEUE is not set.
> sctp_endpoint_init() sets SOCK_USE_WRITE_QUEUE on all sockets.
>
> Therefore if sctp_packet_transmit() manages to queue up more than  
> sndbuf
> bytes, sctp_wait_for_sndbuf() will never be woken up again unless it  
> is
> interrupted by a signal.
>
> This could be fixed by clearing the SOCK_USE_WRITE_QUEUE flag but ...
>
> Charging for the data twice does not make sense in the first place, it
> leads to overcharging sndbuf by a factor 2. Therefore this patch only
> charges a single byte in wmem_alloc when transmitting an SCTP packet  
> to
> ensure that the socket stays alive until the packet has been released.
>
> This means that control chunks are no longer accounted for in  
> wmem_alloc
> which I believe is not a problem as skb->truesize will typically lead
> to overcharging anyway and thus compensates for any control overhead.

Acked-by: Vlad Yasevich <vyasevich@gmail.com>

-vlad

> Signed-off-by: Thomas Graf <tgraf@suug.ch>
> CC: Vlad Yasevich <vyasevic@redhat.com>
> CC: Neil Horman <nhorman@tuxdriver.com>
> CC: David Miller <davem@davemloft.net>
> ---
> net/sctp/output.c | 21 ++++++++++++++++++++-
> 1 file changed, 20 insertions(+), 1 deletion(-)
>
> diff --git a/net/sctp/output.c b/net/sctp/output.c
> index 838e18b..be50aa2 100644
> --- a/net/sctp/output.c
> +++ b/net/sctp/output.c
> @@ -364,6 +364,25 @@ finish:
>    return retval;
> }
>
> +static void sctp_packet_release_owner(struct sk_buff *skb)
> +{
> +    sk_free(skb->sk);
> +}
> +
> +static void sctp_packet_set_owner_w(struct sk_buff *skb, struct  
> sock *sk)
> +{
> +    skb_orphan(skb);
> +    skb->sk = sk;
> +    skb->destructor = sctp_packet_release_owner;
> +
> +    /*
> +     * The data chunks have already been accounted for in  
> sctp_sendmsg(),
> +     * therefore only reserve a single byte to keep socket around  
> until
> +     * the packet has been transmitted.
> +     */
> +    atomic_inc(&sk->sk_wmem_alloc);
> +}
> +
> /* All packets are sent to the network through this function from
>  * sctp_outq_tail().
>  *
> @@ -405,7 +424,7 @@ int sctp_packet_transmit(struct sctp_packet  
> *packet)
>    /* Set the owning socket so that we know where to get the
>     * destination IP address.
>     */
> -    skb_set_owner_w(nskb, sk);
> +    sctp_packet_set_owner_w(nskb, sk);
>
>    if (!sctp_transport_dst_check(tp)) {
>        sctp_transport_route(tp, NULL, sctp_sk(sk));
> -- 
> 1.7.11.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux- 
> sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [PATCH] sctp: Don't charge for data in sndbuf again when transmitting packet
From: Thomas Graf @ 2012-09-03 14:27 UTC (permalink / raw)
  To: linux-sctp; +Cc: netdev, Vlad Yasevich, Neil Horman, David Miller

SCTP charges wmem_alloc via sctp_set_owner_w() in sctp_sendmsg() and via
skb_set_owner_w() in sctp_packet_transmit(). If a sender runs out of
sndbuf it will sleep in sctp_wait_for_sndbuf() and expects to be waken up
by __sctp_write_space().

Buffer space charged via sctp_set_owner_w() is released in sctp_wfree()
which calls __sctp_write_space() directly.

Buffer space charged via skb_set_owner_w() is released via sock_wfree()
which calls sk->sk_write_space() _if_ SOCK_USE_WRITE_QUEUE is not set.
sctp_endpoint_init() sets SOCK_USE_WRITE_QUEUE on all sockets.

Therefore if sctp_packet_transmit() manages to queue up more than sndbuf
bytes, sctp_wait_for_sndbuf() will never be woken up again unless it is
interrupted by a signal.

This could be fixed by clearing the SOCK_USE_WRITE_QUEUE flag but ...

Charging for the data twice does not make sense in the first place, it
leads to overcharging sndbuf by a factor 2. Therefore this patch only
charges a single byte in wmem_alloc when transmitting an SCTP packet to
ensure that the socket stays alive until the packet has been released.

This means that control chunks are no longer accounted for in wmem_alloc
which I believe is not a problem as skb->truesize will typically lead
to overcharging anyway and thus compensates for any control overhead.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
CC: Vlad Yasevich <vyasevic@redhat.com>
CC: Neil Horman <nhorman@tuxdriver.com>
CC: David Miller <davem@davemloft.net>
---
 net/sctp/output.c | 21 ++++++++++++++++++++-
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/net/sctp/output.c b/net/sctp/output.c
index 838e18b..be50aa2 100644
--- a/net/sctp/output.c
+++ b/net/sctp/output.c
@@ -364,6 +364,25 @@ finish:
 	return retval;
 }

+static void sctp_packet_release_owner(struct sk_buff *skb)
+{
+	sk_free(skb->sk);
+}
+
+static void sctp_packet_set_owner_w(struct sk_buff *skb, struct sock *sk)
+{
+	skb_orphan(skb);
+	skb->sk = sk;
+	skb->destructor = sctp_packet_release_owner;
+
+	/*
+	 * The data chunks have already been accounted for in sctp_sendmsg(),
+	 * therefore only reserve a single byte to keep socket around until
+	 * the packet has been transmitted.
+	 */
+	atomic_inc(&sk->sk_wmem_alloc);
+}
+
 /* All packets are sent to the network through this function from
  * sctp_outq_tail().
  *
@@ -405,7 +424,7 @@ int sctp_packet_transmit(struct sctp_packet *packet)
 	/* Set the owning socket so that we know where to get the
 	 * destination IP address.
 	 */
-	skb_set_owner_w(nskb, sk);
+	sctp_packet_set_owner_w(nskb, sk);

 	if (!sctp_transport_dst_check(tp)) {
 		sctp_transport_route(tp, NULL, sctp_sk(sk));
-- 
1.7.11.4

^ permalink raw reply related

* Re: [net-next.git 5/7] stmmac: add sysFs support
From: Giuseppe CAVALLARO @ 2012-09-03 13:36 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: netdev
In-Reply-To: <1346676267.7787.82.camel@deadeye.wl.decadent.org.uk>

Hello Ben,

On 9/3/2012 2:44 PM, Ben Hutchings wrote:
> On Mon, 2012-09-03 at 09:47 +0200, Giuseppe CAVALLARO wrote:
>> This patch adds the sysFs support.
>> Some internal driver parameters can be tuned by using some
>> entries exposed via sysFS. There parameter currently are,
>> for example, for internal timers used to mitigate the rx/tx
>> interrupts or for EEE.
> [...]
> 
> Why are you not exposing these through the standard ethtool operations?
> 
> Ben.

yes I want to expose them via ethtool and I'll do this as soon as I have
clear with ethtool parameters have to be used (
http://marc.info/?l=linux-netdev&m=134561966226677&w=2 ).

For the reception side, I have the RI Watchdog Timer count field and I
do not know what is the appropriate ethtool parameter to use.
>From the Synopsys databook, the RI Watchdog Timer count indicates the
number of system clock cycles. When the it runs out, the receive
interrupt bit is set and the timer is stopped.
No idea if it can be actually covered, for example, with
rx_coalesce_usecs_irq.

For the transmission I have a SW timer that periodically calls the tx
function (stmmac_tx) and a threshold to also set the "Interrupt on
completion" bit in the TDES when a frame is transmitted.
I wonder (but not sure) if in this case I could be: tx_coalesce_usec and
tx_mac_coalesced_frames.
>From the kernel documentation IIUC these seem to have other meaning.

No problem, to extend ethtool to cover these kind of parameters if
necessary.

Welcome advice,
Peppe

^ permalink raw reply

* Re: [net-next.git 5/7] stmmac: add sysFs support
From: Ben Hutchings @ 2012-09-03 12:44 UTC (permalink / raw)
  To: Giuseppe CAVALLARO; +Cc: netdev
In-Reply-To: <1346658422-1925-6-git-send-email-peppe.cavallaro@st.com>

On Mon, 2012-09-03 at 09:47 +0200, Giuseppe CAVALLARO wrote:
> This patch adds the sysFs support.
> Some internal driver parameters can be tuned by using some
> entries exposed via sysFS. There parameter currently are,
> for example, for internal timers used to mitigate the rx/tx
> interrupts or for EEE.
[...]

Why are you not exposing these through the standard ethtool operations?

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [PATCH RFC 0/2] Interface for TCP Metrics
From: Renato Westphal @ 2012-09-03 12:11 UTC (permalink / raw)
  To: Julian Anastasov; +Cc: Stephen Hemminger, netdev
In-Reply-To: <alpine.LFD.2.00.1208231908380.28424@ja.ssi.bg>

>         BTW, I noticed that the GENL part in iproute is
> not separated in common place. Should we create common
> place for functions like genl_ctrl_resolve_family and
> genl_parse_getfamily ?

I think it would be nice to extend libnetlink or create something like
'libgenetlink' with these GENL facilities you created. The tendency is
to increase the number of genetlink users over time, so it seems to
make sense to provide some basic infrastructure for these new users.
Every genetlink application needs to resolve family names and possibly
multicast group names, so I think putting these basic facilities in
common place will help to reduce code duplication in the future.

-- 
Renato Westphal

^ permalink raw reply

* [PATCHv2] virtio-spec: virtio network device multiqueue support
From: Michael S. Tsirkin @ 2012-09-03 11:55 UTC (permalink / raw)
  To: Jason Wang; +Cc: netdev, kvm, virtualization

At Jason's request, I am trying to help finalize the spec for
the new multiqueue feature.

Changes from Jason's rfc:
- reserved vq 3: this makes all rx vqs even and tx vqs odd, which
  looks nicer to me.
- documented packet steering, added a generalized steering programming
  command. Current modes are single queue and host driven multiqueue,
  but I envision support for guest driven multiqueue in the future.
- make default vqs unused when in mq mode - this wastes some memory
  but makes it more efficient to switch between modes as
  we can avoid this causing packet reordering.

Rusty, could you please take a look and comment?
If this looks OK to everyone, we can proceed with finalizing the
implementation.  This patch is against
eb9fc84d0d3c46438aaab190e2401a9e5409a052 in virtio-spec git tree.

-->

virtio-spec: virtio network device multiqueue support

Add multiqueue support to virtio network device.
Add a new feature flag VIRTIO_NET_F_MULTIQUEUE for this feature, a new
configuration field max_virtqueue_pairs to detect supported number of
virtqueues as well as a new command VIRTIO_NET_CTRL_STEERING to program
packet steering.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

--

diff --git a/virtio-spec.lyx b/virtio-spec.lyx
index 7a073f4..583debc 100644
--- a/virtio-spec.lyx
+++ b/virtio-spec.lyx
@@ -58,6 +58,7 @@
 \html_be_strict false
 \author -608949062 "Rusty Russell,,," 
 \author 1531152142 "Paolo Bonzini,,," 
+\author 1986246365 "Michael S. Tsirkin" 
 \end_header
 
 \begin_body
@@ -3896,6 +3897,37 @@ Only if VIRTIO_NET_F_CTRL_VQ set
 \end_inset
 
 
+\change_inserted 1986246365 1346663522
+ 3: reserved
+\end_layout
+
+\begin_layout Description
+
+\change_inserted 1986246365 1346663550
+4: receiveq1.
+ 5: transmitq1.
+ 6: receiveq2.
+ 7.
+ transmitq2.
+ ...
+ 2N+2:receivqN, 2N+3:transmitqN
+\begin_inset Foot
+status open
+
+\begin_layout Plain Layout
+
+\change_inserted 1986246365 1346663558
+Only if VIRTIO_NET_F_CTRL_VQ set.
+ N is indicated by max_virtqueue_pairs field.
+\change_unchanged
+
+\end_layout
+
+\end_inset
+
+
+\change_unchanged
+
 \end_layout
 
 \begin_layout Description
@@ -4056,6 +4088,17 @@ VIRTIO_NET_F_CTRL_VLAN
 
 \begin_layout Description
 VIRTIO_NET_F_GUEST_ANNOUNCE(21) Guest can send gratuitous packets.
+\change_inserted 1986246365 1346617842
+
+\end_layout
+
+\begin_layout Description
+
+\change_inserted 1986246365 1346618103
+VIRTIO_NET_F_MULTIQUEUE(22) Device has multiple receive and transmission
+ queues.
+\change_unchanged
+
 \end_layout
 
 \end_deeper
@@ -4068,11 +4111,45 @@ configuration
 \begin_inset space ~
 \end_inset
 
-layout Two configuration fields are currently defined.
+layout 
+\change_deleted 1986246365 1346671560
+Two
+\change_inserted 1986246365 1346671647
+Six
+\change_unchanged
+ configuration fields are currently defined.
  The mac address field always exists (though is only valid if VIRTIO_NET_F_MAC
  is set), and the status field only exists if VIRTIO_NET_F_STATUS is set.
  Two read-only bits are currently defined for the status field: VIRTIO_NET_S_LIN
 K_UP and VIRTIO_NET_S_ANNOUNCE.
+
+\change_inserted 1986246365 1346672138
+ The following four read-only fields only exists if VIRTIO_NET_F_MULTIQUEUE
+ is set.
+ The max_virtqueue_pairs field specifies the maximum number of each of transmit
+ and receive virtqueues that can used for multiqueue operation.
+ The following read-only fields: 
+\emph on
+current_steering_rule
+\emph default
+, 
+\emph on
+reserved
+\emph default
+ and 
+\emph on
+current_steering_param
+\emph default
+ store the last successful VIRTIO_NET_CTRL_STEERING
+\begin_inset CommandInset ref
+LatexCommand ref
+reference "sub:Transmit-Packet-Steering"
+
+\end_inset
+
+ command executed by driver, for debugging.
+
+\change_unchanged
  
 \begin_inset listings
 inline false
@@ -4105,6 +4182,40 @@ struct virtio_net_config {
 \begin_layout Plain Layout
 
     u16 status;
+\change_inserted 1986246365 1346671221
+
+\end_layout
+
+\begin_layout Plain Layout
+
+\change_inserted 1986246365 1346671532
+
+    u16 max_virtqueue_pairs;
+\end_layout
+
+\begin_layout Plain Layout
+
+\change_inserted 1986246365 1346671531
+
+    u8 current_steering_rule;
+\change_unchanged
+
+\end_layout
+
+\begin_layout Plain Layout
+
+\change_inserted 1986246365 1346671499
+
+    u8 reserved;
+\end_layout
+
+\begin_layout Plain Layout
+
+\change_inserted 1986246365 1346671530
+
+    u16 current_steering_param;
+\change_unchanged
+
 \end_layout
 
 \begin_layout Plain Layout
@@ -4151,6 +4262,18 @@ physical
 \begin_layout Enumerate
 If the VIRTIO_NET_F_CTRL_VQ feature bit is negotiated, identify the control
  virtqueue.
+\change_inserted 1986246365 1346618052
+
+\end_layout
+
+\begin_layout Enumerate
+
+\change_inserted 1986246365 1346618175
+If VIRTIO_NET_F_MULTIQUEUE feature bit is negotiated, identify the receive
+ and transmission queues that are going to be used in multiqueue mode.
+ Only queues that are going to be used need to be initialized.
+\change_unchanged
+
 \end_layout
 
 \begin_layout Enumerate
@@ -4168,7 +4291,11 @@ status
 \end_layout
 
 \begin_layout Enumerate
-The receive virtqueue should be filled with receive buffers.
+The receive virtqueue
+\change_inserted 1986246365 1346618180
+(s)
+\change_unchanged
+ should be filled with receive buffers.
  This is described in detail below in 
 \begin_inset Quotes eld
 \end_inset
@@ -4516,6 +4643,201 @@ Note that the header will be two bytes longer for the VIRTIO_NET_F_MRG_RXBUF
 \end_layout
 
 \begin_layout Subsection*
+
+\change_inserted 1986246365 1346670975
+\begin_inset CommandInset label
+LatexCommand label
+name "sub:Transmit-Packet-Steering"
+
+\end_inset
+
+Transmit Packet Steering
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 1986246365 1346670592
+When VIRTIO_NET_F_MULTIQUEUE feature bit is negotiated, guest can use any
+ of multiple configured transmit queues to transmit a given packet.
+ To avoid packet reordering by device (which gnerally leads to performance
+ degradation) driver should attempt to utilize the same transmit virtqueue
+ for all packets of a given transmit flow.
+ For bi-directional protocols (in practice, TCP), a given network connection
+ can utilize both transmit and receive queues.
+ For best performance, packets from a single connection should utilize the
+ paired transmit and receive queues from the same virtqueue pair; for example
+ both transmitqN and receiveqN.
+ This rule makes it possible to optimize processing on the device side,
+ but this is not a hard requirement: devices should function correctly even
+ when this rule is not followed.
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 1986246365 1346670727
+Driver selects an active steering rule using VIRTIO_NET_CTRL_STEERING command
+ (this controls both which virtqueue is selected for a given packet for
+ receive and notifies the device which virtqueues are about to be used for
+ transmit).
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 1986246365 1346670594
+This command accepts a single out argument in the following format:
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 1986246365 1346670594
+\begin_inset listings
+inline false
+status open
+
+\begin_layout Plain Layout
+
+\change_inserted 1986246365 1346670594
+
+#define VIRTIO_NET_CTRL_STEERING       4
+\end_layout
+
+\begin_layout Plain Layout
+
+\change_inserted 1986246365 1346670594
+
+\end_layout
+
+\begin_layout Plain Layout
+
+\change_inserted 1986246365 1346670594
+
+struct virtio_net_ctrl_steering {
+\end_layout
+
+\begin_layout Plain Layout
+
+\change_inserted 1986246365 1346671425
+
+	u8 current_steering_rule;
+\end_layout
+
+\begin_layout Plain Layout
+
+\change_inserted 1986246365 1346670594
+
+    u8 reserved;
+\end_layout
+
+\begin_layout Plain Layout
+
+\change_inserted 1986246365 1346671485
+
+	u16 current_steering_param;
+\end_layout
+
+\begin_layout Plain Layout
+
+\change_inserted 1986246365 1346670594
+
+};
+\end_layout
+
+\begin_layout Plain Layout
+
+\change_inserted 1986246365 1346670594
+
+#define VIRTIO_NET_CTRL_STEERING_SINGLE       0
+\end_layout
+
+\begin_layout Plain Layout
+
+\change_inserted 1986246365 1346670594
+
+#define VIRTIO_NET_CTRL_STEERING_HOST  1
+\end_layout
+
+\end_inset
+
+
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 1986246365 1346671803
+The field 
+\emph on
+rule
+\emph default
+ specifies the function used to select transmit virtqueue for a given packet;
+ the field 
+\emph on
+param
+\emph default
+ makes it possible to pass an extra parameter if appropriate.
+ When 
+\emph on
+rule
+\emph default
+ is set to VIRTIO_NET_CTRL_STEERING_SINGLE all packets are steered to the
+ default virtqueue transmitq (1); param is unused; this is the default.
+ When 
+\emph on
+rule
+\emph default
+ is set to VIRTIO_NET_CTRL_STEERING_HOST packets are steered by driver to
+ the first (
+\emph on
+param
+\emph default
++1) multiqueue virtqueues transmitq1...transmitqN; the default transmitq is
+ unused.
+ Driver must have configured all these (
+\emph on
+param
+\emph default
++1) virtqueues beforehand.
+ For best performance, driver should detects flow to virtqueue pair mapping
+ on receive and selects the transmit virtqueue from the same virtqueue pair.
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 1986246365 1346670762
+Supported steering rules can be added and removed in the future.
+ Driver should probe for supported rules by checking ack values of the command.
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 1986246365 1346671315
+When steering rule is modified, some packets can still be outstanding in
+ one or more of the virtqueues.
+ For transmit, device is recommended to complete processing of the transmit
+ queue(s) utilized by the original steering before processing any packets
+ delivered by the modified steering rule.
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 1986246365 1346671412
+For debugging, current steering rule can also be read from the configuration
+ space.
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 1986246365 1346670762
+See also receive steering description
+\begin_inset CommandInset ref
+LatexCommand ref
+reference "sub:Receive-Packet-Steering"
+
+\end_inset
+
+.
+\end_layout
+
+\begin_layout Subsection*
 Packet Transmission Interrupt
 \end_layout
 
@@ -4988,8 +5310,24 @@ status open
 The Guest needs to check VIRTIO_NET_S_ANNOUNCE bit in status field when
  it notices the changes of device configuration.
  The command VIRTIO_NET_CTRL_ANNOUNCE_ACK is used to indicate that driver
- has recevied the notification and device would clear the VIRTIO_NET_S_ANNOUNCE
- bit in the status filed after it received this command.
+ has rece
+\change_inserted 1986246365 1346663932
+i
+\change_unchanged
+v
+\change_deleted 1986246365 1346663934
+i
+\change_unchanged
+ed the notification and device would clear the VIRTIO_NET_S_ANNOUNCE bit
+ in the status fi
+\change_inserted 1986246365 1346663942
+e
+\change_unchanged
+l
+\change_deleted 1986246365 1346663943
+e
+\change_unchanged
+d after it received this command.
 \end_layout
 
 \begin_layout Standard
@@ -5004,10 +5342,101 @@ Sending the gratuitous packets or marking there are pending gratuitous packets
 \begin_layout Enumerate
 Sending VIRTIO_NET_CTRL_ANNOUNCE_ACK command through control vq.
  
+\change_deleted 1986246365 1346662247
+
 \end_layout
 
-\begin_layout Enumerate
+\begin_layout Subsection*
+
+\change_inserted 1986246365 1346670357
+\begin_inset CommandInset label
+LatexCommand label
+name "sub:Receive-Packet-Steering"
+
+\end_inset
+
+Receive Packet Steering
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 1986246365 1346671046
+When VIRTIO_NET_F_MULTIQUEUE feature bit is negotiated, device can use any
+ of multiple configured receive queues to pass a given packet to driver.
+ Driver controls which virtqueue is selected in practice by configuring
+ packet steering rule using VIRTIO_NET_CTRL_STEERING command, as described
+ above
+\begin_inset CommandInset ref
+LatexCommand ref
+reference "sub:Transmit-Packet-Steering"
+
+\end_inset
+
 .
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 1986246365 1346671818
+The field 
+\emph on
+rule
+\emph default
+ specifies the function used to select receive virtqueue for a given packet;
+ the field 
+\emph on
+param
+\emph default
+ makes it possible to pass an extra parameter if appropriate.
+ When 
+\emph on
+rule
+\emph default
+ is set to VIRTIO_NET_CTRL_STEERING_SINGLE all packets are steered to the
+ default virtqueue receveq (0); param is unused; this is the default.
+ When 
+\emph on
+rule
+\emph default
+ is set to VIRTIO_NET_CTRL_STEERING_HOST packets are steered by host using
+ a device-specific steering function to the first (
+\emph on
+param
+\emph default
++1) multiqueue virtqueues receiveq1...receiveqN; the default receiveq is unused.
+ Driver must have configured all these (
+\emph on
+param
+\emph default
++1) virtqueues beforehand.
+ For best performance, driver is expected to detect flow to virtqueue pair
+ mapping on receive and select the transmit virtqueue from the same virtqueue
+ pair.
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 1986246365 1346669564
+Supported steering rules can be added and removed in the future.
+ Driver should probe for supported rules by checking ack values of the command.
+\end_layout
+
+\begin_layout Standard
+
+\change_inserted 1986246365 1346671151
+When steering rule is modified, some packets can still be outstanding in
+ one or more of the virtqueues.
+ For receive, driver is recommended to complete processing of the receive
+ queue(s) utilized by the original steering before processing any packets
+ delivered by the modified steering rule.
+\end_layout
+
+\begin_layout Standard
+
+\change_deleted 1986246365 1346664095
+.
+
+\change_unchanged
  
 \end_layout

^ permalink raw reply related

* Re: [PATCH v2] netfilter: take care of timewait sockets
From: Eric Dumazet @ 2012-09-03  9:57 UTC (permalink / raw)
  To: Florian Westphal, David Miller; +Cc: Sami Farin, netdev, e1000-devel
In-Reply-To: <20120903074718.GA14750@breakpoint.cc>

From: Eric Dumazet <edumazet@google.com>

On Mon, 2012-09-03 at 09:47 +0200, Florian Westphal wrote:
> Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > Sami Farin reported crashes in xt_LOG because it assumes skb->sk is a
> > full blown socket.
> > 
> > But with TCP early demux, we can have skb->sk pointing to a timewait
> > socket.
> > 
> > Same fix is needed in netfnetlink_log
> 
> Looks good, but IMHO it is very un-intuitive that
> skb->sk might be a pointer to an object that is not struct sock (or
> a compatible object).

Its kind of a compatible object, if all skb->sk users are aware of it.

You are totally right, this is messy, but TCP edemux is a layering
violation helping a bit performance...

sock_edemux() should also be fixed.

David, tell me if you prefer to change TCP demux to avoid timewait,
as I have no strong opinion.

[PATCH] net: sock_edemux() should take care of timewait sockets

sock_edemux() can handle either a regular socket or a timewait socket

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/core/sock.c |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/net/core/sock.c b/net/core/sock.c
index 8f67ced..7f64467 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1523,7 +1523,12 @@ EXPORT_SYMBOL(sock_rfree);
 
 void sock_edemux(struct sk_buff *skb)
 {
-	sock_put(skb->sk);
+	struct sock *sk = skb->sk;
+
+	if (sk->sk_state == TCP_TIME_WAIT)
+		inet_twsk_put(inet_twsk(sk));
+	else
+		sock_put(sk);
 }
 EXPORT_SYMBOL(sock_edemux);
 

^ permalink raw reply related

* [PATCH net-next] net: cx82310_eth: use common match macro
From: Bjørn Mork @ 2012-09-03  9:20 UTC (permalink / raw)
  To: netdev; +Cc: linux-usb, Bjørn Mork

Signed-off-by: Bjørn Mork <bjorn@mork.no>
---
 drivers/net/usb/cx82310_eth.c |   11 +----------
 1 file changed, 1 insertion(+), 10 deletions(-)

diff --git a/drivers/net/usb/cx82310_eth.c b/drivers/net/usb/cx82310_eth.c
index 49ab45e..1e207f0 100644
--- a/drivers/net/usb/cx82310_eth.c
+++ b/drivers/net/usb/cx82310_eth.c
@@ -302,18 +302,9 @@ static const struct driver_info	cx82310_info = {
 	.tx_fixup	= cx82310_tx_fixup,
 };
 
-#define USB_DEVICE_CLASS(vend, prod, cl, sc, pr) \
-	.match_flags = USB_DEVICE_ID_MATCH_DEVICE | \
-		       USB_DEVICE_ID_MATCH_DEV_INFO, \
-	.idVendor = (vend), \
-	.idProduct = (prod), \
-	.bDeviceClass = (cl), \
-	.bDeviceSubClass = (sc), \
-	.bDeviceProtocol = (pr)
-
 static const struct usb_device_id products[] = {
 	{
-		USB_DEVICE_CLASS(0x0572, 0xcb01, 0xff, 0, 0),
+		USB_DEVICE_AND_INTERFACE_INFO(0x0572, 0xcb01, 0xff, 0, 0),
 		.driver_info = (unsigned long) &cx82310_info
 	},
 	{ },
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH net-next 1/2] net: sierra_net: make private symbols static
From: Bjørn Mork @ 2012-09-03  9:20 UTC (permalink / raw)
  To: netdev; +Cc: linux-usb, Bjørn Mork
In-Reply-To: <1346664033-30284-1-git-send-email-bjorn@mork.no>

Signed-off-by: Bjørn Mork <bjorn@mork.no>
---
 drivers/net/usb/sierra_net.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/usb/sierra_net.c b/drivers/net/usb/sierra_net.c
index 7be49ea..596ddaa 100644
--- a/drivers/net/usb/sierra_net.c
+++ b/drivers/net/usb/sierra_net.c
@@ -560,7 +560,7 @@ static void sierra_net_defer_kevent(struct usbnet *dev, int work)
 /*
  * Sync Retransmit Timer Handler. On expiry, kick the work queue
  */
-void sierra_sync_timer(unsigned long syncdata)
+static void sierra_sync_timer(unsigned long syncdata)
 {
 	struct usbnet *dev = (struct usbnet *)syncdata;
 
@@ -866,8 +866,8 @@ static int sierra_net_rx_fixup(struct usbnet *dev, struct sk_buff *skb)
 }
 
 /* ---------------------------- Transmit data path ----------------------*/
-struct sk_buff *sierra_net_tx_fixup(struct usbnet *dev, struct sk_buff *skb,
-		gfp_t flags)
+static struct sk_buff *sierra_net_tx_fixup(struct usbnet *dev,
+					   struct sk_buff *skb, gfp_t flags)
 {
 	struct sierra_net_data *priv = sierra_net_get_private(dev);
 	u16 len;
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH net-next 2/2] net: sierra_net: rx_urb_size is constant
From: Bjørn Mork @ 2012-09-03  9:20 UTC (permalink / raw)
  To: netdev-u79uwXL29TY76Z2rM5mHXA
  Cc: linux-usb-u79uwXL29TY76Z2rM5mHXA, Bjørn Mork
In-Reply-To: <1346664033-30284-1-git-send-email-bjorn-yOkvZcmFvRU@public.gmane.org>

The rx_urb_size is set to the same value for every device
supported by this driver.  No need to keep a per-device
data structure to do that. Replacing with a macro constant.

This was the last device specific info, and removing it
allows us to delete the sierra_net_info_data struct.

Signed-off-by: Bjørn Mork <bjorn-yOkvZcmFvRU@public.gmane.org>
---
 drivers/net/usb/sierra_net.c |   17 ++++-------------
 1 file changed, 4 insertions(+), 13 deletions(-)

diff --git a/drivers/net/usb/sierra_net.c b/drivers/net/usb/sierra_net.c
index 596ddaa..7ae70e9 100644
--- a/drivers/net/usb/sierra_net.c
+++ b/drivers/net/usb/sierra_net.c
@@ -68,9 +68,8 @@ static	atomic_t iface_counter = ATOMIC_INIT(0);
  */
 #define SIERRA_NET_USBCTL_BUF_LEN	1024
 
-struct sierra_net_info_data {
-	u16 rx_urb_size;
-};
+/* Overriding the default usbnet rx_urb_size */
+#define SIERRA_NET_RX_URB_SIZE		(8 * 1024)
 
 /* Private data structure */
 struct sierra_net_data {
@@ -678,9 +677,6 @@ static int sierra_net_bind(struct usbnet *dev, struct usb_interface *intf)
 	static const u8 shdwn_tmplate[sizeof(priv->shdwn_msg)] = {
 		0x00, 0x00, SIERRA_NET_HIP_SHUTD_ID, 0x00};
 
-	struct sierra_net_info_data *data =
-			(struct sierra_net_info_data *)dev->driver_info->data;
-
 	dev_dbg(&dev->udev->dev, "%s", __func__);
 
 	ifacenum = intf->cur_altsetting->desc.bInterfaceNumber;
@@ -725,9 +721,9 @@ static int sierra_net_bind(struct usbnet *dev, struct usb_interface *intf)
 	sierra_net_set_ctx_index(priv, 0);
 
 	/* decrease the rx_urb_size and max_tx_size to 4k on USB 1.1 */
-	dev->rx_urb_size  = data->rx_urb_size;
+	dev->rx_urb_size  = SIERRA_NET_RX_URB_SIZE;
 	if (dev->udev->speed != USB_SPEED_HIGH)
-		dev->rx_urb_size  = min_t(size_t, 4096, data->rx_urb_size);
+		dev->rx_urb_size  = min_t(size_t, 4096, SIERRA_NET_RX_URB_SIZE);
 
 	dev->net->hard_header_len += SIERRA_NET_HIP_EXT_HDR_LEN;
 	dev->hard_mtu = dev->net->mtu + dev->net->hard_header_len;
@@ -918,10 +914,6 @@ static struct sk_buff *sierra_net_tx_fixup(struct usbnet *dev,
 	return NULL;
 }
 
-static const struct sierra_net_info_data sierra_net_info_data_direct_ip = {
-	.rx_urb_size = 8 * 1024,
-};

^ permalink raw reply related

* Re: [PATCH] net/can:  rename peak_usb dump_mem function
From: Marc Kleine-Budde @ 2012-09-03  9:02 UTC (permalink / raw)
  To: Randy Dunlap
  Cc: netdev, Geert Uytterhoeven, linux-kernel, David Miller,
	Stephane Grosjean, Wolfgang Grandegger, linux-can
In-Reply-To: <504393A7.8040007@xenotime.net>

[-- Attachment #1: Type: text/plain, Size: 900 bytes --]

On 09/02/2012 07:13 PM, Randy Dunlap wrote:
> From: Randy Dunlap <rdunlap@xenotime.net>
> 
> Rename generic-sounding function dump_mem() to pcan_dump_mem()
> so that it does not conflict with the dump_mem() function in
> arch/sh/include/asm/kdebug.h.
> 
> drivers/net/can/usb/peak_usb/pcan_usb_core.c: error: conflicting types for 'dump_mem':  => 56:6
> drivers/net/can/usb/peak_usb/pcan_usb_core.h: error: conflicting types for 'dump_mem':  => 134:6
> 
> Not tested.

:) I've converted the users of peak's dump_mem() function, too. Now it
compiles. Should this go into v3.6, or is v3.7 early enough?

Marc

-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 259 bytes --]

^ permalink raw reply

* Re: Question: routing packets via specific router in LAN?
From: Cong Wang @ 2012-09-03  8:28 UTC (permalink / raw)
  To: netdev
In-Reply-To: <50444880.8080703@gmail.com>

On Mon, 03 Sep 2012 at 06:04 GMT, Yi Li <lovelylich@gmail.com> wrote:
> Hi All,
> I have server --- router ---client three machines,
> and they all have only one ip in the same LAN.
> I want to instruct the packets flowing through the router when the
> server and client communicates.
> I have do the following things to setup:
> on the server:
> # ip route add to unicast CLIENT_IP/32 via ROUTER_IP dev eth0
> # echo 0 > /proc/sys/net/ipv4/conf/all/accept_redirects
> # echo 0 > /proc/sys/net/ipv4/conf/eth0/accept_redirects
>
> on the client:
> /*modify route table*/
> # ip route add to unicast SERVER_IP/32 via ROUTER_IP dev eth0
> /*disable icmp-redirects accept*/
> # echo 0 > /proc/sys/net/ipv4/conf/all/accept_redirects
> # echo 0 > /proc/sys/net/ipv4/conf/eth0/accept_redirects
>
> on the router:
> /*enable forwarding*/
> # echo 1 > /proc/sys/net/ipv4/ip_forwarding
> /*disable icmp-redirects*/
> # echo 0 > /proc/sys/net/ipv4/conf/all/send_redirects
> # echo 0 > /proc/sys/net/ipv4/conf/eth0/send_redirects
>

Try to add some iptables rules like:

iptables -A FORWARD -j ACCEPT -s CLIENT_IP/xx -d  SERVER_IP/xx

^ permalink raw reply

* [PATCH net] net: usbnet: fix softirq storm on suspend
From: Bjørn Mork @ 2012-09-03  8:26 UTC (permalink / raw)
  To: netdev-u79uwXL29TY76Z2rM5mHXA
  Cc: linux-usb-u79uwXL29TY76Z2rM5mHXA, Bjørn Mork, Ming Lei,
	Oliver Neukum

Suspending an open usbnet device results in constant
rescheduling of usbnet_bh.

commit 65841fd5 "usbnet: handle remote wakeup asap"
refactored the usbnet_bh code to allow sharing the
urb allocate and submit code with usbnet_resume. In
this process, a test for, and immediate return on,
ENOLINK from rx_submit was unintentionally dropped.

The rx queue will not grow if rx_submit fails,
making usbnet_bh reschedule itself.  This results
in a softirq storm if the error is persistent.
rx_submit translates the usb_submit_urb error
EHOSTUNREACH into ENOLINK, so this is an expected
and persistent error for a suspended device. The
old code tested for this condition and avoided
rescheduling.  Putting this test back.

Cc: <stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org> # v3.5
Cc: Ming Lei <ming.lei-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
Cc: Oliver Neukum <oneukum-l3A5Bk7waGM@public.gmane.org>
Signed-off-by: Bjørn Mork <bjorn-yOkvZcmFvRU@public.gmane.org>
---
Sorry for not noticing this before, but commit 65841fd5
makes usbnet autosuspend completely unusable.  The device
is suspended fine, but burning one CPU core at full load
uses a tiny bit more power making the power saving 
negative...

I hope this can go into 3.6 and 3.5-stable ASAP. It is
a hard to notice regression, but all the same a serious
one.


Thanks,
Bjørn


 drivers/net/usb/usbnet.c |   16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c
index fd4b26d..fc9f578 100644
--- a/drivers/net/usb/usbnet.c
+++ b/drivers/net/usb/usbnet.c
@@ -1201,19 +1201,26 @@ deferred:
 }
 EXPORT_SYMBOL_GPL(usbnet_start_xmit);
 
-static void rx_alloc_submit(struct usbnet *dev, gfp_t flags)
+static int rx_alloc_submit(struct usbnet *dev, gfp_t flags)
 {
 	struct urb	*urb;
 	int		i;
+	int		ret = 0;
 
 	/* don't refill the queue all at once */
 	for (i = 0; i < 10 && dev->rxq.qlen < RX_QLEN(dev); i++) {
 		urb = usb_alloc_urb(0, flags);
 		if (urb != NULL) {
-			if (rx_submit(dev, urb, flags) == -ENOLINK)
-				return;
+			ret = rx_submit(dev, urb, flags);
+			if (ret)
+				goto err;
+		} else {
+			ret = -ENOMEM;
+			goto err;
 		}
 	}
+err:
+	return ret;
 }
 
 /*-------------------------------------------------------------------------*/
@@ -1257,7 +1264,8 @@ static void usbnet_bh (unsigned long param)
 		int	temp = dev->rxq.qlen;
 
 		if (temp < RX_QLEN(dev)) {
-			rx_alloc_submit(dev, GFP_ATOMIC);
+			if (rx_alloc_submit(dev, GFP_ATOMIC) == -ENOLINK)
+				return;
 			if (temp != dev->rxq.qlen)
 				netif_dbg(dev, link, dev->net,
 					  "rxqlen %d --> %d\n",
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* Re: [PATCH v2 1/2] tcp: add generic netlink support for tcp_metrics
From: Julian Anastasov @ 2012-09-03  8:22 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev, Stephen Hemminger, Paul E. McKenney
In-Reply-To: <1346633144.2563.97.camel@edumazet-glaptop>


	Hello,

On Mon, 3 Sep 2012, Eric Dumazet wrote:

> On Sun, 2012-09-02 at 08:36 +0300, Julian Anastasov wrote:
> > +
> > +static int tcp_metrics_flush_all(struct net *net)
> > +{
> > +	unsigned int max_rows = 1U << net->ipv4.tcp_metrics_hash_log;
> > +	struct tcpm_hash_bucket *hb = net->ipv4.tcp_metrics_hash;
> > +	struct tcp_metrics_block *tm;
> > +	unsigned int sync_count = 0;
> > +	unsigned int row;
> > +
> > +	for (row = 0; row < max_rows; row++, hb++) {
> > +		spin_lock_bh(&tcp_metrics_lock);
> > +		tm = deref_locked_genl(hb->chain);
> > +		if (tm)
> > +			hb->chain = NULL;
> > +		spin_unlock_bh(&tcp_metrics_lock);
> > +		while (tm) {
> > +			struct tcp_metrics_block *next;
> > +
> > +			next = deref_genl(tm->tcpm_next);
> > +			kfree_rcu(tm, rcu_head);
> > +			if (!((++sync_count) & 2047))
> > +				synchronize_rcu();
> > +			tm = next;
> > +		}
> > +	}
> > +	return 0;
> > +}
> 
> It looks like the synchronize_rcu() call is not exactly what you wanted,
> but then net/ipv4/fib_trie.c has the same mistake.

	I used fib_trie as reference...

> What we want here is to force pending call_rcu() calls to complete, so
> that we dont consume too much memory. So it would probably better to
> call rcu_barrier() instead.

	I see

> If other cpus are idle or outside of rcu read lock sections,
> synchronize_rcu() should basically do nothing at all.
> 
> But I am not sure its worth the trouble ?
> 
> Commit c3059477fce2d956a0bb3e04357324780c5d8eeb (ipv4: Use
> synchronize_rcu() during trie_rebalance()) was needed because FIB TRIE
> can really use huge amounts of memory, thats hardly the case with
> tcp_metrics.

	I was worrying for the case
(TCP_METRICS_RECLAIM_DEPTH + 1) * tcpmhash_entries to be
large, eg. if some non-default value is configured. May be the
chance table to be filled immediately is small. I'll remove it.

	BTW, is it appropriate to use kmem_cache for
metrics and as result call_rcu for freeing?

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply

* [net-next.git 3/7] stmmac: add the initial tx coalesce schema
From: Giuseppe CAVALLARO @ 2012-09-03  7:46 UTC (permalink / raw)
  To: netdev; +Cc: Giuseppe Cavallaro
In-Reply-To: <1346658422-1925-1-git-send-email-peppe.cavallaro@st.com>

This patch adds a new schema used for mitigating the
number of transmit interrupts.
It is based on a sw timer and a threshold value.
The timer is used to periodically call the stmmac_tx
function that can be invoked by the ISR but only for
the descriptors where the interrupt on completion
field has been set. This is tuned by a threshold.

Next step is to add the ability to tune these coalesce
values by ethtool.

Till now I have put a default that showed a real gain
on all the platforms ARM/SH4 where I performed benchmarks.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
 drivers/net/ethernet/stmicro/stmmac/common.h       |    8 +-
 drivers/net/ethernet/stmicro/stmmac/stmmac.h       |    4 +
 .../net/ethernet/stmicro/stmmac/stmmac_ethtool.c   |    9 +-
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c  |   86 +++++++++++++-------
 4 files changed, 72 insertions(+), 35 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h b/drivers/net/ethernet/stmicro/stmmac/common.h
index bd32fe6..1d6bd3e 100644
--- a/drivers/net/ethernet/stmicro/stmmac/common.h
+++ b/drivers/net/ethernet/stmicro/stmmac/common.h
@@ -95,11 +95,13 @@ struct stmmac_extra_stats {
 	unsigned long threshold;
 	unsigned long tx_pkt_n;
 	unsigned long rx_pkt_n;
-	unsigned long rx_napi_poll;
+	unsigned long normal_irq_n;
 	unsigned long rx_normal_irq_n;
+	unsigned long rx_napi_poll;
 	unsigned long tx_normal_irq_n;
-	unsigned long sched_timer_n;
-	unsigned long normal_irq_n;
+	unsigned long txtimer;
+	unsigned long tx_clean;
+	unsigned long tx_reset_ic_bit;
 	unsigned long mmc_tx_irq_n;
 	unsigned long mmc_rx_irq_n;
 	unsigned long mmc_rx_csum_offload_irq_n;
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
index 9f35769..0f5ab28 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
@@ -88,6 +88,10 @@ struct stmmac_priv {
 	int eee_enabled;
 	int eee_active;
 	int tx_lpi_timer;
+	struct timer_list txtimer;
+	u32 tx_count_frames;
+	u32 tx_coal_frames;
+	u32 tx_coal_timer;
 };
 
 extern int phyaddr;
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
index 505fe71..48ad0bc 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
@@ -90,12 +90,13 @@ static const struct stmmac_stats stmmac_gstrings_stats[] = {
 	STMMAC_STAT(threshold),
 	STMMAC_STAT(tx_pkt_n),
 	STMMAC_STAT(rx_pkt_n),
-	STMMAC_STAT(rx_napi_poll),
+	STMMAC_STAT(normal_irq_n),
 	STMMAC_STAT(rx_normal_irq_n),
+	STMMAC_STAT(rx_napi_poll),
 	STMMAC_STAT(tx_normal_irq_n),
-	STMMAC_STAT(sched_timer_n),
-	STMMAC_STAT(normal_irq_n),
-	STMMAC_STAT(normal_irq_n),
+	STMMAC_STAT(txtimer),
+	STMMAC_STAT(tx_clean),
+	STMMAC_STAT(tx_reset_ic_bit),
 	STMMAC_STAT(mmc_tx_irq_n),
 	STMMAC_STAT(mmc_rx_irq_n),
 	STMMAC_STAT(mmc_rx_csum_offload_irq_n),
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index b247c39..d7f5482 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -77,6 +77,8 @@
 
 #define STMMAC_ALIGN(x)	L1_CACHE_ALIGN(x)
 #define JUMBO_LEN	9000
+#define	STMMAC_TX_TM	40000
+#define STMMAC_TX_MAX_FRAMES	64	/* Max coalesced frame */
 
 /* Module parameters */
 #define TX_TIMEO 5000 /* default 5 seconds */
@@ -695,8 +697,11 @@ static void stmmac_dma_operation_mode(struct stmmac_priv *priv)
 static void stmmac_tx(struct stmmac_priv *priv)
 {
 	unsigned int txsize = priv->dma_tx_size;
+	unsigned long flags;
+
+	spin_lock_irqsave(&priv->tx_lock, flags);
 
-	spin_lock(&priv->tx_lock);
+	priv->xstats.tx_clean++;
 
 	while (priv->dirty_tx != priv->cur_tx) {
 		int last;
@@ -765,7 +770,7 @@ static void stmmac_tx(struct stmmac_priv *priv)
 		stmmac_enable_eee_mode(priv);
 		mod_timer(&priv->eee_ctrl_timer, STMMAC_LPI_TIMER(eee_timer));
 	}
-	spin_unlock(&priv->tx_lock);
+	spin_unlock_irqrestore(&priv->tx_lock, flags);
 }
 
 static inline void stmmac_enable_irq(struct stmmac_priv *priv)
@@ -778,29 +783,16 @@ static inline void stmmac_disable_irq(struct stmmac_priv *priv)
 	priv->hw->dma->disable_dma_irq(priv->ioaddr);
 }
 
-static int stmmac_has_work(struct stmmac_priv *priv)
+static void stmmac_txtimer(unsigned long data)
 {
-	unsigned int has_work = 0;
-	int rxret, tx_work = 0;
+	struct stmmac_priv *priv = (struct stmmac_priv *)data;
 
-	rxret = priv->hw->desc->get_rx_owner(priv->dma_rx +
-		(priv->cur_rx % priv->dma_rx_size));
+	priv->xstats.txtimer++;
 
 	if (priv->dirty_tx != priv->cur_tx)
-		tx_work = 1;
-
-	if (likely(!rxret || tx_work))
-		has_work = 1;
+		stmmac_tx(priv);
 
-	return has_work;
-}
-
-static inline void _stmmac_schedule(struct stmmac_priv *priv)
-{
-	if (likely(stmmac_has_work(priv))) {
-		stmmac_disable_irq(priv);
-		napi_schedule(&priv->napi);
-	}
+	return;
 }
 
 /**
@@ -824,7 +816,7 @@ static void stmmac_tx_err(struct stmmac_priv *priv)
 	netif_wake_queue(priv->dev);
 }
 
-static inline void stmmac_rx_schedule(struct stmmac_priv *priv)
+static void stmmac_rx_schedule(struct stmmac_priv *priv)
 {
 	if (likely(napi_schedule_prep(&priv->napi))) {
 		stmmac_disable_irq(priv);
@@ -1001,6 +993,36 @@ static int stmmac_init_dma_engine(struct stmmac_priv *priv)
 				   priv->dma_rx_phy);
 }
 
+static int stmmac_check_coal(int size, int max_coal_frames)
+{
+	int ret = 0;
+
+	if (max_coal_frames >= size)
+		return ret;
+
+	return max_coal_frames;
+}
+
+static int stmmac_init_tx_coalesce(struct stmmac_priv *priv)
+{
+	int ret = -EOPNOTSUPP;
+
+	priv->tx_coal_frames = stmmac_check_coal(priv->dma_tx_size,
+						 STMMAC_TX_MAX_FRAMES);
+	if (priv->tx_coal_frames) {
+		/* Set Tx coalesce parameters and timers */
+		priv->tx_coal_timer = jiffies + usecs_to_jiffies(STMMAC_TX_TM);
+		init_timer(&priv->txtimer);
+		priv->txtimer.expires = priv->tx_coal_timer;
+		priv->txtimer.data = (unsigned long)priv;
+		priv->txtimer.function = stmmac_txtimer;
+
+		ret = 0;
+	}
+
+	return ret;
+}
+
 /**
  *  stmmac_open - open entry point of the driver
  *  @dev : pointer to the device structure.
@@ -1113,6 +1135,10 @@ static int stmmac_open(struct net_device *dev)
 	priv->tx_lpi_timer = STMMAC_DEFAULT_TWT_LS_TIMER;
 	priv->eee_enabled = stmmac_eee_init(priv);
 
+	ret = stmmac_init_tx_coalesce(priv);
+	if (!ret)
+		add_timer(&priv->txtimer);
+
 	napi_enable(&priv->napi);
 	skb_queue_head_init(&priv->rx_recycle);
 	netif_start_queue(dev);
@@ -1202,6 +1228,7 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
 	int nfrags = skb_shinfo(skb)->nr_frags;
 	struct dma_desc *desc, *first;
 	unsigned int nopaged_len = skb_headlen(skb);
+	unsigned long flags;
 
 	if (unlikely(stmmac_tx_avail(priv) < nfrags + 1)) {
 		if (!netif_queue_stopped(dev)) {
@@ -1213,10 +1240,7 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
 		return NETDEV_TX_BUSY;
 	}
 
-	spin_lock(&priv->tx_lock);
-
-	if (priv->tx_path_in_lpi_mode)
-		stmmac_disable_eee_mode(priv);
+	spin_lock_irqsave(&priv->tx_lock, flags);
 
 	entry = priv->cur_tx % txsize;
 
@@ -1272,7 +1296,14 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
 	/* Interrupt on completition only for the latest segment */
 	priv->hw->desc->close_tx_desc(desc);
 
-	wmb();
+	/* Do not set the IC according to the coalesce patameters */
+	priv->tx_count_frames++;
+	if (priv->tx_coal_frames > priv->tx_count_frames) {
+		priv->hw->desc->clear_tx_ic(desc);
+		priv->xstats.tx_reset_ic_bit++;
+		mod_timer(&priv->txtimer, priv->tx_coal_timer);
+	} else
+		priv->tx_count_frames = 0;
 
 	/* To avoid raise condition */
 	priv->hw->desc->set_tx_owner(first);
@@ -1302,7 +1333,7 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
 
 	priv->hw->dma->enable_dma_transmission(priv->ioaddr);
 
-	spin_unlock(&priv->tx_lock);
+	spin_unlock_irqrestore(&priv->tx_lock, flags);
 
 	return NETDEV_TX_OK;
 }
@@ -1447,7 +1478,6 @@ static int stmmac_rx(struct stmmac_priv *priv, int limit)
  *	      all interfaces.
  *  Description :
  *   This function implements the the reception process.
- *   Also it runs the TX completion thread
  */
 static int stmmac_poll(struct napi_struct *napi, int budget)
 {
-- 
1.7.4.4

^ permalink raw reply related

* [net-next.git 7/7] stmmac: update the driver version to August_2012
From: Giuseppe CAVALLARO @ 2012-09-03  7:47 UTC (permalink / raw)
  To: netdev; +Cc: Giuseppe Cavallaro
In-Reply-To: <1346658422-1925-1-git-send-email-peppe.cavallaro@st.com>

Many new feautures have been introduced in the driver:
sysFS, Rx HW watchdog... so this patch updates the
driver's version.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
 drivers/net/ethernet/stmicro/stmmac/stmmac.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
index 05f17184..ad95f26 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
@@ -24,7 +24,7 @@
 #define __STMMAC_H__
 
 #define STMMAC_RESOURCE_NAME   "stmmaceth"
-#define DRV_MODULE_VERSION	"March_2012"
+#define DRV_MODULE_VERSION	"August_2012"
 
 #include <linux/clk.h>
 #include <linux/stmmac.h>
-- 
1.7.4.4

^ permalink raw reply related

* [net-next.git 6/7] stmmac: add mitigation and sysfs info in the doc
From: Giuseppe CAVALLARO @ 2012-09-03  7:47 UTC (permalink / raw)
  To: netdev; +Cc: Giuseppe Cavallaro
In-Reply-To: <1346658422-1925-1-git-send-email-peppe.cavallaro@st.com>

This patch updates the stmmac.txt addinf some information
about the new rx/tx mitigation schema and the sysFs support.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
 Documentation/networking/stmmac.txt |   34 +++++++++++++++++++++-------------
 1 files changed, 21 insertions(+), 13 deletions(-)

diff --git a/Documentation/networking/stmmac.txt b/Documentation/networking/stmmac.txt
index ef9ee71..67eaa35 100644
--- a/Documentation/networking/stmmac.txt
+++ b/Documentation/networking/stmmac.txt
@@ -29,11 +29,9 @@ The kernel configuration option is STMMAC_ETH:
 	dma_txsize: DMA tx ring size;
 	buf_sz: DMA buffer size;
 	tc: control the HW FIFO threshold;
-	tx_coe: Enable/Disable Tx Checksum Offload engine;
 	watchdog: transmit timeout (in milliseconds);
 	flow_ctrl: Flow control ability [on/off];
 	pause: Flow Control Pause Time;
-	tmrate: timer period (only if timer optimisation is configured).
 
 3) Command line options
 Driver parameters can be also passed in command line by using:
@@ -60,17 +58,21 @@ Then the poll method will be scheduled at some future point.
 The incoming packets are stored, by the DMA, in a list of pre-allocated socket
 buffers in order to avoid the memcpy (Zero-copy).
 
-4.3) Timer-Driver Interrupt
-Instead of having the device that asynchronously notifies the frame receptions,
-the driver configures a timer to generate an interrupt at regular intervals.
-Based on the granularity of the timer, the frames that are received by the
-device will experience different levels of latency. Some NICs have dedicated
-timer device to perform this task. STMMAC can use either the RTC device or the
-TMU channel 2  on STLinux platforms.
-The timers frequency can be passed to the driver as parameter; when change it,
-take care of both hardware capability and network stability/performance impact.
-Several performance tests on STM platforms showed this optimisation allows to
-spare the CPU while having the maximum throughput.
+4.3) Interrupt Mitigation
+The driver is able to mitigate the number of its DMA interrupts
+using NAPI for the reception on chips older than the 3.50.
+New chips have an HW RX-Watchdog used for this mitigation.
+
+User can tune (also via sysfs) a parameter that is the RI Watchdog
+Timer count. It indicates the number of system clock cycles.
+
+On Tx-side, the mitigation schema is based on a SW timer that calls the
+tx function (stmmac_tx) to reclaim the resource after transmitting the
+frames.
+Also there is another parameter (like a threshold) used to program
+the descriptors avoiding to set the interrupt on completion bit in
+when the frame is sent (xmit).
+These parameters can be tuned by sysfs entries.
 
 4.4) WOL
 Wake up on Lan feature through Magic and Unicast frames are supported for the
@@ -324,6 +326,12 @@ To enter in Tx LPI mode the driver needs to have a software timer
 that enable and disable the LPI mode when there is nothing to be
 transmitted.
 
+7) sys FS interface
+Some internal driver parameters can be tuned by using some
+entries exposed via sysFS. There parameter currently are,
+for example, for internal timers used to mitigate the rx/tx
+interrupts or for EEE.
+
 7) TODO:
  o XGMAC is not supported.
  o Add the PTP - precision time protocol
-- 
1.7.4.4

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox