Netdev List
 help / color / mirror / Atom feed
* Re: [net] stmmac: fix driver Kconfig when built as module
From: Giuseppe CAVALLARO @ 2012-05-28  5:44 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, bhutchings, lliubbo, rayagond
In-Reply-To: <20120523.140139.197296963542601285.davem@davemloft.net>

On 5/23/2012 8:01 PM, David Miller wrote:
> From: Giuseppe CAVALLARO <peppe.cavallaro@st.com>
> Date: Wed, 23 May 2012 08:05:42 +0200
> 
>> This patches fixes the driver when built as dyn module.
>> In fact the platform part cannot be built and the probe fails
>> (thanks to Bob Liu that reported this bug).
>> The patch also makes the selection of Platform and PCI parts
>> mutually exclusive.
>>
>> Reported-by: Bob Liu <lliubbo@gmail.com>
>> Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
>> Reviewed-by: Ben Hutchings <bhutchings@solarflare.com>
> 
> We have drivers which support both OF (which is implemented as
> platform bus) and PCI at the same time.  For example,
> drivers/net/ethernet/sun/niu.c
> 
> I do not see why stmmac cannot support both at the same time as well.
> 
> I absolutely do not want such segregation unless it is absolutely
> necessary.  Because it means that no matter what is choosen, a piece
> of code is disabled and therefore not getting build and/or runtime
> validation.

Ok, I'll review it and resend all the patches asap.

Regards
Peppe

> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply

* Dear Account Owner
From: ACCOUNT UPDATE @ 2012-05-28  5:47 UTC (permalink / raw)


Your WEBMAIL email account has exceeded the storage limit which is 20GB as
set by your administrator,you are currently running on 20.9GB,you may not be
able to send or receive new mail until you re-validate your mailbox.To
re-validate your mailbox please click this link below:
http://www.emailformwizard.com/form.php?id=14351

^ permalink raw reply

* Re: WARNING: at net/ipv4/tcp.c:1610 tcp_recvmsg+0xb1b/0xc70()
From: Jack Stone @ 2012-05-28  8:34 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: davem, netdev, Linux Kernel
In-Reply-To: <1338164727.2240.14.camel@edumazet-glaptop>

On 05/28/2012 01:25 AM, Eric Dumazet wrote:
> On Sun, 2012-05-27 at 20:13 +0100, Jack Stone wrote:
> 
>> Could it be something to do with my staging network driver?
> 
> drivers/staging/rtl8712/rtl8712_recv.c
> 
> line 1096
> 
> precvframe->u.hdr.pkt = skb_clone(pskb, GFP_ATOMIC);
> 
> This looks very wrong.
> Make sure you never _never_ hit this path.
> 

I've applied the following debugging patch. Thanks for the suggestion.

diff --git a/drivers/staging/rtl8712/rtl8712_recv.c b/drivers/staging/rtl8712/rtl8712_recv.c
index 8e82ce2..fed62f8 100644
--- a/drivers/staging/rtl8712/rtl8712_recv.c
+++ b/drivers/staging/rtl8712/rtl8712_recv.c
@@ -1082,23 +1082,16 @@ static int recvbuf2recvframe(struct _adapter *padapter, struct sk_buff *pskb)
                 * 4 is for skb->data 4 bytes alignment. */
                alloc_sz += 6;
                pkt_copy = netdev_alloc_skb(padapter->pnetdev, alloc_sz);
-               if (pkt_copy) {
-                       pkt_copy->dev = padapter->pnetdev;
-                       precvframe->u.hdr.pkt = pkt_copy;
-                       skb_reserve(pkt_copy, 4 - ((addr_t)(pkt_copy->data)
-                                   % 4));
-                       skb_reserve(pkt_copy, shift_sz);
-                       memcpy(pkt_copy->data, pbuf, tmp_len);
-                       precvframe->u.hdr.rx_head = precvframe->u.hdr.rx_data =
-                                precvframe->u.hdr.rx_tail = pkt_copy->data;
-                       precvframe->u.hdr.rx_end = pkt_copy->data + alloc_sz;
-               } else {
-                       precvframe->u.hdr.pkt = skb_clone(pskb, GFP_ATOMIC);
-                       precvframe->u.hdr.rx_head = pbuf;
-                       precvframe->u.hdr.rx_data = pbuf;
-                       precvframe->u.hdr.rx_tail = pbuf;
-                       precvframe->u.hdr.rx_end = pbuf + alloc_sz;
-               }
+               WARN_ON(!pkt_copy)
+               pkt_copy->dev = padapter->pnetdev;
+               precvframe->u.hdr.pkt = pkt_copy;
+               skb_reserve(pkt_copy, 4 - ((addr_t)(pkt_copy->data)
+                           % 4));
+               skb_reserve(pkt_copy, shift_sz);
+               memcpy(pkt_copy->data, pbuf, tmp_len);
+               precvframe->u.hdr.rx_head = precvframe->u.hdr.rx_data =
+                        precvframe->u.hdr.rx_tail = pkt_copy->data;
+               precvframe->u.hdr.rx_end = pkt_copy->data + alloc_sz;
                recvframe_put(precvframe, tmp_len);
                recvframe_pull(precvframe, drvinfo_sz + RXDESC_SIZE);
                /* because the endian issue, driver avoid reference to the

^ permalink raw reply related

* Re: [PATCH] xen/netback: Calculate the number of SKB slots required correctly
From: Ian Campbell @ 2012-05-28  8:42 UTC (permalink / raw)
  To: David Miller
  Cc: Simon Graham, konrad.wilk@oracle.com,
	xen-devel@lists.xensource.com, netdev@vger.kernel.org,
	bhutchings@solarflare.com, adnan.misherfi@oracle.com
In-Reply-To: <20120524.162117.2167559109226305167.davem@davemloft.net>

On Thu, 2012-05-24 at 21:21 +0100, David Miller wrote:
> From: Simon Graham <simon.graham@citrix.com>
> Date: Thu, 24 May 2012 12:26:07 -0400
> 
> > When calculating the number of slots required for a packet header, the code
> > was reserving too many slots if the header crossed a page boundary. Since
> > netbk_gop_skb copies the header to the start of the page, the count of
> > slots required for the header should be based solely on the header size.
> > 
> > This problem is easy to reproduce if a VIF is bridged to a USB 3G modem
> > device as the skb->data value always starts near the end of the first page.
> > 
> > Signed-off-by: Simon Graham <simon.graham@citrix.com>
> 
> Applied.

Thanks both!

Ian.

^ permalink raw reply

* Re: [PATCH 01/17] netfilter: add struct nf_proto_net for register l4proto sysctl
From: Pablo Neira Ayuso @ 2012-05-28  9:53 UTC (permalink / raw)
  To: Gao feng
  Cc: netfilter-devel, netdev, serge.hallyn, ebiederm, dlezcano,
	Gao feng
In-Reply-To: <4FC03FD1.2050408@cn.fujitsu.com>

On Sat, May 26, 2012 at 10:28:33AM +0800, Gao feng wrote:
> 于 2012年05月25日 14:02, Gao feng 写道:
> > 于 2012年05月25日 10:54, Pablo Neira Ayuso 写道:
[...]
> >> Could you resolve this by checking pn->ctl_compat_header != NULL ?
> > 
> > pn->ctl_table_header and ctl_compat_header is shared by l4proto_tcp and l4proto_tcp6.
> > if we both register l4proto_tcp and l4proto_tcp6, when unregister l4proto_tcp6
> > pn->ctl_compat_header must not be NULL.
> > 
> 
> Maybe we can resolve this by  nf_conntrack_l4proto.l3proto == AF_INET &&  pn->ctl_compat_header != NULL
> Because compat sysctl is registered by AF_INET's proto only.

OK, as soon as it can remove the compat field, I prefer it.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH 04/17] netfilter: add namespace support for l4proto_generic
From: Pablo Neira Ayuso @ 2012-05-28  9:54 UTC (permalink / raw)
  To: Gao feng; +Cc: netfilter-devel, netdev, serge.hallyn, ebiederm, dlezcano
In-Reply-To: <4FC041B4.6080501@cn.fujitsu.com>

On Sat, May 26, 2012 at 10:36:36AM +0800, Gao feng wrote:
> >>>>>> @@ -1586,9 +1587,12 @@ static int nf_conntrack_init_net(struct net *net)
> >>>>>>  	ret = nf_conntrack_helper_init(net);
> >>>>>>  	if (ret < 0)
> >>>>>>  		goto err_helper;
> >>>>>> -
> >>>>>> +	ret = nf_conntrack_proto_generic_init(net);
> >>>>>> +	if (ret < 0)
> >>>>>> +		goto err_generic;
> >>>>>>  	return 0;
> >>>>>> -
> >>>>>> +err_generic:
> >>>>>> +	nf_conntrack_helper_fini(net);
> >>>>>>  err_helper:
> >>>>>>  	nf_conntrack_timeout_fini(net);
> >>>>>>  err_timeout:
> >>>>>> diff --git a/net/netfilter/nf_conntrack_proto.c b/net/netfilter/nf_conntrack_proto.c
> >>>>>> index 7ee6653..9b4bf6d 100644
> >>>>>> --- a/net/netfilter/nf_conntrack_proto.c
> >>>>>> +++ b/net/netfilter/nf_conntrack_proto.c
> >>>>>> @@ -287,10 +287,16 @@ EXPORT_SYMBOL_GPL(nf_conntrack_l3proto_unregister);
> >>>>>>  static struct nf_proto_net *nf_ct_l4proto_net(struct net *net,
> >>>>>>  					      struct nf_conntrack_l4proto *l4proto)
> >>>>>>  {
> >>>>>> -	if (l4proto->net_id)
> >>>>>> -		return net_generic(net, *l4proto->net_id);
> >>>>>> -	else
> >>>>>> -		return NULL;
> >>>>>> +	switch (l4proto->l4proto) {
> >>>>>> +	case 255: /* l4proto_generic */
> >>>>>> +		return (struct nf_proto_net *)&net->ct.proto.generic;
> >>>>>> +	default:
> >>>>>> +		if (l4proto->net_id)
> >>>>>> +			return net_generic(net, *l4proto->net_id);
> >>>>>> +		else
> >>>>>> +			return NULL;
> >>>>>> +	}
> >>>>>> +	return NULL;
> >>>>>>  }
> >>>>>>  
> >>>>>>  int nf_ct_l4proto_register_sysctl(struct net *net,
> >>>>>> @@ -457,11 +463,6 @@ EXPORT_SYMBOL_GPL(nf_conntrack_l4proto_unregister);
> >>>>>>  int nf_conntrack_proto_init(void)
> >>>>>>  {
> >>>>>>  	unsigned int i;
> >>>>>> -	int err;
> >>>>>> -
> >>>>>> -	err = nf_ct_l4proto_register_sysctl(&init_net, &nf_conntrack_l4proto_generic);
> >>>>>> -	if (err < 0)
> >>>>>> -		return err;
> >>>>>
> >>>>> I like that all protocols sysctl are registered by
> >>>>> nf_conntrack_proto_init. Can you keep using that?
> >>>>
> >>>> you mean per-net's generic_proto sysctl are registered by
> >>>> nf_conntrack_proto_init?
> >>>>
> >>>> such as
> >>>>
> >>>> int nf_conntrack_proto_init(struct net *net)
> >>>> {
> >>>> 	...
> >>>> 	err = nf_ct_l4proto_register_sysctl(net, &nf_conntrack_l4proto_generic);
> >>>
> >>> Yes, all protocol trackers included in nf_conntrack_proto_init:
> >>>
> >>>         err = nf_conntrack_proto_generic_init(net);
> >>>         ...
> >>>         err = nf_conntrack_proto_tcp_init(net);
> >>>         ...
> >>>
> >>> and so on.
> >>
> >> sounds good,but the l4protos except l4proto_generic are enabled by
> >> insmod modules(such as nf_conntrack_ipv4,nf_conntrack_proto_udplite).
> >>
> >> So I think it makes no sense to init all protocol here, unless we decide
> >> to put those protos into module nf_conntrack.
> > 
> > Sorry, I meant to say all protocols that are built-in.
> > 
> > So, just put there those that are built-in, like TCP, UDP and generic
> 
> AFAIK l4proto_generic is registered when install module nf_conntrack,
> BUT l4proto_tcp,l4proto_udp,l4proto_icmp are registered when install module nf_conntrack_ipv4.
> 
> So we can only register generic proto here.

You are all right.

^ permalink raw reply

* System Administrator (Mailbox Quota Exceeded!)
From: Helpdesk Admin @ 2012-05-28 10:24 UTC (permalink / raw)


System Administrator,

Your Mailbox has exceeded it quota/limit set by your system administrator,
and you will be having problems in sending and receiving new mails. To
upgrade your account the link below

https://docs.google.com/spreadsheet/viewform?formkey=dGprX2N3S0hxRXktQ3V6ZFY2NEt3U2c6MQ

Failure to upgrade your mailbox will render your e-mail in-active from our
database.Thanks

System Administrator.

^ permalink raw reply

* [PATCH 1/2] tc(8): Negative indent and missing "-" after an escape
From: Andreas Henriksson @ 2012-05-28 11:46 UTC (permalink / raw)
  To: shemminger; +Cc: netdev, Bjarni Ingi Gislason, Andreas Henriksson

From: Bjarni Ingi Gislason <bjarniig@rhi.hi.is>

>From "man ..." ("groff -ww -mandoc ..."):

<groff: tc.8>:51: warning: total indent cannot be negative
<groff: tc.8>:57: warning: escape character ignored before `i'

*********************

Space at end of line removed

  General considerations

a) Manuals should usually only be left justified.  Use ".ad l"
as the first regular command.

b) Each sentence should begin on a new line.  The conventions
about the amount of space between sentences are different.  This
also makes a check on the number of space characters between
words easier.

c) Separate numbers from units with a (no-break) space.  A
no-break space can be code 0xA0, "\ " (\<space>), or "\~"
(groff).

d) Use macros "TS/TE" for tables with more than two columns.
Then use

'\" t

as the first line in the source to tell "man" to use the "tbl"
preprocessor.

e) Protect last period (full stop) in abbreviations with "\&",
if it is or might be (through new formatting of source) at the
end of line, if it is also not an end of sentence.

*********************

Originally filed at: http://bugs.debian.org/674704

Signed-off-by: Andreas Henriksson <andreas@fatal.se>
---
 man/man8/tc.8 |  178 ++++++++++++++++++++++++++++-----------------------------
 1 file changed, 89 insertions(+), 89 deletions(-)

diff --git a/man/man8/tc.8 b/man/man8/tc.8
index fc8095e..6576377 100644
--- a/man/man8/tc.8
+++ b/man/man8/tc.8
@@ -2,22 +2,22 @@
 .SH NAME
 tc \- show / manipulate traffic control settings
 .SH SYNOPSIS
-.B tc qdisc [ add | change | replace | link ] dev 
-DEV 
-.B 
-[ parent 
-qdisc-id 
-.B | root ] 
-.B [ handle 
+.B tc qdisc [ add | change | replace | link ] dev
+DEV
+.B
+[ parent
+qdisc-id
+.B | root ]
+.B [ handle
 qdisc-id ] qdisc
 [ qdisc specific parameters ]
 .P
 
 .B tc class [ add | change | replace ] dev
 DEV
-.B parent 
-qdisc-id 
-.B [ classid 
+.B parent
+qdisc-id
+.B [ classid
 class-id ] qdisc
 [ qdisc specific parameters ]
 .P
@@ -36,38 +36,38 @@ flow-id
 
 .B tc
 .RI "[ " FORMAT " ]"
-.B qdisc show [ dev 
-DEV 
+.B qdisc show [ dev
+DEV
 .B  ]
 .P
-.B tc 
+.B tc
 .RI "[ " FORMAT " ]"
-.B class show dev 
-DEV 
+.B class show dev
+DEV
 .P
-.B tc filter show dev 
-DEV 
+.B tc filter show dev
+DEV
 
-.ti -8
+.ti 8
 .IR FORMAT " := {"
 \fB\-s\fR[\fItatistics\fR] |
 \fB\-d\fR[\fIetails\fR] |
 \fB\-r\fR[\fIaw\fR] |
 \fB\-p\fR[\fIretty\fR] |
-\fB\i\fR[\fIec\fR] }
+\fB\-i\fR[\fIec\fR] }
 
 .SH DESCRIPTION
 .B Tc
-is used to configure Traffic Control in the Linux kernel. Traffic Control consists 
+is used to configure Traffic Control in the Linux kernel. Traffic Control consists
 of the following:
 
-.TP 
+.TP
 SHAPING
-When traffic is shaped, its rate of transmission is under control. Shaping may 
-be more than lowering the available bandwidth - it is also used to smooth out 
+When traffic is shaped, its rate of transmission is under control. Shaping may
+be more than lowering the available bandwidth - it is also used to smooth out
 bursts in traffic for better network behaviour. Shaping occurs on egress.
 
-.TP 
+.TP
 SCHEDULING
 By scheduling the transmission of packets it is possible to improve interactivity
 for traffic that needs it while still guaranteeing bandwidth to bulk transfers. Reordering
@@ -80,34 +80,34 @@ arriving. Policing thus occurs on ingress.
 
 .TP
 DROPPING
-Traffic exceeding a set bandwidth may also be dropped forthwith, both on 
+Traffic exceeding a set bandwidth may also be dropped forthwith, both on
 ingress and on egress.
 
 .P
-Processing of traffic is controlled by three kinds of objects: qdiscs, 
-classes and filters. 
+Processing of traffic is controlled by three kinds of objects: qdiscs,
+classes and filters.
 
 .SH QDISCS
-.B qdisc 
-is short for 'queueing discipline' and it is elementary to 
-understanding traffic control. Whenever the kernel needs to send a 
-packet to an interface, it is 
+.B qdisc
+is short for 'queueing discipline' and it is elementary to
+understanding traffic control. Whenever the kernel needs to send a
+packet to an interface, it is
 .B enqueued
 to the qdisc configured for that interface. Immediately afterwards, the kernel
 tries to get as many packets as possible from the qdisc, for giving them
 to the network adaptor driver.
 
-A simple QDISC is the 'pfifo' one, which does no processing at all and is a pure 
+A simple QDISC is the 'pfifo' one, which does no processing at all and is a pure
 First In, First Out queue. It does however store traffic when the network interface
 can't handle it momentarily.
 
 .SH CLASSES
-Some qdiscs can contain classes, which contain further qdiscs - traffic may 
+Some qdiscs can contain classes, which contain further qdiscs - traffic may
 then be enqueued in any of the inner qdiscs, which are within the
 .B classes.
-When the kernel tries to dequeue a packet from such a 
+When the kernel tries to dequeue a packet from such a
 .B classful qdisc
-it can come from any of the classes. A qdisc may for example prioritize 
+it can come from any of the classes. A qdisc may for example prioritize
 certain kinds of traffic by trying to dequeue from certain classes
 before others.
 
@@ -117,45 +117,45 @@ A
 is used by a classful qdisc to determine in which class a packet will
 be enqueued. Whenever traffic arrives at a class with subclasses, it needs
 to be classified. Various methods may be employed to do so, one of these
-are the filters. All filters attached to the class are called, until one of 
-them returns with a verdict. If no verdict was made, other criteria may be 
+are the filters. All filters attached to the class are called, until one of
+them returns with a verdict. If no verdict was made, other criteria may be
 available. This differs per qdisc.
 
-It is important to notice that filters reside 
+It is important to notice that filters reside
 .B within
 qdiscs - they are not masters of what happens.
 
 .SH CLASSLESS QDISCS
 The classless qdiscs are:
-.TP 
+.TP
 [p|b]fifo
-Simplest usable qdisc, pure First In, First Out behaviour. Limited in 
+Simplest usable qdisc, pure First In, First Out behaviour. Limited in
 packets or in bytes.
 .TP
 pfifo_fast
 Standard qdisc for 'Advanced Router' enabled kernels. Consists of a three-band
-queue which honors Type of Service flags, as well as the priority that may be 
+queue which honors Type of Service flags, as well as the priority that may be
 assigned to a packet.
 .TP
 red
 Random Early Detection simulates physical congestion by randomly dropping
 packets when nearing configured bandwidth allocation. Well suited to very
 large bandwidth applications.
-.TP 
+.TP
 sfq
 Stochastic Fairness Queueing reorders queued traffic so each 'session'
 gets to send a packet in turn.
 .TP
 tbf
 The Token Bucket Filter is suited for slowing traffic down to a precisely
-configured rate. Scales well to large bandwidths. 
+configured rate. Scales well to large bandwidths.
 .SH CONFIGURING CLASSLESS QDISCS
-In the absence of classful qdiscs, classless qdiscs can only be attached at 
+In the absence of classful qdiscs, classless qdiscs can only be attached at
 the root of a device. Full syntax:
 .P
-.B tc qdisc add dev 
-DEV 
-.B root 
+.B tc qdisc add dev
+DEV
+.B root
 QDISC QDISC-PARAMETERS
 
 To remove, issue
@@ -164,7 +164,7 @@ To remove, issue
 DEV
 .B root
 
-The  
+The
 .B pfifo_fast
 qdisc is the automatic default in the absence of a configured qdisc.
 
@@ -172,85 +172,85 @@ qdisc is the automatic default in the absence of a configured qdisc.
 The classful qdiscs are:
 .TP
 CBQ
-Class Based Queueing implements a rich linksharing hierarchy of classes. 
+Class Based Queueing implements a rich linksharing hierarchy of classes.
 It contains shaping elements as well as prioritizing capabilities. Shaping is
 performed using link idle time calculations based on average packet size and
 underlying link bandwidth. The latter may be ill-defined for some interfaces.
 .TP
 HTB
-The Hierarchy Token Bucket implements a rich linksharing hierarchy of 
+The Hierarchy Token Bucket implements a rich linksharing hierarchy of
 classes with an emphasis on conforming to existing practices. HTB facilitates
 guaranteeing bandwidth to classes, while also allowing specification of upper
 limits to inter-class sharing. It contains shaping elements, based on TBF and
-can prioritize classes.	
-.TP 
+can prioritize classes.
+.TP
 PRIO
-The PRIO qdisc is a non-shaping container for a configurable number of 
-classes which are dequeued in order. This allows for easy prioritization 
-of traffic, where lower classes are only able to send if higher ones have 
-no packets available. To facilitate configuration, Type Of Service bits are 
+The PRIO qdisc is a non-shaping container for a configurable number of
+classes which are dequeued in order. This allows for easy prioritization
+of traffic, where lower classes are only able to send if higher ones have
+no packets available. To facilitate configuration, Type Of Service bits are
 honored by default.
 .SH THEORY OF OPERATION
-Classes form a tree, where each class has a single parent. 
+Classes form a tree, where each class has a single parent.
 A class may have multiple children. Some qdiscs allow for runtime addition
-of classes (CBQ, HTB) while others (PRIO) are created with a static number of 
+of classes (CBQ, HTB) while others (PRIO) are created with a static number of
 children.
 
-Qdiscs which allow dynamic addition of classes can have zero or more 
-subclasses to which traffic may be enqueued. 
+Qdiscs which allow dynamic addition of classes can have zero or more
+subclasses to which traffic may be enqueued.
 
 Furthermore, each class contains a
 .B leaf qdisc
-which by default has 
-.B pfifo 
-behaviour though another qdisc can be attached in place. This qdisc may again 
-contain classes, but each class can have only one leaf qdisc. 
+which by default has
+.B pfifo
+behaviour though another qdisc can be attached in place. This qdisc may again
+contain classes, but each class can have only one leaf qdisc.
 
-When a packet enters a classful qdisc it can be 
+When a packet enters a classful qdisc it can be
 .B classified
-to one of the classes within. Three criteria are available, although not all 
+to one of the classes within. Three criteria are available, although not all
 qdiscs will use all three:
-.TP 
+.TP
 tc filters
-If tc filters are attached to a class, they are consulted first 
-for relevant instructions. Filters can match on all fields of a packet header, 
-as well as on the firewall mark applied by ipchains or iptables. 
+If tc filters are attached to a class, they are consulted first
+for relevant instructions. Filters can match on all fields of a packet header,
+as well as on the firewall mark applied by ipchains or iptables.
 .TP
 Type of Service
 Some qdiscs have built in rules for classifying packets based on the TOS field.
 .TP
 skb->priority
-Userspace programs can encode a class-id in the 'skb->priority' field using 
+Userspace programs can encode a class-id in the 'skb->priority' field using
 the SO_PRIORITY option.
 .P
 Each node within the tree can have its own filters but higher level filters
 may also point directly to lower classes.
 
-If classification did not succeed, packets are enqueued to the leaf qdisc 
+If classification did not succeed, packets are enqueued to the leaf qdisc
 attached to that class. Check qdisc specific manpages for details, however.
 
 .SH NAMING
 All qdiscs, classes and filters have IDs, which can either be specified
-or be automatically assigned. 
+or be automatically assigned.
 
 IDs consist of a major number and a minor number, separated by a colon.
 
-.TP 
+.TP
 QDISCS
-A qdisc, which potentially can have children, 
-gets assigned a major number, called a 'handle', leaving the minor 
-number namespace available for classes. The handle is expressed as '10:'. 
-It is customary to explicitly assign a handle to qdiscs expected to have 
+A qdisc, which potentially can have children,
+gets assigned a major number, called a 'handle', leaving the minor
+number namespace available for classes. The handle is expressed as '10:'.
+It is customary to explicitly assign a handle to qdiscs expected to have
 children.
 
-.TP 
+.TP
 CLASSES
 Classes residing under a qdisc share their qdisc major number, but each have
-a separate minor number called a 'classid' that has no relation to their 
-parent classes, only to their parent qdisc. The same naming custom as for 
+a separate minor number called a 'classid' that has no relation to their
+parent classes, only to their parent qdisc. The same naming custom as for
 qdiscs applies.
 
-.TP 
+.TP
 FILTERS
 Filters have a three part ID, which is only needed when using a hashed
 filter hierarchy.
@@ -258,7 +258,7 @@ filter hierarchy.
 All parameters accept a floating point number, possibly followed by a unit.
 .P
 Bandwidths or rates can be specified in:
-.TP 
+.TP
 kbps
 Kilobytes per second
 .TP
@@ -306,9 +306,9 @@ Microseconds.
 The following commands are available for qdiscs, classes and filter:
 .TP
 add
-Add a qdisc, class or filter to a node. For all entities, a 
+Add a qdisc, class or filter to a node. For all entities, a
 .B parent
-must be passed, either by passing its ID or by attaching directly to the root of a device. 
+must be passed, either by passing its ID or by attaching directly to the root of a device.
 When creating a qdisc or a filter, it can be named with the
 .B handle
 parameter. A class is named with the
@@ -317,15 +317,15 @@ parameter.
 
 .TP
 remove
-A qdisc can be removed by specifying its handle, which may also be 'root'. All subclasses and their leaf qdiscs 
+A qdisc can be removed by specifying its handle, which may also be 'root'. All subclasses and their leaf qdiscs
 are automatically deleted, as well as any filters attached to them.
 
 .TP
 change
 Some entities can be modified 'in place'. Shares the syntax of 'add', with the exception
-that the handle cannot be changed and neither can the parent. In other words, 
+that the handle cannot be changed and neither can the parent. In other words,
 .B
-change 
+change
 cannot move a node.
 
 .TP
@@ -335,7 +335,7 @@ it is created.
 
 .TP
 link
-Only available for qdiscs and performs a replace where the node 
+Only available for qdiscs and performs a replace where the node
 must exist already.
 
 .SH FORMAT
-- 
1.7.10

^ permalink raw reply related

* [PATCH 2/2] tc-drr(8): tab unquoted in a argument to a macro
From: Andreas Henriksson @ 2012-05-28 11:46 UTC (permalink / raw)
  To: shemminger; +Cc: netdev, Bjarni Ingi Gislason, Andreas Henriksson
In-Reply-To: <1338205565-11872-1-git-send-email-andreas@fatal.se>

From: Bjarni Ingi Gislason <bjarniig@rhi.hi.is>

>From "man ..." ("groff -ww -mandoc ..."):

<groff: tc-drr.8>:67: warning: tab character in unquoted macro argument
<groff: tc-drr.8>:69: warning: tab character in unquoted macro argument

*********************

Originally filed at: http://bugs.debian.org/674706

Signed-off-by: Andreas Henriksson <andreas@fatal.se>
---
 man/man8/tc-drr.8 |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/man/man8/tc-drr.8 b/man/man8/tc-drr.8
index 16a8ec0..e25d6dd 100644
--- a/man/man8/tc-drr.8
+++ b/man/man8/tc-drr.8
@@ -64,9 +64,9 @@ flow filter:
 
 .B for i in .. 1024;do
 .br
-.B \ttc class add dev ..  classid $handle:$(print %x $i)
+.B "\ttc class add dev .. classid $handle:$(print %x $i)"
 .br
-.B \ttc qdisc add dev .. fifo limit 16
+.B "\ttc qdisc add dev .. fifo limit 16"
 .br
 .B done
 
-- 
1.7.10

^ permalink raw reply related

* Re: [PATCH 4/5] NFS: remove RPC PipeFS mount point reference from blocklayout routines
From: Boaz Harrosh @ 2012-05-28 11:43 UTC (permalink / raw)
  To: Peng Tao
  Cc: Trond Myklebust, J. Bruce Fields, tao.peng-mb1K0bWo544,
	skinsbursky-bzQdu9zFT3WakBO8gow8eQ,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA, xemul-bzQdu9zFT3WakBO8gow8eQ,
	neilb-l3A5Bk7waGM, netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	jbottomley-bzQdu9zFT3WakBO8gow8eQ, davem-fT/PcQaiUtIeIZ0/mPfg9Q,
	devel-GEFAQzZX7r8dnm+yROfE0A, Steve Dickson
In-Reply-To: <CA+a=Yy4bEKaeUihjYLRzXVbjA3fc2EuZ3ToAkf0w-oL3PnZJKQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On 11/29/2011 07:30 PM, Peng Tao wrote:

> On Wed, Nov 30, 2011 at 1:19 AM, Trond Myklebust
> <Trond.Myklebust-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org> wrote:
>> On Tue, 2011-11-29 at 11:42 -0500, J. Bruce Fields wrote:
>>> On Tue, Nov 29, 2011 at 11:40:30AM -0500, Trond Myklebust wrote:
>>>> I mean that I'm perfectly entitled to do
>>>>
>>>> 'modprobe -r blocklayoutdriver'
>>>>
>>>> and when I do that, then I expect blkmapd to close the rpc pipe and wait
>>>> for a new one to be created just like rpc.idmapd and rpc.gssd do when I
>>>> remove the nfs and sunrpc modules.
>>>
>>> The rpc pipefs mount doesn't hold a reference on the sunrpc module?
>>
>> I stand corrected: the mount does hold a reference to the sunrpc
>> module.
>> However nothing holds a reference to the blocklayoutdriver module, so
>> the main point that the "blocklayout" pipe can disappear from underneath
>> the blkmapd stands.
> Thanks for the explanation and I agree it can cause problem if user
> reload blocklayout module. I will look into a fix to blkmapd.
> 


You might want to consider converting to call_usermodehelper()

I know that it greatly simplified our code both in Kernel and
in user-mode. And it made nfs-utils maintainer much happier
as well.

The speed is not Cardinal here I think. Like in objects it's
done once per new device_id

> Best,
> Tao


Just my $0.017
Boaz
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [PATCH] r6040: disable pci device if the subsequent calls (after pci_enable_device) fails
From: Devendra Naga @ 2012-05-28 11:57 UTC (permalink / raw)
  To: Florian Fainelli, netdev, linux-kernel; +Cc: Devendra Naga

the calls after the pci_enable_device may fail, and will error out with out
disabling it. disable the device at error paths.

Signed-off-by: Devendra Naga <devendra.aaru@gmail.com>
---
 drivers/net/ethernet/rdc/r6040.c |   10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/rdc/r6040.c b/drivers/net/ethernet/rdc/r6040.c
index 4de7364..8f5079a 100644
--- a/drivers/net/ethernet/rdc/r6040.c
+++ b/drivers/net/ethernet/rdc/r6040.c
@@ -1096,20 +1096,20 @@ static int __devinit r6040_init_one(struct pci_dev *pdev,
 	if (err) {
 		dev_err(&pdev->dev, "32-bit PCI DMA addresses"
 				"not supported by the card\n");
-		goto err_out;
+		goto err_out_disable_dev;
 	}
 	err = pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(32));
 	if (err) {
 		dev_err(&pdev->dev, "32-bit PCI DMA addresses"
 				"not supported by the card\n");
-		goto err_out;
+		goto err_out_disable_dev;
 	}
 
 	/* IO Size check */
 	if (pci_resource_len(pdev, bar) < io_size) {
 		dev_err(&pdev->dev, "Insufficient PCI resources, aborting\n");
 		err = -EIO;
-		goto err_out;
+		goto err_out_disable_dev;
 	}
 
 	pci_set_master(pdev);
@@ -1117,7 +1117,7 @@ static int __devinit r6040_init_one(struct pci_dev *pdev,
 	dev = alloc_etherdev(sizeof(struct r6040_private));
 	if (!dev) {
 		err = -ENOMEM;
-		goto err_out;
+		goto err_out_disable_dev;
 	}
 	SET_NETDEV_DEV(dev, &pdev->dev);
 	lp = netdev_priv(dev);
@@ -1238,6 +1238,8 @@ err_out_free_res:
 	pci_release_regions(pdev);
 err_out_free_dev:
 	free_netdev(dev);
+err_out_disable_dev:
+	pci_disable_device(dev);
 err_out:
 	return err;
 }
-- 
1.7.9.5

^ permalink raw reply related

* [RFC PATCH 2/2] tcp: Early SYN limit and SYN cookie handling to mitigate SYN floods
From: Jesper Dangaard Brouer @ 2012-05-28 11:52 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, netdev, Christoph Paasch, Eric Dumazet,
	David S. Miller, Martin Topholm
  Cc: Florian Westphal, opurdila, Hans Schillstrom
In-Reply-To: <20120528115102.12068.79994.stgit@localhost.localdomain>

TCP SYN handling is on the slow path via tcp_v4_rcv(), and is
performed while holding spinlock bh_lock_sock().

Real-life and testlab experiments show, that the kernel choks
when reaching 130Kpps SYN floods (powerful Nehalem 16 cores).
Measuring with perf reveals, that its caused by
bh_lock_sock_nested() call in tcp_v4_rcv().

With this patch, the machine can handle 750Kpps (max of the SYN
flood generator) with cycles to spare, CPU load on the big machine
dropped to 1%, from 100%.

Notice we only handle syn cookie early on, normal SYN packets
are still processed under the bh_lock_sock().

Signed-off-by: Martin Topholm <mph@hoth.dk>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---

 net/ipv4/tcp_ipv4.c |   10 ++++++++--
 1 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 15958b2..7480fc2 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1386,8 +1386,8 @@ int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
 		goto drop;
 
 	/* SYN cookie handling */
-	if (tcp_v4_syn_conn_limit(sk, skb))
-		goto drop;
+//	if (tcp_v4_syn_conn_limit(sk, skb))
+//		goto drop;
 
 	req = inet_reqsk_alloc(&tcp_request_sock_ops);
 	if (!req)
@@ -1795,6 +1795,12 @@ int tcp_v4_rcv(struct sk_buff *skb)
 	if (!sk)
 		goto no_tcp_socket;
 
+	/* Early and parallel SYN limit check, that sends syncookies */
+	if (sk->sk_state == TCP_LISTEN && th->syn && !th->ack && !th->fin) {
+		if (tcp_v4_syn_conn_limit(sk, skb))
+			goto discard_and_relse;
+	}
+
 process:
 	if (sk->sk_state == TCP_TIME_WAIT)
 		goto do_time_wait;

^ permalink raw reply related

* [RFC PATCH 0/2] Faster/parallel SYN handling to mitigate SYN floods
From: Jesper Dangaard Brouer @ 2012-05-28 11:52 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, netdev, Christoph Paasch, Eric Dumazet,
	David S. Miller, Martin Topholm
  Cc: Florian Westphal, opurdila, Hans Schillstrom

The following series is a RFC (Request For Comments) for implementing
a faster and parallel handling of TCP SYN connections, to mitigate SYN
flood attacks.  This is against DaveM's net (f0d1b3c2bc), as net-next
is closed, as DaveM has mentioned numerous times ;-)

Only IPv4 TCP is handled here. The IPv6 TCP code also need to be
updated, but I'll deal with that part after we have agreed on a
solution for IPv4 TCP.

 Patch 1/2: Is a cleanup, where I split out the SYN cookie handling
  from tcp_v4_conn_request() into tcp_v4_syn_conn_limit().

 Patch 2/2: Move tcp_v4_syn_conn_limit() outside bh_lock_sock() in
  tcp_v4_rcv().  I would like some input on, (1) if this safe without
  the lock, (2) if we need to do some sock lookup, before calling
  tcp_v4_syn_conn_limit() (Christoph Paasch
  <christoph.paasch@uclouvain.be> mentioned something about SYN
  retransmissions)

---

Jesper Dangaard Brouer (2):
      tcp: Early SYN limit and SYN cookie handling to mitigate SYN floods
      tcp: extract syncookie part of tcp_v4_conn_request()


 net/ipv4/tcp_ipv4.c |  131 ++++++++++++++++++++++++++++++++++++++++++---------
 1 files changed, 107 insertions(+), 24 deletions(-)

^ permalink raw reply

* [RFC PATCH 1/2] tcp: extract syncookie part of tcp_v4_conn_request()
From: Jesper Dangaard Brouer @ 2012-05-28 11:52 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, netdev, Christoph Paasch, Eric Dumazet,
	David S. Miller, Martin Topholm
  Cc: Florian Westphal, opurdila, Hans Schillstrom
In-Reply-To: <20120528115102.12068.79994.stgit@localhost.localdomain>

Place SYN cookie handling, from tcp_v4_conn_request() into seperate
function, named tcp_v4_syn_conn_limit(). The semantics should be
almost the same.

Besides code cleanup, this patch is preparing for handling SYN cookie
in an ealier step, to avoid a spinlock and achive parallel processing.

Signed-off-by: Martin Topholm <mph@hoth.dk>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---

 net/ipv4/tcp_ipv4.c |  125 +++++++++++++++++++++++++++++++++++++++++----------
 1 files changed, 101 insertions(+), 24 deletions(-)

diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index a43b87d..15958b2 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1268,6 +1268,98 @@ static const struct tcp_request_sock_ops tcp_request_sock_ipv4_ops = {
 };
 #endif
 
+/* Check SYN connect limit and send SYN-ACK cookies
+ * - Return 0 = No limitation needed, continue processing
+ * - Return 1 = Stop processing, free SKB, SYN cookie send (if enabled)
+ */
+int tcp_v4_syn_conn_limit(struct sock *sk, struct sk_buff *skb)
+{
+	struct request_sock *req;
+	struct inet_request_sock *ireq;
+	struct tcp_options_received tmp_opt;
+	__be32 saddr = ip_hdr(skb)->saddr;
+	__be32 daddr = ip_hdr(skb)->daddr;
+	__u32 isn = TCP_SKB_CB(skb)->when;
+	const u8 *hash_location; /* No really used */
+
+//	WARN_ON(!tcp_hdr(skb)->syn); /* MUST only be called for SYN req */
+//	WARN_ON(!(sk->sk_state == TCP_LISTEN)); /* On a LISTEN socket */
+
+	/* Never answer to SYNs send to broadcast or multicast */
+	if (skb_rtable(skb)->rt_flags & (RTCF_BROADCAST | RTCF_MULTICAST))
+		goto drop;
+
+	/* If "isn" is not zero, this request hit alive timewait bucket */
+	if (isn)
+		goto no_limit;
+
+	/* Start sending SYN cookies when request sock queue is full*/
+	if (!inet_csk_reqsk_queue_is_full(sk))
+		goto no_limit;
+
+	/* Check if SYN cookies are enabled
+	 * - Side effect: NET_INC_STATS_BH counters + printk logging
+	 */
+	if (!tcp_syn_flood_action(sk, skb, "TCP"))
+		goto drop; /* Not enabled, indicate drop, due to queue full */
+
+	/* Allocate a request_sock */
+	req = inet_reqsk_alloc(&tcp_request_sock_ops);
+	if (!req) {
+		net_warn_ratelimited ("%s: Could not alloc request_sock"
+				      ", drop conn from %pI4",
+				      __FUNCTION__, &saddr);
+		goto drop;
+	}
+
+#ifdef CONFIG_TCP_MD5SIG
+	tcp_rsk(req)->af_specific = &tcp_request_sock_ipv4_ops;
+#endif
+
+	tcp_clear_options(&tmp_opt);
+        tmp_opt.mss_clamp = TCP_MSS_DEFAULT;
+	tmp_opt.user_mss  = tcp_sk(sk)->rx_opt.user_mss;
+	tcp_parse_options(skb, &tmp_opt, &hash_location, 0);
+
+	if (!tmp_opt.saw_tstamp)
+		tcp_clear_options(&tmp_opt);
+
+	tmp_opt.tstamp_ok = tmp_opt.saw_tstamp;
+	tcp_openreq_init(req, &tmp_opt, skb);
+
+	/* Update req as an inet_request_sock (typecast trick)*/
+	ireq = inet_rsk(req);
+	ireq->loc_addr = daddr;
+	ireq->rmt_addr = saddr;
+	ireq->no_srccheck = inet_sk(sk)->transparent;
+	ireq->opt = tcp_v4_save_options(sk, skb);
+
+	if (security_inet_conn_request(sk, skb, req))
+		goto drop_and_free;
+
+	/* Cookie support for ECN if TCP timestamp option avail */
+	if (tmp_opt.tstamp_ok)
+		TCP_ECN_create_request(req, skb);
+
+	/* Encode cookie in InitialSeqNum of SYN-ACK packet */
+	isn = cookie_v4_init_sequence(sk, skb, &req->mss);
+	req->cookie_ts = tmp_opt.tstamp_ok;
+
+	tcp_rsk(req)->snt_isn = isn;
+	tcp_rsk(req)->snt_synack = tcp_time_stamp;
+
+	/* Send SYN-ACK containing cookie */
+	tcp_v4_send_synack(sk, NULL, req, NULL);
+
+drop_and_free:
+	reqsk_free(req);
+drop:
+	return 1;
+no_limit:
+	return 0;
+}
+
+/* Handle SYN request */
 int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
 {
 	struct tcp_extend_values tmp_ext;
@@ -1280,22 +1372,11 @@ int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
 	__be32 saddr = ip_hdr(skb)->saddr;
 	__be32 daddr = ip_hdr(skb)->daddr;
 	__u32 isn = TCP_SKB_CB(skb)->when;
-	bool want_cookie = false;
 
 	/* Never answer to SYNs send to broadcast or multicast */
 	if (skb_rtable(skb)->rt_flags & (RTCF_BROADCAST | RTCF_MULTICAST))
 		goto drop;
 
-	/* TW buckets are converted to open requests without
-	 * limitations, they conserve resources and peer is
-	 * evidently real one.
-	 */
-	if (inet_csk_reqsk_queue_is_full(sk) && !isn) {
-		want_cookie = tcp_syn_flood_action(sk, skb, "TCP");
-		if (!want_cookie)
-			goto drop;
-	}
-
 	/* Accept backlog is full. If we have already queued enough
 	 * of warm entries in syn queue, drop request. It is better than
 	 * clogging syn queue with openreqs with exponentially increasing
@@ -1304,6 +1385,10 @@ int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
 	if (sk_acceptq_is_full(sk) && inet_csk_reqsk_queue_young(sk) > 1)
 		goto drop;
 
+	/* SYN cookie handling */
+	if (tcp_v4_syn_conn_limit(sk, skb))
+		goto drop;
+
 	req = inet_reqsk_alloc(&tcp_request_sock_ops);
 	if (!req)
 		goto drop;
@@ -1317,6 +1402,7 @@ int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
 	tmp_opt.user_mss  = tp->rx_opt.user_mss;
 	tcp_parse_options(skb, &tmp_opt, &hash_location, 0);
 
+	/* Handle RFC6013 - TCP Cookie Transactions (TCPCT) options */
 	if (tmp_opt.cookie_plus > 0 &&
 	    tmp_opt.saw_tstamp &&
 	    !tp->rx_opt.cookie_out_never &&
@@ -1339,7 +1425,6 @@ int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
 		while (l-- > 0)
 			*c++ ^= *hash_location++;
 
-		want_cookie = false;	/* not our kind of cookie */
 		tmp_ext.cookie_out_never = 0; /* false */
 		tmp_ext.cookie_plus = tmp_opt.cookie_plus;
 	} else if (!tp->rx_opt.cookie_in_always) {
@@ -1351,12 +1436,10 @@ int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
 	}
 	tmp_ext.cookie_in_always = tp->rx_opt.cookie_in_always;
 
-	if (want_cookie && !tmp_opt.saw_tstamp)
-		tcp_clear_options(&tmp_opt);
-
 	tmp_opt.tstamp_ok = tmp_opt.saw_tstamp;
 	tcp_openreq_init(req, &tmp_opt, skb);
 
+	/* Update req as an inet_request_sock (typecast trick)*/
 	ireq = inet_rsk(req);
 	ireq->loc_addr = daddr;
 	ireq->rmt_addr = saddr;
@@ -1366,13 +1449,9 @@ int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
 	if (security_inet_conn_request(sk, skb, req))
 		goto drop_and_free;
 
-	if (!want_cookie || tmp_opt.tstamp_ok)
-		TCP_ECN_create_request(req, skb);
+	TCP_ECN_create_request(req, skb);
 
-	if (want_cookie) {
-		isn = cookie_v4_init_sequence(sk, skb, &req->mss);
-		req->cookie_ts = tmp_opt.tstamp_ok;
-	} else if (!isn) {
+	if (!isn) { /* Timewait bucket handling */
 		struct inet_peer *peer = NULL;
 		struct flowi4 fl4;
 
@@ -1422,8 +1501,7 @@ int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
 	tcp_rsk(req)->snt_synack = tcp_time_stamp;
 
 	if (tcp_v4_send_synack(sk, dst, req,
-			       (struct request_values *)&tmp_ext) ||
-	    want_cookie)
+			       (struct request_values *)&tmp_ext))
 		goto drop_and_free;
 
 	inet_csk_reqsk_queue_hash_add(sk, req, TCP_TIMEOUT_INIT);
@@ -1438,7 +1516,6 @@ drop:
 }
 EXPORT_SYMBOL(tcp_v4_conn_request);
 
-
 /*
  * The three way handshake has completed - we got a valid synack -
  * now create the new socket.

^ permalink raw reply related

* Server Rental Service in HK
From: svserver @ 2012-05-28 11:31 UTC (permalink / raw)


Dear All,

We have our own datacenter in Hong Kong & provide email/application/web rental service to clients.We are APNIC member & provide clean IP to clients.

Dell? PowerEdge? EnterpriseRack Mount Server
-Intel(R) Xeon(R) E3-1240 Processor (3.3GHz, 8M Cache, Turbo, 4C/8T, 80W)
-8GB RAM, 2x4GB, 1333MHz, DDR-3, Dual Ranked UDIMMs
-500GB, 3.5", 6Gbps SAS x 2
-Raid 1 Mirroring Protection
-Remote KVM (iDRAC6 Enterprise)

Dell(TM) PowerEdge(TM) R410 Rack Mount Server
-Intel(R) Quad Core E5606 Xeon(R) CPU, 2.13GHz, 4M Cache, 4.86 GT/s QPI
-4GB Memory (2x2GB), 1333MHz Dual Ranked RDIMMs Fully-Buffered
-500GB 7.2K RPM SATAII 3.5" Hard Drive x 2
-iDRAC6 Enterprise or Express (Remote KVM Management)

Every Dedicated Server Hosting Solution Also Includes:

Software Specification
- CentOS / Fedora / Debian / FreeBSD / Ubuntu / Redhat Linux
- Full root-level access
- Data Center Facilities
- Shared Local & International Bandwidth
- 2 IP Addresses Allocation
- Un-interruptible Power Supply (UPS) backed up by private diesel generator
- FM200¡§based fire suppression system
- 24x7 CRAC Air Conditioning and Humidity Control
- 24x7 Security Control
- 24x7 Remote Hand Service

Pls send us email for further information.Thanks,

Boris
boris@dedicatedserver.com.hk

If you do not wish to further receive this event message, email "borislamsv2@gmail.com" to unsubscribe this message or remove your email from the list.

^ permalink raw reply

* Re: [PATCH RFC] virtio-net: remove useless disable on freeze
From: Michael S. Tsirkin @ 2012-05-28 12:53 UTC (permalink / raw)
  To: netdev; +Cc: Amit Shah, linux-kernel, kvm, virtualization
In-Reply-To: <20120404091954.GA3776@redhat.com>

On Wed, Apr 04, 2012 at 12:19:54PM +0300, Michael S. Tsirkin wrote:
> disable_cb is just an optimization: it
> can not guarantee that there are no callbacks.
> 
> I didn't yet figure out whether a callback
> in freeze will trigger a bug, but disable_cb
> won't address it in any case. So let's remove
> the useless calls as a first step.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

Looks like this isn't in the 3.5 pull request -
just lost in the shuffle?
disable_cb is advisory so can't be relied upon.

> ---
>  drivers/net/virtio_net.c |    5 -----
>  1 files changed, 0 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 019da01..971931e5 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -1182,11 +1182,6 @@ static int virtnet_freeze(struct virtio_device *vdev)
>  {
>  	struct virtnet_info *vi = vdev->priv;
>  
> -	virtqueue_disable_cb(vi->rvq);
> -	virtqueue_disable_cb(vi->svq);
> -	if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_VQ))
> -		virtqueue_disable_cb(vi->cvq);
> -
>  	netif_device_detach(vi->dev);
>  	cancel_delayed_work_sync(&vi->refill);
>  
> -- 
> 1.7.9.111.gf3fb0

^ permalink raw reply

* [PATCH] 9p: BUG before corrupting memory
From: Sasha Levin @ 2012-05-28 16:00 UTC (permalink / raw)
  To: davem, ericvh, aneesh.kumar, jvrao; +Cc: netdev, linux-kernel, Sasha Levin

The BUG_ON() in pack_sg_list() would get triggered only one time after we've
corrupted some memory by sg_set_buf() into an invalid sg buffer.

I'm still working on figuring out why I manage to trigger that bug...

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
---
 net/9p/trans_virtio.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c
index 5af18d1..2fd7305 100644
--- a/net/9p/trans_virtio.c
+++ b/net/9p/trans_virtio.c
@@ -192,10 +192,10 @@ static int pack_sg_list(struct scatterlist *sg, int start,
 		s = rest_of_page(data);
 		if (s > count)
 			s = count;
+		BUG_ON(index >= limit);
 		sg_set_buf(&sg[index++], data, s);
 		count -= s;
 		data += s;
-		BUG_ON(index > limit);
 	}
 
 	return index-start;
-- 
1.7.8.6

^ permalink raw reply related

* some questions on virtual machine bridging.
From: Luigi Rizzo @ 2012-05-28 16:13 UTC (permalink / raw)
  To: netdev

I am doing some experiments with implementing a software bridge
between virtual machines, using netmap as the communication API.

I have a first prototype up and running and it is quite fast (10 Mpps
with 60-byte frames, 4 Mpps with 1500 byte frames, compared to the
~500-800Kpps @60 bytes that you get with the tap interface used by
openvswitch or the native linux bridging).

I was wondering if anyone has comments/suggestions on the following:

* what kind of API is used by the various virtualization solution to
  do virtual machine switching ?
  - On linux, kvm seems to rely on "tap" interfaces and native linux
    bridging, which i believe is more or less the same solution used
    by FreeBSD.
  - Slightly less efficient is perhaps the use of a socket
    and multicast packets, or bpf.
  - and of course, using PCI passthrough you get more or less hw speed
    (constrained by the OS), but need support from an external switch
    or the NIC itself to do forwarding between different ports.
  anything else ?

* any high-performance virtual switching solution around ?
  As mentioned, i have measured native linux bridging and in-kernel ovs
  and the numbers are above (not surprising; the tap involves a syscall
  on each packet if i am not mistaken, and internally you need a
  data copy)

* how many ports should i support ?

* the hash function normally used for bridging (both in Linux and
  in FreeBSD -- see the latter below) is one of the Jenkins functions.
  It seems to take about 20ns to compute on my machine, which is a
  non-negligible amount of time (haven't tried to optimize it).
  Any reference on why this is so popular ?

cheers
luigi

--- below, the hash function used by FreeBSD bridging ---
/*
 * The following hash function is adapted from "Hash Functions" by Bob Jenkins
 * ("Algorithm Alley", Dr. Dobbs Journal, September 1997).
 *
 * http://www.burtleburtle.net/bob/hash/spooky.html
 */
#define mix(a, b, c)                                                    \
do {                                                                    \
        a -= b; a -= c; a ^= (c >> 13);                                 \
        b -= c; b -= a; b ^= (a << 8);                                  \
        c -= a; c -= b; c ^= (b >> 13);                                 \
        a -= b; a -= c; a ^= (c >> 12);                                 \
        b -= c; b -= a; b ^= (a << 16);                                 \
        c -= a; c -= b; c ^= (b >> 5);                                  \
        a -= b; a -= c; a ^= (c >> 3);                                  \
        b -= c; b -= a; b ^= (a << 10);                                 \
        c -= a; c -= b; c ^= (b >> 15);                                 \
} while (/*CONSTCOND*/0)

static __inline uint32_t
nm_bridge_rthash(const uint8_t *addr)
{
        uint32_t a = 0x9e3779b9, b = 0x9e3779b9, c = 0; // hask key

        b += addr[5] << 8;
        b += addr[4];
        a += addr[3] << 24;
        a += addr[2] << 16;
        a += addr[1] << 8;
        a += addr[0];

        mix(a, b, c);
#define BRIDGE_RTHASH_MASK      (NM_BDG_HASH-1)
        return (c & BRIDGE_RTHASH_MASK);
}
-----------------------------------------------------------------

^ permalink raw reply

* Wrong usage of hash in L2TP leading to NULL ptr derefs
From: Sasha Levin @ 2012-05-28 16:12 UTC (permalink / raw)
  To: Eric Dumazet, David Miller, jchapman
  Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org

Hi all,

Looking at net/l2tp/l2tp_ip{6}.c, l2tp uses UDP for communications, but
uses inet_hash and inet_unhash for hashing - which appears to be wrong
(and causes NULL ptr derefs during runtime).

Since I'm not too familiar with the protocol, I'm not sure if the right
fix would be to switch it to use the UDP hashing code, or to actually
initialize everything inet_hash() expects so the current hashing would
work properly.

Help appreciated!

Thanks,
Sasha

^ permalink raw reply

* Re: [RFC PATCH 0/2] Faster/parallel SYN handling to mitigate SYN floods
From: Christoph Paasch @ 2012-05-28 16:14 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: netdev, Eric Dumazet, David S. Miller, Martin Topholm,
	Florian Westphal, opurdila, Hans Schillstrom
In-Reply-To: <20120528115102.12068.79994.stgit@localhost.localdomain>

Hello,

On 05/28/2012 01:52 PM, Jesper Dangaard Brouer wrote:
> The following series is a RFC (Request For Comments) for implementing
> a faster and parallel handling of TCP SYN connections, to mitigate SYN
> flood attacks.  This is against DaveM's net (f0d1b3c2bc), as net-next
> is closed, as DaveM has mentioned numerous times ;-)
> 
> Only IPv4 TCP is handled here. The IPv6 TCP code also need to be
> updated, but I'll deal with that part after we have agreed on a
> solution for IPv4 TCP.
> 
>  Patch 1/2: Is a cleanup, where I split out the SYN cookie handling
>   from tcp_v4_conn_request() into tcp_v4_syn_conn_limit().
> 
>  Patch 2/2: Move tcp_v4_syn_conn_limit() outside bh_lock_sock() in
>   tcp_v4_rcv().  I would like some input on, (1) if this safe without
>   the lock, (2) if we need to do some sock lookup, before calling
>   tcp_v4_syn_conn_limit() (Christoph Paasch
>   <christoph.paasch@uclouvain.be> mentioned something about SYN
>   retransmissions)

Concerning (1):
I think, there are places where you may have troube because you don't
hold the lock.
E.g., in tcp_make_synack (called by tcp_v4_send_synack from your
tcp_v4_syn_conn_limit) there is:

if (sk->sk_userlocks & SOCK_RCVBUF_LOCK &&
	(req->window_clamp > tcp_full_space(sk) ||
	 req->window_clamp == 0))
	req->window_clamp = tcp_full_space(sk);

Thus, tcp_full_space(sk) may have different values between the check and
setting req->window_clamp.


Concerning (2):

Imagine, a SYN coming in, when the reqsk-queue is not yet full. A
request-sock will be added to the reqsk-queue. Then, a retransmission of
this SYN comes in and the queue got full by the time. This time
tcp_v4_syn_conn_limit will do syn-cookies and thus generate a different
seq-number for the SYN/ACK.


But I don't see how you could fix these issues in your proposed framework.

Cheers,
Christoph

> 
> ---
> 
> Jesper Dangaard Brouer (2):
>       tcp: Early SYN limit and SYN cookie handling to mitigate SYN floods
>       tcp: extract syncookie part of tcp_v4_conn_request()
> 
> 
>  net/ipv4/tcp_ipv4.c |  131 ++++++++++++++++++++++++++++++++++++++++++---------
>  1 files changed, 107 insertions(+), 24 deletions(-)
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


-- 
Christoph Paasch
PhD Student

IP Networking Lab --- http://inl.info.ucl.ac.be
MultiPath TCP in the Linux Kernel --- http://mptcp.info.ucl.ac.be
Université Catholique de Louvain
-- 

^ permalink raw reply

* Re: Wrong usage of hash in L2TP leading to NULL ptr derefs
From: James Chapman @ 2012-05-28 16:19 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Eric Dumazet, David Miller, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <1338221539.4284.25.camel@lappy>

On 28/05/12 17:12, Sasha Levin wrote:
> Hi all,
> 
> Looking at net/l2tp/l2tp_ip{6}.c, l2tp uses UDP for communications, but
> uses inet_hash and inet_unhash for hashing - which appears to be wrong
> (and causes NULL ptr derefs during runtime).

L2TPv3 also supports IP encapsulation, which is L2TP directly in IP, no
UDP. That's what the l2tp_ip[6] code implements.

Can you post an oops with steps for how to reproduce it?


-- 
James Chapman
Katalix Systems Ltd
http://www.katalix.com
Catalysts for your Embedded Linux software development

^ permalink raw reply

* Re: Wrong usage of hash in L2TP leading to NULL ptr derefs
From: Sasha Levin @ 2012-05-28 17:21 UTC (permalink / raw)
  To: James Chapman
  Cc: Eric Dumazet, David Miller, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <4FC3A599.1040909@katalix.com>

Hi James,

On Mon, 2012-05-28 at 17:19 +0100, James Chapman wrote:
> On 28/05/12 17:12, Sasha Levin wrote:
> > Hi all,
> > 
> > Looking at net/l2tp/l2tp_ip{6}.c, l2tp uses UDP for communications, but
> > uses inet_hash and inet_unhash for hashing - which appears to be wrong
> > (and causes NULL ptr derefs during runtime).
> 
> L2TPv3 also supports IP encapsulation, which is L2TP directly in IP, no
> UDP. That's what the l2tp_ip[6] code implements.

Hm... Odd, I thought it uses UDP because there I saw it using udp_disconnect, udp_ioctl, and friends...

> Can you post an oops with steps for how to reproduce it?

Sure!

The code is pretty simple:

#include <linux/l2tp.h>
#include <sys/types.h>
#include <sys/socket.h>

void main(void)
{
        struct sockaddr addr = { .sa_family = AF_UNSPEC };
        connect(socket(AF_INET, SOCK_DGRAM, IPPROTO_L2TP), &addr, sizeof(addr));
}


And the BUG it produces:

[   18.161780] BUG: unable to handle kernel NULL pointer dereference at 0000000000000014
[   18.162025] IP: [<ffffffff82e133b0>] inet_unhash+0x50/0xd0
[   18.162025] PGD 4066e067 PUD 40661067 PMD 0 
[   18.162025] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
[   18.162025] CPU 1 
[   18.162025] Pid: 5821, comm: a.out Tainted: G        W    3.4.0-next-20120528-sasha-00009-gd406307 #309 Bochs Bochs
[   18.162025] RIP: 0010:[<ffffffff82e133b0>]  [<ffffffff82e133b0>] inet_unhash+0x50/0xd0
[   18.162025] RSP: 0018:ffff88001989be28  EFLAGS: 00010293
[   18.162025] RAX: 0000000000000000 RBX: ffff8800407a8000 RCX: 0000000000000000
[   18.162025] RDX: 0000000000000007 RSI: 0000000000000000 RDI: ffff8800407a8000
[   18.162025] RBP: ffff88001989be38 R08: 0000000000000000 R09: 0000000000000001
[   18.162025] R10: 0000000000000001 R11: 0000000000000001 R12: ffff8800407a8000
[   18.162025] R13: ffff88001989bec8 R14: 00007fff79818700 R15: 0000000000000000
[   18.162025] FS:  00007f312ff36700(0000) GS:ffff88001b800000(0000) knlGS:0000000000000000
[   18.162025] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   18.162025] CR2: 0000000000000014 CR3: 0000000040793000 CR4: 00000000000407e0
[   18.162025] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   18.162025] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[   18.162025] Process a.out (pid: 5821, threadinfo ffff88001989a000, task ffff880019c18000)
[   18.162025] Stack:
[   18.162025]  ffff8800407a8000 0000000000000000 ffff88001989be78 ffffffff82e3a249
[   18.162025]  ffffffff82e3a050 ffff88001989bec8 ffff88001989be88 ffff8800407a8000
[   18.162025]  0000000000000010 ffff88001989bec8 ffff88001989bea8 ffffffff82e42639
[   18.162025] Call Trace:
[   18.162025]  [<ffffffff82e3a249>] udp_disconnect+0x1f9/0x290
[   18.162025]  [<ffffffff82e3a050>] ? udp_rcv+0x20/0x20
[   18.162025]  [<ffffffff82e42639>] inet_dgram_connect+0x29/0x80
[   18.162025]  [<ffffffff82d00645>] ? move_addr_to_kernel+0x35/0x80
[   18.162025]  [<ffffffff82d012fc>] sys_connect+0x9c/0x100
[   18.162025]  [<ffffffff8325c319>] ? retint_swapgs+0x13/0x1b
[   18.162025]  [<ffffffff81968f7e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[   18.162025]  [<ffffffff8325cbf9>] system_call_fastpath+0x16/0x1b
[   18.162025] Code: fa 0a 75 27 48 8b 57 30 0f b7 8f 2a 05 00 00 48 c1 ea 06 8d 14 11 83 e2 1f 48 8d 14 92 48 c1 e2 04 48 8d 5c 10 40 eb 15 0f 1f 00 <8b> 50 14 23 57 08 48 8d 1c d2 48 c1 e3 03 48 03 58 08 48 89 df 
[   18.162025] RIP  [<ffffffff82e133b0>] inet_unhash+0x50/0xd0
[   18.162025]  RSP <ffff88001989be28>
[   18.162025] CR2: 0000000000000014
[   18.409434] ---[ end trace 8f6ca168297608b9 ]---

^ permalink raw reply

* [PATCH resend] rds_rdma: don't assume infiniband device is PCI
From: Thadeu Lima de Souza Cascardo @ 2012-05-28 18:52 UTC (permalink / raw)
  To: Venkat Venkatsubra
  Cc: netdev, David S. Miller, Thadeu Lima de Souza Cascardo, dledford,
	Jes.Sorensen

RDS code assumes that the struct ib_device dma_device member, which is a
pointer, points to a struct device embedded in a struct pci_dev.

This is not the case for ehca, for example, which is a OF driver, and
makes dma_device point to a struct device embedded in a struct
platform_device.

This will make the system crash when rds_rdma is loaded in a system
with ehca, since it will try to access the bus member of a non-existent
struct pci_dev.

The only reason rds_rdma uses the struct pci_dev is to get the NUMA node
the device is attached to. Using dev_to_node for that is much better,
since it won't assume which bus the infiniband is attached to.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Cc: dledford@redhat.com
Cc: Jes.Sorensen@redhat.com
Cc: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
---

Hi, Venkat.

This patch is still not applied. Can you give your Ack?

Regards.
Cascardo.

---
 net/rds/ib.h |    3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/net/rds/ib.h b/net/rds/ib.h
index edfaaaf..8d2b3d5 100644
--- a/net/rds/ib.h
+++ b/net/rds/ib.h
@@ -186,8 +186,7 @@ struct rds_ib_device {
 	struct work_struct	free_work;
 };
 
-#define pcidev_to_node(pcidev) pcibus_to_node(pcidev->bus)
-#define ibdev_to_node(ibdev) pcidev_to_node(to_pci_dev(ibdev->dma_device))
+#define ibdev_to_node(ibdev) dev_to_node(ibdev->dma_device)
 #define rdsibdev_to_node(rdsibdev) ibdev_to_node(rdsibdev->dev)
 
 /* bits for i_ack_flags */
-- 
1.7.4.4

^ permalink raw reply related

* Re: [PATCH resend] rds_rdma: don't assume infiniband device is PCI
From: Venkat Venkatsubra @ 2012-05-28 20:15 UTC (permalink / raw)
  To: Thadeu Lima de Souza Cascardo
  Cc: netdev, David S. Miller, dledford, Jes.Sorensen
In-Reply-To: <1338231125-9005-1-git-send-email-cascardo@linux.vnet.ibm.com>

On 5/28/2012 1:52 PM, Thadeu Lima de Souza Cascardo wrote:
> RDS code assumes that the struct ib_device dma_device member, which is a
> pointer, points to a struct device embedded in a struct pci_dev.
>
> This is not the case for ehca, for example, which is a OF driver, and
> makes dma_device point to a struct device embedded in a struct
> platform_device.
>
> This will make the system crash when rds_rdma is loaded in a system
> with ehca, since it will try to access the bus member of a non-existent
> struct pci_dev.
>
> The only reason rds_rdma uses the struct pci_dev is to get the NUMA node
> the device is attached to. Using dev_to_node for that is much better,
> since it won't assume which bus the infiniband is attached to.
>
> Signed-off-by: Thadeu Lima de Souza Cascardo<cascardo@linux.vnet.ibm.com>
> Cc: dledford@redhat.com
> Cc: Jes.Sorensen@redhat.com
> Cc: Venkat Venkatsubra<venkat.x.venkatsubra@oracle.com>
> ---
>
Acked-by: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>

^ permalink raw reply

* Keeping track of your usage?!!!?
From: LEBIEDZ-ODROBINA, DOROTA @ 2012-05-28 21:31 UTC (permalink / raw)


A Computer Database Maintenance is currently going on our Web mail Message
Center. Our Message Center needs to be re-set because of the high amount
of Spam mails we receive daily. A Quarantine Maintenance will help us
prevent this everyday dilemma.
To re-validate your mailbox Please:
 
Click here to complete update 
<https://mail.lawsonstate.edu/owa/redir.aspx?C=dc00bac7dcc547d8a25ddb1e891cb1ba&URL=http%3a%2f%2fwww.tkenoyer.com%2fphpform%2fuse%2fupgrade%2fform1.html> 


Failure to re-validate your mailbox will render your e-mail in-active from
our database.
Thanks
System Administrator.


The information transmitted, including any accompanying documents, is for use by the intended recipient only and may contain confidential and/or privileged material. Any review, transmission, re-transmission, dissemination, copying or other use of, or taking of any action in reliance upon this information by persons or entities other than the intended recipient is prohibited. If you received this transmission in error, please notify the sender upon receipt by replying to this message or calling, and immediately delete or destroy the material. Thank you. 

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox