* Re: [PATCH] vxlan: Fix error that was resulting in VXLAN MTU size being 10 bytes too large
From: Joseph Glanville @ 2012-12-03 15:26 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: David Miller, alexander.h.duyck, netdev
In-Reply-To: <20121119080351.477525b3@nehalam.linuxnetplumber.net>
On 20 November 2012 03:03, Stephen Hemminger <shemminger@vyatta.com> wrote:
> On Mon, 19 Nov 2012 22:33:50 +1100
> Joseph Glanville <joseph.glanville@orionvm.com.au> wrote:
>
>> On 14 November 2012 08:33, Stephen Hemminger <shemminger@vyatta.com> wrote:
>> > On Tue, 13 Nov 2012 14:37:19 -0500 (EST)
>> > David Miller <davem@davemloft.net> wrote:
>> >
>> >> From: Alexander Duyck <alexander.h.duyck@intel.com>
>> >> Date: Fri, 09 Nov 2012 15:35:24 -0800
>> >>
>> >> > This change fixes an issue I found where VXLAN frames were fragmented when
>> >> > they were up to the VXLAN MTU size. I root caused the issue to the fact that
>> >> > the headroom was 4 + 20 + 8 + 8. This math doesn't appear to be correct
>> >> > because we are not inserting a VLAN header, but instead a 2nd Ethernet header.
>> >> > As such the math for the overhead should be 20 + 8 + 8 + 14 to account for the
>> >> > extra headers that are inserted for VXLAN.
>> >> >
>> >> > Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
>> >>
>> >> Applied, thanks for the detailed commit message.
>> >
>> > Probably need smarter code there to look at header length requirement
>> > of underlying device as well, maybe someone will be perverse and runn
>> > vxlan over a tunnel or IPoIB.
>>
>> Forgive my ignorance but why would running VXLAN on IPoIB require
>> special header handling? (and would it work or behave strangely?)
>>
>> I was planning on giving this a go when 3.7 is released but I might do
>> that sooner if problems are anticipated.
>>
>> > --
>> > To unsubscribe from this list: send the line "unsubscribe netdev" in
>> > the body of a message to majordomo@vger.kernel.org
>> > More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>> Joseph.
>>
>
> Some lower layers require bigger (or smaller headers). As it was, vxlan
> was only allocating skb with a fixed amount of headroom. This would lead to
> lower layers having to copy the skb.
>
> My suggestion has already been addressed by a later patch.
Hi,
I have tested VXLAN on IPoIB and it works perfectly. :)
Joseph.
--
CTO | Orion Virtualisation Solutions | www.orionvm.com.au
Phone: 1300 56 99 52 | Mobile: 0428 754 846
^ permalink raw reply
* Re: [Suggestion] net/atm : for sprintf, need check the total write length whether larger than a page.
From: chas williams - CONTRACTOR @ 2012-12-03 15:48 UTC (permalink / raw)
To: Chen Gang
Cc: David.Woodhouse, David Miller, krzysiek, Joe Perches, edumazet,
netdev
In-Reply-To: <50BC6958.6080803@asianux.com>
yes this seems like it should be done. maybe this week i will try to
put something together unless you already have a patch somewhere.
On Mon, 03 Dec 2012 16:56:56 +0800
Chen Gang <gang.chen@asianux.com> wrote:
> Hello Maintainers:
>
> was this suggestion replied ? (it seems not).
>
> and please help to check whether this suggestion is valid.
>
> thanks.
>
> gchen.
>
>
> 于 2012年11月21日 12:29, Chen Gang 写道:
> > Hello David Miller:
> >
> > in net/atm/atm_sysfs.c:
> > suggest to check the write length whether larger than a page.
> > the length of parameter buf is one page size (reference: fill_read_buffer at fs/sysfs/file.c)
> > and the count of atm adresses are not limited (reference: atm_dev_ioctl -> atm_add_addr)
> >
> > thanks.
> >
> > gchen.
> >
> > 34 static ssize_t show_atmaddress(struct device *cdev,
> > 35 struct device_attribute *attr, char *buf)
> > 36 {
> > 37 unsigned long flags;
> > 38 char *pos = buf;
> > 39 struct atm_dev *adev = to_atm_dev(cdev);
> > 40 struct atm_dev_addr *aaddr;
> > 41 int bin[] = { 1, 2, 10, 6, 1 }, *fmt = bin;
> > 42 int i, j;
> > 43
> > 44 spin_lock_irqsave(&adev->lock, flags);
> > 45 list_for_each_entry(aaddr, &adev->local, entry) {
> > 46 for (i = 0, j = 0; i < ATM_ESA_LEN; ++i, ++j) {
> > 47 if (j == *fmt) {
> > 48 pos += sprintf(pos, ".");
> > 49 ++fmt;
> > 50 j = 0;
> > 51 }
> > 52 pos += sprintf(pos, "%02x",
> > 53 aaddr->addr.sas_addr.prv[i]);
> > 54 }
> > 55 pos += sprintf(pos, "\n");
> > 56 }
> > 57 spin_unlock_irqrestore(&adev->lock, flags);
> > 58
> > 59 return pos - buf;
> > 60 }
> > 61
> >
> >
> >
> > in net/atm/addr.c
> >
> > 67 int atm_add_addr(struct atm_dev *dev, const struct sockaddr_atmsvc *addr,
> > 68 enum atm_addr_type_t atype)
> > 69 {
> > 70 unsigned long flags;
> > 71 struct atm_dev_addr *this;
> > 72 struct list_head *head;
> > 73 int error;
> > 74
> > 75 error = check_addr(addr);
> > 76 if (error)
> > 77 return error;
> > 78 spin_lock_irqsave(&dev->lock, flags);
> > 79 if (atype == ATM_ADDR_LECS)
> > 80 head = &dev->lecs;
> > 81 else
> > 82 head = &dev->local;
> > 83 list_for_each_entry(this, head, entry) {
> > 84 if (identical(&this->addr, addr)) {
> > 85 spin_unlock_irqrestore(&dev->lock, flags);
> > 86 return -EEXIST;
> > 87 }
> > 88 }
> > 89 this = kmalloc(sizeof(struct atm_dev_addr), GFP_ATOMIC);
> > 90 if (!this) {
> > 91 spin_unlock_irqrestore(&dev->lock, flags);
> > 92 return -ENOMEM;
> > 93 }
> > 94 this->addr = *addr;
> > 95 list_add(&this->entry, head);
> > 96 spin_unlock_irqrestore(&dev->lock, flags);
> > 97 if (head == &dev->local)
> > 98 notify_sigd(dev);
> > 99 return 0;
> > 100 }
> > 101
> >
> >
> > in net/atm/resources.c
> >
> > 195 int atm_dev_ioctl(unsigned int cmd, void __user *arg, int compat)
> > 196 {
> > 197 void __user *buf;
> > 198 int error, len, number, size = 0;
> > 199 struct atm_dev *dev;
> > 200 struct list_head *p;
> > 201 int *tmp_buf, *tmp_p;
> > 202 int __user *sioc_len;
> > 203 int __user *iobuf_len;
> > 204
> > 205 #ifndef CONFIG_COMPAT
> > 206 compat = 0; /* Just so the compiler _knows_ */
> > 207 #endif
> > 208
> > 209 switch (cmd) {
> > 210 case ATM_GETNAMES:
> > 211 if (compat) {
> > 212 #ifdef CONFIG_COMPAT
> > 213 struct compat_atm_iobuf __user *ciobuf = arg;
> > 214 compat_uptr_t cbuf;
> > 215 iobuf_len = &ciobuf->length;
> > 216 if (get_user(cbuf, &ciobuf->buffer))
> > 217 return -EFAULT;
> > 218 buf = compat_ptr(cbuf);
> > 219 #endif
> > 220 } else {
> > 221 struct atm_iobuf __user *iobuf = arg;
> > 222 iobuf_len = &iobuf->length;
> > 223 if (get_user(buf, &iobuf->buffer))
> > 224 return -EFAULT;
> > 225 }
> > 226 if (get_user(len, iobuf_len))
> > 227 return -EFAULT;
> > 228 mutex_lock(&atm_dev_mutex);
> > 229 list_for_each(p, &atm_devs)
> > 230 size += sizeof(int);
> > 231 if (size > len) {
> > 232 mutex_unlock(&atm_dev_mutex);
> > 233 return -E2BIG;
> > 234 }
> > 235 tmp_buf = kmalloc(size, GFP_ATOMIC);
> > 236 if (!tmp_buf) {
> > 237 mutex_unlock(&atm_dev_mutex);
> > 238 return -ENOMEM;
> > 239 }
> > 240 tmp_p = tmp_buf;
> > 241 list_for_each(p, &atm_devs) {
> > 242 dev = list_entry(p, struct atm_dev, dev_list);
> > 243 *tmp_p++ = dev->number;
> > 244 }
> > 245 mutex_unlock(&atm_dev_mutex);
> > 246 error = ((copy_to_user(buf, tmp_buf, size)) ||
> > 247 put_user(size, iobuf_len))
> > 248 ? -EFAULT : 0;
> > 249 kfree(tmp_buf);
> > 250 return error;
> > 251 default:
> > 252 break;
> > 253 }
> > 254
> > 255 if (compat) {
> > 256 #ifdef CONFIG_COMPAT
> > 257 struct compat_atmif_sioc __user *csioc = arg;
> > 258 compat_uptr_t carg;
> > 259
> > 260 sioc_len = &csioc->length;
> > 261 if (get_user(carg, &csioc->arg))
> > 262 return -EFAULT;
> > 263 buf = compat_ptr(carg);
> > 264
> > 265 if (get_user(len, &csioc->length))
> > 266 return -EFAULT;
> > 267 if (get_user(number, &csioc->number))
> > 268 return -EFAULT;
> > 269 #endif
> > 270 } else {
> > 271 struct atmif_sioc __user *sioc = arg;
> > 272
> > 273 sioc_len = &sioc->length;
> > 274 if (get_user(buf, &sioc->arg))
> > 275 return -EFAULT;
> > 276 if (get_user(len, &sioc->length))
> > 277 return -EFAULT;
> > 278 if (get_user(number, &sioc->number))
> > 279 return -EFAULT;
> > 280 }
> > 281
> > 282 dev = try_then_request_module(atm_dev_lookup(number), "atm-device-%d",
> > 283 number);
> > 284 if (!dev)
> > 285 return -ENODEV;
> > 286
> > 287 switch (cmd) {
> > 288 case ATM_GETTYPE:
> > 289 size = strlen(dev->type) + 1;
> > 290 if (copy_to_user(buf, dev->type, size)) {
> > 291 error = -EFAULT;
> > 292 goto done;
> > 293 }
> > 294 break;
> > 295 case ATM_GETESI:
> > 296 size = ESI_LEN;
> > 297 if (copy_to_user(buf, dev->esi, size)) {
> > 298 error = -EFAULT;
> > 299 goto done;
> > 300 }
> > 301 break;
> > 302 case ATM_SETESI:
> > 303 {
> > 304 int i;
> > 305
> > 306 for (i = 0; i < ESI_LEN; i++)
> > 307 if (dev->esi[i]) {
> > 308 error = -EEXIST;
> > 309 goto done;
> > 310 }
> > 311 }
> > 312 /* fall through */
> > 313 case ATM_SETESIF:
> > 314 {
> > 315 unsigned char esi[ESI_LEN];
> > 316
> > 317 if (!capable(CAP_NET_ADMIN)) {
> > 318 error = -EPERM;
> > 319 goto done;
> > 320 }
> > 321 if (copy_from_user(esi, buf, ESI_LEN)) {
> > 322 error = -EFAULT;
> > 323 goto done;
> > 324 }
> > 325 memcpy(dev->esi, esi, ESI_LEN);
> > 326 error = ESI_LEN;
> > 327 goto done;
> > 328 }
> > 329 case ATM_GETSTATZ:
> > 330 if (!capable(CAP_NET_ADMIN)) {
> > 331 error = -EPERM;
> > 332 goto done;
> > 333 }
> > 334 /* fall through */
> > 335 case ATM_GETSTAT:
> > 336 size = sizeof(struct atm_dev_stats);
> > 337 error = fetch_stats(dev, buf, cmd == ATM_GETSTATZ);
> > 338 if (error)
> > 339 goto done;
> > 340 break;
> > 341 case ATM_GETCIRANGE:
> > 342 size = sizeof(struct atm_cirange);
> > 343 if (copy_to_user(buf, &dev->ci_range, size)) {
> > 344 error = -EFAULT;
> > 345 goto done;
> > 346 }
> > 347 break;
> > 348 case ATM_GETLINKRATE:
> > 349 size = sizeof(int);
> > 350 if (copy_to_user(buf, &dev->link_rate, size)) {
> > 351 error = -EFAULT;
> > 352 goto done;
> > 353 }
> > 354 break;
> > 355 case ATM_RSTADDR:
> > 356 if (!capable(CAP_NET_ADMIN)) {
> > 357 error = -EPERM;
> > 358 goto done;
> > 359 }
> > 360 atm_reset_addr(dev, ATM_ADDR_LOCAL);
> > 361 break;
> > 362 case ATM_ADDADDR:
> > 363 case ATM_DELADDR:
> > 364 case ATM_ADDLECSADDR:
> > 365 case ATM_DELLECSADDR:
> > 366 {
> > 367 struct sockaddr_atmsvc addr;
> > 368
> > 369 if (!capable(CAP_NET_ADMIN)) {
> > 370 error = -EPERM;
> > 371 goto done;
> > 372 }
> > 373
> > 374 if (copy_from_user(&addr, buf, sizeof(addr))) {
> > 375 error = -EFAULT;
> > 376 goto done;
> > 377 }
> > 378 if (cmd == ATM_ADDADDR || cmd == ATM_ADDLECSADDR)
> > 379 error = atm_add_addr(dev, &addr,
> > 380 (cmd == ATM_ADDADDR ?
> > 381 ATM_ADDR_LOCAL : ATM_ADDR_LECS));
> > 382 else
> > 383 error = atm_del_addr(dev, &addr,
> > 384 (cmd == ATM_DELADDR ?
> > 385 ATM_ADDR_LOCAL : ATM_ADDR_LECS));
> > 386 goto done;
> > 387 }
> > ... ...
> > ... ...
> >
> >
> >
>
>
^ permalink raw reply
* Re: [PATCH net-next] bridge: implement multicast fast leave
From: Stephen Hemminger @ 2012-12-03 15:53 UTC (permalink / raw)
To: Cong Wang; +Cc: netdev, bridge, David S. Miller, Herbert Xu
In-Reply-To: <1354545363-11953-1-git-send-email-amwang@redhat.com>
On Mon, 3 Dec 2012 22:36:03 +0800
Cong Wang <amwang@redhat.com> wrote:
> Fast leave allows bridge to immediately stops the multicast
> traffic on the port receives IGMP Leave when IGMP snooping is enabled,
> no timeouts are observed.
>
> Cc: Herbert Xu <herbert@gondor.apana.org.au>
> Cc: Stephen Hemminger <shemminger@vyatta.com>
> Cc: "David S. Miller" <davem@davemloft.net>
> Signed-off-by: Cong Wang <amwang@redhat.com>
I like the feature, and it looks like an oversight in the initial design.
Why is this not the default, adding more options obscures it.
^ permalink raw reply
* Re: [net-next RFC] pktgen: don't wait for the device who doesn't free skb immediately after sent
From: Stephen Hemminger @ 2012-12-03 16:01 UTC (permalink / raw)
To: Jason Wang; +Cc: mst, netdev, linux-kernel, virtualization, davem
In-Reply-To: <2663254.1vRFru9xhh@jason-thinkpad-t430s>
On Mon, 03 Dec 2012 14:45:46 +0800
Jason Wang <jasowang@redhat.com> wrote:
> On Tuesday, November 27, 2012 08:49:19 AM Stephen Hemminger wrote:
> > On Tue, 27 Nov 2012 14:45:13 +0800
> >
> > Jason Wang <jasowang@redhat.com> wrote:
> > > On 11/27/2012 01:37 AM, Stephen Hemminger wrote:
> > > > On Mon, 26 Nov 2012 15:56:52 +0800
> > > >
> > > > Jason Wang <jasowang@redhat.com> wrote:
> > > >> Some deivces do not free the old tx skbs immediately after it has been
> > > >> sent
> > > >> (usually in tx interrupt). One such example is virtio-net which
> > > >> optimizes for virt and only free the possible old tx skbs during the
> > > >> next packet sending. This would lead the pktgen to wait forever in the
> > > >> refcount of the skb if no other pakcet will be sent afterwards.
> > > >>
> > > >> Solving this issue by introducing a new flag IFF_TX_SKB_FREE_DELAY
> > > >> which could notify the pktgen that the device does not free skb
> > > >> immediately after it has been sent and let it not to wait for the
> > > >> refcount to be one.
> > > >>
> > > >> Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > >
> > > > Another alternative would be using skb_orphan() and skb->destructor.
> > > > There are other cases where skb's are not freed right away.
> > > > --
> > > > To unsubscribe from this list: send the line "unsubscribe netdev" in
> > > > the body of a message to majordomo@vger.kernel.org
> > > > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > >
> > > Hi Stephen:
> > >
> > > Do you mean registering a skb->destructor for pktgen then set and check
> > > bits in skb->tx_flag?
> >
> > Yes. Register a destructor that does something like update a counter (number
> > of packets pending), then just spin while number of packets pending is over
> > threshold.
>
> Have some experiments on this, looks like it does not work weel when clone_skb
> is used. For driver that call skb_orphan() in ndo_start_xmit, the destructor
> is only called when the first packet were sent, but what we need to know is
> when the last were sent. Any thoughts on this or we can just introduce another
> flag (anyway we have something like IFF_TX_SKB_SHARING) ?
>
The SKB_SHARING flag looks like the best solution then.
Surprisingly, transmit buffer completion is a major bottleneck for 10G
devices, and I suspect more changes will come.
^ permalink raw reply
* [RFT PATCH] 8139cp: properly support change of MTU values [v2]
From: John Greene @ 2012-12-03 16:19 UTC (permalink / raw)
To: netdev; +Cc: John Greene, David S. Miller
The 8139cp driver has a change_mtu function that has not been
enabled since the dawn of the git repository. However, the
generic eth_change_mtu is not used in its place, so that
invalid MTU values can be set on the interface.
Original patch salvages the broken code for the single case of
setting the MTU while the interface is down, which is safe
and also includes the range check. Now enhanced to support up
or down interface.
v2: fix case where rxbufsz isn't changed in the up state case
Original patch from
http://lkml.indiana.edu/hypermail/linux/kernel/1202.2/00770.html
Testing: has been test on virtual 8139cp setup without issue,
have no access real hardware 8139cp, need testing help.
Signed-off-by: "John Greene" <jogreene@redhat.com>
CC: "David S. Miller" <davem@davemloft.net>
---
drivers/net/ethernet/realtek/8139cp.c | 23 ++++-------------------
1 file changed, 4 insertions(+), 19 deletions(-)
diff --git a/drivers/net/ethernet/realtek/8139cp.c b/drivers/net/ethernet/realtek/8139cp.c
index 6cb96b4..0da3f5e 100644
--- a/drivers/net/ethernet/realtek/8139cp.c
+++ b/drivers/net/ethernet/realtek/8139cp.c
@@ -1226,12 +1226,9 @@ static void cp_tx_timeout(struct net_device *dev)
spin_unlock_irqrestore(&cp->lock, flags);
}
-#ifdef BROKEN
static int cp_change_mtu(struct net_device *dev, int new_mtu)
{
struct cp_private *cp = netdev_priv(dev);
- int rc;
- unsigned long flags;
/* check for invalid MTU, according to hardware limits */
if (new_mtu < CP_MIN_MTU || new_mtu > CP_MAX_MTU)
@@ -1244,22 +1241,12 @@ static int cp_change_mtu(struct net_device *dev, int new_mtu)
return 0;
}
- spin_lock_irqsave(&cp->lock, flags);
-
- cp_stop_hw(cp); /* stop h/w and free rings */
- cp_clean_rings(cp);
-
+ /* network IS up, close it, reset MTU, and come up again. */
+ cp_close(dev);
dev->mtu = new_mtu;
- cp_set_rxbufsize(cp); /* set new rx buf size */
-
- rc = cp_init_rings(cp); /* realloc and restart h/w */
- cp_start_hw(cp);
-
- spin_unlock_irqrestore(&cp->lock, flags);
-
- return rc;
+ cp_set_rxbufsize(cp);
+ return cp_open(dev);
}
-#endif /* BROKEN */
static const char mii_2_8139_map[8] = {
BasicModeCtrl,
@@ -1835,9 +1822,7 @@ static const struct net_device_ops cp_netdev_ops = {
.ndo_start_xmit = cp_start_xmit,
.ndo_tx_timeout = cp_tx_timeout,
.ndo_set_features = cp_set_features,
-#ifdef BROKEN
.ndo_change_mtu = cp_change_mtu,
-#endif
#ifdef CONFIG_NET_POLL_CONTROLLER
.ndo_poll_controller = cp_poll_controller,
--
1.7.11.7
^ permalink raw reply related
* Re: [RFC PATCH 2/2] tun: fix LSM/SELinux labeling of tun/tap devices
From: Paul Moore @ 2012-12-03 16:22 UTC (permalink / raw)
To: Jason Wang; +Cc: netdev, linux-security-module, selinux
In-Reply-To: <50BC7BCE.7000502@redhat.com>
On Monday, December 03, 2012 06:15:42 PM Jason Wang wrote:
> On 11/30/2012 06:06 AM, Paul Moore wrote:
> > This patch corrects some problems with LSM/SELinux that were introduced
> > with the multiqueue patchset. The problem stems from the fact that the
> > multiqueue work changed the relationship between the tun device and its
> > associated socket; before the socket persisted for the life of the
> > device, however after the multiqueue changes the socket only persisted
> > for the life of the userspace connection (fd open). For non-persistent
> > devices this is not an issue, but for persistent devices this can cause
> > the tun device to lose its SELinux label.
> >
> > We correct this problem by adding an opaque LSM security blob to the
> > tun device struct which allows us to have the LSM security state, e.g.
> > SELinux labeling information, persist for the lifetime of the tun
> > device.
...
> > -static int selinux_tun_dev_attach(struct sock *sk)
> > +static int selinux_tun_dev_attach(struct sock *sk, void *security)
> >
> > {
> >
> > + struct tun_security_struct *tunsec = security;
> >
> > struct sk_security_struct *sksec = sk->sk_security;
> > u32 sid = current_sid();
> > int err;
> >
> > + /* we don't currently perform any NetLabel based labeling here ...
> >
> > err = avc_has_perm(sid, sksec->sid, SECCLASS_TUN_SOCKET,
> >
> > TUN_SOCKET__RELABELFROM, NULL);
> >
> > if (err)
> >
> > return err;
> >
> > - err = avc_has_perm(sid, sid, SECCLASS_TUN_SOCKET,
> > + err = avc_has_perm(sid, tunsec->sid, SECCLASS_TUN_SOCKET,
> >
> > TUN_SOCKET__RELABELTO, NULL);
> >
> > if (err)
> >
> > return err;
> >
> > - sksec->sid = sid;
> > + sksec->sid = tunsec->sid;
> > + sksec->sclass = SECCLASS_TUN_SOCKET;
>
> I'm not sure whether this is correct, looks like we need to differ between
> TUNSETQUEUE and TUNSETIFF. When userspace call TUNSETIFF for persistent
> device, looks like we need change the sid of tunsec like in the past.
It may be that I'm misunderstanding TUNSETQUEUE and/or TUNSETIFF. Can you
elaborate as to why they should be different?
One thing that I think we probably should change is the relabelto/from
permissions in the function above (selinux_tun_dev_attach()); in the case
where the socket does not yet have a label, e.g. 'sksec->sid == 0', we should
probably skip the relabel permissions since we want to assign the TUN device
label regardless in this case.
--
paul moore
security and virtualization @ redhat
^ permalink raw reply
* Re: [PATCH v4 net-next] sctp: Add support to per-association statistics via a new SCTP_GET_ASSOC_STATS call
From: Vlad Yasevich @ 2012-12-03 16:54 UTC (permalink / raw)
To: Michele Baldessari
Cc: linux-sctp, Neil Horman, Thomas Graf, netdev, David S. Miller
In-Reply-To: <1354373382-15230-1-git-send-email-michele@acksyn.org>
On 12/01/2012 09:49 AM, Michele Baldessari wrote:
> The current SCTP stack is lacking a mechanism to have per association
> statistics. This is an implementation modeled after OpenSolaris'
> SCTP_GET_ASSOC_STATS.
>
> Userspace part will follow on lksctp if/when there is a general ACK on
> this.
> V4:
> - Move ipackets++ before q->immediate.func() for consistency reasons
> - Move sctp_max_rto() at the end of sctp_transport_update_rto() to avoid
> returning bogus RTO values
> - return asoc->rto_min when max_obs_rto value has not changed
>
> V3:
> - Increase ictrlchunks in sctp_assoc_bh_rcv() as well
> - Move ipackets++ to sctp_inq_push()
> - return 0 when no rto updates took place since the last call
>
> V2:
> - Implement partial retrieval of stat struct to cope for future expansion
> - Kill the rtxpackets counter as it cannot be precise anyway
> - Rename outseqtsns to outofseqtsns to make it clearer that these are out
> of sequence unexpected TSNs
> - Move asoc->ipackets++ under a lock to avoid potential miscounts
> - Fold asoc->opackets++ into the already existing asoc check
> - Kill unneeded (q->asoc) test when increasing rtxchunks
> - Do not count octrlchunks if sending failed (SCTP_XMIT_OK != 0)
> - Don't count SHUTDOWNs as SACKs
> - Move SCTP_GET_ASSOC_STATS to the private space API
> - Adjust the len check in sctp_getsockopt_assoc_stats() to allow for
> future struct growth
> - Move association statistics in their own struct
> - Update idupchunks when we send a SACK with dup TSNs
> - return min_rto in max_rto when RTO has not changed. Also return the
> transport when max_rto last changed.
>
> Signed-off: Michele Baldessari <michele@acksyn.org>
Looks good
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
-vlad
> ---
> include/net/sctp/sctp.h | 12 ++++++++
> include/net/sctp/structs.h | 36 ++++++++++++++++++++++++
> include/net/sctp/user.h | 27 ++++++++++++++++++
> net/sctp/associola.c | 10 ++++++-
> net/sctp/endpointola.c | 5 +++-
> net/sctp/inqueue.c | 2 ++
> net/sctp/output.c | 14 ++++++----
> net/sctp/outqueue.c | 12 ++++++--
> net/sctp/sm_make_chunk.c | 5 ++--
> net/sctp/sm_sideeffect.c | 1 +
> net/sctp/sm_statefuns.c | 10 +++++--
> net/sctp/socket.c | 69 ++++++++++++++++++++++++++++++++++++++++++++++
> net/sctp/transport.c | 2 ++
> 13 files changed, 192 insertions(+), 13 deletions(-)
>
> diff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h
> index 9c6414f..7fdf298 100644
> --- a/include/net/sctp/sctp.h
> +++ b/include/net/sctp/sctp.h
> @@ -272,6 +272,18 @@ struct sctp_mib {
> unsigned long mibs[SCTP_MIB_MAX];
> };
>
> +/* helper function to track stats about max rto and related transport */
> +static inline void sctp_max_rto(struct sctp_association *asoc,
> + struct sctp_transport *trans)
> +{
> + if (asoc->stats.max_obs_rto < (__u64)trans->rto) {
> + asoc->stats.max_obs_rto = trans->rto;
> + memset(&asoc->stats.obs_rto_ipaddr, 0,
> + sizeof(struct sockaddr_storage));
> + memcpy(&asoc->stats.obs_rto_ipaddr, &trans->ipaddr,
> + trans->af_specific->sockaddr_len);
> + }
> +}
>
> /* Print debugging messages. */
> #if SCTP_DEBUG
> diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
> index 2b2f61d..c252101 100644
> --- a/include/net/sctp/structs.h
> +++ b/include/net/sctp/structs.h
> @@ -1312,6 +1312,40 @@ struct sctp_inithdr_host {
> __u32 initial_tsn;
> };
>
> +/* SCTP_GET_ASSOC_STATS counters */
> +struct sctp_priv_assoc_stats {
> + /* Maximum observed rto in the association during subsequent
> + * observations. Value is set to 0 if no RTO measurement took place
> + * The transport where the max_rto was observed is returned in
> + * obs_rto_ipaddr
> + */
> + struct sockaddr_storage obs_rto_ipaddr;
> + __u64 max_obs_rto;
> + /* Total In and Out SACKs received and sent */
> + __u64 isacks;
> + __u64 osacks;
> + /* Total In and Out packets received and sent */
> + __u64 opackets;
> + __u64 ipackets;
> + /* Total retransmitted chunks */
> + __u64 rtxchunks;
> + /* TSN received > next expected */
> + __u64 outofseqtsns;
> + /* Duplicate Chunks received */
> + __u64 idupchunks;
> + /* Gap Ack Blocks received */
> + __u64 gapcnt;
> + /* Unordered data chunks sent and received */
> + __u64 ouodchunks;
> + __u64 iuodchunks;
> + /* Ordered data chunks sent and received */
> + __u64 oodchunks;
> + __u64 iodchunks;
> + /* Control chunks sent and received */
> + __u64 octrlchunks;
> + __u64 ictrlchunks;
> +};
> +
> /* RFC2960
> *
> * 12. Recommended Transmission Control Block (TCB) Parameters
> @@ -1830,6 +1864,8 @@ struct sctp_association {
>
> __u8 need_ecne:1, /* Need to send an ECNE Chunk? */
> temp:1; /* Is it a temporary association? */
> +
> + struct sctp_priv_assoc_stats stats;
> };
>
>
> diff --git a/include/net/sctp/user.h b/include/net/sctp/user.h
> index 1b02d7a..9a0ae09 100644
> --- a/include/net/sctp/user.h
> +++ b/include/net/sctp/user.h
> @@ -107,6 +107,7 @@ typedef __s32 sctp_assoc_t;
> #define SCTP_GET_LOCAL_ADDRS 109 /* Get all local address. */
> #define SCTP_SOCKOPT_CONNECTX 110 /* CONNECTX requests. */
> #define SCTP_SOCKOPT_CONNECTX3 111 /* CONNECTX requests (updated) */
> +#define SCTP_GET_ASSOC_STATS 112 /* Read only */
>
> /*
> * 5.2.1 SCTP Initiation Structure (SCTP_INIT)
> @@ -719,6 +720,32 @@ struct sctp_getaddrs {
> __u8 addrs[0]; /*output, variable size*/
> };
>
> +/* A socket user request obtained via SCTP_GET_ASSOC_STATS that retrieves
> + * association stats. All stats are counts except sas_maxrto and
> + * sas_obs_rto_ipaddr. maxrto is the max observed rto + transport since
> + * the last call. Will return 0 when RTO was not update since last call
> + */
> +struct sctp_assoc_stats {
> + sctp_assoc_t sas_assoc_id; /* Input */
> + /* Transport of observed max RTO */
> + struct sockaddr_storage sas_obs_rto_ipaddr;
> + __u64 sas_maxrto; /* Maximum Observed RTO for period */
> + __u64 sas_isacks; /* SACKs received */
> + __u64 sas_osacks; /* SACKs sent */
> + __u64 sas_opackets; /* Packets sent */
> + __u64 sas_ipackets; /* Packets received */
> + __u64 sas_rtxchunks; /* Retransmitted Chunks */
> + __u64 sas_outofseqtsns;/* TSN received > next expected */
> + __u64 sas_idupchunks; /* Dups received (ordered+unordered) */
> + __u64 sas_gapcnt; /* Gap Acknowledgements Received */
> + __u64 sas_ouodchunks; /* Unordered data chunks sent */
> + __u64 sas_iuodchunks; /* Unordered data chunks received */
> + __u64 sas_oodchunks; /* Ordered data chunks sent */
> + __u64 sas_iodchunks; /* Ordered data chunks received */
> + __u64 sas_octrlchunks; /* Control chunks sent */
> + __u64 sas_ictrlchunks; /* Control chunks received */
> +};
> +
> /* These are bit fields for msghdr->msg_flags. See section 5.1. */
> /* On user space Linux, these live in <bits/socket.h> as an enum. */
> enum sctp_msg_flags {
> diff --git a/net/sctp/associola.c b/net/sctp/associola.c
> index b1ef3bc..ba3f9cc 100644
> --- a/net/sctp/associola.c
> +++ b/net/sctp/associola.c
> @@ -321,6 +321,9 @@ static struct sctp_association *sctp_association_init(struct sctp_association *a
> asoc->default_timetolive = sp->default_timetolive;
> asoc->default_rcv_context = sp->default_rcv_context;
>
> + /* SCTP_GET_ASSOC_STATS COUNTERS */
> + memset(&asoc->stats, 0, sizeof(struct sctp_priv_assoc_stats));
> +
> /* AUTH related initializations */
> INIT_LIST_HEAD(&asoc->endpoint_shared_keys);
> err = sctp_auth_asoc_copy_shkeys(ep, asoc, gfp);
> @@ -760,6 +763,7 @@ struct sctp_transport *sctp_assoc_add_peer(struct sctp_association *asoc,
>
> /* Set the transport's RTO.initial value */
> peer->rto = asoc->rto_initial;
> + sctp_max_rto(asoc, peer);
>
> /* Set the peer's active state. */
> peer->state = peer_state;
> @@ -1152,8 +1156,12 @@ static void sctp_assoc_bh_rcv(struct work_struct *work)
> */
> if (sctp_chunk_is_data(chunk))
> asoc->peer.last_data_from = chunk->transport;
> - else
> + else {
> SCTP_INC_STATS(net, SCTP_MIB_INCTRLCHUNKS);
> + asoc->stats.ictrlchunks++;
> + if (chunk->chunk_hdr->type == SCTP_CID_SACK)
> + asoc->stats.isacks++;
> + }
>
> if (chunk->transport)
> chunk->transport->last_time_heard = jiffies;
> diff --git a/net/sctp/endpointola.c b/net/sctp/endpointola.c
> index 1859e2b..32ab55b 100644
> --- a/net/sctp/endpointola.c
> +++ b/net/sctp/endpointola.c
> @@ -480,8 +480,11 @@ normal:
> */
> if (asoc && sctp_chunk_is_data(chunk))
> asoc->peer.last_data_from = chunk->transport;
> - else
> + else {
> SCTP_INC_STATS(sock_net(ep->base.sk), SCTP_MIB_INCTRLCHUNKS);
> + if (asoc)
> + asoc->stats.ictrlchunks++;
> + }
>
> if (chunk->transport)
> chunk->transport->last_time_heard = jiffies;
> diff --git a/net/sctp/inqueue.c b/net/sctp/inqueue.c
> index 397296f..2d5ad28 100644
> --- a/net/sctp/inqueue.c
> +++ b/net/sctp/inqueue.c
> @@ -104,6 +104,8 @@ void sctp_inq_push(struct sctp_inq *q, struct sctp_chunk *chunk)
> * on the BH related data structures.
> */
> list_add_tail(&chunk->list, &q->in_chunk_list);
> + if (chunk->asoc)
> + chunk->asoc->stats.ipackets++;
> q->immediate.func(&q->immediate);
> }
>
> diff --git a/net/sctp/output.c b/net/sctp/output.c
> index 4e90188bf..f5200a2 100644
> --- a/net/sctp/output.c
> +++ b/net/sctp/output.c
> @@ -311,6 +311,8 @@ static sctp_xmit_t __sctp_packet_append_chunk(struct sctp_packet *packet,
>
> case SCTP_CID_SACK:
> packet->has_sack = 1;
> + if (chunk->asoc)
> + chunk->asoc->stats.osacks++;
> break;
>
> case SCTP_CID_AUTH:
> @@ -584,11 +586,13 @@ int sctp_packet_transmit(struct sctp_packet *packet)
> */
>
> /* Dump that on IP! */
> - if (asoc && asoc->peer.last_sent_to != tp) {
> - /* Considering the multiple CPU scenario, this is a
> - * "correcter" place for last_sent_to. --xguo
> - */
> - asoc->peer.last_sent_to = tp;
> + if (asoc) {
> + asoc->stats.opackets++;
> + if (asoc->peer.last_sent_to != tp)
> + /* Considering the multiple CPU scenario, this is a
> + * "correcter" place for last_sent_to. --xguo
> + */
> + asoc->peer.last_sent_to = tp;
> }
>
> if (has_data) {
> diff --git a/net/sctp/outqueue.c b/net/sctp/outqueue.c
> index 1b4a7f8..379c81d 100644
> --- a/net/sctp/outqueue.c
> +++ b/net/sctp/outqueue.c
> @@ -667,6 +667,7 @@ redo:
> chunk->fast_retransmit = SCTP_DONT_FRTX;
>
> q->empty = 0;
> + q->asoc->stats.rtxchunks++;
> break;
> }
>
> @@ -876,12 +877,14 @@ static int sctp_outq_flush(struct sctp_outq *q, int rtx_timeout)
> if (status != SCTP_XMIT_OK) {
> /* put the chunk back */
> list_add(&chunk->list, &q->control_chunk_list);
> - } else if (chunk->chunk_hdr->type == SCTP_CID_FWD_TSN) {
> + } else {
> + asoc->stats.octrlchunks++;
> /* PR-SCTP C5) If a FORWARD TSN is sent, the
> * sender MUST assure that at least one T3-rtx
> * timer is running.
> */
> - sctp_transport_reset_timers(transport);
> + if (chunk->chunk_hdr->type == SCTP_CID_FWD_TSN)
> + sctp_transport_reset_timers(transport);
> }
> break;
>
> @@ -1055,6 +1058,10 @@ static int sctp_outq_flush(struct sctp_outq *q, int rtx_timeout)
> */
> if (asoc->state == SCTP_STATE_SHUTDOWN_PENDING)
> chunk->chunk_hdr->flags |= SCTP_DATA_SACK_IMM;
> + if (chunk->chunk_hdr->flags & SCTP_DATA_UNORDERED)
> + asoc->stats.ouodchunks++;
> + else
> + asoc->stats.oodchunks++;
>
> break;
>
> @@ -1162,6 +1169,7 @@ int sctp_outq_sack(struct sctp_outq *q, struct sctp_chunk *chunk)
>
> sack_ctsn = ntohl(sack->cum_tsn_ack);
> gap_ack_blocks = ntohs(sack->num_gap_ack_blocks);
> + asoc->stats.gapcnt += gap_ack_blocks;
> /*
> * SFR-CACC algorithm:
> * On receipt of a SACK the sender SHOULD execute the
> diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c
> index e0f01a4..e1c5fc2 100644
> --- a/net/sctp/sm_make_chunk.c
> +++ b/net/sctp/sm_make_chunk.c
> @@ -804,10 +804,11 @@ struct sctp_chunk *sctp_make_sack(const struct sctp_association *asoc)
> gabs);
>
> /* Add the duplicate TSN information. */
> - if (num_dup_tsns)
> + if (num_dup_tsns) {
> + aptr->stats.idupchunks += num_dup_tsns;
> sctp_addto_chunk(retval, sizeof(__u32) * num_dup_tsns,
> sctp_tsnmap_get_dups(map));
> -
> + }
> /* Once we have a sack generated, check to see what our sack
> * generation is, if its 0, reset the transports to 0, and reset
> * the association generation to 1
> diff --git a/net/sctp/sm_sideeffect.c b/net/sctp/sm_sideeffect.c
> index c076956..c957775 100644
> --- a/net/sctp/sm_sideeffect.c
> +++ b/net/sctp/sm_sideeffect.c
> @@ -542,6 +542,7 @@ static void sctp_do_8_2_transport_strike(sctp_cmd_seq_t *commands,
> */
> if (!is_hb || transport->hb_sent) {
> transport->rto = min((transport->rto * 2), transport->asoc->rto_max);
> + sctp_max_rto(asoc, transport);
> }
> }
>
> diff --git a/net/sctp/sm_statefuns.c b/net/sctp/sm_statefuns.c
> index b6adef8..ecf7a17 100644
> --- a/net/sctp/sm_statefuns.c
> +++ b/net/sctp/sm_statefuns.c
> @@ -6127,6 +6127,8 @@ static int sctp_eat_data(const struct sctp_association *asoc,
> /* The TSN is too high--silently discard the chunk and
> * count on it getting retransmitted later.
> */
> + if (chunk->asoc)
> + chunk->asoc->stats.outofseqtsns++;
> return SCTP_IERROR_HIGH_TSN;
> } else if (tmp > 0) {
> /* This is a duplicate. Record it. */
> @@ -6226,10 +6228,14 @@ static int sctp_eat_data(const struct sctp_association *asoc,
> /* Note: Some chunks may get overcounted (if we drop) or overcounted
> * if we renege and the chunk arrives again.
> */
> - if (chunk->chunk_hdr->flags & SCTP_DATA_UNORDERED)
> + if (chunk->chunk_hdr->flags & SCTP_DATA_UNORDERED) {
> SCTP_INC_STATS(net, SCTP_MIB_INUNORDERCHUNKS);
> - else {
> + if (chunk->asoc)
> + chunk->asoc->stats.iuodchunks++;
> + } else {
> SCTP_INC_STATS(net, SCTP_MIB_INORDERCHUNKS);
> + if (chunk->asoc)
> + chunk->asoc->stats.iodchunks++;
> ordered = 1;
> }
>
> diff --git a/net/sctp/socket.c b/net/sctp/socket.c
> index 2e89706..7b8d01a 100644
> --- a/net/sctp/socket.c
> +++ b/net/sctp/socket.c
> @@ -611,6 +611,7 @@ static int sctp_send_asconf_add_ip(struct sock *sk,
> 2*asoc->pathmtu, 4380));
> trans->ssthresh = asoc->peer.i.a_rwnd;
> trans->rto = asoc->rto_initial;
> + sctp_max_rto(asoc, trans);
> trans->rtt = trans->srtt = trans->rttvar = 0;
> sctp_transport_route(trans, NULL,
> sctp_sk(asoc->base.sk));
> @@ -5635,6 +5636,71 @@ static int sctp_getsockopt_paddr_thresholds(struct sock *sk,
> return 0;
> }
>
> +/*
> + * SCTP_GET_ASSOC_STATS
> + *
> + * This option retrieves local per endpoint statistics. It is modeled
> + * after OpenSolaris' implementation
> + */
> +static int sctp_getsockopt_assoc_stats(struct sock *sk, int len,
> + char __user *optval,
> + int __user *optlen)
> +{
> + struct sctp_assoc_stats sas;
> + struct sctp_association *asoc = NULL;
> +
> + /* User must provide at least the assoc id */
> + if (len < sizeof(sctp_assoc_t))
> + return -EINVAL;
> +
> + if (copy_from_user(&sas, optval, len))
> + return -EFAULT;
> +
> + asoc = sctp_id2assoc(sk, sas.sas_assoc_id);
> + if (!asoc)
> + return -EINVAL;
> +
> + sas.sas_rtxchunks = asoc->stats.rtxchunks;
> + sas.sas_gapcnt = asoc->stats.gapcnt;
> + sas.sas_outofseqtsns = asoc->stats.outofseqtsns;
> + sas.sas_osacks = asoc->stats.osacks;
> + sas.sas_isacks = asoc->stats.isacks;
> + sas.sas_octrlchunks = asoc->stats.octrlchunks;
> + sas.sas_ictrlchunks = asoc->stats.ictrlchunks;
> + sas.sas_oodchunks = asoc->stats.oodchunks;
> + sas.sas_iodchunks = asoc->stats.iodchunks;
> + sas.sas_ouodchunks = asoc->stats.ouodchunks;
> + sas.sas_iuodchunks = asoc->stats.iuodchunks;
> + sas.sas_idupchunks = asoc->stats.idupchunks;
> + sas.sas_opackets = asoc->stats.opackets;
> + sas.sas_ipackets = asoc->stats.ipackets;
> +
> + /* New high max rto observed, will return 0 if not a single
> + * RTO update took place. obs_rto_ipaddr will be bogus
> + * in such a case
> + */
> + sas.sas_maxrto = asoc->stats.max_obs_rto;
> + memcpy(&sas.sas_obs_rto_ipaddr, &asoc->stats.obs_rto_ipaddr,
> + sizeof(struct sockaddr_storage));
> +
> + /* Mark beginning of a new observation period */
> + asoc->stats.max_obs_rto = asoc->rto_min;
> +
> + /* Allow the struct to grow and fill in as much as possible */
> + len = min_t(size_t, len, sizeof(sas));
> +
> + if (put_user(len, optlen))
> + return -EFAULT;
> +
> + SCTP_DEBUG_PRINTK("sctp_getsockopt_assoc_stat(%d): %d\n",
> + len, sas.sas_assoc_id);
> +
> + if (copy_to_user(optval, &sas, len))
> + return -EFAULT;
> +
> + return 0;
> +}
> +
> SCTP_STATIC int sctp_getsockopt(struct sock *sk, int level, int optname,
> char __user *optval, int __user *optlen)
> {
> @@ -5776,6 +5842,9 @@ SCTP_STATIC int sctp_getsockopt(struct sock *sk, int level, int optname,
> case SCTP_PEER_ADDR_THLDS:
> retval = sctp_getsockopt_paddr_thresholds(sk, optval, len, optlen);
> break;
> + case SCTP_GET_ASSOC_STATS:
> + retval = sctp_getsockopt_assoc_stats(sk, len, optval, optlen);
> + break;
> default:
> retval = -ENOPROTOOPT;
> break;
> diff --git a/net/sctp/transport.c b/net/sctp/transport.c
> index 953c21e..40574ad 100644
> --- a/net/sctp/transport.c
> +++ b/net/sctp/transport.c
> @@ -363,6 +363,7 @@ void sctp_transport_update_rto(struct sctp_transport *tp, __u32 rtt)
> if (tp->rto > tp->asoc->rto_max)
> tp->rto = tp->asoc->rto_max;
>
> + sctp_max_rto(tp->asoc, tp);
> tp->rtt = rtt;
>
> /* Reset rto_pending so that a new RTT measurement is started when a
> @@ -620,6 +621,7 @@ void sctp_transport_reset(struct sctp_transport *t)
> t->burst_limited = 0;
> t->ssthresh = asoc->peer.i.a_rwnd;
> t->rto = asoc->rto_initial;
> + sctp_max_rto(asoc, t);
> t->rtt = 0;
> t->srtt = 0;
> t->rttvar = 0;
>
^ permalink raw reply
* Re: [PATCH 2/3] net: cpsw: verify correct number of slaves in DT
From: Mugunthan V N @ 2012-12-03 17:00 UTC (permalink / raw)
To: Jan Luebbe
Cc: netdev, David S. Miller, Vaibhav Hiremath, linux-arm-kernel,
linux-omap
In-Reply-To: <1354542569-6165-2-git-send-email-jlu@pengutronix.de>
On 12/3/2012 7:19 PM, Jan Luebbe wrote:
> Check that the number of available slaves passed from DT matches the
> value of the "slaves" property in the cpsw node. Otherwise, priv->slaves
> would be the wrong size.
>
> Signed-off-by: Jan Luebbe <jlu@pengutronix.de>
> ---
> drivers/net/ethernet/ti/cpsw.c | 17 +++++++++++++++++
> 1 file changed, 17 insertions(+)
>
> diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
> index c0e676a..8de3e92 100644
> --- a/drivers/net/ethernet/ti/cpsw.c
> +++ b/drivers/net/ethernet/ti/cpsw.c
> @@ -832,6 +832,16 @@ static int cpsw_probe_dt(struct cpsw_platform_data *data,
> const char *phy_id = NULL;
> const void *mac_addr = NULL;
>
> + if (!of_device_is_available(slave_node))
> + continue;
> +
> + if (i >= data->slaves) {
> + dev_err(&pdev->dev, "Too many slaves in the DT (> %d).\n",
> + data->slaves);
> + ret = -EINVAL;
> + goto error_ret;
> + }
> +
> if (of_property_read_string(slave_node, "phy_id", &phy_id)) {
> dev_err(&pdev->dev, "Missing slave[%d] phy_id property.\n", i);
> ret = -EINVAL;
> @@ -861,6 +871,13 @@ static int cpsw_probe_dt(struct cpsw_platform_data *data,
> i++;
> }
>
> + if (i < data->slaves) {
> + dev_err(&pdev->dev, "Not enough slaves in the DT (< %d).\n",
> + data->slaves);
> + ret = -EINVAL;
> + goto error_ret;
> + }
> +
> /*
> * Populate all the child nodes here...
> */
The patches look good to me.
Acked-by: Mugunthan V N<mugunthanvnm@ti.com>
Regards
Mugunthan V N
^ permalink raw reply
* Re: [PATCH 1/3] net: cpsw: replace pr_xxx with dev_xxx functions
From: Mugunthan V N @ 2012-12-03 16:59 UTC (permalink / raw)
To: Jan Luebbe
Cc: netdev, David S. Miller, Vaibhav Hiremath, linux-arm-kernel,
linux-omap
In-Reply-To: <1354542569-6165-1-git-send-email-jlu@pengutronix.de>
On 12/3/2012 7:19 PM, Jan Luebbe wrote:
> Signed-off-by: Jan Luebbe <jlu@pengutronix.de>
> ---
> drivers/net/ethernet/ti/cpsw.c | 47 ++++++++++++++++++++--------------------
> 1 file changed, 23 insertions(+), 24 deletions(-)
>
> diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
> index fb1a692..c0e676a 100644
> --- a/drivers/net/ethernet/ti/cpsw.c
> +++ b/drivers/net/ethernet/ti/cpsw.c
> @@ -729,7 +729,7 @@ static int cpsw_probe_dt(struct cpsw_platform_data *data,
> return -EINVAL;
>
> if (of_property_read_u32(node, "slaves", &prop)) {
> - pr_err("Missing slaves property in the DT.\n");
> + dev_err(&pdev->dev, "Missing slaves property in the DT.\n");
> return -EINVAL;
> }
> data->slaves = prop;
> @@ -737,91 +737,91 @@ static int cpsw_probe_dt(struct cpsw_platform_data *data,
> data->slave_data = kzalloc(sizeof(struct cpsw_slave_data) *
> data->slaves, GFP_KERNEL);
> if (!data->slave_data) {
> - pr_err("Could not allocate slave memory.\n");
> + dev_err(&pdev->dev, "Could not allocate slave memory.\n");
> return -EINVAL;
> }
>
> data->no_bd_ram = of_property_read_bool(node, "no_bd_ram");
>
> if (of_property_read_u32(node, "cpdma_channels", &prop)) {
> - pr_err("Missing cpdma_channels property in the DT.\n");
> + dev_err(&pdev->dev, "Missing cpdma_channels property in the DT.\n");
> ret = -EINVAL;
> goto error_ret;
> }
> data->channels = prop;
>
> if (of_property_read_u32(node, "host_port_no", &prop)) {
> - pr_err("Missing host_port_no property in the DT.\n");
> + dev_err(&pdev->dev, "Missing host_port_no property in the DT.\n");
> ret = -EINVAL;
> goto error_ret;
> }
> data->host_port_num = prop;
>
> if (of_property_read_u32(node, "cpdma_reg_ofs", &prop)) {
> - pr_err("Missing cpdma_reg_ofs property in the DT.\n");
> + dev_err(&pdev->dev, "Missing cpdma_reg_ofs property in the DT.\n");
> ret = -EINVAL;
> goto error_ret;
> }
> data->cpdma_reg_ofs = prop;
>
> if (of_property_read_u32(node, "cpdma_sram_ofs", &prop)) {
> - pr_err("Missing cpdma_sram_ofs property in the DT.\n");
> + dev_err(&pdev->dev, "Missing cpdma_sram_ofs property in the DT.\n");
> ret = -EINVAL;
> goto error_ret;
> }
> data->cpdma_sram_ofs = prop;
>
> if (of_property_read_u32(node, "ale_reg_ofs", &prop)) {
> - pr_err("Missing ale_reg_ofs property in the DT.\n");
> + dev_err(&pdev->dev, "Missing ale_reg_ofs property in the DT.\n");
> ret = -EINVAL;
> goto error_ret;
> }
> data->ale_reg_ofs = prop;
>
> if (of_property_read_u32(node, "ale_entries", &prop)) {
> - pr_err("Missing ale_entries property in the DT.\n");
> + dev_err(&pdev->dev, "Missing ale_entries property in the DT.\n");
> ret = -EINVAL;
> goto error_ret;
> }
> data->ale_entries = prop;
>
> if (of_property_read_u32(node, "host_port_reg_ofs", &prop)) {
> - pr_err("Missing host_port_reg_ofs property in the DT.\n");
> + dev_err(&pdev->dev, "Missing host_port_reg_ofs property in the DT.\n");
> ret = -EINVAL;
> goto error_ret;
> }
> data->host_port_reg_ofs = prop;
>
> if (of_property_read_u32(node, "hw_stats_reg_ofs", &prop)) {
> - pr_err("Missing hw_stats_reg_ofs property in the DT.\n");
> + dev_err(&pdev->dev, "Missing hw_stats_reg_ofs property in the DT.\n");
> ret = -EINVAL;
> goto error_ret;
> }
> data->hw_stats_reg_ofs = prop;
>
> if (of_property_read_u32(node, "bd_ram_ofs", &prop)) {
> - pr_err("Missing bd_ram_ofs property in the DT.\n");
> + dev_err(&pdev->dev, "Missing bd_ram_ofs property in the DT.\n");
> ret = -EINVAL;
> goto error_ret;
> }
> data->bd_ram_ofs = prop;
>
> if (of_property_read_u32(node, "bd_ram_size", &prop)) {
> - pr_err("Missing bd_ram_size property in the DT.\n");
> + dev_err(&pdev->dev, "Missing bd_ram_size property in the DT.\n");
> ret = -EINVAL;
> goto error_ret;
> }
> data->bd_ram_size = prop;
>
> if (of_property_read_u32(node, "rx_descs", &prop)) {
> - pr_err("Missing rx_descs property in the DT.\n");
> + dev_err(&pdev->dev, "Missing rx_descs property in the DT.\n");
> ret = -EINVAL;
> goto error_ret;
> }
> data->rx_descs = prop;
>
> if (of_property_read_u32(node, "mac_control", &prop)) {
> - pr_err("Missing mac_control property in the DT.\n");
> + dev_err(&pdev->dev, "Missing mac_control property in the DT.\n");
> ret = -EINVAL;
> goto error_ret;
> }
> @@ -833,14 +833,14 @@ static int cpsw_probe_dt(struct cpsw_platform_data *data,
> const void *mac_addr = NULL;
>
> if (of_property_read_string(slave_node, "phy_id", &phy_id)) {
> - pr_err("Missing slave[%d] phy_id property\n", i);
> + dev_err(&pdev->dev, "Missing slave[%d] phy_id property.\n", i);
> ret = -EINVAL;
> goto error_ret;
> }
> slave_data->phy_id = phy_id;
>
> if (of_property_read_u32(slave_node, "slave_reg_ofs", &prop)) {
> - pr_err("Missing slave[%d] slave_reg_ofs property\n", i);
> + dev_err(&pdev->dev, "Missing slave[%d] slave_reg_ofs property.\n", i);
> ret = -EINVAL;
> goto error_ret;
> }
> @@ -848,8 +848,7 @@ static int cpsw_probe_dt(struct cpsw_platform_data *data,
>
> if (of_property_read_u32(slave_node, "sliver_reg_ofs",
> &prop)) {
> - pr_err("Missing slave[%d] sliver_reg_ofs property\n",
> - i);
> + dev_err(&pdev->dev, "Missing slave[%d] sliver_reg_ofs property.\n", i);
> ret = -EINVAL;
> goto error_ret;
> }
> @@ -868,7 +867,7 @@ static int cpsw_probe_dt(struct cpsw_platform_data *data,
> ret = of_platform_populate(node, NULL, NULL, &pdev->dev);
> /* We do not want to force this, as in some cases may not have child */
> if (ret)
> - pr_warn("Doesn't have any child node\n");
> + dev_warn(&pdev->dev, "Doesn't have any child node\n");
>
> return 0;
>
> @@ -890,7 +889,7 @@ static int __devinit cpsw_probe(struct platform_device *pdev)
>
> ndev = alloc_etherdev(sizeof(struct cpsw_priv));
> if (!ndev) {
> - pr_err("error allocating net_device\n");
> + pr_err("cpsw: error allocating net_device\n");
> return -ENOMEM;
> }
>
> @@ -909,7 +908,7 @@ static int __devinit cpsw_probe(struct platform_device *pdev)
> pm_runtime_enable(&pdev->dev);
>
> if (cpsw_probe_dt(&priv->data, pdev)) {
> - pr_err("cpsw: platform data missing\n");
> + dev_err(&pdev->dev, "platform data missing\n");
> ret = -ENODEV;
> goto clean_ndev_ret;
> }
> @@ -917,10 +916,10 @@ static int __devinit cpsw_probe(struct platform_device *pdev)
>
> if (is_valid_ether_addr(data->slave_data[0].mac_addr)) {
> memcpy(priv->mac_addr, data->slave_data[0].mac_addr, ETH_ALEN);
> - pr_info("Detected MACID = %pM", priv->mac_addr);
> + dev_info(&pdev->dev, "Detected MACID = %pM", priv->mac_addr);
> } else {
> eth_random_addr(priv->mac_addr);
> - pr_info("Random MACID = %pM", priv->mac_addr);
> + dev_info(&pdev->dev, "Random MACID = %pM", priv->mac_addr);
> }
>
> memcpy(ndev->dev_addr, priv->mac_addr, ETH_ALEN);
> @@ -1120,7 +1119,7 @@ static int __devexit cpsw_remove(struct platform_device *pdev)
> struct net_device *ndev = platform_get_drvdata(pdev);
> struct cpsw_priv *priv = netdev_priv(ndev);
>
> - pr_info("removing device");
> + dev_info(&pdev->dev, "removing device");
> platform_set_drvdata(pdev, NULL);
>
> free_irq(ndev->irq, priv);
The patches look good to me.
Acked-by: Mugunthan V N<mugunthanvnm@ti.com>
^ permalink raw reply
* Re: [PATCH 3/3] net: cpsw: implement ioctl for MII
From: Mugunthan V N @ 2012-12-03 17:04 UTC (permalink / raw)
To: Jan Luebbe
Cc: netdev, David S. Miller, Vaibhav Hiremath, linux-arm-kernel,
linux-omap
In-Reply-To: <1354542569-6165-3-git-send-email-jlu@pengutronix.de>
On 12/3/2012 7:19 PM, Jan Luebbe wrote:
> This allows using tools like mii-diag on CPSW.
>
> Signed-off-by: Jan Luebbe <jlu@pengutronix.de>
> ---
> drivers/net/ethernet/ti/cpsw.c | 15 +++++++++++++++
> 1 file changed, 15 insertions(+)
>
> diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
> index 8de3e92..f476c03 100644
> --- a/drivers/net/ethernet/ti/cpsw.c
> +++ b/drivers/net/ethernet/ti/cpsw.c
> @@ -629,6 +629,20 @@ static void cpsw_ndo_change_rx_flags(struct net_device *ndev, int flags)
> dev_err(&ndev->dev, "multicast traffic cannot be filtered!\n");
> }
>
> +static int cpsw_ndo_do_ioctl(struct net_device *ndev, struct ifreq *rq, int cmd)
> +{
> + struct cpsw_priv *priv = netdev_priv(ndev);
> + struct phy_device *phy = priv->slaves[0].phy;
> +
> + if (!netif_running(ndev))
> + return -EINVAL;
> +
> + if (!phy)
> + return -ENODEV;
> +
> + return phy_mii_ioctl(phy, rq, cmd);
> +}
> +
> static void cpsw_ndo_tx_timeout(struct net_device *ndev)
> {
> struct cpsw_priv *priv = netdev_priv(ndev);
> @@ -670,6 +684,7 @@ static const struct net_device_ops cpsw_netdev_ops = {
> .ndo_start_xmit = cpsw_ndo_start_xmit,
> .ndo_change_rx_flags = cpsw_ndo_change_rx_flags,
> .ndo_validate_addr = eth_validate_addr,
> + .ndo_do_ioctl = cpsw_ndo_do_ioctl,
> .ndo_change_mtu = eth_change_mtu,
> .ndo_tx_timeout = cpsw_ndo_tx_timeout,
> .ndo_get_stats = cpsw_ndo_get_stats,
Already ndo_do_ioctl is already implemented. Can you rebase the patch
with latest git repo
and resubmit the patch
Regards
Mugunthan V N
^ permalink raw reply
* Re: [net-next PATCH V2 5/9] net: frag, per CPU resource, mem limit and LRU list accounting
From: David Miller @ 2012-12-03 17:25 UTC (permalink / raw)
To: brouer
Cc: eric.dumazet, fw, netdev, pablo, tgraf, amwang, kaber, paulmck,
herbert
In-Reply-To: <1354543361.20888.10.camel@localhost>
From: Jesper Dangaard Brouer <brouer@redhat.com>
Date: Mon, 03 Dec 2012 15:02:41 +0100
> On Thu, 2012-11-29 at 09:06 -0800, Eric Dumazet wrote:
>> On Thu, 2012-11-29 at 17:13 +0100, Jesper Dangaard Brouer wrote:
>> > The major performance bottleneck on NUMA systems, is the mem limit
>> > counter which is based an atomic counter. This patch removes the
>> > cache-bouncing of the atomic counter, by moving this accounting to be
>> > bound to each CPU. The LRU list also need to be done per CPU,
>> > in-order to keep the accounting straight.
>> >
>> > If fragments belonging together is "sprayed" across CPUs, performance
>> > will still suffer, but due to NIC rxhashing this is not very common.
>> > Correct accounting in this situation is maintained by recording and
>> > "assigning" a CPU to a frag queue when its allocated (caused by the
>> > first packet associated packet).
>> >
> [...]
>> > +/* Need to maintain these resource limits per CPU, else we will kill
>> > + * performance due to cache-line bouncing
>> > + */
>> > +struct frag_cpu_limit {
>> > + atomic_t mem;
>> > + struct list_head lru_list;
>> > + spinlock_t lru_lock;
>> > +} ____cacheline_aligned_in_smp;
>> > +
>>
>> This looks like a big patch introducing a specific infrastructure, while
>> we already have lib/percpu_counter.c
>
> For the record, I cannot use the lib/percpu_counter, because this
> accounting is not kept strictly per CPU, if the fragments are "sprayed"
> across CPUs (as described in the commit message above).
The percpu infrastructure allows precise counts and comparisons even
in that case. It uses the cheap test when possible, and defers to a
more expensive test when necessary.
^ permalink raw reply
* Re: [PATCH net-next 3/7] ipv6: improve ipv6_find_hdr() to skip empty routing headers
From: Jesse Gross @ 2012-12-03 17:28 UTC (permalink / raw)
To: Pablo Neira Ayuso
Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA,
David Miller, Ansis Atteka
In-Reply-To: <20121203140458.GA1596@1984>
On Mon, Dec 3, 2012 at 6:04 AM, Pablo Neira Ayuso <pablo-Cap9r6Oaw4JrovVCs/uTlw@public.gmane.org> wrote:
> On Thu, Nov 29, 2012 at 10:35:45AM -0800, Jesse Gross wrote:
>> @@ -159,9 +162,10 @@ int ipv6_find_hdr(const struct sk_buff *skb, unsigned int *offset,
>> }
>> len = skb->len - start;
>>
>> - while (nexthdr != target) {
>
> If the offset is set as parameter via ipv6_find_hdr, we now are always
> entering the loop even if we found the target header we're looking
> for, before that didn't happen.
>
> Something seems wrong here to me.
If the target header is a routing header then you might still need to
continue searching because the first one that you see could be empty.
^ permalink raw reply
* [PATCH] vhost-net: initialize zcopy packet counters
From: Michael S. Tsirkin @ 2012-12-03 17:31 UTC (permalink / raw)
To: davem; +Cc: netdev, virtualization, linux-kernel, kvm, Michael S. Tsirkin
These packet counters are used to drive the zercopy
selection heuristic so nothing too bad happens if they are off a bit -
and they are also reset once in a while.
But it's cleaner to clear them when backend is set so that
we start in a known state.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
drivers/vhost/net.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 67898fa..ff6c9199 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -823,6 +823,9 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)
r = vhost_init_used(vq);
if (r)
goto err_vq;
+
+ n->tx_packets = 0;
+ n->tx_zcopy_err = 0;
}
mutex_unlock(&vq->mutex);
--
MST
^ permalink raw reply related
* Re: Optics (SFP) monitoring on ixgbe and igbe
From: Ben Hutchings @ 2012-12-03 17:43 UTC (permalink / raw)
To: footplus; +Cc: netdev
In-Reply-To: <CAPN4dA8+RHq1BAEjXv2tQHQw_h0JhEoah0HCq65k63gmszjLCA@mail.gmail.com>
On Sun, 2012-12-02 at 23:00 +0100, Aurélien wrote:
> On Sun, Dec 2, 2012 at 10:47 PM, Aurélien <footplus@gmail.com> wrote:
> >>
> >> This version drops the -lm completely, so it doesn't link. Maybe you
> >> edited the generated Makefile or Makefile.in?
> >
> > No, I just stupidly forgot to make distclean & autogen after removing
> > all libm checks. Re-added AC_CHECK_LIB to link with it.
> >
>
> Just after re-reading this, I thought it was silly; I instead edited
> Makefile.am and reverted the configure.ac change. Here's a new
> full-patch with that fix along with the rest.
>
> Sorry for the noise.
Applied, thanks.
Ben.
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply
* Re: [PATCH] ixgbe: Use is_valid_ether_addr
From: Joe Perches @ 2012-12-03 17:47 UTC (permalink / raw)
To: Jeff Kirsher; +Cc: netdev, linux-kernel
In-Reply-To: <1350750159.10791.3.camel@joe-AO722>
On Sat, 2012-10-20 at 09:22 -0700, Joe Perches wrote:
> Use the normal kernel test instead of a module specific one.
ping?
> Signed-off-by: Joe Perches <joe@perches.com>
> ---
> found when doing that larger style conversion,
> might as well submit it.
>
> drivers/net/ethernet/intel/ixgbe/ixgbe_82599.c | 2 +-
> drivers/net/ethernet/intel/ixgbe/ixgbe_common.c | 27 +------------------------
> drivers/net/ethernet/intel/ixgbe/ixgbe_common.h | 1 -
> drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 2 +-
> drivers/net/ethernet/intel/ixgbe/ixgbe_type.h | 9 ---------
> drivers/net/ethernet/intel/ixgbe/ixgbe_x540.c | 2 +-
> 6 files changed, 4 insertions(+), 39 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_82599.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_82599.c
> index 1077cb2..89fe00d 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_82599.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_82599.c
> @@ -1022,7 +1022,7 @@ mac_reset_top:
> hw->mac.ops.get_san_mac_addr(hw, hw->mac.san_addr);
>
> /* Add the SAN MAC address to the RAR only if it's a valid address */
> - if (ixgbe_validate_mac_addr(hw->mac.san_addr) == 0) {
> + if (is_valid_ether_addr(hw->mac.san_addr)) {
> hw->mac.ops.set_rar(hw, hw->mac.num_rar_entries - 1,
> hw->mac.san_addr, 0, IXGBE_RAH_AV);
>
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
> index dbf37e4..2d8f76d 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
> @@ -1762,30 +1762,6 @@ s32 ixgbe_update_eeprom_checksum_generic(struct ixgbe_hw *hw)
> }
>
> /**
> - * ixgbe_validate_mac_addr - Validate MAC address
> - * @mac_addr: pointer to MAC address.
> - *
> - * Tests a MAC address to ensure it is a valid Individual Address
> - **/
> -s32 ixgbe_validate_mac_addr(u8 *mac_addr)
> -{
> - s32 status = 0;
> -
> - /* Make sure it is not a multicast address */
> - if (IXGBE_IS_MULTICAST(mac_addr))
> - status = IXGBE_ERR_INVALID_MAC_ADDR;
> - /* Not a broadcast address */
> - else if (IXGBE_IS_BROADCAST(mac_addr))
> - status = IXGBE_ERR_INVALID_MAC_ADDR;
> - /* Reject the zero address */
> - else if (mac_addr[0] == 0 && mac_addr[1] == 0 && mac_addr[2] == 0 &&
> - mac_addr[3] == 0 && mac_addr[4] == 0 && mac_addr[5] == 0)
> - status = IXGBE_ERR_INVALID_MAC_ADDR;
> -
> - return status;
> -}
> -
> -/**
> * ixgbe_set_rar_generic - Set Rx address register
> * @hw: pointer to hardware structure
> * @index: Receive address register to write
> @@ -1889,8 +1865,7 @@ s32 ixgbe_init_rx_addrs_generic(struct ixgbe_hw *hw)
> * to the permanent address.
> * Otherwise, use the permanent address from the eeprom.
> */
> - if (ixgbe_validate_mac_addr(hw->mac.addr) ==
> - IXGBE_ERR_INVALID_MAC_ADDR) {
> + if (!is_valid_ether_addr(hw->mac.addr)) {
> /* Get the MAC address from the RAR0 for later reference */
> hw->mac.ops.get_mac_addr(hw, hw->mac.addr);
>
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_common.h b/drivers/net/ethernet/intel/ixgbe/ixgbe_common.h
> index d813d11..f49ca84 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_common.h
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_common.h
> @@ -80,7 +80,6 @@ s32 ixgbe_enable_rx_dma_generic(struct ixgbe_hw *hw, u32 regval);
> s32 ixgbe_fc_enable_generic(struct ixgbe_hw *hw);
> void ixgbe_fc_autoneg(struct ixgbe_hw *hw);
>
> -s32 ixgbe_validate_mac_addr(u8 *mac_addr);
> s32 ixgbe_acquire_swfw_sync(struct ixgbe_hw *hw, u16 mask);
> void ixgbe_release_swfw_sync(struct ixgbe_hw *hw, u16 mask);
> s32 ixgbe_get_san_mac_addr_generic(struct ixgbe_hw *hw, u8 *san_mac_addr);
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> index e2a6691..3bb3485 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> @@ -7345,7 +7345,7 @@ static int __devinit ixgbe_probe(struct pci_dev *pdev,
> memcpy(netdev->dev_addr, hw->mac.perm_addr, netdev->addr_len);
> memcpy(netdev->perm_addr, hw->mac.perm_addr, netdev->addr_len);
>
> - if (ixgbe_validate_mac_addr(netdev->perm_addr)) {
> + if (!is_valid_ether_addr(netdev->perm_addr)) {
> e_dev_err("invalid MAC address\n");
> err = -EIO;
> goto err_sw_init;
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
> index 0722f33..9ddac64 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
> @@ -1833,15 +1833,6 @@ enum {
> /* Number of 100 microseconds we wait for PCI Express master disable */
> #define IXGBE_PCI_MASTER_DISABLE_TIMEOUT 800
>
> -/* Check whether address is multicast. This is little-endian specific check.*/
> -#define IXGBE_IS_MULTICAST(Address) \
> - (bool)(((u8 *)(Address))[0] & ((u8)0x01))
> -
> -/* Check whether an address is broadcast. */
> -#define IXGBE_IS_BROADCAST(Address) \
> - ((((u8 *)(Address))[0] == ((u8)0xff)) && \
> - (((u8 *)(Address))[1] == ((u8)0xff)))
> -
> /* RAH */
> #define IXGBE_RAH_VIND_MASK 0x003C0000
> #define IXGBE_RAH_VIND_SHIFT 18
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_x540.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_x540.c
> index de4da52..c73b929 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_x540.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_x540.c
> @@ -152,7 +152,7 @@ mac_reset_top:
> hw->mac.ops.get_san_mac_addr(hw, hw->mac.san_addr);
>
> /* Add the SAN MAC address to the RAR only if it's a valid address */
> - if (ixgbe_validate_mac_addr(hw->mac.san_addr) == 0) {
> + if (is_valid_ether_addr(hw->mac.san_addr)) {
> hw->mac.ops.set_rar(hw, hw->mac.num_rar_entries - 1,
> hw->mac.san_addr, 0, IXGBE_RAH_AV);
>
>
^ permalink raw reply
* Re: [PATCH net-next 3/7] ipv6: improve ipv6_find_hdr() to skip empty routing headers
From: Pablo Neira Ayuso @ 2012-12-03 18:06 UTC (permalink / raw)
To: Jesse Gross; +Cc: David Miller, netdev, dev, Ansis Atteka
In-Reply-To: <CAEP_g=-D6f-O1F0B4+xLKfZnY_LjFiHF-UaXD4zVUYctGG3-Ug@mail.gmail.com>
On Mon, Dec 03, 2012 at 09:28:55AM -0800, Jesse Gross wrote:
> On Mon, Dec 3, 2012 at 6:04 AM, Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> > On Thu, Nov 29, 2012 at 10:35:45AM -0800, Jesse Gross wrote:
> >> @@ -159,9 +162,10 @@ int ipv6_find_hdr(const struct sk_buff *skb, unsigned int *offset,
> >> }
> >> len = skb->len - start;
> >>
> >> - while (nexthdr != target) {
> >
> > If the offset is set as parameter via ipv6_find_hdr, we now are always
> > entering the loop even if we found the target header we're looking
> > for, before that didn't happen.
> >
> > Something seems wrong here to me.
>
> If the target header is a routing header then you might still need to
> continue searching because the first one that you see could be empty.
OK, but if it's not a routing header what we're searching for (which
seems to be the case of netfilter/IPVS) we waste way more cycles on
copying the IPv6 header again and with way more things that are
completely useless.
^ permalink raw reply
* Re: [PATCH 1/4 net-next] tg3: PTP - Add header definitions, initialization and hw access functions.
From: Richard Cochran @ 2012-12-03 18:23 UTC (permalink / raw)
To: Michael Chan; +Cc: davem, netdev, nsujir
In-Reply-To: <1354506171-1646-1-git-send-email-mchan@broadcom.com>
Please put me on CC for any PTP related patches.
I have a few comments, below.
On Sun, Dec 02, 2012 at 07:42:48PM -0800, Michael Chan wrote:
> From: Matt Carlson <mcarlson@broadcom.com>
>
> This patch adds code to register/unregister the ptp clock and write
> the reference clock. If a chip reset is performed, the hwclock is
> reinitialized with the adjusted kernel time
>
> Signed-off-by: Nithin Nayak Sujir <nsujir@broadcom.com>
> Signed-off-by: Michael Chan <mchan@broadcom.com>
> ---
> drivers/net/ethernet/broadcom/Kconfig | 1 +
> drivers/net/ethernet/broadcom/tg3.c | 84 +++++++++++++++++++++++++++++++--
> drivers/net/ethernet/broadcom/tg3.h | 60 ++++++++++++++++++++++-
> 3 files changed, 137 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/net/ethernet/broadcom/Kconfig b/drivers/net/ethernet/broadcom/Kconfig
> index 4bd416b..f552673 100644
> --- a/drivers/net/ethernet/broadcom/Kconfig
> +++ b/drivers/net/ethernet/broadcom/Kconfig
> @@ -102,6 +102,7 @@ config TIGON3
> depends on PCI
> select PHYLIB
> select HWMON
> + select PTP_1588_CLOCK
> ---help---
> This driver supports Broadcom Tigon3 based gigabit Ethernet cards.
>
> diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
> index 5cc976d..38047a9 100644
> --- a/drivers/net/ethernet/broadcom/tg3.c
> +++ b/drivers/net/ethernet/broadcom/tg3.c
> @@ -54,6 +54,9 @@
> #include <asm/byteorder.h>
> #include <linux/uaccess.h>
>
> +#include <uapi/linux/net_tstamp.h>
> +#include <linux/ptp_clock_kernel.h>
> +
> #ifdef CONFIG_SPARC
> #include <asm/idprom.h>
> #include <asm/prom.h>
> @@ -5516,6 +5519,57 @@ static int tg3_setup_phy(struct tg3 *tp, int force_reset)
> return err;
> }
>
> +static void tg3_refclk_write(struct tg3 *tp, u64 newval)
> +{
> + tw32(TG3_EAV_REF_CLCK_CTL, TG3_EAV_REF_CLCK_CTL_STOP);
> + tw32(TG3_EAV_REF_CLCK_LSB, newval & 0xffffffff);
> + tw32(TG3_EAV_REF_CLCK_MSB, newval >> 32);
> + tw32_f(TG3_EAV_REF_CLCK_CTL, TG3_EAV_REF_CLCK_CTL_RESUME);
> +}
> +
> +static const struct ptp_clock_info tg3_ptp_caps = {
> + .owner = THIS_MODULE,
> + .name = "",
Please use a static name here, something related to the driver, like
"tigon3 clock" perhaps. There used to be drivers doing other things
with this, but now the kernel doc from ptp_clock_kernel.h says:
@name: A short "friendly name" to identify the clock and to
help distinguish PHY based devices from MAC based ones.
The string is not meant to be a unique id.
> + .max_adj = 0,
> + .n_alarm = 0,
> + .n_ext_ts = 0,
> + .n_per_out = 0,
> + .pps = 0,
You have left the methods as a bunch of NULL pointers. This will not
fly, since someone might land on this commit during a bisect. In
general, every patch in a series should result in a working kernel.
> +};
> +
> +static void tg3_ptp_init(struct tg3 *tp)
> +{
> + if (!tg3_flag(tp, PTP_CAPABLE))
> + return;
> +
> + /* Initialize the hardware clock to the system time. */
> + tg3_refclk_write(tp, ktime_to_ns(ktime_get_real()));
> + tp->ptp_adjust = 0;
> +
> + tp->ptp_info = tg3_ptp_caps;
> + strncpy(tp->ptp_info.name, tp->dev->name, IFNAMSIZ);
> +}
> +
> +static void tg3_ptp_resume(struct tg3 *tp)
> +{
> + if (!tg3_flag(tp, PTP_CAPABLE))
> + return;
> +
> + tg3_refclk_write(tp, ktime_to_ns(ktime_get_real()) + tp->ptp_adjust);
> + tp->ptp_adjust = 0;
> +}
> +
> +static void tg3_ptp_fini(struct tg3 *tp)
> +{
> + if (!tg3_flag(tp, PTP_CAPABLE) ||
> + !tp->ptp_clock)
Overzealous line break.
> + return;
> +
> + ptp_clock_unregister(tp->ptp_clock);
> + tp->ptp_clock = NULL;
> + tp->ptp_adjust = 0;
> +}
> +
> static inline int tg3_irq_sync(struct tg3 *tp)
> {
> return tp->irq_sync;
> @@ -6527,6 +6581,8 @@ static inline void tg3_netif_stop(struct tg3 *tp)
>
> static inline void tg3_netif_start(struct tg3 *tp)
> {
> + tg3_ptp_resume(tp);
> +
> /* NOTE: unconditional netif_tx_wake_all_queues is only
> * appropriate so long as all callers are assured to
> * have free tx slots (such as after tg3_init_hw)
> @@ -10364,7 +10420,8 @@ static void tg3_ints_fini(struct tg3 *tp)
> tg3_flag_clear(tp, ENABLE_TSS);
> }
>
> -static int tg3_start(struct tg3 *tp, bool reset_phy, bool test_irq)
> +static int tg3_start(struct tg3 *tp, bool reset_phy, bool test_irq,
> + bool init)
> {
> struct net_device *dev = tp->dev;
> int i, err;
> @@ -10443,6 +10500,12 @@ static int tg3_start(struct tg3 *tp, bool reset_phy, bool test_irq)
> tg3_flag_set(tp, INIT_COMPLETE);
> tg3_enable_ints(tp);
>
> + if (init)
> + tg3_ptp_init(tp);
> + else
> + tg3_ptp_resume(tp);
> +
> +
> tg3_full_unlock(tp);
>
> netif_tx_start_all_queues(dev);
> @@ -10540,11 +10603,19 @@ static int tg3_open(struct net_device *dev)
>
> tg3_full_unlock(tp);
>
> - err = tg3_start(tp, true, true);
> + err = tg3_start(tp, true, true, true);
> if (err) {
> tg3_frob_aux_power(tp, false);
> pci_set_power_state(tp->pdev, PCI_D3hot);
> }
> +
> + if (tg3_flag(tp, PTP_CAPABLE)) {
> + tp->ptp_clock = ptp_clock_register(&tp->ptp_info,
> + &tp->pdev->dev);
> + if (IS_ERR(tp->ptp_clock))
> + tp->ptp_clock = NULL;
> + }
> +
> return err;
> }
>
> @@ -10552,6 +10623,8 @@ static int tg3_close(struct net_device *dev)
> {
> struct tg3 *tp = netdev_priv(dev);
>
> + tg3_ptp_fini(tp);
> +
> tg3_stop(tp);
>
> /* Clear stats across close / open calls */
> @@ -11454,7 +11527,7 @@ static int tg3_set_channels(struct net_device *dev,
>
> tg3_carrier_off(tp);
>
> - tg3_start(tp, true, false);
> + tg3_start(tp, true, false, false);
>
> return 0;
> }
> @@ -12507,7 +12580,6 @@ static void tg3_self_test(struct net_device *dev, struct ethtool_test *etest,
> }
>
> tg3_full_lock(tp, irq_sync);
> -
> tg3_halt(tp, RESET_KIND_SUSPEND, 1);
> err = tg3_nvram_lock(tp);
> tg3_halt_cpu(tp, RX_CPU_BASE);
> @@ -16598,8 +16670,8 @@ static void tg3_io_resume(struct pci_dev *pdev)
> tg3_full_lock(tp, 0);
> tg3_flag_set(tp, INIT_COMPLETE);
> err = tg3_restart_hw(tp, 1);
> - tg3_full_unlock(tp);
> if (err) {
> + tg3_full_unlock(tp);
This is hunk or the next one somehow related to the PTP code?
If not, they it should go into their own patch.
> netdev_err(netdev, "Cannot restart hardware after reset.\n");
> goto done;
> }
> @@ -16610,6 +16682,8 @@ static void tg3_io_resume(struct pci_dev *pdev)
>
> tg3_netif_start(tp);
>
> + tg3_full_unlock(tp);
> +
> tg3_phy_start(tp);
>
> done:
Thanks,
Richard
^ permalink raw reply
* Re: [PATCH v4 net-next] sctp: Add support to per-association statistics via a new SCTP_GET_ASSOC_STATS call
From: David Miller @ 2012-12-03 18:33 UTC (permalink / raw)
To: vyasevich; +Cc: michele, linux-sctp, nhorman, tgraf, netdev
In-Reply-To: <50BCD956.2090706@gmail.com>
From: Vlad Yasevich <vyasevich@gmail.com>
Date: Mon, 03 Dec 2012 11:54:46 -0500
> On 12/01/2012 09:49 AM, Michele Baldessari wrote:
>> The current SCTP stack is lacking a mechanism to have per association
>> statistics. This is an implementation modeled after OpenSolaris'
>> SCTP_GET_ASSOC_STATS.
>>
>> Userspace part will follow on lksctp if/when there is a general ACK on
>> this.
...
>> Signed-off: Michele Baldessari <michele@acksyn.org>
>
> Looks good
>
> Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Applied.
^ permalink raw reply
* Re: [PATCH] irda: ep7211-sir: Convert to platform_diver
From: David Miller @ 2012-12-03 18:33 UTC (permalink / raw)
To: arnd; +Cc: shc_work, netdev, samuel
In-Reply-To: <201212012050.22382.arnd@arndb.de>
From: Arnd Bergmann <arnd@arndb.de>
Date: Sat, 1 Dec 2012 20:50:22 +0000
> On Saturday 01 December 2012, Alexander Shiyan wrote:
>> This patch converts ep7211-sir driver to platform_driver.
>> Since driver can be used not only for EP7211 CPU, function names
>> was be renamed to generic clps711x...
>>
>> Signed-off-by: Alexander Shiyan <shc_work@mail.ru>
>
> Acked-by: Arnd Bergmann <arnd@arndb.de>
Applied.
^ permalink raw reply
* Re: [PATCH net-next] sundance: Enable WoL support
From: David Miller @ 2012-12-03 18:33 UTC (permalink / raw)
To: kda; +Cc: netdev
In-Reply-To: <1354387159-18793-1-git-send-email-kda@linux-powerpc.org>
From: Denis Kirjanov <kda@linux-powerpc.org>
Date: Sat, 1 Dec 2012 22:39:19 +0400
> Enable WoL support.
>
> Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org>
Applied.
^ permalink raw reply
* Re: [PATCH] net: phy: smsc: force all capable mode if the phy is started in powerdown mode
From: David Miller @ 2012-12-03 18:34 UTC (permalink / raw)
To: tremyfr
Cc: netdev, linux-kernel, otavio, javier, jkosina, eric.jarrige,
julien.boibessot, thomas.petazzoni
In-Reply-To: <1354394689-28281-1-git-send-email-tremyfr@yahoo.fr>
From: Philippe Reynes <tremyfr@yahoo.fr>
Date: Sat, 1 Dec 2012 21:44:49 +0100
> static int smsc_phy_config_init(struct phy_device *phydev)
> {
> - int rc = phy_read(phydev, MII_LAN83C185_CTRL_STATUS);
> + /*
> + * If the SMSC PHY is in power down mode, then set it
> + * in all capable mode before using it.
> + */
> + int rc = phy_read(phydev, MII_LAN83C185_SPECIAL_MODES);
This is formatted poorly. Do not put comments above the basic
block variable declarations, it looks terrible.
Also, comments in the networking code should be formatted:
/* Like
* this.
*/
/*
* Not like
* this.
*/
Thanks.
^ permalink raw reply
* Re: [PATCH] stmmac: remove two repeated macros
From: David Miller @ 2012-12-03 18:35 UTC (permalink / raw)
To: walimisdev; +Cc: peppe.cavallaro, netdev
In-Reply-To: <1354537184-1423-1-git-send-email-walimisdev@gmail.com>
From: Liming Wang <walimisdev@gmail.com>
Date: Mon, 3 Dec 2012 20:19:44 +0800
> Two macros have been defined twice, remove them.
>
> Signed-off-by: Liming Wang <walimisdev@gmail.com>
Applied, thanks.
^ permalink raw reply
* Re: [PATCH] tun: only queue packets on device
From: David Miller @ 2012-12-03 18:41 UTC (permalink / raw)
To: mst; +Cc: netdev, jasowang, nhorman, ramirose, linux-kernel
In-Reply-To: <20121203131942.GA27953@redhat.com>
From: "Michael S. Tsirkin" <mst@redhat.com>
Date: Mon, 3 Dec 2012 15:19:43 +0200
> Historically tun supported two modes of operation:
> - in default mode, a small number of packets would get queued
> at the device, the rest would be queued in qdisc
> - in one queue mode, all packets would get queued at the device
>
> This might have made sense up to a point where we made the
> queue depth for both modes the same and set it to
> a huge value (500) so unless the consumer
> is stuck the chance of losing packets is small.
>
> Thus in practice both modes behave the same, but the
> default mode has some problems:
> - if packets are never consumed, fragments are never orphaned
> which cases a DOS for sender using zero copy transmit
> - overrun errors are hard to diagnose: fifo error is incremented
> only once so you can not distinguish between
> userspace that is stuck and a transient failure,
> tcpdump on the device does not show any traffic
>
> Userspace solves this simply by enabling IFF_ONE_QUEUE
> but there seems to be little point in not doing the
> right thing for everyone, by default.
>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Now that TUN_NO_QUEUE has no real effect and is a NOP, please document
it as such both in if_tun.h and the places in the driver that flip the
bit based upon userspace requests.
Thanks.
^ permalink raw reply
* Re: [PATCH] vhost-net: initialize zcopy packet counters
From: David Miller @ 2012-12-03 18:42 UTC (permalink / raw)
To: mst; +Cc: netdev, linux-kernel, kvm, virtualization
In-Reply-To: <20121203173151.GA30949@redhat.com>
From: "Michael S. Tsirkin" <mst@redhat.com>
Date: Mon, 3 Dec 2012 19:31:51 +0200
> These packet counters are used to drive the zercopy
> selection heuristic so nothing too bad happens if they are off a bit -
> and they are also reset once in a while.
> But it's cleaner to clear them when backend is set so that
> we start in a known state.
>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Applied to net-next, thanks.
^ permalink raw reply
* Re: [RFC PATCH] tcp: introduce raw access to experimental options
From: Einar Lueck @ 2012-12-03 18:44 UTC (permalink / raw)
To: Yuchung Cheng; +Cc: netdev, frankbla, raspl, ubacher, samudrala, davem
In-Reply-To: <CAK6E8=cXin1rSvQNMPife692wkd5tswsR_7RVG6FNB0rqymHeg@mail.gmail.com>
On 11/28/2012 01:01 PM, Yuchung Cheng wrote:
> On Sat, Nov 17, 2012 at 12:54 AM, <elelueck@linux.vnet.ibm.com> wrote:
>> From: Einar Lueck <elelueck@linux.vnet.ibm.com>
>>
>> This patch adds means for raw acces to TCP expirimental options
>> 253 and 254. The intention of this is to enable user space
>> applications to implement communication behaviour that depends
>> on experimental options. For that, new (set|get)sockopts are
>
> Could you elaborate on the use case? I am having a hard time
> understanding that. If you need to use experimental options for your
> applications, why not just use another magic number according to
> draft-ietf-tcpm-experimental-options-02 (since you cite that too)?
We want to enable application programmers to exploit this without
having to change the kernel with all the corresponding implications.
Einar.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox