Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH] virtio-net: fix data corruption with OOM
From: Michael S. Tsirkin @ 2009-10-26 18:42 UTC (permalink / raw)
  To: Rusty Russell; +Cc: virtualization, kvm, netdev
In-Reply-To: <200910261211.52148.rusty@rustcorp.com.au>

On Mon, Oct 26, 2009 at 12:11:51PM +1030, Rusty Russell wrote:
> On Mon, 26 Oct 2009 03:33:40 am Michael S. Tsirkin wrote:
> > virtio net used to unlink skbs from send queues on error,
> > but ever since 48925e372f04f5e35fec6269127c62b2c71ab794
> > we do not do this. This causes guest data corruption and crashes
> > with vhost since net core can requeue the skb or free it without
> > it being taken off the list.
> > 
> > This patch fixes this by queueing the skb after successfull
> > transmit.
> 
> I originally thought that this was racy: as soon as we do add_buf, we need to
> make sure we're ready for the callback (for virtio_pci, it's ->kick, but we
> shouldn't rely on that).

Modified the guest slightly, and I am getting crashes again.
I didn't have time to debug this, but based on previous experience,
I reverted 48925e372f04f5e35fec6269127c62b2c71ab794,
and the crash went away.
Rusty, what do you say we just revert 48925e372f04f5e35fec6269127c62b2c71ab794
for now?

How to reproduce: I used my vhost trees, and modified drivers/vhost/vhost.c :
-       vhost_workqueue = create_workqueue("vhost");
+       vhost_workqueue = create_singlethread_workqueue("vhost");

My guess is this modifies timing and uncovers more races,
but of course there is a possibility that the bug is in vhost.
Still, the fact that 2.6.31 and 48925e372f04f5e35fec6269127c62b2c71ab794
as a guest are both fine, this is a strong hint that
48925e372f04f5e35fec6269127c62b2c71ab794 is to blame.

[   24.555691] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008                      
[   24.556658] IP: [<ffffffffa003f1b1>] free_old_xmit_skbs+0x66/0xcd [virtio_net]                             
[   24.556658] PGD 3e9ee067 PUD 3f38d067 PMD 0                                                                
[   24.556658] Thread overran stack, or stack corrupted                                                       
[   24.556658] Oops: 0002 [#1] SMP                                                                            
[   24.556658] last sysfs file: /sys/devices/virtual/input/input1/capabilities/sw                             
[   24.556658] CPU 0                                                                                          
[   24.556658] Modules linked in: virtio_net virtio_blk virtio_pci virtio_ring virtio af_packet aacraid [last unloaded: scsi_wait_scan]                                                                                     
[   24.556658] Pid: 0, comm: swapper Tainted: G        W  2.6.32-rc4-net #6                                   
[   24.556658] RIP: 0010:[<ffffffffa003f1b1>]  [<ffffffffa003f1b1>] free_old_xmit_skbs+0x66/0xcd [virtio_net] 
[   24.556658] RSP: 0018:ffff880001c03d70  EFLAGS: 00010202                                                   
[   24.556658] RAX: ffff88003e951418 RBX: ffff88003e953398 RCX: 0000000000000000                              
[   24.556658] RDX: 0000000000000000 RSI: ffff880001c03d84 RDI: ffff88003e953398                              
[   24.556658] RBP: ffff880001c03db0 R08: ffff88003e2c949c R09: 00000000ffffffff                              
[   24.556658] R10: ffff880001c03f78 R11: 00000000fffbcc57 R12: ffff88003e65cdc0                              
[   24.556658] R13: 0000000000000000 R14: 2000000000000000 R15: ffff880001c03d84                              
[   24.556658] FS:  0000000000000000(0000) GS:ffff880001c00000(0000) knlGS:0000000000000000                   
[   24.556658] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b                                              
[   24.556658] CR2: 0000000000000008 CR3: 000000003eee4000 CR4: 00000000000006b0                              
[   24.556658] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000                              
[   24.556658] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400                              
[   24.556658] Process swapper (pid: 0, threadinfo ffffffff8174e000, task ffffffff817c09f0)                   
[   24.556658] Stack:                                                                                         
[   24.556658]  0000000000000002 0000000000000000 0000000000000000 ffff88003e953398                           
[   24.556658] <0> ffff88003e953398 ffff88003e65cdc0 ffff88003e65c800 ffff88003e65ce70                        
[   24.556658] <0> ffff880001c03df0 ffffffffa003fb35 ffff88003e65cc28 ffff88003e953398                        
[   24.556658] Call Trace:                                                                                    
[   24.556658]  <IRQ>                                                                                         
[   24.556658]  [<ffffffffa003fb35>] start_xmit+0x38/0x15f [virtio_net]                                       
[   24.556658]  [<ffffffff813ff768>] dev_hard_start_xmit+0x26c/0x2d3                                          
[   24.556658]  [<ffffffff81412016>] sch_direct_xmit+0x5a/0x157                                               
[   24.556658]  [<ffffffff814121cf>] __qdisc_run+0xbc/0xdd                                                    
[   24.556658]  [<ffffffff813fce1c>] net_tx_action+0xc2/0x120                                                 
[   24.556658]  [<ffffffff81047efe>] __do_softirq+0xd8/0x192                                                  
[   24.556658]  [<ffffffff8100cb3c>] call_softirq+0x1c/0x28                                                   
[   24.556658]  [<ffffffff8100ddb7>] do_softirq+0x33/0x6b                                                     
[   24.556658]  [<ffffffff81047d5c>] irq_exit+0x36/0x75                                                       
[   24.556658]  [<ffffffff8100d692>] do_IRQ+0xa8/0xbf                                                         
[   24.556658]  [<ffffffff8100c3d3>] ret_from_intr+0x0/0xa                                                    
[   24.556658]  <EOI>                                                                                         
[   24.556658]  [<ffffffff81011de3>] ? default_idle+0x31/0x46                                                 
[   24.556658]  [<ffffffff81011dc5>] ? default_idle+0x13/0x46                                                 
[   24.556658]  [<ffffffff8100ae53>] ? cpu_idle+0x55/0x8d                                                     
[   24.556658]  [<ffffffff814d1982>] ? rest_init+0x66/0x68                                                    
[   24.556658]  [<ffffffff818adc5d>] ? start_kernel+0x360/0x36b                                               
[   24.556658]  [<ffffffff818ad29a>] ? x86_64_start_reservations+0xaa/0xae                                    
[   24.556658]  [<ffffffff818ad37f>] ? x86_64_start_kernel+0xe1/0xe8                                          
[   24.556658] Code: fc 26 00 00 00 75 75 41 ff 8c 24 c0 00 00 00 48 89 df 48 8b 13 48 8b 43 08 48 c7 03 00 00 00 00 48 c7 43 08 00 00 00 00 48 89 10 <48> 89 42 08 49 8b 54 24 20 8b 43 68 48 01 82 98 00 00 00 49 8b      
[   24.556658] RIP  [<ffffffffa003f1b1>] free_old_xmit_skbs+0x66/0xcd [virtio_net]                            
[   24.556658]  RSP <ffff880001c03d70>                                                                        
[   24.556658] CR2: 0000000000000008                                                                          
[   24.722629] ---[ end trace 6ac04221a0ae018b ]---                                                           
[   24.725010] Kernel panic - not syncing: Fatal exception in interrupt                                       
[   24.727696] Pid: 0, comm: swapper Tainted: G      D W  2.6.32-rc4-net #6                                   
[   24.730447] Call Trace:                                                                                    
[   24.732443]  <IRQ>  [<ffffffff814eb553>] panic+0x75/0x127                                                  
[   24.735097]  [<ffffffff814ee350>] oops_end+0xaa/0xba                                                       
[   24.737520]  [<ffffffff81029002>] no_context+0x1ea/0x1f9                                                   
[   24.740024]  [<ffffffff810291c4>] __bad_area_nosemaphore+0x1b3/0x1d9                                       
[   24.742779]  [<ffffffff810291f8>] bad_area_nosemaphore+0xe/0x10                                            
[   24.745399]  [<ffffffff814ef73c>] do_page_fault+0x186/0x2c3                                                
[   24.748009]  [<ffffffff814ed8bf>] page_fault+0x1f/0x30                                                     
[   24.750463]  [<ffffffffa003f1b1>] ? free_old_xmit_skbs+0x66/0xcd [virtio_net]                              
[   24.753299]  [<ffffffffa003fb35>] start_xmit+0x38/0x15f [virtio_net]                                       
[   24.755990]  [<ffffffff813ff768>] dev_hard_start_xmit+0x26c/0x2d3                                          
[   24.758635]  [<ffffffff81412016>] sch_direct_xmit+0x5a/0x157                                               
[   24.761204]  [<ffffffff814121cf>] __qdisc_run+0xbc/0xdd                                                    
[   24.763693]  [<ffffffff813fce1c>] net_tx_action+0xc2/0x120                                                 
[   24.766236]  [<ffffffff81047efe>] __do_softirq+0xd8/0x192                                                  
[   24.768754]  [<ffffffff8100cb3c>] call_softirq+0x1c/0x28                                                   
[   24.771326]  [<ffffffff8100ddb7>] do_softirq+0x33/0x6b                                                     
[   24.773793]  [<ffffffff81047d5c>] irq_exit+0x36/0x75                                                       
[   24.776241]  [<ffffffff8100d692>] do_IRQ+0xa8/0xbf                                                         
[   24.778705]  [<ffffffff8100c3d3>] ret_from_intr+0x0/0xa                                                    
[   24.781191]  <EOI>  [<ffffffff81011de3>] ? default_idle+0x31/0x46                                          
[   24.783961]  [<ffffffff81011dc5>] ? default_idle+0x13/0x46                                                 
[   24.786487]  [<ffffffff8100ae53>] ? cpu_idle+0x55/0x8d                                                     
[   24.788967]  [<ffffffff814d1982>] ? rest_init+0x66/0x68                                                    
[   24.791448]  [<ffffffff818adc5d>] ? start_kernel+0x360/0x36b                                               
[   24.794014]  [<ffffffff818ad29a>] ? x86_64_start_reservations+0xaa/0xae                                    
[   24.796747]  [<ffffffff818ad37f>] ? x86_64_start_kernel+0xe1/0xe8       



^ permalink raw reply

* 2.6.32-rc5-git3: Reported regressions from 2.6.31
From: Rafael J. Wysocki @ 2009-10-26 18:45 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Adrian Bunk, Andrew Morton, Linus Torvalds, Natalie Protasevich,
	Kernel Testers List, Network Development, Linux ACPI,
	Linux PM List, Linux SCSI List, Linux Wireless List, DRI

This message contains a list of some regressions from 2.6.31, for which there
are no fixes in the mainline I know of.  If any of them have been fixed already,
please let me know.

If you know of any other unresolved regressions from 2.6.31, please let me know
either and I'll add them to the list.  Also, please let me know if any of the
entries below are invalid.

Each entry from the list will be sent additionally in an automatic reply to
this message with CCs to the people involved in reporting and handling the
issue.


Listed regressions statistics:

  Date          Total  Pending  Unresolved
  ----------------------------------------
  2009-10-26       66       42          37
  2009-10-12       48       31          27
  2009-10-02       22       15           9


Unresolved regressions
----------------------

Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14485
Subject		: System lockup running "cat /sys/kernel/debug/dri/0/i915_regs"
Submitter	: Miles Lane <miles.lane-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date		: 2009-10-26 4:00 (1 days old)
References	: http://marc.info/?l=linux-kernel&m=125652968117713&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14484
Subject		: no video output after suspend
Submitter	: Riccardo Magliocchetti <riccardo.magliocchetti-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date		: 2009-10-25 20:57 (2 days old)
References	: http://marc.info/?l=linux-kernel&m=125650430123713&w=4
Handled-By	: Jesse Barnes <jbarnes-Y1mF5jBUw70BENJcbMCuUQ@public.gmane.org>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14483
Subject		: WARNING: at drivers/base/sys.c:353 __sysdev_resume+0x54/0xca()
Submitter	: Justin Mattock <justinmattock-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date		: 2009-10-25 19:58 (2 days old)
References	: http://marc.info/?l=linux-kernel&m=125650070420168&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14482
Subject		: kernel BUG at fs/dcache.c:670 +lvm +md +ext3
Submitter	: Alexander Clouter <alex-L4GPcECwBoDe9xe1eoZjHA@public.gmane.org>
Date		: 2009-10-23 10:30 (4 days old)
References	: http://lkml.org/lkml/2009/10/23/50


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14481
Subject		: umount blocked for more than 120 seconds after USB drive removal
Submitter	: Robert Hancock <hancockrwd-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date		: 2009-10-21 5:26 (6 days old)
References	: http://marc.info/?l=linux-kernel&m=125610280532245&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14479
Subject		: nfs oops
Submitter	: Egon Alter <egon.alter-hi6Y0CQ0nG0@public.gmane.org>
Date		: 2009-10-19 16:03 (8 days old)
References	: http://marc.info/?l=linux-kernel&m=125596822630410&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14477
Subject		: possible circular locking dependency in ISDN PPP
Submitter	: Tilman Schmidt <tilman-ZTO5kqT2PaM@public.gmane.org>
Date		: 2009-10-18 22:16 (9 days old)
References	: http://marc.info/?l=linux-kernel&m=125590423416087&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14473
Subject		: ATA related kernel warning after resume
Submitter	: Tino Keitel <tino.keitel-rAwCM5oiXHA@public.gmane.org>
Date		: 2009-10-14 6:55 (13 days old)
References	: http://marc.info/?l=linux-kernel&m=125550466624678&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14472
Subject		: EXT4 corruption
Submitter	: Shawn Starr <shawn.starr-bJEeYj9oJeDQT0dZR+AlfA@public.gmane.org>
Date		: 2009-10-13 2:07 (14 days old)
References	: http://marc.info/?l=linux-kernel&m=125539997508256&w=4
Handled-By	: Theodore Tso <tytso-3s7WtUTddSA@public.gmane.org>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14467
Subject		: Linker errors on ia64 with NR_CPUS=4096
Submitter	: Jeff Mahoney <jeffm-IBi9RG/b67k@public.gmane.org>
Date		: 2009-10-18 22:28 (9 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=34d76c41554a05425613d16efebb3069c4c545f0
References	: http://marc.info/?l=linux-kernel&m=125590493116720&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14466
Subject		: EFI boot on x86 fails in .32
Submitter	: Matthew Garrett <mjg59-1xO5oi07KQx4cg9Nei1l7Q@public.gmane.org>
Date		: 2009-10-20 0:34 (7 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=7bd867dfb4e0357e06a3211ab2bd0e714110def3
References	: http://marc.info/?l=linux-kernel&m=125599887314290&w=4
Handled-By	: Feng Tang <feng.tang-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14442
Subject		: resume after hibernate: /dev/sdb drops and returns as /dev/sde
Submitter	: Duncan <1i5t5.duncan-j9pdmedNgrk@public.gmane.org>
Date		: 2009-10-20 01:52 (7 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14430
Subject		: sync() hangs in bdi_sched_wait
Submitter	: Petr Vandrovec <petr-vPk2MGR0e28uaRcfnNAh7A@public.gmane.org>
Date		: 2009-10-17 19:14 (10 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14415
Subject		: Reboot on kernel load
Submitter	: Brian Beardall <brian-sVkzCUl/XCrR7s880joybQ@public.gmane.org>
Date		: 2009-10-15 23:57 (12 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14408
Subject		: sysctl check failed
Submitter	: Peter Teoh <htmldeveloper-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date		: 2009-10-14 22:59 (13 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14406
Subject		: uvcvideo stopped work on Toshiba
Submitter	: okias <d.okias-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date		: 2009-10-14 19:08 (13 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14390
Subject		: "bind" a device to a driver doesn't not work anymore
Submitter	: Éric Piel <Eric.Piel-VkQ1JFuSMpfAbQlEx87xDw@public.gmane.org>
Date		: 2009-10-11 0:04 (16 days old)
References	: http://marc.info/?l=linux-kernel&m=125521979921241&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14389
Subject		: Build system issue
Submitter	: Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
Date		: 2009-10-09 8:58 (18 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=575543347b5baed0ca927cb90ba8807396fe9cc9
References	: http://marc.info/?l=linux-kernel&m=125507914909152&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14387
Subject		: deadlock with fallocate
Submitter	: Thomas Neumann <tneumann-Rn4VEauK+AKRv+LV9MX5uipxlwaOVQ5f@public.gmane.org>
Date		: 2009-10-07 3:00 (20 days old)
References	: http://marc.info/?l=linux-kernel&m=125488495526471&w=4
Handled-By	: Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14384
Subject		: tbench regression with 2.6.32-rc1
Submitter	: Zhang, Yanmin <yanmin_zhang-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
Date		: 2009-10-09 9:51 (18 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=59abf02644c45f1591e1374ee7bb45dc757fcb88
References	: http://marc.info/?l=linux-kernel&m=125508216713138&w=4
Handled-By	: Peter Zijlstra <a.p.zijlstra-/NLkJaSkS4VmR6Xm/wNWPw@public.gmane.org>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14383
Subject		: hackbench regression with kernel 2.6.32-rc1
Submitter	: Zhang, Yanmin <yanmin_zhang-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
Date		: 2009-10-09 9:19 (18 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=29cd8bae396583a2ee9a3340db8c5102acf9f6fd
References	: http://marc.info/?l=linux-kernel&m=125508007510274&w=4
Handled-By	: Peter Zijlstra <a.p.zijlstra-/NLkJaSkS4VmR6Xm/wNWPw@public.gmane.org>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14381
Subject		: iwlagn lost connection after s2ram (with warnings)
Submitter	: Carlos R. Mafra <crmafra2-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date		: 2009-10-07 14:20 (20 days old)
References	: http://marc.info/?l=linux-kernel&m=125492569119947&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14378
Subject		: Problems with net/core/skbuff.c
Submitter	: Massimo Cetra <mcetra-BBpJ+9iBSNKonA0d6jMUrA@public.gmane.org>
Date		: 2009-10-08 14:51 (19 days old)
References	: http://marc.info/?l=linux-kernel&m=125501488220358&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14376
Subject		: Kernel NULL pointer dereference/ kvm subsystem
Submitter	: Don Dupuis <dondster-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date		: 2009-10-06 14:38 (21 days old)
References	: http://marc.info/?l=linux-kernel&m=125484025021737&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14373
Subject		: Task blocked for more than 120 seconds
Submitter	: Zeno Davatz <zdavatz-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date		: 2009-10-02 10:16 (25 days old)
References	: http://marc.info/?l=linux-kernel&m=125447858618412&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14372
Subject		: ath5k wireless not working after suspend-resume - eeepc
Submitter	: Fabio Comolli <fabio.comolli-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date		: 2009-10-03 15:36 (24 days old)
References	: http://lkml.org/lkml/2009/10/3/91


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14355
Subject		: USB serial regression after 2.6.31.1 with Huawei E169 GSM modem
Submitter	: Benjamin Herrenschmidt <benh-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r@public.gmane.org>
Date		: 2009-10-10 03:07 (17 days old)
References	: http://marc.info/?l=linux-kernel&m=125513456327542&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14354
Subject		: Bad corruption with 2.6.32-rc1 and upwards
Submitter	: Holger Freyther <zecke-MQnelBtSfJRAfugRpC6u6w@public.gmane.org>
Date		: 2009-10-09 15:42 (18 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14353
Subject		: BUG: sleeping function called from invalid context at kernel/mutex.c:280
Submitter	: Miles Lane <miles.lane-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date		: 2009-10-05 3:39 (22 days old)
References	: http://marc.info/?l=linux-kernel&m=125471432208671&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14352
Subject		: WARNING: at net/mac80211/scan.c:267
Submitter	: Maciej Rutecki <maciej.rutecki-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date		: 2009-10-08 00:30 (19 days old)
References	: http://bugzilla.intellinuxwireless.org/show_bug.cgi?id=2089#c7


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14334
Subject		: pcmcia suspend regression from 2.6.31.1 to 2.6.31.2 - Dell Inspiron 600m
Submitter	: Jose Marino <braket-PkbjNfxxIARBDgjK7y7TUQ@public.gmane.org>
Date		: 2009-10-06 15:44 (21 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14331
Subject		: Radeon XPRESS 200M: System hang with radeon DRI and Fedora 10 userspace unless DRI=off
Submitter	: Alex Villacis Lasso <avillaci-x0m+Mc+nT7uljOmnV8AmnkElSqmLX1BE@public.gmane.org>
Date		: 2009-10-06 00:29 (21 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14299
Subject		: oops in wireless, iwl3945 related?
Submitter	: Pavel Machek <pavel-+ZI9xUNit7I@public.gmane.org>
Date		: 2009-09-29 17:12 (28 days old)
References	: http://marc.info/?l=linux-kernel&m=125424439725743&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14298
Subject		: warning at manage.c:361 (set_irq_wake), matrix-keypad related?
Submitter	: Pavel Machek <pavel-+ZI9xUNit7I@public.gmane.org>
Date		: 2009-09-30 20:07 (27 days old)
References	: http://marc.info/?l=linux-kernel&m=125434130703538&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14297
Subject		: console resume broken since ba15ab0e8d
Submitter	: Sascha Hauer <s.hauer-bIcnvbaLZ9MEGnE8C9+IrQ@public.gmane.org>
Date		: 2009-09-30 15:11 (27 days old)
References	: http://marc.info/?l=linux-kernel&m=125432349404060&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14296
Subject		: spitz boots but suspend/resume is broken
Submitter	: Pavel Machek <pavel-+ZI9xUNit7I@public.gmane.org>
Date		: 2009-09-30 12:06 (27 days old)
References	: http://marc.info/?l=linux-kernel&m=125431244516449&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14277
Subject		: Caught 8-bit read from freed memory in b43 driver at association
Submitter	: Christian Casteyde <casteyde.christian-GANU6spQydw@public.gmane.org>
Date		: 2009-09-30 18:06 (27 days old)


Regressions with patches
------------------------

Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14480
Subject		: 2 locks held by cat -- running "find /sys | head -c 4" --> system hang
Submitter	: Miles Lane <miles.lane-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date		: 2009-10-20 16:11 (7 days old)
References	: http://marc.info/?l=linux-kernel&m=125605511728088&w=4
Handled-By	: Chris Wilson <chris-Y6uKTt2uX1cEflXRtASbqLVCufUGDwFn@public.gmane.org>
Patch		: http://patchwork.kernel.org/patch/54974/


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14380
Subject		: Video tearing/glitching with T400 laptops
Submitter	: Theodore Ts'o <tytso-3s7WtUTddSA@public.gmane.org>
Date		: 2009-10-02 22:40 (25 days old)
References	: http://marc.info/?l=linux-kernel&m=125452324520623&w=4
Handled-By	: Jesse Barnes <jbarnes-Y1mF5jBUw70BENJcbMCuUQ@public.gmane.org>
Patch		: http://marc.info/?l=linux-kernel&m=125591495325000&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14379
Subject		: ACPI Warning for _SB_.BAT0._BIF: Converted Buffer to expected String
Submitter	: Justin Mattock <justinmattock-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date		: 2009-10-08 21:46 (19 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d9adc2e031bd22d5d9607a53a8d3b30e0b675f39
References	: http://marc.info/?l=linux-kernel&m=125504031328941&w=4
Handled-By	: Alexey Starikovskiy <astarikovskiy-l3A5Bk7waGM@public.gmane.org>
Patch		: http://bugzilla.kernel.org/attachment.cgi?id=23347


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14375
Subject		: Intel(R) I/OAT DMA Engine init failed
Submitter	: Alexander Beregalov <a.beregalov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date		: 2009-10-02 9:46 (25 days old)
References	: http://marc.info/?l=linux-kernel&m=125447680016160&w=4
Handled-By	: Dan Williams <dan.j.williams-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Patch		: http://patchwork.kernel.org/patch/51808/


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14302
Subject		: Kernel panic on i386 machine when booting with profile=2
Submitter	: Shi, Alex <alex.shi-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Date		: 2009-10-01 3:23 (26 days old)
References	: http://marc.info/?l=linux-kernel&m=125436749607199&w=4
Handled-By	: Alex Shi <alex.shi-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Patch		: http://patchwork.kernel.org/patch/50813/


For details, please visit the bug entries and follow the links given in
references.

As you can see, there is a Bugzilla entry for each of the listed regressions.
There also is a Bugzilla entry used for tracking the regressions from 2.6.31,
unresolved as well as resolved, at:

http://bugzilla.kernel.org/show_bug.cgi?id=14230

Please let me know if there are any Bugzilla entries that should be added to
the list in there.

Thanks,
Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* performance regression in virtio-net in 2.6.32-rc4
From: Michael S. Tsirkin @ 2009-10-26 18:48 UTC (permalink / raw)
  To: Rusty Russell; +Cc: virtualization, kvm, netdev

Hi!
I noticed a performance regression in virtio net: going from
2.6.31 to 2.6.32-rc4 I see this, for guest to host communication:

[mst@tuck ~]$ ssh robin sh streamtest1
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 11.0.0.3
(11.0.0.3) port 0 AF_INET : demo
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

 87380  16384  16384    10.20    7806.48


[mst@tuck ~]$ ssh robin sh streamtest1
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 11.0.0.3
(11.0.0.3) port 0 AF_INET : demo
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

 87380  16384  16384    10.00    6814.60


Note: I had to revert 48925e372f04f5e35fec6269127c62b2c71ab794,
and I applied a patch
	virtio-pci: fix per-vq MSI-X request logic
which fixes a bug introduced by f68d24082e22ccee3077d11aeb6dc5354f0ca7f1.

Any tips on debugging this?

-- 
MST

^ permalink raw reply

* 2.6.32-rc5-git3: Reported regressions 2.6.30 -> 2.6.31
From: Rafael J. Wysocki @ 2009-10-26 19:26 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Andrew Morton, Linus Torvalds, Natalie Protasevich,
	Kernel Testers List, Network Development, Linux ACPI,
	Linux PM List, Linux SCSI List, Linux Wireless List, DRI

This message contains a list of some regressions introduced between 2.6.30 and
2.6.31, for which there are no fixes in the mainline I know of.  If any of them
have been fixed already, please let me know.

If you know of any other unresolved regressions introduced between 2.6.30
and 2.6.31, please let me know either and I'll add them to the list.
Also, please let me know if any of the entries below are invalid.

Each entry from the list will be sent additionally in an automatic reply to
this message with CCs to the people involved in reporting and handling the
issue.


Listed regressions statistics:

  Date          Total  Pending  Unresolved
  ----------------------------------------
  2009-10-26      170       37          32
  2009-10-12      161       45          35
  2009-10-02      151       49          42
  2009-09-06      123       34          27
  2009-08-26      108       33          26
  2009-08-20      102       32          29
  2009-08-10       89       27          24
  2009-08-02       76       36          28
  2009-07-27       70       51          43
  2009-07-07       35       25          21
  2009-06-29       22       22          15


Unresolved regressions
----------------------

Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14476
Subject		: Unable to handle kernel paging request in nfs_write_mapping
Submitter	: Stephan von Krawczynski <skraw@ithnet.com>
Date		: 2009-10-14 9:53 (13 days old)
References	: http://marc.info/?l=linux-kernel&m=125551421405656&w=4
Handled-By	: Trond Myklebust <Trond.Myklebust@netapp.com>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14474
Subject		: restorecond going crazy on 2.6.31.4 - inotify regression?
Submitter	: Robert Hancock <hancockrwd@gmail.com>
Date		: 2009-10-16 0:03 (11 days old)
References	: http://marc.info/?l=linux-kernel&m=125565159520489&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14446
Subject		: battery status info broken/useless in 2.6.32-rc3 - MSI PR200 (possibly others, too)
Submitter	: Eddy Petrișor <eddy.petrisor+linbug@gmail.com>
Date		: 2009-10-20 08:25 (7 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14417
Subject		: [Regression] Wireless driver iwlagn+iwlcore doesn't work after resume (needs reloading)
Submitter	: Eddy Petrișor <eddy.petrisor+linbug@gmail.com>
Date		: 2009-10-16 11:07 (11 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14402
Subject		: Atheros ath9k module is not working with 2.6.31.1 on an Acer Extensa 7630EZ
Submitter	: Bernhard <berndl81@gmx.at>
Date		: 2009-10-14 11:17 (13 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14400
Subject		: disable/enable wlan broken with ath5k
Submitter	: Daniel Bumke <danielbumke@gmail.com>
Date		: 2009-10-13 12:35 (14 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14391
Subject		: use after free of struct powernow_k8_data
Submitter	: Michal Schmidt <mschmidt@redhat.com>
Date		: 2009-09-24 14:51 (33 days old)
References	: http://marc.info/?l=linux-kernel&m=125380383515615&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14388
Subject		: keyboard under X with 2.6.31
Submitter	: Frédéric L. W. Meunier <fredlwm@gmail.com>
Date		: 2009-10-07 20:19 (20 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e043e42bdb66885b3ac10d27a01ccb9972e2b0a3
References	: http://marc.info/?l=linux-kernel&m=125494753228217&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14385
Subject		: DMAR regression in 2.6.31 leads to ext4 corruption?
Submitter	: Andy Isaacson <adi@hexapodia.org>
Date		: 2009-10-08 23:56 (19 days old)
References	: http://marc.info/?l=linux-kernel&m=125504643703877&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14294
Subject		: kernel BUG at drivers/ide/ide-disk.c:187
Submitter	: Santiago Garcia Mantinan <manty@manty.net>
Date		: 2009-09-30 11:05 (27 days old)
References	: http://marc.info/?l=linux-kernel&m=125430926311466&w=4
Handled-By	: David Miller <davem@davemloft.net>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14267
Subject		: Disassociating atheros wlan
Submitter	: Kristoffer Ericson <kristoffer.ericson@gmail.com>
Date		: 2009-09-24 10:16 (33 days old)
References	: http://marc.info/?l=linux-kernel&m=125378723723384&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14265
Subject		: ifconfig: page allocation failure. order:5, mode:0x8020 w/ e100
Submitter	: Karol Lewandowski <karol.k.lewandowski@gmail.com>
Date		: 2009-09-15 12:05 (42 days old)
References	: http://marc.info/?l=linux-kernel&m=125301636509517&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14257
Subject		: Not able to boot on 32 bit System
Submitter	: Rishikesh <risrajak@linux.vnet.ibm.com>
Date		: 2009-09-21 15:25 (36 days old)
References	: http://marc.info/?l=linux-kernel&m=125354604314412&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14256
Subject		: kernel BUG at fs/ext3/super.c:435
Submitter	: Mikael Pettersson <mikpe@it.uu.se>
Date		: 2009-09-21 7:29 (36 days old)
References	: http://marc.info/?l=linux-kernel&m=125351816109264&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14252
Subject		: WARNING: at include/linux/skbuff.h:1382 w/ e1000
Submitter	: Stephan von Krawczynski <skraw@ithnet.com>
Date		: 2009-09-20 11:26 (37 days old)
References	: http://marc.info/?l=linux-kernel&m=125344599006033&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14249
Subject		: BUG: oops in gss_validate on 2.6.31
Submitter	: Bastian Blank <bastian@waldi.eu.org>
Date		: 2009-09-16 10:29 (41 days old)
References	: http://marc.info/?l=linux-kernel&m=125309700417283&w=4
Handled-By	: Trond Myklebust <trond.myklebust@fys.uio.no>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14248
Subject		: 2.6.31 wireless: WARNING: at net/wireless/ibss.c:34
Submitter	: Jurriaan <thunder8@xs4all.nl>
Date		: 2009-09-13 7:32 (44 days old)
References	: http://marc.info/?l=linux-kernel&m=125282721113553&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14204
Subject		: MCE prevent booting on my computer(pentium iii @500Mhz)
Submitter	: GNUtoo <GNUtoo@no-log.org>
Date		: 2009-09-21 20:36 (36 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14181
Subject		: b43 causes panic at ifconfig down / shutdown
Submitter	: Jeremy Huddleston <jeremyhu@freedesktop.org>
Date		: 2009-09-15 18:34 (42 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14157
Subject		: end_request: I/O error, dev cciss/cXdX, sector 0
Submitter	:  <jiri.harcarik@gmail.com>
Date		: 2009-09-11 07:42 (46 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14141
Subject		: order 2 page allocation failures in iwlagn
Submitter	: Frans Pop <elendil@planet.nl>
Date		: 2009-09-06 7:40 (51 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=2ff05b2b4eac2e63d345fc731ea151a060247f53
References	: http://marc.info/?l=linux-kernel&m=125222287419691&w=4
		  http://lkml.org/lkml/2009/10/2/86
		  http://lkml.org/lkml/2009/10/5/24
		  http://lkml.indiana.edu/hypermail/linux/kernel/0910.1/01395.html
Handled-By	: Pekka Enberg <penberg@cs.helsinki.fi>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14114
Subject		: Tuning a saa7134 based card is broken in kernel 2.6.31-rc7
Submitter	: Tsvety Petrov <Tsvetoslav.Petrov@itron.com>
Date		: 2009-09-03 21:06 (54 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14090
Subject		: WARNING: at fs/notify/inotify/inotify_user.c:394
Submitter	: Joerg Platte <bugzilla@jako.ping.de>
Date		: 2009-08-30 15:21 (58 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14058
Subject		: Oops in fsnotify
Submitter	: Grant Wilson <grant.wilson@zen.co.uk>
Date		: 2009-08-20 15:48 (68 days old)
References	: http://marc.info/?l=linux-kernel&m=125078450923133&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13987
Subject		: Received NMI interrupt at resume
Submitter	: Christian Casteyde <casteyde.christian@free.fr>
Date		: 2009-08-15 07:55 (73 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=f41f3f373dd72344c65d801d6381fe83ef3a2c54


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13943
Subject		: WARNING: at net/mac80211/mlme.c:2292 with ath5k
Submitter	: Fabio Comolli <fabio.comolli@gmail.com>
Date		: 2009-08-06 20:15 (82 days old)
References	: http://marc.info/?l=linux-kernel&m=124958978600600&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13941
Subject		: x86 Geode issue
Submitter	: Martin-Éric Racine <q-funk@iki.fi>
Date		: 2009-08-03 12:58 (85 days old)
References	: http://marc.info/?l=linux-kernel&m=124930434732481&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13906
Subject		: Huawei E169 GPRS connection causes Ooops
Submitter	: Clemens Eisserer <linuxhippy@gmail.com>
Date		: 2009-08-04 09:02 (84 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13836
Subject		: suspend script fails, related to stdout?
Submitter	: Tomas M. <tmezzadra@gmail.com>
Date		: 2009-07-17 21:24 (102 days old)
References	: http://marc.info/?l=linux-kernel&m=124785853811667&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13809
Subject		: oprofile: possible circular locking dependency detected
Submitter	: Jerome Marchand <jmarchan@redhat.com>
Date		: 2009-07-22 13:35 (97 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13733
Subject		: 2.6.31-rc2: irq 16: nobody cared
Submitter	: Niel Lambrechts <niel.lambrechts@gmail.com>
Date		: 2009-07-06 18:32 (113 days old)
References	: http://marc.info/?l=linux-kernel&m=124690524027166&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13645
Subject		: NULL pointer dereference at (null) (level2_spare_pgt)
Submitter	: poornima nayak <mpnayak@linux.vnet.ibm.com>
Date		: 2009-06-17 17:56 (132 days old)
References	: http://lkml.org/lkml/2009/6/17/194


Regressions with patches
------------------------

Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14340
Subject		: speedstep-ich driver not working in 2.6.31
Submitter	:  <dave.mueller@gmx.ch>
Date		: 2009-10-07 08:16 (20 days old)
Handled-By	: Eric Pielbug <e.a.b.piel@tudelft.nl>
		  Rusty Russell <rusty@rustcorp.com.au>
Patch		: http://patchwork.kernel.org/patch/54672/


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14258
Subject		: Memory leak in SCSI initialization
Submitter	: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Date		: 2009-09-22 4:18 (35 days old)
References	: http://marc.info/?l=linux-kernel&m=125359311312243&w=4
Handled-By	: Michael Ellerman <michael@ellerman.id.au>
		  James Bottomley <James.Bottomley@suse.de>
Patch		: http://patchwork.kernel.org/patch/51412/


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14253
Subject		: Oops in driversbasefirmware_class
Submitter	: Lars Ericsson <Lars_Ericsson@telia.com>
Date		: 2009-09-16 20:44 (41 days old)
References	: http://lkml.org/lkml/2009/9/16/461
Handled-By	: Frederik Deweerdt <frederik.deweerdt@xprog.eu>
Patch		: http://patchwork.kernel.org/patch/49914/


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14137
Subject		: usb console regressions
Submitter	: Jason Wessel <jason.wessel@windriver.com>
Date		: 2009-09-05 21:08 (52 days old)
References	: http://marc.info/?l=linux-kernel&m=125218501310512&w=4
Handled-By	: Jason Wessel <jason.wessel@windriver.com>
Patch		: http://patchwork.kernel.org/patch/45953/
		  http://patchwork.kernel.org/patch/45952/


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14017
Subject		: _end symbol missing from Symbol.map
Submitter	: Hannes Reinecke <hare@suse.de>
Date		: 2009-08-13 6:45 (75 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=091e52c3551d3031343df24b573b770b4c6c72b6
References	: http://marc.info/?l=linux-kernel&m=125014649102253&w=4
Handled-By	: Hannes Reinecke <hare@suse.de>
Patch		: http://marc.info/?l=linux-kernel&m=125014649102253&w=4


For details, please visit the bug entries and follow the links given in
references.

As you can see, there is a Bugzilla entry for each of the listed regressions.
There also is a Bugzilla entry used for tracking the regressions introduced
between 2.6.30 and 2.6.31, unresolved as well as resolved, at:

http://bugzilla.kernel.org/show_bug.cgi?id=13615

Please let me know if there are any Bugzilla entries that should be added to
the list in there.

Thanks,
Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH] virtio-net: fix data corruption with OOM
From: Michael S. Tsirkin @ 2009-10-26 19:34 UTC (permalink / raw)
  To: Rusty Russell; +Cc: virtualization, kvm, netdev
In-Reply-To: <20091026184243.GA26473@redhat.com>

On Mon, Oct 26, 2009 at 08:42:43PM +0200, Michael S. Tsirkin wrote:
> On Mon, Oct 26, 2009 at 12:11:51PM +1030, Rusty Russell wrote:
> > On Mon, 26 Oct 2009 03:33:40 am Michael S. Tsirkin wrote:
> > > virtio net used to unlink skbs from send queues on error,
> > > but ever since 48925e372f04f5e35fec6269127c62b2c71ab794
> > > we do not do this. This causes guest data corruption and crashes
> > > with vhost since net core can requeue the skb or free it without
> > > it being taken off the list.
> > > 
> > > This patch fixes this by queueing the skb after successfull
> > > transmit.
> > 
> > I originally thought that this was racy: as soon as we do add_buf, we need to
> > make sure we're ready for the callback (for virtio_pci, it's ->kick, but we
> > shouldn't rely on that).
> 
> Modified the guest slightly, and I am getting crashes again.
> I didn't have time to debug this, but based on previous experience,
> I reverted 48925e372f04f5e35fec6269127c62b2c71ab794,
> and the crash went away.
> Rusty, what do you say we just revert 48925e372f04f5e35fec6269127c62b2c71ab794
> for now?

Hmm. Can't reproduce the crash anymore.
There is a small chance that the problem was my error,
so I guess I should try to reproduce and debug this,
after all.

^ permalink raw reply

* [PATCH] PPPoE: Fix flush/close races.
From: Michal Ostrowski @ 2009-10-26 19:51 UTC (permalink / raw)
  To: linux-ppp, netdev, Cyrill Gorcunov
In-Reply-To: <1256586498-6230-1-git-send-email-mostrows@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 975 bytes --]

Be more careful about the state of pointers during tear-down.
The "pppoe_dev" field can only be looked at safely while holding socket locks.
This subsequently allows for the flush_lock to be killed.

We depend on the PPPOX_CONNECTED state to tell us that that those fields are
valid, so whoever clears that state (pppox_unbind_sock()) is responsible for
the dev_put() call.

We also have to ensure that we delete_item() on all sockets before they are
cleaned up.

The need for these changes has been exposed by scenarios wherein namespace
bindings of ethernet devices change while there are ongoing PPPoE sessions,
which resulted in oopses due to unusual socket connection termination paths,
exposing these issues.

Signed-off-by: Michal Ostrowski <mostrows@gmail.com>
Reviewed-by: Cyril Gorcunov <gorcunov@gmail.com>
---
 drivers/net/pppoe.c |  129 +++++++++++++++++++++++++++------------------------
 1 files changed, 68 insertions(+), 61 deletions(-)

[-- Attachment #2: 0001-PPPoE-Fix-flush-close-races.patch --]
[-- Type: text/x-patch, Size: 7871 bytes --]

diff --git a/drivers/net/pppoe.c b/drivers/net/pppoe.c
index 7cbf6f9..2559991 100644
--- a/drivers/net/pppoe.c
+++ b/drivers/net/pppoe.c
@@ -111,9 +111,6 @@ struct pppoe_net {
 	rwlock_t hash_lock;
 };
 
-/* to eliminate a race btw pppoe_flush_dev and pppoe_release */
-static DEFINE_SPINLOCK(flush_lock);
-
 /*
  * PPPoE could be in the following stages:
  * 1) Discovery stage (to obtain remote MAC and Session ID)
@@ -303,45 +300,48 @@ static void pppoe_flush_dev(struct net_device *dev)
 	write_lock_bh(&pn->hash_lock);
 	for (i = 0; i < PPPOE_HASH_SIZE; i++) {
 		struct pppox_sock *po = pn->hash_table[i];
+		struct sock *sk;
 
-		while (po != NULL) {
-			struct sock *sk;
-			if (po->pppoe_dev != dev) {
+		while (po) {
+			while (po && po->pppoe_dev != dev) {
 				po = po->next;
-				continue;
 			}
+
+			if (!po)
+				break;
+
 			sk = sk_pppox(po);
-			spin_lock(&flush_lock);
-			po->pppoe_dev = NULL;
-			spin_unlock(&flush_lock);
-			dev_put(dev);
 
 			/* We always grab the socket lock, followed by the
-			 * hash_lock, in that order.  Since we should
-			 * hold the sock lock while doing any unbinding,
-			 * we need to release the lock we're holding.
-			 * Hold a reference to the sock so it doesn't disappear
-			 * as we're jumping between locks.
+			 * hash_lock, in that order.  Since we should hold the
+			 * sock lock while doing any unbinding, we need to
+			 * release the lock we're holding.  Hold a reference to
+			 * the sock so it doesn't disappear as we're jumping
+			 * between locks.
 			 */
 
 			sock_hold(sk);
-
 			write_unlock_bh(&pn->hash_lock);
 			lock_sock(sk);
 
-			if (sk->sk_state & (PPPOX_CONNECTED | PPPOX_BOUND)) {
+			if (po->pppoe_dev == dev
+			    && sk->sk_state & (PPPOX_CONNECTED | PPPOX_BOUND)) {
 				pppox_unbind_sock(sk);
 				sk->sk_state = PPPOX_ZOMBIE;
 				sk->sk_state_change(sk);
+				po->pppoe_dev = NULL;
+				dev_put(dev);
 			}
 
 			release_sock(sk);
 			sock_put(sk);
 
-			/* Restart scan at the beginning of this hash chain.
-			 * While the lock was dropped the chain contents may
-			 * have changed.
+			/* Restart the process from the start of the current
+			 * hash chain. We dropped locks so the world may have
+			 * change from underneath us.
 			 */
+
+			BUG_ON(pppoe_pernet(dev_net(dev)) == NULL);
 			write_lock_bh(&pn->hash_lock);
 			po = pn->hash_table[i];
 		}
@@ -388,11 +388,16 @@ static int pppoe_rcv_core(struct sock *sk, struct sk_buff *skb)
 	struct pppox_sock *po = pppox_sk(sk);
 	struct pppox_sock *relay_po;
 
+	/* Backlog receive. Semantics of backlog rcv preclude any code from
+	 * executing in lock_sock()/release_sock() bounds; meaning sk->sk_state
+	 * can't change.
+	 */
+
 	if (sk->sk_state & PPPOX_BOUND) {
 		ppp_input(&po->chan, skb);
 	} else if (sk->sk_state & PPPOX_RELAY) {
-		relay_po = get_item_by_addr(dev_net(po->pppoe_dev),
-						&po->pppoe_relay);
+		relay_po = get_item_by_addr(sock_net(sk),
+					    &po->pppoe_relay);
 		if (relay_po == NULL)
 			goto abort_kfree;
 
@@ -447,6 +452,10 @@ static int pppoe_rcv(struct sk_buff *skb, struct net_device *dev,
 		goto drop;
 
 	pn = pppoe_pernet(dev_net(dev));
+
+	/* Note that get_item does a sock_hold(), so sk_pppox(po)
+	 * is known to be safe.
+	 */
 	po = get_item(pn, ph->sid, eth_hdr(skb)->h_source, dev->ifindex);
 	if (!po)
 		goto drop;
@@ -561,6 +570,7 @@ static int pppoe_release(struct socket *sock)
 	struct sock *sk = sock->sk;
 	struct pppox_sock *po;
 	struct pppoe_net *pn;
+	struct net *net = NULL;
 
 	if (!sk)
 		return 0;
@@ -571,44 +581,28 @@ static int pppoe_release(struct socket *sock)
 		return -EBADF;
 	}
 
+	po = pppox_sk(sk);
+
+	if (sk->sk_state & (PPPOX_CONNECTED | PPPOX_BOUND)) {
+		dev_put(po->pppoe_dev);
+		po->pppoe_dev = NULL;
+	}
+
 	pppox_unbind_sock(sk);
 
 	/* Signal the death of the socket. */
 	sk->sk_state = PPPOX_DEAD;
 
-	/*
-	 * pppoe_flush_dev could lead to a race with
-	 * this routine so we use flush_lock to eliminate
-	 * such a case (we only need per-net specific data)
-	 */
-	spin_lock(&flush_lock);
-	po = pppox_sk(sk);
-	if (!po->pppoe_dev) {
-		spin_unlock(&flush_lock);
-		goto out;
-	}
-	pn = pppoe_pernet(dev_net(po->pppoe_dev));
-	spin_unlock(&flush_lock);
+	net = sock_net(sk);
+	pn = pppoe_pernet(net);
 
 	/*
 	 * protect "po" from concurrent updates
 	 * on pppoe_flush_dev
 	 */
-	write_lock_bh(&pn->hash_lock);
+	delete_item(pn, po->pppoe_pa.sid, po->pppoe_pa.remote,
+		    po->pppoe_ifindex);
 
-	po = pppox_sk(sk);
-	if (stage_session(po->pppoe_pa.sid))
-		__delete_item(pn, po->pppoe_pa.sid, po->pppoe_pa.remote,
-				po->pppoe_ifindex);
-
-	if (po->pppoe_dev) {
-		dev_put(po->pppoe_dev);
-		po->pppoe_dev = NULL;
-	}
-
-	write_unlock_bh(&pn->hash_lock);
-
-out:
 	sock_orphan(sk);
 	sock->sk = NULL;
 
@@ -625,8 +619,9 @@ static int pppoe_connect(struct socket *sock, struct sockaddr *uservaddr,
 	struct sock *sk = sock->sk;
 	struct sockaddr_pppox *sp = (struct sockaddr_pppox *)uservaddr;
 	struct pppox_sock *po = pppox_sk(sk);
-	struct net_device *dev;
+	struct net_device *dev = NULL;
 	struct pppoe_net *pn;
+	struct net *net = NULL;
 	int error;
 
 	lock_sock(sk);
@@ -652,12 +647,14 @@ static int pppoe_connect(struct socket *sock, struct sockaddr *uservaddr,
 	/* Delete the old binding */
 	if (stage_session(po->pppoe_pa.sid)) {
 		pppox_unbind_sock(sk);
+		pn = pppoe_pernet(sock_net(sk));
+		delete_item(pn, po->pppoe_pa.sid,
+			    po->pppoe_pa.remote, po->pppoe_ifindex);
 		if (po->pppoe_dev) {
-			pn = pppoe_pernet(dev_net(po->pppoe_dev));
-			delete_item(pn, po->pppoe_pa.sid,
-				po->pppoe_pa.remote, po->pppoe_ifindex);
 			dev_put(po->pppoe_dev);
+			po->pppoe_dev = NULL;
 		}
+
 		memset(sk_pppox(po) + 1, 0,
 		       sizeof(struct pppox_sock) - sizeof(struct sock));
 		sk->sk_state = PPPOX_NONE;
@@ -666,16 +663,15 @@ static int pppoe_connect(struct socket *sock, struct sockaddr *uservaddr,
 	/* Re-bind in session stage only */
 	if (stage_session(sp->sa_addr.pppoe.sid)) {
 		error = -ENODEV;
-		dev = dev_get_by_name(sock_net(sk), sp->sa_addr.pppoe.dev);
+		net = sock_net(sk);
+		dev = dev_get_by_name(net, sp->sa_addr.pppoe.dev);
 		if (!dev)
-			goto end;
+			goto err_put;
 
 		po->pppoe_dev = dev;
 		po->pppoe_ifindex = dev->ifindex;
-		pn = pppoe_pernet(dev_net(dev));
-		write_lock_bh(&pn->hash_lock);
+		pn = pppoe_pernet(net);
 		if (!(dev->flags & IFF_UP)) {
-			write_unlock_bh(&pn->hash_lock);
 			goto err_put;
 		}
 
@@ -683,6 +679,7 @@ static int pppoe_connect(struct socket *sock, struct sockaddr *uservaddr,
 		       &sp->sa_addr.pppoe,
 		       sizeof(struct pppoe_addr));
 
+		write_lock_bh(&pn->hash_lock);
 		error = __set_item(pn, po);
 		write_unlock_bh(&pn->hash_lock);
 		if (error < 0)
@@ -696,8 +693,11 @@ static int pppoe_connect(struct socket *sock, struct sockaddr *uservaddr,
 		po->chan.ops = &pppoe_chan_ops;
 
 		error = ppp_register_net_channel(dev_net(dev), &po->chan);
-		if (error)
+		if (error) {
+			delete_item(pn, po->pppoe_pa.sid,
+				    po->pppoe_pa.remote, po->pppoe_ifindex);
 			goto err_put;
+		}
 
 		sk->sk_state = PPPOX_CONNECTED;
 	}
@@ -915,6 +915,14 @@ static int __pppoe_xmit(struct sock *sk, struct sk_buff *skb)
 	struct pppoe_hdr *ph;
 	int data_len = skb->len;
 
+	/* The higher-level PPP code (ppp_unregister_channel()) ensures the PPP
+	 * xmit operations conclude prior to an unregistration call.  Thus
+	 * sk->sk_state cannot change, so we don't need to do lock_sock().
+	 * But, we also can't do a lock_sock since that introduces a potential
+	 * deadlock as we'd reverse the lock ordering used when calling
+	 * ppp_unregister_channel().
+	 */
+
 	if (sock_flag(sk, SOCK_DEAD) || !(sk->sk_state & PPPOX_CONNECTED))
 		goto abort;
 
@@ -944,7 +952,6 @@ static int __pppoe_xmit(struct sock *sk, struct sk_buff *skb)
 			po->pppoe_pa.remote, NULL, data_len);
 
 	dev_queue_xmit(skb);
-
 	return 1;
 
 abort:

^ permalink raw reply related

* Re: [PATCH] PPPoE: Fix flush/close races.
From: Cyrill Gorcunov @ 2009-10-26 19:59 UTC (permalink / raw)
  To: Michal Ostrowski; +Cc: linux-ppp, netdev, Denys Fedoryschenko, Eric Dumazet
In-Reply-To: <e6d1cecd0910261251w721b8258n7dfc1bac9d01af8b@mail.gmail.com>

[Michal Ostrowski - Mon, Oct 26, 2009 at 02:51:52PM -0500]
| Be more careful about the state of pointers during tear-down.
| The "pppoe_dev" field can only be looked at safely while holding socket locks.
| This subsequently allows for the flush_lock to be killed.
| 
| We depend on the PPPOX_CONNECTED state to tell us that that those fields are
| valid, so whoever clears that state (pppox_unbind_sock()) is responsible for
| the dev_put() call.
| 
| We also have to ensure that we delete_item() on all sockets before they are
| cleaned up.
| 
| The need for these changes has been exposed by scenarios wherein namespace
| bindings of ethernet devices change while there are ongoing PPPoE sessions,
| which resulted in oopses due to unusual socket connection termination paths,
| exposing these issues.
| 
| Signed-off-by: Michal Ostrowski <mostrows@gmail.com>
| Reviewed-by: Cyril Gorcunov <gorcunov@gmail.com>
...

Thanks a lot Michal!

I think we should add as well

Reported-by: Denys Fedoryschenko <denys@visp.net.lb>
Tested-by: Denys Fedoryschenko <denys@visp.net.lb>

	-- Cyrill

^ permalink raw reply

* Re: [PATCH] PPPoE: Fix flush/close races.
From: Denys Fedoryschenko @ 2009-10-26 20:05 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: Michal Ostrowski, linux-ppp, netdev, Eric Dumazet
In-Reply-To: <20091026195933.GC5321@lenovo>

On Monday 26 October 2009 21:59:33 Cyrill Gorcunov wrote:
> [Michal Ostrowski - Mon, Oct 26, 2009 at 02:51:52PM -0500]
>
> | Be more careful about the state of pointers during tear-down.
> | The "pppoe_dev" field can only be looked at safely while holding socket
> | locks. This subsequently allows for the flush_lock to be killed.
> |
> | We depend on the PPPOX_CONNECTED state to tell us that that those fields
> | are valid, so whoever clears that state (pppox_unbind_sock()) is
> | responsible for the dev_put() call.
> |
> | We also have to ensure that we delete_item() on all sockets before they
> | are cleaned up.
> |
> | The need for these changes has been exposed by scenarios wherein
> | namespace bindings of ethernet devices change while there are ongoing
> | PPPoE sessions, which resulted in oopses due to unusual socket connection
> | termination paths, exposing these issues.
> |
> | Signed-off-by: Michal Ostrowski <mostrows@gmail.com>
> | Reviewed-by: Cyril Gorcunov <gorcunov@gmail.com>
>
> ...
>
> Thanks a lot Michal!
>
> I think we should add as well
>
> Reported-by: Denys Fedoryschenko <denys@visp.net.lb>
> Tested-by: Denys Fedoryschenko <denys@visp.net.lb>
>
> 	-- Cyrill

Yes, till now everything working perfectly. Confirming :-)

^ permalink raw reply

* [PATCH][Revised Log] PPPoE: Fix flush/close races.
From: Michal Ostrowski @ 2009-10-26 20:06 UTC (permalink / raw)
  To: linux-ppp, netdev, Cyrill Gorcunov, Denys Fedoryschenko
In-Reply-To: <1256587405-7073-1-git-send-email-mostrows@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1081 bytes --]

Be more careful about the state of pointers during tear-down.
The "pppoe_dev" field can only be looked at safely while holding socket locks.
This subsequently allows for the flush_lock to be killed.

We depend on the PPPOX_CONNECTED state to tell us that that those fields are
valid, so whoever clears that state (pppox_unbind_sock()) is responsible for
the dev_put() call.

We also have to ensure that we delete_item() on all sockets before they are
cleaned up.

The need for these changes has been exposed by scenarios wherein namespace
bindings of ethernet devices change while there are ongoing PPPoE sessions,
which resulted in oopses due to unusual socket connection termination paths,
exposing these issues.

Signed-off-by: Michal Ostrowski <mostrows@gmail.com>
Reviewed-by: Cyril Gorcunov <gorcunov@gmail.com>
Reported-by: Denys Fedoryschenko <denys@visp.net.lb>
Tested-by: Denys Fedoryschenko <denys@visp.net.lb>
---
 drivers/net/pppoe.c |  129 +++++++++++++++++++++++++++------------------------
 1 files changed, 68 insertions(+), 61 deletions(-)

[-- Attachment #2: 0001-PPPoE-Fix-flush-close-races.patch --]
[-- Type: text/x-patch, Size: 7871 bytes --]

diff --git a/drivers/net/pppoe.c b/drivers/net/pppoe.c
index 7cbf6f9..2559991 100644
--- a/drivers/net/pppoe.c
+++ b/drivers/net/pppoe.c
@@ -111,9 +111,6 @@ struct pppoe_net {
 	rwlock_t hash_lock;
 };
 
-/* to eliminate a race btw pppoe_flush_dev and pppoe_release */
-static DEFINE_SPINLOCK(flush_lock);
-
 /*
  * PPPoE could be in the following stages:
  * 1) Discovery stage (to obtain remote MAC and Session ID)
@@ -303,45 +300,48 @@ static void pppoe_flush_dev(struct net_device *dev)
 	write_lock_bh(&pn->hash_lock);
 	for (i = 0; i < PPPOE_HASH_SIZE; i++) {
 		struct pppox_sock *po = pn->hash_table[i];
+		struct sock *sk;
 
-		while (po != NULL) {
-			struct sock *sk;
-			if (po->pppoe_dev != dev) {
+		while (po) {
+			while (po && po->pppoe_dev != dev) {
 				po = po->next;
-				continue;
 			}
+
+			if (!po)
+				break;
+
 			sk = sk_pppox(po);
-			spin_lock(&flush_lock);
-			po->pppoe_dev = NULL;
-			spin_unlock(&flush_lock);
-			dev_put(dev);
 
 			/* We always grab the socket lock, followed by the
-			 * hash_lock, in that order.  Since we should
-			 * hold the sock lock while doing any unbinding,
-			 * we need to release the lock we're holding.
-			 * Hold a reference to the sock so it doesn't disappear
-			 * as we're jumping between locks.
+			 * hash_lock, in that order.  Since we should hold the
+			 * sock lock while doing any unbinding, we need to
+			 * release the lock we're holding.  Hold a reference to
+			 * the sock so it doesn't disappear as we're jumping
+			 * between locks.
 			 */
 
 			sock_hold(sk);
-
 			write_unlock_bh(&pn->hash_lock);
 			lock_sock(sk);
 
-			if (sk->sk_state & (PPPOX_CONNECTED | PPPOX_BOUND)) {
+			if (po->pppoe_dev == dev
+			    && sk->sk_state & (PPPOX_CONNECTED | PPPOX_BOUND)) {
 				pppox_unbind_sock(sk);
 				sk->sk_state = PPPOX_ZOMBIE;
 				sk->sk_state_change(sk);
+				po->pppoe_dev = NULL;
+				dev_put(dev);
 			}
 
 			release_sock(sk);
 			sock_put(sk);
 
-			/* Restart scan at the beginning of this hash chain.
-			 * While the lock was dropped the chain contents may
-			 * have changed.
+			/* Restart the process from the start of the current
+			 * hash chain. We dropped locks so the world may have
+			 * change from underneath us.
 			 */
+
+			BUG_ON(pppoe_pernet(dev_net(dev)) == NULL);
 			write_lock_bh(&pn->hash_lock);
 			po = pn->hash_table[i];
 		}
@@ -388,11 +388,16 @@ static int pppoe_rcv_core(struct sock *sk, struct sk_buff *skb)
 	struct pppox_sock *po = pppox_sk(sk);
 	struct pppox_sock *relay_po;
 
+	/* Backlog receive. Semantics of backlog rcv preclude any code from
+	 * executing in lock_sock()/release_sock() bounds; meaning sk->sk_state
+	 * can't change.
+	 */
+
 	if (sk->sk_state & PPPOX_BOUND) {
 		ppp_input(&po->chan, skb);
 	} else if (sk->sk_state & PPPOX_RELAY) {
-		relay_po = get_item_by_addr(dev_net(po->pppoe_dev),
-						&po->pppoe_relay);
+		relay_po = get_item_by_addr(sock_net(sk),
+					    &po->pppoe_relay);
 		if (relay_po == NULL)
 			goto abort_kfree;
 
@@ -447,6 +452,10 @@ static int pppoe_rcv(struct sk_buff *skb, struct net_device *dev,
 		goto drop;
 
 	pn = pppoe_pernet(dev_net(dev));
+
+	/* Note that get_item does a sock_hold(), so sk_pppox(po)
+	 * is known to be safe.
+	 */
 	po = get_item(pn, ph->sid, eth_hdr(skb)->h_source, dev->ifindex);
 	if (!po)
 		goto drop;
@@ -561,6 +570,7 @@ static int pppoe_release(struct socket *sock)
 	struct sock *sk = sock->sk;
 	struct pppox_sock *po;
 	struct pppoe_net *pn;
+	struct net *net = NULL;
 
 	if (!sk)
 		return 0;
@@ -571,44 +581,28 @@ static int pppoe_release(struct socket *sock)
 		return -EBADF;
 	}
 
+	po = pppox_sk(sk);
+
+	if (sk->sk_state & (PPPOX_CONNECTED | PPPOX_BOUND)) {
+		dev_put(po->pppoe_dev);
+		po->pppoe_dev = NULL;
+	}
+
 	pppox_unbind_sock(sk);
 
 	/* Signal the death of the socket. */
 	sk->sk_state = PPPOX_DEAD;
 
-	/*
-	 * pppoe_flush_dev could lead to a race with
-	 * this routine so we use flush_lock to eliminate
-	 * such a case (we only need per-net specific data)
-	 */
-	spin_lock(&flush_lock);
-	po = pppox_sk(sk);
-	if (!po->pppoe_dev) {
-		spin_unlock(&flush_lock);
-		goto out;
-	}
-	pn = pppoe_pernet(dev_net(po->pppoe_dev));
-	spin_unlock(&flush_lock);
+	net = sock_net(sk);
+	pn = pppoe_pernet(net);
 
 	/*
 	 * protect "po" from concurrent updates
 	 * on pppoe_flush_dev
 	 */
-	write_lock_bh(&pn->hash_lock);
+	delete_item(pn, po->pppoe_pa.sid, po->pppoe_pa.remote,
+		    po->pppoe_ifindex);
 
-	po = pppox_sk(sk);
-	if (stage_session(po->pppoe_pa.sid))
-		__delete_item(pn, po->pppoe_pa.sid, po->pppoe_pa.remote,
-				po->pppoe_ifindex);
-
-	if (po->pppoe_dev) {
-		dev_put(po->pppoe_dev);
-		po->pppoe_dev = NULL;
-	}
-
-	write_unlock_bh(&pn->hash_lock);
-
-out:
 	sock_orphan(sk);
 	sock->sk = NULL;
 
@@ -625,8 +619,9 @@ static int pppoe_connect(struct socket *sock, struct sockaddr *uservaddr,
 	struct sock *sk = sock->sk;
 	struct sockaddr_pppox *sp = (struct sockaddr_pppox *)uservaddr;
 	struct pppox_sock *po = pppox_sk(sk);
-	struct net_device *dev;
+	struct net_device *dev = NULL;
 	struct pppoe_net *pn;
+	struct net *net = NULL;
 	int error;
 
 	lock_sock(sk);
@@ -652,12 +647,14 @@ static int pppoe_connect(struct socket *sock, struct sockaddr *uservaddr,
 	/* Delete the old binding */
 	if (stage_session(po->pppoe_pa.sid)) {
 		pppox_unbind_sock(sk);
+		pn = pppoe_pernet(sock_net(sk));
+		delete_item(pn, po->pppoe_pa.sid,
+			    po->pppoe_pa.remote, po->pppoe_ifindex);
 		if (po->pppoe_dev) {
-			pn = pppoe_pernet(dev_net(po->pppoe_dev));
-			delete_item(pn, po->pppoe_pa.sid,
-				po->pppoe_pa.remote, po->pppoe_ifindex);
 			dev_put(po->pppoe_dev);
+			po->pppoe_dev = NULL;
 		}
+
 		memset(sk_pppox(po) + 1, 0,
 		       sizeof(struct pppox_sock) - sizeof(struct sock));
 		sk->sk_state = PPPOX_NONE;
@@ -666,16 +663,15 @@ static int pppoe_connect(struct socket *sock, struct sockaddr *uservaddr,
 	/* Re-bind in session stage only */
 	if (stage_session(sp->sa_addr.pppoe.sid)) {
 		error = -ENODEV;
-		dev = dev_get_by_name(sock_net(sk), sp->sa_addr.pppoe.dev);
+		net = sock_net(sk);
+		dev = dev_get_by_name(net, sp->sa_addr.pppoe.dev);
 		if (!dev)
-			goto end;
+			goto err_put;
 
 		po->pppoe_dev = dev;
 		po->pppoe_ifindex = dev->ifindex;
-		pn = pppoe_pernet(dev_net(dev));
-		write_lock_bh(&pn->hash_lock);
+		pn = pppoe_pernet(net);
 		if (!(dev->flags & IFF_UP)) {
-			write_unlock_bh(&pn->hash_lock);
 			goto err_put;
 		}
 
@@ -683,6 +679,7 @@ static int pppoe_connect(struct socket *sock, struct sockaddr *uservaddr,
 		       &sp->sa_addr.pppoe,
 		       sizeof(struct pppoe_addr));
 
+		write_lock_bh(&pn->hash_lock);
 		error = __set_item(pn, po);
 		write_unlock_bh(&pn->hash_lock);
 		if (error < 0)
@@ -696,8 +693,11 @@ static int pppoe_connect(struct socket *sock, struct sockaddr *uservaddr,
 		po->chan.ops = &pppoe_chan_ops;
 
 		error = ppp_register_net_channel(dev_net(dev), &po->chan);
-		if (error)
+		if (error) {
+			delete_item(pn, po->pppoe_pa.sid,
+				    po->pppoe_pa.remote, po->pppoe_ifindex);
 			goto err_put;
+		}
 
 		sk->sk_state = PPPOX_CONNECTED;
 	}
@@ -915,6 +915,14 @@ static int __pppoe_xmit(struct sock *sk, struct sk_buff *skb)
 	struct pppoe_hdr *ph;
 	int data_len = skb->len;
 
+	/* The higher-level PPP code (ppp_unregister_channel()) ensures the PPP
+	 * xmit operations conclude prior to an unregistration call.  Thus
+	 * sk->sk_state cannot change, so we don't need to do lock_sock().
+	 * But, we also can't do a lock_sock since that introduces a potential
+	 * deadlock as we'd reverse the lock ordering used when calling
+	 * ppp_unregister_channel().
+	 */
+
 	if (sock_flag(sk, SOCK_DEAD) || !(sk->sk_state & PPPOX_CONNECTED))
 		goto abort;
 
@@ -944,7 +952,6 @@ static int __pppoe_xmit(struct sock *sk, struct sk_buff *skb)
 			po->pppoe_pa.remote, NULL, data_len);
 
 	dev_queue_xmit(skb);
-
 	return 1;
 
 abort:

^ permalink raw reply related

* wanPMC-CxT1E1
From: Bob Beers @ 2009-10-26 20:22 UTC (permalink / raw)
  To: netdev; +Cc: Greg KH

Hello List,

I am using subject named card[1] and it has a GPL driver that is
 out-of-tree.  I have permission from the manufacturer to post and
 discuss it, so I'd like to get started.  Where/how shall I post the code,
 and my first patch which allows it to compile and run with the
 kernel of RHEL/CentOS 5?  The tarball includes driver source and
 a utility pmcc4cfg, and is 174 093 bytes.

Thanks,

-Bob Beers

[1] <http://www.onestopsystems.com/communication_wan_e.php>

^ permalink raw reply

* Re: wanPMC-CxT1E1
From: Greg KH @ 2009-10-26 20:41 UTC (permalink / raw)
  To: Bob Beers; +Cc: netdev
In-Reply-To: <4f6ba3b0910261322j273c977fm356506c46f095832@mail.gmail.com>

On Mon, Oct 26, 2009 at 04:22:08PM -0400, Bob Beers wrote:
> Hello List,
> 
> I am using subject named card[1] and it has a GPL driver that is
>  out-of-tree.  I have permission from the manufacturer to post and
>  discuss it, so I'd like to get started.  Where/how shall I post the code,
>  and my first patch which allows it to compile and run with the
>  kernel of RHEL/CentOS 5?  The tarball includes driver source and
>  a utility pmcc4cfg, and is 174 093 bytes.

Getting it to build on 2.6.31 is more important than RHEL5, we can't do
anything with an old kernel like that.

Do you have a pointer to where we can download the tarball?  I'll be
glad to add it to the staging tree so it can be cleaned up there before
being submitted to the main portion of the kernel tree.

thanks,

greg k-h

^ permalink raw reply

* Re: wanPMC-CxT1E1
From: Bob Beers @ 2009-10-26 20:57 UTC (permalink / raw)
  To: Greg KH; +Cc: netdev
In-Reply-To: <20091026204144.GA28436@kroah.com>

On Mon, Oct 26, 2009 at 4:41 PM, Greg KH <greg@kroah.com> wrote:
> On Mon, Oct 26, 2009 at 04:22:08PM -0400, Bob Beers wrote:
>> Hello List,
>>
>> I am using subject named card[1] and it has a GPL driver that is
>>  out-of-tree.  I have permission from the manufacturer to post and
>>  discuss it, so I'd like to get started.  Where/how shall I post the code,
>>  and my first patch which allows it to compile and run with the
>>  kernel of RHEL/CentOS 5?  The tarball includes driver source and
>>  a utility pmcc4cfg, and is 174 093 bytes.
>
> Getting it to build on 2.6.31 is more important than RHEL5, we can't do
> anything with an old kernel like that.
>

I know, but I'll need to back port it as soon as possible.

> Do you have a pointer to where we can download the tarball?  I'll be
> glad to add it to the staging tree so it can be cleaned up there before
> being submitted to the main portion of the kernel tree.
>

<http://rapidshare.com/files/298314343/wanPMC_CxT1E1_Linux_3_1B.tgz.html>
MD5: 563FC98A1A0EA625F72C45B89E5C726B

> thanks,
>
> greg k-h
>

Thanks,

-Bob Beers

^ permalink raw reply

* r8169: Fix card drop incoming VLAN tagged MTU byte large jumbo frames
From: Raimonds Cicans @ 2009-10-26 20:52 UTC (permalink / raw)
  To: romieu; +Cc: netdev

[-- Attachment #1: Type: text/plain, Size: 0 bytes --]



[-- Attachment #2: r8169-vlan-jumbo.patch --]
[-- Type: text/plain, Size: 783 bytes --]

r8169 card drop incoming VLAN tagged MTU byte large jumbo frames

It looks to compare current and maximal packet sizes hardware use
'<' operator, not '<='.

Bug introduced by patch:
r8169: fix crash when large packets are received

Signed-off-by: Raimonds Cicans <ray@apollo.lv>

---

--- linux-2.6.31/drivers/net/r8169.c.orig	2009-10-26 20:57:47.256658618 +0200
+++ linux-2.6.31/drivers/net/r8169.c	2009-10-26 19:48:25.807252812 +0200
@@ -2365,7 +2365,7 @@ static u16 rtl_rw_cpluscmd(void __iomem 
 static void rtl_set_rx_max_size(void __iomem *ioaddr, unsigned int rx_buf_sz)
 {
 	/* Low hurts. Let's disable the filtering. */
-	RTL_W16(RxMaxSize, rx_buf_sz);
+	RTL_W16(RxMaxSize, rx_buf_sz + 1);
 }
 
 static void rtl8169_set_magic_reg(void __iomem *ioaddr, unsigned mac_version)

^ permalink raw reply

* Re: wanPMC-CxT1E1
From: Greg KH @ 2009-10-26 21:20 UTC (permalink / raw)
  To: Bob Beers; +Cc: netdev
In-Reply-To: <4f6ba3b0910261357x2e5bdf5byb03b1debd177ec86@mail.gmail.com>

On Mon, Oct 26, 2009 at 04:57:35PM -0400, Bob Beers wrote:
> On Mon, Oct 26, 2009 at 4:41 PM, Greg KH <greg@kroah.com> wrote:
> > On Mon, Oct 26, 2009 at 04:22:08PM -0400, Bob Beers wrote:
> >> Hello List,
> >>
> >> I am using subject named card[1] and it has a GPL driver that is
> >>  out-of-tree.  I have permission from the manufacturer to post and
> >>  discuss it, so I'd like to get started.  Where/how shall I post the code,
> >>  and my first patch which allows it to compile and run with the
> >>  kernel of RHEL/CentOS 5?  The tarball includes driver source and
> >>  a utility pmcc4cfg, and is 174 093 bytes.
> >
> > Getting it to build on 2.6.31 is more important than RHEL5, we can't do
> > anything with an old kernel like that.
> >
> 
> I know, but I'll need to back port it as soon as possible.

Sure, but that's nothing we can do, nor are able to even care about :)

> > Do you have a pointer to where we can download the tarball?  I'll be
> > glad to add it to the staging tree so it can be cleaned up there before
> > being submitted to the main portion of the kernel tree.
> >
> 
> <http://rapidshare.com/files/298314343/wanPMC_CxT1E1_Linux_3_1B.tgz.html>
> MD5: 563FC98A1A0EA625F72C45B89E5C726B

Care to put it somewhere that will actually work?  That site refuses to
give it to me.

Heck, if it's only 174kb, can you just send it in email to me?

thanks,

greg k-h

^ permalink raw reply

* [net-2.6 PATCH 1/5] e1000e: clear PHY wakeup bit after LCD reset on 82577/82578
From: Jeff Kirsher @ 2009-10-26 21:22 UTC (permalink / raw)
  To: davem; +Cc: netdev, gospo, Bruce Allan, Jeff Kirsher

From: Bruce Allan <bruce.w.allan@intel.com>

Performing a dummy read of the PHY Wakeup Control (WUC) register clears the
wakeup enable bit set by an PHY reset.  If this bit remains set, link
problems may occur.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---

 drivers/net/e1000e/ich8lan.c |   10 +++++++++-
 1 files changed, 9 insertions(+), 1 deletions(-)

diff --git a/drivers/net/e1000e/ich8lan.c b/drivers/net/e1000e/ich8lan.c
index aa0ab0e..ead6651 100644
--- a/drivers/net/e1000e/ich8lan.c
+++ b/drivers/net/e1000e/ich8lan.c
@@ -844,7 +844,7 @@ static s32 e1000_phy_hw_reset_ich8lan(struct e1000_hw *hw)
 	u32 i;
 	u32 data, cnf_size, cnf_base_addr, sw_cfg_mask;
 	s32 ret_val;
-	u16 word_addr, reg_data, reg_addr, phy_page = 0;
+	u16 reg, word_addr, reg_data, reg_addr, phy_page = 0;
 
 	ret_val = e1000e_phy_hw_reset_generic(hw);
 	if (ret_val)
@@ -859,6 +859,10 @@ static s32 e1000_phy_hw_reset_ich8lan(struct e1000_hw *hw)
 			return ret_val;
 	}
 
+	/* Dummy read to clear the phy wakeup bit after lcd reset */
+	if (hw->mac.type == e1000_pchlan)
+		e1e_rphy(hw, BM_WUC, &reg);
+
 	/*
 	 * Initialize the PHY from the NVM on ICH platforms.  This
 	 * is needed due to an issue where the NVM configuration is
@@ -2229,6 +2233,7 @@ static s32 e1000_get_bus_info_ich8lan(struct e1000_hw *hw)
  **/
 static s32 e1000_reset_hw_ich8lan(struct e1000_hw *hw)
 {
+	u16 reg;
 	u32 ctrl, icr, kab;
 	s32 ret_val;
 
@@ -2304,6 +2309,9 @@ static s32 e1000_reset_hw_ich8lan(struct e1000_hw *hw)
 			hw_dbg(hw, "Auto Read Done did not complete\n");
 		}
 	}
+	/* Dummy read to clear the phy wakeup bit after lcd reset */
+	if (hw->mac.type == e1000_pchlan)
+		e1e_rphy(hw, BM_WUC, &reg);
 
 	/*
 	 * For PCH, this write will make sure that any noise


^ permalink raw reply related

* [net-2.6 PATCH 2/5] e1000e: increase swflag acquisition timeout for ICHx/PCH
From: Jeff Kirsher @ 2009-10-26 21:23 UTC (permalink / raw)
  To: davem; +Cc: netdev, gospo, Bruce Allan, Jeff Kirsher
In-Reply-To: <20091026212242.9682.25442.stgit@localhost.localdomain>

From: Bruce Allan <bruce.w.allan@intel.com>

In some conditions (e.g. when AMT is enabled on the system), it is possible
to take an extended period of time to for the driver to acquire the sw/fw/hw
hardware semaphore used to protect against concurrent access of a shared
resource (e.g. PHY registers).  This could cause PHY registers to not get
configured properly resulting in link issues.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---

 drivers/net/e1000e/ich8lan.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/drivers/net/e1000e/ich8lan.c b/drivers/net/e1000e/ich8lan.c
index ead6651..fb2222d 100644
--- a/drivers/net/e1000e/ich8lan.c
+++ b/drivers/net/e1000e/ich8lan.c
@@ -122,6 +122,8 @@
 
 #define HV_LED_CONFIG		PHY_REG(768, 30) /* LED Configuration */
 
+#define SW_FLAG_TIMEOUT    1000 /* SW Semaphore flag timeout in milliseconds */
+
 /* ICH GbE Flash Hardware Sequencing Flash Status Register bit breakdown */
 /* Offset 04h HSFSTS */
 union ich8_hws_flash_status {
@@ -599,7 +601,7 @@ static s32 e1000_acquire_swflag_ich8lan(struct e1000_hw *hw)
 		goto out;
 	}
 
-	timeout = PHY_CFG_TIMEOUT * 2;
+	timeout = SW_FLAG_TIMEOUT;
 
 	extcnf_ctrl |= E1000_EXTCNF_CTRL_SWFLAG;
 	ew32(EXTCNF_CTRL, extcnf_ctrl);


^ permalink raw reply related

* [net-2.6 PATCH 3/5] e1000e: 82577/82578 requires a different method to configure LPLU
From: Jeff Kirsher @ 2009-10-26 21:23 UTC (permalink / raw)
  To: davem; +Cc: netdev, gospo, Bruce Allan, Jeff Kirsher
In-Reply-To: <20091026212242.9682.25442.stgit@localhost.localdomain>

From: Bruce Allan <bruce.w.allan@intel.com>

Unlike previous ICHx-based parts, the PCH-based parts (82577/82578) require
LPLU (Low Power Link Up, or "reverse auto-negotiation") to be configured in
the PHY rather than the MAC.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---

 drivers/net/e1000e/ich8lan.c |   40 ++++++++++++++++++++++++++++++++++++++++
 1 files changed, 40 insertions(+), 0 deletions(-)

diff --git a/drivers/net/e1000e/ich8lan.c b/drivers/net/e1000e/ich8lan.c
index fb2222d..2451dc8 100644
--- a/drivers/net/e1000e/ich8lan.c
+++ b/drivers/net/e1000e/ich8lan.c
@@ -124,6 +124,11 @@
 
 #define SW_FLAG_TIMEOUT    1000 /* SW Semaphore flag timeout in milliseconds */
 
+/* OEM Bits Phy Register */
+#define HV_OEM_BITS            PHY_REG(768, 25)
+#define HV_OEM_BITS_LPLU       0x0004 /* Low Power Link Up */
+#define HV_OEM_BITS_RESTART_AN 0x0400 /* Restart Auto-negotiation */
+
 /* ICH GbE Flash Hardware Sequencing Flash Status Register bit breakdown */
 /* Offset 04h HSFSTS */
 union ich8_hws_flash_status {
@@ -202,6 +207,7 @@ static s32 e1000_setup_led_pchlan(struct e1000_hw *hw);
 static s32 e1000_cleanup_led_pchlan(struct e1000_hw *hw);
 static s32 e1000_led_on_pchlan(struct e1000_hw *hw);
 static s32 e1000_led_off_pchlan(struct e1000_hw *hw);
+static s32 e1000_set_lplu_state_pchlan(struct e1000_hw *hw, bool active);
 
 static inline u16 __er16flash(struct e1000_hw *hw, unsigned long reg)
 {
@@ -244,6 +250,8 @@ static s32 e1000_init_phy_params_pchlan(struct e1000_hw *hw)
 
 	phy->ops.check_polarity       = e1000_check_polarity_ife_ich8lan;
 	phy->ops.read_phy_reg         = e1000_read_phy_reg_hv;
+	phy->ops.set_d0_lplu_state    = e1000_set_lplu_state_pchlan;
+	phy->ops.set_d3_lplu_state    = e1000_set_lplu_state_pchlan;
 	phy->ops.write_phy_reg        = e1000_write_phy_reg_hv;
 	phy->autoneg_mask             = AUTONEG_ADVERTISE_SPEED_DEFAULT;
 
@@ -1060,6 +1068,38 @@ static s32 e1000_check_polarity_ife_ich8lan(struct e1000_hw *hw)
 }
 
 /**
+ *  e1000_set_lplu_state_pchlan - Set Low Power Link Up state
+ *  @hw: pointer to the HW structure
+ *  @active: true to enable LPLU, false to disable
+ *
+ *  Sets the LPLU state according to the active flag.  For PCH, if OEM write
+ *  bit are disabled in the NVM, writing the LPLU bits in the MAC will not set
+ *  the phy speed. This function will manually set the LPLU bit and restart
+ *  auto-neg as hw would do. D3 and D0 LPLU will call the same function
+ *  since it configures the same bit.
+ **/
+static s32 e1000_set_lplu_state_pchlan(struct e1000_hw *hw, bool active)
+{
+	s32 ret_val = 0;
+	u16 oem_reg;
+
+	ret_val = e1e_rphy(hw, HV_OEM_BITS, &oem_reg);
+	if (ret_val)
+		goto out;
+
+	if (active)
+		oem_reg |= HV_OEM_BITS_LPLU;
+	else
+		oem_reg &= ~HV_OEM_BITS_LPLU;
+
+	oem_reg |= HV_OEM_BITS_RESTART_AN;
+	ret_val = e1e_wphy(hw, HV_OEM_BITS, oem_reg);
+
+out:
+	return ret_val;
+}
+
+/**
  *  e1000_set_d0_lplu_state_ich8lan - Set Low Power Linkup D0 state
  *  @hw: pointer to the HW structure
  *  @active: TRUE to enable LPLU, FALSE to disable


^ permalink raw reply related

* [net-2.6 PATCH 4/5] e1000e: separate mutex usage between NVM and PHY/CSR register for ICHx/PCH
From: Jeff Kirsher @ 2009-10-26 21:23 UTC (permalink / raw)
  To: davem; +Cc: netdev, gospo, Bruce Allan, Jeff Kirsher
In-Reply-To: <20091026212242.9682.25442.stgit@localhost.localdomain>

From: Bruce Allan <bruce.w.allan@intel.com>

Accesses to NVM and PHY/CSR registers on ICHx/PCH-based parts are protected
from concurrent accesses with a mutex that is acquired when the access is
initiated and released when the access has completed.  However, the two
types of accesses should not be protected by the same mutex because the
driver may have to access the NVM while already holding the mutex over
several consecutive PHY/CSR accesses which would result in livelock.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---

 drivers/net/e1000e/ich8lan.c |   89 +++++++++++++++++++++++++++---------------
 1 files changed, 58 insertions(+), 31 deletions(-)

diff --git a/drivers/net/e1000e/ich8lan.c b/drivers/net/e1000e/ich8lan.c
index 2451dc8..aaaaf2c 100644
--- a/drivers/net/e1000e/ich8lan.c
+++ b/drivers/net/e1000e/ich8lan.c
@@ -578,12 +578,39 @@ static s32 e1000_get_variants_ich8lan(struct e1000_adapter *adapter)
 static DEFINE_MUTEX(nvm_mutex);
 
 /**
+ *  e1000_acquire_nvm_ich8lan - Acquire NVM mutex
+ *  @hw: pointer to the HW structure
+ *
+ *  Acquires the mutex for performing NVM operations.
+ **/
+static s32 e1000_acquire_nvm_ich8lan(struct e1000_hw *hw)
+{
+	mutex_lock(&nvm_mutex);
+
+	return 0;
+}
+
+/**
+ *  e1000_release_nvm_ich8lan - Release NVM mutex
+ *  @hw: pointer to the HW structure
+ *
+ *  Releases the mutex used while performing NVM operations.
+ **/
+static void e1000_release_nvm_ich8lan(struct e1000_hw *hw)
+{
+	mutex_unlock(&nvm_mutex);
+
+	return;
+}
+
+static DEFINE_MUTEX(swflag_mutex);
+
+/**
  *  e1000_acquire_swflag_ich8lan - Acquire software control flag
  *  @hw: pointer to the HW structure
  *
- *  Acquires the software control flag for performing NVM and PHY
- *  operations.  This is a function pointer entry point only called by
- *  read/write routines for the PHY and NVM parts.
+ *  Acquires the software control flag for performing PHY and select
+ *  MAC CSR accesses.
  **/
 static s32 e1000_acquire_swflag_ich8lan(struct e1000_hw *hw)
 {
@@ -592,7 +619,7 @@ static s32 e1000_acquire_swflag_ich8lan(struct e1000_hw *hw)
 
 	might_sleep();
 
-	mutex_lock(&nvm_mutex);
+	mutex_lock(&swflag_mutex);
 
 	while (timeout) {
 		extcnf_ctrl = er32(EXTCNF_CTRL);
@@ -633,7 +660,7 @@ static s32 e1000_acquire_swflag_ich8lan(struct e1000_hw *hw)
 
 out:
 	if (ret_val)
-		mutex_unlock(&nvm_mutex);
+		mutex_unlock(&swflag_mutex);
 
 	return ret_val;
 }
@@ -642,9 +669,8 @@ out:
  *  e1000_release_swflag_ich8lan - Release software control flag
  *  @hw: pointer to the HW structure
  *
- *  Releases the software control flag for performing NVM and PHY operations.
- *  This is a function pointer entry point only called by read/write
- *  routines for the PHY and NVM parts.
+ *  Releases the software control flag for performing PHY and select
+ *  MAC CSR accesses.
  **/
 static void e1000_release_swflag_ich8lan(struct e1000_hw *hw)
 {
@@ -654,7 +680,9 @@ static void e1000_release_swflag_ich8lan(struct e1000_hw *hw)
 	extcnf_ctrl &= ~E1000_EXTCNF_CTRL_SWFLAG;
 	ew32(EXTCNF_CTRL, extcnf_ctrl);
 
-	mutex_unlock(&nvm_mutex);
+	mutex_unlock(&swflag_mutex);
+
+	return;
 }
 
 /**
@@ -1360,12 +1388,11 @@ static s32 e1000_read_nvm_ich8lan(struct e1000_hw *hw, u16 offset, u16 words,
 	if ((offset >= nvm->word_size) || (words > nvm->word_size - offset) ||
 	    (words == 0)) {
 		hw_dbg(hw, "nvm parameter(s) out of bounds\n");
-		return -E1000_ERR_NVM;
+		ret_val = -E1000_ERR_NVM;
+		goto out;
 	}
 
-	ret_val = e1000_acquire_swflag_ich8lan(hw);
-	if (ret_val)
-		goto out;
+	nvm->ops.acquire_nvm(hw);
 
 	ret_val = e1000_valid_nvm_bank_detect_ich8lan(hw, &bank);
 	if (ret_val) {
@@ -1391,7 +1418,7 @@ static s32 e1000_read_nvm_ich8lan(struct e1000_hw *hw, u16 offset, u16 words,
 		}
 	}
 
-	e1000_release_swflag_ich8lan(hw);
+	nvm->ops.release_nvm(hw);
 
 out:
 	if (ret_val)
@@ -1649,11 +1676,15 @@ static s32 e1000_write_nvm_ich8lan(struct e1000_hw *hw, u16 offset, u16 words,
 		return -E1000_ERR_NVM;
 	}
 
+	nvm->ops.acquire_nvm(hw);
+
 	for (i = 0; i < words; i++) {
 		dev_spec->shadow_ram[offset+i].modified = 1;
 		dev_spec->shadow_ram[offset+i].value = data[i];
 	}
 
+	nvm->ops.release_nvm(hw);
+
 	return 0;
 }
 
@@ -1683,9 +1714,7 @@ static s32 e1000_update_nvm_checksum_ich8lan(struct e1000_hw *hw)
 	if (nvm->type != e1000_nvm_flash_sw)
 		goto out;
 
-	ret_val = e1000_acquire_swflag_ich8lan(hw);
-	if (ret_val)
-		goto out;
+	nvm->ops.acquire_nvm(hw);
 
 	/*
 	 * We're writing to the opposite bank so if we're on bank 1,
@@ -1703,7 +1732,7 @@ static s32 e1000_update_nvm_checksum_ich8lan(struct e1000_hw *hw)
 		old_bank_offset = 0;
 		ret_val = e1000_erase_flash_bank_ich8lan(hw, 1);
 		if (ret_val) {
-			e1000_release_swflag_ich8lan(hw);
+			nvm->ops.release_nvm(hw);
 			goto out;
 		}
 	} else {
@@ -1711,7 +1740,7 @@ static s32 e1000_update_nvm_checksum_ich8lan(struct e1000_hw *hw)
 		new_bank_offset = 0;
 		ret_val = e1000_erase_flash_bank_ich8lan(hw, 0);
 		if (ret_val) {
-			e1000_release_swflag_ich8lan(hw);
+			nvm->ops.release_nvm(hw);
 			goto out;
 		}
 	}
@@ -1769,7 +1798,7 @@ static s32 e1000_update_nvm_checksum_ich8lan(struct e1000_hw *hw)
 	if (ret_val) {
 		/* Possibly read-only, see e1000e_write_protect_nvm_ich8lan() */
 		hw_dbg(hw, "Flash commit failed.\n");
-		e1000_release_swflag_ich8lan(hw);
+		nvm->ops.release_nvm(hw);
 		goto out;
 	}
 
@@ -1782,7 +1811,7 @@ static s32 e1000_update_nvm_checksum_ich8lan(struct e1000_hw *hw)
 	act_offset = new_bank_offset + E1000_ICH_NVM_SIG_WORD;
 	ret_val = e1000_read_flash_word_ich8lan(hw, act_offset, &data);
 	if (ret_val) {
-		e1000_release_swflag_ich8lan(hw);
+		nvm->ops.release_nvm(hw);
 		goto out;
 	}
 	data &= 0xBFFF;
@@ -1790,7 +1819,7 @@ static s32 e1000_update_nvm_checksum_ich8lan(struct e1000_hw *hw)
 						       act_offset * 2 + 1,
 						       (u8)(data >> 8));
 	if (ret_val) {
-		e1000_release_swflag_ich8lan(hw);
+		nvm->ops.release_nvm(hw);
 		goto out;
 	}
 
@@ -1803,7 +1832,7 @@ static s32 e1000_update_nvm_checksum_ich8lan(struct e1000_hw *hw)
 	act_offset = (old_bank_offset + E1000_ICH_NVM_SIG_WORD) * 2 + 1;
 	ret_val = e1000_retry_write_flash_byte_ich8lan(hw, act_offset, 0);
 	if (ret_val) {
-		e1000_release_swflag_ich8lan(hw);
+		nvm->ops.release_nvm(hw);
 		goto out;
 	}
 
@@ -1813,7 +1842,7 @@ static s32 e1000_update_nvm_checksum_ich8lan(struct e1000_hw *hw)
 		dev_spec->shadow_ram[i].value = 0xFFFF;
 	}
 
-	e1000_release_swflag_ich8lan(hw);
+	nvm->ops.release_nvm(hw);
 
 	/*
 	 * Reload the EEPROM, or else modifications will not appear
@@ -1877,14 +1906,12 @@ static s32 e1000_validate_nvm_checksum_ich8lan(struct e1000_hw *hw)
  **/
 void e1000e_write_protect_nvm_ich8lan(struct e1000_hw *hw)
 {
+	struct e1000_nvm_info *nvm = &hw->nvm;
 	union ich8_flash_protected_range pr0;
 	union ich8_hws_flash_status hsfsts;
 	u32 gfpreg;
-	s32 ret_val;
 
-	ret_val = e1000_acquire_swflag_ich8lan(hw);
-	if (ret_val)
-		return;
+	nvm->ops.acquire_nvm(hw);
 
 	gfpreg = er32flash(ICH_FLASH_GFPREG);
 
@@ -1905,7 +1932,7 @@ void e1000e_write_protect_nvm_ich8lan(struct e1000_hw *hw)
 	hsfsts.hsf_status.flockdn = true;
 	ew32flash(ICH_FLASH_HSFSTS, hsfsts.regval);
 
-	e1000_release_swflag_ich8lan(hw);
+	nvm->ops.release_nvm(hw);
 }
 
 /**
@@ -3162,9 +3189,9 @@ static struct e1000_phy_operations ich8_phy_ops = {
 };
 
 static struct e1000_nvm_operations ich8_nvm_ops = {
-	.acquire_nvm		= e1000_acquire_swflag_ich8lan,
+	.acquire_nvm		= e1000_acquire_nvm_ich8lan,
 	.read_nvm	 	= e1000_read_nvm_ich8lan,
-	.release_nvm		= e1000_release_swflag_ich8lan,
+	.release_nvm		= e1000_release_nvm_ich8lan,
 	.update_nvm		= e1000_update_nvm_checksum_ich8lan,
 	.valid_led_default	= e1000_valid_led_default_ich8lan,
 	.validate_nvm		= e1000_validate_nvm_checksum_ich8lan,


^ permalink raw reply related

* [net-2.6 PATCH 5/5] e1000e: allow for swflag to be held over consecutive PHY accesses
From: Jeff Kirsher @ 2009-10-26 21:24 UTC (permalink / raw)
  To: davem; +Cc: netdev, gospo, Bruce Allan, Jeff Kirsher
In-Reply-To: <20091026212242.9682.25442.stgit@localhost.localdomain>

From: Bruce Allan <bruce.w.allan@intel.com>

PCH-based parts (82577/82578) and some ICH8-based parts (82566) need to
hold the swflag (sw/fw/hw hardware semaphore) over consecutive PHY accesses
in order to perform sw-driven PHY configuration during initialization to
workaround known hardware issues (see follow-on patch).  This patch
provides new PHY read/write functions (and function pointers) that will
allow accessing the PHY registers assuming the swflag has already been
acquired.  The actual PHY register access code has moved into helper
functions that are called with a flag indicating whether or not the swflag
has already been acquired and acquires/releases it if not.

The functions called from within the updated PHY access functions had to be
updated to assume the swflag was already acquired, and other functions that
called those functions were also updated to acquire/release the swflag.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---

 drivers/net/e1000e/e1000.h   |   12 +
 drivers/net/e1000e/hw.h      |    2 
 drivers/net/e1000e/ich8lan.c |    4 
 drivers/net/e1000e/phy.c     |  469 +++++++++++++++++++++++++++++-------------
 4 files changed, 346 insertions(+), 141 deletions(-)

diff --git a/drivers/net/e1000e/e1000.h b/drivers/net/e1000e/e1000.h
index 981936c..405a144 100644
--- a/drivers/net/e1000e/e1000.h
+++ b/drivers/net/e1000e/e1000.h
@@ -519,9 +519,13 @@ extern s32 e1000e_phy_force_speed_duplex_igp(struct e1000_hw *hw);
 extern s32 e1000e_get_cable_length_igp_2(struct e1000_hw *hw);
 extern s32 e1000e_get_phy_info_igp(struct e1000_hw *hw);
 extern s32 e1000e_read_phy_reg_igp(struct e1000_hw *hw, u32 offset, u16 *data);
+extern s32 e1000e_read_phy_reg_igp_locked(struct e1000_hw *hw, u32 offset,
+                                          u16 *data);
 extern s32 e1000e_phy_hw_reset_generic(struct e1000_hw *hw);
 extern s32 e1000e_set_d3_lplu_state(struct e1000_hw *hw, bool active);
 extern s32 e1000e_write_phy_reg_igp(struct e1000_hw *hw, u32 offset, u16 data);
+extern s32 e1000e_write_phy_reg_igp_locked(struct e1000_hw *hw, u32 offset,
+                                           u16 data);
 extern s32 e1000e_phy_sw_reset(struct e1000_hw *hw);
 extern s32 e1000e_phy_force_speed_duplex_m88(struct e1000_hw *hw);
 extern s32 e1000e_get_cfg_done(struct e1000_hw *hw);
@@ -538,7 +542,11 @@ extern s32 e1000e_read_phy_reg_bm2(struct e1000_hw *hw, u32 offset, u16 *data);
 extern s32 e1000e_write_phy_reg_bm2(struct e1000_hw *hw, u32 offset, u16 data);
 extern void e1000e_phy_force_speed_duplex_setup(struct e1000_hw *hw, u16 *phy_ctrl);
 extern s32 e1000e_write_kmrn_reg(struct e1000_hw *hw, u32 offset, u16 data);
+extern s32 e1000e_write_kmrn_reg_locked(struct e1000_hw *hw, u32 offset,
+                                        u16 data);
 extern s32 e1000e_read_kmrn_reg(struct e1000_hw *hw, u32 offset, u16 *data);
+extern s32 e1000e_read_kmrn_reg_locked(struct e1000_hw *hw, u32 offset,
+                                       u16 *data);
 extern s32 e1000e_phy_has_link_generic(struct e1000_hw *hw, u32 iterations,
 			       u32 usec_interval, bool *success);
 extern s32 e1000e_phy_reset_dsp(struct e1000_hw *hw);
@@ -546,7 +554,11 @@ extern s32 e1000e_read_phy_reg_mdic(struct e1000_hw *hw, u32 offset, u16 *data);
 extern s32 e1000e_write_phy_reg_mdic(struct e1000_hw *hw, u32 offset, u16 data);
 extern s32 e1000e_check_downshift(struct e1000_hw *hw);
 extern s32 e1000_read_phy_reg_hv(struct e1000_hw *hw, u32 offset, u16 *data);
+extern s32 e1000_read_phy_reg_hv_locked(struct e1000_hw *hw, u32 offset,
+                                        u16 *data);
 extern s32 e1000_write_phy_reg_hv(struct e1000_hw *hw, u32 offset, u16 data);
+extern s32 e1000_write_phy_reg_hv_locked(struct e1000_hw *hw, u32 offset,
+                                         u16 data);
 extern s32 e1000_set_mdio_slow_mode_hv(struct e1000_hw *hw, bool slow);
 extern s32 e1000_link_stall_workaround_hv(struct e1000_hw *hw);
 extern s32 e1000_copper_link_setup_82577(struct e1000_hw *hw);
diff --git a/drivers/net/e1000e/hw.h b/drivers/net/e1000e/hw.h
index fd44d9f..7b05cf4 100644
--- a/drivers/net/e1000e/hw.h
+++ b/drivers/net/e1000e/hw.h
@@ -764,11 +764,13 @@ struct e1000_phy_operations {
 	s32  (*get_cable_length)(struct e1000_hw *);
 	s32  (*get_phy_info)(struct e1000_hw *);
 	s32  (*read_phy_reg)(struct e1000_hw *, u32, u16 *);
+	s32  (*read_phy_reg_locked)(struct e1000_hw *, u32, u16 *);
 	void (*release_phy)(struct e1000_hw *);
 	s32  (*reset_phy)(struct e1000_hw *);
 	s32  (*set_d0_lplu_state)(struct e1000_hw *, bool);
 	s32  (*set_d3_lplu_state)(struct e1000_hw *, bool);
 	s32  (*write_phy_reg)(struct e1000_hw *, u32, u16);
+	s32  (*write_phy_reg_locked)(struct e1000_hw *, u32, u16);
 	s32  (*cfg_on_link_up)(struct e1000_hw *);
 };
 
diff --git a/drivers/net/e1000e/ich8lan.c b/drivers/net/e1000e/ich8lan.c
index aaaaf2c..b6388b9 100644
--- a/drivers/net/e1000e/ich8lan.c
+++ b/drivers/net/e1000e/ich8lan.c
@@ -250,9 +250,11 @@ static s32 e1000_init_phy_params_pchlan(struct e1000_hw *hw)
 
 	phy->ops.check_polarity       = e1000_check_polarity_ife_ich8lan;
 	phy->ops.read_phy_reg         = e1000_read_phy_reg_hv;
+	phy->ops.read_phy_reg_locked  = e1000_read_phy_reg_hv_locked;
 	phy->ops.set_d0_lplu_state    = e1000_set_lplu_state_pchlan;
 	phy->ops.set_d3_lplu_state    = e1000_set_lplu_state_pchlan;
 	phy->ops.write_phy_reg        = e1000_write_phy_reg_hv;
+	phy->ops.write_phy_reg_locked = e1000_write_phy_reg_hv_locked;
 	phy->autoneg_mask             = AUTONEG_ADVERTISE_SPEED_DEFAULT;
 
 	phy->id = e1000_phy_unknown;
@@ -313,6 +315,8 @@ static s32 e1000_init_phy_params_ich8lan(struct e1000_hw *hw)
 	case IGP03E1000_E_PHY_ID:
 		phy->type = e1000_phy_igp_3;
 		phy->autoneg_mask = AUTONEG_ADVERTISE_SPEED_DEFAULT;
+		phy->ops.read_phy_reg_locked = e1000e_read_phy_reg_igp_locked;
+		phy->ops.write_phy_reg_locked = e1000e_write_phy_reg_igp_locked;
 		break;
 	case IFE_E_PHY_ID:
 	case IFE_PLUS_E_PHY_ID:
diff --git a/drivers/net/e1000e/phy.c b/drivers/net/e1000e/phy.c
index 994401f..f9d33ab 100644
--- a/drivers/net/e1000e/phy.c
+++ b/drivers/net/e1000e/phy.c
@@ -164,16 +164,25 @@ s32 e1000e_get_phy_id(struct e1000_hw *hw)
 		 * MDIC mode. No harm in trying again in this case since
 		 * the PHY ID is unknown at this point anyway
 		 */
+		ret_val = phy->ops.acquire_phy(hw);
+		if (ret_val)
+			goto out;
 		ret_val = e1000_set_mdio_slow_mode_hv(hw, true);
 		if (ret_val)
 			goto out;
+		phy->ops.release_phy(hw);
 
 		retry_count++;
 	}
 out:
 	/* Revert to MDIO fast mode, if applicable */
-	if (retry_count)
+	if (retry_count) {
+		ret_val = phy->ops.acquire_phy(hw);
+		if (ret_val)
+			return ret_val;
 		ret_val = e1000_set_mdio_slow_mode_hv(hw, false);
+		phy->ops.release_phy(hw);
+	}
 
 	return ret_val;
 }
@@ -354,94 +363,173 @@ s32 e1000e_write_phy_reg_m88(struct e1000_hw *hw, u32 offset, u16 data)
 }
 
 /**
- *  e1000e_read_phy_reg_igp - Read igp PHY register
+ *  __e1000e_read_phy_reg_igp - Read igp PHY register
  *  @hw: pointer to the HW structure
  *  @offset: register offset to be read
  *  @data: pointer to the read data
+ *  @locked: semaphore has already been acquired or not
  *
  *  Acquires semaphore, if necessary, then reads the PHY register at offset
- *  and storing the retrieved information in data.  Release any acquired
+ *  and stores the retrieved information in data.  Release any acquired
  *  semaphores before exiting.
  **/
-s32 e1000e_read_phy_reg_igp(struct e1000_hw *hw, u32 offset, u16 *data)
+static s32 __e1000e_read_phy_reg_igp(struct e1000_hw *hw, u32 offset, u16 *data,
+                                    bool locked)
 {
-	s32 ret_val;
+	s32 ret_val = 0;
 
-	ret_val = hw->phy.ops.acquire_phy(hw);
-	if (ret_val)
-		return ret_val;
+	if (!locked) {
+		if (!(hw->phy.ops.acquire_phy))
+			goto out;
+
+		ret_val = hw->phy.ops.acquire_phy(hw);
+		if (ret_val)
+			goto out;
+	}
 
 	if (offset > MAX_PHY_MULTI_PAGE_REG) {
 		ret_val = e1000e_write_phy_reg_mdic(hw,
 						    IGP01E1000_PHY_PAGE_SELECT,
 						    (u16)offset);
-		if (ret_val) {
-			hw->phy.ops.release_phy(hw);
-			return ret_val;
-		}
+		if (ret_val)
+			goto release;
 	}
 
 	ret_val = e1000e_read_phy_reg_mdic(hw, MAX_PHY_REG_ADDRESS & offset,
-					   data);
-
-	hw->phy.ops.release_phy(hw);
+	                                  data);
 
+release:
+	if (!locked)
+		hw->phy.ops.release_phy(hw);
+out:
 	return ret_val;
 }
 
 /**
+ *  e1000e_read_phy_reg_igp - Read igp PHY register
+ *  @hw: pointer to the HW structure
+ *  @offset: register offset to be read
+ *  @data: pointer to the read data
+ *
+ *  Acquires semaphore then reads the PHY register at offset and stores the
+ *  retrieved information in data.
+ *  Release the acquired semaphore before exiting.
+ **/
+s32 e1000e_read_phy_reg_igp(struct e1000_hw *hw, u32 offset, u16 *data)
+{
+	return __e1000e_read_phy_reg_igp(hw, offset, data, false);
+}
+
+/**
+ *  e1000e_read_phy_reg_igp_locked - Read igp PHY register
+ *  @hw: pointer to the HW structure
+ *  @offset: register offset to be read
+ *  @data: pointer to the read data
+ *
+ *  Reads the PHY register at offset and stores the retrieved information
+ *  in data.  Assumes semaphore already acquired.
+ **/
+s32 e1000e_read_phy_reg_igp_locked(struct e1000_hw *hw, u32 offset, u16 *data)
+{
+	return __e1000e_read_phy_reg_igp(hw, offset, data, true);
+}
+
+/**
  *  e1000e_write_phy_reg_igp - Write igp PHY register
  *  @hw: pointer to the HW structure
  *  @offset: register offset to write to
  *  @data: data to write at register offset
+ *  @locked: semaphore has already been acquired or not
  *
  *  Acquires semaphore, if necessary, then writes the data to PHY register
  *  at the offset.  Release any acquired semaphores before exiting.
  **/
-s32 e1000e_write_phy_reg_igp(struct e1000_hw *hw, u32 offset, u16 data)
+static s32 __e1000e_write_phy_reg_igp(struct e1000_hw *hw, u32 offset, u16 data,
+                                     bool locked)
 {
-	s32 ret_val;
+	s32 ret_val = 0;
 
-	ret_val = hw->phy.ops.acquire_phy(hw);
-	if (ret_val)
-		return ret_val;
+	if (!locked) {
+		if (!(hw->phy.ops.acquire_phy))
+			goto out;
+
+		ret_val = hw->phy.ops.acquire_phy(hw);
+		if (ret_val)
+			goto out;
+	}
 
 	if (offset > MAX_PHY_MULTI_PAGE_REG) {
 		ret_val = e1000e_write_phy_reg_mdic(hw,
 						    IGP01E1000_PHY_PAGE_SELECT,
 						    (u16)offset);
-		if (ret_val) {
-			hw->phy.ops.release_phy(hw);
-			return ret_val;
-		}
+		if (ret_val)
+			goto release;
 	}
 
 	ret_val = e1000e_write_phy_reg_mdic(hw, MAX_PHY_REG_ADDRESS & offset,
 					    data);
 
-	hw->phy.ops.release_phy(hw);
+release:
+	if (!locked)
+		hw->phy.ops.release_phy(hw);
 
+out:
 	return ret_val;
 }
 
 /**
- *  e1000e_read_kmrn_reg - Read kumeran register
+ *  e1000e_write_phy_reg_igp - Write igp PHY register
+ *  @hw: pointer to the HW structure
+ *  @offset: register offset to write to
+ *  @data: data to write at register offset
+ *
+ *  Acquires semaphore then writes the data to PHY register
+ *  at the offset.  Release any acquired semaphores before exiting.
+ **/
+s32 e1000e_write_phy_reg_igp(struct e1000_hw *hw, u32 offset, u16 data)
+{
+	return __e1000e_write_phy_reg_igp(hw, offset, data, false);
+}
+
+/**
+ *  e1000e_write_phy_reg_igp_locked - Write igp PHY register
+ *  @hw: pointer to the HW structure
+ *  @offset: register offset to write to
+ *  @data: data to write at register offset
+ *
+ *  Writes the data to PHY register at the offset.
+ *  Assumes semaphore already acquired.
+ **/
+s32 e1000e_write_phy_reg_igp_locked(struct e1000_hw *hw, u32 offset, u16 data)
+{
+	return __e1000e_write_phy_reg_igp(hw, offset, data, true);
+}
+
+/**
+ *  __e1000_read_kmrn_reg - Read kumeran register
  *  @hw: pointer to the HW structure
  *  @offset: register offset to be read
  *  @data: pointer to the read data
+ *  @locked: semaphore has already been acquired or not
  *
  *  Acquires semaphore, if necessary.  Then reads the PHY register at offset
  *  using the kumeran interface.  The information retrieved is stored in data.
  *  Release any acquired semaphores before exiting.
  **/
-s32 e1000e_read_kmrn_reg(struct e1000_hw *hw, u32 offset, u16 *data)
+static s32 __e1000_read_kmrn_reg(struct e1000_hw *hw, u32 offset, u16 *data,
+                                 bool locked)
 {
 	u32 kmrnctrlsta;
-	s32 ret_val;
+	s32 ret_val = 0;
 
-	ret_val = hw->phy.ops.acquire_phy(hw);
-	if (ret_val)
-		return ret_val;
+	if (!locked) {
+		if (!(hw->phy.ops.acquire_phy))
+			goto out;
+
+		ret_val = hw->phy.ops.acquire_phy(hw);
+		if (ret_val)
+			goto out;
+	}
 
 	kmrnctrlsta = ((offset << E1000_KMRNCTRLSTA_OFFSET_SHIFT) &
 		       E1000_KMRNCTRLSTA_OFFSET) | E1000_KMRNCTRLSTA_REN;
@@ -452,41 +540,111 @@ s32 e1000e_read_kmrn_reg(struct e1000_hw *hw, u32 offset, u16 *data)
 	kmrnctrlsta = er32(KMRNCTRLSTA);
 	*data = (u16)kmrnctrlsta;
 
-	hw->phy.ops.release_phy(hw);
+	if (!locked)
+		hw->phy.ops.release_phy(hw);
 
+out:
 	return ret_val;
 }
 
 /**
- *  e1000e_write_kmrn_reg - Write kumeran register
+ *  e1000e_read_kmrn_reg -  Read kumeran register
+ *  @hw: pointer to the HW structure
+ *  @offset: register offset to be read
+ *  @data: pointer to the read data
+ *
+ *  Acquires semaphore then reads the PHY register at offset using the
+ *  kumeran interface.  The information retrieved is stored in data.
+ *  Release the acquired semaphore before exiting.
+ **/
+s32 e1000e_read_kmrn_reg(struct e1000_hw *hw, u32 offset, u16 *data)
+{
+	return __e1000_read_kmrn_reg(hw, offset, data, false);
+}
+
+/**
+ *  e1000_read_kmrn_reg_locked -  Read kumeran register
+ *  @hw: pointer to the HW structure
+ *  @offset: register offset to be read
+ *  @data: pointer to the read data
+ *
+ *  Reads the PHY register at offset using the kumeran interface.  The
+ *  information retrieved is stored in data.
+ *  Assumes semaphore already acquired.
+ **/
+s32 e1000_read_kmrn_reg_locked(struct e1000_hw *hw, u32 offset, u16 *data)
+{
+	return __e1000_read_kmrn_reg(hw, offset, data, true);
+}
+
+/**
+ *  __e1000_write_kmrn_reg - Write kumeran register
  *  @hw: pointer to the HW structure
  *  @offset: register offset to write to
  *  @data: data to write at register offset
+ *  @locked: semaphore has already been acquired or not
  *
  *  Acquires semaphore, if necessary.  Then write the data to PHY register
  *  at the offset using the kumeran interface.  Release any acquired semaphores
  *  before exiting.
  **/
-s32 e1000e_write_kmrn_reg(struct e1000_hw *hw, u32 offset, u16 data)
+static s32 __e1000_write_kmrn_reg(struct e1000_hw *hw, u32 offset, u16 data,
+                                  bool locked)
 {
 	u32 kmrnctrlsta;
-	s32 ret_val;
+	s32 ret_val = 0;
 
-	ret_val = hw->phy.ops.acquire_phy(hw);
-	if (ret_val)
-		return ret_val;
+	if (!locked) {
+		if (!(hw->phy.ops.acquire_phy))
+			goto out;
+
+		ret_val = hw->phy.ops.acquire_phy(hw);
+		if (ret_val)
+			goto out;
+	}
 
 	kmrnctrlsta = ((offset << E1000_KMRNCTRLSTA_OFFSET_SHIFT) &
 		       E1000_KMRNCTRLSTA_OFFSET) | data;
 	ew32(KMRNCTRLSTA, kmrnctrlsta);
 
 	udelay(2);
-	hw->phy.ops.release_phy(hw);
 
+	if (!locked)
+		hw->phy.ops.release_phy(hw);
+
+out:
 	return ret_val;
 }
 
 /**
+ *  e1000e_write_kmrn_reg -  Write kumeran register
+ *  @hw: pointer to the HW structure
+ *  @offset: register offset to write to
+ *  @data: data to write at register offset
+ *
+ *  Acquires semaphore then writes the data to the PHY register at the offset
+ *  using the kumeran interface.  Release the acquired semaphore before exiting.
+ **/
+s32 e1000e_write_kmrn_reg(struct e1000_hw *hw, u32 offset, u16 data)
+{
+	return __e1000_write_kmrn_reg(hw, offset, data, false);
+}
+
+/**
+ *  e1000_write_kmrn_reg_locked -  Write kumeran register
+ *  @hw: pointer to the HW structure
+ *  @offset: register offset to write to
+ *  @data: data to write at register offset
+ *
+ *  Write the data to PHY register at the offset using the kumeran interface.
+ *  Assumes semaphore already acquired.
+ **/
+s32 e1000_write_kmrn_reg_locked(struct e1000_hw *hw, u32 offset, u16 data)
+{
+	return __e1000_write_kmrn_reg(hw, offset, data, true);
+}
+
+/**
  *  e1000_copper_link_setup_82577 - Setup 82577 PHY for copper link
  *  @hw: pointer to the HW structure
  *
@@ -2105,6 +2263,10 @@ s32 e1000e_write_phy_reg_bm(struct e1000_hw *hw, u32 offset, u16 data)
 	u32 page = offset >> IGP_PAGE_SHIFT;
 	u32 page_shift = 0;
 
+	ret_val = hw->phy.ops.acquire_phy(hw);
+	if (ret_val)
+		return ret_val;
+
 	/* Page 800 works differently than the rest so it has its own func */
 	if (page == BM_WUC_PAGE) {
 		ret_val = e1000_access_phy_wakeup_reg_bm(hw, offset, &data,
@@ -2112,10 +2274,6 @@ s32 e1000e_write_phy_reg_bm(struct e1000_hw *hw, u32 offset, u16 data)
 		goto out;
 	}
 
-	ret_val = hw->phy.ops.acquire_phy(hw);
-	if (ret_val)
-		goto out;
-
 	hw->phy.addr = e1000_get_phy_addr_for_bm_page(page, offset);
 
 	if (offset > MAX_PHY_MULTI_PAGE_REG) {
@@ -2135,18 +2293,15 @@ s32 e1000e_write_phy_reg_bm(struct e1000_hw *hw, u32 offset, u16 data)
 		/* Page is shifted left, PHY expects (page x 32) */
 		ret_val = e1000e_write_phy_reg_mdic(hw, page_select,
 		                                    (page << page_shift));
-		if (ret_val) {
-			hw->phy.ops.release_phy(hw);
+		if (ret_val)
 			goto out;
-		}
 	}
 
 	ret_val = e1000e_write_phy_reg_mdic(hw, MAX_PHY_REG_ADDRESS & offset,
 	                                    data);
 
-	hw->phy.ops.release_phy(hw);
-
 out:
+	hw->phy.ops.release_phy(hw);
 	return ret_val;
 }
 
@@ -2167,6 +2322,10 @@ s32 e1000e_read_phy_reg_bm(struct e1000_hw *hw, u32 offset, u16 *data)
 	u32 page = offset >> IGP_PAGE_SHIFT;
 	u32 page_shift = 0;
 
+	ret_val = hw->phy.ops.acquire_phy(hw);
+	if (ret_val)
+		return ret_val;
+
 	/* Page 800 works differently than the rest so it has its own func */
 	if (page == BM_WUC_PAGE) {
 		ret_val = e1000_access_phy_wakeup_reg_bm(hw, offset, data,
@@ -2174,10 +2333,6 @@ s32 e1000e_read_phy_reg_bm(struct e1000_hw *hw, u32 offset, u16 *data)
 		goto out;
 	}
 
-	ret_val = hw->phy.ops.acquire_phy(hw);
-	if (ret_val)
-		goto out;
-
 	hw->phy.addr = e1000_get_phy_addr_for_bm_page(page, offset);
 
 	if (offset > MAX_PHY_MULTI_PAGE_REG) {
@@ -2197,17 +2352,14 @@ s32 e1000e_read_phy_reg_bm(struct e1000_hw *hw, u32 offset, u16 *data)
 		/* Page is shifted left, PHY expects (page x 32) */
 		ret_val = e1000e_write_phy_reg_mdic(hw, page_select,
 		                                    (page << page_shift));
-		if (ret_val) {
-			hw->phy.ops.release_phy(hw);
+		if (ret_val)
 			goto out;
-		}
 	}
 
 	ret_val = e1000e_read_phy_reg_mdic(hw, MAX_PHY_REG_ADDRESS & offset,
 	                                   data);
-	hw->phy.ops.release_phy(hw);
-
 out:
+	hw->phy.ops.release_phy(hw);
 	return ret_val;
 }
 
@@ -2226,17 +2378,17 @@ s32 e1000e_read_phy_reg_bm2(struct e1000_hw *hw, u32 offset, u16 *data)
 	s32 ret_val;
 	u16 page = (u16)(offset >> IGP_PAGE_SHIFT);
 
+	ret_val = hw->phy.ops.acquire_phy(hw);
+	if (ret_val)
+		return ret_val;
+
 	/* Page 800 works differently than the rest so it has its own func */
 	if (page == BM_WUC_PAGE) {
 		ret_val = e1000_access_phy_wakeup_reg_bm(hw, offset, data,
 							 true);
-		return ret_val;
+		goto out;
 	}
 
-	ret_val = hw->phy.ops.acquire_phy(hw);
-	if (ret_val)
-		return ret_val;
-
 	hw->phy.addr = 1;
 
 	if (offset > MAX_PHY_MULTI_PAGE_REG) {
@@ -2245,16 +2397,14 @@ s32 e1000e_read_phy_reg_bm2(struct e1000_hw *hw, u32 offset, u16 *data)
 		ret_val = e1000e_write_phy_reg_mdic(hw, BM_PHY_PAGE_SELECT,
 						    page);
 
-		if (ret_val) {
-			hw->phy.ops.release_phy(hw);
-			return ret_val;
-		}
+		if (ret_val)
+			goto out;
 	}
 
 	ret_val = e1000e_read_phy_reg_mdic(hw, MAX_PHY_REG_ADDRESS & offset,
 					   data);
+out:
 	hw->phy.ops.release_phy(hw);
-
 	return ret_val;
 }
 
@@ -2272,17 +2422,17 @@ s32 e1000e_write_phy_reg_bm2(struct e1000_hw *hw, u32 offset, u16 data)
 	s32 ret_val;
 	u16 page = (u16)(offset >> IGP_PAGE_SHIFT);
 
+	ret_val = hw->phy.ops.acquire_phy(hw);
+	if (ret_val)
+		return ret_val;
+
 	/* Page 800 works differently than the rest so it has its own func */
 	if (page == BM_WUC_PAGE) {
 		ret_val = e1000_access_phy_wakeup_reg_bm(hw, offset, &data,
 							 false);
-		return ret_val;
+		goto out;
 	}
 
-	ret_val = hw->phy.ops.acquire_phy(hw);
-	if (ret_val)
-		return ret_val;
-
 	hw->phy.addr = 1;
 
 	if (offset > MAX_PHY_MULTI_PAGE_REG) {
@@ -2290,17 +2440,15 @@ s32 e1000e_write_phy_reg_bm2(struct e1000_hw *hw, u32 offset, u16 data)
 		ret_val = e1000e_write_phy_reg_mdic(hw, BM_PHY_PAGE_SELECT,
 						    page);
 
-		if (ret_val) {
-			hw->phy.ops.release_phy(hw);
-			return ret_val;
-		}
+		if (ret_val)
+			goto out;
 	}
 
 	ret_val = e1000e_write_phy_reg_mdic(hw, MAX_PHY_REG_ADDRESS & offset,
 					    data);
 
+out:
 	hw->phy.ops.release_phy(hw);
-
 	return ret_val;
 }
 
@@ -2320,6 +2468,8 @@ s32 e1000e_write_phy_reg_bm2(struct e1000_hw *hw, u32 offset, u16 data)
  *  3) Write the address using the address opcode (0x11)
  *  4) Read or write the data using the data opcode (0x12)
  *  5) Restore 769_17.2 to its original value
+ *
+ *  Assumes semaphore already acquired.
  **/
 static s32 e1000_access_phy_wakeup_reg_bm(struct e1000_hw *hw, u32 offset,
 					  u16 *data, bool read)
@@ -2327,20 +2477,12 @@ static s32 e1000_access_phy_wakeup_reg_bm(struct e1000_hw *hw, u32 offset,
 	s32 ret_val;
 	u16 reg = BM_PHY_REG_NUM(offset);
 	u16 phy_reg = 0;
-	u8  phy_acquired = 1;
-
 
 	/* Gig must be disabled for MDIO accesses to page 800 */
 	if ((hw->mac.type == e1000_pchlan) &&
 	   (!(er32(PHY_CTRL) & E1000_PHY_CTRL_GBE_DISABLE)))
 		hw_dbg(hw, "Attempting to access page 800 while gig enabled\n");
 
-	ret_val = hw->phy.ops.acquire_phy(hw);
-	if (ret_val) {
-		phy_acquired = 0;
-		goto out;
-	}
-
 	/* All operations in this function are phy address 1 */
 	hw->phy.addr = 1;
 
@@ -2397,8 +2539,6 @@ static s32 e1000_access_phy_wakeup_reg_bm(struct e1000_hw *hw, u32 offset,
 	ret_val = e1000e_write_phy_reg_mdic(hw, BM_WUC_ENABLE_REG, phy_reg);
 
 out:
-	if (phy_acquired == 1)
-		hw->phy.ops.release_phy(hw);
 	return ret_val;
 }
 
@@ -2439,52 +2579,63 @@ static s32 e1000_set_d0_lplu_state(struct e1000_hw *hw, bool active)
 	return 0;
 }
 
+/**
+ *  e1000_set_mdio_slow_mode_hv - Set slow MDIO access mode
+ *  @hw:   pointer to the HW structure
+ *  @slow: true for slow mode, false for normal mode
+ *
+ *  Assumes semaphore already acquired.
+ **/
 s32 e1000_set_mdio_slow_mode_hv(struct e1000_hw *hw, bool slow)
 {
 	s32 ret_val = 0;
 	u16 data = 0;
 
-	ret_val = hw->phy.ops.acquire_phy(hw);
-	if (ret_val)
-		return ret_val;
-
 	/* Set MDIO mode - page 769, register 16: 0x2580==slow, 0x2180==fast */
 	hw->phy.addr = 1;
 	ret_val = e1000e_write_phy_reg_mdic(hw, IGP01E1000_PHY_PAGE_SELECT,
 				         (BM_PORT_CTRL_PAGE << IGP_PAGE_SHIFT));
-	if (ret_val) {
-		hw->phy.ops.release_phy(hw);
-		return ret_val;
-	}
+	if (ret_val)
+		goto out;
+
 	ret_val = e1000e_write_phy_reg_mdic(hw, BM_CS_CTRL1,
 	                                   (0x2180 | (slow << 10)));
+	if (ret_val)
+		goto out;
 
 	/* dummy read when reverting to fast mode - throw away result */
 	if (!slow)
-		e1000e_read_phy_reg_mdic(hw, BM_CS_CTRL1, &data);
-
-	hw->phy.ops.release_phy(hw);
+		ret_val = e1000e_read_phy_reg_mdic(hw, BM_CS_CTRL1, &data);
 
+out:
 	return ret_val;
 }
 
 /**
- *  e1000_read_phy_reg_hv -  Read HV PHY register
+ *  __e1000_read_phy_reg_hv -  Read HV PHY register
  *  @hw: pointer to the HW structure
  *  @offset: register offset to be read
  *  @data: pointer to the read data
+ *  @locked: semaphore has already been acquired or not
  *
  *  Acquires semaphore, if necessary, then reads the PHY register at offset
- *  and storing the retrieved information in data.  Release any acquired
+ *  and stores the retrieved information in data.  Release any acquired
  *  semaphore before exiting.
  **/
-s32 e1000_read_phy_reg_hv(struct e1000_hw *hw, u32 offset, u16 *data)
+static s32 __e1000_read_phy_reg_hv(struct e1000_hw *hw, u32 offset, u16 *data,
+                                   bool locked)
 {
 	s32 ret_val;
 	u16 page = BM_PHY_REG_PAGE(offset);
 	u16 reg = BM_PHY_REG_NUM(offset);
 	bool in_slow_mode = false;
 
+	if (!locked) {
+		ret_val = hw->phy.ops.acquire_phy(hw);
+		if (ret_val)
+			return ret_val;
+	}
+
 	/* Workaround failure in MDIO access while cable is disconnected */
 	if ((hw->phy.type == e1000_phy_82577) &&
 	    !(er32(STATUS) & E1000_STATUS_LU)) {
@@ -2508,10 +2659,6 @@ s32 e1000_read_phy_reg_hv(struct e1000_hw *hw, u32 offset, u16 *data)
 		goto out;
 	}
 
-	ret_val = hw->phy.ops.acquire_phy(hw);
-	if (ret_val)
-		goto out;
-
 	hw->phy.addr = e1000_get_phy_addr_for_hv_page(page);
 
 	if (page == HV_INTC_FC_PAGE_START)
@@ -2529,42 +2676,76 @@ s32 e1000_read_phy_reg_hv(struct e1000_hw *hw, u32 offset, u16 *data)
 			ret_val = e1000e_write_phy_reg_mdic(hw,
 			                             IGP01E1000_PHY_PAGE_SELECT,
 			                             (page << IGP_PAGE_SHIFT));
-			if (ret_val) {
-				hw->phy.ops.release_phy(hw);
-				goto out;
-			}
 			hw->phy.addr = phy_addr;
 		}
 	}
 
 	ret_val = e1000e_read_phy_reg_mdic(hw, MAX_PHY_REG_ADDRESS & reg,
 	                                  data);
-	hw->phy.ops.release_phy(hw);
-
 out:
 	/* Revert to MDIO fast mode, if applicable */
 	if ((hw->phy.type == e1000_phy_82577) && in_slow_mode)
 		ret_val = e1000_set_mdio_slow_mode_hv(hw, false);
 
+	if (!locked)
+		hw->phy.ops.release_phy(hw);
+
 	return ret_val;
 }
 
 /**
- *  e1000_write_phy_reg_hv - Write HV PHY register
+ *  e1000_read_phy_reg_hv -  Read HV PHY register
+ *  @hw: pointer to the HW structure
+ *  @offset: register offset to be read
+ *  @data: pointer to the read data
+ *
+ *  Acquires semaphore then reads the PHY register at offset and stores
+ *  the retrieved information in data.  Release the acquired semaphore
+ *  before exiting.
+ **/
+s32 e1000_read_phy_reg_hv(struct e1000_hw *hw, u32 offset, u16 *data)
+{
+	return __e1000_read_phy_reg_hv(hw, offset, data, false);
+}
+
+/**
+ *  e1000_read_phy_reg_hv_locked -  Read HV PHY register
+ *  @hw: pointer to the HW structure
+ *  @offset: register offset to be read
+ *  @data: pointer to the read data
+ *
+ *  Reads the PHY register at offset and stores the retrieved information
+ *  in data.  Assumes semaphore already acquired.
+ **/
+s32 e1000_read_phy_reg_hv_locked(struct e1000_hw *hw, u32 offset, u16 *data)
+{
+	return __e1000_read_phy_reg_hv(hw, offset, data, true);
+}
+
+/**
+ *  __e1000_write_phy_reg_hv - Write HV PHY register
  *  @hw: pointer to the HW structure
  *  @offset: register offset to write to
  *  @data: data to write at register offset
+ *  @locked: semaphore has already been acquired or not
  *
  *  Acquires semaphore, if necessary, then writes the data to PHY register
  *  at the offset.  Release any acquired semaphores before exiting.
  **/
-s32 e1000_write_phy_reg_hv(struct e1000_hw *hw, u32 offset, u16 data)
+static s32 __e1000_write_phy_reg_hv(struct e1000_hw *hw, u32 offset, u16 data,
+                                    bool locked)
 {
 	s32 ret_val;
 	u16 page = BM_PHY_REG_PAGE(offset);
 	u16 reg = BM_PHY_REG_NUM(offset);
 	bool in_slow_mode = false;
 
+	if (!locked) {
+		ret_val = hw->phy.ops.acquire_phy(hw);
+		if (ret_val)
+			return ret_val;
+	}
+
 	/* Workaround failure in MDIO access while cable is disconnected */
 	if ((hw->phy.type == e1000_phy_82577) &&
 	    !(er32(STATUS) & E1000_STATUS_LU)) {
@@ -2588,10 +2769,6 @@ s32 e1000_write_phy_reg_hv(struct e1000_hw *hw, u32 offset, u16 data)
 		goto out;
 	}
 
-	ret_val = hw->phy.ops.acquire_phy(hw);
-	if (ret_val)
-		goto out;
-
 	hw->phy.addr = e1000_get_phy_addr_for_hv_page(page);
 
 	if (page == HV_INTC_FC_PAGE_START)
@@ -2607,15 +2784,10 @@ s32 e1000_write_phy_reg_hv(struct e1000_hw *hw, u32 offset, u16 data)
 	    ((MAX_PHY_REG_ADDRESS & reg) == 0) &&
 	    (data & (1 << 11))) {
 		u16 data2 = 0x7EFF;
-		hw->phy.ops.release_phy(hw);
 		ret_val = e1000_access_phy_debug_regs_hv(hw, (1 << 6) | 0x3,
 		                                         &data2, false);
 		if (ret_val)
 			goto out;
-
-		ret_val = hw->phy.ops.acquire_phy(hw);
-		if (ret_val)
-			goto out;
 	}
 
 	if (reg > MAX_PHY_MULTI_PAGE_REG) {
@@ -2630,27 +2802,53 @@ s32 e1000_write_phy_reg_hv(struct e1000_hw *hw, u32 offset, u16 data)
 			ret_val = e1000e_write_phy_reg_mdic(hw,
 			                             IGP01E1000_PHY_PAGE_SELECT,
 			                             (page << IGP_PAGE_SHIFT));
-			if (ret_val) {
-				hw->phy.ops.release_phy(hw);
-				goto out;
-			}
 			hw->phy.addr = phy_addr;
 		}
 	}
 
 	ret_val = e1000e_write_phy_reg_mdic(hw, MAX_PHY_REG_ADDRESS & reg,
 	                                  data);
-	hw->phy.ops.release_phy(hw);
 
 out:
 	/* Revert to MDIO fast mode, if applicable */
 	if ((hw->phy.type == e1000_phy_82577) && in_slow_mode)
 		ret_val = e1000_set_mdio_slow_mode_hv(hw, false);
 
+	if (!locked)
+		hw->phy.ops.release_phy(hw);
+
 	return ret_val;
 }
 
 /**
+ *  e1000_write_phy_reg_hv - Write HV PHY register
+ *  @hw: pointer to the HW structure
+ *  @offset: register offset to write to
+ *  @data: data to write at register offset
+ *
+ *  Acquires semaphore then writes the data to PHY register at the offset.
+ *  Release the acquired semaphores before exiting.
+ **/
+s32 e1000_write_phy_reg_hv(struct e1000_hw *hw, u32 offset, u16 data)
+{
+	return __e1000_write_phy_reg_hv(hw, offset, data, false);
+}
+
+/**
+ *  e1000_write_phy_reg_hv_locked - Write HV PHY register
+ *  @hw: pointer to the HW structure
+ *  @offset: register offset to write to
+ *  @data: data to write at register offset
+ *
+ *  Writes the data to PHY register at the offset.  Assumes semaphore
+ *  already acquired.
+ **/
+s32 e1000_write_phy_reg_hv_locked(struct e1000_hw *hw, u32 offset, u16 data)
+{
+	return __e1000_write_phy_reg_hv(hw, offset, data, true);
+}
+
+/**
  *  e1000_get_phy_addr_for_hv_page - Get PHY adrress based on page
  *  @page: page to be accessed
  **/
@@ -2671,10 +2869,9 @@ static u32 e1000_get_phy_addr_for_hv_page(u32 page)
  *  @data: pointer to the data to be read or written
  *  @read: determines if operation is read or written
  *
- *  Acquires semaphore, if necessary, then reads the PHY register at offset
- *  and storing the retreived information in data.  Release any acquired
- *  semaphores before exiting.  Note that the procedure to read these regs
- *  uses the address port and data port to read/write.
+ *  Reads the PHY register at offset and stores the retreived information
+ *  in data.  Assumes semaphore already acquired.  Note that the procedure
+ *  to read these regs uses the address port and data port to read/write.
  **/
 static s32 e1000_access_phy_debug_regs_hv(struct e1000_hw *hw, u32 offset,
                                           u16 *data, bool read)
@@ -2682,20 +2879,12 @@ static s32 e1000_access_phy_debug_regs_hv(struct e1000_hw *hw, u32 offset,
 	s32 ret_val;
 	u32 addr_reg = 0;
 	u32 data_reg = 0;
-	u8  phy_acquired = 1;
 
 	/* This takes care of the difference with desktop vs mobile phy */
 	addr_reg = (hw->phy.type == e1000_phy_82578) ?
 	           I82578_ADDR_REG : I82577_ADDR_REG;
 	data_reg = addr_reg + 1;
 
-	ret_val = hw->phy.ops.acquire_phy(hw);
-	if (ret_val) {
-		hw_dbg(hw, "Could not acquire PHY\n");
-		phy_acquired = 0;
-		goto out;
-	}
-
 	/* All operations in this function are phy address 2 */
 	hw->phy.addr = 2;
 
@@ -2718,8 +2907,6 @@ static s32 e1000_access_phy_debug_regs_hv(struct e1000_hw *hw, u32 offset,
 	}
 
 out:
-	if (phy_acquired == 1)
-		hw->phy.ops.release_phy(hw);
 	return ret_val;
 }
 


^ permalink raw reply related

* [net-2.6 PATCH 1/3] igb: fix memory leak when setting ring size while interface is down
From: Jeff Kirsher @ 2009-10-26 21:31 UTC (permalink / raw)
  To: davem; +Cc: netdev, gospo, Alexander Duyck, Jeff Kirsher

From: Alexander Duyck <alexander.h.duyck@intel.com>

Changing ring sizes while the interface was down was causing a double
allocation of the receive and transmit rings.  This issue is amplified when
there are multiple rings enabled.  To prevent this we need to add an
additional check which will just update the ring counts when the interface
is not up and skip the allocation steps.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---

 drivers/net/igb/igb_ethtool.c |   35 ++++++++++++++++++++++-------------
 1 files changed, 22 insertions(+), 13 deletions(-)

diff --git a/drivers/net/igb/igb_ethtool.c b/drivers/net/igb/igb_ethtool.c
index d004c35..aab3d97 100644
--- a/drivers/net/igb/igb_ethtool.c
+++ b/drivers/net/igb/igb_ethtool.c
@@ -731,7 +731,7 @@ static int igb_set_ringparam(struct net_device *netdev,
 {
 	struct igb_adapter *adapter = netdev_priv(netdev);
 	struct igb_ring *temp_ring;
-	int i, err;
+	int i, err = 0;
 	u32 new_rx_count, new_tx_count;
 
 	if ((ring->rx_mini_pending) || (ring->rx_jumbo_pending))
@@ -751,18 +751,30 @@ static int igb_set_ringparam(struct net_device *netdev,
 		return 0;
 	}
 
+	while (test_and_set_bit(__IGB_RESETTING, &adapter->state))
+		msleep(1);
+
+	if (!netif_running(adapter->netdev)) {
+		for (i = 0; i < adapter->num_tx_queues; i++)
+			adapter->tx_ring[i].count = new_tx_count;
+		for (i = 0; i < adapter->num_rx_queues; i++)
+			adapter->rx_ring[i].count = new_rx_count;
+		adapter->tx_ring_count = new_tx_count;
+		adapter->rx_ring_count = new_rx_count;
+		goto clear_reset;
+	}
+
 	if (adapter->num_tx_queues > adapter->num_rx_queues)
 		temp_ring = vmalloc(adapter->num_tx_queues * sizeof(struct igb_ring));
 	else
 		temp_ring = vmalloc(adapter->num_rx_queues * sizeof(struct igb_ring));
-	if (!temp_ring)
-		return -ENOMEM;
 
-	while (test_and_set_bit(__IGB_RESETTING, &adapter->state))
-		msleep(1);
+	if (!temp_ring) {
+		err = -ENOMEM;
+		goto clear_reset;
+	}
 
-	if (netif_running(adapter->netdev))
-		igb_down(adapter);
+	igb_down(adapter);
 
 	/*
 	 * We can't just free everything and then setup again,
@@ -819,14 +831,11 @@ static int igb_set_ringparam(struct net_device *netdev,
 
 		adapter->rx_ring_count = new_rx_count;
 	}
-
-	err = 0;
 err_setup:
-	if (netif_running(adapter->netdev))
-		igb_up(adapter);
-
-	clear_bit(__IGB_RESETTING, &adapter->state);
+	igb_up(adapter);
 	vfree(temp_ring);
+clear_reset:
+	clear_bit(__IGB_RESETTING, &adapter->state);
 	return err;
 }
 


^ permalink raw reply related

* [net-2.6 PATCH 2/3] ixgbe: fix memory leak when resizing rings while interface is down
From: Jeff Kirsher @ 2009-10-26 21:32 UTC (permalink / raw)
  To: davem; +Cc: netdev, gospo, Alexander Duyck, Jeff Kirsher
In-Reply-To: <20091026213147.9993.9778.stgit@localhost.localdomain>

From: Alexander Duyck <alexander.h.duyck@intel.com>

This patch resolves a memory leak that occurs when you resize the rings via
the ethtool -G option while the interface is down.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---

 drivers/net/ixgbe/ixgbe_ethtool.c |   22 +++++++++++++---------
 1 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_ethtool.c b/drivers/net/ixgbe/ixgbe_ethtool.c
index fa314cb..856c18c 100644
--- a/drivers/net/ixgbe/ixgbe_ethtool.c
+++ b/drivers/net/ixgbe/ixgbe_ethtool.c
@@ -798,7 +798,7 @@ static int ixgbe_set_ringparam(struct net_device *netdev,
 {
 	struct ixgbe_adapter *adapter = netdev_priv(netdev);
 	struct ixgbe_ring *temp_tx_ring, *temp_rx_ring;
-	int i, err;
+	int i, err = 0;
 	u32 new_rx_count, new_tx_count;
 	bool need_update = false;
 
@@ -822,6 +822,16 @@ static int ixgbe_set_ringparam(struct net_device *netdev,
 	while (test_and_set_bit(__IXGBE_RESETTING, &adapter->state))
 		msleep(1);
 
+	if (!netif_running(adapter->netdev)) {
+		for (i = 0; i < adapter->num_tx_queues; i++)
+			adapter->tx_ring[i].count = new_tx_count;
+		for (i = 0; i < adapter->num_rx_queues; i++)
+			adapter->rx_ring[i].count = new_rx_count;
+		adapter->tx_ring_count = new_tx_count;
+		adapter->rx_ring_count = new_rx_count;
+		goto err_setup;
+	}
+
 	temp_tx_ring = kcalloc(adapter->num_tx_queues,
 	                       sizeof(struct ixgbe_ring), GFP_KERNEL);
 	if (!temp_tx_ring) {
@@ -879,8 +889,7 @@ static int ixgbe_set_ringparam(struct net_device *netdev,
 
 	/* if rings need to be updated, here's the place to do it in one shot */
 	if (need_update) {
-		if (netif_running(netdev))
-			ixgbe_down(adapter);
+		ixgbe_down(adapter);
 
 		/* tx */
 		if (new_tx_count != adapter->tx_ring_count) {
@@ -897,13 +906,8 @@ static int ixgbe_set_ringparam(struct net_device *netdev,
 			temp_rx_ring = NULL;
 			adapter->rx_ring_count = new_rx_count;
 		}
-	}
-
-	/* success! */
-	err = 0;
-	if (netif_running(netdev))
 		ixgbe_up(adapter);
-
+	}
 err_setup:
 	clear_bit(__IXGBE_RESETTING, &adapter->state);
 	return err;


^ permalink raw reply related

* [net-2.6 PATCH 3/3] igbvf: fix memory leak when ring size changed while interface down
From: Jeff Kirsher @ 2009-10-26 21:32 UTC (permalink / raw)
  To: davem; +Cc: netdev, gospo, Alexander Duyck, Jeff Kirsher
In-Reply-To: <20091026213147.9993.9778.stgit@localhost.localdomain>

From: Alexander Duyck <alexander.h.duyck@intel.com>

This patch resolves a memory leak which occurs while changing the ring size
while the interface is down.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---

 drivers/net/igbvf/ethtool.c |   30 +++++++++++++++++-------------
 1 files changed, 17 insertions(+), 13 deletions(-)

diff --git a/drivers/net/igbvf/ethtool.c b/drivers/net/igbvf/ethtool.c
index ee17a09..c68265b 100644
--- a/drivers/net/igbvf/ethtool.c
+++ b/drivers/net/igbvf/ethtool.c
@@ -279,7 +279,7 @@ static int igbvf_set_ringparam(struct net_device *netdev,
 {
 	struct igbvf_adapter *adapter = netdev_priv(netdev);
 	struct igbvf_ring *temp_ring;
-	int err;
+	int err = 0;
 	u32 new_rx_count, new_tx_count;
 
 	if ((ring->rx_mini_pending) || (ring->rx_jumbo_pending))
@@ -299,15 +299,22 @@ static int igbvf_set_ringparam(struct net_device *netdev,
 		return 0;
 	}
 
-	temp_ring = vmalloc(sizeof(struct igbvf_ring));
-	if (!temp_ring)
-		return -ENOMEM;
-
 	while (test_and_set_bit(__IGBVF_RESETTING, &adapter->state))
 		msleep(1);
 
-	if (netif_running(adapter->netdev))
-		igbvf_down(adapter);
+	if (!netif_running(adapter->netdev)) {
+		adapter->tx_ring->count = new_tx_count;
+		adapter->rx_ring->count = new_rx_count;
+		goto clear_reset;
+	}
+
+	temp_ring = vmalloc(sizeof(struct igbvf_ring));
+	if (!temp_ring) {
+		err = -ENOMEM;
+		goto clear_reset;
+	}
+
+	igbvf_down(adapter);
 
 	/*
 	 * We can't just free everything and then setup again,
@@ -339,14 +346,11 @@ static int igbvf_set_ringparam(struct net_device *netdev,
 
 		memcpy(adapter->rx_ring, temp_ring,sizeof(struct igbvf_ring));
 	}
-
-	err = 0;
 err_setup:
-	if (netif_running(adapter->netdev))
-		igbvf_up(adapter);
-
-	clear_bit(__IGBVF_RESETTING, &adapter->state);
+	igbvf_up(adapter);
 	vfree(temp_ring);
+clear_reset:
+	clear_bit(__IGBVF_RESETTING, &adapter->state);
 	return err;
 }
 


^ permalink raw reply related

* [PATCH -next] netxen: fix builds for SYSFS=n or MODULES=n
From: Randy Dunlap @ 2009-10-26 22:09 UTC (permalink / raw)
  To: Stephen Rothwell, netdev; +Cc: linux-next, LKML, davem, Dhananjay Phadke
In-Reply-To: <20091026172104.205f15af.sfr@canb.auug.org.au>

From: Randy Dunlap <randy.dunlap@oracle.com>

When CONFIG_MODULES=n:
drivers/net/netxen/netxen_nic_main.c:2751: error: dereferencing pointer to incomplete type
drivers/net/netxen/netxen_nic_main.c:2764: error: dereferencing pointer to incomplete type

Also needs addition of <linux/sysfs.h> for sysfs function prototypes or
stubs when CONFIG_SYSFS=n.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
---
 drivers/net/netxen/netxen_nic_main.c |    9 +++++++++
 1 file changed, 9 insertions(+)

--- linux-next-20091026.orig/drivers/net/netxen/netxen_nic_main.c
+++ linux-next-20091026/drivers/net/netxen/netxen_nic_main.c
@@ -34,6 +34,7 @@
 #include <net/ip.h>
 #include <linux/ipv6.h>
 #include <linux/inetdevice.h>
+#include <linux/sysfs.h>
 
 MODULE_DESCRIPTION("NetXen Multi port (1/10) Gigabit Network Driver");
 MODULE_LICENSE("GPL");
@@ -2500,6 +2501,7 @@ static struct bin_attribute bin_attr_mem
 	.write = netxen_sysfs_write_mem,
 };
 
+#ifdef CONFIG_MODULES
 static ssize_t
 netxen_store_auto_fw_reset(struct module_attribute *mattr,
 		struct module *mod, const char *buf, size_t count)
@@ -2534,6 +2536,7 @@ static struct module_attribute mod_attr_
 	.show = netxen_show_auto_fw_reset,
 	.store = netxen_store_auto_fw_reset,
 };
+#endif
 
 static void
 netxen_create_sysfs_entries(struct netxen_adapter *adapter)
@@ -2739,7 +2742,9 @@ static struct pci_driver netxen_driver =
 
 static int __init netxen_init_module(void)
 {
+#ifdef CONFIG_MODULES
 	struct module *mod = THIS_MODULE;
+#endif
 
 	printk(KERN_INFO "%s\n", netxen_nic_driver_string);
 
@@ -2748,9 +2753,11 @@ static int __init netxen_init_module(voi
 	register_inetaddr_notifier(&netxen_inetaddr_cb);
 #endif
 
+#ifdef CONFIG_MODULES
 	if (sysfs_create_file(&mod->mkobj.kobj, &mod_attr_fw_reset.attr))
 		printk(KERN_ERR "%s: Failed to create auto_fw_reset "
 				"sysfs entry.", netxen_nic_driver_name);
+#endif
 
 	return pci_register_driver(&netxen_driver);
 }
@@ -2759,9 +2766,11 @@ module_init(netxen_init_module);
 
 static void __exit netxen_exit_module(void)
 {
+#ifdef CONFIG_MODULES
 	struct module *mod = THIS_MODULE;
 
 	sysfs_remove_file(&mod->mkobj.kobj, &mod_attr_fw_reset.attr);
+#endif
 
 	pci_unregister_driver(&netxen_driver);
 

^ permalink raw reply

* Re: [PATCH 0/5] Candidate fix for increased number of GFP_ATOMIC failures V2
From: Frans Pop @ 2009-10-26 22:17 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Jiri Kosina, Sven Geggus, Karol Lewandowski, Tobias Oetiker,
	Rafael J. Wysocki, David Miller, Reinette Chatre, Kalle Valo,
	David Rientjes, KOSAKI Motohiro, Mohamed Abbas, Jens Axboe,
	John W. Linville, Pekka Enberg, Bartlomiej Zolnierkiewicz,
	Greg Kroah-Hartman, Stephan von Krawczynski, Kernel Testers List,
	netdev, linux-kernel, linux-mm@kvack.org
In-Reply-To: <1256221356-26049-1-git-send-email-mel@csn.ul.ie>

On Thursday 22 October 2009, Mel Gorman wrote:
> Test 1: Verify your problem occurs on 2.6.32-rc5 if you can

I've tested against 2.6.31.1 as it's easier for me to compare behaviors 
with that than with .32. All patches applied without problems against .31.

I've also tested 2.6.31.1 with SLAB instead of SLUB, but that does not seem 
to make a significant difference for my test.

> Test 2: Apply the following two patches and test again
>   1/5 page allocator: Always wake kswapd when restarting an allocation
>       attempt after direct reclaim failed
>   2/5 page allocator: Do not allow interrupts to use ALLOC_HARDER

Does not look to make any difference. Possibly causes more variation in the 
duration of the test (increases timing effects)?

> Test 3: If you are getting allocation failures, try with the following
> patch
>   3/5 vmscan: Force kswapd to take notice faster when high-order
>       watermarks are being hit

Applied on top of patches 1-2. Does not look to make any difference.

> Test 4: If you are still getting failures, apply the following
>   4/5 page allocator: Pre-emptively wake kswapd when high-order
>       watermarks are hit

Applied on top of patches 1-3. Does not look to make any difference.

> Test 5: If things are still screwed, apply the following
>   5/5 Revert 373c0a7e, 8aa7e847: Fix congestion_wait() sync/async vs
>       read/write confusion

Applied on top of patches 1-4. Despite Jens' scepticism is this still the 
patch that makes the most significant difference in my test.
The reading of commits in gitk is much more fluent and music skips are a 
lot less severe. But most important is that there is no long total freeze 
of the system halfway during the reading of commits and gitk loads 
fastest. It also gives by far the most consistent results.
The likelyhood of SKB allocation errors during the test is a lot smaller.
See also http://lkml.org/lkml/2009/10/26/455.


Detailed test results follow. I've done 2 test runs with each kernel (3 for 
the last).

The columns below give the following info:
- time at which all commits have been read by gitk
- time at which gitk fills in "branch", "follows" and "precedes" data for
  the current commit
- time at which there's no longer any disk activity, i.e. when gitk is
  fully loaded and all swapping is done
- total number of SKB allocation errors during the test
A "freeze" during the reading of commits is indicated by an "f" (short 
freeze) or "F" (long "hard" freeze). An "S" shows when there were SKB 
allocation errors.

		end commits	show branch	done		SKB errs
1) vanilla .31.1
run 1:		1:20 fFS	2:10 S		2:30		44 a)
run 2:		1:35 FS		1:45		2:10		13

2) .31.1 + patches 1-2
run1:		2:30 fFS	2:45		3:00		58
run2:		1:15 fS		2:00		2:20		2 a)

3) .31.1 + patches 1-3
run1:		1:00 fS		1:15		1:45		1 *)
run2:		3:00 fFS	3:15		3:30		33
*) unexpected; fortunate timing?

4) .31.1 + patches 1-4
run1:		1:10 ffS	1:55 S		2:20		35 a)
run2:		3:05 fFS	3:15		3:25		36

5) .31.1 + patches 1-5
run1:		1:00		1:15		1:35		0
run2:		0:50		1:15 S		1:45		45 *)
run3:		1:00		1:15		1:45		0
*) unexpected; unfortunate timing?

a) fast in 1st phase; slow in 2nd and 3rd

Note that without the congestion_wait() reverts occurrence of SKB errors, 
the long freezes and time it takes for gitk to load seem roughly related; 
with the reverts total time is not affected even with many SKB errors.

Cheers,
FJP

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* Re: [PATCH net-next] bnx2x: Do Tx handling in a separate tasklet.
From: David Miller @ 2009-10-26 22:28 UTC (permalink / raw)
  To: vladz; +Cc: eilong, netdev
In-Reply-To: <8628FE4E7912BF47A96AE7DD7BAC0AADCB2CFF1C42@SJEXCHCCR02.corp.ad.broadcom.com>

From: "Vladislav Zolotarov" <vladz@broadcom.com>
Date: Mon, 26 Oct 2009 07:42:27 -0700

> The separation of Tx and Rx interrupt handling gives us the
> possibility to properly affinitize the Rx (heavy CPU consuming task)
> and Tx (low CPU consuming task) and to ensure that Tx work is done
> not long after the Tx interrupt without interference of Rx work thus
> letting the user benefit from Tx coalescing configuration in order
> to achieve the best performance in each specific scenario. This is
> most important in heavy load scenarios with mixed traffic (UDP + TCP
> for instance). If we didn't separate Tx and Rx interrupt handling Tx
> coalescing configuration was not worth much.

There are other issues:

1) Actually, it makes sense to do TX and RX work together, since TX
   packet liberation makes fresh CPU local packets available for
   responses generated by RX packet reception.

2) TX packet liberation is not low CPU consumption, it has to perform
   many atomic instructions, reference socket state, enter the SLAB
   allocator, potentially liberate netfilter state, etc.

Using NAPI also moves the TX freeing into softirq context.

If you do it from a hardirq you are making it more expensive.  From
hardirq the free just puts the SKB on a list, schedules a softirq,
then does the real SKB free work from the softirq.

This needless SKB list management and softirq scheduling you'll
avoid if you do things from softirqs, and thus using NAPI makes
sense here.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox