* Linux 2.4.33-rc1
@ 2006-06-16 18:14 Marcelo Tosatti
2006-06-16 22:21 ` Grant Coady
0 siblings, 1 reply; 36+ messages in thread
From: Marcelo Tosatti @ 2006-06-16 18:14 UTC (permalink / raw)
To: linux-kernel; +Cc: Willy Tarreau
Here is 2.4.33-rc1.
Spread collection of fixes, including a number of security fixes
in networking.
Refer to the changelog for details.
Summary of changes from v2.4.33-pre3 to v2.4.33-rc1
============================================
Jean Delvare:
scx200_acb: Fix resource name use after free
i2c: Delete 2 out-of-date, colliding ioctl defines
Jesper Juhl:
fix mem-leak in netfilter
Kirill Korotaev:
[NETFILTER]: Fix possible overflow in netfilters do_replace()
Marcelo Tosatti:
Contact information update
Vadim Egorov: ext3 link/unlink race
Fix vfs_unlink issue introduced by link/unlink race correction
Merge master.kernel.org:/.../davem/net-2.4
Update VERSION to v2.4.33-rc1
Olaf Kirch:
smbfs chroot issue (CVE-2006-1864)
Patrick McHardy:
[NETFILTER]: SNMP NAT: fix memory corruption
Pradeep Vincent:
Neighbour Cache (ARP) State machine bug Fixed
Sergei Shtylyov:
AMD Au1xx0: fix ethernet TX stats
Shaun Tancheff:
USB: Gadget RNDIS fix alloc bug. (buffer overflow)
Sridhar Samudrala:
[SCTP]: Fix state table entries for chunks received in CLOSED state. (CVE-2006-2271)
[SCTP]: Fix panic's when receiving fragmented SCTP control chunks. (CVE-2006-2272)
Stephen Hemminger:
[IPV4]: ip_route_input panic fix (CVE-2006-1525)
Theodore Ts'o:
Fix memory leak when the ext3's journal file is corrupted
Vladislav Yasevich:
[SCTP]: Prevent possible infinite recursion with multiple bundled DATA. (CVE-2006-2274)
Willy TARREAU:
fix /proc/partitions display with USB-FDD geometry
Merge branch 'origin'
fix /proc/partitions display with USB-FDD geometry
ext2: update inode ctime on rename()
bonding: remove a warning with gcc 3
block: remove the annoying 'blk: queue %p I/O limit %luMb' messages
scripts : ver_linux does not report recent binutils version
3c59x: reload EEPROM values at rmmod for needy cards
netdrv: fix b44 loading after bcm4400
netdrv: b44 driver must ignore carrier lost errors
drm: gcc complains that print_heap() in radeon_mem.c is not used.
ide: trying to enable DMA may cause an oops
Merge branch 'updates'
JBD: avoid panic on corrupted journal superblock (from akpm)
Willy Tarreau:
forcedeth update to 0.50
^ permalink raw reply [flat|nested] 36+ messages in thread* Re: Linux 2.4.33-rc1 2006-06-16 18:14 Linux 2.4.33-rc1 Marcelo Tosatti @ 2006-06-16 22:21 ` Grant Coady 2006-06-16 22:38 ` Michal Piotrowski 2006-06-18 13:37 ` Marcelo Tosatti 0 siblings, 2 replies; 36+ messages in thread From: Grant Coady @ 2006-06-16 22:21 UTC (permalink / raw) To: Marcelo Tosatti; +Cc: linux-kernel, Willy Tarreau On Fri, 16 Jun 2006 15:14:19 -0300, Marcelo Tosatti <marcelo@kvack.org> wrote: >Here is 2.4.33-rc1. I provoked a network related oops as user 'grant', with this CLI input boo-boo on an ssh terminal: grant@sempro:~$ rm /home/share/config-2.6.17-rc6-mm1a dmesg-2.6.17-rc6-mm1a /home/share is an NFS mounted directory Was able to login direct as root and copy the oops info with mouse (gpm) copy/paste, but console locked up on localnet access, here's the guff: ksymoops 2.4.11 on i686 2.4.33-rc1. Options used -v /home/grant/linux/linux-2.4.33-rc1/vmlinux (specified) -k /proc/ksyms (default) -l /proc/modules (default) -o /lib/modules/2.4.33-rc1/ (default) -m /boot/System.map-2.4.33-rc1 (specified) Unable to handle kernel NULL pointer dereference at virtual address 00000088 *pde = 00000000 Oops: 0002 CPU: 0 EIP: 0010:[<c013eeb4>] Not tainted Using defaults from ksymoops -t elf32-i386 -a i386 EFLAGS: 00010282 eax: 00000000 ebx: 00000000 ecx: 00000088 edx: 00000088 esi: f6e2fd08 edi: f5839ac0 ebp: f6e2fc80 esp: f5889f6c ds: 0018 es: 0018 ss: 0018 Process rm (pid: 243, stackpage=f5889000) Stack: f6e2fc80 f5839ac0 f5839ac0 f7bdd000 f5889f90 f5839ac0 c013f066 f6e2fc80 f5839ac0 f6e36440 c19ac440 f7bdd00c 00000016 be8f2661 00000010 00000000 00000004 f5888000 bffff96b 08051050 bffff758 c0106eff bffff96b 00000002 Call Trace: [<c013f066>] [<c0106eff>] Code: ff 80 88 00 00 00 0f 8e 2e 16 00 00 85 db 74 16 89 d8 8b 5c >>EIP; c013eeb4 <vfs_unlink+a4/1a0> <===== >>esi; f6e2fd08 <_end+36a9405c/386be3d4> >>edi; f5839ac0 <_end+3549de14/386be3d4> >>ebp; f6e2fc80 <_end+36a93fd4/386be3d4> >>esp; f5889f6c <_end+354ee2c0/386be3d4> Trace; c013f066 <sys_unlink+b6/120> Trace; c0106eff <system_call+33/38> Code; c013eeb4 <vfs_unlink+a4/1a0> 00000000 <_EIP>: Code; c013eeb4 <vfs_unlink+a4/1a0> <===== 0: ff 80 88 00 00 00 incl 0x88(%eax) <===== Code; c013eeba <vfs_unlink+aa/1a0> 6: 0f 8e 2e 16 00 00 jle 163a <_EIP+0x163a> Code; c013eec0 <vfs_unlink+b0/1a0> c: 85 db test %ebx,%ebx Code; c013eec2 <vfs_unlink+b2/1a0> e: 74 16 je 26 <_EIP+0x26> Code; c013eec4 <vfs_unlink+b4/1a0> 10: 89 d8 mov %ebx,%eax Code; c013eec6 <vfs_unlink+b6/1a0> 12: 8b 5c 00 00 mov 0x0(%eax,%eax,1),%ebx Just done it again, not my finger trouble at all, again via ssh terminal, deleting a file from an NFS mounted directory: grant@sempro:~$ rm /home/share/dmesg-2.6.17-rc6-mm1a Segmentation fault Note: the file was successfully deleted. No reboot for this one: root@sempro:~# ksymoops -v /home/grant/linux/linux-2.4.33-rc1/vmlinux -m /boot/System.map-2.4.33-rc1 oops2 ksymoops 2.4.11 on i686 2.4.33-rc1. Options used -v /home/grant/linux/linux-2.4.33-rc1/vmlinux (specified) -k /proc/ksyms (default) -l /proc/modules (default) -o /lib/modules/2.4.33-rc1/ (default) -m /boot/System.map-2.4.33-rc1 (specified) Unable to handle kernel NULL pointer dereference at virtual address 00000088 *pde = 00000000 Oops: 0002 CPU: 0 EIP: 0010:[<c013eeb4>] Not tainted Using defaults from ksymoops -t elf32-i386 -a i386 EFLAGS: 00010282 eax: 00000000 ebx: 00000000 ecx: 00000088 edx: 00000088 esi: f6e2cd08 edi: f56cf640 ebp: f6e2cc80 esp: f5715f6c ds: 0018 es: 0018 ss: 0018 Process rm (pid: 311, stackpage=f5715000) Stack: f6e2cc80 f56cf640 f56cf640 f5d18000 f5715f90 f56cf640 c013f066 f6e2cc80 f56cf640 f6fdf440 c19ac440 f5d1800c 00000015 eab76344 00000010 00000000 00000004 f5714000 bffff982 08051050 bffff768 c0106eff bffff982 00000002 Call Trace: [<c013f066>] [<c0106eff>] Code: ff 80 88 00 00 00 0f 8e 2e 16 00 00 85 db 74 16 89 d8 8b 5c >>EIP; c013eeb4 <vfs_unlink+a4/1a0> <===== >>esi; f6e2cd08 <_end+36a9105c/386be3d4> >>edi; f56cf640 <_end+35333994/386be3d4> >>ebp; f6e2cc80 <_end+36a90fd4/386be3d4> >>esp; f5715f6c <_end+3537a2c0/386be3d4> Trace; c013f066 <sys_unlink+b6/120> Trace; c0106eff <system_call+33/38> Code; c013eeb4 <vfs_unlink+a4/1a0> 00000000 <_EIP>: Code; c013eeb4 <vfs_unlink+a4/1a0> <===== 0: ff 80 88 00 00 00 incl 0x88(%eax) <===== Code; c013eeba <vfs_unlink+aa/1a0> 6: 0f 8e 2e 16 00 00 jle 163a <_EIP+0x163a> Code; c013eec0 <vfs_unlink+b0/1a0> c: 85 db test %ebx,%ebx Code; c013eec2 <vfs_unlink+b2/1a0> e: 74 16 je 26 <_EIP+0x26> Code; c013eec4 <vfs_unlink+b4/1a0> 10: 89 d8 mov %ebx,%eax Code; c013eec6 <vfs_unlink+b6/1a0> 12: 8b 5c 00 00 mov 0x0(%eax,%eax,1),%ebx Since I use NFS here, no further testing 33-rc1 on the other boxen. More info, testing on request. The server exporting the NFS share is running 2.4.32-hf-latest: Now running: 2.4.32-hf32.6 Uptime: 08:19:43 up 16 days, 14:27, 1 user, load average: 0.00, 0.00, 0.00 <http://bugsplatter.mine.nu/test/boxen/deltree/> Grant. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linux 2.4.33-rc1 2006-06-16 22:21 ` Grant Coady @ 2006-06-16 22:38 ` Michal Piotrowski 2006-06-16 23:24 ` Grant Coady 2006-06-18 13:37 ` Marcelo Tosatti 1 sibling, 1 reply; 36+ messages in thread From: Michal Piotrowski @ 2006-06-16 22:38 UTC (permalink / raw) To: Grant Coady; +Cc: Marcelo Tosatti, linux-kernel, Willy Tarreau Hi Grant, On 17/06/06, Grant Coady <gcoady.lk@gmail.com> wrote: > On Fri, 16 Jun 2006 15:14:19 -0300, Marcelo Tosatti <marcelo@kvack.org> wrote: > > >Here is 2.4.33-rc1. > > I provoked a network related oops as user 'grant', with this CLI input > boo-boo on an ssh terminal: > > grant@sempro:~$ rm /home/share/config-2.6.17-rc6-mm1a dmesg-2.6.17-rc6-mm1a > > /home/share is an NFS mounted directory > > Was able to login direct as root and copy the oops info with mouse (gpm) > copy/paste, but console locked up on localnet access, here's the guff: > > ksymoops 2.4.11 on i686 2.4.33-rc1. Options used > -v /home/grant/linux/linux-2.4.33-rc1/vmlinux (specified) > -k /proc/ksyms (default) > -l /proc/modules (default) > -o /lib/modules/2.4.33-rc1/ (default) > -m /boot/System.map-2.4.33-rc1 (specified) > > Unable to handle kernel NULL pointer dereference at virtual address 00000088 > *pde = 00000000 > Oops: 0002 > CPU: 0 > EIP: 0010:[<c013eeb4>] Not tainted > Using defaults from ksymoops -t elf32-i386 -a i386 > EFLAGS: 00010282 > eax: 00000000 ebx: 00000000 ecx: 00000088 edx: 00000088 > esi: f6e2fd08 edi: f5839ac0 ebp: f6e2fc80 esp: f5889f6c > ds: 0018 es: 0018 ss: 0018 > Process rm (pid: 243, stackpage=f5889000) > Stack: f6e2fc80 f5839ac0 f5839ac0 f7bdd000 f5889f90 f5839ac0 c013f066 f6e2fc80 > f5839ac0 f6e36440 c19ac440 f7bdd00c 00000016 be8f2661 00000010 00000000 > 00000004 f5888000 bffff96b 08051050 bffff758 c0106eff bffff96b 00000002 > Call Trace: [<c013f066>] [<c0106eff>] > Code: ff 80 88 00 00 00 0f 8e 2e 16 00 00 85 db 74 16 89 d8 8b 5c > > > >>EIP; c013eeb4 <vfs_unlink+a4/1a0> <===== > > >>esi; f6e2fd08 <_end+36a9405c/386be3d4> > >>edi; f5839ac0 <_end+3549de14/386be3d4> > >>ebp; f6e2fc80 <_end+36a93fd4/386be3d4> > >>esp; f5889f6c <_end+354ee2c0/386be3d4> > > Trace; c013f066 <sys_unlink+b6/120> > Trace; c0106eff <system_call+33/38> > > Code; c013eeb4 <vfs_unlink+a4/1a0> > 00000000 <_EIP>: > Code; c013eeb4 <vfs_unlink+a4/1a0> <===== > 0: ff 80 88 00 00 00 incl 0x88(%eax) <===== > Code; c013eeba <vfs_unlink+aa/1a0> > 6: 0f 8e 2e 16 00 00 jle 163a <_EIP+0x163a> > Code; c013eec0 <vfs_unlink+b0/1a0> > c: 85 db test %ebx,%ebx > Code; c013eec2 <vfs_unlink+b2/1a0> > e: 74 16 je 26 <_EIP+0x26> > Code; c013eec4 <vfs_unlink+b4/1a0> > 10: 89 d8 mov %ebx,%eax > Code; c013eec6 <vfs_unlink+b6/1a0> > 12: 8b 5c 00 00 mov 0x0(%eax,%eax,1),%ebx > > > Just done it again, not my finger trouble at all, again via ssh terminal, > deleting a file from an NFS mounted directory: > > grant@sempro:~$ rm /home/share/dmesg-2.6.17-rc6-mm1a > Segmentation fault > > Note: the file was successfully deleted. No reboot for this one: > > root@sempro:~# ksymoops -v /home/grant/linux/linux-2.4.33-rc1/vmlinux -m /boot/System.map-2.4.33-rc1 oops2 > ksymoops 2.4.11 on i686 2.4.33-rc1. Options used > -v /home/grant/linux/linux-2.4.33-rc1/vmlinux (specified) > -k /proc/ksyms (default) > -l /proc/modules (default) > -o /lib/modules/2.4.33-rc1/ (default) > -m /boot/System.map-2.4.33-rc1 (specified) > > Unable to handle kernel NULL pointer dereference at virtual address 00000088 > *pde = 00000000 > Oops: 0002 > CPU: 0 > EIP: 0010:[<c013eeb4>] Not tainted > Using defaults from ksymoops -t elf32-i386 -a i386 > EFLAGS: 00010282 > eax: 00000000 ebx: 00000000 ecx: 00000088 edx: 00000088 > esi: f6e2cd08 edi: f56cf640 ebp: f6e2cc80 esp: f5715f6c > ds: 0018 es: 0018 ss: 0018 > Process rm (pid: 311, stackpage=f5715000) > Stack: f6e2cc80 f56cf640 f56cf640 f5d18000 f5715f90 f56cf640 c013f066 f6e2cc80 > f56cf640 f6fdf440 c19ac440 f5d1800c 00000015 eab76344 00000010 00000000 > 00000004 f5714000 bffff982 08051050 bffff768 c0106eff bffff982 00000002 > Call Trace: [<c013f066>] [<c0106eff>] > Code: ff 80 88 00 00 00 0f 8e 2e 16 00 00 85 db 74 16 89 d8 8b 5c > > > >>EIP; c013eeb4 <vfs_unlink+a4/1a0> <===== > > >>esi; f6e2cd08 <_end+36a9105c/386be3d4> > >>edi; f56cf640 <_end+35333994/386be3d4> > >>ebp; f6e2cc80 <_end+36a90fd4/386be3d4> > >>esp; f5715f6c <_end+3537a2c0/386be3d4> > > Trace; c013f066 <sys_unlink+b6/120> > Trace; c0106eff <system_call+33/38> > > Code; c013eeb4 <vfs_unlink+a4/1a0> > 00000000 <_EIP>: > Code; c013eeb4 <vfs_unlink+a4/1a0> <===== > 0: ff 80 88 00 00 00 incl 0x88(%eax) <===== > Code; c013eeba <vfs_unlink+aa/1a0> > 6: 0f 8e 2e 16 00 00 jle 163a <_EIP+0x163a> > Code; c013eec0 <vfs_unlink+b0/1a0> > c: 85 db test %ebx,%ebx > Code; c013eec2 <vfs_unlink+b2/1a0> > e: 74 16 je 26 <_EIP+0x26> > Code; c013eec4 <vfs_unlink+b4/1a0> > 10: 89 d8 mov %ebx,%eax > Code; c013eec6 <vfs_unlink+b6/1a0> > 12: 8b 5c 00 00 mov 0x0(%eax,%eax,1),%ebx > > > Since I use NFS here, no further testing 33-rc1 on the other boxen. > More info, testing on request. The server exporting the NFS share is > running 2.4.32-hf-latest: > > Now running: 2.4.32-hf32.6 > Uptime: 08:19:43 up 16 days, 14:27, 1 user, load average: 0.00, 0.00, 0.00 > > <http://bugsplatter.mine.nu/test/boxen/deltree/> > > Grant. Can you revert 42cce987d63d3048595b8b00d74be786707d5e5d commit? Regards, Michal -- Michal K. K. Piotrowski LTG - Linux Testers Group (http://www.stardust.webpages.pl/ltg/wiki/) ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linux 2.4.33-rc1 2006-06-16 22:38 ` Michal Piotrowski @ 2006-06-16 23:24 ` Grant Coady 2006-06-17 5:13 ` Willy Tarreau 0 siblings, 1 reply; 36+ messages in thread From: Grant Coady @ 2006-06-16 23:24 UTC (permalink / raw) To: Michal Piotrowski; +Cc: Marcelo Tosatti, linux-kernel, Willy Tarreau On Sat, 17 Jun 2006 00:38:10 +0200, "Michal Piotrowski" <michal.k.k.piotrowski@gmail.com> wrote: >Can you revert 42cce987d63d3048595b8b00d74be786707d5e5d commit? Only if you point me at the patch, that string seems not to be in the patch-2.4.33-rc1.gz file. Grant. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linux 2.4.33-rc1 2006-06-16 23:24 ` Grant Coady @ 2006-06-17 5:13 ` Willy Tarreau 2006-06-17 6:24 ` Grant Coady 0 siblings, 1 reply; 36+ messages in thread From: Willy Tarreau @ 2006-06-17 5:13 UTC (permalink / raw) To: Grant Coady; +Cc: Michal Piotrowski, Marcelo Tosatti, linux-kernel Hi Grant, On Sat, Jun 17, 2006 at 09:24:02AM +1000, Grant Coady wrote: > On Sat, 17 Jun 2006 00:38:10 +0200, "Michal Piotrowski" <michal.k.k.piotrowski@gmail.com> wrote: > > >Can you revert 42cce987d63d3048595b8b00d74be786707d5e5d commit? > > Only if you point me at the patch, that string seems not to be > in the patch-2.4.33-rc1.gz file. First, revert this one : http://kernel.org/git/?p=linux/kernel/git/marcelo/linux-2.4.git;a=commitdiff_plain;h=efc95599c0261dd7ab3a1d9071024ca140b4c644;hp=6601095e2de35f00325a33c8be6b548f81fe76d5 --- a/fs/namei.c +++ b/fs/namei.c @@ -1479,19 +1479,20 @@ int vfs_unlink(struct inode *dir, struct { int error; - double_down(&dir->i_zombie, &dentry->d_inode->i_zombie); error = may_delete(dir, dentry, 0); - if (!error) { - error = -EPERM; - if (dir->i_op && dir->i_op->unlink) { - DQUOT_INIT(dir); - if (d_mountpoint(dentry)) - error = -EBUSY; - else { - lock_kernel(); - error = dir->i_op->unlink(dir, dentry); - unlock_kernel(); - } + if (error) + return error; + + double_down(&dir->i_zombie, &dentry->d_inode->i_zombie); + error = -EPERM; + if (dir->i_op && dir->i_op->unlink) { + DQUOT_INIT(dir); + if (d_mountpoint(dentry)) + error = -EBUSY; + else { + lock_kernel(); + error = dir->i_op->unlink(dir, dentry); + unlock_kernel(); } } double_up(&dir->i_zombie, &dentry->d_inode->i_zombie); It will put you back to the state where all your machines hanged at boot with -hf32.5, then revert this one : http://kernel.org/git/?p=linux/kernel/git/marcelo/linux-2.4.git;a=commitdiff_plain;h=f41e0ce901260d3d1ae5bd8bae34266891b4a65d;hp=925c7ce0a2d9a676cd8e4a2baf411b23cf6762d6 --- a/fs/namei.c +++ b/fs/namei.c @@ -1479,7 +1479,7 @@ int vfs_unlink(struct inode *dir, struct { int error; - down(&dir->i_zombie); + double_down(&dir->i_zombie, &dentry->d_inode->i_zombie); error = may_delete(dir, dentry, 0); if (!error) { error = -EPERM; @@ -1491,14 +1491,14 @@ int vfs_unlink(struct inode *dir, struct lock_kernel(); error = dir->i_op->unlink(dir, dentry); unlock_kernel(); - if (!error) - d_delete(dentry); } } } - up(&dir->i_zombie); - if (!error) + double_up(&dir->i_zombie, &dentry->d_inode->i_zombie); + if (!error) { + d_delete(dentry); inode_dir_notify(dir, DN_DELETE); + } return error; } @@ -1607,18 +1607,19 @@ int vfs_link(struct dentry *old_dentry, struct inode *inode; int error; - down(&dir->i_zombie); error = -ENOENT; inode = old_dentry->d_inode; if (!inode) - goto exit_lock; - - error = may_create(dir, new_dentry); - if (error) - goto exit_lock; + goto exit; error = -EXDEV; if (dir->i_dev != inode->i_dev) + goto exit; + + double_down(&dir->i_zombie, &old_dentry->d_inode->i_zombie); + + error = may_create(dir, new_dentry); + if (error) goto exit_lock; /* @@ -1636,9 +1637,10 @@ int vfs_link(struct dentry *old_dentry, unlock_kernel(); exit_lock: - up(&dir->i_zombie); + double_up(&dir->i_zombie, &old_dentry->d_inode->i_zombie); if (!error) inode_dir_notify(dir, DN_CREATE); +exit: return error; } And you will have something equivalent to -hf32.6 (you remember, I only reverted the fix in -hf32.6, and did not merge Marcelo's fix for the fix). Then we will have to decide whether we can fix it again or revert it completely. I would have liked to Cc: Vadim Egorov, but I cannot find his email. > Grant. Cheers, Willy ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linux 2.4.33-rc1 2006-06-17 5:13 ` Willy Tarreau @ 2006-06-17 6:24 ` Grant Coady 2006-06-17 7:10 ` Willy Tarreau 0 siblings, 1 reply; 36+ messages in thread From: Grant Coady @ 2006-06-17 6:24 UTC (permalink / raw) To: Willy Tarreau; +Cc: Michal Piotrowski, Marcelo Tosatti, linux-kernel On Sat, 17 Jun 2006 07:13:57 +0200, Willy Tarreau <w@1wt.eu> wrote: >Hi Grant, > >On Sat, Jun 17, 2006 at 09:24:02AM +1000, Grant Coady wrote: >> On Sat, 17 Jun 2006 00:38:10 +0200, "Michal Piotrowski" <michal.k.k.piotrowski@gmail.com> wrote: >> >> >Can you revert 42cce987d63d3048595b8b00d74be786707d5e5d commit? >> >> Only if you point me at the patch, that string seems not to be >> in the patch-2.4.33-rc1.gz file. > >First, revert this one : > > http://kernel.org/git/?p=linux/kernel/git/marcelo/linux-2.4.git;a=commitdiff_plain;h=efc95599c0261dd7ab3a1d9071024ca140b4c644;hp=6601095e2de35f00325a33c8be6b548f81fe76d5 > >--- a/fs/namei.c >+++ b/fs/namei.c >@@ -1479,19 +1479,20 @@ int vfs_unlink(struct inode *dir, struct > { > int error; > >- double_down(&dir->i_zombie, &dentry->d_inode->i_zombie); > error = may_delete(dir, dentry, 0); >- if (!error) { >- error = -EPERM; >- if (dir->i_op && dir->i_op->unlink) { >- DQUOT_INIT(dir); >- if (d_mountpoint(dentry)) >- error = -EBUSY; >- else { >- lock_kernel(); >- error = dir->i_op->unlink(dir, dentry); >- unlock_kernel(); >- } >+ if (error) >+ return error; >+ >+ double_down(&dir->i_zombie, &dentry->d_inode->i_zombie); >+ error = -EPERM; >+ if (dir->i_op && dir->i_op->unlink) { >+ DQUOT_INIT(dir); >+ if (d_mountpoint(dentry)) >+ error = -EBUSY; >+ else { >+ lock_kernel(); >+ error = dir->i_op->unlink(dir, dentry); >+ unlock_kernel(); > } > } > double_up(&dir->i_zombie, &dentry->d_inode->i_zombie); > > >It will put you back to the state where all your machines hanged at boot >with -hf32.5, then revert this one : > > http://kernel.org/git/?p=linux/kernel/git/marcelo/linux-2.4.git;a=commitdiff_plain;h=f41e0ce901260d3d1ae5bd8bae34266891b4a65d;hp=925c7ce0a2d9a676cd8e4a2baf411b23cf6762d6 > >--- a/fs/namei.c >+++ b/fs/namei.c >@@ -1479,7 +1479,7 @@ int vfs_unlink(struct inode *dir, struct > { > int error; > >- down(&dir->i_zombie); >+ double_down(&dir->i_zombie, &dentry->d_inode->i_zombie); > error = may_delete(dir, dentry, 0); > if (!error) { > error = -EPERM; >@@ -1491,14 +1491,14 @@ int vfs_unlink(struct inode *dir, struct > lock_kernel(); > error = dir->i_op->unlink(dir, dentry); > unlock_kernel(); >- if (!error) >- d_delete(dentry); > } > } > } >- up(&dir->i_zombie); >- if (!error) >+ double_up(&dir->i_zombie, &dentry->d_inode->i_zombie); >+ if (!error) { >+ d_delete(dentry); > inode_dir_notify(dir, DN_DELETE); >+ } > return error; > } > >@@ -1607,18 +1607,19 @@ int vfs_link(struct dentry *old_dentry, > struct inode *inode; > int error; > >- down(&dir->i_zombie); > error = -ENOENT; > inode = old_dentry->d_inode; > if (!inode) >- goto exit_lock; >- >- error = may_create(dir, new_dentry); >- if (error) >- goto exit_lock; >+ goto exit; > > error = -EXDEV; > if (dir->i_dev != inode->i_dev) >+ goto exit; >+ >+ double_down(&dir->i_zombie, &old_dentry->d_inode->i_zombie); >+ >+ error = may_create(dir, new_dentry); >+ if (error) > goto exit_lock; > > /* >@@ -1636,9 +1637,10 @@ int vfs_link(struct dentry *old_dentry, > unlock_kernel(); > > exit_lock: >- up(&dir->i_zombie); >+ double_up(&dir->i_zombie, &old_dentry->d_inode->i_zombie); > if (!error) > inode_dir_notify(dir, DN_CREATE); >+exit: > return error; > } > > >And you will have something equivalent to -hf32.6 (you remember, I only >reverted the fix in -hf32.6, and did not merge Marcelo's fix for the fix). >Then we will have to decide whether we can fix it again or revert it >completely. I would have liked to Cc: Vadim Egorov, but I cannot find his >email. > Hi Willy, this worked. Grant. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linux 2.4.33-rc1 2006-06-17 6:24 ` Grant Coady @ 2006-06-17 7:10 ` Willy Tarreau 2006-06-17 17:15 ` Need to format twice /dev/ramX with reiserfs to be able to mount it ? sebastien cabaniols 2006-06-17 19:55 ` Linux 2.4.33-rc1 Marcelo Tosatti 0 siblings, 2 replies; 36+ messages in thread From: Willy Tarreau @ 2006-06-17 7:10 UTC (permalink / raw) To: Grant Coady; +Cc: Michal Piotrowski, Marcelo Tosatti, linux-kernel On Sat, Jun 17, 2006 at 04:24:00PM +1000, Grant Coady wrote: > On Sat, 17 Jun 2006 07:13:57 +0200, Willy Tarreau <w@1wt.eu> wrote: > > >Hi Grant, > > > >On Sat, Jun 17, 2006 at 09:24:02AM +1000, Grant Coady wrote: > >> On Sat, 17 Jun 2006 00:38:10 +0200, "Michal Piotrowski" <michal.k.k.piotrowski@gmail.com> wrote: > >> > >> >Can you revert 42cce987d63d3048595b8b00d74be786707d5e5d commit? > >> > >> Only if you point me at the patch, that string seems not to be > >> in the patch-2.4.33-rc1.gz file. > > > >First, revert this one : > > > > http://kernel.org/git/?p=linux/kernel/git/marcelo/linux-2.4.git;a=commitdiff_plain;h=efc95599c0261dd7ab3a1d9071024ca140b4c644;hp=6601095e2de35f00325a33c8be6b548f81fe76d5 > > > >--- a/fs/namei.c > >+++ b/fs/namei.c > >@@ -1479,19 +1479,20 @@ int vfs_unlink(struct inode *dir, struct > > { > > int error; > > > >- double_down(&dir->i_zombie, &dentry->d_inode->i_zombie); > > error = may_delete(dir, dentry, 0); > >- if (!error) { > >- error = -EPERM; > >- if (dir->i_op && dir->i_op->unlink) { > >- DQUOT_INIT(dir); > >- if (d_mountpoint(dentry)) > >- error = -EBUSY; > >- else { > >- lock_kernel(); > >- error = dir->i_op->unlink(dir, dentry); > >- unlock_kernel(); > >- } > >+ if (error) > >+ return error; > >+ > >+ double_down(&dir->i_zombie, &dentry->d_inode->i_zombie); > >+ error = -EPERM; > >+ if (dir->i_op && dir->i_op->unlink) { > >+ DQUOT_INIT(dir); > >+ if (d_mountpoint(dentry)) > >+ error = -EBUSY; > >+ else { > >+ lock_kernel(); > >+ error = dir->i_op->unlink(dir, dentry); > >+ unlock_kernel(); > > } > > } > > double_up(&dir->i_zombie, &dentry->d_inode->i_zombie); > > > > > >It will put you back to the state where all your machines hanged at boot > >with -hf32.5, then revert this one : > > > > http://kernel.org/git/?p=linux/kernel/git/marcelo/linux-2.4.git;a=commitdiff_plain;h=f41e0ce901260d3d1ae5bd8bae34266891b4a65d;hp=925c7ce0a2d9a676cd8e4a2baf411b23cf6762d6 > > > >--- a/fs/namei.c > >+++ b/fs/namei.c > >@@ -1479,7 +1479,7 @@ int vfs_unlink(struct inode *dir, struct > > { > > int error; > > > >- down(&dir->i_zombie); > >+ double_down(&dir->i_zombie, &dentry->d_inode->i_zombie); > > error = may_delete(dir, dentry, 0); > > if (!error) { > > error = -EPERM; > >@@ -1491,14 +1491,14 @@ int vfs_unlink(struct inode *dir, struct > > lock_kernel(); > > error = dir->i_op->unlink(dir, dentry); > > unlock_kernel(); > >- if (!error) > >- d_delete(dentry); > > } > > } > > } > >- up(&dir->i_zombie); > >- if (!error) > >+ double_up(&dir->i_zombie, &dentry->d_inode->i_zombie); > >+ if (!error) { > >+ d_delete(dentry); > > inode_dir_notify(dir, DN_DELETE); > >+ } > > return error; > > } > > > >@@ -1607,18 +1607,19 @@ int vfs_link(struct dentry *old_dentry, > > struct inode *inode; > > int error; > > > >- down(&dir->i_zombie); > > error = -ENOENT; > > inode = old_dentry->d_inode; > > if (!inode) > >- goto exit_lock; > >- > >- error = may_create(dir, new_dentry); > >- if (error) > >- goto exit_lock; > >+ goto exit; > > > > error = -EXDEV; > > if (dir->i_dev != inode->i_dev) > >+ goto exit; > >+ > >+ double_down(&dir->i_zombie, &old_dentry->d_inode->i_zombie); > >+ > >+ error = may_create(dir, new_dentry); > >+ if (error) > > goto exit_lock; > > > > /* > >@@ -1636,9 +1637,10 @@ int vfs_link(struct dentry *old_dentry, > > unlock_kernel(); > > > > exit_lock: > >- up(&dir->i_zombie); > >+ double_up(&dir->i_zombie, &old_dentry->d_inode->i_zombie); > > if (!error) > > inode_dir_notify(dir, DN_CREATE); > >+exit: > > return error; > > } > > > > > >And you will have something equivalent to -hf32.6 (you remember, I only > >reverted the fix in -hf32.6, and did not merge Marcelo's fix for the fix). > >Then we will have to decide whether we can fix it again or revert it > >completely. I would have liked to Cc: Vadim Egorov, but I cannot find his > >email. > > > > Hi Willy, this worked. OK Fine. However, I don't understand. From you oops, it looks like dentry->d_inode suddenly gets NULL before calling this line : double_up(&dir->i_zombie, &dentry->d_inode->i_zombie); I'm wondering if the unlink(dir, dentry) above can produce this. If this is the case, the Vadim's fix is terribly broken. I also think that in vfs_link(), we can encounter the same problem that Marcelo had to fix, because double_down() is performed without prior checking that old_dentry->d_inode is valid. Marcelo, do you have Vadim's email somewhere ? I think he should help us on this otherwise it would be better to revert his fix, as it has introduced lots of sensible changes in the error path. > Grant. Thanks for the tests, Grant. Willy ^ permalink raw reply [flat|nested] 36+ messages in thread
* Need to format twice /dev/ramX with reiserfs to be able to mount it ? 2006-06-17 7:10 ` Willy Tarreau @ 2006-06-17 17:15 ` sebastien cabaniols 2006-06-17 18:14 ` Bernd Eckenfels 2006-06-17 19:55 ` Linux 2.4.33-rc1 Marcelo Tosatti 1 sibling, 1 reply; 36+ messages in thread From: sebastien cabaniols @ 2006-06-17 17:15 UTC (permalink / raw) To: linux-kernel Hello list, I am trying to format with reiserfs a ramdrive /dev/ramX (I want to nfs export it, so tmpfs is not an option and ext2/ext3 is not an option for other reasons) After formating is okay I can't mount it, but If I reformat it, I usually can mount it. I have also tried after first formating to dump it to a file and to loopback mount the file, this works: I am puzzled. Here is the sample session: opteron:/home/seb # mkreiserfs -q /dev/ram8 mkreiserfs 3.6.18 (2003 www.namesys.com) opteron:/home/seb # mount /dev/ram8 /tmp/mountpoint/ mount: wrong fs type, bad option, bad superblock on /dev/ram8, missing codepage or other error In some cases useful info is found in syslog - try dmesg | tail or so opteron:/home/seb # mkreiserfs -q /dev/ram8 mkreiserfs 3.6.18 (2003 www.namesys.com) opteron:/home/seb # mount /dev/ram8 /tmp/mountpoint/ opteron:/home/seb # df -h Filesystem Size Used Avail Use% Mounted on /dev/sda5 18G 4.1G 13G 24% / tmpfs 1004M 12K 1004M 1% /dev/shm /dev/sda2 79G 9.9G 65G 14% /home /dev/sda6 18G 2.5G 15G 15% /sys2 /dev/sda7 18G 2.5G 15G 15% /sys3 /dev/sda8 18G 129M 17G 1% /sys4 athlon:/data 111G 103G 1.6G 99% /data /dev/ram8 125M 33M 93M 26% /tmp/mountpoint opteron:/home/seb # The filesystem looks usable (I have copied data into it and used it... it looks functionnal) I tried on several configurations: 32bit kernel 2.6.13-15.10 from SUSE 10 on a centrino laptop. 64bit kernel 2.6.13-15.10 from SUSE 10 on an opteron box. 64bit kernel 2.6.16 from kernel.org on an EMT64 box. and when mount fails, dmesg tells me the following: ReiserFS: ram8: warning: sh-2011: read_super_block: can't find a reiserfs filesystem on (dev ram8, block 16, size 4096) ReiserFS: ram8: warning: sh-2021: reiserfs_fill_super: can not find reiserfs on ram8 ReiserFS: ram8: found reiserfs format "3.6" with standard journal ReiserFS: ram8: using ordered data mode ReiserFS: ram8: journal params: device ram8, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30 ReiserFS: ram8: checking transaction log (ram8) ReiserFS: ram8: Using r5 hash to sort names ReiserFS: ram8: warning: Created .reiserfs_priv on ram8 - reserved for xattr storage. Thanks for any suggestion/idea. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Need to format twice /dev/ramX with reiserfs to be able to mount it ? 2006-06-17 17:15 ` Need to format twice /dev/ramX with reiserfs to be able to mount it ? sebastien cabaniols @ 2006-06-17 18:14 ` Bernd Eckenfels 2006-06-17 18:37 ` sebastien cabaniols 0 siblings, 1 reply; 36+ messages in thread From: Bernd Eckenfels @ 2006-06-17 18:14 UTC (permalink / raw) To: linux-kernel sebastien cabaniols <sebastien.cabaniols@laposte.net> wrote: > it, so tmpfs is not an option and ext2/ext3 is not an option for other > reasons) just out of couriostity, what might that reason be? Gruss Bernd ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Need to format twice /dev/ramX with reiserfs to be able to mount it ? 2006-06-17 18:14 ` Bernd Eckenfels @ 2006-06-17 18:37 ` sebastien cabaniols 0 siblings, 0 replies; 36+ messages in thread From: sebastien cabaniols @ 2006-06-17 18:37 UTC (permalink / raw) To: Bernd Eckenfels; +Cc: linux-kernel On Samedi 17 Juin 2006 20:14, Bernd Eckenfels wrote: > sebastien cabaniols <sebastien.cabaniols@laposte.net> wrote: > > it, so tmpfs is not an option and ext2/ext3 is not an option for other > > reasons) > > just out of couriostity, what might that reason be? It is a long story, In fact, when using ext2 instead, I am hitting a second kernel bug or a PXE bug (Alan Cox supposition) that I cannot narrow down to something simple. For a description of the problem see Rick Warner's email (1), he had the very same problem (and also probably failed to narrow it down to something easy to reproduce) in the past see: (1) http://lkml.org/lkml/2005/4/28/145 When removing ext2/ext3 from the way, the bug referenced in (1) disappears but I need to export the ramdrive data via NFS, hence it excludes tmpfs... But now reiserfs don't wan't to mount :-(((((( Since the second bug is very easy to reproduce, I prefered to report this one. I am still trying to reduce the original bug to something easy to reproduce but it is a lot of work. Regarding my initial bug report,: I just manage to reproduce it(the one I am reporting in the previous email) on an AMD64/Debian Sarge running 2.6.8. I will appreciate any help. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linux 2.4.33-rc1 2006-06-17 7:10 ` Willy Tarreau 2006-06-17 17:15 ` Need to format twice /dev/ramX with reiserfs to be able to mount it ? sebastien cabaniols @ 2006-06-17 19:55 ` Marcelo Tosatti 1 sibling, 0 replies; 36+ messages in thread From: Marcelo Tosatti @ 2006-06-17 19:55 UTC (permalink / raw) To: Willy Tarreau; +Cc: Grant Coady, Michal Piotrowski, linux-kernel > > Hi Willy, this worked. > > OK Fine. However, I don't understand. From you oops, it looks like > dentry->d_inode suddenly gets NULL before calling this line : > > double_up(&dir->i_zombie, &dentry->d_inode->i_zombie); > > I'm wondering if the unlink(dir, dentry) above can produce this. If this > is the case, the Vadim's fix is terribly broken. I also think that in > vfs_link(), we can encounter the same problem that Marcelo had to fix, > because double_down() is performed without prior checking that > old_dentry->d_inode is valid. Right, it should just check for the NULL case. > Marcelo, do you have Vadim's email somewhere ? I think he should help us > on this otherwise it would be better to revert his fix, as it has introduced > lots of sensible changes in the error path. The patch fixes a serious SMP race. > > > Grant. > > Thanks for the tests, Grant. > Willy ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linux 2.4.33-rc1 2006-06-16 22:21 ` Grant Coady 2006-06-16 22:38 ` Michal Piotrowski @ 2006-06-18 13:37 ` Marcelo Tosatti 2006-06-18 22:25 ` Grant Coady 2006-06-18 22:33 ` Grant Coady 1 sibling, 2 replies; 36+ messages in thread From: Marcelo Tosatti @ 2006-06-18 13:37 UTC (permalink / raw) To: Grant Coady; +Cc: linux-kernel, Willy Tarreau, Al Viro On Sat, Jun 17, 2006 at 08:21:44AM +1000, Grant Coady wrote: > On Fri, 16 Jun 2006 15:14:19 -0300, Marcelo Tosatti <marcelo@kvack.org> wrote: > > >Here is 2.4.33-rc1. > > I provoked a network related oops as user 'grant', with this CLI input > boo-boo on an ssh terminal: > > grant@sempro:~$ rm /home/share/config-2.6.17-rc6-mm1a dmesg-2.6.17-rc6-mm1a > > /home/share is an NFS mounted directory > > Was able to login direct as root and copy the oops info with mouse (gpm) > copy/paste, but console locked up on localnet access, here's the guff: > > ksymoops 2.4.11 on i686 2.4.33-rc1. Options used > -v /home/grant/linux/linux-2.4.33-rc1/vmlinux (specified) > -k /proc/ksyms (default) > -l /proc/modules (default) > -o /lib/modules/2.4.33-rc1/ (default) > -m /boot/System.map-2.4.33-rc1 (specified) > > Unable to handle kernel NULL pointer dereference at virtual address 00000088 > *pde = 00000000 > Oops: 0002 > CPU: 0 > EIP: 0010:[<c013eeb4>] Not tainted > Using defaults from ksymoops -t elf32-i386 -a i386 > EFLAGS: 00010282 > eax: 00000000 ebx: 00000000 ecx: 00000088 edx: 00000088 > esi: f6e2fd08 edi: f5839ac0 ebp: f6e2fc80 esp: f5889f6c > ds: 0018 es: 0018 ss: 0018 > Process rm (pid: 243, stackpage=f5889000) > Stack: f6e2fc80 f5839ac0 f5839ac0 f7bdd000 f5889f90 f5839ac0 c013f066 f6e2fc80 > f5839ac0 f6e36440 c19ac440 f7bdd00c 00000016 be8f2661 00000010 00000000 > 00000004 f5888000 bffff96b 08051050 bffff758 c0106eff bffff96b 00000002 > Call Trace: [<c013f066>] [<c0106eff>] > Code: ff 80 88 00 00 00 0f 8e 2e 16 00 00 85 db 74 16 89 d8 8b 5c > > > >>EIP; c013eeb4 <vfs_unlink+a4/1a0> <===== > > >>esi; f6e2fd08 <_end+36a9405c/386be3d4> > >>edi; f5839ac0 <_end+3549de14/386be3d4> > >>ebp; f6e2fc80 <_end+36a93fd4/386be3d4> > >>esp; f5889f6c <_end+354ee2c0/386be3d4> > > Trace; c013f066 <sys_unlink+b6/120> > Trace; c0106eff <system_call+33/38> > > Code; c013eeb4 <vfs_unlink+a4/1a0> > 00000000 <_EIP>: > Code; c013eeb4 <vfs_unlink+a4/1a0> <===== > 0: ff 80 88 00 00 00 incl 0x88(%eax) <===== > Code; c013eeba <vfs_unlink+aa/1a0> > 6: 0f 8e 2e 16 00 00 jle 163a <_EIP+0x163a> > Code; c013eec0 <vfs_unlink+b0/1a0> > c: 85 db test %ebx,%ebx > Code; c013eec2 <vfs_unlink+b2/1a0> > e: 74 16 je 26 <_EIP+0x26> > Code; c013eec4 <vfs_unlink+b4/1a0> > 10: 89 d8 mov %ebx,%eax > Code; c013eec6 <vfs_unlink+b6/1a0> > 12: 8b 5c 00 00 mov 0x0(%eax,%eax,1),%ebx Grant, Can you please try the attached patch. Grab a reference to the victim inode before calling vfs_unlink() to avoid it vanishing under us. diff --git a/fs/namei.c b/fs/namei.c index 42cce98..7993283 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -1509,6 +1509,7 @@ asmlinkage long sys_unlink(const char * char * name; struct dentry *dentry; struct nameidata nd; + struct inode *inode = NULL; name = getname(pathname); if(IS_ERR(name)) @@ -1527,11 +1528,16 @@ asmlinkage long sys_unlink(const char * /* Why not before? Because we want correct error value */ if (nd.last.name[nd.last.len]) goto slashes; + inode = dentry->d_inode; + if (inode) + atomic_inc(&inode->i_count); error = vfs_unlink(nd.dentry->d_inode, dentry); exit2: dput(dentry); } up(&nd.dentry->d_inode->i_sem); + if (inode) + iput(inode); exit1: path_release(&nd); exit: ^ permalink raw reply related [flat|nested] 36+ messages in thread
* Re: Linux 2.4.33-rc1 2006-06-18 13:37 ` Marcelo Tosatti @ 2006-06-18 22:25 ` Grant Coady 2006-06-18 22:37 ` Willy Tarreau 2006-06-18 22:33 ` Grant Coady 1 sibling, 1 reply; 36+ messages in thread From: Grant Coady @ 2006-06-18 22:25 UTC (permalink / raw) To: Marcelo Tosatti; +Cc: linux-kernel, Willy Tarreau, Al Viro On Sun, 18 Jun 2006 10:37:18 -0300, Marcelo Tosatti <marcelo@kvack.org> wrote: >Can you please try the attached patch. > >Grab a reference to the victim inode before calling vfs_unlink() to avoid >it vanishing under us. > >diff --git a/fs/namei.c b/fs/namei.c >index 42cce98..7993283 100644 >--- a/fs/namei.c >+++ b/fs/namei.c >@@ -1509,6 +1509,7 @@ asmlinkage long sys_unlink(const char * > char * name; > struct dentry *dentry; > struct nameidata nd; >+ struct inode *inode = NULL; > > name = getname(pathname); > if(IS_ERR(name)) >@@ -1527,11 +1528,16 @@ asmlinkage long sys_unlink(const char * > /* Why not before? Because we want correct error value */ > if (nd.last.name[nd.last.len]) > goto slashes; >+ inode = dentry->d_inode; >+ if (inode) >+ atomic_inc(&inode->i_count); > error = vfs_unlink(nd.dentry->d_inode, dentry); > exit2: > dput(dentry); > } > up(&nd.dentry->d_inode->i_sem); >+ if (inode) >+ iput(inode); > exit1: > path_release(&nd); > exit: /home/share is an NFS mounted directory, via ssh terminal: grant@sempro:~$ dmesg >/home/share/dmesg-2.4.33-rc1a grant@sempro:~$ rm /home/share/dmesg-2.4.33-rc1a Segmentation fault Network connection lost, copy / paste oops from screen to file, reboot, and... ksymoops 2.4.11 on i686 2.4.33-rc1a. Options used -v /home/grant/linux/linux-2.4.33-rc1a/vmlinux (specified) -k /proc/ksyms (default) -l /proc/modules (default) -o /lib/modules/2.4.33-rc1a/ (default) -m /boot/System.map-2.4.33-rc1a (specified) Unable to handle kernel NULL pointer dereference at virtual address 00000088 *pde = 00000000 Oops: 0002 CPU: 0 EIP: 0010:[<c013eeb4>] Not tainted Using defaults from ksymoops -t elf32-i386 -a i386 EFLAGS: 00010282 eax: 00000000 ebx: 00000000 ecx: 00000088 edx: 00000088 esi: f6e2ed08 edi: f5954e40 ebp: f6e2ec80 esp: f587ff68 ds: 0018 es: 0018 ss: 0018 Process rm (pid: 241, stackpage=f587f000) Stack: f6e2ec80 f5954e40 f5954e40 f75a7000 f58ca0c0 f587ff90 c013f078 f6e2ec80 f5954e40 f5954e40 f6eb8440 c19ac440 f75a700c 00000011 c1bbcfcb 00000010 00000000 00000004 f587e000 bffff986 08051050 bffff768 c0106eff bffff986 Call Trace: [<c013f078>] [<c0106eff>] Code: ff 80 88 00 00 00 0f 8e 58 16 00 00 85 db 74 16 89 d8 8b 5c >>EIP; c013eeb4 <vfs_unlink+a4/1a0> <===== >>esi; f6e2ed08 <_end+36a9305c/386be3d4> >>edi; f5954e40 <_end+355b9194/386be3d4> >>ebp; f6e2ec80 <_end+36a92fd4/386be3d4> >>esp; f587ff68 <_end+354e42bc/386be3d4> Trace; c013f078 <sys_unlink+c8/140> Trace; c0106eff <system_call+33/38> Code; c013eeb4 <vfs_unlink+a4/1a0> 00000000 <_EIP>: Code; c013eeb4 <vfs_unlink+a4/1a0> <===== 0: ff 80 88 00 00 00 incl 0x88(%eax) <===== Code; c013eeba <vfs_unlink+aa/1a0> 6: 0f 8e 58 16 00 00 jle 1664 <_EIP+0x1664> Code; c013eec0 <vfs_unlink+b0/1a0> c: 85 db test %ebx,%ebx Code; c013eec2 <vfs_unlink+b2/1a0> e: 74 16 je 26 <_EIP+0x26> Code; c013eec4 <vfs_unlink+b4/1a0> 10: 89 d8 mov %ebx,%eax Code; c013eec6 <vfs_unlink+b6/1a0> 12: 8b 5c 00 00 mov 0x0(%eax,%eax,1),%ebx Sorry for bad news. As before, the 'rm file' succeeded, prior to the segfault. I put the dmesg (before oops) and 'grep = .config' up on <http://bugsplatter.mine.nu/test/linux-2.4/sempro/> with -rc1a suffix Repeat with extract 2.4.32 + patches --> same, note that the oops is only on deleting file over NFS, I noticed 2.6.16.20 has extra NFS stuff around this area. grant@sempro:~$ dmesg >dmesg grant@sempro:~$ rm dmesg grant@sempro:~$ dmesg >/home/share/dmesg-test grant@sempro:~$ rm /home/share/dmesg-test Segmentation fault Grant. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linux 2.4.33-rc1 2006-06-18 22:25 ` Grant Coady @ 2006-06-18 22:37 ` Willy Tarreau 2006-06-18 23:07 ` Grant Coady 0 siblings, 1 reply; 36+ messages in thread From: Willy Tarreau @ 2006-06-18 22:37 UTC (permalink / raw) To: Grant Coady; +Cc: Marcelo Tosatti, linux-kernel, Al Viro Hi Grant, On Mon, Jun 19, 2006 at 08:25:06AM +1000, Grant Coady wrote: > On Sun, 18 Jun 2006 10:37:18 -0300, Marcelo Tosatti <marcelo@kvack.org> wrote: > > >Can you please try the attached patch. > > > >Grab a reference to the victim inode before calling vfs_unlink() to avoid > >it vanishing under us. > > > >diff --git a/fs/namei.c b/fs/namei.c > >index 42cce98..7993283 100644 > >--- a/fs/namei.c > >+++ b/fs/namei.c > >@@ -1509,6 +1509,7 @@ asmlinkage long sys_unlink(const char * > > char * name; > > struct dentry *dentry; > > struct nameidata nd; > >+ struct inode *inode = NULL; > > > > name = getname(pathname); > > if(IS_ERR(name)) > >@@ -1527,11 +1528,16 @@ asmlinkage long sys_unlink(const char * > > /* Why not before? Because we want correct error value */ > > if (nd.last.name[nd.last.len]) > > goto slashes; > >+ inode = dentry->d_inode; > >+ if (inode) > >+ atomic_inc(&inode->i_count); > > error = vfs_unlink(nd.dentry->d_inode, dentry); > > exit2: > > dput(dentry); > > } Could you add this line here, because your oops still looks like the NULL is close to this area : + printk(KERN_DEBUG "nd.dentry->d_inode = %p\n", nd.dentry->d_inode); > > up(&nd.dentry->d_inode->i_sem); > >+ if (inode) > >+ iput(inode); > > exit1: > > path_release(&nd); > > exit: > > /home/share is an NFS mounted directory, via ssh terminal: > > grant@sempro:~$ dmesg >/home/share/dmesg-2.4.33-rc1a > grant@sempro:~$ rm /home/share/dmesg-2.4.33-rc1a > Segmentation fault > > Network connection lost, copy / paste oops from screen to file, reboot, > and... > > ksymoops 2.4.11 on i686 2.4.33-rc1a. Options used > -v /home/grant/linux/linux-2.4.33-rc1a/vmlinux (specified) > -k /proc/ksyms (default) > -l /proc/modules (default) > -o /lib/modules/2.4.33-rc1a/ (default) > -m /boot/System.map-2.4.33-rc1a (specified) > > Unable to handle kernel NULL pointer dereference at virtual address 00000088 > *pde = 00000000 > Oops: 0002 > CPU: 0 > EIP: 0010:[<c013eeb4>] Not tainted > Using defaults from ksymoops -t elf32-i386 -a i386 > EFLAGS: 00010282 > eax: 00000000 ebx: 00000000 ecx: 00000088 edx: 00000088 > esi: f6e2ed08 edi: f5954e40 ebp: f6e2ec80 esp: f587ff68 > ds: 0018 es: 0018 ss: 0018 > Process rm (pid: 241, stackpage=f587f000) > Stack: f6e2ec80 f5954e40 f5954e40 f75a7000 f58ca0c0 f587ff90 c013f078 f6e2ec80 > f5954e40 f5954e40 f6eb8440 c19ac440 f75a700c 00000011 c1bbcfcb 00000010 > 00000000 00000004 f587e000 bffff986 08051050 bffff768 c0106eff bffff986 > Call Trace: [<c013f078>] [<c0106eff>] > Code: ff 80 88 00 00 00 0f 8e 58 16 00 00 85 db 74 16 89 d8 8b 5c > > > >>EIP; c013eeb4 <vfs_unlink+a4/1a0> <===== > > >>esi; f6e2ed08 <_end+36a9305c/386be3d4> > >>edi; f5954e40 <_end+355b9194/386be3d4> > >>ebp; f6e2ec80 <_end+36a92fd4/386be3d4> > >>esp; f587ff68 <_end+354e42bc/386be3d4> > > Trace; c013f078 <sys_unlink+c8/140> > Trace; c0106eff <system_call+33/38> > > Code; c013eeb4 <vfs_unlink+a4/1a0> > 00000000 <_EIP>: > Code; c013eeb4 <vfs_unlink+a4/1a0> <===== > 0: ff 80 88 00 00 00 incl 0x88(%eax) <===== > Code; c013eeba <vfs_unlink+aa/1a0> > 6: 0f 8e 58 16 00 00 jle 1664 <_EIP+0x1664> > Code; c013eec0 <vfs_unlink+b0/1a0> > c: 85 db test %ebx,%ebx > Code; c013eec2 <vfs_unlink+b2/1a0> > e: 74 16 je 26 <_EIP+0x26> > Code; c013eec4 <vfs_unlink+b4/1a0> > 10: 89 d8 mov %ebx,%eax > Code; c013eec6 <vfs_unlink+b6/1a0> > 12: 8b 5c 00 00 mov 0x0(%eax,%eax,1),%ebx > > > Sorry for bad news. As before, the 'rm file' succeeded, prior to the > segfault. I put the dmesg (before oops) and 'grep = .config' up on > <http://bugsplatter.mine.nu/test/linux-2.4/sempro/> with -rc1a suffix > > Repeat with extract 2.4.32 + patches --> same, note that the oops is > only on deleting file over NFS, I noticed 2.6.16.20 has extra NFS > stuff around this area. Thanks for the info and the tests, maybe Al will have some insight here ? > grant@sempro:~$ dmesg >dmesg > grant@sempro:~$ rm dmesg > grant@sempro:~$ dmesg >/home/share/dmesg-test > grant@sempro:~$ rm /home/share/dmesg-test > Segmentation fault > > Grant. Cheers, Willy ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linux 2.4.33-rc1 2006-06-18 22:37 ` Willy Tarreau @ 2006-06-18 23:07 ` Grant Coady 2006-06-19 4:01 ` Willy Tarreau 0 siblings, 1 reply; 36+ messages in thread From: Grant Coady @ 2006-06-18 23:07 UTC (permalink / raw) To: Willy Tarreau; +Cc: Marcelo Tosatti, linux-kernel, Al Viro On Mon, 19 Jun 2006 00:37:36 +0200, Willy Tarreau <w@1wt.eu> wrote: >Hi Grant, > >On Mon, Jun 19, 2006 at 08:25:06AM +1000, Grant Coady wrote: >> On Sun, 18 Jun 2006 10:37:18 -0300, Marcelo Tosatti <marcelo@kvack.org> wrote: >> >> >Can you please try the attached patch. >> > >> >Grab a reference to the victim inode before calling vfs_unlink() to avoid >> >it vanishing under us. >> > >> >diff --git a/fs/namei.c b/fs/namei.c >> >index 42cce98..7993283 100644 >> >--- a/fs/namei.c >> >+++ b/fs/namei.c >> >@@ -1509,6 +1509,7 @@ asmlinkage long sys_unlink(const char * >> > char * name; >> > struct dentry *dentry; >> > struct nameidata nd; >> >+ struct inode *inode = NULL; >> > >> > name = getname(pathname); >> > if(IS_ERR(name)) >> >@@ -1527,11 +1528,16 @@ asmlinkage long sys_unlink(const char * >> > /* Why not before? Because we want correct error value */ >> > if (nd.last.name[nd.last.len]) >> > goto slashes; >> >+ inode = dentry->d_inode; >> >+ if (inode) >> >+ atomic_inc(&inode->i_count); >> > error = vfs_unlink(nd.dentry->d_inode, dentry); >> > exit2: >> > dput(dentry); >> > } > >Could you add this line here, because your oops still looks like the NULL >is close to this area : > >+ printk(KERN_DEBUG "nd.dentry->d_inode = %p\n", nd.dentry->d_inode); It didn't get there for the segfault case, gets there for local file delete After: grant@sempro:~$ dmesg >dmesg grant@sempro:~$ rm dmesg Jun 19 08:49:17 sempro kernel: nd.dentry->d_inode = f73f4b80 After: grant@sempro:~$ dmesg >/home/share/dmesg-test grant@sempro:~$ rm /home/share/dmesg-test Segmentation fault Nothing reported by debug or syslog, oops in messages. Cheers, Grant. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linux 2.4.33-rc1 2006-06-18 23:07 ` Grant Coady @ 2006-06-19 4:01 ` Willy Tarreau 2006-06-19 5:03 ` Grant Coady 0 siblings, 1 reply; 36+ messages in thread From: Willy Tarreau @ 2006-06-19 4:01 UTC (permalink / raw) To: Grant Coady; +Cc: Marcelo Tosatti, linux-kernel, Al Viro On Mon, Jun 19, 2006 at 09:07:03AM +1000, Grant Coady wrote: > On Mon, 19 Jun 2006 00:37:36 +0200, Willy Tarreau <w@1wt.eu> wrote: > > >Hi Grant, > > > >On Mon, Jun 19, 2006 at 08:25:06AM +1000, Grant Coady wrote: > >> On Sun, 18 Jun 2006 10:37:18 -0300, Marcelo Tosatti <marcelo@kvack.org> wrote: > >> > >> >Can you please try the attached patch. > >> > > >> >Grab a reference to the victim inode before calling vfs_unlink() to avoid > >> >it vanishing under us. > >> > > >> >diff --git a/fs/namei.c b/fs/namei.c > >> >index 42cce98..7993283 100644 > >> >--- a/fs/namei.c > >> >+++ b/fs/namei.c > >> >@@ -1509,6 +1509,7 @@ asmlinkage long sys_unlink(const char * > >> > char * name; > >> > struct dentry *dentry; > >> > struct nameidata nd; > >> >+ struct inode *inode = NULL; > >> > > >> > name = getname(pathname); > >> > if(IS_ERR(name)) > >> >@@ -1527,11 +1528,16 @@ asmlinkage long sys_unlink(const char * > >> > /* Why not before? Because we want correct error value */ > >> > if (nd.last.name[nd.last.len]) > >> > goto slashes; > >> >+ inode = dentry->d_inode; > >> >+ if (inode) > >> >+ atomic_inc(&inode->i_count); > >> > error = vfs_unlink(nd.dentry->d_inode, dentry); > >> > exit2: > >> > dput(dentry); > >> > } > > > >Could you add this line here, because your oops still looks like the NULL > >is close to this area : > > > >+ printk(KERN_DEBUG "nd.dentry->d_inode = %p\n", nd.dentry->d_inode); > > It didn't get there for the segfault case, gets there for local file > delete > > After: > grant@sempro:~$ dmesg >dmesg > grant@sempro:~$ rm dmesg > > Jun 19 08:49:17 sempro kernel: nd.dentry->d_inode = f73f4b80 > > After: > grant@sempro:~$ dmesg >/home/share/dmesg-test > grant@sempro:~$ rm /home/share/dmesg-test > Segmentation fault > > Nothing reported by debug or syslog, oops in messages. Thanks. Then, could you send us your 'namei.o' file please ? Mine does not produce the same content as yours and it makes it difficult to find the exact position in the code. Cheers, Willy ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linux 2.4.33-rc1 2006-06-19 4:01 ` Willy Tarreau @ 2006-06-19 5:03 ` Grant Coady 2006-06-19 8:06 ` Willy Tarreau 0 siblings, 1 reply; 36+ messages in thread From: Grant Coady @ 2006-06-19 5:03 UTC (permalink / raw) To: Willy Tarreau; +Cc: Marcelo Tosatti, linux-kernel, Al Viro [-- Attachment #1: Type: text/plain, Size: 2190 bytes --] On Mon, 19 Jun 2006 06:01:52 +0200, Willy Tarreau <w@1wt.eu> wrote: >On Mon, Jun 19, 2006 at 09:07:03AM +1000, Grant Coady wrote: >> On Mon, 19 Jun 2006 00:37:36 +0200, Willy Tarreau <w@1wt.eu> wrote: >> >> >Hi Grant, >> > >> >On Mon, Jun 19, 2006 at 08:25:06AM +1000, Grant Coady wrote: >> >> On Sun, 18 Jun 2006 10:37:18 -0300, Marcelo Tosatti <marcelo@kvack.org> wrote: >> >> >> >> >Can you please try the attached patch. >> >> > >> >> >Grab a reference to the victim inode before calling vfs_unlink() to avoid >> >> >it vanishing under us. >> >> > >> >> >diff --git a/fs/namei.c b/fs/namei.c >> >> >index 42cce98..7993283 100644 >> >> >--- a/fs/namei.c >> >> >+++ b/fs/namei.c >> >> >@@ -1509,6 +1509,7 @@ asmlinkage long sys_unlink(const char * >> >> > char * name; >> >> > struct dentry *dentry; >> >> > struct nameidata nd; >> >> >+ struct inode *inode = NULL; >> >> > >> >> > name = getname(pathname); >> >> > if(IS_ERR(name)) >> >> >@@ -1527,11 +1528,16 @@ asmlinkage long sys_unlink(const char * >> >> > /* Why not before? Because we want correct error value */ >> >> > if (nd.last.name[nd.last.len]) >> >> > goto slashes; >> >> >+ inode = dentry->d_inode; >> >> >+ if (inode) >> >> >+ atomic_inc(&inode->i_count); >> >> > error = vfs_unlink(nd.dentry->d_inode, dentry); >> >> > exit2: >> >> > dput(dentry); >> >> > } >> > >> >Could you add this line here, because your oops still looks like the NULL >> >is close to this area : >> > >> >+ printk(KERN_DEBUG "nd.dentry->d_inode = %p\n", nd.dentry->d_inode); >> >> It didn't get there for the segfault case, gets there for local file >> delete >> >> After: >> grant@sempro:~$ dmesg >dmesg >> grant@sempro:~$ rm dmesg >> >> Jun 19 08:49:17 sempro kernel: nd.dentry->d_inode = f73f4b80 >> >> After: >> grant@sempro:~$ dmesg >/home/share/dmesg-test >> grant@sempro:~$ rm /home/share/dmesg-test >> Segmentation fault >> >> Nothing reported by debug or syslog, oops in messages. > >Thanks. Then, could you send us your 'namei.o' file please ? Mine does >not produce the same content as yours and it makes it difficult to find >the exact position in the code. gzipped, attached. Cheers, Grant. [-- Attachment #2: fs-namei.o.gz --] [-- Type: application/octet-stream, Size: 9529 bytes --] ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linux 2.4.33-rc1 2006-06-19 5:03 ` Grant Coady @ 2006-06-19 8:06 ` Willy Tarreau 2006-06-19 8:53 ` Grant Coady ` (2 more replies) 0 siblings, 3 replies; 36+ messages in thread From: Willy Tarreau @ 2006-06-19 8:06 UTC (permalink / raw) To: Grant Coady; +Cc: Marcelo Tosatti, linux-kernel, Al Viro Hi Grant, OK, it does *really* crash in vfs_unlink(), during the double_up on dentry->inode-i_zombie (dentry->inode = NULL). I suggest the following fix, I hope that it is correct and is not subject to any race condition : --- ./fs/namei.c.orig 2006-06-19 09:39:52.000000000 +0200 +++ ./fs/namei.c 2006-06-19 09:51:09.000000000 +0200 @@ -1478,12 +1478,14 @@ int vfs_unlink(struct inode *dir, struct dentry *dentry) { int error; + struct inode *inode; error = may_delete(dir, dentry, 0); if (error) return error; - double_down(&dir->i_zombie, &dentry->d_inode->i_zombie); + inode = dentry->d_inode; + double_down(&dir->i_zombie, &inode->i_zombie); error = -EPERM; if (dir->i_op && dir->i_op->unlink) { DQUOT_INIT(dir); @@ -1495,7 +1497,7 @@ unlock_kernel(); } } - double_up(&dir->i_zombie, &dentry->d_inode->i_zombie); + double_up(&dir->i_zombie, &inode->i_zombie); if (!error) { d_delete(dentry); inode_dir_notify(dir, DN_DELETE); I think it will *not* oops anymore with this fix, but I'd like someone to review it to ensure that it is valid. Cheers, Willy ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linux 2.4.33-rc1 2006-06-19 8:06 ` Willy Tarreau @ 2006-06-19 8:53 ` Grant Coady 2006-06-19 9:08 ` Willy Tarreau 2006-06-19 9:12 ` Grant Coady 2006-06-19 22:04 ` Marcelo Tosatti 2 siblings, 1 reply; 36+ messages in thread From: Grant Coady @ 2006-06-19 8:53 UTC (permalink / raw) To: Willy Tarreau; +Cc: Marcelo Tosatti, linux-kernel, Al Viro On Mon, 19 Jun 2006 10:06:51 +0200, Willy Tarreau <w@1wt.eu> wrote: >Hi Grant, > >OK, it does *really* crash in vfs_unlink(), during the double_up on >dentry->inode-i_zombie (dentry->inode = NULL). > >I suggest the following fix, I hope that it is correct and is not subject >to any race condition : > >--- ./fs/namei.c.orig 2006-06-19 09:39:52.000000000 +0200 >+++ ./fs/namei.c 2006-06-19 09:51:09.000000000 +0200 >@@ -1478,12 +1478,14 @@ > int vfs_unlink(struct inode *dir, struct dentry *dentry) > { > int error; >+ struct inode *inode; > > error = may_delete(dir, dentry, 0); > if (error) > return error; > >- double_down(&dir->i_zombie, &dentry->d_inode->i_zombie); >+ inode = dentry->d_inode; >+ double_down(&dir->i_zombie, &inode->i_zombie); > error = -EPERM; > if (dir->i_op && dir->i_op->unlink) { > DQUOT_INIT(dir); >@@ -1495,7 +1497,7 @@ > unlock_kernel(); > } > } >- double_up(&dir->i_zombie, &dentry->d_inode->i_zombie); >+ double_up(&dir->i_zombie, &inode->i_zombie); > if (!error) { > d_delete(dentry); > inode_dir_notify(dir, DN_DELETE); > >I think it will *not* oops anymore with this fix, but I'd like someone to >review it to ensure that it is valid. Hi Willy, Still corrupts a vim edit backup filename as previously reported, instead of /etc/lilo.conf~ I get /etc/lilo.co~ :( Grant. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linux 2.4.33-rc1 2006-06-19 8:53 ` Grant Coady @ 2006-06-19 9:08 ` Willy Tarreau 0 siblings, 0 replies; 36+ messages in thread From: Willy Tarreau @ 2006-06-19 9:08 UTC (permalink / raw) To: Grant Coady; +Cc: Marcelo Tosatti, linux-kernel, Al Viro On Mon, Jun 19, 2006 at 06:53:31PM +1000, Grant Coady wrote: > On Mon, 19 Jun 2006 10:06:51 +0200, Willy Tarreau <w@1wt.eu> wrote: > > >Hi Grant, > > > >OK, it does *really* crash in vfs_unlink(), during the double_up on > >dentry->inode-i_zombie (dentry->inode = NULL). > > > >I suggest the following fix, I hope that it is correct and is not subject > >to any race condition : > > > >--- ./fs/namei.c.orig 2006-06-19 09:39:52.000000000 +0200 > >+++ ./fs/namei.c 2006-06-19 09:51:09.000000000 +0200 > >@@ -1478,12 +1478,14 @@ > > int vfs_unlink(struct inode *dir, struct dentry *dentry) > > { > > int error; > >+ struct inode *inode; > > > > error = may_delete(dir, dentry, 0); > > if (error) > > return error; > > > >- double_down(&dir->i_zombie, &dentry->d_inode->i_zombie); > >+ inode = dentry->d_inode; > >+ double_down(&dir->i_zombie, &inode->i_zombie); > > error = -EPERM; > > if (dir->i_op && dir->i_op->unlink) { > > DQUOT_INIT(dir); > >@@ -1495,7 +1497,7 @@ > > unlock_kernel(); > > } > > } > >- double_up(&dir->i_zombie, &dentry->d_inode->i_zombie); > >+ double_up(&dir->i_zombie, &inode->i_zombie); > > if (!error) { > > d_delete(dentry); > > inode_dir_notify(dir, DN_DELETE); > > > >I think it will *not* oops anymore with this fix, but I'd like someone to > >review it to ensure that it is valid. > > Hi Willy, > > Still corrupts a vim edit backup filename as previously reported, > instead of /etc/lilo.conf~ I get /etc/lilo.co~ :( Ok, thanks. At least we're making progress. Could you try on another file, with fewer chars after the dot ? I suspect that one particular error is reported to vim and that it retries with a shorter name, for instance to be compatible with 8.3. If this is the case, the problem might be in vfs_link() (but I don't see why). > Grant. Cheers, Willy ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linux 2.4.33-rc1 2006-06-19 8:06 ` Willy Tarreau 2006-06-19 8:53 ` Grant Coady @ 2006-06-19 9:12 ` Grant Coady 2006-06-19 9:24 ` Willy Tarreau 2006-06-19 22:04 ` Marcelo Tosatti 2 siblings, 1 reply; 36+ messages in thread From: Grant Coady @ 2006-06-19 9:12 UTC (permalink / raw) To: Willy Tarreau; +Cc: Marcelo Tosatti, linux-kernel, Al Viro On Mon, 19 Jun 2006 10:06:51 +0200, Willy Tarreau <w@1wt.eu> wrote: >Hi Grant, > >OK, it does *really* crash in vfs_unlink(), during the double_up on >dentry->inode-i_zombie (dentry->inode = NULL). > >I suggest the following fix, I hope that it is correct and is not subject >to any race condition : > >--- ./fs/namei.c.orig 2006-06-19 09:39:52.000000000 +0200 >+++ ./fs/namei.c 2006-06-19 09:51:09.000000000 +0200 >@@ -1478,12 +1478,14 @@ > int vfs_unlink(struct inode *dir, struct dentry *dentry) > { > int error; >+ struct inode *inode; > > error = may_delete(dir, dentry, 0); > if (error) > return error; > >- double_down(&dir->i_zombie, &dentry->d_inode->i_zombie); >+ inode = dentry->d_inode; >+ double_down(&dir->i_zombie, &inode->i_zombie); > error = -EPERM; > if (dir->i_op && dir->i_op->unlink) { > DQUOT_INIT(dir); >@@ -1495,7 +1497,7 @@ > unlock_kernel(); > } > } >- double_up(&dir->i_zombie, &dentry->d_inode->i_zombie); >+ double_up(&dir->i_zombie, &inode->i_zombie); > if (!error) { > d_delete(dentry); > inode_dir_notify(dir, DN_DELETE); > >I think it will *not* oops anymore with this fix, but I'd like someone to >review it to ensure that it is valid. Strangely, the /etc/lilo.conf~ is as expected on a different box, 500MHz Celeron (Coppermine) + PATA HDD okay, whereas the Sempron SktA 2600+ with SATA HDD has something munch a couple chars off a filename during whatever vim does to make its backup file. Grant. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linux 2.4.33-rc1 2006-06-19 9:12 ` Grant Coady @ 2006-06-19 9:24 ` Willy Tarreau 2006-06-19 10:27 ` Grant Coady 0 siblings, 1 reply; 36+ messages in thread From: Willy Tarreau @ 2006-06-19 9:24 UTC (permalink / raw) To: Grant Coady; +Cc: Marcelo Tosatti, linux-kernel, Al Viro On Mon, Jun 19, 2006 at 07:12:22PM +1000, Grant Coady wrote: > On Mon, 19 Jun 2006 10:06:51 +0200, Willy Tarreau <w@1wt.eu> wrote: > > >Hi Grant, > > > >OK, it does *really* crash in vfs_unlink(), during the double_up on > >dentry->inode-i_zombie (dentry->inode = NULL). > > > >I suggest the following fix, I hope that it is correct and is not subject > >to any race condition : > > > >--- ./fs/namei.c.orig 2006-06-19 09:39:52.000000000 +0200 > >+++ ./fs/namei.c 2006-06-19 09:51:09.000000000 +0200 > >@@ -1478,12 +1478,14 @@ > > int vfs_unlink(struct inode *dir, struct dentry *dentry) > > { > > int error; > >+ struct inode *inode; > > > > error = may_delete(dir, dentry, 0); > > if (error) > > return error; > > > >- double_down(&dir->i_zombie, &dentry->d_inode->i_zombie); > >+ inode = dentry->d_inode; > >+ double_down(&dir->i_zombie, &inode->i_zombie); > > error = -EPERM; > > if (dir->i_op && dir->i_op->unlink) { > > DQUOT_INIT(dir); > >@@ -1495,7 +1497,7 @@ > > unlock_kernel(); > > } > > } > >- double_up(&dir->i_zombie, &dentry->d_inode->i_zombie); > >+ double_up(&dir->i_zombie, &inode->i_zombie); > > if (!error) { > > d_delete(dentry); > > inode_dir_notify(dir, DN_DELETE); > > > >I think it will *not* oops anymore with this fix, but I'd like someone to > >review it to ensure that it is valid. > > Strangely, the /etc/lilo.conf~ is as expected on a different box, > 500MHz Celeron (Coppermine) + PATA HDD okay, whereas the Sempron > SktA 2600+ with SATA HDD has something munch a couple chars off > a filename during whatever vim does to make its backup file. I would not suspect the hardware. Instead, you should strace vim when it write the file : # strace -s 1000 -o /tmp/vim.trace vim /etc/lilo.conf Grep for "lilo.co" in it, I'm fairly sure that you will find "lilo.co~". > Grant. Cheers, Willy ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linux 2.4.33-rc1 2006-06-19 9:24 ` Willy Tarreau @ 2006-06-19 10:27 ` Grant Coady 2006-06-19 10:31 ` Willy Tarreau 0 siblings, 1 reply; 36+ messages in thread From: Grant Coady @ 2006-06-19 10:27 UTC (permalink / raw) To: Willy Tarreau; +Cc: Marcelo Tosatti, linux-kernel, Al Viro On Mon, 19 Jun 2006 11:24:26 +0200, Willy Tarreau <w@1wt.eu> wrote: >On Mon, Jun 19, 2006 at 07:12:22PM +1000, Grant Coady wrote: >> On Mon, 19 Jun 2006 10:06:51 +0200, Willy Tarreau <w@1wt.eu> wrote: >> >> >Hi Grant, >> > >> >OK, it does *really* crash in vfs_unlink(), during the double_up on >> >dentry->inode-i_zombie (dentry->inode = NULL). >> > >> >I suggest the following fix, I hope that it is correct and is not subject >> >to any race condition : >> > >> >--- ./fs/namei.c.orig 2006-06-19 09:39:52.000000000 +0200 >> >+++ ./fs/namei.c 2006-06-19 09:51:09.000000000 +0200 >> >@@ -1478,12 +1478,14 @@ >> > int vfs_unlink(struct inode *dir, struct dentry *dentry) >> > { >> > int error; >> >+ struct inode *inode; >> > >> > error = may_delete(dir, dentry, 0); >> > if (error) >> > return error; >> > >> >- double_down(&dir->i_zombie, &dentry->d_inode->i_zombie); >> >+ inode = dentry->d_inode; >> >+ double_down(&dir->i_zombie, &inode->i_zombie); >> > error = -EPERM; >> > if (dir->i_op && dir->i_op->unlink) { >> > DQUOT_INIT(dir); >> >@@ -1495,7 +1497,7 @@ >> > unlock_kernel(); >> > } >> > } >> >- double_up(&dir->i_zombie, &dentry->d_inode->i_zombie); >> >+ double_up(&dir->i_zombie, &inode->i_zombie); >> > if (!error) { >> > d_delete(dentry); >> > inode_dir_notify(dir, DN_DELETE); >> > >> >I think it will *not* oops anymore with this fix, but I'd like someone to >> >review it to ensure that it is valid. >> >> Strangely, the /etc/lilo.conf~ is as expected on a different box, >> 500MHz Celeron (Coppermine) + PATA HDD okay, whereas the Sempron >> SktA 2600+ with SATA HDD has something munch a couple chars off >> a filename during whatever vim does to make its backup file. > >I would not suspect the hardware. Instead, you should strace vim when it >write the file : > > # strace -s 1000 -o /tmp/vim.trace vim /etc/lilo.conf > >Grep for "lilo.co" in it, I'm fairly sure that you will find "lilo.co~". stat64("/etc/lilo.conf", {st_mode=S_IFREG|0644, st_size=778, ...}) = 0 stat64("/etc/lilo.conf", {st_mode=S_IFREG|0644, st_size=778, ...}) = 0 stat64("/etc/lilo.conf", {st_mode=S_IFREG|0644, st_size=778, ...}) = 0 access("/etc/lilo.conf", W_OK) = 0 open("/etc/lilo.conf", O_RDONLY) = 3 ## munch a char: stat64("/etc/lilo_con.swp", 0xbfffee8c) = -1 ENOENT (No such file or directory) lstat64("/etc/lilo_con.swp", 0xbfffef0c) = -1 ENOENT (No such file or directory) lstat64("/etc/lilo_con.swp", 0xbffff38c) = -1 ENOENT (No such file or directory) open("/etc/lilo_con.swp", O_RDWR|O_CREAT|O_EXCL, 0600) = 4 ##munch another: write(1, "\"/etc/lilo.conf\"", 16) = 16 stat64("/etc/lilo.conf", {st_mode=S_IFREG|0644, st_size=778, ...}) = 0 access("/etc/lilo.conf", W_OK) = 0 lstat64("/etc/lilo.conf", {st_mode=S_IFREG|0644, st_size=778, ...}) = 0 lstat64("/etc/lilo.conf", {st_mode=S_IFREG|0644, st_size=778, ...}) = 0 stat64("/etc/lilo.conf", {st_mode=S_IFREG|0644, st_size=778, ...}) = 0 unlink("/etc/lilo.co~") = 0 rename("/etc/lilo.conf", "/etc/lilo.co~") = 0 <http://bugsplatter.mine.nu/test/boxen/sempro/2.4.xx/vim.trace.gz> If you want the whole trace (168k -> 26k gzipped). Cheers, Grant. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linux 2.4.33-rc1 2006-06-19 10:27 ` Grant Coady @ 2006-06-19 10:31 ` Willy Tarreau 2006-06-19 20:11 ` Grant Coady 0 siblings, 1 reply; 36+ messages in thread From: Willy Tarreau @ 2006-06-19 10:31 UTC (permalink / raw) To: Grant Coady; +Cc: Marcelo Tosatti, linux-kernel, Al Viro On Mon, Jun 19, 2006 at 08:27:29PM +1000, Grant Coady wrote: > On Mon, 19 Jun 2006 11:24:26 +0200, Willy Tarreau <w@1wt.eu> wrote: > > >On Mon, Jun 19, 2006 at 07:12:22PM +1000, Grant Coady wrote: > >> On Mon, 19 Jun 2006 10:06:51 +0200, Willy Tarreau <w@1wt.eu> wrote: > >> > >> >Hi Grant, > >> > > >> >OK, it does *really* crash in vfs_unlink(), during the double_up on > >> >dentry->inode-i_zombie (dentry->inode = NULL). > >> > > >> >I suggest the following fix, I hope that it is correct and is not subject > >> >to any race condition : > >> > > >> >--- ./fs/namei.c.orig 2006-06-19 09:39:52.000000000 +0200 > >> >+++ ./fs/namei.c 2006-06-19 09:51:09.000000000 +0200 > >> >@@ -1478,12 +1478,14 @@ > >> > int vfs_unlink(struct inode *dir, struct dentry *dentry) > >> > { > >> > int error; > >> >+ struct inode *inode; > >> > > >> > error = may_delete(dir, dentry, 0); > >> > if (error) > >> > return error; > >> > > >> >- double_down(&dir->i_zombie, &dentry->d_inode->i_zombie); > >> >+ inode = dentry->d_inode; > >> >+ double_down(&dir->i_zombie, &inode->i_zombie); > >> > error = -EPERM; > >> > if (dir->i_op && dir->i_op->unlink) { > >> > DQUOT_INIT(dir); > >> >@@ -1495,7 +1497,7 @@ > >> > unlock_kernel(); > >> > } > >> > } > >> >- double_up(&dir->i_zombie, &dentry->d_inode->i_zombie); > >> >+ double_up(&dir->i_zombie, &inode->i_zombie); > >> > if (!error) { > >> > d_delete(dentry); > >> > inode_dir_notify(dir, DN_DELETE); > >> > > >> >I think it will *not* oops anymore with this fix, but I'd like someone to > >> >review it to ensure that it is valid. > >> > >> Strangely, the /etc/lilo.conf~ is as expected on a different box, > >> 500MHz Celeron (Coppermine) + PATA HDD okay, whereas the Sempron > >> SktA 2600+ with SATA HDD has something munch a couple chars off > >> a filename during whatever vim does to make its backup file. > > > >I would not suspect the hardware. Instead, you should strace vim when it > >write the file : > > > > # strace -s 1000 -o /tmp/vim.trace vim /etc/lilo.conf > > > >Grep for "lilo.co" in it, I'm fairly sure that you will find "lilo.co~". > > stat64("/etc/lilo.conf", {st_mode=S_IFREG|0644, st_size=778, ...}) = 0 > stat64("/etc/lilo.conf", {st_mode=S_IFREG|0644, st_size=778, ...}) = 0 > stat64("/etc/lilo.conf", {st_mode=S_IFREG|0644, st_size=778, ...}) = 0 > access("/etc/lilo.conf", W_OK) = 0 > open("/etc/lilo.conf", O_RDONLY) = 3 > > ## munch a char: > stat64("/etc/lilo_con.swp", 0xbfffee8c) = -1 ENOENT (No such file or directory) > lstat64("/etc/lilo_con.swp", 0xbfffef0c) = -1 ENOENT (No such file or directory) > lstat64("/etc/lilo_con.swp", 0xbffff38c) = -1 ENOENT (No such file or directory) > open("/etc/lilo_con.swp", O_RDWR|O_CREAT|O_EXCL, 0600) = 4 > > ##munch another: > write(1, "\"/etc/lilo.conf\"", 16) = 16 > stat64("/etc/lilo.conf", {st_mode=S_IFREG|0644, st_size=778, ...}) = 0 > access("/etc/lilo.conf", W_OK) = 0 > lstat64("/etc/lilo.conf", {st_mode=S_IFREG|0644, st_size=778, ...}) = 0 > lstat64("/etc/lilo.conf", {st_mode=S_IFREG|0644, st_size=778, ...}) = 0 > stat64("/etc/lilo.conf", {st_mode=S_IFREG|0644, st_size=778, ...}) = 0 > unlink("/etc/lilo.co~") = 0 > rename("/etc/lilo.conf", "/etc/lilo.co~") = 0 > > <http://bugsplatter.mine.nu/test/boxen/sempro/2.4.xx/vim.trace.gz> > > If you want the whole trace (168k -> 26k gzipped). Not needed, it really seems that your vim does name the file like this on purpose. I see nothing abnormal right here. So possibly the kernel bug is fixed (but I'd like to get some reviewer comments). Thanks, Willy ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linux 2.4.33-rc1 2006-06-19 10:31 ` Willy Tarreau @ 2006-06-19 20:11 ` Grant Coady 2006-06-19 20:20 ` Willy Tarreau 0 siblings, 1 reply; 36+ messages in thread From: Grant Coady @ 2006-06-19 20:11 UTC (permalink / raw) To: Willy Tarreau; +Cc: Marcelo Tosatti, linux-kernel, Al Viro On Mon, 19 Jun 2006 12:31:14 +0200, Willy Tarreau <w@1wt.eu> wrote: >Not needed, it really seems that your vim does name the file like this on >purpose. I see nothing abnormal right here. So testing -rc1 has munched the filesystem, I'll need to reinstall :( Thanks, Grant. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linux 2.4.33-rc1 2006-06-19 20:11 ` Grant Coady @ 2006-06-19 20:20 ` Willy Tarreau 0 siblings, 0 replies; 36+ messages in thread From: Willy Tarreau @ 2006-06-19 20:20 UTC (permalink / raw) To: Grant Coady; +Cc: Marcelo Tosatti, linux-kernel, Al Viro On Tue, Jun 20, 2006 at 06:11:25AM +1000, Grant Coady wrote: > On Mon, 19 Jun 2006 12:31:14 +0200, Willy Tarreau <w@1wt.eu> wrote: > > >Not needed, it really seems that your vim does name the file like this on > >purpose. I see nothing abnormal right here. > > So testing -rc1 has munched the filesystem, I'll need to reinstall :( Boot from a CD and force a full e2fsck. It is very smart and will do a very good job. At most, you'll get new friends in /lost+found. But do not reinstall for this ! > Thanks, > Grant. Regards, willy ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linux 2.4.33-rc1 2006-06-19 8:06 ` Willy Tarreau 2006-06-19 8:53 ` Grant Coady 2006-06-19 9:12 ` Grant Coady @ 2006-06-19 22:04 ` Marcelo Tosatti 2006-06-19 23:00 ` Willy Tarreau 2 siblings, 1 reply; 36+ messages in thread From: Marcelo Tosatti @ 2006-06-19 22:04 UTC (permalink / raw) To: Willy Tarreau; +Cc: Grant Coady, linux-kernel, Al Viro Willy, On Mon, Jun 19, 2006 at 10:06:51AM +0200, Willy Tarreau wrote: > Hi Grant, > > OK, it does *really* crash in vfs_unlink(), during the double_up on > dentry->inode-i_zombie (dentry->inode = NULL). > > I suggest the following fix, I hope that it is correct and is not subject > to any race condition : > > --- ./fs/namei.c.orig 2006-06-19 09:39:52.000000000 +0200 > +++ ./fs/namei.c 2006-06-19 09:51:09.000000000 +0200 > @@ -1478,12 +1478,14 @@ > int vfs_unlink(struct inode *dir, struct dentry *dentry) > { > int error; > + struct inode *inode; > > error = may_delete(dir, dentry, 0); > if (error) > return error; > > - double_down(&dir->i_zombie, &dentry->d_inode->i_zombie); > + inode = dentry->d_inode; > + double_down(&dir->i_zombie, &inode->i_zombie); > error = -EPERM; > if (dir->i_op && dir->i_op->unlink) { > DQUOT_INIT(dir); > @@ -1495,7 +1497,7 @@ > unlock_kernel(); > } > } > - double_up(&dir->i_zombie, &dentry->d_inode->i_zombie); > + double_up(&dir->i_zombie, &inode->i_zombie); > if (!error) { > d_delete(dentry); > inode_dir_notify(dir, DN_DELETE); > > I think it will *not* oops anymore with this fix, but I'd like someone to > review it to ensure that it is valid. Think this is the right thing to do, except that it must be guaranteed that the inode struct won't be freed in the meantime, need to grab a reference to it. Thanks! -- v2.4.33-pre introduced a fix for lack of synchronization between link/unlink which requires vfs_unlink to grab i_zombie of both the victim and its parent with double_down(). Problem is that NFS client deletes the victim dentry on ->unlink, resulting in a NULL dereference when vfs_unlink() tries to up dentry->d_inode->i_zombie. Keep a copy of the inode pointer, incrementing its reference counter, to fix the situation. diff --git a/fs/namei.c b/fs/namei.c index 42cce98..69da199 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -1478,12 +1478,14 @@ exit: int vfs_unlink(struct inode *dir, struct dentry *dentry) { int error; + struct inode *inode; error = may_delete(dir, dentry, 0); if (error) return error; - double_down(&dir->i_zombie, &dentry->d_inode->i_zombie); + inode = dentry->d_inode; + double_down(&dir->i_zombie, &inode->i_zombie); error = -EPERM; if (dir->i_op && dir->i_op->unlink) { DQUOT_INIT(dir); @@ -1495,7 +1497,7 @@ int vfs_unlink(struct inode *dir, struct unlock_kernel(); } } - double_up(&dir->i_zombie, &dentry->d_inode->i_zombie); + double_up(&dir->i_zombie, &inode->i_zombie); if (!error) { d_delete(dentry); inode_dir_notify(dir, DN_DELETE); @@ -1509,6 +1511,7 @@ asmlinkage long sys_unlink(const char * char * name; struct dentry *dentry; struct nameidata nd; + struct inode *inode = NULL; name = getname(pathname); if(IS_ERR(name)) @@ -1527,11 +1530,16 @@ asmlinkage long sys_unlink(const char * /* Why not before? Because we want correct error value */ if (nd.last.name[nd.last.len]) goto slashes; + inode = dentry->d_inode; + if (inode) + atomic_inc(&inode->i_count); error = vfs_unlink(nd.dentry->d_inode, dentry); exit2: dput(dentry); } up(&nd.dentry->d_inode->i_sem); + if (inode) + iput(inode); exit1: path_release(&nd); exit: ^ permalink raw reply related [flat|nested] 36+ messages in thread
* Re: Linux 2.4.33-rc1 2006-06-19 22:04 ` Marcelo Tosatti @ 2006-06-19 23:00 ` Willy Tarreau 2006-06-19 23:45 ` Marcelo Tosatti 0 siblings, 1 reply; 36+ messages in thread From: Willy Tarreau @ 2006-06-19 23:00 UTC (permalink / raw) To: Marcelo Tosatti; +Cc: Grant Coady, linux-kernel, Al Viro On Mon, Jun 19, 2006 at 07:04:05PM -0300, Marcelo Tosatti wrote: > Think this is the right thing to do, except that it must be guaranteed > that the inode struct won't be freed in the meantime, need to grab a > reference to it. OK, I believe it will be right this time. I took inspiration from your precedent patch to sys_unlink(). diff --git a/fs/namei.c b/fs/namei.c index 42cce98..374b767 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -1478,12 +1478,16 @@ exit: int vfs_unlink(struct inode *dir, struct dentry *dentry) { int error; + struct inode *inode; error = may_delete(dir, dentry, 0); if (error) return error; - double_down(&dir->i_zombie, &dentry->d_inode->i_zombie); + inode = dentry->d_inode; + atomic_inc(&inode->i_count); + double_down(&dir->i_zombie, &inode->i_zombie); + error = -EPERM; if (dir->i_op && dir->i_op->unlink) { DQUOT_INIT(dir); @@ -1495,7 +1499,9 @@ int vfs_unlink(struct inode *dir, struct unlock_kernel(); } } - double_up(&dir->i_zombie, &dentry->d_inode->i_zombie); + double_up(&dir->i_zombie, &inode->i_zombie); + iput(inode); + if (!error) { d_delete(dentry); inode_dir_notify(dir, DN_DELETE); BTW, I might be wrong because my knowledge in this area is rather poor, but I now believe that your previously proposed fix below indeed is not needed at all : > diff --git a/fs/namei.c b/fs/namei.c > index 42cce98..69da199 100644 > --- a/fs/namei.c > +++ b/fs/namei.c > @@ -1509,6 +1511,7 @@ asmlinkage long sys_unlink(const char * > char * name; > struct dentry *dentry; > struct nameidata nd; > + struct inode *inode = NULL; > > name = getname(pathname); > if(IS_ERR(name)) > @@ -1527,11 +1530,16 @@ asmlinkage long sys_unlink(const char * > /* Why not before? Because we want correct error value */ > if (nd.last.name[nd.last.len]) > goto slashes; ---- from here ---- > + inode = dentry->d_inode; > + if (inode) > + atomic_inc(&inode->i_count); > error = vfs_unlink(nd.dentry->d_inode, dentry); > exit2: > dput(dentry); > } > up(&nd.dentry->d_inode->i_sem); > + if (inode) > + iput(inode); ---- to here ---- I believe that nd.dentry->d_inode cannot vanish because it is protected by the down(->i_sem) before and the up(->i_sem) after. Am I right or am I missing something important ? > exit1: > path_release(&nd); > exit: Thanks, Willy ^ permalink raw reply related [flat|nested] 36+ messages in thread
* Re: Linux 2.4.33-rc1 2006-06-19 23:00 ` Willy Tarreau @ 2006-06-19 23:45 ` Marcelo Tosatti 2006-06-20 22:23 ` Willy Tarreau 0 siblings, 1 reply; 36+ messages in thread From: Marcelo Tosatti @ 2006-06-19 23:45 UTC (permalink / raw) To: Willy Tarreau; +Cc: Grant Coady, linux-kernel, Al Viro On Tue, Jun 20, 2006 at 01:00:07AM +0200, Willy Tarreau wrote: > On Mon, Jun 19, 2006 at 07:04:05PM -0300, Marcelo Tosatti wrote: > > > Think this is the right thing to do, except that it must be guaranteed > > that the inode struct won't be freed in the meantime, need to grab a > > reference to it. > > OK, I believe it will be right this time. I took inspiration from your > precedent patch to sys_unlink(). > > diff --git a/fs/namei.c b/fs/namei.c > index 42cce98..374b767 100644 > --- a/fs/namei.c > +++ b/fs/namei.c > @@ -1478,12 +1478,16 @@ exit: > int vfs_unlink(struct inode *dir, struct dentry *dentry) > { > int error; > + struct inode *inode; > > error = may_delete(dir, dentry, 0); > if (error) > return error; > > - double_down(&dir->i_zombie, &dentry->d_inode->i_zombie); > + inode = dentry->d_inode; > + atomic_inc(&inode->i_count); > + double_down(&dir->i_zombie, &inode->i_zombie); > + > error = -EPERM; > if (dir->i_op && dir->i_op->unlink) { > DQUOT_INIT(dir); > @@ -1495,7 +1499,9 @@ int vfs_unlink(struct inode *dir, struct > unlock_kernel(); > } > } > - double_up(&dir->i_zombie, &dentry->d_inode->i_zombie); > + double_up(&dir->i_zombie, &inode->i_zombie); > + iput(inode); > + > if (!error) { > d_delete(dentry); > inode_dir_notify(dir, DN_DELETE); Yeah thats better. > BTW, I might be wrong because my knowledge in this area is rather poor, but > I now believe that your previously proposed fix below indeed is not needed > at all : > > > diff --git a/fs/namei.c b/fs/namei.c > > index 42cce98..69da199 100644 > > --- a/fs/namei.c > > +++ b/fs/namei.c > > @@ -1509,6 +1511,7 @@ asmlinkage long sys_unlink(const char * > > char * name; > > struct dentry *dentry; > > struct nameidata nd; > > + struct inode *inode = NULL; > > > > name = getname(pathname); > > if(IS_ERR(name)) > > @@ -1527,11 +1530,16 @@ asmlinkage long sys_unlink(const char * > > /* Why not before? Because we want correct error value */ > > if (nd.last.name[nd.last.len]) > > goto slashes; > > ---- from here ---- > > > > + inode = dentry->d_inode; > > + if (inode) > > + atomic_inc(&inode->i_count); > > error = vfs_unlink(nd.dentry->d_inode, dentry); > > exit2: > > dput(dentry); > > } > > up(&nd.dentry->d_inode->i_sem); > > + if (inode) > > + iput(inode); > > ---- to here ---- > > I believe that nd.dentry->d_inode cannot vanish because it is protected by the > down(->i_sem) before and the up(->i_sem) after. Am I right or am I missing > something important ? Indeed it can't, but dentry->d_inode will be set to NULL by nfs_unlink->nfs_safe_remove->d_delete. Thus the problem. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linux 2.4.33-rc1 2006-06-19 23:45 ` Marcelo Tosatti @ 2006-06-20 22:23 ` Willy Tarreau 2006-06-21 1:09 ` Grant Coady 2006-06-21 13:50 ` Marcelo Tosatti 0 siblings, 2 replies; 36+ messages in thread From: Willy Tarreau @ 2006-06-20 22:23 UTC (permalink / raw) To: Marcelo Tosatti; +Cc: Grant Coady, linux-kernel, Al Viro On Mon, Jun 19, 2006 at 08:45:06PM -0300, Marcelo Tosatti wrote: > > > > ---- from here ---- > > > > > > > + inode = dentry->d_inode; > > > + if (inode) > > > + atomic_inc(&inode->i_count); > > > error = vfs_unlink(nd.dentry->d_inode, dentry); > > > exit2: > > > dput(dentry); > > > } > > > up(&nd.dentry->d_inode->i_sem); > > > + if (inode) > > > + iput(inode); > > > > ---- to here ---- > > > > I believe that nd.dentry->d_inode cannot vanish because it is protected by the > > down(->i_sem) before and the up(->i_sem) after. Am I right or am I missing > > something important ? > > Indeed it can't, but dentry->d_inode will be set to NULL by > nfs_unlink->nfs_safe_remove->d_delete. Thus the problem. What puzzles me is how are we supposed to up(&nd.dentry->d_inode->i_sem) if dentry->d_inode can become NULL ? simply by keeping a copy of it ? I thought that the down() protected the whole thing, but may be that's stupid anyway. I've been running rc1 without this patch for a few hours and during kernel compiles without a problem, so I'm not sure about what to think about the other changes which were apparently harmless too :-/ Well, if I resume it right, we only need to merge your patch and mine and it *should* be OK. BTW, I've been reviewing the PaX patch and found *at least* one patch that should be merged (fix for oops). I'll send it separately, and it's queued in -upstream. Cheers, Willy ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linux 2.4.33-rc1 2006-06-20 22:23 ` Willy Tarreau @ 2006-06-21 1:09 ` Grant Coady 2006-06-21 4:05 ` Willy Tarreau 2006-06-21 13:50 ` Marcelo Tosatti 1 sibling, 1 reply; 36+ messages in thread From: Grant Coady @ 2006-06-21 1:09 UTC (permalink / raw) To: Willy Tarreau; +Cc: Marcelo Tosatti, linux-kernel, Al Viro On Wed, 21 Jun 2006 00:23:57 +0200, Willy Tarreau <w@1wt.eu> wrote: >On Mon, Jun 19, 2006 at 08:45:06PM -0300, Marcelo Tosatti wrote: >> > >> > ---- from here ---- >> > >> > >> > > + inode = dentry->d_inode; >> > > + if (inode) >> > > + atomic_inc(&inode->i_count); >> > > error = vfs_unlink(nd.dentry->d_inode, dentry); >> > > exit2: >> > > dput(dentry); >> > > } >> > > up(&nd.dentry->d_inode->i_sem); >> > > + if (inode) >> > > + iput(inode); >> > >> > ---- to here ---- >> > >> > I believe that nd.dentry->d_inode cannot vanish because it is protected by the >> > down(->i_sem) before and the up(->i_sem) after. Am I right or am I missing >> > something important ? >> >> Indeed it can't, but dentry->d_inode will be set to NULL by >> nfs_unlink->nfs_safe_remove->d_delete. Thus the problem. > >What puzzles me is how are we supposed to up(&nd.dentry->d_inode->i_sem) if >dentry->d_inode can become NULL ? simply by keeping a copy of it ? I thought >that the down() protected the whole thing, but may be that's stupid anyway. >I've been running rc1 without this patch for a few hours and during kernel >compiles without a problem, so I'm not sure about what to think about the >other changes which were apparently harmless too :-/ So what's the final fixup? Last two patches don't seem to cause the problems previously reported by me. They don't play together though, so I'll add my general sense of confusion to this issue ;) Should I run the thing (which patch?) and compile a hundred kernels or something to see what (if anything) breaks. Shortest day of year here, I don't mind running the test box as part of room heating :o) Thanks, Grant. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linux 2.4.33-rc1 2006-06-21 1:09 ` Grant Coady @ 2006-06-21 4:05 ` Willy Tarreau 0 siblings, 0 replies; 36+ messages in thread From: Willy Tarreau @ 2006-06-21 4:05 UTC (permalink / raw) To: Grant Coady; +Cc: Marcelo Tosatti, linux-kernel, Al Viro Hi Grant, On Wed, Jun 21, 2006 at 11:09:15AM +1000, Grant Coady wrote: > >What puzzles me is how are we supposed to up(&nd.dentry->d_inode->i_sem) if > >dentry->d_inode can become NULL ? simply by keeping a copy of it ? I thought > >that the down() protected the whole thing, but may be that's stupid anyway. > >I've been running rc1 without this patch for a few hours and during kernel > >compiles without a problem, so I'm not sure about what to think about the > >other changes which were apparently harmless too :-/ > > So what's the final fixup? Last two patches don't seem to cause the > problems previously reported by me. They don't play together though, > so I'll add my general sense of confusion to this issue ;) :-) Marcelo's patch applies to sys_unlink(). It prevents sys_unlink() from oopsing while releasing the inode's semaphore after vfs_unlink() has nullified the inode pointer. Mine did nearly the same within vfs_unlink(), where you got your original oopses. > Should I run the thing (which patch?) and compile a hundred kernels > or something to see what (if anything) breaks. Shortest day of year > here, I don't mind running the test box as part of room heating :o) If you want to heat your room, you should effectively apply both patches, then run a hundred kernel compiles. Removing the "-pipe" option to gcc would help a lot since it will have to create temp files. > Thanks, > Grant. Thanks for your time, Willy ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linux 2.4.33-rc1 2006-06-20 22:23 ` Willy Tarreau 2006-06-21 1:09 ` Grant Coady @ 2006-06-21 13:50 ` Marcelo Tosatti 2006-06-21 20:26 ` Willy Tarreau 1 sibling, 1 reply; 36+ messages in thread From: Marcelo Tosatti @ 2006-06-21 13:50 UTC (permalink / raw) To: Willy Tarreau; +Cc: Grant Coady, linux-kernel, Al Viro Hi Willy, On Wed, Jun 21, 2006 at 12:23:57AM +0200, Willy Tarreau wrote: > On Mon, Jun 19, 2006 at 08:45:06PM -0300, Marcelo Tosatti wrote: > > > > > > ---- from here ---- > > > > > > > > > > + inode = dentry->d_inode; > > > > + if (inode) > > > > + atomic_inc(&inode->i_count); > > > > error = vfs_unlink(nd.dentry->d_inode, dentry); > > > > exit2: > > > > dput(dentry); > > > > } > > > > up(&nd.dentry->d_inode->i_sem); > > > > + if (inode) > > > > + iput(inode); > > > > > > ---- to here ---- > > > > > > I believe that nd.dentry->d_inode cannot vanish because it is protected by the > > > down(->i_sem) before and the up(->i_sem) after. Am I right or am I missing > > > something important ? > > > > Indeed it can't, but dentry->d_inode will be set to NULL by > > nfs_unlink->nfs_safe_remove->d_delete. Thus the problem. > > What puzzles me is how are we supposed to up(&nd.dentry->d_inode->i_sem) if > dentry->d_inode can become NULL ? down(&nd.dentry->d_inode->i_sem); dentry = lookup_hash(&nd.last, nd.dentry); error = vfs_unlink(nd.dentry->d_inode, dentry); ^^^^^^^ It does vfs_unlink(parent, child). The child is killed. > simply by keeping a copy of it ? I thought > that the down() protected the whole thing, but may be that's stupid anyway. It does not protect from last reference going away, which is what happens when NFS d_delete's the dentry. dentry_iput() gets called, and maybe the last reference to "struct inode" might be gone too. > I've been running rc1 without this patch for a few hours and during kernel > compiles without a problem, so I'm not sure about what to think about the > other changes which were apparently harmless too :-/ Use NFS... :) > Well, if I resume it right, we only need to merge your patch and mine and > it *should* be OK. > > BTW, I've been reviewing the PaX patch and found *at least* one patch > that should be merged (fix for oops). I'll send it separately, and it's > queued in -upstream. Merged, thanks. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linux 2.4.33-rc1 2006-06-21 13:50 ` Marcelo Tosatti @ 2006-06-21 20:26 ` Willy Tarreau 0 siblings, 0 replies; 36+ messages in thread From: Willy Tarreau @ 2006-06-21 20:26 UTC (permalink / raw) To: Marcelo Tosatti; +Cc: Grant Coady, linux-kernel, Al Viro Hi Marcelo, On Wed, Jun 21, 2006 at 10:50:15AM -0300, Marcelo Tosatti wrote: > Hi Willy, > > On Wed, Jun 21, 2006 at 12:23:57AM +0200, Willy Tarreau wrote: > > On Mon, Jun 19, 2006 at 08:45:06PM -0300, Marcelo Tosatti wrote: > > > > > > > > ---- from here ---- > > > > > > > > > > > > > + inode = dentry->d_inode; > > > > > + if (inode) > > > > > + atomic_inc(&inode->i_count); > > > > > error = vfs_unlink(nd.dentry->d_inode, dentry); > > > > > exit2: > > > > > dput(dentry); > > > > > } > > > > > up(&nd.dentry->d_inode->i_sem); > > > > > + if (inode) > > > > > + iput(inode); > > > > > > > > ---- to here ---- > > > > > > > > I believe that nd.dentry->d_inode cannot vanish because it is protected by the > > > > down(->i_sem) before and the up(->i_sem) after. Am I right or am I missing > > > > something important ? > > > > > > Indeed it can't, but dentry->d_inode will be set to NULL by > > > nfs_unlink->nfs_safe_remove->d_delete. Thus the problem. > > > > What puzzles me is how are we supposed to up(&nd.dentry->d_inode->i_sem) if > > dentry->d_inode can become NULL ? > > down(&nd.dentry->d_inode->i_sem); > dentry = lookup_hash(&nd.last, nd.dentry); > error = vfs_unlink(nd.dentry->d_inode, dentry); > ^^^^^^^ > > It does vfs_unlink(parent, child). The child is killed. > > > simply by keeping a copy of it ? I thought > > that the down() protected the whole thing, but may be that's stupid anyway. > > It does not protect from last reference going away, which is what > happens when NFS d_delete's the dentry. dentry_iput() gets called, and > maybe the last reference to "struct inode" might be gone too. OK for this one, but what I mean is that vfs_unlink() will kill dentry (its second argument, which your fix protects), while the down()/up() we initially suspected, and tried to protect, play with nd.dentry->d_inode, which is the directory in which we are removing dentry. And I can't see how nd.dentry->d_inode can be killed during vfs_unlink(), reason why I thought the fix above did not provide any benefit. > > I've been running rc1 without this patch for a few hours and during kernel > > compiles without a problem, so I'm not sure about what to think about the > > other changes which were apparently harmless too :-/ > > Use NFS... :) I'm using it at home right now, doing lots of git-checkout -f and make -j dep ; make -j 4 bzImage modules on SMP without a glitch. Right now I'm running rc2 without trouble and I noticed you did not merge the fix for sys_unlink() that we discussed above. May be you simply have forgotten it or may be we are in sync but were talking about different things. > > Well, if I resume it right, we only need to merge your patch and mine and > > it *should* be OK. > > > > BTW, I've been reviewing the PaX patch and found *at least* one patch > > that should be merged (fix for oops). I'll send it separately, and it's > > queued in -upstream. > > Merged, thanks. Thanks, Willy ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linux 2.4.33-rc1 2006-06-18 13:37 ` Marcelo Tosatti 2006-06-18 22:25 ` Grant Coady @ 2006-06-18 22:33 ` Grant Coady 2006-06-18 22:40 ` Willy Tarreau 1 sibling, 1 reply; 36+ messages in thread From: Grant Coady @ 2006-06-18 22:33 UTC (permalink / raw) To: Marcelo Tosatti; +Cc: linux-kernel, Willy Tarreau, Al Viro On Sun, 18 Jun 2006 10:37:18 -0300, Marcelo Tosatti <marcelo@kvack.org> wrote: >Can you please try the attached patch. > >Grab a reference to the victim inode before calling vfs_unlink() to avoid >it vanishing under us. > >diff --git a/fs/namei.c b/fs/namei.c >index 42cce98..7993283 100644 >--- a/fs/namei.c >+++ b/fs/namei.c >@@ -1509,6 +1509,7 @@ asmlinkage long sys_unlink(const char * > char * name; > struct dentry *dentry; > struct nameidata nd; >+ struct inode *inode = NULL; > > name = getname(pathname); > if(IS_ERR(name)) >@@ -1527,11 +1528,16 @@ asmlinkage long sys_unlink(const char * > /* Why not before? Because we want correct error value */ > if (nd.last.name[nd.last.len]) > goto slashes; >+ inode = dentry->d_inode; >+ if (inode) >+ atomic_inc(&inode->i_count); > error = vfs_unlink(nd.dentry->d_inode, dentry); > exit2: > dput(dentry); > } > up(&nd.dentry->d_inode->i_sem); >+ if (inode) >+ iput(inode); > exit1: > path_release(&nd); > exit: An odd thing happening is that after editing /etc/lilo.conf I get the vim backup file: "/etc/lilo.co~" instead of "/etc/lilo.conf~" -- this was happening before the above patch, with -rc1. Very weird. I'm editing lilo.conf to have the box reboot to 2.6.16.20, since it is semi-headless (has KVM, but I usually it run via ssh terminal). Grant. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Linux 2.4.33-rc1 2006-06-18 22:33 ` Grant Coady @ 2006-06-18 22:40 ` Willy Tarreau 0 siblings, 0 replies; 36+ messages in thread From: Willy Tarreau @ 2006-06-18 22:40 UTC (permalink / raw) To: Grant Coady; +Cc: Marcelo Tosatti, linux-kernel, Al Viro On Mon, Jun 19, 2006 at 08:33:19AM +1000, Grant Coady wrote: > On Sun, 18 Jun 2006 10:37:18 -0300, Marcelo Tosatti <marcelo@kvack.org> wrote: > > >Can you please try the attached patch. > > > >Grab a reference to the victim inode before calling vfs_unlink() to avoid > >it vanishing under us. > > > >diff --git a/fs/namei.c b/fs/namei.c > >index 42cce98..7993283 100644 > >--- a/fs/namei.c > >+++ b/fs/namei.c > >@@ -1509,6 +1509,7 @@ asmlinkage long sys_unlink(const char * > > char * name; > > struct dentry *dentry; > > struct nameidata nd; > >+ struct inode *inode = NULL; > > > > name = getname(pathname); > > if(IS_ERR(name)) > >@@ -1527,11 +1528,16 @@ asmlinkage long sys_unlink(const char * > > /* Why not before? Because we want correct error value */ > > if (nd.last.name[nd.last.len]) > > goto slashes; > >+ inode = dentry->d_inode; > >+ if (inode) > >+ atomic_inc(&inode->i_count); > > error = vfs_unlink(nd.dentry->d_inode, dentry); > > exit2: > > dput(dentry); > > } > > up(&nd.dentry->d_inode->i_sem); > >+ if (inode) > >+ iput(inode); > > exit1: > > path_release(&nd); > > exit: > > An odd thing happening is that after editing /etc/lilo.conf I get the > vim backup file: "/etc/lilo.co~" instead of "/etc/lilo.conf~" -- this > was happening before the above patch, with -rc1. Very weird. I'm > editing lilo.conf to have the box reboot to 2.6.16.20, since it is > semi-headless (has KVM, but I usually it run via ssh terminal). OK, if you doubt about the kernel you're running, the build # and date are reported by 'uname -a'. I often abuse this because sometimes I'm not sure about what I see ! > Grant. Cheers, Willy ^ permalink raw reply [flat|nested] 36+ messages in thread
end of thread, other threads:[~2006-06-21 20:32 UTC | newest] Thread overview: 36+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-06-16 18:14 Linux 2.4.33-rc1 Marcelo Tosatti 2006-06-16 22:21 ` Grant Coady 2006-06-16 22:38 ` Michal Piotrowski 2006-06-16 23:24 ` Grant Coady 2006-06-17 5:13 ` Willy Tarreau 2006-06-17 6:24 ` Grant Coady 2006-06-17 7:10 ` Willy Tarreau 2006-06-17 17:15 ` Need to format twice /dev/ramX with reiserfs to be able to mount it ? sebastien cabaniols 2006-06-17 18:14 ` Bernd Eckenfels 2006-06-17 18:37 ` sebastien cabaniols 2006-06-17 19:55 ` Linux 2.4.33-rc1 Marcelo Tosatti 2006-06-18 13:37 ` Marcelo Tosatti 2006-06-18 22:25 ` Grant Coady 2006-06-18 22:37 ` Willy Tarreau 2006-06-18 23:07 ` Grant Coady 2006-06-19 4:01 ` Willy Tarreau 2006-06-19 5:03 ` Grant Coady 2006-06-19 8:06 ` Willy Tarreau 2006-06-19 8:53 ` Grant Coady 2006-06-19 9:08 ` Willy Tarreau 2006-06-19 9:12 ` Grant Coady 2006-06-19 9:24 ` Willy Tarreau 2006-06-19 10:27 ` Grant Coady 2006-06-19 10:31 ` Willy Tarreau 2006-06-19 20:11 ` Grant Coady 2006-06-19 20:20 ` Willy Tarreau 2006-06-19 22:04 ` Marcelo Tosatti 2006-06-19 23:00 ` Willy Tarreau 2006-06-19 23:45 ` Marcelo Tosatti 2006-06-20 22:23 ` Willy Tarreau 2006-06-21 1:09 ` Grant Coady 2006-06-21 4:05 ` Willy Tarreau 2006-06-21 13:50 ` Marcelo Tosatti 2006-06-21 20:26 ` Willy Tarreau 2006-06-18 22:33 ` Grant Coady 2006-06-18 22:40 ` Willy Tarreau
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox