Bug in reiserfsck 3.6.5

All of lore.kernel.org
 help / color / mirror / Atom feed

* Bug in reiserfsck 3.6.5
@ 2003-04-09 10:57 Kelledin
  2003-04-09 10:09 ` Yury Umanets
  2003-04-09 10:15 ` Oleg Drokin
  0 siblings, 2 replies; 5+ messages in thread
From: Kelledin @ 2003-04-09 10:57 UTC (permalink / raw)
  To: reiserfs-list

[-- Attachment #1: Type: text/plain, Size: 2184 bytes --]

(I'd send this directly to Yura, but I'm having a bit of trouble 
getting mail through.  Twice it's bounced off the 
namesys.botik.ru mailserver as suspected spam, and when I 
finally employ a mailertable trick, namesys.botik.ru no longer 
resolves.  I think some angry god is conspiring against me and 
my bug report...)

I recently found that when installing reiserfsprogs with
/sbin/fsck.reiserfs symlinked to reiserfsck, fsck.reiserfs
generates a SIGABORT when called as an fsck backend via "fsck -a
-A -C -T" (fairly common command used in some system boot
scripts).  It was quite interesting to troubleshoot, as the
problem _didn't_ occur if I turned fsck.reiserfs into a wrapper
script, or called "fsck.reiserfs -a /dev/sda14" directly from
the bash prompt...

I traced it down to this line in lib/io.c:

--
void flush_buffers (dev_t dev)
{
    if (!dev)
        die ("flush_buffers: device is not specifed");
        ^^^^^
--

I'm fairly certan what is happening is that when fsck calls the
fsck.reiserfs backend, it's closing all default stream
descriptors (stdin, stdout, stderr) before exec'ing it.  So if
fsck.reiserfs opens the device file (/dev/sda14) before anything
else, then fs->dev_t gets a descriptor value of zero.  This
eventually trickles down to flush_buffers(), which thinks
something is wrong with this and croaks.

(This is obviously incorrect thinking on the part of
flush_buffers().  Having a general-purpose file descriptor with
a value of 0 is unusual, but not really incorrect.)

When reiserfsck is called directly from the shell prompt, or is
executed via a wrapper script, it actually gets its own
stdin/stdout/stderr sitting on descriptors 0/1/2 and thus
doesn't trip over this bug.  So creating a wrapper script works
as a quick band-aid fix.

The proper solution is to change the flush_buffers() way of
thinking; the attached patch might be enough.  Or it might not 
be.  If some other bit of code is actually setting fs->fs_dev to 
0 to signify a real error condition, then a real fix is going to 
require more far-reaching changes.

--
Kelledin
"If a server crashes in a server farm and no one pings it, does
it still cost four figures to fix?"

[-- Attachment #2: reiserfsprogs-3.6.5-fdzero.patch --]
[-- Type: text/x-diff, Size: 603 bytes --]

diff -Naur reiserfsprogs-3.6.5/lib/io.c reiserfsprogs-3.6.5-fdzero/lib/io.c
--- reiserfsprogs-3.6.5/lib/io.c	2003-03-12 11:34:43.000000000 -0600
+++ reiserfsprogs-3.6.5-fdzero/lib/io.c	2003-04-09 05:43:05.000000000 -0500
@@ -390,8 +390,12 @@

 void flush_buffers (dev_t dev)
 {
+/* Scrap this test.  A file descriptor with a value of zero is perfectly
+ * valid.
+ *                                                        --kelledin
     if (!dev)
 	die ("flush_buffers: device is not specifed");
+*/
     sync_buffers (&Buffer_list_head, dev, 0/*all*/);
     buffer_soft_limit = BUFFER_SOFT_LIMIT;
 }

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Bug in reiserfsck 3.6.5
  2003-04-09 10:57 Bug in reiserfsck 3.6.5 Kelledin
@ 2003-04-09 10:09 ` Yury Umanets
  2003-04-09 10:15 ` Oleg Drokin
  1 sibling, 0 replies; 5+ messages in thread
From: Yury Umanets @ 2003-04-09 10:09 UTC (permalink / raw)
  To: Kelledin (by way of Kelledin <kelledin+RFS@skarpsey.dyndns.org>)
  Cc: reiserfs-list

Kelledin (by way of Kelledin) wrote:

>(I'd send this directly to Yura, but I'm having a bit of trouble 
>getting mail through.  Twice it's bounced off the 
>namesys.botik.ru mailserver as suspected spam, and when I 
>finally employ a mailertable trick, namesys.botik.ru no longer 
>resolves.  I think some angry god is conspiring against me and 
>my bug report...)
>
>I recently found that when installing reiserfsprogs with
>/sbin/fsck.reiserfs symlinked to reiserfsck, fsck.reiserfs
>generates a SIGABORT when called as an fsck backend via "fsck -a
>-A -C -T" (fairly common command used in some system boot
>scripts).  It was quite interesting to troubleshoot, as the
>problem _didn't_ occur if I turned fsck.reiserfs into a wrapper
>script, or called "fsck.reiserfs -a /dev/sda14" directly from
>the bash prompt...
>
>I traced it down to this line in lib/io.c:
>
>--
>void flush_buffers (dev_t dev)
>{
>    if (!dev)
>        die ("flush_buffers: device is not specifed");
>        ^^^^^
>--
>
>I'm fairly certan what is happening is that when fsck calls the
>fsck.reiserfs backend, it's closing all default stream
>descriptors (stdin, stdout, stderr) before exec'ing it.  So if
>fsck.reiserfs opens the device file (/dev/sda14) before anything
>else, then fs->dev_t gets a descriptor value of zero.  This
>eventually trickles down to flush_buffers(), which thinks
>something is wrong with this and croaks.
>
>(This is obviously incorrect thinking on the part of
>flush_buffers().  Having a general-purpose file descriptor with
>a value of 0 is unusual, but not really incorrect.)
>
>When reiserfsck is called directly from the shell prompt, or is
>executed via a wrapper script, it actually gets its own
>stdin/stdout/stderr sitting on descriptors 0/1/2 and thus
>doesn't trip over this bug.  So creating a wrapper script works
>as a quick band-aid fix.
>
>The proper solution is to change the flush_buffers() way of
>thinking; the attached patch might be enough.  Or it might not 
>be.  If some other bit of code is actually setting fs->fs_dev to 
>0 to signify a real error condition, then a real fix is going to 
>require more far-reaching changes.
>
>--
>Kelledin
>"If a server crashes in a server farm and no one pings it, does
>it still cost four figures to fix?"
>  
>
Can you try last pre reiserfsprogs 
ftp://ftp.namesys.com/pub/reiserfsprogs/pre/reiserfsprogs-3.6.6-pre1.tar.gz 
please. It seems this bug fixed yet.

-- 
Yury Umanets 
"We're flying high, we're watching the world passes by..."




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Bug in reiserfsck 3.6.5
  2003-04-09 10:57 Bug in reiserfsck 3.6.5 Kelledin
  2003-04-09 10:09 ` Yury Umanets
@ 2003-04-09 10:15 ` Oleg Drokin
  2003-04-09 17:17   ` Hans Reiser
  1 sibling, 1 reply; 5+ messages in thread
From: Oleg Drokin @ 2003-04-09 10:15 UTC (permalink / raw)
  To: Kelledin; +Cc: reiserfs-list

Hello!

On Wed, Apr 09, 2003 at 05:57:29AM -0500, Kelledin wrote:
> I recently found that when installing reiserfsprogs with
> /sbin/fsck.reiserfs symlinked to reiserfsck, fsck.reiserfs
> generates a SIGABORT when called as an fsck backend via "fsck -a
> -A -C -T" (fairly common command used in some system boot
> scripts).  It was quite interesting to troubleshoot, as the
> problem _didn't_ occur if I turned fsck.reiserfs into a wrapper
> script, or called "fsck.reiserfs -a /dev/sda14" directly from
> the bash prompt...
[...]

Thanks for the analysis. Actually we already performed it some time ago.
Grab the patch for reiserfsprogs-3.6.5 from 
ftp://ftp.namesys.com/pub/reiserfsprogs/reiserfsprogs-3.6.5-flush_buffers-bug.patch

Or you can grab the reiserfsprogs 3.6.6-pre1 which is also contains the bugfix.

Bye,
    Oleg

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Bug in reiserfsck 3.6.5
  2003-04-09 10:15 ` Oleg Drokin
@ 2003-04-09 17:17   ` Hans Reiser
  2003-04-10  4:12     ` Kelledin
  0 siblings, 1 reply; 5+ messages in thread
From: Hans Reiser @ 2003-04-09 17:17 UTC (permalink / raw)
  To: Oleg Drokin; +Cc: Kelledin, reiserfs-list

Oleg Drokin wrote:

>Hello!
>
>On Wed, Apr 09, 2003 at 05:57:29AM -0500, Kelledin wrote:
>  
>
>>I recently found that when installing reiserfsprogs with
>>/sbin/fsck.reiserfs symlinked to reiserfsck, fsck.reiserfs
>>generates a SIGABORT when called as an fsck backend via "fsck -a
>>-A -C -T" (fairly common command used in some system boot
>>scripts).  It was quite interesting to troubleshoot, as the
>>problem _didn't_ occur if I turned fsck.reiserfs into a wrapper
>>script, or called "fsck.reiserfs -a /dev/sda14" directly from
>>the bash prompt...
>>    
>>
>[...]
>
>Thanks for the analysis. Actually we already performed it some time ago.
>Grab the patch for reiserfsprogs-3.6.5 from 
>ftp://ftp.namesys.com/pub/reiserfsprogs/reiserfsprogs-3.6.5-flush_buffers-bug.patch
>
>Or you can grab the reiserfsprogs 3.6.6-pre1 which is also contains the bugfix.
>
>Bye,
>    Oleg
>
>
>  
>
please feel encouraged to do more debugging for us though..... :)  best,

-- 
Hans



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Bug in reiserfsck 3.6.5
  2003-04-09 17:17   ` Hans Reiser
@ 2003-04-10  4:12     ` Kelledin
  0 siblings, 0 replies; 5+ messages in thread
From: Kelledin @ 2003-04-10  4:12 UTC (permalink / raw)
  To: reiserfs-list

On Wednesday 09 April 2003 12:17 pm, Hans Reiser wrote:
> >Thanks for the analysis. Actually we already performed it
> > some time ago. Grab the patch for reiserfsprogs-3.6.5 from
> >ftp://ftp.namesys.com/pub/reiserfsprogs/reiserfsprogs-3.6.5-f
> >lush_buffers-bug.patch

Ahhh, thnx, that took care of it. :)  I'd probably have noticed 
that patch if I checked the FTP directory out of habit...

> please feel encouraged to do more debugging for us though.....

Heh, I can't promise much; I am shamelessly self-centered when it 
comes to debugging. ;)  I only go out of my way like this for 
stuff that affects me, else I'd lose all my hair by the time I 
was thirty!

-- 
Kelledin
"If a server crashes in a server farm and no one pings it, does 
it still cost four figures to fix?"

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2003-04-10  4:12 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-04-09 10:57 Bug in reiserfsck 3.6.5 Kelledin
2003-04-09 10:09 ` Yury Umanets
2003-04-09 10:15 ` Oleg Drokin
2003-04-09 17:17   ` Hans Reiser
2003-04-10  4:12     ` Kelledin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.