public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* Regression- XFS won't mount on partitioned md array
@ 2008-05-16 17:11 David Greaves
  2008-05-16 17:16 ` Justin Piszcz
  2008-05-16 18:59 ` Eric Sandeen
  0 siblings, 2 replies; 17+ messages in thread
From: David Greaves @ 2008-05-16 17:11 UTC (permalink / raw)
  To: David Chinner; +Cc: LinuxRaid, xfs, 'linux-kernel@vger.kernel.org'

I just attempted a kernel upgrade from 2.6.20.7 to 2.6.25.3 and it no longer
mounts my xfs filesystem.

I bisected it to around
a67d7c5f5d25d0b13a4dfb182697135b014fa478
[XFS] Move platform specific mount option parse out of core XFS code

I have a RAID5 array with partitions:

Partition Table for /dev/md_d0

               First       Last
 # Type       Sector      Sector   Offset    Length   Filesystem Type (ID) Flag
-- ------- ----------- ----------- ------ ----------- -------------------- ----
 1 Primary           0  2500288279      4  2500288280 Linux (83)           None
 2 Primary  2500288280  2500483583      0      195304 Non-FS data (DA)     None


when I attempt to mount /media:
/dev/md_d0p1 /media xfs rw,nobarrier,noatime,logdev=/dev/md_d0p2,allocsize=512m 0 0

I get:
 md_d0: p1 p2
XFS mounting filesystem md_d0p1
attempt to access beyond end of device
md_d0p2: rw=0, want=195311, limit=195304
I/O error in filesystem ("md_d0p1") meta-data dev md_d0p2 block 0x2fae7
("xlog_bread") error 5 buf count 512
XFS: empty log check failed
XFS: log mount/recovery failed: error 5
XFS: log mount failed

A repair:
  xfs_repair /dev/md_d0p1 -l /dev/md_d0p2
gives no errors.

Phase 1 - find and verify superblock...
Phase 2 - using external log on /dev/md_d0p2
        - zero log...
        - scan filesystem freespace and inode maps...
        - found root inode chunk
...


David

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Regression- XFS won't mount on partitioned md array
  2008-05-16 17:11 Regression- XFS won't mount on partitioned md array David Greaves
@ 2008-05-16 17:16 ` Justin Piszcz
  2008-05-16 18:05   ` David Greaves
  2008-05-16 18:59 ` Eric Sandeen
  1 sibling, 1 reply; 17+ messages in thread
From: Justin Piszcz @ 2008-05-16 17:16 UTC (permalink / raw)
  To: David Greaves
  Cc: David Chinner, LinuxRaid, xfs,
	'linux-kernel@vger.kernel.org'



On Fri, 16 May 2008, David Greaves wrote:

> I just attempted a kernel upgrade from 2.6.20.7 to 2.6.25.3 and it no longer
> mounts my xfs filesystem.
>
> I bisected it to around
> a67d7c5f5d25d0b13a4dfb182697135b014fa478
> [XFS] Move platform specific mount option parse out of core XFS code
>
> I have a RAID5 array with partitions:
>
> Partition Table for /dev/md_d0
>
>               First       Last
> # Type       Sector      Sector   Offset    Length   Filesystem Type (ID) Flag
> -- ------- ----------- ----------- ------ ----------- -------------------- ----
> 1 Primary           0  2500288279      4  2500288280 Linux (83)           None
> 2 Primary  2500288280  2500483583      0      195304 Non-FS data (DA)     None
>
>
> when I attempt to mount /media:
> /dev/md_d0p1 /media xfs rw,nobarrier,noatime,logdev=/dev/md_d0p2,allocsize=512m 0 0
>
> I get:
> md_d0: p1 p2
> XFS mounting filesystem md_d0p1
> attempt to access beyond end of device
> md_d0p2: rw=0, want=195311, limit=195304
> I/O error in filesystem ("md_d0p1") meta-data dev md_d0p2 block 0x2fae7
> ("xlog_bread") error 5 buf count 512
> XFS: empty log check failed
> XFS: log mount/recovery failed: error 5
> XFS: log mount failed
>
> A repair:
>  xfs_repair /dev/md_d0p1 -l /dev/md_d0p2
> gives no errors.
>
> Phase 1 - find and verify superblock...
> Phase 2 - using external log on /dev/md_d0p2
>        - zero log...
>        - scan filesystem freespace and inode maps...
>        - found root inode chunk
> ...
>
>
> David
>
>

Ouch, still on 2.6.25.1 here, didn't reboot yet, but I do not use 
mdraid'ed partitions, just regular mdraid, if you boot back to 2.6.20.7 
does it work again?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Regression- XFS won't mount on partitioned md array
  2008-05-16 17:16 ` Justin Piszcz
@ 2008-05-16 18:05   ` David Greaves
  2008-05-16 18:35     ` Oliver Pinter
  0 siblings, 1 reply; 17+ messages in thread
From: David Greaves @ 2008-05-16 18:05 UTC (permalink / raw)
  To: Justin Piszcz
  Cc: David Chinner, LinuxRaid, xfs,
	'linux-kernel@vger.kernel.org'

> Ouch, still on 2.6.25.1 here, didn't reboot yet, but I do not use
> mdraid'ed partitions, just regular mdraid, if you boot back to 2.6.20.7
> does it work again?
Yes, no probs.

It came in prior to 2.6.25-rc1
The machine has a root xfs filesystem with an internal log on a sata disk and a
data filesystem on a partitioned array with an external log (also on the
partitioned array).
Only the partitioned array/external-log filesystem is affected.

David

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Regression- XFS won't mount on partitioned md array
  2008-05-16 18:05   ` David Greaves
@ 2008-05-16 18:35     ` Oliver Pinter
  2008-05-17 14:48       ` David Greaves
  0 siblings, 1 reply; 17+ messages in thread
From: Oliver Pinter @ 2008-05-16 18:35 UTC (permalink / raw)
  To: David Greaves
  Cc: Justin Piszcz, David Chinner, LinuxRaid, xfs,
	linux-kernel@vger.kernel.org

this[1] patch fixed?

this patch is for 2.6.25.5 kernel

1: http://git.kernel.org/?p=linux/kernel/git/stable/stable-queue.git;a=blob;f=queue-2.6.25/block-do_mounts-accept-root-non-existant-partition.patch;h=097cda9928b434994dbb157065b2ca38e7cec3a1;hb=8cc4c3b370d59deb16c2e92165a466c82e914020

On 5/16/08, David Greaves <david@dgreaves.com> wrote:
>> Ouch, still on 2.6.25.1 here, didn't reboot yet, but I do not use
>> mdraid'ed partitions, just regular mdraid, if you boot back to 2.6.20.7
>> does it work again?
> Yes, no probs.
>
> It came in prior to 2.6.25-rc1
> The machine has a root xfs filesystem with an internal log on a sata disk
> and a
> data filesystem on a partitioned array with an external log (also on the
> partitioned array).
> Only the partitioned array/external-log filesystem is affected.
>
> David
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>


-- 
Thanks,
Oliver

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Regression- XFS won't mount on partitioned md array
  2008-05-16 17:11 Regression- XFS won't mount on partitioned md array David Greaves
  2008-05-16 17:16 ` Justin Piszcz
@ 2008-05-16 18:59 ` Eric Sandeen
  2008-05-17 14:46   ` David Greaves
  1 sibling, 1 reply; 17+ messages in thread
From: Eric Sandeen @ 2008-05-16 18:59 UTC (permalink / raw)
  To: David Greaves
  Cc: David Chinner, LinuxRaid, xfs,
	'linux-kernel@vger.kernel.org'

David Greaves wrote:
> I just attempted a kernel upgrade from 2.6.20.7 to 2.6.25.3 and it no longer
> mounts my xfs filesystem.
> 
> I bisected it to around
> a67d7c5f5d25d0b13a4dfb182697135b014fa478
> [XFS] Move platform specific mount option parse out of core XFS code

around that... not exactly?  That commit should have been largely a code
move, which is not to say that it can't contain a bug... :)

> I have a RAID5 array with partitions:
> 
> Partition Table for /dev/md_d0
> 
>                First       Last
>  # Type       Sector      Sector   Offset    Length   Filesystem Type (ID) Flag
> -- ------- ----------- ----------- ------ ----------- -------------------- ----
>  1 Primary           0  2500288279      4  2500288280 Linux (83)           None
>  2 Primary  2500288280  2500483583      0      195304 Non-FS data (DA)     None
> 
> 
> when I attempt to mount /media:
> /dev/md_d0p1 /media xfs rw,nobarrier,noatime,logdev=/dev/md_d0p2,allocsize=512m 0 0

mythbox?  :)

Hm, so it's the external log size that it doesn't much like...

> I get:
>  md_d0: p1 p2
> XFS mounting filesystem md_d0p1
> attempt to access beyond end of device
> md_d0p2: rw=0, want=195311, limit=195304

what does /proc/partitions say about md_d0p1 and p2?  Is it different
between the older & newer kernel?

What does xfs_info /mount/point say about the filesystem when you mount
it under the older kernel?  Or... if you can't mount it,

xfs_db -r -c "sb 0" -c p /dev/md_d0p1

-Eric

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Regression- XFS won't mount on partitioned md array
  2008-05-16 18:59 ` Eric Sandeen
@ 2008-05-17 14:46   ` David Greaves
  2008-05-17 15:15     ` Eric Sandeen
  0 siblings, 1 reply; 17+ messages in thread
From: David Greaves @ 2008-05-17 14:46 UTC (permalink / raw)
  To: Eric Sandeen
  Cc: David Chinner, LinuxRaid, xfs,
	'linux-kernel@vger.kernel.org'

Eric Sandeen wrote:
> David Greaves wrote:
>> I just attempted a kernel upgrade from 2.6.20.7 to 2.6.25.3 and it no longer
>> mounts my xfs filesystem.
>>
>> I bisected it to around
>> a67d7c5f5d25d0b13a4dfb182697135b014fa478
>> [XFS] Move platform specific mount option parse out of core XFS code
> 
> around that... not exactly?  That commit should have been largely a code
> move, which is not to say that it can't contain a bug... :)
I got to within 4 on the bisect and my xfs partition containing the kernel src
and the bisect history blew up telling me that files were directories and then
exploding in a  heap of lost+found/  fragments. Quite, erm, "interesting" really.

At that point I decided I was close enough to ask for advice, looked at the
commits and took this one as the most likely to cause the bug :)

But, thinking about it, I can decode the kernel extraversion tags in /boot. From
that I think my bounds were:
40ebd81d1a7635cf92a59c387a599fce4863206b
[XFS] Use kernel-supplied "roundup_pow_of_two" for simplicity
and:
3ed6526441053d79b85d206b14d75125e6f51cc2
[XFS] Implement fallocate.

so those bound:
[XFS] Remove the BPCSHIFT and NB* based macros from XFS.
[XFS] Remove bogus assert
[XFS] optimize XFS_IS_REALTIME_INODE w/o realtime config
[XFS] Move platform specific mount option parse out of core XFS code

and just glancing through the patches I didn't see any changes that looked
likely in the others...


> 
>> I have a RAID5 array with partitions:
>>
>> Partition Table for /dev/md_d0
>>
>>                First       Last
>>  # Type       Sector      Sector   Offset    Length   Filesystem Type (ID) Flag
>> -- ------- ----------- ----------- ------ ----------- -------------------- ----
>>  1 Primary           0  2500288279      4  2500288280 Linux (83)           None
>>  2 Primary  2500288280  2500483583      0      195304 Non-FS data (DA)     None
>>
>>
>> when I attempt to mount /media:
>> /dev/md_d0p1 /media xfs rw,nobarrier,noatime,logdev=/dev/md_d0p2,allocsize=512m 0 0
> 
> mythbox?  :)
Hey - we test some interesting corner cases... :)
My *wife* just told *me* to buy, and I quote "No more than 10" 1Tb Samsung
drives... I decided 5 would be plenty.

> Hm, so it's the external log size that it doesn't much like...
Yep - I noticed that - and ISTR that Neil has been fiddling in the md
partitioning code over the last 6 months or so.
I wondered where it got the larger figure from and if, somehow, md was changing
the partition size somehow...


>> I get:
>>  md_d0: p1 p2
>> XFS mounting filesystem md_d0p1
>> attempt to access beyond end of device
>> md_d0p2: rw=0, want=195311, limit=195304
> 
> what does /proc/partitions say about md_d0p1 and p2?  Is it different
> between the older & newer kernel?
2.6.20.7 (good)
 254     0 1250241792 md_d0
 254     1 1250144138 md_d0p1
 254     2      97652 md_d0p2

2.6.25.3 (bad)
 254     0 1250241792 md_d0
 254     1 1250144138 md_d0p1
 254     2      97652 md_d0p2

2.6.25.4 (bad)
 254     0 1250241792 md_d0
 254     1 1250144138 md_d0p1
 254     2      97652 md_d0p2

So nothing obvious there then...

> 
> What does xfs_info /mount/point say about the filesystem when you mount
> it under the older kernel?  Or... if you can't mount it,
teak:~# xfs_info /media/
meta-data=/dev/md_d0p1           isize=256    agcount=32, agsize=9766751 blks
         =                       sectsz=512   attr=0
data     =                       bsize=4096   blocks=312536032, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096
log      =external               bsize=4096   blocks=24413, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=0
realtime =none                   extsz=65536  blocks=0, rtextents=0

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Regression- XFS won't mount on partitioned md array
  2008-05-16 18:35     ` Oliver Pinter
@ 2008-05-17 14:48       ` David Greaves
  2008-05-17 15:20         ` David Greaves
  0 siblings, 1 reply; 17+ messages in thread
From: David Greaves @ 2008-05-17 14:48 UTC (permalink / raw)
  To: Oliver Pinter
  Cc: Justin Piszcz, David Chinner, LinuxRaid, xfs,
	linux-kernel@vger.kernel.org, Eric Sandeen

Oliver Pinter wrote:
> this[1] patch fixed?
>
> 1:
http://git.kernel.org/?p=linux/kernel/git/stable/stable-queue.git;a=blob;f=queue-2.6.25/block-do_mounts-accept-root-non-existant-partition.patch;h=097cda9928b434994dbb157065b2ca38e7cec3a1;hb=8cc4c3b370d59deb16c2e92165a466c82e914020


Looks like a possible candidate - thanks.

I think this patch is for mounting root on an md device when the partitions
aren't yet initialised.

However:
* When I run cfdisk I can read the partition table.
* Subsequent attempts to mount the xfs when /proc/partitions is clearly present
still fail.

> this patch is for 2.6.25.5 kernel
? there isn't a 2.6.25.5
It doesn't apply to 2.6.25.4

I'll see if I can make it apply...

David

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Regression- XFS won't mount on partitioned md array
  2008-05-17 14:46   ` David Greaves
@ 2008-05-17 15:15     ` Eric Sandeen
  2008-05-17 23:18       ` Eric Sandeen
  0 siblings, 1 reply; 17+ messages in thread
From: Eric Sandeen @ 2008-05-17 15:15 UTC (permalink / raw)
  To: David Greaves
  Cc: David Chinner, LinuxRaid, xfs,
	'linux-kernel@vger.kernel.org'

David Greaves wrote:
> Eric Sandeen wrote:

>>> I get:
>>>  md_d0: p1 p2
>>> XFS mounting filesystem md_d0p1
>>> attempt to access beyond end of device
>>> md_d0p2: rw=0, want=195311, limit=195304
>> what does /proc/partitions say about md_d0p1 and p2?  Is it different
>> between the older & newer kernel?

...

> 2.6.25.4 (bad)
>  254     0 1250241792 md_d0
>  254     1 1250144138 md_d0p1
>  254     2      97652 md_d0p2
> 
> So nothing obvious there then...
> 
>> What does xfs_info /mount/point say about the filesystem when you mount
>> it under the older kernel?  Or... if you can't mount it,
> teak:~# xfs_info /media/
> meta-data=/dev/md_d0p1           isize=256    agcount=32, agsize=9766751 blks
>          =                       sectsz=512   attr=0
> data     =                       bsize=4096   blocks=312536032, imaxpct=25
>          =                       sunit=0      swidth=0 blks
> naming   =version 2              bsize=4096
> log      =external               bsize=4096   blocks=24413, version=2
>          =                       sectsz=512   sunit=0 blks, lazy-count=0
> realtime =none                   extsz=65536  blocks=0, rtextents=0

ok, and with:

> Partition Table for /dev/md_d0
> 
>                First       Last
>  # Type       Sector      Sector   Offset    Length   Filesystem Type (ID) Flag
> -- ------- ----------- ----------- ------ ----------- -------------------- ----
>  1 Primary           0  2500288279      4  2500288280 Linux (83)           None
>  2 Primary  2500288280  2500483583      0      195304 Non-FS data (DA)     None

So, xfs thinks the external log is 24413 4k blocks (from the sb geometry
printed by xfs_info).  This is 97652 1k units (matching your
/proc/partitions output) and 195304 512-byte sectors (matching the
partition table output).  So that all looks consistent.

So if xfs is doing:

>>> md_d0p2: rw=0, want=195311, limit=195304
>>> XFS: empty log check failed

it surely does seem to be trying to read past the end of what even it
thinks is the end of its log.

And, with your geometry I can reproduce this w/o md, partitioned or not.
 So looks like xfs itself is busted:

loop5: rw=0, want=195311, limit=195304

I'll see if I have a little time today to track down the problem.

Thanks,

-Eric

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Regression- XFS won't mount on partitioned md array
  2008-05-17 14:48       ` David Greaves
@ 2008-05-17 15:20         ` David Greaves
  0 siblings, 0 replies; 17+ messages in thread
From: David Greaves @ 2008-05-17 15:20 UTC (permalink / raw)
  To: Oliver Pinter
  Cc: Justin Piszcz, David Chinner, LinuxRaid, xfs,
	linux-kernel@vger.kernel.org, Eric Sandeen

David Greaves wrote:
> Oliver Pinter wrote:
>> this[1] patch fixed?
>>
>> 1:
> http://git.kernel.org/?p=linux/kernel/git/stable/stable-queue.git;a=blob;f=queue-2.6.25/block-do_mounts-accept-root-non-existant-partition.patch;h=097cda9928b434994dbb157065b2ca38e7cec3a1;hb=8cc4c3b370d59deb16c2e92165a466c82e914020
> 
> 
> Looks like a possible candidate - thanks.
> 
> I think this patch is for mounting root on an md device when the partitions
> aren't yet initialised.
> 
> However:
> * When I run cfdisk I can read the partition table.
> * Subsequent attempts to mount the xfs when /proc/partitions is clearly present
> still fail.
> 
>> this patch is for 2.6.25.5 kernel
> ? there isn't a 2.6.25.5
Sorry, I understand; it's queued for 2.6.25.5.

> It doesn't apply to 2.6.25.4
> 
> I'll see if I can make it apply...
Yep - it was download corruption.

Applied but it didn't help.

David

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Regression- XFS won't mount on partitioned md array
  2008-05-17 15:15     ` Eric Sandeen
@ 2008-05-17 23:18       ` Eric Sandeen
  2008-05-18  5:23         ` Eric Sandeen
  0 siblings, 1 reply; 17+ messages in thread
From: Eric Sandeen @ 2008-05-17 23:18 UTC (permalink / raw)
  To: David Greaves
  Cc: David Chinner, LinuxRaid, xfs,
	'linux-kernel@vger.kernel.org'

Eric Sandeen wrote:

> I'll see if I have a little time today to track down the problem.


Does this patch fix it for you?  Does for me though I can't yet explain
why ;)

http://www.linux.sgi.com/archives/xfs/2008-05/msg00190.html

-Eric

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Regression- XFS won't mount on partitioned md array
  2008-05-17 23:18       ` Eric Sandeen
@ 2008-05-18  5:23         ` Eric Sandeen
  2008-05-18  8:48           ` David Greaves
  2008-05-19  3:46           ` Timothy Shimmin
  0 siblings, 2 replies; 17+ messages in thread
From: Eric Sandeen @ 2008-05-18  5:23 UTC (permalink / raw)
  To: David Greaves
  Cc: David Chinner, xfs, 'linux-kernel@vger.kernel.org',
	Christoph Hellwig

Eric Sandeen wrote:
> Eric Sandeen wrote:
> 
>> I'll see if I have a little time today to track down the problem.
> 
> 
> Does this patch fix it for you?  Does for me though I can't yet explain
> why ;)
> 
> http://www.linux.sgi.com/archives/xfs/2008-05/msg00190.html
> 
> -Eric

So what's happening is that xfs is trying to read a page-sized IO from
the last sector of the log... which goes off the end of the device.
This looks like another regression introduced by
a9759f2de38a3443d5107bddde03b4f3f550060e, but fixed by Christoph's patch
in the URL above, which should be headed towards -stable.

(aside: it seems that this breaks any external log setup where the log
consists of the entire device... but I'd have expected the xfsqa suite
to catch this...?)

The patch avoids the problem by looking for some extra locking but it
seems to me that the root cause is that the buffer being read at this
point doesn't have it's b_offset, the offset in it's page, set.  Might
be another little buglet but harmless it seems.

-Eric

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Regression- XFS won't mount on partitioned md array
  2008-05-18  5:23         ` Eric Sandeen
@ 2008-05-18  8:48           ` David Greaves
  2008-05-18 15:38             ` Eric Sandeen
  2008-05-24 13:33             ` RFI for 2.6.25.5 : " David Greaves
  2008-05-19  3:46           ` Timothy Shimmin
  1 sibling, 2 replies; 17+ messages in thread
From: David Greaves @ 2008-05-18  8:48 UTC (permalink / raw)
  To: Eric Sandeen
  Cc: David Chinner, xfs, 'linux-kernel@vger.kernel.org',
	Christoph Hellwig, LinuxRaid

Eric Sandeen wrote:
> Eric Sandeen wrote:
>> Eric Sandeen wrote:
>>
>>> I'll see if I have a little time today to track down the problem.
>>
>> Does this patch fix it for you?  Does for me though I can't yet explain
>> why ;)
>>
>> http://www.linux.sgi.com/archives/xfs/2008-05/msg00190.html
>>
>> -Eric
Yes, this fixes it for me - thanks :)

> So what's happening is that xfs is trying to read a page-sized IO from
> the last sector of the log... which goes off the end of the device.
> This looks like another regression introduced by
> a9759f2de38a3443d5107bddde03b4f3f550060e, but fixed by Christoph's patch
> in the URL above, which should be headed towards -stable.
Damn, I guess I misread my bisect readings when things crashed then.
Still, I said 'around' :)

> (aside: it seems that this breaks any external log setup where the log
> consists of the entire device... but I'd have expected the xfsqa suite
> to catch this...?)
> 
> The patch avoids the problem by looking for some extra locking but it
> seems to me that the root cause is that the buffer being read at this
> point doesn't have it's b_offset, the offset in it's page, set.  Might
> be another little buglet but harmless it seems.
mmmm
'little buglets' in the filesystem holding a few Tb of data...
mmmm
Anything I can do to help find that? I suspect not if you can reproduce it.

Anyhow - thanks again.

David
PS I'll be back soon, back in 2.6.23 I was hitting a hibernate/xfs bug which
I've been avoiding by powering down. Well, it's still there in 2.6.25.3...

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Regression- XFS won't mount on partitioned md array
  2008-05-18  8:48           ` David Greaves
@ 2008-05-18 15:38             ` Eric Sandeen
  2008-05-24 13:33             ` RFI for 2.6.25.5 : " David Greaves
  1 sibling, 0 replies; 17+ messages in thread
From: Eric Sandeen @ 2008-05-18 15:38 UTC (permalink / raw)
  To: David Greaves
  Cc: David Chinner, xfs, 'linux-kernel@vger.kernel.org',
	Christoph Hellwig, LinuxRaid

David Greaves wrote:
> Eric Sandeen wrote:


> mmmm
> 'little buglets' in the filesystem holding a few Tb of data...
> mmmm
> Anything I can do to help find that? I suspect not if you can reproduce it.

Nah.  I'll ask the sgi guys about it, it just seems a little
inconsistent but maybe by design...

-Eric

> Anyhow - thanks again.
> 
> David
> PS I'll be back soon, back in 2.6.23 I was hitting a hibernate/xfs bug which
> I've been avoiding by powering down. Well, it's still there in 2.6.25.3...
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Regression- XFS won't mount on partitioned md array
  2008-05-18  5:23         ` Eric Sandeen
  2008-05-18  8:48           ` David Greaves
@ 2008-05-19  3:46           ` Timothy Shimmin
  1 sibling, 0 replies; 17+ messages in thread
From: Timothy Shimmin @ 2008-05-19  3:46 UTC (permalink / raw)
  To: Eric Sandeen
  Cc: David Greaves, David Chinner, xfs,
	'linux-kernel@vger.kernel.org', Christoph Hellwig, asg-qa

Eric Sandeen wrote:
> Eric Sandeen wrote:
>> Eric Sandeen wrote:
>>
>>> I'll see if I have a little time today to track down the problem.
>>
>> Does this patch fix it for you?  Does for me though I can't yet explain
>> why ;)
>>
>> http://www.linux.sgi.com/archives/xfs/2008-05/msg00190.html
>>
>> -Eric
> 
> So what's happening is that xfs is trying to read a page-sized IO from
> the last sector of the log... which goes off the end of the device.
> This looks like another regression introduced by
> a9759f2de38a3443d5107bddde03b4f3f550060e, but fixed by Christoph's patch
> in the URL above, which should be headed towards -stable.
> 
> (aside: it seems that this breaks any external log setup where the log
> consists of the entire device... but I'd have expected the xfsqa suite
> to catch this...?)
> 
The only way I can see that we'd catch this (by testing external logs)
in the current qa setup is
if one sets up SCRATCH_LOGDEV and/or TEST_LOGDEV and USE_EXTERNAL.
There are no specific tests to test out external logs (there is 044 but
it requires the env vars set anyway) such as using a loop back device
for the log. Perhaps we should do this.
I should check that our QA group are setting the vars in some runs.

--Tim

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RFI for 2.6.25.5 : Re: Regression- XFS won't mount on partitioned md array
  2008-05-18  8:48           ` David Greaves
  2008-05-18 15:38             ` Eric Sandeen
@ 2008-05-24 13:33             ` David Greaves
  2008-05-24 13:52               ` Willy Tarreau
  1 sibling, 1 reply; 17+ messages in thread
From: David Greaves @ 2008-05-24 13:33 UTC (permalink / raw)
  To: Greg KH
  Cc: Eric Sandeen, David Chinner, xfs,
	'linux-kernel@vger.kernel.org', Christoph Hellwig,
	LinuxRaid

Hi Greg
Perusing:
  http://git.kernel.org/?p=linux/kernel/git/stable/stable-queue.git
doesn't show the patch referenced below as in the queue for 2.6.25.5

David

David Greaves wrote:
> Eric Sandeen wrote:
>> Eric Sandeen wrote:
>>> Eric Sandeen wrote:
>>>
>>>> I'll see if I have a little time today to track down the problem.
>>> Does this patch fix it for you?  Does for me though I can't yet explain
>>> why ;)
>>>
>>> http://www.linux.sgi.com/archives/xfs/2008-05/msg00190.html
>>>
>>> -Eric
> Yes, this fixes it for me - thanks :)
> 
>> So what's happening is that xfs is trying to read a page-sized IO from
>> the last sector of the log... which goes off the end of the device.
>> This looks like another regression introduced by
>> a9759f2de38a3443d5107bddde03b4f3f550060e, but fixed by Christoph's patch
>> in the URL above, which should be headed towards -stable.
> Damn, I guess I misread my bisect readings when things crashed then.
> Still, I said 'around' :)
> 
>> (aside: it seems that this breaks any external log setup where the log
>> consists of the entire device... but I'd have expected the xfsqa suite
>> to catch this...?)
>>
>> The patch avoids the problem by looking for some extra locking but it
>> seems to me that the root cause is that the buffer being read at this
>> point doesn't have it's b_offset, the offset in it's page, set.  Might
>> be another little buglet but harmless it seems.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: RFI for 2.6.25.5 : Re: Regression- XFS won't mount on partitioned md array
  2008-05-24 13:33             ` RFI for 2.6.25.5 : " David Greaves
@ 2008-05-24 13:52               ` Willy Tarreau
  2008-05-24 15:39                 ` Eric Sandeen
  0 siblings, 1 reply; 17+ messages in thread
From: Willy Tarreau @ 2008-05-24 13:52 UTC (permalink / raw)
  To: David Greaves
  Cc: Greg KH, Eric Sandeen, David Chinner, xfs,
	'linux-kernel@vger.kernel.org', Christoph Hellwig,
	LinuxRaid, stable

Hi David,

On Sat, May 24, 2008 at 02:33:35PM +0100, David Greaves wrote:
> Hi Greg
> Perusing:
>   http://git.kernel.org/?p=linux/kernel/git/stable/stable-queue.git
> doesn't show the patch referenced below as in the queue for 2.6.25.5

First, please avoid top-posting.

> David Greaves wrote:
> > Eric Sandeen wrote:
> >> Eric Sandeen wrote:
> >>> Eric Sandeen wrote:
> >>>
> >>>> I'll see if I have a little time today to track down the problem.
> >>> Does this patch fix it for you?  Does for me though I can't yet explain
> >>> why ;)
> >>>
> >>> http://www.linux.sgi.com/archives/xfs/2008-05/msg00190.html
> >>>
> >>> -Eric
> > Yes, this fixes it for me - thanks :)
> > 
> >> So what's happening is that xfs is trying to read a page-sized IO from
> >> the last sector of the log... which goes off the end of the device.
> >> This looks like another regression introduced by
> >> a9759f2de38a3443d5107bddde03b4f3f550060e, but fixed by Christoph's patch
> >> in the URL above, which should be headed towards -stable.
> > Damn, I guess I misread my bisect readings when things crashed then.
> > Still, I said 'around' :)
> > 
> >> (aside: it seems that this breaks any external log setup where the log
> >> consists of the entire device... but I'd have expected the xfsqa suite
> >> to catch this...?)
> >>
> >> The patch avoids the problem by looking for some extra locking but it
> >> seems to me that the root cause is that the buffer being read at this
> >> point doesn't have it's b_offset, the offset in it's page, set.  Might
> >> be another little buglet but harmless it seems.

It would have helped to CC stable (fixed) and to give the mainline commit
ID since the stable branch only holds already merged patches. Greg, the
commit is 6ab455ee...

Willy

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: RFI for 2.6.25.5 : Re: Regression- XFS won't mount on partitioned md array
  2008-05-24 13:52               ` Willy Tarreau
@ 2008-05-24 15:39                 ` Eric Sandeen
  0 siblings, 0 replies; 17+ messages in thread
From: Eric Sandeen @ 2008-05-24 15:39 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: David Greaves, Greg KH, David Chinner, xfs,
	'linux-kernel@vger.kernel.org', Christoph Hellwig,
	LinuxRaid, stable

Willy Tarreau wrote:

> It would have helped to CC stable (fixed) and to give the mainline commit
> ID since the stable branch only holds already merged patches. Greg, the
> commit is 6ab455ee...
> 
> Willy

Yup I'll agree that this should probably go to -stable unless hch or
dchinner disagree.

FWIW I've already put it in the Fedora kernels.

-Eric

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2008-05-24 15:38 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-05-16 17:11 Regression- XFS won't mount on partitioned md array David Greaves
2008-05-16 17:16 ` Justin Piszcz
2008-05-16 18:05   ` David Greaves
2008-05-16 18:35     ` Oliver Pinter
2008-05-17 14:48       ` David Greaves
2008-05-17 15:20         ` David Greaves
2008-05-16 18:59 ` Eric Sandeen
2008-05-17 14:46   ` David Greaves
2008-05-17 15:15     ` Eric Sandeen
2008-05-17 23:18       ` Eric Sandeen
2008-05-18  5:23         ` Eric Sandeen
2008-05-18  8:48           ` David Greaves
2008-05-18 15:38             ` Eric Sandeen
2008-05-24 13:33             ` RFI for 2.6.25.5 : " David Greaves
2008-05-24 13:52               ` Willy Tarreau
2008-05-24 15:39                 ` Eric Sandeen
2008-05-19  3:46           ` Timothy Shimmin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox