XFS issue under 2.6.25.13 kernel

public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed

* XFS issue under 2.6.25.13 kernel
@ 2008-08-22 10:03 Sławomir Nowakowski
  2008-08-23  1:05 ` Dave Chinner
  0 siblings, 1 reply; 8+ messages in thread
From: Sławomir Nowakowski @ 2008-08-22 10:03 UTC (permalink / raw)
  To: xfs

Dear All,

We have a problem with implementing xfs file system for Linux. The problem
appears after mounting xfs file system on 2.6.25.13 kernel that is created on
2.6.17.13 kernel.

File system is created on logical volume in the following way:

lvcreate -L 4G volume1 -n test
mkfs.xfs /dev/volume1/test
mount /dev/volume1/test /mnt/x

After mounting it on 2.6.17.13 kernel "df -B 1" output looks like this:

/dev/volume1/test    4284481536    147456 4284334080   1% /mnt/x

but in case of 2.6.25.13 kernel:

/dev/volume1/test    4284481536   4489216 4279992320   1% /mnt/x

The same happens in case of 2.6.26.3 kernel.

As it is shown after mounting the volume in newer kernel size of file system
is visible smaller. The problem appears when on this volume exists one big
file, occupying all available space. After trying to mount it under newer
kernel, the file is cut because available free space is smaller.

Is it known issue and/or does solution or workaround exists?

Thank you in advance for your help.

Best Regards
Roland

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: XFS issue under 2.6.25.13 kernel
  2008-08-22 10:03 XFS issue under 2.6.25.13 kernel Sławomir Nowakowski
@ 2008-08-23  1:05 ` Dave Chinner
  2008-08-25 11:08   ` Sławomir Nowakowski
  0 siblings, 1 reply; 8+ messages in thread
From: Dave Chinner @ 2008-08-23  1:05 UTC (permalink / raw)
  To: Sławomir Nowakowski; +Cc: xfs

On Fri, Aug 22, 2008 at 12:03:40PM +0200, Sławomir Nowakowski wrote:
> Dear All,
> 
> We have a problem with implementing xfs file system for Linux. The problem
> appears after mounting xfs file system on 2.6.25.13 kernel that is created on
> 2.6.17.13 kernel.
> 
> File system is created on logical volume in the following way:
> 
> lvcreate -L 4G volume1 -n test
> mkfs.xfs /dev/volume1/test
> mount /dev/volume1/test /mnt/x
> 
> After mounting it on 2.6.17.13 kernel "df -B 1" output looks like this:
> 
> /dev/volume1/test    4284481536    147456 4284334080   1% /mnt/x
> 
> but in case of 2.6.25.13 kernel:
> 
> /dev/volume1/test    4284481536   4489216 4279992320   1% /mnt/x
> 
> The same happens in case of 2.6.26.3 kernel.

Yeah, we reserved 4MB of space for unreserved delayed metadata
allocations to allow transactions to succeed when at ENOSPC. That
reservation is accounted as 'used space' to prevent it being used by
data.

> As it is shown after mounting the volume in newer kernel size of file system
> is visible smaller. The problem appears when on this volume exists one big
> file, occupying all available space. After trying to mount it under newer
> kernel, the file is cut because available free space is smaller.

What is on disk will not change - the reservation is purely an
in-memory construct. i.e. if the file already exists then it
won't change on upgrade. Can you show how the file changes just
by booting a different kernel (e.g. ls -l output, md5sums, etc).

> Is it known issue and/or does solution or workaround exists?

$ sudo xfs_io -x -c 'resblks 0' <file in filesystem>

will remove the reservation. This means your filesystem can shutdown
or lose data at ENOSPC in certain circumstances....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: XFS issue under 2.6.25.13 kernel
  2008-08-23  1:05 ` Dave Chinner
@ 2008-08-25 11:08   ` Sławomir Nowakowski
  2008-08-26  1:41     ` Dave Chinner
  0 siblings, 1 reply; 8+ messages in thread
From: Sławomir Nowakowski @ 2008-08-25 11:08 UTC (permalink / raw)
  To: Sławomir Nowakowski, xfs

2008/8/23 Dave Chinner <david@fromorbit.com>:
> On Fri, Aug 22, 2008 at 12:03:40PM +0200, Sławomir Nowakowski wrote:
>> Dear All,
>>
>> We have a problem with implementing xfs file system for Linux. The problem
>> appears after mounting xfs file system on 2.6.25.13 kernel that is created on
>> 2.6.17.13 kernel.
>>
>> File system is created on logical volume in the following way:
>>
>> lvcreate -L 4G volume1 -n test
>> mkfs.xfs /dev/volume1/test
>> mount /dev/volume1/test /mnt/x
>>
>> After mounting it on 2.6.17.13 kernel "df -B 1" output looks like this:
>>
>> /dev/volume1/test    4284481536    147456 4284334080   1% /mnt/x
>>
>> but in case of 2.6.25.13 kernel:
>>
>> /dev/volume1/test    4284481536   4489216 4279992320   1% /mnt/x
>>
>> The same happens in case of 2.6.26.3 kernel.
>
> Yeah, we reserved 4MB of space for unreserved delayed metadata
> allocations to allow transactions to succeed when at ENOSPC. That
> reservation is accounted as 'used space' to prevent it being used by
> data.
>

Thank you for information.


>> As it is shown after mounting the volume in newer kernel size of file system
>> is visible smaller. The problem appears when on this volume exists one big
>> file, occupying all available space. After trying to mount it under newer
>> kernel, the file is cut because available free space is smaller.
>
> What is on disk will not change - the reservation is purely an
> in-memory construct. i.e. if the file already exists then it
> won't change on upgrade. Can you show how the file changes just
> by booting a different kernel (e.g. ls -l output, md5sums, etc).
>

For tests we have used two kernels: 2.6.17.13 and 2.6.25.13
We have created following partition map:

Disk /dev/sda: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1               1         123      987966   83  Linux
/dev/sda2             124         155      257040   83  Linux
/dev/sda3             156         778     5004247+  83  Linux

Next under kernel 2.6.17.1e we have created XFS partition onto sda3:

# mkfs.xfs /dev/sda3

and have mounted it:

# mount /dev/sda3 /mnt/z

Next we have created some files:
-one big file called "bigfile" and size of 5109497856 bytes
-two small text files called: "file1" and "file2"

At this stage it looked as follows:

root@localhost:~# ls -la /mnt/z; df
total 4989773
drwxr-xr-x   2 root root         44 Aug 25 09:35 .
drwxr-xr-x  25 root root       1024 Aug 25 08:30 ..
-rw-r--r--   1 root root 5109497856 Aug 25 08:33 bigfile
-rw-r--r--   1 root root      15132 May 30 15:04 file1
-rw-------   1 root root       7537 Aug  7 15:32 file2

Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/sda3              4993984   4989916      4068 100% /mnt/z

Then we have run system with 2.6.25.13 kernel and checked how it looks:

root@localhost:~# ls -la /mnt/z; df
total 4989773
drwxr-xr-x   2 root root         44 Aug 25 09:35 .
drwxr-xr-x  25 root root       1024 Aug 25 08:30 ..
-rw-r--r--   1 root root 5109497856 Aug 25 08:33 bigfile
-rw-r--r--   1 root root      15132 May 30 15:04 file1
-rw-------   1 root root       7537 Aug  7 15:32 file2

Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/sda3              4993984   4993984         0 100% /mnt/z

As it shown in case of 2.6.25.13 kernel system reports no free space,
but under 2.6.17.13 kernel there is 4068kB of free space.

At this stage when editing file  file1 with i.e. mcedit and trying to
write changes, the system cuts this file to 0 bytes!

The same situation does not happen when we use:

cat file2 >> file1

In this case the system connects two files properly.

>> Is it known issue and/or does solution or workaround exists?
>
> $ sudo xfs_io -x -c 'resblks 0' <file in filesystem>
>
> will remove the reservation. This means your filesystem can shutdown
> or lose data at ENOSPC in certain circumstances....

A question: does using the command:

$ sudo xfs_io -x -c 'resblks 0' <file in filesystem>

for 2.6.25.13 kernel gives higher risk of losing data then in case of
2.6.17.13 kernel.

>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
>

Thank you very much for your help

Roland
nailman23@gmail.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: XFS issue under 2.6.25.13 kernel
  2008-08-25 11:08   ` Sławomir Nowakowski
@ 2008-08-26  1:41     ` Dave Chinner
  2008-08-26 12:53       ` Sławomir Nowakowski
  0 siblings, 1 reply; 8+ messages in thread
From: Dave Chinner @ 2008-08-26  1:41 UTC (permalink / raw)
  To: Sławomir Nowakowski; +Cc: xfs

On Mon, Aug 25, 2008 at 01:08:29PM +0200, Sławomir Nowakowski wrote:
> 2008/8/23 Dave Chinner <david@fromorbit.com>:
> Next we have created some files:
> -one big file called "bigfile" and size of 5109497856 bytes
> -two small text files called: "file1" and "file2"
> 
> At this stage it looked as follows:
....
> Filesystem           1K-blocks      Used Available Use% Mounted on
> /dev/sda3              4993984   4989916      4068 100% /mnt/z
> 
> Then we have run system with 2.6.25.13 kernel and checked how it looks:
.....
> Filesystem           1K-blocks      Used Available Use% Mounted on
> /dev/sda3              4993984   4993984         0 100% /mnt/z
> 
> As it shown in case of 2.6.25.13 kernel system reports no free space,
> but under 2.6.17.13 kernel there is 4068kB of free space.
> 
> At this stage when editing file  file1 with i.e. mcedit and trying to
> write changes, the system cuts this file to 0 bytes!

Oh, look, yet another editor that doesn't safely handle ENOSPC and
trashes files when it can't overwrite them. That's not an XFS
problem - I suggest raising a bug against the editor....

> >> Is it known issue and/or does solution or workaround exists?
> >
> > $ sudo xfs_io -x -c 'resblks 0' <file in filesystem>
> >
> > will remove the reservation. This means your filesystem can shutdown
> > or lose data at ENOSPC in certain circumstances....
> 
> A question: does using the command:
> 
> $ sudo xfs_io -x -c 'resblks 0' <file in filesystem>
> 
> for 2.6.25.13 kernel gives higher risk of losing data then in case of
> 2.6.17.13 kernel.

Hard to say. If you don't run to ENOSPC then there is no difference.
If you do run to ENOSPC then I think that there is a slightly higher
risk of tripping problems on 2.6.25.x because of other ENOSPC fixes
that have been included since 2.6.17.13. This really is a safety net
in that it allows the system to continue without problems in
conditions where it would have previously done a bad thing...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: XFS issue under 2.6.25.13 kernel
  2008-08-26  1:41     ` Dave Chinner
@ 2008-08-26 12:53       ` Sławomir Nowakowski
  2008-08-27  0:52         ` Dave Chinner
  0 siblings, 1 reply; 8+ messages in thread
From: Sławomir Nowakowski @ 2008-08-26 12:53 UTC (permalink / raw)
  To: Sławomir Nowakowski, xfs

2008/8/26 Dave Chinner <david@fromorbit.com>:
> On Mon, Aug 25, 2008 at 01:08:29PM +0200, Sławomir Nowakowski wrote:
>> 2008/8/23 Dave Chinner <david@fromorbit.com>:
>> Next we have created some files:
>> -one big file called "bigfile" and size of 5109497856 bytes
>> -two small text files called: "file1" and "file2"
>>
>> At this stage it looked as follows:
> ....
>> Filesystem           1K-blocks      Used Available Use% Mounted on
>> /dev/sda3              4993984   4989916      4068 100% /mnt/z
>>
>> Then we have run system with 2.6.25.13 kernel and checked how it looks:
> .....
>> Filesystem           1K-blocks      Used Available Use% Mounted on
>> /dev/sda3              4993984   4993984         0 100% /mnt/z
>>
>> As it shown in case of 2.6.25.13 kernel system reports no free space,
>> but under 2.6.17.13 kernel there is 4068kB of free space.
>>
>> At this stage when editing file  file1 with i.e. mcedit and trying to
>> write changes, the system cuts this file to 0 bytes!
>
> Oh, look, yet another editor that doesn't safely handle ENOSPC and
> trashes files when it can't overwrite them. That's not an XFS
> problem - I suggest raising a bug against the editor....
>
>> >> Is it known issue and/or does solution or workaround exists?
>> >
>> > $ sudo xfs_io -x -c 'resblks 0' <file in filesystem>
>> >
>> > will remove the reservation. This means your filesystem can shutdown
>> > or lose data at ENOSPC in certain circumstances....
>>
>> A question: does using the command:
>>
>> $ sudo xfs_io -x -c 'resblks 0' <file in filesystem>
>>
>> for 2.6.25.13 kernel gives higher risk of losing data then in case of
>> 2.6.17.13 kernel.
>
> Hard to say. If you don't run to ENOSPC then there is no difference.
> If you do run to ENOSPC then I think that there is a slightly higher
> risk of tripping problems on 2.6.25.x because of other ENOSPC fixes
> that have been included since 2.6.17.13. This really is a safety net
> in that it allows the system to continue without problems in
> conditions where it would have previously done a bad thing...
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
>

Dear Dave,

Can you please take a look at the following outputs of some commands
run under 2.6.17.17 and 2.6.25.13 kernels?

Here is a situation on 2.6.17.13 kernel:

xfs_io -x -c 'statfs' /mnt/point

fd.path = "/mnt/sda"
statfs.f_bsize = 4096
statfs.f_blocks = 487416
statfs.f_bavail = 6
statfs.f_files = 160
statfs.f_ffree = 154
geom.bsize = 4096
geom.agcount = 8
geom.agblocks = 61247
geom.datablocks = 489976
geom.rtblocks = 0
geom.rtextents = 0
geom.rtextsize = 1
geom.sunit = 0
geom.swidth = 0
counts.freedata = 6
counts.freertx = 0
counts.freeino = 58
counts.allocino = 64

xfs_io -x -c 'resblks' /mnt/point

reserved blocks = 0
available reserved blocks = 0

xfs_info /mnt/point

meta-data=/dev/sda4              isize=256    agcount=8, agsize=61247 blks
         =                       sectsz=512   attr=0
data     =                       bsize=4096   blocks=489976, imaxpct=25
         =                       sunit=0      swidth=0 blks, unwritten=1
naming   =version 2              bsize=4096
log      =internal               bsize=4096   blocks=2560, version=1
         =                       sectsz=512   sunit=0 blks, lazy-count=0
realtime =none                   extsz=4096   blocks=0, rtextents=0

But under 2.6.25.13 kernel the situation looks different:

xfs_io -x -c 'statfs' /mnt/point:

fd.path = "/mnt/-sda4"
statfs.f_bsize = 4096
statfs.f_blocks = 487416
statfs.f_bavail = 30
statfs.f_files = 544
statfs.f_ffree = 538
geom.bsize = 4096
geom.agcount = 8
geom.agblocks = 61247
geom.datablocks = 489976
geom.rtblocks = 0
geom.rtextents = 0
geom.rtextsize = 1
geom.sunit = 0
geom.swidth = 0
counts.freedata = 30
counts.freertx = 0
counts.freeino = 58
counts.allocino = 64

xfs_io -x -c 'resblks' /mnt/point:

reserved blocks = 18446744073709551586
available reserved blocks = 18446744073709551586

xfs_info /mnt/point

meta-data=/dev/sda4              isize=256    agcount=8, agsize=61247 blks
         =                       sectsz=512   attr=0
data     =                       bsize=4096   blocks=489976, imaxpct=25
         =                       sunit=0      swidth=0 blks, unwritten=1
naming   =version 2              bsize=4096
log      =internal               bsize=4096   blocks=2560, version=1
         =                       sectsz=512   sunit=0 blks
realtime =none                   extsz=4096   blocks=0, rtextents=0

As you can easy see statfs.f_bavail, statfs.f_files, statfs.f_ffree
and counts.freedata values are different.
Can you explain why?

Also after applying your solution "xfs_io -x -c 'resblks 0' <file in
filesystem>" the command

xfs_io -x -c 'resblks' /mnt/point gives output:

reserved blocks = 0
available reserved blocks = 18446744073709551586

Is it OK?

Another question is if you know some advices for tuning of XFS file
systems that will contain maximum 10 files?

Thank you very much for your help!
I really appreciate it.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: XFS issue under 2.6.25.13 kernel
  2008-08-26 12:53       ` Sławomir Nowakowski
@ 2008-08-27  0:52         ` Dave Chinner
  2008-08-27 18:09           ` Sławomir Nowakowski
  0 siblings, 1 reply; 8+ messages in thread
From: Dave Chinner @ 2008-08-27  0:52 UTC (permalink / raw)
  To: Sławomir Nowakowski; +Cc: xfs

On Tue, Aug 26, 2008 at 02:53:23PM +0200, Sławomir Nowakowski wrote:
> 2008/8/26 Dave Chinner <david@fromorbit.com>:
> run under 2.6.17.17 and 2.6.25.13 kernels?
> 
> Here is a situation on 2.6.17.13 kernel:
> 
> xfs_io -x -c 'statfs' /mnt/point
>
> fd.path = "/mnt/sda"
> statfs.f_bsize = 4096
> statfs.f_blocks = 487416
> statfs.f_bavail = 6
> statfs.f_files = 160
> statfs.f_ffree = 154
> geom.bsize = 4096
> geom.agcount = 8
> geom.agblocks = 61247
> geom.datablocks = 489976
> geom.rtblocks = 0
> geom.rtextents = 0
> geom.rtextsize = 1
> geom.sunit = 0
> geom.swidth = 0
> counts.freedata = 6
> counts.freertx = 0
> counts.freeino = 58
> counts.allocino = 64

The counts.* numbers are the real numbers, not th statfs numbers
which are somewhat made up - the inode count for example is
influenced by the amount of free space....

> xfs_io -x -c 'resblks' /mnt/point
> 
> reserved blocks = 0
> available reserved blocks = 0
....

> 
> But under 2.6.25.13 kernel the situation looks different:
> 
> xfs_io -x -c 'statfs' /mnt/point:
> 
> fd.path = "/mnt/-sda4"
> statfs.f_bsize = 4096
> statfs.f_blocks = 487416
> statfs.f_bavail = 30
> statfs.f_files = 544
> statfs.f_ffree = 538

More free space, therefore more inodes....

> geom.bsize = 4096
> geom.agcount = 8
> geom.agblocks = 61247
> geom.datablocks = 489976
> geom.rtblocks = 0
> geom.rtextents = 0
> geom.rtextsize = 1
> geom.sunit = 0
> geom.swidth = 0
> counts.freedata = 30
> counts.freertx = 0
> counts.freeino = 58
> counts.allocino = 64

but the counts.* values show that the inode counts are the same.
However, the free space is different, partially due to a different
set of ENOSPC deadlock fixes that were done that required different
calculations of space usage....

> xfs_io -x -c 'resblks' /mnt/point:
> 
> reserved blocks = 18446744073709551586
> available reserved blocks = 18446744073709551586

Well, that is wrong - that's a large negative number.

FWIW, I can't reproduce this on a pure 2.6.24 on ia32 or 2.6.27-rc4 kernel
on x86_64-UML:

# mount /mnt/xfs2
# df -k /mnt/xfs2
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/ubd/2             2086912      1176   2085736   1% /mnt/xfs2
# xfs_io -x -c 'resblks 0' /mnt/xfs2
reserved blocks = 0
available reserved blocks = 0
# df -k /mnt/xfs2
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/ubd/2             2086912       160   2086752   1% /mnt/xfs2
# xfs_io -f -c 'truncate 2g' -c 'resvsp 0 2086720k' /mnt/xfs2/fred
# df -k /mnt/xfs2
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/ubd/2             2086912   2086880        32 100% /mnt/xfs2
# xfs_io -x -c statfs /mnt/xfs2
fd.path = "/mnt/xfs2"
statfs.f_bsize = 4096
statfs.f_blocks = 521728
statfs.f_bavail = 8
statfs.f_files = 192
statfs.f_ffree = 188
....
counts.freedata = 8
counts.freertx = 0
counts.freeino = 60
counts.allocino = 64
death:/mnt# umount /mnt/xfs2
death:/mnt# mount /mnt/xfs2
# xfs_io -x -c statfs /mnt/xfs2
fd.path = "/mnt/xfs2"
statfs.f_bsize = 4096
statfs.f_blocks = 521728
statfs.f_bavail = 0
statfs.f_files = 64
statfs.f_ffree = 60
....
counts.freedata = 0
counts.freertx = 0
counts.freeino = 60
counts.allocino = 64
# df -k /mnt/xfs2
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/ubd/2             2086912   2086912         0 100% /mnt/xfs2
# xfs_io -x -c resblks /mnt/xfs2
reserved blocks = 8
available reserved blocks = 8

Can you produce a metadump of the filesystem image that your have produced
on 2.6.17 that results in bad behaviour on later kernels so I can see if
I can reproduce the same results here? If you've only got a handful of files
the image will be small enough to mail to me....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: XFS issue under 2.6.25.13 kernel
  2008-08-27  0:52         ` Dave Chinner
@ 2008-08-27 18:09           ` Sławomir Nowakowski
  2008-08-28  0:20             ` Dave Chinner
  0 siblings, 1 reply; 8+ messages in thread
From: Sławomir Nowakowski @ 2008-08-27 18:09 UTC (permalink / raw)
  To: Sławomir Nowakowski, xfs

[-- Attachment #1: Type: text/plain, Size: 4951 bytes --]

Dear Dave,

We really apreciate your help..

In the realtion to previous correspondations about differences between
implementation of kernels 2.6.17.13 and 2.6.25.13 we'd like to ask
some questions.

We was based on git repository:

git://git.kernel.org

We have reverted some changes for XFS in 2.6.25.13 kernel. We have
usedf 3 commits:

- 94E1E99F11... (SGI-PV: 964468)
- 4BE536DEBE... (SGI-PV: 955674)
- 4CA488EB4...  (SGI-PV: 971186)

With these changes we have created patch for 2.6.25.13 kernel. This
patch should eliminate additional reservation of disk space in XFS
file system. Our intention was to get similarity space of disk between
2.6.17.13 and 2.6.25.13 kernels.

Does patch that is attached to this mail do everything properly? Is it
100% compatibe with XFS API?

If you wnat anything more from us juts ask. WQe deliver it.

Thank you vey much for your attitude.

Roland

2008/8/27, Dave Chinner <david@fromorbit.com>:
> On Tue, Aug 26, 2008 at 02:53:23PM +0200, Sławomir Nowakowski wrote:
> > 2008/8/26 Dave Chinner <david@fromorbit.com>:
> > run under 2.6.17.17 and 2.6.25.13 kernels?
> >
> > Here is a situation on 2.6.17.13 kernel:
> >
> > xfs_io -x -c 'statfs' /mnt/point
> >
> > fd.path = "/mnt/sda"
> > statfs.f_bsize = 4096
> > statfs.f_blocks = 487416
> > statfs.f_bavail = 6
> > statfs.f_files = 160
> > statfs.f_ffree = 154
> > geom.bsize = 4096
> > geom.agcount = 8
> > geom.agblocks = 61247
> > geom.datablocks = 489976
> > geom.rtblocks = 0
> > geom.rtextents = 0
> > geom.rtextsize = 1
> > geom.sunit = 0
> > geom.swidth = 0
> > counts.freedata = 6
> > counts.freertx = 0
> > counts.freeino = 58
> > counts.allocino = 64
>
> The counts.* numbers are the real numbers, not th statfs numbers
> which are somewhat made up - the inode count for example is
> influenced by the amount of free space....
>
> > xfs_io -x -c 'resblks' /mnt/point
> >
> > reserved blocks = 0
> > available reserved blocks = 0
> ....
>
> >
> > But under 2.6.25.13 kernel the situation looks different:
> >
> > xfs_io -x -c 'statfs' /mnt/point:
> >
> > fd.path = "/mnt/-sda4"
> > statfs.f_bsize = 4096
> > statfs.f_blocks = 487416
> > statfs.f_bavail = 30
> > statfs.f_files = 544
> > statfs.f_ffree = 538
>
> More free space, therefore more inodes....
>
> > geom.bsize = 4096
> > geom.agcount = 8
> > geom.agblocks = 61247
> > geom.datablocks = 489976
> > geom.rtblocks = 0
> > geom.rtextents = 0
> > geom.rtextsize = 1
> > geom.sunit = 0
> > geom.swidth = 0
> > counts.freedata = 30
> > counts.freertx = 0
> > counts.freeino = 58
> > counts.allocino = 64
>
> but the counts.* values show that the inode counts are the same.
> However, the free space is different, partially due to a different
> set of ENOSPC deadlock fixes that were done that required different
> calculations of space usage....
>
> > xfs_io -x -c 'resblks' /mnt/point:
> >
> > reserved blocks = 18446744073709551586
> > available reserved blocks = 18446744073709551586
>
> Well, that is wrong - that's a large negative number.
>
> FWIW, I can't reproduce this on a pure 2.6.24 on ia32 or 2.6.27-rc4 kernel
> on x86_64-UML:
>
> # mount /mnt/xfs2
> # df -k /mnt/xfs2
> Filesystem           1K-blocks      Used Available Use% Mounted on
> /dev/ubd/2             2086912      1176   2085736   1% /mnt/xfs2
> # xfs_io -x -c 'resblks 0' /mnt/xfs2
> reserved blocks = 0
> available reserved blocks = 0
> # df -k /mnt/xfs2
> Filesystem           1K-blocks      Used Available Use% Mounted on
> /dev/ubd/2             2086912       160   2086752   1% /mnt/xfs2
> # xfs_io -f -c 'truncate 2g' -c 'resvsp 0 2086720k' /mnt/xfs2/fred
> # df -k /mnt/xfs2
> Filesystem           1K-blocks      Used Available Use% Mounted on
> /dev/ubd/2             2086912   2086880        32 100% /mnt/xfs2
> # xfs_io -x -c statfs /mnt/xfs2
> fd.path = "/mnt/xfs2"
> statfs.f_bsize = 4096
> statfs.f_blocks = 521728
> statfs.f_bavail = 8
> statfs.f_files = 192
> statfs.f_ffree = 188
> ....
> counts.freedata = 8
> counts.freertx = 0
> counts.freeino = 60
> counts.allocino = 64
> death:/mnt# umount /mnt/xfs2
> death:/mnt# mount /mnt/xfs2
> # xfs_io -x -c statfs /mnt/xfs2
> fd.path = "/mnt/xfs2"
> statfs.f_bsize = 4096
> statfs.f_blocks = 521728
> statfs.f_bavail = 0
> statfs.f_files = 64
> statfs.f_ffree = 60
> ....
> counts.freedata = 0
> counts.freertx = 0
> counts.freeino = 60
> counts.allocino = 64
> # df -k /mnt/xfs2
> Filesystem           1K-blocks      Used Available Use% Mounted on
> /dev/ubd/2             2086912   2086912         0 100% /mnt/xfs2
> # xfs_io -x -c resblks /mnt/xfs2
> reserved blocks = 8
> available reserved blocks = 8
>
> Can you produce a metadump of the filesystem image that your have produced
> on 2.6.17 that results in bad behaviour on later kernels so I can see if
> I can reproduce the same results here? If you've only got a handful of files
> the image will be small enough to mail to me....
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
>

[-- Attachment #2: d10.diff.txt --]
[-- Type: text/plain, Size: 5180 bytes --]

diff -rNup xfs/linux-2.6/xfs_super.c xfs/linux-2.6/xfs_super.c
--- xfs/linux-2.6/xfs_super.c	2008-08-25 14:25:11.000000000 +0200
+++ xfs/linux-2.6/xfs_super.c	2008-08-27 11:55:33.000000000 +0200
@@ -61,6 +61,9 @@
 #include <linux/kthread.h>
 #include <linux/freezer.h>
 
+
+#define NO_2618_XFS
+
 static struct quotactl_ops xfs_quotactl_operations;
 static struct super_operations xfs_super_operations;
 static kmem_zone_t *xfs_vnode_zone;
@@ -1187,8 +1190,13 @@ xfs_fs_statfs(
 	statp->f_bsize = sbp->sb_blocksize;
 	lsize = sbp->sb_logstart ? sbp->sb_logblocks : 0;
 	statp->f_blocks = sbp->sb_dblocks - lsize;
+#ifndef NO_2618_XFS
 	statp->f_bfree = statp->f_bavail =
 				sbp->sb_fdblocks - XFS_ALLOC_SET_ASIDE(mp);
+#else
+	statp->f_bfree = statp->f_bavail = sbp->sb_fdblocks;
+#endif
+
 	fakeinos = statp->f_bfree << sbp->sb_inopblog;
 #if XFS_BIG_INUMS
 	fakeinos += mp->m_inoadd;
diff -rNup xfs/xfs_fsops.c xfs/xfs_fsops.c
--- xfs/xfs_fsops.c	2008-08-25 14:25:13.000000000 +0200
+++ xfs/xfs_fsops.c	2008-08-27 11:56:30.000000000 +0200
@@ -46,6 +46,8 @@
 #include "xfs_rw.h"
 #include "xfs_filestream.h"
 
+#define NO_2618_XFS
+
 /*
  * File system operations
  */
@@ -464,7 +466,11 @@ xfs_fs_counts(
 {
 	xfs_icsb_sync_counters_flags(mp, XFS_ICSB_LAZY_COUNT);
 	spin_lock(&mp->m_sb_lock);
+#ifdef NO_2618_XFS
+	cnt->freedata = mp->m_sb.sb_fdblocks;
+#else
 	cnt->freedata = mp->m_sb.sb_fdblocks - XFS_ALLOC_SET_ASIDE(mp);
+#endif	
 	cnt->freertx = mp->m_sb.sb_frextents;
 	cnt->freeino = mp->m_sb.sb_ifree;
 	cnt->allocino = mp->m_sb.sb_icount;
@@ -539,24 +545,42 @@ retry:
 		}
 		mp->m_resblks = request;
 	} else {
+#ifndef NO_2618_XFS
 		__int64_t	free;
 
 		free =  mp->m_sb.sb_fdblocks - XFS_ALLOC_SET_ASIDE(mp);
 		if (!free)
 			goto out; /* ENOSPC and fdblks_delta = 0 */
-
+			
+#endif
 		delta = request - mp->m_resblks;
+
+#ifndef NO_2618_XFS
 		lcounter = free - delta;
+#else
+		lcounter = mp->m_sb.sb_fdblocks - delta;
+#endif
 		if (lcounter < 0) {
 			/* We can't satisfy the request, just get what we can */
+#ifndef NO_2618_XFS
 			mp->m_resblks += free;
 			mp->m_resblks_avail += free;
 			fdblks_delta = -free;
 			mp->m_sb.sb_fdblocks = XFS_ALLOC_SET_ASIDE(mp);
+#else
+			mp->m_resblks += mp->m_sb.sb_fdblocks;
+			mp->m_resblks_avail += mp->m_sb.sb_fdblocks;
+			fdblks_delta = -mp->m_sb.sb_fdblocks;
+			mp->m_sb.sb_fdblocks = 0;
+#endif
 		} else {
 			fdblks_delta = -delta;
+#ifndef NO_2618_XFS
 			mp->m_sb.sb_fdblocks =
 				lcounter + XFS_ALLOC_SET_ASIDE(mp);
+#else
+			mp->m_sb.sb_fdblocks = lcounter;
+#endif
 			mp->m_resblks = request;
 			mp->m_resblks_avail += delta;
 		}
diff -rNup xfs/xfs_mount.c xfs/xfs_mount.c
--- xfs/xfs_mount.c	2008-08-25 14:25:14.000000000 +0200
+++ xfs/xfs_mount.c	2008-08-27 14:42:02.000000000 +0200
@@ -44,6 +44,9 @@
 #include "xfs_quota.h"
 #include "xfs_fsops.h"
 
+
+#define NO_2618_XFS
+
 STATIC void	xfs_mount_log_sb(xfs_mount_t *, __int64_t);
 STATIC int	xfs_uuid_mount(xfs_mount_t *);
 STATIC void	xfs_uuid_unmount(xfs_mount_t *mp);
@@ -1525,6 +1528,11 @@ xfs_mod_sb(xfs_trans_t *tp, __int64_t fi
  *
  * The m_sb_lock must be held when this routine is called.
  */
+ 
+#ifdef NO_2618_XFS	
+    #define SET_ASIDE_BLOCKS 8
+#endif 
+ 
 int
 xfs_mod_incore_sb_unlocked(
 	xfs_mount_t	*mp,
@@ -1562,8 +1570,12 @@ xfs_mod_incore_sb_unlocked(
 		mp->m_sb.sb_ifree = lcounter;
 		return 0;
 	case XFS_SBS_FDBLOCKS:
+#ifndef NO_2618_XFS	
 		lcounter = (long long)
 			mp->m_sb.sb_fdblocks - XFS_ALLOC_SET_ASIDE(mp);
+#else
+		lcounter = (long long)mp->m_sb.sb_fdblocks - SET_ASIDE_BLOCKS;
+#endif
 		res_used = (long long)(mp->m_resblks - mp->m_resblks_avail);
 
 		if (delta > 0) {		/* Putting blocks back */
@@ -1596,8 +1608,11 @@ xfs_mod_incore_sb_unlocked(
 				}
 			}
 		}
-
+#ifndef NO_2618_XFS	
 		mp->m_sb.sb_fdblocks = lcounter + XFS_ALLOC_SET_ASIDE(mp);
+#else
+		mp->m_sb.sb_fdblocks = lcounter + SET_ASIDE_BLOCKS;    
+#endif		
 		return 0;
 	case XFS_SBS_FREXTENTS:
 		lcounter = (long long)mp->m_sb.sb_frextents;
@@ -2321,8 +2336,14 @@ xfs_icsb_sync_counters(
  */
 
 #define XFS_ICSB_INO_CNTR_REENABLE	(uint64_t)64
+
+#ifndef NO_2618_XFS
 #define XFS_ICSB_FDBLK_CNTR_REENABLE(mp) \
 		(uint64_t)(512 + XFS_ALLOC_SET_ASIDE(mp))
+#else
+#define XFS_ICSB_FDBLK_CNTR_REENABLE   512
+#endif
+		
 STATIC void
 xfs_icsb_balance_counter(
 	xfs_mount_t	*mp,
@@ -2357,7 +2378,11 @@ xfs_icsb_balance_counter(
 	case XFS_SBS_FDBLOCKS:
 		count = mp->m_sb.sb_fdblocks;
 		resid = do_div(count, weight);
+#ifndef NO_2618_XFS
 		if (count < max(min, XFS_ICSB_FDBLK_CNTR_REENABLE(mp)))
+#else
+		if (count < max(min, XFS_ICSB_FDBLK_CNTR_REENABLE))
+#endif		
 			goto out;
 		break;
 	default:
@@ -2418,12 +2443,19 @@ again:
 
 	case XFS_SBS_FDBLOCKS:
 		BUG_ON((mp->m_resblks - mp->m_resblks_avail) != 0);
-
+#ifndef NO_2618_XFS
 		lcounter = icsbp->icsb_fdblocks - XFS_ALLOC_SET_ASIDE(mp);
+#else
+                lcounter = icsbp->icsb_fdblocks;
+#endif		
 		lcounter += delta;
 		if (unlikely(lcounter < 0))
 			goto balance_counter;
+#ifndef NO_2618_XFS
 		icsbp->icsb_fdblocks = lcounter + XFS_ALLOC_SET_ASIDE(mp);
+#else		
+		icsbp->icsb_fdblocks = lcounter;
+#endif
 		break;
 	default:
 		BUG();

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: XFS issue under 2.6.25.13 kernel
  2008-08-27 18:09           ` Sławomir Nowakowski
@ 2008-08-28  0:20             ` Dave Chinner
  0 siblings, 0 replies; 8+ messages in thread
From: Dave Chinner @ 2008-08-28  0:20 UTC (permalink / raw)
  To: Sławomir Nowakowski; +Cc: xfs

On Wed, Aug 27, 2008 at 08:09:18PM +0200, Sławomir Nowakowski wrote:
> Dear Dave,
> 
> We really apreciate your help..
> 
> In the realtion to previous correspondations about differences between
> implementation of kernels 2.6.17.13 and 2.6.25.13 we'd like to ask
> some questions.
> 
> We was based on git repository:
> 
> git://git.kernel.org
> 
> We have reverted some changes for XFS in 2.6.25.13 kernel. We have
> usedf 3 commits:
> 
> - 94E1E99F11... (SGI-PV: 964468)
> - 4BE536DEBE... (SGI-PV: 955674)
> - 4CA488EB4...  (SGI-PV: 971186)
> 
> With these changes we have created patch for 2.6.25.13 kernel. This
> patch should eliminate additional reservation of disk space in XFS
> file system. Our intention was to get similarity space of disk between
> 2.6.17.13 and 2.6.25.13 kernels.

After removing the reservation with xfs_io (the big difference), I
don't see why you need to hack the kernel as well. Have you got
such little margin in your filesystem provisioning that you can't
spare 4 blocks per AG?

> Does patch that is attached to this mail do everything properly?

Don't know. You've taken away a bunch of reserved blocks other
code relies on existing for correct operation at ENOSPC. Given
that you are doing this because you are running so close to
ENOSPC there's a good chance that you've broken something.

I don't have the time (or the desire) to analyse the impact of the
changes being made, but I bet that the XFSQA tests that exercise
behaviour at ENOSPC will start to deadlock again...

> Is it
> 100% compatibe with XFS API?

You've changed statfs. You'll have to make sure it reports
the correct thing in all cases (there's an XFSQA test for this).

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2008-08-28  0:18 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-08-22 10:03 XFS issue under 2.6.25.13 kernel Sławomir Nowakowski
2008-08-23  1:05 ` Dave Chinner
2008-08-25 11:08   ` Sławomir Nowakowski
2008-08-26  1:41     ` Dave Chinner
2008-08-26 12:53       ` Sławomir Nowakowski
2008-08-27  0:52         ` Dave Chinner
2008-08-27 18:09           ` Sławomir Nowakowski
2008-08-28  0:20             ` Dave Chinner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox