* Samba speed
@ 2008-12-08 18:21 Jeremy Allison
2008-12-08 22:39 ` Theodore Tso
0 siblings, 1 reply; 10+ messages in thread
From: Jeremy Allison @ 2008-12-08 18:21 UTC (permalink / raw)
To: samba-technical; +Cc: linux-fsdevel, linux-cifs-client
Here's a really interesting paper from Intel
that they recently brought to my attention.
http://software.intel.com/en-us/articles/windows-client-cifs-behavior-can-slow-linux-nas-performance
Looks like using XFS for your Linux Samba
server, or setting "strict allocate = yes" can make
a big difference due to sparse file issues.
Comments welcome !
Jeremy.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Samba speed
2008-12-08 18:21 Samba speed Jeremy Allison
@ 2008-12-08 22:39 ` Theodore Tso
2008-12-08 23:12 ` Jeremy Allison
0 siblings, 1 reply; 10+ messages in thread
From: Theodore Tso @ 2008-12-08 22:39 UTC (permalink / raw)
To: Jeremy Allison; +Cc: samba-technical, linux-fsdevel, linux-cifs-client
On Mon, Dec 08, 2008 at 10:21:14AM -0800, Jeremy Allison wrote:
> Here's a really interesting paper from Intel
> that they recently brought to my attention.
>
> http://software.intel.com/en-us/articles/windows-client-cifs-behavior-can-slow-linux-nas-performance
>
> Looks like using XFS for your Linux Samba
> server, or setting "strict allocate = yes" can make
> a big difference due to sparse file issues.
Glibc 2.7 (as shipped in Ubuntu Hardy) has posix_fallocate wired up to
the fallocate system call, and ext4 supports delayed allocation as
well as preallocation. There are number of userspace applications ---
rsync, samba, and most bittorrent applications come to mind --- where
use of fallocate would be a big win.
- Ted
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Samba speed
2008-12-08 22:39 ` Theodore Tso
@ 2008-12-08 23:12 ` Jeremy Allison
2008-12-08 23:38 ` Theodore Tso
0 siblings, 1 reply; 10+ messages in thread
From: Jeremy Allison @ 2008-12-08 23:12 UTC (permalink / raw)
To: Theodore Tso
Cc: Jeremy Allison, samba-technical, linux-fsdevel, linux-cifs-client
On Mon, Dec 08, 2008 at 05:39:24PM -0500, Theodore Tso wrote:
> On Mon, Dec 08, 2008 at 10:21:14AM -0800, Jeremy Allison wrote:
> > Here's a really interesting paper from Intel
> > that they recently brought to my attention.
> >
> > http://software.intel.com/en-us/articles/windows-client-cifs-behavior-can-slow-linux-nas-performance
> >
> > Looks like using XFS for your Linux Samba
> > server, or setting "strict allocate = yes" can make
> > a big difference due to sparse file issues.
>
> Glibc 2.7 (as shipped in Ubuntu Hardy) has posix_fallocate wired up to
> the fallocate system call, and ext4 supports delayed allocation as
> well as preallocation. There are number of userspace applications ---
> rsync, samba, and most bittorrent applications come to mind --- where
> use of fallocate would be a big win.
Turns out that ext4 doesn't suffer from the slowdown in the
first place. The paper is extremly interesting, I'm looking
at the implications for our default settings (most users
are still using Samba on ext3 on Linux).
Jeremy.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Samba speed
2008-12-08 23:12 ` Jeremy Allison
@ 2008-12-08 23:38 ` Theodore Tso
2008-12-09 0:37 ` Andreas Dilger
0 siblings, 1 reply; 10+ messages in thread
From: Theodore Tso @ 2008-12-08 23:38 UTC (permalink / raw)
To: Jeremy Allison; +Cc: samba-technical, linux-fsdevel, linux-cifs-client
On Mon, Dec 08, 2008 at 03:12:33PM -0800, Jeremy Allison wrote:
>
> Turns out that ext4 doesn't suffer from the slowdown in the
> first place. The paper is extremly interesting, I'm looking
> at the implications for our default settings (most users
> are still using Samba on ext3 on Linux).
I thought the paper only talked about ext3, and theorized that delayed
allocation in ext4 might be enough to make the problem go away, but
they had not actually done any measurements to confirm this
supposition. Has there been any more recent benchmarks comparing
ext3, ext4, and XFS running Samba serving Windows clients?
- Ted
P.S. I'll be on the Google campus tomorrow and Wednesday attending
the Ubuntu developer's conference; we should get together for lunch or
dinner or some such....
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Samba speed
2008-12-08 23:38 ` Theodore Tso
@ 2008-12-09 0:37 ` Andreas Dilger
2008-12-09 6:06 ` Theodore Tso
0 siblings, 1 reply; 10+ messages in thread
From: Andreas Dilger @ 2008-12-09 0:37 UTC (permalink / raw)
To: Theodore Tso
Cc: Jeremy Allison, samba-technical, linux-fsdevel, linux-cifs-client
On Dec 08, 2008 18:38 -0500, Theodore Ts'o wrote:
> On Mon, Dec 08, 2008 at 03:12:33PM -0800, Jeremy Allison wrote:
> >
> > Turns out that ext4 doesn't suffer from the slowdown in the
> > first place. The paper is extremly interesting, I'm looking
> > at the implications for our default settings (most users
> > are still using Samba on ext3 on Linux).
>
> I thought the paper only talked about ext3, and theorized that delayed
> allocation in ext4 might be enough to make the problem go away, but
> they had not actually done any measurements to confirm this
> supposition. Has there been any more recent benchmarks comparing
> ext3, ext4, and XFS running Samba serving Windows clients?
It wouldn't be a bad idea to use this hint in the kernel to call
fallocate(), given the fact that this is used by a number of apps
(i.e. all of them) that predate fallocate().
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Samba speed
2008-12-09 0:37 ` Andreas Dilger
@ 2008-12-09 6:06 ` Theodore Tso
2008-12-09 6:25 ` ronnie sahlberg
0 siblings, 1 reply; 10+ messages in thread
From: Theodore Tso @ 2008-12-09 6:06 UTC (permalink / raw)
To: Andreas Dilger
Cc: Jeremy Allison, samba-technical, linux-fsdevel, linux-cifs-client
On Mon, Dec 08, 2008 at 04:37:01PM -0800, Andreas Dilger wrote:
> On Dec 08, 2008 18:38 -0500, Theodore Ts'o wrote:
> > On Mon, Dec 08, 2008 at 03:12:33PM -0800, Jeremy Allison wrote:
> > >
> > > Turns out that ext4 doesn't suffer from the slowdown in the
> > > first place. The paper is extremly interesting, I'm looking
> > > at the implications for our default settings (most users
> > > are still using Samba on ext3 on Linux).
> >
> > I thought the paper only talked about ext3, and theorized that delayed
> > allocation in ext4 might be enough to make the problem go away, but
> > they had not actually done any measurements to confirm this
> > supposition. Has there been any more recent benchmarks comparing
> > ext3, ext4, and XFS running Samba serving Windows clients?
>
> It wouldn't be a bad idea to use this hint in the kernel to call
> fallocate(), given the fact that this is used by a number of apps
> (i.e. all of them) that predate fallocate().
What, a one byte write that extends a file should be translated into
an fallocate()? How.... crude. The question is, do we really want to
be encouraging Microsoft in that way? :-)
Also, as it turns out, Microsoft is only doing this every 128k (i.e.,
touch one byte 128k after the end of the file, then write 128k of
data, then write another 1 byte of garbage 128k past the end of the
file, etc.), so ext4's delayed allocation algorithms seems to be able
to handle things just fine.
I also suspect that if someone tried recompiling a kernel changing the
value of EXT3_DEFAULT_RESERVE_BLOCKS from 8 to 32, or changing Samba
to use the EXT3_IOC_SETRSVSZ ioctl immediately after opening a file
for writing to set the block allocation reservation size for that
inode to 32 blocks (128k), this might also enough of a kludge to solve
most of the performance problems of Samba running on ext3 versus a
Windows XP client. If someone *does* manage to try this experiment,
please us know if it works...
- Ted
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Samba speed
2008-12-09 6:06 ` Theodore Tso
@ 2008-12-09 6:25 ` ronnie sahlberg
2008-12-09 6:55 ` Theodore Tso
0 siblings, 1 reply; 10+ messages in thread
From: ronnie sahlberg @ 2008-12-09 6:25 UTC (permalink / raw)
To: Theodore Tso
Cc: linux-fsdevel, Andreas Dilger, linux-cifs-client, Jeremy Allison,
samba-technical
On Tue, Dec 9, 2008 at 5:06 PM, Theodore Tso <tytso@mit.edu> wrote:
> On Mon, Dec 08, 2008 at 04:37:01PM -0800, Andreas Dilger wrote:
>> On Dec 08, 2008 18:38 -0500, Theodore Ts'o wrote:
>> > On Mon, Dec 08, 2008 at 03:12:33PM -0800, Jeremy Allison wrote:
>> > >
>> > > Turns out that ext4 doesn't suffer from the slowdown in the
>> > > first place. The paper is extremly interesting, I'm looking
>> > > at the implications for our default settings (most users
>> > > are still using Samba on ext3 on Linux).
>> >
>> > I thought the paper only talked about ext3, and theorized that delayed
>> > allocation in ext4 might be enough to make the problem go away, but
>> > they had not actually done any measurements to confirm this
>> > supposition. Has there been any more recent benchmarks comparing
>> > ext3, ext4, and XFS running Samba serving Windows clients?
>>
>> It wouldn't be a bad idea to use this hint in the kernel to call
>> fallocate(), given the fact that this is used by a number of apps
>> (i.e. all of them) that predate fallocate().
>
> What, a one byte write that extends a file should be translated into
> an fallocate()? How.... crude. The question is, do we really want to
> be encouraging Microsoft in that way? :-)
>
> Also, as it turns out, Microsoft is only doing this every 128k (i.e.,
> touch one byte 128k after the end of the file, then write 128k of
> data, then write another 1 byte of garbage 128k past the end of the
> file, etc.), so ext4's delayed allocation algorithms seems to be able
> to handle things just fine.
>
> I also suspect that if someone tried recompiling a kernel changing the
> value of EXT3_DEFAULT_RESERVE_BLOCKS from 8 to 32, or changing Samba
> to use the EXT3_IOC_SETRSVSZ ioctl immediately after opening a file
> for writing to set the block allocation reservation size for that
> inode to 32 blocks (128k), this might also enough of a kludge to solve
> most of the performance problems of Samba running on ext3 versus a
> Windows XP client. If someone *does* manage to try this experiment,
> please us know if it works...
>
> - Ted
>
Its not as simple as "the redirector does it every 128k". the
redirector does this but it varies from run to run and from client to
client.
It is very common to see this happening in 60-64kb strides and other
strides as well.
It is probably some interaction with how large the actual i/o that the
application did internally to the cache and some other thing.
but anyway, it varies a lot. it is not always 128k.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Samba speed
2008-12-09 6:25 ` ronnie sahlberg
@ 2008-12-09 6:55 ` Theodore Tso
2008-12-09 7:50 ` Volker Lendecke
0 siblings, 1 reply; 10+ messages in thread
From: Theodore Tso @ 2008-12-09 6:55 UTC (permalink / raw)
To: ronnie sahlberg
Cc: Andreas Dilger, linux-fsdevel, linux-cifs-client, Jeremy Allison,
samba-technical
On Tue, Dec 09, 2008 at 05:25:29PM +1100, ronnie sahlberg wrote:
>
> Its not as simple as "the redirector does it every 128k". the
> redirector does this but it varies from run to run and from client to
> client.
> It is very common to see this happening in 60-64kb strides and other
> strides as well.
>
> It is probably some interaction with how large the actual i/o that the
> application did internally to the cache and some other thing.
> but anyway, it varies a lot. it is not always 128k.
Is there a maximum stride size used by the redirector? i.e., will it
use something bigger than 128k? In any case, increasing the ext3
reservation window size should still be helpful. It's OK if we
increase it to 32 blocks (128k, on a 4k block filesystem), and the
stride size is smaller than that. But if it is often bigger than
128k, then then it would probably be better if samba used the
EXT3_IOC_SETRSVSZ ioctl to dynamically set the reservation size as
appropriate.
- Ted
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Samba speed
2008-12-09 6:55 ` Theodore Tso
@ 2008-12-09 7:50 ` Volker Lendecke
2008-12-09 15:40 ` Richard Sharpe
0 siblings, 1 reply; 10+ messages in thread
From: Volker Lendecke @ 2008-12-09 7:50 UTC (permalink / raw)
To: Theodore Tso
Cc: Andreas Dilger, samba-technical, linux-fsdevel, linux-cifs-client,
Jeremy Allison
[-- Attachment #1: Type: text/plain, Size: 1334 bytes --]
On Tue, Dec 09, 2008 at 01:55:09AM -0500, Theodore Tso wrote:
> On Tue, Dec 09, 2008 at 05:25:29PM +1100, ronnie sahlberg wrote:
> >
> > Its not as simple as "the redirector does it every 128k". the
> > redirector does this but it varies from run to run and from client to
> > client.
> > It is very common to see this happening in 60-64kb strides and other
> > strides as well.
> >
> > It is probably some interaction with how large the actual i/o that the
> > application did internally to the cache and some other thing.
> > but anyway, it varies a lot. it is not always 128k.
>
> Is there a maximum stride size used by the redirector? i.e., will it
> use something bigger than 128k? In any case, increasing the ext3
> reservation window size should still be helpful. It's OK if we
> increase it to 32 blocks (128k, on a 4k block filesystem), and the
> stride size is smaller than that. But if it is often bigger than
> 128k, then then it would probably be better if samba used the
> EXT3_IOC_SETRSVSZ ioctl to dynamically set the reservation size as
> appropriate.
One might try to use "dd" from Cygwin on Windows. When I
once analyzed this behaviour, the 1-byte writes were exactly
at the end of the block the Win32 app gave to the kernel as
seen by the sysinternals filemon tool.
Volker
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Samba speed
2008-12-09 7:50 ` Volker Lendecke
@ 2008-12-09 15:40 ` Richard Sharpe
0 siblings, 0 replies; 10+ messages in thread
From: Richard Sharpe @ 2008-12-09 15:40 UTC (permalink / raw)
To: Volker Lendecke
Cc: Theodore Tso, Andreas Dilger, samba-technical, linux-fsdevel,
linux-cifs-client, Jeremy Allison
On Tue, 9 Dec 2008, Volker Lendecke wrote:
>> Is there a maximum stride size used by the redirector? i.e., will it
>> use something bigger than 128k? In any case, increasing the ext3
>> reservation window size should still be helpful. It's OK if we
>> increase it to 32 blocks (128k, on a 4k block filesystem), and the
>> stride size is smaller than that. But if it is often bigger than
>> 128k, then then it would probably be better if samba used the
>> EXT3_IOC_SETRSVSZ ioctl to dynamically set the reservation size as
>> appropriate.
>
> One might try to use "dd" from Cygwin on Windows. When I
> once analyzed this behaviour, the 1-byte writes were exactly
> at the end of the block the Win32 app gave to the kernel as
> seen by the sysinternals filemon tool.
Hmmm, I always thought the network behavior was to prevent the app from
being given an ENOSPACE error when it closed the file and the OS tried to
flush the cache only to find that there was no space remaining on the
remote file system to actually perform the writes the local cache
accepted.
However, this suggests that the behavior is the same for local and remote
file systems.
I do believe, as Ronnie said and as your evidence suggests, that these
writes are at the end of the IO written into the cache by the app.
Regards
-------
Richard Sharpe, rsharpe[at]richardsharpe.com, rsharpe[at]samba.org,
sharpe[at]ethereal.com, http://www.richardsharpe.com
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2008-12-09 15:42 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-12-08 18:21 Samba speed Jeremy Allison
2008-12-08 22:39 ` Theodore Tso
2008-12-08 23:12 ` Jeremy Allison
2008-12-08 23:38 ` Theodore Tso
2008-12-09 0:37 ` Andreas Dilger
2008-12-09 6:06 ` Theodore Tso
2008-12-09 6:25 ` ronnie sahlberg
2008-12-09 6:55 ` Theodore Tso
2008-12-09 7:50 ` Volker Lendecke
2008-12-09 15:40 ` Richard Sharpe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).