public inbox for linux-ext4@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Gernot Hillier <gernot.hillier@siemens.com>
Cc: Theodore Ts'o <tytso@mit.edu>,
	linux-scsi@vger.kernel.org, MPT-FusionLinux.pdl@broadcom.com,
	linux-ext4@vger.kernel.org, sathya.prakash@broadcom.com,
	chaitra.basappa@broadcom.com,
	suganath-prabu.subramani@broadcom.com
Subject: Re: unexpected sync delays in dpkg for small pre-allocated files on ext4
Date: Tue, 31 May 2016 10:21:52 +1000	[thread overview]
Message-ID: <20160531002152.GQ26977@dastard> (raw)
In-Reply-To: <574BF988.5050202@siemens.com>

On Mon, May 30, 2016 at 10:27:52AM +0200, Gernot Hillier wrote:
> Hi!
> 
> On 25.05.2016 01:13, Theodore Ts'o wrote:
> > On Tue, May 24, 2016 at 07:07:41PM +0200, Gernot Hillier wrote:
> >> We experience strange delays with kernel 4.1.18 during dpkg
> >> package installation on an ext4 filesystem after switching from
> >> Ubuntu 14.04 to 16.04. We can reproduce the issue with kernel 4.6.
> >> Installation of the same package takes 2s with ext3 and 31s with
> >> ext4 on the same partition.
> >>
> >> Hardware is an Intel-based server with Supermicro X8DTH board and
> >> Seagate ST973451SS disks connected to an LSI SAS2008 controller (PCI
> >> 0x1000:0x0072, mpt2sas driver).
> [...]
> >> To me, the problem looks comparable to
> >> https://bugzilla.kernel.org/show_bug.cgi?id=56821 (even if we don't see
> >> a full hang and there's no RAID involved for us), so a closer look on
> >> the SCSI layer or driver might be the next step?
> > 
> > What I would suggest is to create a small test case which compares the
> > time it takes to allocate 1 megabyte of memory, zero it, and then
> > write one megabytes of zeros using the write(2) system call.  Then try
> > writing one megabytes of zero using the BLKZEROOUT ioctl.
> 
> Ok, this is my test code:
> 
> 	const int SIZE = 1*1024*1024;
> 	char* buffer = malloc(SIZE);
> 	uint64_t range[2] = { 0, SIZE };
> 	int fd = open("/dev/sdb2", O_WRONLY);
> 
> 	bzero(buffer, SIZE);
> 	write(fd, buffer, SIZE);
> 	sync_file_range(fd, 0, 0, 2);
> 
> 	ioctl (fd, BLKZEROOUT, range);
> 
> 	close(fd);
> 	free(buffer);
> 
> # strace -tt ./test-tytso
> [...]
> 15:46:27.481636 open("/dev/sdb2", O_WRONLY) = 3
> 15:46:27.482004 write(3, "\0\0\0\0\0\0"..., 1048576) = 1048576
> 15:46:27.482438 sync_file_range(3, 0, 0, SYNC_FILE_RANGE_WRITE) = 0
> 15:46:27.482698 ioctl(3, BLKZEROOUT, [0, 100000]) = 0
> 15:46:27.546971 close(3)                = 0
> 
> So the write() and sync_file_range() in the first case takes ~400 us
> each while BLKZEROOUT takes... 60 ms. Wow.

Comparing apples to oranges.

Unlike the name implies, sync_file_range() does not provide any data
integrity semantics what-so-ever: SYNC_FILE_RANGE_WRITE only submits
IO to clean dirty pages - that only takes 400us of CPU time.  It
does not wait for completion, nor does it flush the drive cache and
so by the time the syscall returns to userspace the IO may not have
even been sent to the device (e.g. it could be queued by the IO
scheduler in the block layer). i.e. you're not timing IO, you're
timing CPU overhead of IO submission.

For an apples to apples comparison, you need to use fsync() to
physically force the written data to stable storage and wait for
completion. This is what BLKZEROOUT is effectively doing, so I think
you'll find fdatasync() also takes around 60ms...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2016-05-31  0:21 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-24 17:07 unexpected sync delays in dpkg for small pre-allocated files on ext4 Gernot Hillier
2016-05-24 23:13 ` Theodore Ts'o
2016-05-30  8:27   ` Gernot Hillier
2016-05-31  0:21     ` Dave Chinner [this message]
2016-06-01  9:44       ` Gernot Hillier
2016-06-01 13:17         ` Gernot Hillier
2016-06-01 14:12           ` Theodore Ts'o
2016-06-02 16:23             ` Gernot Hillier
2016-07-13 13:57             ` Gernot Hillier
2016-05-26  2:20 ` Dave Chinner
2016-05-26  7:02   ` Christoph Hellwig
  -- strict thread matches above, loose matches on Subject: below --
2016-05-30 19:04 Jun He

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160531002152.GQ26977@dastard \
    --to=david@fromorbit.com \
    --cc=MPT-FusionLinux.pdl@broadcom.com \
    --cc=chaitra.basappa@broadcom.com \
    --cc=gernot.hillier@siemens.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=sathya.prakash@broadcom.com \
    --cc=suganath-prabu.subramani@broadcom.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox