public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: "Darrick J. Wong" <djwong@us.ibm.com>
Cc: Andreas Dilger <adilger@dilger.ca>, Jens Axboe <axboe@kernel.dk>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	Mingming Cao <mcao@us.ibm.com>,
	linux-scsi <linux-scsi@vger.kernel.org>
Subject: Re: [RFC] block integrity: Fix write after checksum calculation problem
Date: Wed, 23 Feb 2011 09:53:51 +1100	[thread overview]
Message-ID: <20110222225351.GG3166@dastard> (raw)
In-Reply-To: <20110222194538.GU27190@tux1.beaverton.ibm.com>

On Tue, Feb 22, 2011 at 11:45:38AM -0800, Darrick J. Wong wrote:
> On Tue, Feb 22, 2011 at 09:13:49AM -0700, Andreas Dilger wrote:
> > On 2011-02-21, at 19:00, "Darrick J. Wong" <djwong@us.ibm.com> wrote:
> > > Last summer there was a long thread entitled "Wrong DIF guard tag on ext2
> > > write" (http://marc.info/?l=linux-scsi&m=127530531808556&w=2) that started a
> > > discussion about how to deal with the situation where one program tells the
> > > kernel to write a block to disk, the kernel computes the checksum of that data,
> > > and then a second program begins writing to that same block before the disk HBA
> > > can DMA the memory block, thereby causing the disk to complain about being sent
> > > invalid checksums.
> > > 
> > > I was able to write a
> > > trivial program to trigger the write problem, I'm pretty sure that this has not
> > > been fixed upstream.  (FYI, using O_DIRECT still seems fine.)
> > 
> > Can you please attach your reproducer? IIRC it needed mmap() to hit this
> > problem?  Did you measure CPU usage during your testing?
> 
> I didn't need mmap; a lot of threads using write() was enough.  (The reproducer
> program does have a mmap mode though).  Basically it creates a lot of threads
> to write small blobs to random offsets in a file, with optional mmap, dio, and
> sync options.

*nod*

Both mmap and write paths need to be block on
wait_for_page_writeback(page) once they have a locked page ready for
modification. btrfs does this in btrfs_page_mkwrite() and
prepare_pages(), so adding similar calls into block_page_mkwrite()
and grab_cache_page_write_begin() would probably fix the problem for
the other major filesystems....

> Agreed.  I too am curious to study which circumstances favor copying vs
> blocking.

IMO blocking is generally preferable in high throughput threaded
workloads as there is always another thread that can do useful work
while we wait for IO to complete. Most use cases for DIF center
around high throughput environments....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2011-02-22 22:53 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-22  2:00 [RFC] block integrity: Fix write after checksum calculation problem Darrick J. Wong
2011-02-22  5:45 ` Boaz Harrosh
2011-02-22 11:42   ` Jan Kara
2011-02-22 13:02     ` Chris Mason
2011-02-22 19:13       ` Boaz Harrosh
2011-03-04 20:51     ` Darrick J. Wong
2011-03-04 20:53       ` Christoph Hellwig
2011-02-22 16:13 ` Andreas Dilger
2011-02-22 16:40   ` Martin K. Petersen
2011-02-22 19:45   ` Darrick J. Wong
2011-02-22 22:53     ` Dave Chinner [this message]
2011-02-23 16:24       ` Martin K. Petersen
2011-02-23 23:47         ` Dave Chinner
2011-02-24 16:43         ` Jan Kara
2011-02-28  8:49   ` Christoph Hellwig
2011-02-22 16:45 ` Martin K. Petersen
2011-02-23 20:24   ` Joel Becker
2011-02-23 20:35     ` Chris Mason
2011-02-23 21:42       ` Joel Becker
2011-02-24 16:47       ` Jan Kara
2011-02-24 17:37         ` Chris Mason
2011-02-24 18:27           ` Darrick J. Wong
2011-02-28 12:54             ` Chris Mason
2011-03-04 21:07               ` Darrick J. Wong
2011-03-04 22:22                 ` Andreas Dilger
2011-03-07 19:11                   ` Darrick J. Wong
2011-03-07 21:12                 ` Chris Mason
2011-03-08  4:56                 ` Dave Chinner
2011-03-10 23:57                   ` Darrick J. Wong
2011-03-11 16:34                     ` Chris Mason
2011-03-11 18:51                       ` Darrick J. Wong
2011-03-19  0:07                   ` Darrick J. Wong
2011-03-19  2:28                     ` Andreas Dilger
2011-03-22 19:23                       ` Darrick J. Wong
2011-03-22 21:54                         ` Jan Kara
2011-03-21 14:04                     ` Jan Kara
2011-03-21 14:24                       ` Chris Mason
2011-03-21 16:43                         ` Jan Kara
2011-04-06 23:29                           ` Darrick J. Wong
2011-04-07 16:44                             ` Darrick J. Wong
2011-04-07 16:57                             ` Jan Kara
2011-04-08 20:31                               ` Darrick J. Wong
2011-04-11 16:42                                 ` Jeff Layton
2011-04-11 17:41                                   ` Chris Mason
2011-04-11 18:25                                     ` Christoph Hellwig
2011-04-11 18:38                                       ` Chris Mason
2011-04-12  0:46                                     ` Mingming Cao
2011-04-12  0:57                                       ` Christoph Hellwig
2011-04-14  0:48                                         ` Mingming Cao
2011-04-22  0:02                                           ` [RFC v2] block integrity: Stabilize(?) pages during writeback Darrick J. Wong
2011-04-22 12:50                                             ` Chris Mason
2011-04-22 20:34                                               ` Jan Kara
2011-04-26  0:37                                                 ` Darrick J. Wong
2011-04-26 11:33                                                   ` Chris Mason
2011-05-03  1:59                                                     ` Darrick J. Wong
2011-05-04  1:26                                                       ` Darrick J. Wong
2011-04-26 11:37                                                   ` Jan Kara
2011-05-04 17:37                                             ` [PATCH v3 0/3] data integrity: Stabilize pages during writeback for ext4 Darrick J. Wong
2011-05-04 18:46                                               ` Christoph Hellwig
2011-05-04 19:21                                                 ` Chris Mason
2011-05-04 20:00                                                   ` Darrick J. Wong
2011-05-04 23:57                                                   ` Darrick J. Wong
2011-05-05 15:26                                                     ` Jan Kara
2011-05-04 17:39                                             ` [PATCH v3 1/3] ext4: Clean up some wait_on_page_writeback calls Darrick J. Wong
2011-05-04 17:41                                             ` [PATCH v3 2/3] ext4: Wait for writeback to complete while making pages writable Darrick J. Wong
2011-05-04 17:42                                             ` [PATCH v3 3/3] mm: Wait for writeback when grabbing pages to begin a write Darrick J. Wong
2011-05-04 18:48                                               ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110222225351.GG3166@dastard \
    --to=david@fromorbit.com \
    --cc=adilger@dilger.ca \
    --cc=axboe@kernel.dk \
    --cc=djwong@us.ibm.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=mcao@us.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox