All of lore.kernel.org
 help / color / mirror / Atom feed
From: Boaz Harrosh <bharrosh@panasas.com>
To: James Bottomley <James.Bottomley@suse.de>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-scsi@vger.kernel.org
Subject: Re: DIF/DIX updates for 2.6.32
Date: Thu, 27 Aug 2009 17:40:20 +0300	[thread overview]
Message-ID: <4A969AD4.7070309@panasas.com> (raw)
In-Reply-To: <1251380803.6426.16.camel@mulgrave.site>

On 08/27/2009 04:46 PM, James Bottomley wrote:
> On Thu, 2009-08-27 at 12:49 +0300, Boaz Harrosh wrote:
>> On 08/27/2009 09:34 AM, Martin K. Petersen wrote:
>>>>>>>> "Boaz" == Boaz Harrosh <bharrosh@panasas.com> writes:
>>>
>>> Boaz> I know that we also have the above problem with iscsi and
>>> Boaz> data-digest such that when we come to sign the data it might
>>> Boaz> change on us before the target receives it.
>>>
>>> Yep, I have the same problem.  I talked to Andrew Morton a couple of
>>> months ago and he said that modifying pages in flight is "a feature" as
>>> far as ext[234] is concerned.
>>>
>>
>> As you might know, I have a filesystem copied from the ext2 code base.
>> I'm experimenting with altering the behavior so that pages written to
>> while been IOed will page fault, then sleep, until IO is done.
>> Clearly this is a good "feature" until such systems like mirror or signed-
>> data that are forced to reallocate-copy all IO do to the 2% optimization
>> that thing gives you.
> 
> What about reads to the page?  If you allow them, you get the situation
> where something signals a write intent, tries to write and gets put into
> wait, then the readers get the old data still.
> 

Is there any guaranty between a parallel write and read about what's first?
But I think in my case the reads will also page-fault so I'm not sure yet.
Thanks for asking that's a good question that should be taken into
consideration.

>> At the final outcome I hope for a VFS support on a flip of a flag or
>> something. So under laying device can turn that "feature" off when it
>> means grate performance gains in it's operations.
>>
>> If any one has thought about that problem, and as some preliminary strategies,
>> please I'm all hears. I've just started on this subject and currently I do not
>> have a clue.
> 
> The correct way to handle this is simply to dump the page being written.
> It's dirty and was updated after the last write attempt, so it gets
> re-written out.  It costs nothing and it's incredibly fast.
> 

This is not an option on a mirror system, and the performance gain/lose
is dependent on the round trip speed. If for every digest error I have an
error recovery cycle, delays, and stalls. Then no it is not better. Not
to mention some iscsi-targets that reset and the all session must be
re-established.

> What you likely want is a way of telling that the page got re-written so
> you don't need to print out scary warning messages about parity
> problems.
> 

Maybe that is a start. I guess I could signal a fast abort for these. What
would be the cost for this knowledge. I guess O(sglist-size) right? loop
on all pages and check? Anything better we can do?

> James
> 
> 

Thanks
Boaz

  reply	other threads:[~2009-08-27 14:40 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-08-26  6:17 DIF/DIX updates for 2.6.32 Martin K. Petersen
2009-08-26  6:17 ` [PATCH 1/5] SCSI: Add support for 32-byte CDBs Martin K. Petersen
2009-08-26 12:16   ` Boaz Harrosh
2009-08-27  6:38     ` Martin K. Petersen
2009-08-26  6:17 ` [PATCH 2/5] SCSI: Deprecate SCSI_PROT_*_CONVERT operations Martin K. Petersen
2009-08-26  6:17 ` [PATCH 3/5] sd: Detach DIF from block integrity infrastructure Martin K. Petersen
2009-08-26  6:18 ` [PATCH 4/5] sd: Support disks formatted with DIF Type 2 Martin K. Petersen
2009-08-26 12:26   ` Boaz Harrosh
2009-08-27  6:41     ` Martin K. Petersen
2009-08-26  6:18 ` [PATCH 5/5] scsi_debug: Implement support for " Martin K. Petersen
2009-08-26 12:40   ` Boaz Harrosh
2009-08-27  6:58     ` Martin K. Petersen
2009-08-27  9:35       ` Boaz Harrosh
2009-08-27 13:41         ` James Bottomley
2009-08-27 14:20           ` Boaz Harrosh
2009-08-27 14:30             ` James Bottomley
2009-08-27 14:47               ` Boaz Harrosh
2009-08-27 14:54                 ` James Bottomley
2009-08-27 15:17           ` Douglas Gilbert
2009-08-27 15:39             ` Boaz Harrosh
2009-08-26 11:54 ` DIF/DIX updates for 2.6.32 Boaz Harrosh
2009-08-27  6:34   ` Martin K. Petersen
2009-08-27  9:49     ` Boaz Harrosh
2009-08-27 13:46       ` James Bottomley
2009-08-27 14:40         ` Boaz Harrosh [this message]
2009-08-27 14:51           ` James Bottomley
2009-08-27 15:18             ` Boaz Harrosh
2009-08-27 15:22               ` James Bottomley
2009-08-27 20:02             ` Martin K. Petersen
2009-08-27 20:05               ` Chris Mason
  -- strict thread matches above, loose matches on Subject: below --
2009-09-04  8:36 Martin K. Petersen
2009-09-11 19:20 Martin K. Petersen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A969AD4.7070309@panasas.com \
    --to=bharrosh@panasas.com \
    --cc=James.Bottomley@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.