linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff Garzik <jeff@garzik.org>
To: Steve Byan <smb@egenera.com>
Cc: Jens Axboe <axboe@suse.de>, Mark Lord <liml@rtr.ca>,
	Gentoopower <gentoopower@yahoo.de>,
	"Raz Ben-Jehuda(caro)" <raziebe@gmail.com>,
	Linux RAID Mailing List <linux-raid@vger.kernel.org>,
	"linux-ide@vger.kernel.org" <linux-ide@vger.kernel.org>
Subject: Re: NCQ general question
Date: Fri, 03 Mar 2006 17:19:05 -0500	[thread overview]
Message-ID: <4408C0D9.4010202@garzik.org> (raw)
In-Reply-To: <C6838977-5411-400F-9548-2F22598DC2AE@egenera.com>

Steve Byan wrote:
> On Mar 1, 2006, at 8:55 AM, Jens Axboe wrote:
> The problem with TCQ is that the host can't disconnect on writes  after 
> sending the data to the drive but before receiving the status.  The host 
> can only disconnect between sending the command and moving  the data. 

That, but also:  The standard PCI IDE hardware interface prevents the 
device from selecting command $N's DMA data out of $M active write commands.

With reads, the device has more freedom to process the requests 
asynchronously.


> Consequently TCQ is useless for writes, which is where you  really need 

Agreed.


> it. It works OK for reads. TCQ was really invented as a  way to allow 
> CD-ROM drives to play nice on the same ATA bus as disks.

Disagree, you are probably thinking about bus disconnect associated with 
the overlapped command set?  AFAIK TCQ has -never- applied to ATAPI.


> The reason you need write queuing is for data integrity reasons, not  
> for performance. ATA disks effectively get command-queuing on writes  
> even without TCQ and NCQ - they simply park the data in a volatile  RAM 
> cache, tell the host that the data is saved on persistent  storage, and 
> then asynchronously write the queued data to the  physical media. The 
> drive reorders those writes and will gather  sequential writes.

Data integrity -and- performance.  Performance increases for all the 
standard reasons that an asynchronous pipeline increases performance 
over a synchronous one.

The write cache means that requests on the device can be processed 
asynchronously, but without NCQ there is still a synchronous bottleneck: 
  the device<->controller pipe.


> However, note that all filesystems that make even a pretense of  trying 
> to maintain filesystem integrity after a power failure (note  that the 
> Windows NT implementation of FAT32 does not attempt to  maintain 
> filesystem integrity after a power failure) depend on  knowing when data 
> makes it to persistent storage, so they can order  their writes 

True.


> correctly. ATA disk write caching breaks this guarantee.  To restore 
> filesystem integrity on a careful-write filesystem like  most unix 
> filesystems, you have to disable write-caching in the  drive. This 

False, as Linux has proven:  barriers can be implemented with 
flush-cache commands.

Disabling write cache is not your only choice, and using flush-cache 
gives you better performance than flat-out disabling the write cache.

	Jeff



  reply	other threads:[~2006-03-03 22:19 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-03-01  7:04 NCQ general question Raz Ben-Jehuda(caro)
2006-03-01  8:56 ` Gentoopower
2006-03-01 13:49   ` Mark Lord
2006-03-01 13:55     ` Jens Axboe
2006-03-03 21:55       ` Steve Byan
2006-03-03 22:19         ` Jeff Garzik [this message]
2006-03-04 18:56           ` Steve Byan
2006-03-04 19:10             ` Jeff Garzik
2006-03-04 20:23               ` Steve Byan
2006-03-04 23:56                 ` Eric D. Mudama
2006-03-05  7:19                   ` Raz Ben-Jehuda(caro)
2006-03-05  7:29                     ` Jeff Garzik
2006-03-08 16:51                       ` Louis-David Mitterrand
2006-03-08 17:17                         ` Jeff Garzik
2006-03-14 17:17                           ` Louis-David Mitterrand
2006-03-01 15:56     ` Gentoopower
2006-03-01 16:05       ` Jens Axboe
2006-03-01 16:20         ` Jeff Garzik
2006-03-01 18:53           ` Jens Axboe
2006-03-02  8:14             ` Raz Ben-Jehuda(caro)
2006-03-02  8:18               ` Jens Axboe
2006-03-02 11:20                 ` Jeff Garzik
2006-03-02 13:34                   ` Raz Ben-Jehuda(caro)
2006-03-02 13:37                     ` Jeff Garzik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4408C0D9.4010202@garzik.org \
    --to=jeff@garzik.org \
    --cc=axboe@suse.de \
    --cc=gentoopower@yahoo.de \
    --cc=liml@rtr.ca \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=raziebe@gmail.com \
    --cc=smb@egenera.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).