From: Andrew Morton <akpm@linux-foundation.org>
To: Bernd Schubert <bs@q-leap.de>
Cc: linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org,
Jens Axboe <jens.axboe@oracle.com>
Subject: Re: cache flush timeouts by blk_queue_ordered()
Date: Mon, 26 Jan 2009 11:55:40 -0800 [thread overview]
Message-ID: <20090126115540.694ea262.akpm@linux-foundation.org> (raw)
In-Reply-To: <20081120181919.GA3818@lanczos.q-leap.de>
On Thu, 20 Nov 2008 19:19:19 +0100
Bernd Schubert <bs@q-leap.de> wrote:
> Hello,
>
> with some FC hardware-raid units we have the problem that the
> SYNCHRONIZE_CACHE command reproducibly fails.
>
> [658715.827428] sd 6:0:2:2: last recovery: 4311805647, now: 4459793681
> [658715.833980] sd 6:0:2:2: [sdk] Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
> [658715.842288] sd 6:0:2:2: [sdk] CDB: Synchronize Cache(10): 35 00 00 00 00 00 00 00 00 00
> [658715.850954] sd 6:0:2:2: Activating scsi error recovery (1)
> [658715.856793] sd 6:0:2:2: trying to abort command
> [658715.861820] qla2xxx 0000:07:02.0: scsi(6:2:2): Abort command issued -- 1 36e2df2 2002.
> [658746.004124] sd 6:0:2:2: last recovery: 4459793692, now: 4459801236
> [658746.010686] sd 6:0:2:2: [sdk] Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
> [658746.019004] sd 6:0:2:2: [sdk] CDB: Synchronize Cache(10): 35 00 00 00 00 00 00 00 00 00
> [658746.027680] sd 6:0:2:2: Activating scsi error recovery (2)
> [658746.033526] sd 6:0:2:2: trying to abort command
> [658746.038543] qla2xxx 0000:07:02.0: scsi(6:2:2): Abort command issued -- 1 36e2df4 2002.
Does this result in a dead machine, or does the system still struggle
along somehow?
> My guess is that these units flush their cache when this command is send
> even though they have a battery backup unit and flushing 2GB cache may
> take some time... Since I can only reproduce it on systems in production
> I can't do any experiments, but I guess the default timeout of 30s is not
> sufficient.
>
> Problem is now that this timeout cannot be adjusted by the sysfs scsi device
> timeout, since sd_prepare_flush() doesn't have the required device
> structure. The reason for that is blk_queue_ordered(). It neither
> gets a timeout argument, nor any pointer to the device.
>
> I already tried to use container_of() in sd_prepare_flush, but somehow
> that doesn't seem work if the structure member is a pointer.
>
> The next solution that comes into my my mind is to add the timeout argument
> to blk_queue_ordered() and subsequentely to modifiy all callers.
> Would such a patch be accepted? Or is there any better solution?
>
> Any help is appreciated.
Is this problem still open?
>
> Thanks,
> Bernd
>
> PS: (I'm also discussing the cache flush issue with one of our
> hardware vendors, but fixing their firmware might take ages).
Making the timeout tunable sounds like a suitable (if minimal)
approach. It appears to presently be a qlogic-specific thing?
Command_Entry.time_out?
next prev parent reply other threads:[~2009-01-26 19:56 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-11-20 18:19 cache flush timeouts by blk_queue_ordered() Bernd Schubert
2009-01-26 19:55 ` Andrew Morton [this message]
2009-01-26 22:39 ` Bernd Schubert
2009-01-26 22:57 ` Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090126115540.694ea262.akpm@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=bs@q-leap.de \
--cc=jens.axboe@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox