From: Andrew Patterson <andrew.patterson@hp.com>
To: scameron@beardog.cca.cpqcorp.net
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>,
linux-kernel@vger.kernel.org, mike.miller@hp.com,
jens.axboe@oracle.com, akpm@linux-foundation.org,
linux-scsi@vger.kernel.org, coldwell@redhat.com, hare@novell.com,
iss_storagedev@hp.com
Subject: Re: [PATCH] hpsa: SCSI driver for HP Smart Array controllers
Date: Thu, 05 Mar 2009 16:54:19 +0000 [thread overview]
Message-ID: <1236272059.7050.14.camel@grinch> (raw)
In-Reply-To: <20090305142114.GG15340@beardog.cca.cpqcorp.net>
On Thu, 2009-03-05 at 08:21 -0600, scameron@beardog.cca.cpqcorp.net
wrote:
> On Thu, Mar 05, 2009 at 02:48:09PM +0900, FUJITA Tomonori wrote:
> > On Tue, 3 Mar 2009 10:28:21 -0600
> > scameron@beardog.cca.cpqcorp.net wrote:
> >
> > > On Tue, Mar 03, 2009 at 03:35:26PM +0900, FUJITA Tomonori wrote:
> > > > On Mon, 2 Mar 2009 08:56:50 -0600
> > > > scameron@beardog.cca.cpqcorp.net wrote:
> > > >
> > > > > [...]
> > > > > > > + .this_id = -1,
> > > > > > > + .sg_tablesize = MAXSGENTRIES,
> > > > > >
> > > > > > MAXSGENTRIES (32) is the limitation of hardware? If not, it might be
> > > > > > better to enlarge this for better performance?
> > > > >
> > > > > Yes, definitely, though this value varies from controller to controller,
> > > > > so this is just a default value that needs to be overridden, probably
> > > > > in hpsa_scsi_detect().
> > > >
> > > > I see. If we override this in hpsa_scsi_detect(), we need a trick for
> > > > SG in CommandList_struct, I guess.
> > >
> > > Yes. There are some limits to what can be put into CommandList_struct
> > > directly, but there is also scatter gather chaining, in which we use
> > > the last element in the CommandList_struct to point to another buffer
> > > of SG entries.
> > >
> > > If you have a system with a lot of controllers, having a large number of
> > > scatter gathers can be a bit of a memory hog, and since this memory is all
> > > via pci_alloc_consistent, that can be a concern. It would be nice if
> > > there was a way for the user to specify differing amounts of scatter
> > > gathers for different controller instances so for instance the controller
> > > which he's running his big oracle database, or webserver or whatever on
> > > gets lots, while the controller he's booted from that's mostly idle
> > > gets not so many. I don't know what a good way for a user to identify
> > > what controller he's talking about in a module parameter would be
> > > though. Maybe by pci domain/bus/device/function? Maybe something along
> > > the lines of:
> > >
> > > modprobe hpsa dev1=0:0e:00.0 sg1=1000 dev2=0:0b:00.0 sg2=31
> > >
> > > to say that one controller gets 1000 scatter gather elements, but
> > > another gets only 31. But PCI busses can change if hardware
> > > configuration changes, and this isn't exactly obvious, so seems less
> > > than ideal. Any bright ideas on that front?
> >
> > We have /sys/class/scsi_host/host*/sg_tablesize:
> >
> > How about modifying this value on the fly?
> >
> > fujita@clover:/sys/class/scsi_host/host3$ echo 1000 > sg_tablesize
> >
>
> We pci_alloc_consistent that space, so... I think that would mean
> we'd have to do things considerably differently. I think we'd have
> to quit allocating commands in big chunks, and instead of indexing
> into that chunk we'd probably have to have an array of pointers or
> something. If we wanted sg_tablesize adjustable down to single
> command counts, we'd probably have to allocate each command separately
> and have an array of pointers to those...
>
> e.g. if you did
>
> echo 1000 > sg_tablesize
> echo 999 > sg_tablesize
>
> you probably wouldn't want to keep the 1000 commands around,
> and then allocate 999 additional, then let all the outstanding
> commands using the first 1000 block complete, then finally free
> the first block of 1000, leaving just the 999. You'd probably want
> instead to free one of the 1000 to get to 999.
>
> Likewise with this:
>
> echo 999 > sg_tablesize
> echo 1000 > sg_tablesize
>
> These are somewhat pathological cases, granted.
>
> I'm not sure dynamically modifying the number of SGs a controller
> can do is something that comes up enough to be worth implementing
> something so complicated.
>
> If it's settable at init time, that would probably be enough for
> the vast majority of uses (and more flexible than what we have now)
> and a lot easier to implement.
>
> >
> > Well, this needs more changes (to both the block layer and the scsi
> > mid layer) but is it nice to change this value dynamically?
> >
> > Anyway, I think that it's better to address this fancy feature later
> > on (after the mainline inclusion). Let's put hpsa driver into mainline
> > first.
>
> Agreed, we can think about all that stuff later.
>
> Another fancy feature to think about later which would be nice:
>
> On Smart arrays you can expand logical drives on the fly by
> adding physical disks, or portions of physical disks into them.
> Would be nice if there was a non-i/o-interrupting way to notify
> the scsi layer of this new space (maybe there already is?) so
> that if there's, say, a filesystem which can also dynamically
> grow on the fly on that embiggened logical drive, it can take
> advantage of that extra space.
>
> Right now, the driver will do scsi_remove_device() and then
> scsi_add_device() if a logical drive changes size, which isn't
> very nice.
>
The following might help:
There are several ways to "kick off" a device size change:
1. For SCSI devices do:
# echo 1 > /sys/class/scsi_device/<device>/device/rescan
or
# blockdev --rereadpt <device file>
2. Other devices (not device mapper)
# blockdev --rereadpt <device file>
See http://marc.info/?l=linux-kernel&m=122056065131792&w=2
Andrew
> >
> >
> > > > > Hmm, this doesn't seem all that complicated to me, and this code snippet
> > > > > has been pretty stable for about 10 years. it's nearly identical to what's in
> > > > > cpqarray in the 2.2.13 kernel from 1999:
> > > > >
> > > > > do {
> > > > > i = find_first_zero_bit(h->cmd_pool_bits, NR_CMDS);
> > > > > if (i == NR_CMDS)
> > > > > return NULL;
> > > > > } while(test_and_set_bit(i%32, h->cmd_pool_bits+(i/32)) != 0)
> > > > >
> > > > > It's fast, works well, and has needed very little maintenance over the
> > > > > years. Without knowing what you have in mind specifically, I don't see a
> > > > > big need to change this.
> > > >
> > > > I see. Seems that some drivers want something similar. I might come
> > > > back later on with a patch to replace this with library
> > > > functions.
> > >
> > > There was some other discussion about pushing this sort of thing to
> > > upper layers, using a tag generated in the scsi layer as a means
> > > of allocating driver command buffers, since, presumably there's a
> > > one to one mapping. (I didn't completely grok it all though.)
> >
> > Oops, I meant that I might come back with a patch to convert hpsa to
> > use the the block layer tagging, which you and Mike Christie are
> > talking about (yeah, my first suggestion to use lists was wrong. using
> > the block layer tagging looks much better).
> >
> >
> > By the way, have you guys started to work on the review comments for
>
> We haven't really done much. It's obvious that there's a lot to do
> based on the comments, and it's also obvious how to do most of it,
> and not hard, (e.g. ripping out /proc stuff, etc.), there's just a
> lot of other non-kernel related work keeping us busy at the moment.
>
> > the next submission? The driver has some minor style issues that have
> > not been mentioned yet. For example, the comment style in the driver
> > is not preferred:
> >
> > /* If this device a non-zero lun of a multi-lun device */
> > /* byte 4 of the 8-byte LUN addr will contain the logical */
> > /* unit no, zero otherise. */
> >
> > The preferred style is:
> >
> > /*
> > * If this device a non-zero lun of a multi-lun device
> > * byte 4 of the 8-byte LUN addr will contain the logical
> > * unit no, zero otherise.
> > */
>
> ok.
>
> >
> > Another example, I think that the SCSI-ml preferred style is (not
> > documented in CodingStyle though):
> >
> > 'if (!ptr)' rather than 'if (ptr == NULL)'
> > 'if (!value)' rather than 'if (value == 0)'
> > 'if (ptr)' rather than 'if (ptr != NULL)'
> > 'if (value)' rather than 'if (value != 0)'
>
> Ok.
>
> >
> >
> > If you are already addressing the review comments, I just wait for the
> > next submission, then I'll send such minor patches. If you are not,
> > I'll send patches to address the review comments (including such minor
> > patches).
>
> Ok, thanks.
>
> -- steve
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
next prev parent reply other threads:[~2009-03-05 16:54 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-03-02 14:56 [PATCH] hpsa: SCSI driver for HP Smart Array controllers scameron
2009-03-03 6:35 ` FUJITA Tomonori
2009-03-03 16:28 ` scameron
2009-03-05 5:48 ` FUJITA Tomonori
2009-03-05 14:21 ` scameron
2009-03-05 16:54 ` Andrew Patterson [this message]
2009-03-06 8:55 ` Jens Axboe
2009-03-06 9:13 ` FUJITA Tomonori
2009-03-06 9:21 ` Jens Axboe
2009-03-06 9:27 ` FUJITA Tomonori
2009-03-06 9:35 ` Jens Axboe
2009-03-06 14:38 ` scameron
2009-03-06 19:06 ` Jens Axboe
2009-03-06 20:59 ` Grant Grundler
2009-03-06 20:59 ` Grant Grundler
2009-03-06 21:18 ` scameron
2009-03-06 21:18 ` scameron
2009-03-06 21:55 ` Grant Grundler
2009-03-06 21:55 ` Grant Grundler
2009-03-06 21:59 ` James Bottomley
2009-03-05 14:55 ` Miller, Mike (OS Dev)
2009-03-03 16:49 ` Mike Christie
2009-03-03 21:28 ` scameron
-- strict thread matches above, loose matches on Subject: below --
2009-02-27 23:09 Mike Miller
2009-03-01 13:49 ` Rolf Eike Beer
2009-03-02 6:32 ` FUJITA Tomonori
2009-03-02 17:19 ` Grant Grundler
2009-03-02 17:19 ` Grant Grundler
2009-03-02 18:20 ` Mike Christie
2009-03-02 18:36 ` Jens Axboe
2009-03-02 20:33 ` Mike Christie
2009-03-02 20:37 ` Mike Christie
2009-03-03 9:43 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1236272059.7050.14.camel@grinch \
--to=andrew.patterson@hp.com \
--cc=akpm@linux-foundation.org \
--cc=coldwell@redhat.com \
--cc=fujita.tomonori@lab.ntt.co.jp \
--cc=hare@novell.com \
--cc=iss_storagedev@hp.com \
--cc=jens.axboe@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=mike.miller@hp.com \
--cc=scameron@beardog.cca.cpqcorp.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.