All of lore.kernel.org
 help / color / mirror / Atom feed
* [Ocfs2-devel] cluster/heartbeat.c::compute_max_sectors is invalid.
@ 2006-09-07  3:10 Mathieu Avila
  2006-09-08 11:28 ` Mark Fasheh
  0 siblings, 1 reply; 4+ messages in thread
From: Mathieu Avila @ 2006-09-07  3:10 UTC (permalink / raw)
  To: ocfs2-devel

Hello OCFS2 team,

The function is defined so:

static int compute_max_sectors(struct block_device *bdev)
{
    int max_pages, max_sectors, pow_two_sectors;

    struct request_queue *q;

    q = bdev_get_queue(bdev);
    max_pages = q->max_sectors >> (PAGE_SHIFT - 9);
    if (max_pages > BIO_MAX_PAGES)
        max_pages = BIO_MAX_PAGES;
    if (max_pages > q->max_phys_segments)
        max_pages = q->max_phys_segments;
    if (max_pages > q->max_hw_segments)
        max_pages = q->max_hw_segments;
    max_pages--; /* Handle I/Os that straddle a page */

    max_sectors = max_pages << (PAGE_SHIFT - 9);

    /* Why is fls() 1-based???? */
    pow_two_sectors = 1 << (fls(max_sectors) - 1);

    return pow_two_sectors;
}


One of the device i use has "q->max_sectors"=8. Therefore, max_pages 
should be 1. This is not the case, due to the :
max_pages--; /* Handle I/Os that straddle a page */
Therefore heartbeating doesn't work on my cluster, as it is defined. 
When i force pow_two_sectors to be 8, it works fine, it heartbeats and i 
can mount the device. But the problem remains with the standard code and 
i just don't understand this statement. With the normal code, it breaks 
a little later at "o2hb_setup_one_bio", because "bio_add_page" returns 
0, which is perfectly valid for the device, as specified in the 
definition of bio_add_page (fs/bio.c):
/**
* bio_add_page - attempt to add page to bio
* @bio: destination bio
* @page: page to add
* @len: vec entry length
* @offset: vec entry offset
*
* Attempt to add a page to the bio_vec maplist. This can fail for a
* number of reasons, such as the bio being full or target block
* device limitations. The target block device must allow bio's
* smaller than PAGE_SIZE, so it is always possible to add a single
* page to an empty bio.
*/

Allocating BIOs in advance by making assumptions on the behaviour of the 
device (o2hb_compute_request_limits) is a bad idea, it seems. Instead, 
it should add pages until the device refuses to add another one. With 
the current code, i don't know whether it would take time to correct 
that, however...

--
Mathieu

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Ocfs2-devel] cluster/heartbeat.c::compute_max_sectors is invalid.
  2006-09-07  3:10 [Ocfs2-devel] cluster/heartbeat.c::compute_max_sectors is invalid Mathieu Avila
@ 2006-09-08 11:28 ` Mark Fasheh
  2006-09-13  1:44   ` Mathieu Avila
  0 siblings, 1 reply; 4+ messages in thread
From: Mark Fasheh @ 2006-09-08 11:28 UTC (permalink / raw)
  To: ocfs2-devel

On Thu, Sep 07, 2006 at 12:05:58PM +0200, Mathieu Avila wrote:
> One of the device i use has "q->max_sectors"=8. Therefore, max_pages 
> should be 1. This is not the case, due to the :
> max_pages--; /* Handle I/Os that straddle a page */
> Therefore heartbeating doesn't work on my cluster, as it is defined. 
> When i force pow_two_sectors to be 8, it works fine, it heartbeats and i 
> can mount the device. 
Hmm, interesting. Can I trouble you to send us a patch for that? It's
trivial, but since I don't have access to any of that hardware around here I
can't test it.


> Allocating BIOs in advance by making assumptions on the behaviour of the 
> device (o2hb_compute_request_limits) is a bad idea, it seems. Instead, 
> it should add pages until the device refuses to add another one. With 
> the current code, i don't know whether it would take time to correct 
> that, however...
It doesn't allocate bios in advance. That function, "setup_one_bio()" does
the allocation. One bio at a time. We just figure out how many bio's we'll
need ahead of time so that we can size an array. Your problem was initially
with o2hb_compute_request_limits() which definitely has a bug. Everything
that happened after that was a result of the bad calculation.
	--Mark

--
Mark Fasheh
Senior Software Developer, Oracle
mark.fasheh@oracle.com

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Ocfs2-devel] cluster/heartbeat.c::compute_max_sectors is invalid.
  2006-09-08 11:28 ` Mark Fasheh
@ 2006-09-13  1:44   ` Mathieu Avila
  2006-09-13 11:17     ` Mark Fasheh
  0 siblings, 1 reply; 4+ messages in thread
From: Mathieu Avila @ 2006-09-13  1:44 UTC (permalink / raw)
  To: ocfs2-devel

Le Fri, 8 Sep 2006 11:28:33 -0700,
Mark Fasheh <mark.fasheh@oracle.com> a ?crit :

> On Thu, Sep 07, 2006 at 12:05:58PM +0200, Mathieu Avila wrote:
> > One of the device i use has "q->max_sectors"=8. Therefore,
> > max_pages should be 1. This is not the case, due to the :
> > max_pages--; /* Handle I/Os that straddle a page */
> > Therefore heartbeating doesn't work on my cluster, as it is
> > defined. When i force pow_two_sectors to be 8, it works fine, it
> > heartbeats and i can mount the device. 
> Hmm, interesting. Can I trouble you to send us a patch for that? It's
> trivial, but since I don't have access to any of that hardware around
> here I can't test it.
> 

A proposal patch (against the 2.6.17.11 vanilla kernel) for this is
attached. It works for me, and shouldn't break any existing device.

> 
> 
> It doesn't allocate bios in advance. That function, "setup_one_bio()"
> does the allocation. One bio at a time. We just figure out how many
> bio's we'll need ahead of time so that we can size an array. 

And i still think that it is a bad idea to try figuring out how many
BIOs are needed, because it really depends on the underlying
architecture of the device. Although it might work on most hardware,
some strange one might break one day...

> Your
> problem was initially with o2hb_compute_request_limits() which
> definitely has a bug. Everything that happened after that was a
> result of the bad calculation. --Mark
> 

Agreed.

--
Mathieu
-------------- next part --------------
A non-text attachment was scrubbed...
Name: heartbeat-sector-calculation.patch
Type: text/x-patch
Size: 514 bytes
Desc: not available
Url : http://oss.oracle.com/pipermail/ocfs2-devel/attachments/20060913/29bea8f2/heartbeat-sector-calculation.bin

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Ocfs2-devel] cluster/heartbeat.c::compute_max_sectors is invalid.
  2006-09-13  1:44   ` Mathieu Avila
@ 2006-09-13 11:17     ` Mark Fasheh
  0 siblings, 0 replies; 4+ messages in thread
From: Mark Fasheh @ 2006-09-13 11:17 UTC (permalink / raw)
  To: ocfs2-devel

On Wed, Sep 13, 2006 at 10:44:42AM +0200, Mathieu Avila wrote:
> A proposal patch (against the 2.6.17.11 vanilla kernel) for this is
> attached. It works for me, and shouldn't break any existing device.
Ok, that patch looks good - I've put it in my 2.6.19 queue.


> And i still think that it is a bad idea to try figuring out how many
> BIOs are needed, because it really depends on the underlying
> architecture of the device. Although it might work on most hardware,
> some strange one might break one day...
Well, I'm pretty sure some other software uses that approach, but that might
just mean other software might have similar problems :)

So yeah. I don't reallly have any qualms changing ocfs2 heartbeat to a more
"natural" usage of the bio API.


Thanks!
	--Mark

--
Mark Fasheh
Senior Software Developer, Oracle
mark.fasheh@oracle.com

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2006-09-13 11:17 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-09-07  3:10 [Ocfs2-devel] cluster/heartbeat.c::compute_max_sectors is invalid Mathieu Avila
2006-09-08 11:28 ` Mark Fasheh
2006-09-13  1:44   ` Mathieu Avila
2006-09-13 11:17     ` Mark Fasheh

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.