* [RFC] New interface for dm-io to handle timed requests
@ 2006-03-14 10:19 Stefan Bader
2006-03-14 20:15 ` Mike Christie
0 siblings, 1 reply; 4+ messages in thread
From: Stefan Bader @ 2006-03-14 10:19 UTC (permalink / raw)
To: dm-devel
[-- Attachment #1: Type: text/plain, Size: 2283 bytes --]
Some days ago we proposed an extension to the device mapper that allows
to specify a timeout after which a given request should return as
successful, even if some of the target devices did not react by that
time. As we cannot return a request to the upper layers as long as some
io is still running and possibly modifying referenced pages, we also
need a way to handle those requests.
The ideal solution would be to have an interface in the block layer that
allows us to cancel any submitted requests. But since such a change will
take quite a lot discussions and work, we want to emulate such a
behavior in the dm core for now.
The rough idea is as follows:
- The dm core has to keep track of running ios, so each client has to
create a dm_io_client structure by calling dm_io_client_create
This is also required to have better scaling targets that use
dm-io since this allows to have memory pools private to each
target instance.
- Any io is submitted via dm_io. Details on timeouts, what callback
function to use, etc. are submitted via a struct dm_io_control.
- The notify function will be called multiple times, usually once for
each region. It's the job of the client to wait for all regions to
complete.
- The state of a region can be OK, TIMEOUT, CANCELED or ERROR. If The
state is TIMEOUT, the io is still running, and can complete later by
it self. In that case the callback is called again with the new
state.
If the client doesn't want to wait, it can call
dm_io_cancel_by_device or dm_io_cancel_by_handle to cancel the
outstanding io.
- Once all regions returned with a return code of OK, CANCELED or ERROR
the io request can be returned to the originator.
- Synchronous calls are done by setting the SYNC bit in the rw attribute
(only one function call instead of multiple ones). The call will wait
until all regions are done (but will call the notify function if
supplied). If no notify function is supplied the caller will only
know that any region has an error or all are done.
Without notify function but with timeout the regions will be cancelled
automatically.
Regards,
Stefan Bader
----------------------------------------------------------------------
Here comes the proposed new header:
[-- Attachment #2: dm-io-v2.h --]
[-- Type: text/plain, Size: 4838 bytes --]
#include <linux/bio.h>
#include "dm.h"
#define dm_io_page_list page_list
/*=============================================================================
* Structures and functions to manage different I/O clients.
*=============================================================================
*/
struct dm_io_client;
/*
* NOTE: We need the number of requests (ios) that the target wants to have
* running on (devices) devices in parallel. The size is sort of bad.
* We need it to simulate cancellation since there we have to have
* enough memory to store the bio_vecs content. Otherwise we would have
* to reserve the maximum memory size a bio_vec can adress which is a
* waste of memory.
* Another proposal would be:
* dm_io_client_create(dm_target *, uint, uint, dm_io_client **)
*/
/*-----------------------------------------------------------------------------
* Register as a new I/O client.
*
* Arguments: devices = how many devices will be used for each request.
* min_ios = the minimum number of I/O request that should run
* in parallel.
* max_size = the biggest amount of memory that will be packed into
* one bio_vec.
* cl = address into which the pointer to the new dm_io_client
* will be written.
*
* Returns: 0 on success
* -ENOMEM if there is not enough memory to build all memory
* pools and data structures.
*-----------------------------------------------------------------------------
*/
int dm_io_client_create(
unsigned int devices,
unsigned int min_ios,
unsigned int max_size,
struct dm_io_client ** cl);
/*-----------------------------------------------------------------------------
* Unregister as a client.
*
* Arguments: cl = pointer to the client context to release.
*-----------------------------------------------------------------------------
*/
void dm_io_client_destroy(struct dm_io_client *cl);
/*=============================================================================
* Structures and functions to do the actual I/O. The dm_io_region is a
* container to pass in the destination(s) for write- and the source for
* read-requests.A
*=============================================================================
*/
struct dm_io_region {
struct block_device * bdev;
sector_t sector;
sector_t count;
};
/*
* The dm_io_handle is in place for future extensions where it is necessary
* to identify a certain I/O job in calls to dm_io functions.
*/
struct dm_io_handle;
struct dm_io_region_state {
unsigned int index;
enum {
OK,
TIMEOUT,
CANCELLED,
ERROR,
} state;
int error_code;
struct dm_io_handle * hdl;
};
/*
* Note: It is guaranteed that the contents of region_state will not change
* while in the notify function.
* Note: The dm_io_handle is only valid during the call. If the caller stores
* it somewhere else it has to use dm_io_handle_get().
*/
typedef void (*dm_io_notify_fn)(
struct dm_io_region_state *state,
void *context);
struct dm_io_page_list {
struct dm_io_page_list * next;
struct page * page;
};
struct dm_io_memory {
enum {
IO_PAGE_LIST,
IO_BVEC,
IO_VM,
} type;
union {
void * vma;
struct bio_vec * bv;
struct dm_io_page_list * pl;
} ptr;
unsigned int offset;
};
/*
* Optional flags for dm_io_control:
*/
#define DM_IO_CANCEL_ON_TIMEOUT 1
struct dm_io_control {
struct dm_io_memory memory;
int rw; // SYNC flag supported...
dm_io_notify_fn notify;
void * context;
struct dm_io_client * client;
unsigned long timeout; // What time base (seconds)?
unsigned int flags;
};
/*
* Note: If the caller supplies a place to store the io_handle it has to
* release it by calling dm_io_handle_put().
* Note: By issuing a SYNC I/O the call will return when all I/O has
* completed but the notify function is called as it would be with
* asyncronous calls.
*/
int dm_io(
struct dm_io_control * ctrl,
unsigned int num_regions,
struct dm_io_region * regions,
struct dm_io_handle ** hdl);
/*
* Since the interface allows to pass references to the io handle to the
* caller we need to supply a way to manage them.
* The *_get variant might be unnecessary but IMHO it should be there to
* allow clients to store the reference to additional locations. Comments?
*/
struct dm_io_handle *dm_io_handle_get(struct dm_io_handle *io);
struct dm_io_handle *dm_io_handle_put(struct dm_io_handle *io);
/*
* Cancelation functions for several I/O entities.
*/
int dm_io_cancel_by_device(struct dm_io_client *cl, struct block_device *bdev);
int dm_io_cancel_by_handle(struct dm_io_client *cl, struct dm_io_handle *hdl);
[-- Attachment #3: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFC] New interface for dm-io to handle timed requests
2006-03-14 10:19 [RFC] New interface for dm-io to handle timed requests Stefan Bader
@ 2006-03-14 20:15 ` Mike Christie
2006-03-15 15:33 ` Stefan Bader
0 siblings, 1 reply; 4+ messages in thread
From: Mike Christie @ 2006-03-14 20:15 UTC (permalink / raw)
To: device-mapper development
Stefan Bader wrote:
> Some days ago we proposed an extension to the device mapper that allows
> to specify a timeout after which a given request should return as
> successful, even if some of the target devices did not react by that
> time. As we cannot return a request to the upper layers as long as some
> io is still running and possibly modifying referenced pages, we also
> need a way to handle those requests.
> The ideal solution would be to have an interface in the block layer that
> allows us to cancel any submitted requests. But since such a change will
> take quite a lot discussions and work, we want to emulate such a
> behavior in the dm core for now.
>
The scsi people and some block people have been talking about moving
more error handling functionality into the block layer for a while and
it is slowing moving that way. It could probably be done faster if
people did not concentrate on one subsystem :)
Maybe you should post to lkml and linux-scsi and get some responses from
them before adding it to dm core. If a post already went out to those
list my fault for missing it.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFC] New interface for dm-io to handle timed requests
2006-03-14 20:15 ` Mike Christie
@ 2006-03-15 15:33 ` Stefan Bader
2006-03-15 19:16 ` Mike Christie
0 siblings, 1 reply; 4+ messages in thread
From: Stefan Bader @ 2006-03-15 15:33 UTC (permalink / raw)
To: device-mapper development
dm-devel-bounces@redhat.com wrote on 14.03.2006 21:15:10:
> Stefan Bader wrote:
> > The ideal solution would be to have an interface in the block layer
that
> > allows us to cancel any submitted requests. But since such a change
will
> > take quite a lot discussions and work, we want to emulate such a
> > behavior in the dm core for now.
> >
>
> The scsi people and some block people have been talking about moving
> more error handling functionality into the block layer for a while and
> it is slowing moving that way. It could probably be done faster if
> people did not concentrate on one subsystem :)
>
It is not that much about error handling. More about policy. If the
subsystem decides that io took enough time or maybe it just doesn't
want to go on (could be a force umount...) it would be nice to be
able to stop lower level drivers from doing error recovery. Thus
the idea of stopping a submitted request.
The changes to the core now shall fake this as long as there isn't
such a functionality in the kernel with an interface that can handle
this.
> Maybe you should post to lkml and linux-scsi and get some responses from
> them before adding it to dm core. If a post already went out to those
> list my fault for missing it.
>
No it didn't. I guess lkml is a good point. I am not sure about
linux-scsi.
As it is not related specifically to scsi...
Stefan Bader
SW Linux on zSeries Development & Services
Stefan.Bader@de.ibm.com
----------------------------------------------------------------------------------
When all other means of communication fail, try words.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFC] New interface for dm-io to handle timed requests
2006-03-15 15:33 ` Stefan Bader
@ 2006-03-15 19:16 ` Mike Christie
0 siblings, 0 replies; 4+ messages in thread
From: Mike Christie @ 2006-03-15 19:16 UTC (permalink / raw)
To: device-mapper development
Stefan Bader wrote:
> dm-devel-bounces@redhat.com wrote on 14.03.2006 21:15:10:
>
>
>>Stefan Bader wrote:
>>
>>>The ideal solution would be to have an interface in the block layer
>
> that
>
>>>allows us to cancel any submitted requests. But since such a change
>
> will
>
>>>take quite a lot discussions and work, we want to emulate such a
>>>behavior in the dm core for now.
>>>
>>
>>The scsi people and some block people have been talking about moving
>>more error handling functionality into the block layer for a while and
>>it is slowing moving that way. It could probably be done faster if
>>people did not concentrate on one subsystem :)
>>
>
>
> It is not that much about error handling. More about policy. If the
> subsystem decides that io took enough time or maybe it just doesn't
> want to go on (could be a force umount...) it would be nice to be
> able to stop lower level drivers from doing error recovery. Thus
> the idea of stopping a submitted request.
Ah ok sorry, I think I call it error handling becuase if a command is
running on the disk or on the transport then for LLDs like scsi
canceling the command is part of our error handling code. Sorry for the
confusion.
But setting the limit for the command's running time can be moved to the
block layer away from the llds and higher levels so we can all
coordinate this. If the command times out the block layer can begin to
cancel the command and call into the LLD (we would actually have to
stack the cancel command callout like the request_fns or do the block
request queue as message queue junk) to handle the lower level details
of how to cancel it.
>
> The changes to the core now shall fake this as long as there isn't
> such a functionality in the kernel with an interface that can handle
> this.
>
>
>>Maybe you should post to lkml and linux-scsi and get some responses from
>
>
>>them before adding it to dm core. If a post already went out to those
>>list my fault for missing it.
>>
>
> No it didn't. I guess lkml is a good point. I am not sure about
> linux-scsi.
> As it is not related specifically to scsi...
>
Well, you need to convert linux-scsi and many people on the list have
been thinking about how to do it so that is why I suggested it.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2006-03-15 19:16 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-03-14 10:19 [RFC] New interface for dm-io to handle timed requests Stefan Bader
2006-03-14 20:15 ` Mike Christie
2006-03-15 15:33 ` Stefan Bader
2006-03-15 19:16 ` Mike Christie
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.