Re: [RFC][PATCH] dm-cache (block level disk cache target): UPDATE

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Andi Kleen <andi@firstfloor.org>
To: Stephen Bromfield <s.bromfield@gmail.com>
Cc: linux-kernel@vger.kernel.org, dm-cache@googlegroups.com
Subject: Re: [RFC][PATCH] dm-cache (block level disk cache target): UPDATE
Date: Tue, 18 Oct 2011 15:19:06 -0700	[thread overview]
Message-ID: <m2aa8yymcl.fsf@firstfloor.org> (raw)
In-Reply-To: <CAGWyFtpaHvgDVV6DrM8L18eiDGrTL=teH6TZqfJNpwafE+-Big@mail.gmail.com> (Stephen Bromfield's message of "Tue, 18 Oct 2011 16:46:00 -0400")

Stephen Bromfield <s.bromfield@gmail.com> writes:

> A new patch for the 2.6.39 kernel. Please CC all comments to

Why 2.6.39? Like living in the past? 3.1 is current.


> + ****************************************************************************/
> +#include <linux/blk_types.h>
> +#include <asm/atomic.h>

Should be linux/atomic.h

> +
> +#define DMC_DEBUG 0
> +
> +#define DM_MSG_PREFIX "cache"
> +#define DMC_PREFIX "dm-cache: "
> +
> +#if DMC_DEBUG
> +#define DPRINTK( s, arg... ) printk(DMC_PREFIX s "\n", ##arg)
> +#else
> +#define DPRINTK( s, arg... )
> +#endif

Please use the standard pr_* calls for all of this.

> +
> +/* Default cache parameters */
> +#define DEFAULT_CACHE_SIZE	65536
> +#define DEFAULT_CACHE_ASSOC	1024
> +#define DEFAULT_BLOCK_SIZE	8
> +#define CONSECUTIVE_BLOCKS	512
> +
> +/* Write policy */
> +#define WRITE_THROUGH 0
> +#define WRITE_BACK 1
> +#define DEFAULT_WRITE_POLICY WRITE_THROUGH
> +
> +/* Number of pages for I/O */
> +#define DMCACHE_COPY_PAGES 1024

Runtime tunable?


> +#define is_state(x, y)		(x & y)
> +#define set_state(x, y)		(x |= y)
> +#define clear_state(x, y)	(x &= ~y)

Brackets around macro arguments. In fact i don't see what you need the
macros for. Just seems like obfuscation.

> +/****************************************************************************
> + *  Wrapper functions for using the new dm_io API
> + ****************************************************************************/
> +static int dm_io_sync_vm(unsigned int num_regions, struct dm_io_region
> +	*where, int rw, void *data, unsigned long *error_bits, struct
> cache_c *dmc)

Functions with that many parameters are usually a mistake. You'll need
more eventually. And nobody can read the calls. Same below.

> +{
> +	struct dm_io_request iorq;
> +
> +	iorq.bi_rw= rw;
> +	iorq.mem.type = DM_IO_VMA;
> +	iorq.mem.ptr.vma = data;
> +	iorq.notify.fn = NULL;
> +	iorq.client = dmc->io_client;
> +
> +	return dm_io(&iorq, num_regions, where, error_bits);
> +}
> +
> +/*
> + * Functions for handling pages used by async I/O.
> + * The data asked by a bio request may not be aligned with cache blocks, in
> + * which case additional pages are required for the request that is forwarded
> + * to the server. A pool of pages are reserved for this purpose.
> + */

I don't think you need the separate page_list structures, you could
just use page->lru. That would drop a lot of code.
 

> +		spin_unlock(&dmc->lock);
> +		return -ENOMEM;
> +	}
> +
> +	dmc->nr_free_pages -= nr;
> +	for (*pages = pl = dmc->pages; --nr; pl = pl->next)
> +		;

I'm sure there are clearer and safer ways to write such a loop.

For once what happens when nr == 0?

> +static void kcached_put_pages(struct cache_c *dmc, struct page_list *pl)
> +{
> +	struct page_list *cursor;
> +
> +	spin_lock(&dmc->lock);
> +	for (cursor = pl; cursor->next; cursor = cursor->next)
> +		dmc->nr_free_pages++;
> +
> +	dmc->nr_free_pages++;
> +	cursor->next = dmc->pages;
> +	dmc->pages = pl;
> +
> +	spin_unlock(&dmc->lock);

Does that actually put/free anything? It's at least misnamed.

> +static DEFINE_SPINLOCK(_job_lock);

Locks are supposed to come with documentation what they protect.

And why the _s ? 

> +

> +	spin_lock_irqsave(&_job_lock, flags);
> +	list_add_tail(&job->list, jobs);
> +	spin_unlock_irqrestore(&_job_lock, flags);

Hmm so you have a global lock for a list used in every IO?

That does not look like it will scale to multiple devices or larger systems.

It would be better to have this locking local to each volume at least.

> +}
> +
> +
> +/****************************************************************************
> + * Functions for asynchronously fetching data from source device and storing
> + * data in cache device. Because the requested data may not align with the
> + * cache blocks, extra handling is required to pad a block request and extract
> + * the requested data from the results.
> + ****************************************************************************/
> +
> +static void io_callback(unsigned long error, void *context)
> +{
> +	struct kcached_job *job = (struct kcached_job *) context;
> +
> +	if (error) {
> +		/* TODO */

So what happens? Not handling IO errors seems like a big omission.

Will they just not get reported or will you leak memory or corrupt data?

> +		DMERR("io_callback: io error");
> +		return;
> +	}
> +		while (head) {
> +			bvec[i].bv_len = min(head, (unsigned
> int)PAGE_SIZE);

Use min_t with the right type like the warning suggested (think of sign issues)
Same below.

> +		}
> +
> +		job->bvec = bvec;
> +		r = dm_io_async_bvec(1, &job->src, READ, job->bvec, io_callback, job);
> +		return r;
> +	} else { /* The original request is a WRITE */
> +		pl = job->pages;
> +
> +		if (head && tail) { /* Special case */
> +			bvec = kmalloc(job->nr_pages * sizeof(*bvec),
> GFP_KERNEL);

Wait a moment. Is that in the IO path? 

Then it should be GFP_NOFS at least to avoid recursion? Same below in
lots of other places (did you ever test that under low memory?)


Stopped reading here so far, but looks like there is plenty to do still.


-Andi

-- 
ak@linux.intel.com -- Speaking for myself only

next prev parent reply	other threads:[~2011-10-18 22:19 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-10-18 20:46 [RFC][PATCH] dm-cache (block level disk cache target): UPDATE Stephen Bromfield
2011-10-18 22:19 ` Andi Kleen [this message]
2011-10-19 16:17   ` Stephen Bromfield
2011-10-19 16:25     ` Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m2aa8yymcl.fsf@firstfloor.org \
    --to=andi@firstfloor.org \
    --cc=dm-cache@googlegroups.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=s.bromfield@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox