All of lore.kernel.org
 help / color / mirror / Atom feed
From: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
To: Hirokazu Takahashi <taka@valinux.co.jp>
Cc: xen-devel@lists.xensource.com,
	containers@lists.linux-foundation.org,
	linux-kernel@vger.kernel.org,
	virtualization@lists.linux-foundation.org, dm-devel@redhat.com,
	agk@sourceware.org, s-uchida@ap.jp.nec.com,
	balbir@linux.vnet.ibm.com, ngupta@google.com,
	fernando@oss.ntt.co.jp, vgoyal@redhat.com, xemul@openvz.org
Subject: [PATCH 6/8]Re: bio-cgroup: The body of bio-cgroup
Date: Mon, 17 Nov 2008 16:50:51 +0900	[thread overview]
Message-ID: <4921225B.4000503@oss.ntt.co.jp> (raw)
In-Reply-To: <20081116.204609.65892670.taka@valinux.co.jp>

Hi i/o controller/scheduler developers,

Hirokazu Takahashi wrote:
> Hi,
> 
> As you pointed out, cgroup iocontext stuff isn't well designed yet
> since the current implementation of dm-iband doesn't need it.
> I'd like to leave the iocontext stuff to you I/O scheduler guys
> if you want to implement the bio-cgroup infrastructure to handle iocotexts
> as the I/O schedulers expect.

Yes, I want to contribute to this because I think the io context tracking part
can be used for a lot of other things. So it would be nice if we can talk about
"the most useful desing for everyone."

> 
>> Hi,
>>
>> Ryo Tsuruta wrote:
>>>  
>>> +void init_io_context(struct io_context *ioc)
>>> +{
>>> +	atomic_set(&ioc->refcount, 1);
>>> +	atomic_set(&ioc->nr_tasks, 1);
>>> +	spin_lock_init(&ioc->lock);
>>> +	ioc->ioprio_changed = 0;
>>> +	ioc->ioprio = 0;
>>> +	ioc->last_waited = jiffies; /* doesn't matter... */
>>> +	ioc->nr_batch_requests = 0; /* because this is 0 */
>>> +	ioc->aic = NULL;
>>> +	INIT_RADIX_TREE(&ioc->radix_root, GFP_ATOMIC | __GFP_HIGH);
>>> +	INIT_HLIST_HEAD(&ioc->cic_list);
>>> +	ioc->ioc_data = NULL;
>>> +}
>>> +
>>>  struct io_context *alloc_io_context(gfp_t gfp_flags, int node)
>>>  {
>>>  	struct io_context *ret;
>>>  
>>>  	ret = kmem_cache_alloc_node(iocontext_cachep, gfp_flags, node);
>>> -	if (ret) {
>>> -		atomic_set(&ret->refcount, 1);
>>> -		atomic_set(&ret->nr_tasks, 1);
>>> -		spin_lock_init(&ret->lock);
>>> -		ret->ioprio_changed = 0;
>>> -		ret->ioprio = 0;
>>> -		ret->last_waited = jiffies; /* doesn't matter... */
>>> -		ret->nr_batch_requests = 0; /* because this is 0 */
>>> -		ret->aic = NULL;
>>> -		INIT_RADIX_TREE(&ret->radix_root, GFP_ATOMIC | __GFP_HIGH);
>>> -		INIT_HLIST_HEAD(&ret->cic_list);
>>> -		ret->ioc_data = NULL;
>>> -	}
>>> +	if (ret)
>>> +		init_io_context(ret);
>>>  
>>>  	return ret;
>>>  }
>>
>>> +
>>> +/* Create a new bio-cgroup. */
>>> +static struct cgroup_subsys_state *
>>> +bio_cgroup_create(struct cgroup_subsys *ss, struct cgroup *cgrp)
>>> +{
>>> +	struct bio_cgroup *biog;
>>> +	struct io_context *ioc;
>>> +	int ret;
>>> +
>>> +	if (!cgrp->parent) {
>>> +		biog = &default_bio_cgroup;
>>> +		init_io_context(biog->io_context);
>>> +		/* Increment the referrence count not to be released ever. */
>>> +		atomic_inc(&biog->io_context->refcount);
>>> +		idr_init(&bio_cgroup_id);
>>> +		return &biog->css;
>>> +	}
>>> +
>>> +	biog = kzalloc(sizeof(*biog), GFP_KERNEL);
>>> +	ioc = alloc_io_context(GFP_KERNEL, -1);
>>> +	if (!ioc || !biog) {
>>> +		ret = -ENOMEM;
>>> +		goto out_err;
>>> +	}
>>> +	biog->io_context = ioc;
>> If I understand correctly io_contexts allocated by alloc_io_context() should
>> be owned by some tasks. In this case, the newly created io_context has no
>> owner though biog->io_context->nr_tasks == 1. With the cfq scheduler this may
>> cause some problems especially when cic_free_func() is called because
>> cfq_exit_io_context() would never be called. Am I wrong?
> 
> I think iocontext allocated for a certain cgroup should be owned by the
> cgroup itself, whose code isn't implemented yet. I think you need to
> enhance it a bit if the owner is a cgroup.
> 
>>> +retry:
>>> +	if (!idr_pre_get(&bio_cgroup_id, GFP_KERNEL)) {
>>> +		ret = -EAGAIN;
>>> +		goto out_err;
>>> +	}
>>> +	spin_lock_irq(&bio_cgroup_idr_lock);
>>> +	ret = idr_get_new_above(&bio_cgroup_id, (void *)biog, 1, &biog->id);
>>> +	spin_unlock_irq(&bio_cgroup_idr_lock);
>>> +	if (ret == -EAGAIN)
>>> +		goto retry;
>>> +	else if (ret)
>>> +		goto out_err;
>>> +
>>> +	return &biog->css;
>>> +out_err:
>>> +	if (biog)
>>> +		kfree(biog);
>>> +	if (ioc)
>>> +		put_io_context(ioc);
>>> +	return ERR_PTR(ret);
>>> +}
>>> +
>>> +/* Delete the bio-cgroup. */
>>> +static void bio_cgroup_destroy(struct cgroup_subsys *ss, struct cgroup *cgrp)
>>> +{
>>> +	struct bio_cgroup *biog = cgroup_bio(cgrp);
>>> +
>>> +	put_io_context(biog->io_context);
>> Here, I suspects what will happen under the following condition:
>> biog->io_context->refcount: 1 --> 0
>> biog->io_context->nr_tasks: 1 --> 1
>> ==> cfq_dtor() will be called but cfq_exit() has not be called yet.
>>
>>> +
>>> +	spin_lock_irq(&bio_cgroup_idr_lock);
>>> +	idr_remove(&bio_cgroup_id, biog->id);
>>> +	spin_unlock_irq(&bio_cgroup_idr_lock);
>>> +
>>> +	kfree(biog);
>>> +}
>>> +
>>> +
>>> +/* Determine the iocontext of the bio-cgroup that issued a given bio. */
>>> +struct io_context *get_bio_cgroup_iocontext(struct bio *bio)
>>> +{
>>> +	struct bio_cgroup *biog = NULL;
>>> +	struct io_context *ioc;
>>> +	int	id = 0;
>>> +
>>> +	id = get_bio_cgroup_id(bio);
>>> +	if (id)
>>> +		biog = find_bio_cgroup(id);
>>> +	if (!biog)
>>> +		biog = &default_bio_cgroup;
>>> +	ioc = biog->io_context;	/* default io_context for this cgroup */
>>> +	atomic_inc(&ioc->refcount);
>>> +	return ioc;
>>> +}
>>> +EXPORT_SYMBOL(get_bio_cgroup_iocontext);
>>
>> Thanks,
>> Takuya Yoshikawa
> 
> Thank you,
> Hirokazu Takahashi.
> 

WARNING: multiple messages have this Message-ID (diff)
From: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
To: Hirokazu Takahashi <taka@valinux.co.jp>
Cc: ryov@valinux.co.jp, xen-devel@lists.xensource.com,
	containers@lists.linux-foundation.org,
	linux-kernel@vger.kernel.org,
	virtualization@lists.linux-foundation.org, dm-devel@redhat.com,
	agk@sourceware.org, xemul@openvz.org, fernando@oss.ntt.co.jp,
	balbir@linux.vnet.ibm.com, s-uchida@ap.jp.nec.com,
	ngupta@google.com, vgoyal@redhat.com
Subject: [PATCH 6/8]Re: bio-cgroup: The body of bio-cgroup
Date: Mon, 17 Nov 2008 16:50:51 +0900	[thread overview]
Message-ID: <4921225B.4000503@oss.ntt.co.jp> (raw)
In-Reply-To: <20081116.204609.65892670.taka@valinux.co.jp>

Hi i/o controller/scheduler developers,

Hirokazu Takahashi wrote:
> Hi,
> 
> As you pointed out, cgroup iocontext stuff isn't well designed yet
> since the current implementation of dm-iband doesn't need it.
> I'd like to leave the iocontext stuff to you I/O scheduler guys
> if you want to implement the bio-cgroup infrastructure to handle iocotexts
> as the I/O schedulers expect.

Yes, I want to contribute to this because I think the io context tracking part
can be used for a lot of other things. So it would be nice if we can talk about
"the most useful desing for everyone."

> 
>> Hi,
>>
>> Ryo Tsuruta wrote:
>>>  
>>> +void init_io_context(struct io_context *ioc)
>>> +{
>>> +	atomic_set(&ioc->refcount, 1);
>>> +	atomic_set(&ioc->nr_tasks, 1);
>>> +	spin_lock_init(&ioc->lock);
>>> +	ioc->ioprio_changed = 0;
>>> +	ioc->ioprio = 0;
>>> +	ioc->last_waited = jiffies; /* doesn't matter... */
>>> +	ioc->nr_batch_requests = 0; /* because this is 0 */
>>> +	ioc->aic = NULL;
>>> +	INIT_RADIX_TREE(&ioc->radix_root, GFP_ATOMIC | __GFP_HIGH);
>>> +	INIT_HLIST_HEAD(&ioc->cic_list);
>>> +	ioc->ioc_data = NULL;
>>> +}
>>> +
>>>  struct io_context *alloc_io_context(gfp_t gfp_flags, int node)
>>>  {
>>>  	struct io_context *ret;
>>>  
>>>  	ret = kmem_cache_alloc_node(iocontext_cachep, gfp_flags, node);
>>> -	if (ret) {
>>> -		atomic_set(&ret->refcount, 1);
>>> -		atomic_set(&ret->nr_tasks, 1);
>>> -		spin_lock_init(&ret->lock);
>>> -		ret->ioprio_changed = 0;
>>> -		ret->ioprio = 0;
>>> -		ret->last_waited = jiffies; /* doesn't matter... */
>>> -		ret->nr_batch_requests = 0; /* because this is 0 */
>>> -		ret->aic = NULL;
>>> -		INIT_RADIX_TREE(&ret->radix_root, GFP_ATOMIC | __GFP_HIGH);
>>> -		INIT_HLIST_HEAD(&ret->cic_list);
>>> -		ret->ioc_data = NULL;
>>> -	}
>>> +	if (ret)
>>> +		init_io_context(ret);
>>>  
>>>  	return ret;
>>>  }
>>
>>> +
>>> +/* Create a new bio-cgroup. */
>>> +static struct cgroup_subsys_state *
>>> +bio_cgroup_create(struct cgroup_subsys *ss, struct cgroup *cgrp)
>>> +{
>>> +	struct bio_cgroup *biog;
>>> +	struct io_context *ioc;
>>> +	int ret;
>>> +
>>> +	if (!cgrp->parent) {
>>> +		biog = &default_bio_cgroup;
>>> +		init_io_context(biog->io_context);
>>> +		/* Increment the referrence count not to be released ever. */
>>> +		atomic_inc(&biog->io_context->refcount);
>>> +		idr_init(&bio_cgroup_id);
>>> +		return &biog->css;
>>> +	}
>>> +
>>> +	biog = kzalloc(sizeof(*biog), GFP_KERNEL);
>>> +	ioc = alloc_io_context(GFP_KERNEL, -1);
>>> +	if (!ioc || !biog) {
>>> +		ret = -ENOMEM;
>>> +		goto out_err;
>>> +	}
>>> +	biog->io_context = ioc;
>> If I understand correctly io_contexts allocated by alloc_io_context() should
>> be owned by some tasks. In this case, the newly created io_context has no
>> owner though biog->io_context->nr_tasks == 1. With the cfq scheduler this may
>> cause some problems especially when cic_free_func() is called because
>> cfq_exit_io_context() would never be called. Am I wrong?
> 
> I think iocontext allocated for a certain cgroup should be owned by the
> cgroup itself, whose code isn't implemented yet. I think you need to
> enhance it a bit if the owner is a cgroup.
> 
>>> +retry:
>>> +	if (!idr_pre_get(&bio_cgroup_id, GFP_KERNEL)) {
>>> +		ret = -EAGAIN;
>>> +		goto out_err;
>>> +	}
>>> +	spin_lock_irq(&bio_cgroup_idr_lock);
>>> +	ret = idr_get_new_above(&bio_cgroup_id, (void *)biog, 1, &biog->id);
>>> +	spin_unlock_irq(&bio_cgroup_idr_lock);
>>> +	if (ret == -EAGAIN)
>>> +		goto retry;
>>> +	else if (ret)
>>> +		goto out_err;
>>> +
>>> +	return &biog->css;
>>> +out_err:
>>> +	if (biog)
>>> +		kfree(biog);
>>> +	if (ioc)
>>> +		put_io_context(ioc);
>>> +	return ERR_PTR(ret);
>>> +}
>>> +
>>> +/* Delete the bio-cgroup. */
>>> +static void bio_cgroup_destroy(struct cgroup_subsys *ss, struct cgroup *cgrp)
>>> +{
>>> +	struct bio_cgroup *biog = cgroup_bio(cgrp);
>>> +
>>> +	put_io_context(biog->io_context);
>> Here, I suspects what will happen under the following condition:
>> biog->io_context->refcount: 1 --> 0
>> biog->io_context->nr_tasks: 1 --> 1
>> ==> cfq_dtor() will be called but cfq_exit() has not be called yet.
>>
>>> +
>>> +	spin_lock_irq(&bio_cgroup_idr_lock);
>>> +	idr_remove(&bio_cgroup_id, biog->id);
>>> +	spin_unlock_irq(&bio_cgroup_idr_lock);
>>> +
>>> +	kfree(biog);
>>> +}
>>> +
>>> +
>>> +/* Determine the iocontext of the bio-cgroup that issued a given bio. */
>>> +struct io_context *get_bio_cgroup_iocontext(struct bio *bio)
>>> +{
>>> +	struct bio_cgroup *biog = NULL;
>>> +	struct io_context *ioc;
>>> +	int	id = 0;
>>> +
>>> +	id = get_bio_cgroup_id(bio);
>>> +	if (id)
>>> +		biog = find_bio_cgroup(id);
>>> +	if (!biog)
>>> +		biog = &default_bio_cgroup;
>>> +	ioc = biog->io_context;	/* default io_context for this cgroup */
>>> +	atomic_inc(&ioc->refcount);
>>> +	return ioc;
>>> +}
>>> +EXPORT_SYMBOL(get_bio_cgroup_iocontext);
>>
>> Thanks,
>> Takuya Yoshikawa
> 
> Thank you,
> Hirokazu Takahashi.
> 


  parent reply	other threads:[~2008-11-17  7:50 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-11-13  3:10 [PATCH 0/8] I/O bandwidth controller and BIO tracking Ryo Tsuruta
2008-11-13  3:10 ` Ryo Tsuruta
2008-11-13  3:11 ` [PATCH 1/8] dm-ioband: Introduction Ryo Tsuruta
2008-11-13  3:11 ` Ryo Tsuruta
2008-11-13  3:12   ` [PATCH 2/8] dm-ioband: Source code and patch Ryo Tsuruta
2008-11-13  3:12     ` Ryo Tsuruta
     [not found]     ` <20081113.121221.220301508585560320.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
2008-11-13  3:12       ` [PATCH 3/8] dm-ioband: Document Ryo Tsuruta
2008-11-13  3:12     ` Ryo Tsuruta
2008-11-13  3:12     ` Ryo Tsuruta
2008-11-13  3:12       ` Ryo Tsuruta
2008-11-13  3:13       ` [PATCH 4/8] bio-cgroup: Introduction Ryo Tsuruta
2008-11-13  3:13         ` Ryo Tsuruta
2008-11-13  3:13         ` [PATCH 5/8] bio-cgroup: The new page_cgroup framework Ryo Tsuruta
2008-11-13  3:13         ` Ryo Tsuruta
2008-11-13  3:13           ` Ryo Tsuruta
2008-11-13  3:14           ` [PATCH 6/8] bio-cgroup: The body of bio-cgroup Ryo Tsuruta
2008-11-13  3:14             ` Ryo Tsuruta
2008-11-13  3:15             ` [PATCH 7/8] bio-cgroup: Page tracking hooks Ryo Tsuruta
     [not found]             ` <20081113.121433.632868945383117577.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
2008-11-13  3:15               ` Ryo Tsuruta
2008-11-13  3:15             ` Ryo Tsuruta
     [not found]               ` <20081113.121507.722554765817591488.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
2008-11-13  3:15                 ` [PATCH 8/8] bio-cgroup: Add a cgroup support to dm-ioband Ryo Tsuruta
2008-11-13  3:15               ` Ryo Tsuruta
2008-11-13  3:15               ` Ryo Tsuruta
2008-11-14  7:01             ` [PATCH 6/8] bio-cgroup: The body of bio-cgroup Takuya Yoshikawa
2008-11-14  7:01               ` Takuya Yoshikawa
     [not found]               ` <491D2251.7010702-gVGce1chcLdL9jVzuh4AOg@public.gmane.org>
2008-11-16 11:46                 ` Hirokazu Takahashi
2008-11-16 11:46               ` Hirokazu Takahashi
2008-11-16 11:46                 ` Hirokazu Takahashi
     [not found]                 ` <20081116.204609.65892670.taka-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
2008-11-17  7:50                   ` [PATCH 6/8]Re: " Takuya Yoshikawa
2008-11-17  7:50                 ` Takuya Yoshikawa
2008-11-17  7:50                 ` Takuya Yoshikawa [this message]
2008-11-17  7:50                   ` Takuya Yoshikawa
2008-11-16 11:46               ` [PATCH 6/8] " Hirokazu Takahashi
2008-11-13  3:14           ` Ryo Tsuruta
     [not found]         ` <20081113.121322.653026525707351102.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
2008-11-13  3:13           ` [PATCH 5/8] bio-cgroup: The new page_cgroup framework Ryo Tsuruta
2008-11-13  3:13       ` [PATCH 4/8] bio-cgroup: Introduction Ryo Tsuruta
     [not found]       ` <20081113.121254.625859878544534829.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
2008-11-13  3:13         ` Ryo Tsuruta
2008-11-13  3:12   ` [PATCH 2/8] dm-ioband: Source code and patch Ryo Tsuruta
     [not found]   ` <20081113.121146.623571555980959797.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
2008-11-13  3:12     ` Ryo Tsuruta
2008-11-13 19:36   ` [PATCH 1/8] dm-ioband: Introduction Phillip Susi
2008-11-13 19:36     ` [dm-devel] " Phillip Susi
2008-11-13 23:18     ` Hirokazu Takahashi
2008-11-13 23:18       ` Hirokazu Takahashi
2008-11-14  1:38       ` Ryo Tsuruta
2008-11-14  1:38         ` Ryo Tsuruta
2008-11-13  3:50 ` [PATCH 0/8] I/O bandwidth controller and BIO tracking Ryo Tsuruta
2008-12-05 12:19   ` Ryo Tsuruta
2008-12-24  8:36     ` Ryo Tsuruta
     [not found] ` <20081113.121019.104237699577193919.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
2008-11-13  3:11   ` [PATCH 1/8] dm-ioband: Introduction Ryo Tsuruta
2008-11-13  6:30   ` [PATCH 0/8] I/O bandwidth controller and BIO tracking KAMEZAWA Hiroyuki
2008-11-13  6:30 ` KAMEZAWA Hiroyuki
2008-11-13  6:30   ` KAMEZAWA Hiroyuki
     [not found]   ` <20081113153019.cc50d42c.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2008-11-13  7:26     ` [dm-devel] " Hirokazu Takahashi
2008-11-13  7:26   ` Hirokazu Takahashi
2008-11-13  7:26     ` Hirokazu Takahashi
2008-11-13 16:20     ` Balbir Singh
2008-11-13 23:15       ` Hirokazu Takahashi
2008-11-13 23:15         ` Hirokazu Takahashi
2008-11-13  7:26   ` Hirokazu Takahashi
2008-11-13  6:30 ` KAMEZAWA Hiroyuki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4921225B.4000503@oss.ntt.co.jp \
    --to=yoshikawa.takuya@oss.ntt.co.jp \
    --cc=agk@sourceware.org \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=containers@lists.linux-foundation.org \
    --cc=dm-devel@redhat.com \
    --cc=fernando@oss.ntt.co.jp \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ngupta@google.com \
    --cc=s-uchida@ap.jp.nec.com \
    --cc=taka@valinux.co.jp \
    --cc=vgoyal@redhat.com \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=xemul@openvz.org \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.