From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner+w=401wt.eu-S1753853AbZKCX5m@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1753853AbZKCX5m (ORCPT <rfc822;w@1wt.eu>);
	Tue, 3 Nov 2009 18:57:42 -0500
Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1753045AbZKCX5l
	(ORCPT <rfc822;linux-kernel-outgoing>);
	Tue, 3 Nov 2009 18:57:41 -0500
Received: from e7.ny.us.ibm.com ([32.97.182.137]:48390 "EHLO e7.ny.us.ibm.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1752831AbZKCX5k (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
	Tue, 3 Nov 2009 18:57:40 -0500
Date: Tue, 3 Nov 2009 15:57:44 -0800
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Gregory Haskins <gregory.haskins@gmail.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>,
       "Michael S. Tsirkin" <mst@redhat.com>, netdev@vger.kernel.org,
       virtualization@lists.linux-foundation.org, kvm@vger.kernel.org,
       linux-kernel@vger.kernel.org, mingo@elte.hu, linux-mm@kvack.org,
       akpm@linux-foundation.org, hpa@zytor.com,
       Rusty Russell <rusty@rustcorp.com.au>, s.hetze@linux-ag.com
Subject: Re: [PATCHv7 3/3] vhost_net: a kernel-level virtio server
Message-ID: <20091103235744.GF6726@linux.vnet.ibm.com>
Reply-To: paulmck@linux.vnet.ibm.com
References: <cover.1257267892.git.mst@redhat.com> <20091103172422.GD5591@redhat.com> <4AF0708B.4020406@gmail.com> <4AF07199.2020601@gmail.com> <4AF072EE.9020202@gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <4AF072EE.9020202@gmail.com>
User-Agent: Mutt/1.5.15+20070412 (2007-04-11)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Tue, Nov 03, 2009 at 01:14:06PM -0500, Gregory Haskins wrote:
> Gregory Haskins wrote:
> > Eric Dumazet wrote:
> >> Michael S. Tsirkin a écrit :
> >>> +static void handle_tx(struct vhost_net *net)
> >>> +{
> >>> +	struct vhost_virtqueue *vq = &net->dev.vqs[VHOST_NET_VQ_TX];
> >>> +	unsigned head, out, in, s;
> >>> +	struct msghdr msg = {
> >>> +		.msg_name = NULL,
> >>> +		.msg_namelen = 0,
> >>> +		.msg_control = NULL,
> >>> +		.msg_controllen = 0,
> >>> +		.msg_iov = vq->iov,
> >>> +		.msg_flags = MSG_DONTWAIT,
> >>> +	};
> >>> +	size_t len, total_len = 0;
> >>> +	int err, wmem;
> >>> +	size_t hdr_size;
> >>> +	struct socket *sock = rcu_dereference(vq->private_data);
> >>> +	if (!sock)
> >>> +		return;
> >>> +
> >>> +	wmem = atomic_read(&sock->sk->sk_wmem_alloc);
> >>> +	if (wmem >= sock->sk->sk_sndbuf)
> >>> +		return;
> >>> +
> >>> +	use_mm(net->dev.mm);
> >>> +	mutex_lock(&vq->mutex);
> >>> +	vhost_no_notify(vq);
> >>> +
> >> using rcu_dereference() and mutex_lock() at the same time seems wrong, I suspect
> >> that your use of RCU is not correct.
> >>
> >> 1) rcu_dereference() should be done inside a read_rcu_lock() section, and
> >>    we are not allowed to sleep in such a section.
> >>    (Quoting Documentation/RCU/whatisRCU.txt :
> >>      It is illegal to block while in an RCU read-side critical section, )
> >>
> >> 2) mutex_lock() can sleep (ie block)
> >>
> > 
> > 
> > Michael,
> >   I warned you that this needed better documentation ;)
> > 
> > Eric,
> >   I think I flagged this once before, but Michael convinced me that it
> > was indeed "ok", if but perhaps a bit unconventional.  I will try to
> > find the thread.
> > 
> > Kind Regards,
> > -Greg
> > 
> 
> Here it is:
> 
> http://lkml.org/lkml/2009/8/12/173

What was happening in that case was that the rcu_dereference()
was being used in a workqueue item.  The role of rcu_read_lock()
was taken on be the start of execution of the workqueue item, of
rcu_read_unlock() by the end of execution of the workqueue item, and
of synchronize_rcu() by flush_workqueue().  This does work, at least
assuming that flush_workqueue() operates as advertised, which it appears
to at first glance.

The above code looks somewhat different, however -- I don't see
handle_tx() being executed in the context of a work queue.  Instead
it appears to be in an interrupt handler.

So what is the story?  Using synchronize_irq() or some such?

							Thanx, Paul

From mboxrd@z Thu Jan  1 00:00:00 1970
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Subject: Re: [PATCHv7 3/3] vhost_net: a kernel-level virtio server
Date: Tue, 3 Nov 2009 15:57:44 -0800
Message-ID: <20091103235744.GF6726@linux.vnet.ibm.com>
References: <cover.1257267892.git.mst@redhat.com> <20091103172422.GD5591@redhat.com> <4AF0708B.4020406@gmail.com> <4AF07199.2020601@gmail.com> <4AF072EE.9020202@gmail.com>
Reply-To: paulmck@linux.vnet.ibm.com
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: quoted-printable
Cc: Eric Dumazet <eric.dumazet@gmail.com>,
        "Michael S. Tsirkin" <mst@redhat.com>, netdev@vger.kernel.org,
        virtualization@lists.linux-foundation.org, kvm@vger.kernel.org,
        linux-kernel@vger.kernel.org, mingo@elte.hu, linux-mm@kvack.org,
        akpm@linux-foundation.org, hpa@zytor.com,
        Rusty Russell <rusty@rustcorp.com.au>, s.hetze@linux-ag.com
To: Gregory Haskins <gregory.haskins@gmail.com>
Return-path: <owner-linux-mm@kvack.org>
Content-Disposition: inline
In-Reply-To: <4AF072EE.9020202@gmail.com>
Sender: owner-linux-mm@kvack.org
List-Id: kvm.vger.kernel.org

On Tue, Nov 03, 2009 at 01:14:06PM -0500, Gregory Haskins wrote:
> Gregory Haskins wrote:
> > Eric Dumazet wrote:
> >> Michael S. Tsirkin a =E9crit :
> >>> +static void handle_tx(struct vhost_net *net)
> >>> +{
> >>> +	struct vhost_virtqueue *vq =3D &net->dev.vqs[VHOST_NET_VQ_TX];
> >>> +	unsigned head, out, in, s;
> >>> +	struct msghdr msg =3D {
> >>> +		.msg_name =3D NULL,
> >>> +		.msg_namelen =3D 0,
> >>> +		.msg_control =3D NULL,
> >>> +		.msg_controllen =3D 0,
> >>> +		.msg_iov =3D vq->iov,
> >>> +		.msg_flags =3D MSG_DONTWAIT,
> >>> +	};
> >>> +	size_t len, total_len =3D 0;
> >>> +	int err, wmem;
> >>> +	size_t hdr_size;
> >>> +	struct socket *sock =3D rcu_dereference(vq->private_data);
> >>> +	if (!sock)
> >>> +		return;
> >>> +
> >>> +	wmem =3D atomic_read(&sock->sk->sk_wmem_alloc);
> >>> +	if (wmem >=3D sock->sk->sk_sndbuf)
> >>> +		return;
> >>> +
> >>> +	use_mm(net->dev.mm);
> >>> +	mutex_lock(&vq->mutex);
> >>> +	vhost_no_notify(vq);
> >>> +
> >> using rcu_dereference() and mutex_lock() at the same time seems wron=
g, I suspect
> >> that your use of RCU is not correct.
> >>
> >> 1) rcu_dereference() should be done inside a read_rcu_lock() section=
, and
> >>    we are not allowed to sleep in such a section.
> >>    (Quoting Documentation/RCU/whatisRCU.txt :
> >>      It is illegal to block while in an RCU read-side critical secti=
on, )
> >>
> >> 2) mutex_lock() can sleep (ie block)
> >>
> >=20
> >=20
> > Michael,
> >   I warned you that this needed better documentation ;)
> >=20
> > Eric,
> >   I think I flagged this once before, but Michael convinced me that i=
t
> > was indeed "ok", if but perhaps a bit unconventional.  I will try to
> > find the thread.
> >=20
> > Kind Regards,
> > -Greg
> >=20
>=20
> Here it is:
>=20
> http://lkml.org/lkml/2009/8/12/173

What was happening in that case was that the rcu_dereference()
was being used in a workqueue item.  The role of rcu_read_lock()
was taken on be the start of execution of the workqueue item, of
rcu_read_unlock() by the end of execution of the workqueue item, and
of synchronize_rcu() by flush_workqueue().  This does work, at least
assuming that flush_workqueue() operates as advertised, which it appears
to at first glance.

The above code looks somewhat different, however -- I don't see
handle_tx() being executed in the context of a work queue.  Instead
it appears to be in an interrupt handler.

So what is the story?  Using synchronize_irq() or some such?

							Thanx, Paul

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=3Dmailto:"dont@kvack.org"> email@kvack.org </a>

From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <owner-linux-mm@kvack.org>
Received: from mail190.messagelabs.com (mail190.messagelabs.com [216.82.249.51])
	by kanga.kvack.org (Postfix) with ESMTP id DBF2F6B0044
	for <linux-mm@kvack.org>; Tue,  3 Nov 2009 18:57:51 -0500 (EST)
Received: from d01relay05.pok.ibm.com (d01relay05.pok.ibm.com [9.56.227.237])
	by e7.ny.us.ibm.com (8.14.3/8.13.1) with ESMTP id nA3Ns5Yt012927
	for <linux-mm@kvack.org>; Tue, 3 Nov 2009 18:54:05 -0500
Received: from d01av04.pok.ibm.com (d01av04.pok.ibm.com [9.56.224.64])
	by d01relay05.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id nA3Nvikd107566
	for <linux-mm@kvack.org>; Tue, 3 Nov 2009 18:57:44 -0500
Received: from d01av04.pok.ibm.com (loopback [127.0.0.1])
	by d01av04.pok.ibm.com (8.14.3/8.13.1/NCO v10.0 AVout) with ESMTP id nA3Nvh6M000765
	for <linux-mm@kvack.org>; Tue, 3 Nov 2009 18:57:44 -0500
Date: Tue, 3 Nov 2009 15:57:44 -0800
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Subject: Re: [PATCHv7 3/3] vhost_net: a kernel-level virtio server
Message-ID: <20091103235744.GF6726@linux.vnet.ibm.com>
Reply-To: paulmck@linux.vnet.ibm.com
References: <cover.1257267892.git.mst@redhat.com> <20091103172422.GD5591@redhat.com> <4AF0708B.4020406@gmail.com> <4AF07199.2020601@gmail.com> <4AF072EE.9020202@gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <4AF072EE.9020202@gmail.com>
Sender: owner-linux-mm@kvack.org
To: Gregory Haskins <gregory.haskins@gmail.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>, "Michael S. Tsirkin" <mst@redhat.com>, netdev@vger.kernel.org, virtualization@lists.linux-foundation.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, mingo@elte.hu, linux-mm@kvack.org, akpm@linux-foundation.org, hpa@zytor.com, Rusty Russell <rusty@rustcorp.com.au>, s.hetze@linux-ag.com
List-ID: <linux-mm.kvack.org>

On Tue, Nov 03, 2009 at 01:14:06PM -0500, Gregory Haskins wrote:
> Gregory Haskins wrote:
> > Eric Dumazet wrote:
> >> Michael S. Tsirkin a ecrit :
> >>> +static void handle_tx(struct vhost_net *net)
> >>> +{
> >>> +	struct vhost_virtqueue *vq = &net->dev.vqs[VHOST_NET_VQ_TX];
> >>> +	unsigned head, out, in, s;
> >>> +	struct msghdr msg = {
> >>> +		.msg_name = NULL,
> >>> +		.msg_namelen = 0,
> >>> +		.msg_control = NULL,
> >>> +		.msg_controllen = 0,
> >>> +		.msg_iov = vq->iov,
> >>> +		.msg_flags = MSG_DONTWAIT,
> >>> +	};
> >>> +	size_t len, total_len = 0;
> >>> +	int err, wmem;
> >>> +	size_t hdr_size;
> >>> +	struct socket *sock = rcu_dereference(vq->private_data);
> >>> +	if (!sock)
> >>> +		return;
> >>> +
> >>> +	wmem = atomic_read(&sock->sk->sk_wmem_alloc);
> >>> +	if (wmem >= sock->sk->sk_sndbuf)
> >>> +		return;
> >>> +
> >>> +	use_mm(net->dev.mm);
> >>> +	mutex_lock(&vq->mutex);
> >>> +	vhost_no_notify(vq);
> >>> +
> >> using rcu_dereference() and mutex_lock() at the same time seems wrong, I suspect
> >> that your use of RCU is not correct.
> >>
> >> 1) rcu_dereference() should be done inside a read_rcu_lock() section, and
> >>    we are not allowed to sleep in such a section.
> >>    (Quoting Documentation/RCU/whatisRCU.txt :
> >>      It is illegal to block while in an RCU read-side critical section, )
> >>
> >> 2) mutex_lock() can sleep (ie block)
> >>
> > 
> > 
> > Michael,
> >   I warned you that this needed better documentation ;)
> > 
> > Eric,
> >   I think I flagged this once before, but Michael convinced me that it
> > was indeed "ok", if but perhaps a bit unconventional.  I will try to
> > find the thread.
> > 
> > Kind Regards,
> > -Greg
> > 
> 
> Here it is:
> 
> http://lkml.org/lkml/2009/8/12/173

What was happening in that case was that the rcu_dereference()
was being used in a workqueue item.  The role of rcu_read_lock()
was taken on be the start of execution of the workqueue item, of
rcu_read_unlock() by the end of execution of the workqueue item, and
of synchronize_rcu() by flush_workqueue().  This does work, at least
assuming that flush_workqueue() operates as advertised, which it appears
to at first glance.

The above code looks somewhat different, however -- I don't see
handle_tx() being executed in the context of a work queue.  Instead
it appears to be in an interrupt handler.

So what is the story?  Using synchronize_irq() or some such?

							Thanx, Paul

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>