linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
To: Devesh Sharma <devesh.sharma-1wcpHE2jlwO1Z/+hSey0Gg@public.gmane.org>
Cc: Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
	Eran Ben Elisha <eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
	Sean Hefty <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	Jason Gunthorpe
	<jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
Subject: Re: [PATCH rdma-RC] IB/cm: Fix sleeping while atomic when creating AH from WC
Date: Tue, 13 Oct 2015 11:24:20 +0300	[thread overview]
Message-ID: <561CBFB4.1020504@mellanox.com> (raw)
In-Reply-To: <CANjDDBjGe9a_gg-51=ysxMyDPjuD6rQg4FLyfZH8E1TCoEYLKQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>



On 10/12/2015 3:59 PM, Devesh Sharma wrote:
> Looks good, just one doubt inline:
>

Thanks for looking at this patch.

> On Sun, Oct 11, 2015 at 6:28 PM, Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
>> When IP based addressing was introduced, ib_create_ah_from_wc was
>> changed in order to support a suitable AH. Since this AH should
>> now contains the DMAC (which isn't a simple derivative of the GID).
>> In order to find the DMAC, an ARP should sometime be sent. This ARP
>> is a sleeping context.
>>
>> ib_create_ah_from_wc is called from cm_alloc_response_msg, which is
>> sometimes called from an atomic context. This caused a
>> sleeping-while-atomic bug. Fixing this by splitting
>> cm_alloc_response_msg to an atomic and sleep-able part. When
>> cm_alloc_response_msg is used in an atomic context, we try to create
>> the AH before entering the atomic context.
>>
>> Fixes: 66bd20a72d2f ('IB/core: Ethernet L2 attributes in verbs/cm structures')
>> Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>> ---
>>
>> Hi Doug,
>> This patch fixes an old bug in the CM. IP based addressing requires
>> ARP resolution which isn't sleep-able by its nature. This resolution
>> was sometimes done in non sleep-able context. Our regression tests
>> picked up this bug and verified this fix.
>>
>> Thanks,
>> Matan
>>
>>   drivers/infiniband/core/cm.c | 60 ++++++++++++++++++++++++++++++++++++--------
>>   1 file changed, 49 insertions(+), 11 deletions(-)
>>
>> diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
>> index ea4db9c..f5cf1c4 100644
>> --- a/drivers/infiniband/core/cm.c
>> +++ b/drivers/infiniband/core/cm.c
>> @@ -287,17 +287,12 @@ static int cm_alloc_msg(struct cm_id_private *cm_id_priv,
>>          return 0;
>>   }
>>
>> -static int cm_alloc_response_msg(struct cm_port *port,
>> -                                struct ib_mad_recv_wc *mad_recv_wc,
>> -                                struct ib_mad_send_buf **msg)
>> +static int _cm_alloc_response_msg(struct cm_port *port,
>> +                                 struct ib_mad_recv_wc *mad_recv_wc,
>> +                                 struct ib_ah *ah,
>> +                                 struct ib_mad_send_buf **msg)
>>   {
>>          struct ib_mad_send_buf *m;
>> -       struct ib_ah *ah;
>> -
>> -       ah = ib_create_ah_from_wc(port->mad_agent->qp->pd, mad_recv_wc->wc,
>> -                                 mad_recv_wc->recv_buf.grh, port->port_num);
>> -       if (IS_ERR(ah))
>> -               return PTR_ERR(ah);
>>
>>          m = ib_create_send_mad(port->mad_agent, 1, mad_recv_wc->wc->pkey_index,
>>                                 0, IB_MGMT_MAD_HDR, IB_MGMT_MAD_DATA,
>> @@ -312,6 +307,20 @@ static int cm_alloc_response_msg(struct cm_port *port,
>>          return 0;
>>   }
>>
>> +static int cm_alloc_response_msg(struct cm_port *port,
>> +                                struct ib_mad_recv_wc *mad_recv_wc,
>> +                                struct ib_mad_send_buf **msg)
>> +{
>> +       struct ib_ah *ah;
>> +
>> +       ah = ib_create_ah_from_wc(port->mad_agent->qp->pd, mad_recv_wc->wc,
>> +                                 mad_recv_wc->recv_buf.grh, port->port_num);
>> +       if (IS_ERR(ah))
>> +               return PTR_ERR(ah);
>> +
>> +       return _cm_alloc_response_msg(port, mad_recv_wc, ah, msg);
>> +}
>> +
>>   static void cm_free_msg(struct ib_mad_send_buf *msg)
>>   {
>>          ib_destroy_ah(msg->ah);
>> @@ -2201,6 +2210,7 @@ static int cm_dreq_handler(struct cm_work *work)
>>          struct cm_id_private *cm_id_priv;
>>          struct cm_dreq_msg *dreq_msg;
>>          struct ib_mad_send_buf *msg = NULL;
>> +       struct ib_ah *ah;
>>          int ret;
>>
>>          dreq_msg = (struct cm_dreq_msg *)work->mad_recv_wc->recv_buf.mad;
>> @@ -2213,6 +2223,11 @@ static int cm_dreq_handler(struct cm_work *work)
>>                  return -EINVAL;
>>          }
>>
>> +       ah = ib_create_ah_from_wc(work->port->mad_agent->qp->pd,
>> +                                 work->mad_recv_wc->wc,
>> +                                 work->mad_recv_wc->recv_buf.grh,
>> +                                 work->port->port_num);
>> +
>
> Shouldn't below IS_ERR(ah) on ah be here, instead of there?
>

I don't think we want to fail on error if the state != IB_CM_TIMEWAIT 
(other states don't use this ah).

>>          work->cm_event.private_data = &dreq_msg->private_data;
>>
>>          spin_lock_irq(&cm_id_priv->lock);
>> @@ -2234,9 +2249,13 @@ static int cm_dreq_handler(struct cm_work *work)
>>          case IB_CM_TIMEWAIT:
>>                  atomic_long_inc(&work->port->counter_group[CM_RECV_DUPLICATES].
>>                                  counter[CM_DREQ_COUNTER]);
>> -               if (cm_alloc_response_msg(work->port, work->mad_recv_wc, &msg))
>> +               if (IS_ERR(ah))
>> +                       goto unlock;
>> +               if (_cm_alloc_response_msg(work->port, work->mad_recv_wc, ah,
>> +                                          &msg))
>>                          goto unlock;
>>
>> +               ah = NULL;
>>                  cm_format_drep((struct cm_drep_msg *) msg->mad, cm_id_priv,
>>                                 cm_id_priv->private_data,
>>                                 cm_id_priv->private_data_len);
>> @@ -2259,6 +2278,8 @@ static int cm_dreq_handler(struct cm_work *work)
>>                  list_add_tail(&work->list, &cm_id_priv->work_list);
>>          spin_unlock_irq(&cm_id_priv->lock);
>>
>> +       if (!IS_ERR_OR_NULL(ah))
>> +               ib_destroy_ah(ah);
>>          if (ret)
>>                  cm_process_work(cm_id_priv, work);
>>          else
>> @@ -2266,6 +2287,8 @@ static int cm_dreq_handler(struct cm_work *work)
>>          return 0;
>>
>>   unlock:        spin_unlock_irq(&cm_id_priv->lock);
>> +       if (!IS_ERR_OR_NULL(ah))
>> +               ib_destroy_ah(ah);
>>   deref: cm_deref_id(cm_id_priv);
>>          return -EINVAL;
>>   }
>> @@ -2761,6 +2784,7 @@ static int cm_lap_handler(struct cm_work *work)
>>          struct cm_lap_msg *lap_msg;
>>          struct ib_cm_lap_event_param *param;
>>          struct ib_mad_send_buf *msg = NULL;
>> +       struct ib_ah *ah;
>>          int ret;
>>
>>          /* todo: verify LAP request and send reject APR if invalid. */
>> @@ -2770,6 +2794,12 @@ static int cm_lap_handler(struct cm_work *work)
>>          if (!cm_id_priv)
>>                  return -EINVAL;
>>
>> +
>> +       ah = ib_create_ah_from_wc(work->port->mad_agent->qp->pd,
>> +                                 work->mad_recv_wc->wc,
>> +                                 work->mad_recv_wc->recv_buf.grh,
>> +                                 work->port->port_num);
>> +
>>          param = &work->cm_event.param.lap_rcvd;
>>          param->alternate_path = &work->path[0];
>>          cm_format_path_from_lap(cm_id_priv, param->alternate_path, lap_msg);
>> @@ -2786,9 +2816,13 @@ static int cm_lap_handler(struct cm_work *work)
>>          case IB_CM_MRA_LAP_SENT:
>>                  atomic_long_inc(&work->port->counter_group[CM_RECV_DUPLICATES].
>>                                  counter[CM_LAP_COUNTER]);
>> -               if (cm_alloc_response_msg(work->port, work->mad_recv_wc, &msg))
>> +               if (IS_ERR(ah))
>> +                       goto unlock;
>> +               if (_cm_alloc_response_msg(work->port, work->mad_recv_wc, ah,
>> +                                          &msg))
>>                          goto unlock;
>>
>> +               ah = NULL;
>>                  cm_format_mra((struct cm_mra_msg *) msg->mad, cm_id_priv,
>>                                CM_MSG_RESPONSE_OTHER,
>>                                cm_id_priv->service_timeout,
>> @@ -2818,6 +2852,8 @@ static int cm_lap_handler(struct cm_work *work)
>>                  list_add_tail(&work->list, &cm_id_priv->work_list);
>>          spin_unlock_irq(&cm_id_priv->lock);
>>
>> +       if (!IS_ERR_OR_NULL(ah))
>> +               ib_destroy_ah(ah);
>>          if (ret)
>>                  cm_process_work(cm_id_priv, work);
>>          else
>> @@ -2825,6 +2861,8 @@ static int cm_lap_handler(struct cm_work *work)
>>          return 0;
>>
>>   unlock:        spin_unlock_irq(&cm_id_priv->lock);
>> +       if (!IS_ERR_OR_NULL(ah))
>> +               ib_destroy_ah(ah);
>>   deref: cm_deref_id(cm_id_priv);
>>          return -EINVAL;
>>   }
>> --
>> 2.1.0
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2015-10-13  8:24 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-11 12:58 [PATCH rdma-RC] IB/cm: Fix rb-tree duplicate free and use-after-free Matan Barak
     [not found] ` <1444568298-17289-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-10-11 12:58   ` [PATCH rdma-RC] IB/cm: Fix sleeping while atomic when creating AH from WC Matan Barak
     [not found]     ` <1444568298-17289-2-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-10-12 12:59       ` Devesh Sharma
     [not found]         ` <CANjDDBjGe9a_gg-51=ysxMyDPjuD6rQg4FLyfZH8E1TCoEYLKQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-10-13  8:24           ` Matan Barak [this message]
2015-10-12 16:42       ` Hefty, Sean
     [not found]         ` <1828884A29C6694DAF28B7E6B8A82373A9734356-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-10-13  8:22           ` Matan Barak
     [not found]             ` <CAAKD3BCe_LAuyxifm=j-Am44S1k4nT328WrGBC+Day+XxMxk9g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-10-13 16:18               ` Hefty, Sean
     [not found]                 ` <1828884A29C6694DAF28B7E6B8A82373A9734A57-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-10-14  7:44                   ` Matan Barak
     [not found]                     ` <CAAKD3BAF763brdhsrHtpm_peHk--g3iza53AioiUefepHM_s2w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-10-15 16:58                       ` Hefty, Sean
     [not found]                         ` <1828884A29C6694DAF28B7E6B8A82373A9735914-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-10-18  7:28                           ` Matan Barak
     [not found]                             ` <CAAKD3BCdbD8MC4PJGuVzUxiq5wEbCNjX4e91vBkcoErJVM8FQg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-10-20 15:57                               ` Hefty, Sean
     [not found]                                 ` <1828884A29C6694DAF28B7E6B8A82373A9736832-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-10-20 16:03                                   ` Hal Rosenstock
2015-10-20 16:36                                   ` Jason Gunthorpe
2015-12-23 20:04                           ` Doug Ledford
     [not found]                             ` <567AFE42.2080107-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-12-24  7:46                               ` Matan Barak
     [not found]                                 ` <CAAKD3BBGhWqp7kJfBQAhYHDVz5JgjNg4M+0GBTwg7Us4hioO-A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-01-05 11:33                                   ` Matan Barak
2015-10-11 15:28   ` [PATCH rdma-RC] IB/cm: Fix rb-tree duplicate free and use-after-free Or Gerlitz
     [not found]     ` <HE1PR05MB1466FC3AF0B30533033EC1B0B0310@HE1PR05MB1466.eurprd05.prod.outlook.com>
     [not found]       ` <HE1PR05MB1466FC3AF0B30533033EC1B0B0310-eBadYZ65MZ+I1hPkL3GmLNqRiQSDpxhJvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2015-10-12 13:14         ` Or Gerlitz
2015-10-12 16:37   ` Hefty, Sean
     [not found]     ` <1828884A29C6694DAF28B7E6B8A82373A9734333-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-10-15 15:15       ` Matan Barak
     [not found]         ` <561FC309.2030102-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-10-20 20:27           ` Doug Ledford
     [not found]             ` <5626A39D.6030906-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-10-21 19:58               ` Doug Ledford
     [not found]                 ` <5627EE5A.7030303-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-10-26 17:39                   ` Hefty, Sean

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=561CBFB4.1020504@mellanox.com \
    --to=matanb-vpraknaxozvwk0htik3j/w@public.gmane.org \
    --cc=devesh.sharma-1wcpHE2jlwO1Z/+hSey0Gg@public.gmane.org \
    --cc=dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
    --cc=jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
    --cc=sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).