From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mescal.linbit (office.linbit [213.229.1.138]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.linbit.com (LINBIT Mail Daemon) with ESMTP id 84DCD2CE82C8 for ; Wed, 16 Aug 2006 10:51:48 +0200 (CEST) From: Philipp Reisner To: drbd-dev@lists.linbit.com Subject: Re: [Drbd-dev] DRBD-8 - crash due to NULL page* in drbd_send_page Date: Wed, 16 Aug 2006 10:52:00 +0200 References: <342BAC0A5467384983B586A6B0B37671036252EC@EXNA.corp.stratus.com> <200608161044.31669.philipp.reisner@linbit.com> In-Reply-To: <200608161044.31669.philipp.reisner@linbit.com> MIME-Version: 1.0 Content-Type: Multipart/Mixed; boundary="Boundary-00=_wyt4EJrL7MruoFc" Message-Id: <200608161052.00607.philipp.reisner@linbit.com> List-Id: Coordination of development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , --Boundary-00=_wyt4EJrL7MruoFc Content-Type: text/plain; charset="iso-8859-15" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline > > Assuming for the minute that this IS the cause, what would a suitable > > solution be? We really need to delay processing the Ack until the > > send-dblock/send-block has finished -- i.e. we should wait until the > > RQ_DRBD_ON_WIRE flag is set in the request -- is there something > > suitable we could issue a wait_event_interruptible() on in > > got_BlockAck() to wait for this? [...] > I attached the patch. I guess you will rerun your tests with this > patch. [ it is completely untested ] > And the second version of that patch... =2D-=20 : Dipl-Ing Philipp Reisner Tel +43-1-8178292-50 : : LINBIT Information Technologies GmbH Fax +43-1-8178292-82 : : Sch=F6nbrunnerstr 244, 1120 Vienna, Austria http://www.linbit.com : --Boundary-00=_wyt4EJrL7MruoFc Content-Type: text/x-diff; charset="iso-8859-15"; name="for_simon2.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="for_simon2.diff" Index: drbd_worker.c =================================================================== --- drbd_worker.c (revision 2373) +++ drbd_worker.c (working copy) @@ -564,12 +564,10 @@ ok = drbd_send_dblock(mdev,req); if (ok) { - spin_lock_irq(&mdev->req_lock); - req->rq_status |= RQ_DRBD_ON_WIRE; - spin_unlock_irq(&mdev->req_lock); - inc_ap_pending(mdev); + drbd_end_req(req,RQ_DRBD_ON_WIRE,1,drbd_req_get_sector(req)); + if(mdev->net_conf->wire_protocol == DRBD_PROT_A) { dec_ap_pending(mdev); drbd_end_req(req, RQ_DRBD_SENT, 1, Index: drbd_req.c =================================================================== --- drbd_req.c (revision 2373) +++ drbd_req.c (working copy) @@ -341,7 +341,7 @@ if (!local) req->rq_status |= RQ_DRBD_LOCAL; if (!remote) - req->rq_status |= RQ_DRBD_SENT; + req->rq_status |= RQ_DRBD_SENT | RQ_DRBD_ON_WIRE; /* we need to plug ALWAYS since we possibly need to kick lo_dev */ drbd_plug_device(mdev); Index: drbd_int.h =================================================================== --- drbd_int.h (revision 2373) +++ drbd_int.h (working copy) @@ -233,9 +233,9 @@ #define RQ_DRBD_NOTHING 0x0001 #define RQ_DRBD_SENT 0x0010 // We got an ack #define RQ_DRBD_LOCAL 0x0020 // We wrote it to the local disk -#define RQ_DRBD_DONE 0x0030 // We are done ;) #define RQ_DRBD_IN_TL 0x0040 // Set when it is in the TL #define RQ_DRBD_ON_WIRE 0x0080 // Set as soon as it is on the socket... +#define RQ_DRBD_DONE ( RQ_DRBD_SENT + RQ_DRBD_LOCAL + RQ_DRBD_ON_WIRE ) /* drbd_meta-data.c (still in drbd_main.c) */ #define DRBD_MD_MAGIC (DRBD_MAGIC+4) // 4th incarnation of the disk layout. --Boundary-00=_wyt4EJrL7MruoFc--