From mboxrd@z Thu Jan 1 00:00:00 1970 From: Bob Peterson Date: Tue, 22 Dec 2020 21:47:37 -0500 (EST) Subject: [Cluster-devel] [GFS2 PATCH] gfs2: make recovery workqueue operate on a gfs2 mount point, not journal In-Reply-To: References: <290202568.38904309.1608669529163.JavaMail.zimbra@redhat.com> <2125295377.38904313.1608669538740.JavaMail.zimbra@redhat.com> Message-ID: <1734775184.38918835.1608691657308.JavaMail.zimbra@redhat.com> List-Id: To: cluster-devel.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit ----- Original Message ----- > On Tue, Dec 22, 2020 at 9:39 PM Bob Peterson wrote: > > > > Hi, > > > > Before this patch, journal recovery was done by a workqueue function that > > operated on a per-journal basis. The problem is, these could run > > simultaneously > > which meant that they could all use the same bio, sd_log_bio, to do their > > writing to all the various journals. These operations overwrote one another > > eventually causing memory corruption. > > > > This patch makes the recovery workqueue operate on a per-superblock basis, > > which means a mount point using, for example journal0, could do recovery > > for all journals that need recovery. This is done consecutively so the > > sd_log_bio is only referenced by one recovery at a time, thus avoiding the > > chaos. > > > > Since the journal recovery requests can come in any order, and > > unpredictably, > > the new work func loops until there are no more journals to be recovered. > > > > Since multiple processes may request recovery of a journal, and since they > > all now use the same sdp-based workqueue, it's okay for them to get an > > error from queue_work: Queueing work while there's already work queued. > > > > Signed-off-by: Bob Peterson > > Can't this be a simple loop like below? > > repeat: > spin_lock(&sdp->sd_jindex_spin); > list_for_each_entry(jd, &sdp->sd_jindex_list, jd_list) { > if (test_bit(JDF_RECOVERY, &jd->jd_flags)) { > spin_unlock(&sdp->sd_jindex_spin); > gfs2_recover_one(jd); > goto repeat; > } > } > spin_unlock(&sdp->sd_jindex_spin); > Yes, that's just as effective. I just hate gotos. Whichever you prefer is fine with me. I'm okay with not reporting the count too. Bob