From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from lithops.sigma-star.at ([195.201.40.130]) by bombadil.infradead.org with esmtps (Exim 4.90_1 #2 (Red Hat Linux)) id 1gEVY6-0004sJ-Vm for linux-mtd@lists.infradead.org; Mon, 22 Oct 2018 08:26:58 +0000 From: Richard Weinberger To: =?utf-8?B?UmFmYcWCIE1pxYJlY2tp?= Cc: Amir Goldstein , Miklos Szeredi , linux-unionfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, Artem Bityutskiy , Adrian Hunter , linux-mtd@lists.infradead.org, Russell Senior , OpenWrt Development List Subject: Re: Regression in handling power cuts since 3a1e819b4e80 ("ovl: store file handle of lower inode on copy up") Date: Mon, 22 Oct 2018 10:26:29 +0200 Message-ID: <3000620.g91H2S8sUk@blindfold> In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="UTF-8" List-Id: Linux MTD discussion mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Am Montag, 22. Oktober 2018, 09:14:08 CEST schrieb Rafa=C5=82 Mi=C5=82ecki: > On Fri, 19 Oct 2018 at 14:31, Rafa=C5=82 Mi=C5=82ecki = wrote: > > Since OpenWrt switch from kernel 4.9 to 4.14 users started randomly > > reporting file system corruptions. OpenWrt uses overlay(fs) with > > squashfs as lowerdir and ubifs as upperdir. Russell managed to isolate > > & describe test case for reproducing corruption when doing a power cut > > after first boot. > > > > (...) > > > > Can I ask you to check if there is something possibly wrong with the > > above ovl commit? Or does it expose some problem with the ubifs? Or > > maybe the whole UBI? > > > > FWIW testing above commit (and one before it) always results in single > > error in the kernel log: > > [ 14.250184] UBIFS error (ubi0:1 pid 637): ubifs_add_orphan: orphaned= twice > > > > That UBIFS error doesn't occur with 4.12.14. Unfortunately it's > > impossible to cleanly revert 3a1e819b4e80 from the top of 4.12.14. >=20 > Let me provide a summary of all relevant commits & tests: >=20 > By "Corruption" I mean file system corruption after power cut Well, is the filesystem not consistent anymore? =46rom what Russel explained to me, I thought the main problem is that no w= rite back happens. IOW the inode is present, has correct length, but no content is there (all = zeros). Just like the typical case where userspace does not fsync. But in your case sooner or later write back should have happened because th= e writeback timer fires at some point. Thanks, //richard From mboxrd@z Thu Jan 1 00:00:00 1970 From: Richard Weinberger Subject: Re: Regression in handling power cuts since 3a1e819b4e80 ("ovl: store file handle of lower inode on copy up") Date: Mon, 22 Oct 2018 10:26:29 +0200 Message-ID: <3000620.g91H2S8sUk@blindfold> References: Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Return-path: In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-mtd" Errors-To: linux-mtd-bounces+gldm-linux-mtd-36=gmane.org@lists.infradead.org To: =?utf-8?B?UmFmYcWCIE1pxYJlY2tp?= Cc: Artem Bityutskiy , Amir Goldstein , Adrian Hunter , linux-unionfs@vger.kernel.org, Miklos Szeredi , linux-mtd@lists.infradead.org, Russell Senior , linux-fsdevel@vger.kernel.org, OpenWrt Development List List-Id: linux-unionfs@vger.kernel.org QW0gTW9udGFnLCAyMi4gT2t0b2JlciAyMDE4LCAwOToxNDowOCBDRVNUIHNjaHJpZWIgUmFmYcWC IE1pxYJlY2tpOgo+IE9uIEZyaSwgMTkgT2N0IDIwMTggYXQgMTQ6MzEsIFJhZmHFgiBNacWCZWNr aSA8emFqZWM1QGdtYWlsLmNvbT4gd3JvdGU6Cj4gPiBTaW5jZSBPcGVuV3J0IHN3aXRjaCBmcm9t IGtlcm5lbCA0LjkgdG8gNC4xNCB1c2VycyBzdGFydGVkIHJhbmRvbWx5Cj4gPiByZXBvcnRpbmcg ZmlsZSBzeXN0ZW0gY29ycnVwdGlvbnMuIE9wZW5XcnQgdXNlcyBvdmVybGF5KGZzKSB3aXRoCj4g PiBzcXVhc2hmcyBhcyBsb3dlcmRpciBhbmQgdWJpZnMgYXMgdXBwZXJkaXIuIFJ1c3NlbGwgbWFu YWdlZCB0byBpc29sYXRlCj4gPiAmIGRlc2NyaWJlIHRlc3QgY2FzZSBmb3IgcmVwcm9kdWNpbmcg Y29ycnVwdGlvbiB3aGVuIGRvaW5nIGEgcG93ZXIgY3V0Cj4gPiBhZnRlciBmaXJzdCBib290Lgo+ ID4KPiA+ICguLi4pCj4gPgo+ID4gQ2FuIEkgYXNrIHlvdSB0byBjaGVjayBpZiB0aGVyZSBpcyBz b21ldGhpbmcgcG9zc2libHkgd3Jvbmcgd2l0aCB0aGUKPiA+IGFib3ZlIG92bCBjb21taXQ/IE9y IGRvZXMgaXQgZXhwb3NlIHNvbWUgcHJvYmxlbSB3aXRoIHRoZSB1Ymlmcz8gT3IKPiA+IG1heWJl IHRoZSB3aG9sZSBVQkk/Cj4gPgo+ID4gRldJVyB0ZXN0aW5nIGFib3ZlIGNvbW1pdCAoYW5kIG9u ZSBiZWZvcmUgaXQpIGFsd2F5cyByZXN1bHRzIGluIHNpbmdsZQo+ID4gZXJyb3IgaW4gdGhlIGtl cm5lbCBsb2c6Cj4gPiBbICAgMTQuMjUwMTg0XSBVQklGUyBlcnJvciAodWJpMDoxIHBpZCA2Mzcp OiB1Ymlmc19hZGRfb3JwaGFuOiBvcnBoYW5lZCB0d2ljZQo+ID4KPiA+IFRoYXQgVUJJRlMgZXJy b3IgZG9lc24ndCBvY2N1ciB3aXRoIDQuMTIuMTQuIFVuZm9ydHVuYXRlbHkgaXQncwo+ID4gaW1w b3NzaWJsZSB0byBjbGVhbmx5IHJldmVydCAzYTFlODE5YjRlODAgZnJvbSB0aGUgdG9wIG9mIDQu MTIuMTQuCj4gCj4gTGV0IG1lIHByb3ZpZGUgYSBzdW1tYXJ5IG9mIGFsbCByZWxldmFudCBjb21t aXRzICYgdGVzdHM6Cj4gCj4gQnkgIkNvcnJ1cHRpb24iIEkgbWVhbiBmaWxlIHN5c3RlbSBjb3Jy dXB0aW9uIGFmdGVyIHBvd2VyIGN1dAoKV2VsbCwgaXMgdGhlIGZpbGVzeXN0ZW0gbm90IGNvbnNp c3RlbnQgYW55bW9yZT8KRnJvbSB3aGF0IFJ1c3NlbCBleHBsYWluZWQgdG8gbWUsIEkgdGhvdWdo dCB0aGUgbWFpbiBwcm9ibGVtIGlzIHRoYXQgbm8gd3JpdGUgYmFjayBoYXBwZW5zLgpJT1cgdGhl IGlub2RlIGlzIHByZXNlbnQsIGhhcyBjb3JyZWN0IGxlbmd0aCwgYnV0IG5vIGNvbnRlbnQgaXMg dGhlcmUgKGFsbCB6ZXJvcykuCgpKdXN0IGxpa2UgdGhlIHR5cGljYWwgY2FzZSB3aGVyZSB1c2Vy c3BhY2UgZG9lcyBub3QgZnN5bmMuCkJ1dCBpbiB5b3VyIGNhc2Ugc29vbmVyIG9yIGxhdGVyIHdy aXRlIGJhY2sgc2hvdWxkIGhhdmUgaGFwcGVuZWQgYmVjYXVzZSB0aGUgd3JpdGViYWNrIHRpbWVy CmZpcmVzIGF0IHNvbWUgcG9pbnQuCgpUaGFua3MsCi8vcmljaGFyZAoKCgoKCl9fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXwpMaW51eCBNVEQgZGlz Y3Vzc2lvbiBtYWlsaW5nIGxpc3QKaHR0cDovL2xpc3RzLmluZnJhZGVhZC5vcmcvbWFpbG1hbi9s aXN0aW5mby9saW51eC1tdGQvCg== From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lithops.sigma-star.at ([195.201.40.130]:50406 "EHLO lithops.sigma-star.at" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727635AbeJVQoH (ORCPT ); Mon, 22 Oct 2018 12:44:07 -0400 From: Richard Weinberger To: =?utf-8?B?UmFmYcWCIE1pxYJlY2tp?= Cc: Amir Goldstein , Miklos Szeredi , linux-unionfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, Artem Bityutskiy , Adrian Hunter , linux-mtd@lists.infradead.org, Russell Senior , OpenWrt Development List Subject: Re: Regression in handling power cuts since 3a1e819b4e80 ("ovl: store file handle of lower inode on copy up") Date: Mon, 22 Oct 2018 10:26:29 +0200 Message-ID: <3000620.g91H2S8sUk@blindfold> In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8BIT Content-Type: text/plain; charset="UTF-8" Sender: linux-fsdevel-owner@vger.kernel.org List-ID: Am Montag, 22. Oktober 2018, 09:14:08 CEST schrieb Rafał Miłecki: > On Fri, 19 Oct 2018 at 14:31, Rafał Miłecki wrote: > > Since OpenWrt switch from kernel 4.9 to 4.14 users started randomly > > reporting file system corruptions. OpenWrt uses overlay(fs) with > > squashfs as lowerdir and ubifs as upperdir. Russell managed to isolate > > & describe test case for reproducing corruption when doing a power cut > > after first boot. > > > > (...) > > > > Can I ask you to check if there is something possibly wrong with the > > above ovl commit? Or does it expose some problem with the ubifs? Or > > maybe the whole UBI? > > > > FWIW testing above commit (and one before it) always results in single > > error in the kernel log: > > [ 14.250184] UBIFS error (ubi0:1 pid 637): ubifs_add_orphan: orphaned twice > > > > That UBIFS error doesn't occur with 4.12.14. Unfortunately it's > > impossible to cleanly revert 3a1e819b4e80 from the top of 4.12.14. > > Let me provide a summary of all relevant commits & tests: > > By "Corruption" I mean file system corruption after power cut Well, is the filesystem not consistent anymore? >>From what Russel explained to me, I thought the main problem is that no write back happens. IOW the inode is present, has correct length, but no content is there (all zeros). Just like the typical case where userspace does not fsync. But in your case sooner or later write back should have happened because the writeback timer fires at some point. Thanks, //richard