From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from ns.sciencehorizons.net ([71.41.210.147]) by bombadil.infradead.org with smtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1bC1cF-0005kw-KE for linux-mtd@lists.infradead.org; Sun, 12 Jun 2016 09:23:36 +0000 Date: 12 Jun 2016 05:23:13 -0400 Message-ID: <20160612092313.4046.qmail@ns.sciencehorizons.net> From: "George Spelvin" To: boris.brezillon@free-electrons.com, linux@sciencehorizons.net Subject: Re: [PATCH 2/4] mtd: nand: implement two pairing scheme Cc: computersforpeace@gmail.com, linux-kernel@vger.kernel.org, linux-mtd@lists.infradead.org, richard@nod.at In-Reply-To: <20160612092019.79b57b7f@bbrezillon> List-Id: Linux MTD discussion mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , >> It also applies an offset of +1, to avoid negative numbers and the >> problems of signed divides. > It seems to cover all cases. I wasn't sure why you used a signed int for the interface. (Another thing I thought of, but am less sure of, is packing the group and pair numbers into a register-passable int rather than a structure. Even 2 bits for the group is probably the most that will ever be needed, but it's easy to say the low 4 bits are the group and the high 28 are the pair. Just create a few access macros to pull them apart. This was inspired by Linus's hash_len abstraction, recently moved to ) >> (or you could add an mtd->write_per_erase field). > Okay. Actually I'd like to avoid adding new 'conversion' fields to the > mtd_info struct. Not sure we are really improving perfs when doing that, > since what takes long is the I/O ops between the flash and the > controller not the conversion operations. Well, yes, but you may need to do conversion ops for in-memory cache lookups or searching for free blocks, or wear-levelling computations, all of which may involve a great many conversions per actual I/O. (In hindsight, I'd wish for writesize and write_per_erase, and not store erasesize explicitly. Not only is the multiply more efficient, but it abolishes the error of an erase size which is not a multiple of the write size by making it impossible.) > Can we have a boolean to make it clearer? > > bool lastpage = ((page + 1) * mtd->writesize) == mtd->erasesize; An improvement IMHO. You can use the same name in all four functions to make the equivalence clear. > Also, the page update is quite obscure for people that did not have the > explanation you gave above. Can we make it > /* > * The first and last pages are not surrounded by other pages, > * and are thus less sensitive to read/write disturbance. > * That's why NAND vendors decided to use a different distance > * for these 2 specific case, which complicates a bit the > * pairing scheme logic. Um... this is, as far as I can tell, complete nonsense. I realize you know this about a thousand times better than I do, so I'm hesitant to make such a strong statement, but one thing that I do know is that paired pages are stored in the exact same transistors. The pairing is purely a logical addressing distance. The physical distance is exactly zero. The qustion is why they chose this particular *logival* addressing scheme, and I believe the reason is write bandwidth for the common case of streaming writes to consecutive pages. The obvious thing to do is pair consecutive even and odd pages (pages 0 and 1, then 2 and 3, then...), but that makes it hard to pipeline programming of the two pages. You can't start programming page 1 until page 0 is finished. The next obvious thing is stride-2: 0<->2, 1<->3, 4<->6, 5<->7, etc. This is done in some MLC chips. See p. 98 of this Micron data sheet: http://pdf.datasheet.directory/datasheets-0/micron_technology/MT29F32G08CBACAWP_C.pdf which has a stride-4 pairing. 0..3 pair with 4..8, then 9..11 with 12..15, and so on. However, it's desirable to alternate group-0 and group-1 pages, since the write operations are rather different and even take different amounts of time. Alternating them makes it possible to: 1) Possibly overlap parts of the writes that use different on-chip resources, 2) Average the non-overlapping times for minimum jitter. This leads naturally to the stride-3 solution. You want to minimize the stride because you can read both pages in a pair with one read disturbance, and the shorter the distance, the more likely you'll want both pages (and the less buffering you'll need to make both available). Stride-3 does have those two awkward edge cases, and changing the stride is simply the simplest way to special-case them. > Thanks for your valuable review/suggestions. > > Just out of curiosity, why are you interested in the pairing scheme > concept? Are you working with NANDs? Not at present, but I do embedded hardware and might some day. Also, the data sheets are a real PITA to find. I have yet to see an actual data sheet that documents the stride-3 pairing scheme.