* bitmap chunk size @ 2009-11-10 16:39 Darius S. Naqvi 2009-11-10 18:01 ` Doug Ledford 0 siblings, 1 reply; 4+ messages in thread From: Darius S. Naqvi @ 2009-11-10 16:39 UTC (permalink / raw) To: linux-raid Is there any possibility of having a bitmap chunk size of 512 bytes? I know that mdadm rejects anything under 4k. I fear that the assumption of the 4k minimum is embedded fairly strongly in the code. Can my fear be alleviated? -- Darius S. Naqvi dnaqvi@datagardens.com http://www.datagardens.com ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: bitmap chunk size 2009-11-10 16:39 bitmap chunk size Darius S. Naqvi @ 2009-11-10 18:01 ` Doug Ledford 2009-11-10 18:36 ` Darius S. Naqvi 0 siblings, 1 reply; 4+ messages in thread From: Doug Ledford @ 2009-11-10 18:01 UTC (permalink / raw) To: Darius S. Naqvi; +Cc: linux-raid [-- Attachment #1: Type: text/plain, Size: 1289 bytes --] On 11/10/2009 11:39 AM, Darius S. Naqvi wrote: > Is there any possibility of having a bitmap chunk size of 512 bytes? > I know that mdadm rejects anything under 4k. I fear that the > assumption of the 4k minimum is embedded fairly strongly in the code. > Can my fear be alleviated? > If you're putting any normal filesystem (with a block size of 4k) on this, then it makes absolutely no sense to have a bitmap size less than 4k as any given filesystem block is either dirty or clean, sub-block semantics make no sense in this scenario. That said, unless you have a specific need for this level of granularity, it is a really bad idea (performance wise and space wise) to go with anything even close to resembling the granularity you are requesting. I usually go with --bitmap-chunk=32768 (which since that's expressed in k means 32Megabytes). I would actually suspect that if you have a truly pressing need for a 512byte bitmap chunk, then you probably don't need a bare raid, you need some sort of database underlying your data or something else. -- Doug Ledford <dledford@redhat.com> GPG KeyID: CFBFF194 http://people.redhat.com/dledford Infiniband specific RPMs available at http://people.redhat.com/dledford/Infiniband [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 197 bytes --] ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: bitmap chunk size 2009-11-10 18:01 ` Doug Ledford @ 2009-11-10 18:36 ` Darius S. Naqvi 2009-11-10 19:46 ` Doug Ledford 0 siblings, 1 reply; 4+ messages in thread From: Darius S. Naqvi @ 2009-11-10 18:36 UTC (permalink / raw) To: linux-raid On Tue, 10 Nov 2009, Doug Ledford wrote: > On 11/10/2009 11:39 AM, Darius S. Naqvi wrote: >> Is there any possibility of having a bitmap chunk size of 512 bytes? >> I know that mdadm rejects anything under 4k. I fear that the >> assumption of the 4k minimum is embedded fairly strongly in the code. >> Can my fear be alleviated? >> > > If you're putting any normal filesystem (with a block size of 4k) on > this, then it makes absolutely no sense to have a bitmap size less than > 4k as any given filesystem block is either dirty or clean, sub-block > semantics make no sense in this scenario. That said, unless you have a Well, the mke2fs(8) man page says, "Valid block size values are 1024, 2048 and 4096 bytes per block. If omitted, mke2fs block-size is heuristically determined by the file system size..." I always thought that filesystems typically used a block size of 4k, but apparently there is no guarantee that that is the case. Also, I'm not sure what windows uses as a filesystem block size. This is important to me, because we're trying to use md raid 1's to periodically synchronize blocks from a filesystem, and having sub-chunk writes messes things up for us. What we want is that a whole chunk gets written, then we fiddle with things so that a bitmap-driven resync copies those whole chunks. The chunks were not necessarily initialized before the write, so a sub-chunk write means garbage data is copied in the remainder of the chunk. We've seen this problem occur in practice, meaning either filesystems are not using 4k (or multiple thereof) chunks, or for some reason, writes are not 4k-aligned. Does anyone with more knowledge of filesystems know about this? Perhaps we can force block size and alignment of filesystems to make this work. -- Darius S. Naqvi dnaqvi@datagardens.com http://www.datagardens.com ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: bitmap chunk size 2009-11-10 18:36 ` Darius S. Naqvi @ 2009-11-10 19:46 ` Doug Ledford 0 siblings, 0 replies; 4+ messages in thread From: Doug Ledford @ 2009-11-10 19:46 UTC (permalink / raw) To: Darius S. Naqvi; +Cc: linux-raid [-- Attachment #1: Type: text/plain, Size: 3480 bytes --] On 11/10/2009 01:36 PM, Darius S. Naqvi wrote: > On Tue, 10 Nov 2009, Doug Ledford wrote: > >> On 11/10/2009 11:39 AM, Darius S. Naqvi wrote: >>> Is there any possibility of having a bitmap chunk size of 512 bytes? >>> I know that mdadm rejects anything under 4k. I fear that the >>> assumption of the 4k minimum is embedded fairly strongly in the code. >>> Can my fear be alleviated? >>> >> >> If you're putting any normal filesystem (with a block size of 4k) on >> this, then it makes absolutely no sense to have a bitmap size less than >> 4k as any given filesystem block is either dirty or clean, sub-block >> semantics make no sense in this scenario. That said, unless you have a > > Well, the mke2fs(8) man page says, "Valid block size values are 1024, > 2048 and 4096 bytes per block. If omitted, mke2fs block-size is > heuristically determined by the file system size..." Yes, but you can always supply the -b option and tell it what you want. I'm actually a little confused as to why you would quote this specific part of the man page then at the end of the mail ask how you can force block size on the filesystem...well, with this option you just quoted is how. > I always thought that filesystems typically used a block size of 4k, They do, especially if you use any of the modern distributions. I think they all pass -b 4096 in when calling mke2fs. But, even if they didn't, I think you need a pretty small filesystem before mke2fs will voluntarily grab a less than 4k block size. > but apparently there is no guarantee that that is the case. Also, I'm > not sure what windows uses as a filesystem block size. This is > important to me, because we're trying to use md raid 1's to > periodically synchronize blocks from a filesystem, and having > sub-chunk writes messes things up for us. What we want is that a > whole chunk gets written, then we fiddle with things so that a > bitmap-driven resync copies those whole chunks. The chunks were not > necessarily initialized before the write, so a sub-chunk write means > garbage data is copied in the remainder of the chunk. So? If this is truly a raid1, and it wasn't initialized prior to use, and you copy a larger than write size chunk with garbage at the end, it doesn't matter, the other drive has garbage there too so you are just overwriting garbage with garbage. The only reason it would ever matter if you copied garbage at the end is if you are trying to do a partial raid1, where they aren't really fully raid1 mirror copies, but you are using the bitmap to signal a sort of mask that you want copied and you want other parts untouched. What you would be doing something like this for I don't know, but good luck getting it to work with the md raid1 code. It simply isn't intended to be used in that way, and even if you manage to get it to work, I suspect it would be *VERY* fragile. > We've seen this problem occur in practice, meaning either filesystems > are not using 4k (or multiple thereof) chunks, or for some reason, > writes are not 4k-aligned. Does anyone with more knowledge of > filesystems know about this? Perhaps we can force block size and > alignment of filesystems to make this work. > -- Doug Ledford <dledford@redhat.com> GPG KeyID: CFBFF194 http://people.redhat.com/dledford Infiniband specific RPMs available at http://people.redhat.com/dledford/Infiniband [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 197 bytes --] ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2009-11-10 19:46 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-11-10 16:39 bitmap chunk size Darius S. Naqvi 2009-11-10 18:01 ` Doug Ledford 2009-11-10 18:36 ` Darius S. Naqvi 2009-11-10 19:46 ` Doug Ledford
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).