From: Vyacheslav Dubeyko <slava@dubeyko.com>
To: htl10@users.sourceforge.net
Cc: linux-fsdevel@vger.kernel.org,
Till Kamppeter <till.kamppeter@gmail.com>,
Naohiro Aota <naota@elisp.net>
Subject: Re: hfsplus BUG(), kmap and journalling.
Date: Fri, 19 Oct 2012 16:45:12 +0400 [thread overview]
Message-ID: <1350650712.2028.50.camel@slavad-ubuntu> (raw)
In-Reply-To: <1350579334.54535.YahooMailClassic@web29404.mail.ird.yahoo.com>
Hi Hin-Tak,
On Thu, 2012-10-18 at 17:55 +0100, Hin-Tak Leung wrote:
> Hi,
>
> While looking at a few of the older BUG() traces I have consistently
> running du on a somewhat large directory with lots of small files and
> small directories, I noticed that it tends to have two sleeping "?
> hfs_bnode_read()" towards the top. As it is a very small and simple
> function which just reads a b-tree node record - sometimes only a few
> bytes between a kmap/kunmap, I see that it might just be the number of
> simultaneous kmap() being run. So I put a mutex around it just to make
> sure only one copy of hfs_bnode_read() is run at a time.
Yeah, you touch very important problem. It needs to rework hfsplus
driver from using kmap()/kunmap() because kmap() is slow, theoretically
deadlocky and is deprecated. The alternative is kunmap_atomic() but it
needs to dive more deeply in every case of kmap() using in hfsplus
driver.
The mutex is useless. It simply hides the issue.
> This seems to make it much harder to get a BUG() - I needed to run du
> a few times over and over to get it again. Of course it might just be
> a mutex slowing the driver down to make it less likely to get
> confused, but as I read that the number of simultaneous kmap() in the
> kernel is limited, I think I might be on to something.
> Also this shifts the problem onto multiple copies of "?
> hfsplus_bmap()". (which also kmap()/kunmap()'s, but much more
> complicated).
Namely, the mutex hides the issue.
> I thought of doing hfsplus_kmap()/etc(which seems to exist a long time
> ago but removed!) , but this might cause dead locks since some of the
> hfsplus code is kmapping/kunmapping all the time, and recursively. So
> a better way might be just to make sure only one instance of some of
> the routines are only run one at a time. i.e. multiple mutexes.
> This is both ugly and sounds like voodoo though. Also I am not sure
> why the existing mutex'es, which protects some of the internal
> structures, doesn't protect against too many kmap's. (maybe they
> protect "writes", but not against too many simultaneous reads).
> So does anybody has an idea how many kmaps are allowed and how to tell
> that I am close to my machine's limit?
As I can understand, the hfsplus_kmap() doesn't do something useful. It
really needs to rework kmap()/kunmap() using instead of mutex using.
Could you try to fix this issue? :-)
> Also a side note on the Netgear journalling code: I see that it
> jounrnals the volume header, some of the special files (the catalog,
> allocation bitmap, etc), but (1) it has some code to journal the
> attribute file, but it was actually non-functional, since without
> Vyacheslav's recent patches, the linux kernel doesn't even read/write
> that correctly, let alone doing *journalled* read/write correctly, (2)
> there is a part which tries to do data-page journalling, but it seems
> to be wrong - or at least, not quite working. (this I found while I
> was looking at some curious warning messages and how they come about).
> Luckily that codes just bails out when it gets confused - i.e. it does
> non-journalled writes, rather than writing wrong journal to disk. So
> it doesn't harm data under routine normal use. (i.e. mount/unmount
> cleanly).
> But that got me worrying a bit about inter-operability: it is probably
> unsafe to use Linux to replay the journal written by Mac OS X, and
> vice versa. i.e. if you have a dual boot machine, or a portable disk
> that you use between two OSes, if it disconnects/unplugs/crashes under
> one OS, it is better to plug it right back and let the same OS
> replaying the journal then unmount cleanly before using it under the
> other OS.
The journal should be replayed during every mount in the case of
presence of valid transactions. A HFS+ volume shouldn't be mounted
without journal replaying. Otherwise, it is possible to achieve
corrupted partition. Just imagine, you have mounted HFS+ partition with
not empty journal then add some data on volume. It means that you modify
metadata. If you will mount such HFS+ volume under Mac OS X then journal
will be replayed and metadata will be corrupted.
With the best regards,
Vyacheslav Dubeyko.
> I'll be interested on hearing any tips on finding out kmap's limit at
> run time, if anybody has any idea...
>
> Hin-Tak
next prev parent reply other threads:[~2012-10-19 12:45 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-10-18 16:55 hfsplus BUG(), kmap and journalling Hin-Tak Leung
2012-10-19 12:45 ` Vyacheslav Dubeyko [this message]
2012-10-20 6:24 ` Hin-Tak Leung
2012-10-22 9:02 ` Vyacheslav Dubeyko
2012-10-30 8:24 ` Hin-Tak Leung
2012-10-30 11:45 ` Vyacheslav Dubeyko
2012-11-02 5:43 ` hfsplus foldercount (Re: hfsplus BUG(), kmap and journalling.) Hin-Tak Leung
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1350650712.2028.50.camel@slavad-ubuntu \
--to=slava@dubeyko.com \
--cc=htl10@users.sourceforge.net \
--cc=linux-fsdevel@vger.kernel.org \
--cc=naota@elisp.net \
--cc=till.kamppeter@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).