From: Duncan <1i5t5.duncan@cox.net>
To: linux-btrfs@vger.kernel.org
Subject: Re: applications hang on a btrfs spanning two partitions
Date: Thu, 17 Jan 2019 11:15:49 +0000 (UTC) [thread overview]
Message-ID: <pan$b1351$9e1f5c6d$ffd38bb8$2fa1b304@cox.net> (raw)
In-Reply-To: 2671305.1QxYQ0Ocz6@thetick
Marc Joliet posted on Tue, 15 Jan 2019 23:40:18 +0100 as excerpted:
> Am Dienstag, 15. Januar 2019, 09:33:40 CET schrieb Duncan:
>> Marc Joliet posted on Mon, 14 Jan 2019 12:35:05 +0100 as excerpted:
>> > Am Montag, 14. Januar 2019, 06:49:58 CET schrieb Duncan:
>> >
>> >> ... noatime ...
>> >
>> > The one reason I decided to remove noatime from my systems' mount
>> > options is because I use systemd-tmpfiles to clean up cache
>> > directories, for which it is necessary to leave atime intact
>> > (since caches are often Write Once Read Many).
>>
>> Thanks for the reply. I hadn't really thought of that use, but it
>> makes sense...
I really enjoy these "tips" subthreads. As I said I hadn't really
thought of that use, and seeing and understanding other people's
solutions helps when I later find reason to review/change my own. =:^)
One example is an ssd brand reliability discussion from a couple years
ago. I had the main system on ssds then and wasn't planning on an
immediate upgrade, but later on, I got tired of the media partition and a
main system backup being on slow spinning rust, and dug out that ssd
discussion to help me decide what to buy. (Samsung 1 TB evo 850s, FWIW.)
> Specifically, I mean ~/.cache/ (plus a separate entry for ~/.cache/
> thumbnails/, since I want thumbnails to live longer):
Here, ~/.cache -> tmp/cache/ and ~/tmp -> /tmp/tmp-$USER/, plus
XDG_CACHE_HOME=$HOME/tmp/cache/, with /tmp being tmpfs.
So as I said, user cache is on tmpfs.
Thumbnails... I actually did an experiment with the .thumbnails backed up
elsewhere and empty, and found that with my ssds anyway, rethumbnailing
was close enough to having them cached that it didn't really matter to my
visual browsing experience. So not only do I not mind thumbnails being
on tmpfs, I actually have gwenview, my primary images browser, set to
delete its thumbnails dir on close.
> I haven't bothered configuring /var/cache/, other than making it a
> subvolume so it's not a part of my snapshots (overriding the systemd
> default of creating it as a directory). It appears to me that it's
> managed just fine by pre- existing tmpfiles.d snippets and by the
> applications that use it cleaning up after themselves (except for
> portage, see below).
Here, /var/cache/ is on /, which remains mounted read-only by default.
The only things using it are package-updates related, and I obviously
have to mount / rw for package updates, so it works fine. (My sync
script mounts the dedicated packages filesystem containing the repos,
ccache, distdir, and binpkgs, and remounting / rw, and that's the first
thing I run doing an update, so I don't even have to worry about doing
the mounts manually.)
>> FWIW systemd here too, but I suppose it depends on what's being cached
>> and particularly on the expense of recreation of cached data. I
>> actually have many of my caches (user/browser caches, etc) on tmpfs and
>> reboot several times a week, so much of the cached data is only
>> trivially cached as it's trivial to recreate/redownload.
>
> While that sort of tmpfs hackery is definitely cool, my system is,
> despite its age, fast enough for me that I don't want to bother with
> that (plus I like my 8 GB of RAM to be used just for applications and
> whatever Linux decides to cache in RAM). Also, modern SSDs live long
> enough that I'm not worried about wearing them out through my daily
> usage (which IIRC was a major reason for you to do things that way).
16 gigs RAM here, and except for building chromium (in tmpfs), I seldom
fill it even with cache -- most of the time several gigs remain entirely
empty. With 8 gig I'd obviously have to worry a bit more about what I
put in tmpfs, but given that I have the RAM space, I might as well use it.
When I setup this system I was upgrading from a 4-core (original 2-socket
dual-core 3-digit Opterons, purchased in 2003 and ran until the caps
started dying in 2011), this system being a 6-core fx-series, and based
on the experience with the quad-core, I figured 12 gig RAM for the 6-
core. But with pairs of RAM sticks for dual-channel, powers of two
worked better, so it was 8 gig or 16 gig. And given that I had worked
with 8 gig on the quad-core, I knew that would be OK, but 12 gig would
mean less cache dumping, so 16 gig it was.
And my estimate was right on. Since 2011, I've typically run up to ~12
gigs RAM used including cache, leaving ~4 gigs of the 16 entirely unused
most of the time, tho I do use the full 16 gig sometimes when doing
updates, since I have PORTAGE_TMPDIR set to tmpfs.
Of course since my purchase in 2011 I've upgraded to SSDs and RAM-based
storage cache isn't as important as it was back on spinning rust, so for
my routine usage 8 gig RAM with ssds would be just fine, today.
But building chromium on tmpfs is the exception.
Until recently I was running firefox, but for various reasons including
firefox upstream requiring pulse-audio now so I can't just run upstream
firefox binaries, and gentoo's firefox updates unfortunately sometimes
being uncomfortably late for a security-minded user aware that their
primary browser is the single most security-exposed application they run,
and often build or run problems after gentoo /did/ have a firefox build,
making reliably running a secure-as-possible firefox even *more* of a
problem, a few months ago I switched to chromium.
And chromium is over a half-gig of compressed sources that expands to
several gigs of build dir. Put that in tmpfs along with the memory
requirements of a multi-threaded build, with USE=jumbo-build and a couple
gigs of other stuff (an X/kde-plasma session, building in a konsole
window, often with chromium and minitube running) in memory too, and...
That 16 gig RAM isn't enough for that sort of chromium build. =:^(
So for the first time on the ssds, I reconfigured and rebuilt the kernel
with swap support, and added a pair of 16-gig each swap partitions on the
ssds, for now 16 gig RAM and 32 gig swap.
With the parallel-jobs cut down slightly via a package.env setting to
better control memory usage, to -j7 from the normal -j8, and with
PORTAGE_TMPDIR still pointed at tmpfs, I run about 16 gig into swap
building chromium now. So for that I could now use 32 gig of RAM.
Meanwhile, it's 2019, and this 2011 system's starting to feel a bit dated
in other ways too, now, and I'm already at the ~8 years my last system
lasted, so I'm thinking about upgrading. I've upgraded to SSDs and to
big-screen monitors (a 65-inch/165cm 4K TV as primary) on this system,
but I've not done the CPU or memory upgrades on it that I did on the last
one, and having to enable swap to build chromium just seems so last
century.
So I'm thinking about upgrading later this year, probably to a zen-2-
based system with hardware spectre mitigations.
And I want at least 32-gig RAM when I do, depending on the number of
cores/threads. I'm figuring 4-gig/thread now, 4-core/8-thread minimum,
which would be the 32-gig. But 8-core/16-thread, 64-gig RAM, would be
nice.
But I'm moving this spring and am busy with that first. When that's done
and I'm settled in the new place I'll see what my financials look like
and go from there.
>> OTOH, running gentoo, my ccache and binpkg cache are seriously
>> CPU-cycle expensive to recreate, so you can bet those are _not_ tmpfs,
>> but OTTH, they're not managed by systemd-tmpfiles either. (Ccache
>> manages its own cache and together with the source-tarballs cache and
>> git-managed repo trees along with binpkgs, I have a dedicated packages
>> btrfs containing all of them, so I eclean binpkgs and distfiles
>> whenever the 24-gigs space (48-gig total, 24-gig each on pair-device
>> btrfs raid1) gets too close to full, then btrfs balance with -dusage=
>> to reclaim partial chunks to unallocated.)
>
> For distfiles I just have a weekly systemd timer that runs "eclean-dist
> -d" (I stopped using the buildpkg feature, so no eclean-pkg), and have
> moved both $DISTDIR and $PKGDIR to their future default locations in
> /var/cache/. (They used to reside on my desktops HDD RAID1 as distinct
> subvolumes, but I recently bought a larger SSD, so I set up the above
> and got rid of two fstab entries.)
I like short paths.
So my packages filesystem mountpoint is /p, with /p/gentoo and /p/kde
being my main repos, DISTDIR=/p/src, PKGDIR=/p/pkw (w=workstation, back
when I had my 32-bit netbook and 32-bit chroot build image on the
workstation too, I had its packages in pkn, IIRC), /p/linux for the linux
git tree, /p/kpatch for local kernel patches, /p/cc for ccache, and /p/
initramfs for my (dracut-generate) initramfs.
And FWIW, /h is the home mountpoint, /lg the log mountpoint (with
/var/log -> /lg) /l the system-local dir (with /var/local -> /l) on /,
/mnt for auxiliary mounts, /bk the root-backup mountpoint, etc.
You stopped using binpkgs? I can't imagine doing that. Not only does it
make the occasional downgrade easier, older binpkgs come in handy for
checking whether a file location moved in recent versions, looking up
default configs and seeing how they've changed, checking the dates on
them to know when I was running version X or whether I upgraded package Y
before or after package Z, etc.
Of course I could use btrfs snapshotting for most of that and could get
the other info in other ways, but I had this setup working and tested
long before btrfs, and it seems less risky and easier to quantify and
manage than btrfs snapshotting. But surely that's because I /did/ have
it up, running and tested, before btrfs, so it's old hat to me now. If I
were starting with it now, I imagine I might well find the btrfs
snapshotting thing simpler to manage, and covering a broader use-case too.
>> tho I'd still keep the atime effects in mind and switch to noatime if
>> you end up in a recovery situation that requires writable mounting.
>> (Losing a device in btrfs raid1 and mounting writable in ordered to
>> replace it and rebalance comes to mind as one example of a
>> writable-mount recovery scenario where noatime until full
>> replace/rebalance/scrub completion would prevent unnecessary writes
>> until the raid1 is safely complete and scrub-verified again.)
>
> That all makes sense. I was going to argue that I can't imagine
> randomly reading files in a recovery situation, but eventually realized
> that "ls" would be enough to trigger a directory atime update. So yeah,
> one should keep the above mind.
Not just ls, etc, either. Consider manpage access, etc, as well. Plus
of course any executable binaries you run, the libs they load,
scripts... If atime's on, all those otherwise read-only accesses will
trigger atime-update writes, and with btrfs, updating that bit of
metadata copies and writes the entire updated metadata block, triggering
an update and thus a COW of the metadata block tracking the one just
written... all the way up the metadata tree. In a recovery situation
where every write is an additional risk, that's a lot of additional risk,
all for not-so-necessary atime updates!
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
prev parent reply other threads:[~2019-01-17 11:18 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-08 19:38 applications hang on a btrfs spanning two partitions Florian Stecker
2019-01-09 6:24 ` Nikolay Borisov
2019-01-09 9:16 ` Florian Stecker
2019-01-09 10:03 ` Nikolay Borisov
2019-01-09 20:10 ` Florian Stecker
2019-01-12 2:12 ` Chris Murphy
2019-01-12 10:19 ` Florian Stecker
2019-01-14 5:49 ` Duncan
2019-01-14 11:35 ` Marc Joliet
2019-01-15 8:33 ` Duncan
2019-01-15 22:40 ` Marc Joliet
2019-01-17 11:15 ` Duncan [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='pan$b1351$9e1f5c6d$ffd38bb8$2fa1b304@cox.net' \
--to=1i5t5.duncan@cox.net \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.