tools.linux.kernel.org archive mirror
 help / color / mirror / Atom feed
* Does anyone have any tips for efficient lei and dovecot integration?
@ 2025-05-30 18:46 James Bottomley
  2025-06-06 18:02 ` Eric Wong
  0 siblings, 1 reply; 4+ messages in thread
From: James Bottomley @ 2025-05-30 18:46 UTC (permalink / raw)
  To: tools, users

This will only interest the diminishing number of you actually running
your own email server (or possibly if you have a shell login to the
actual dovecot system).

I was looking at replacing mail subscriptions on my main email server
(which runs postfix and dovecot using Maildir fs layout) with lei for
at least some of the lists (meaning I'd be trying to run a mixed
subscription and lei setup on a per list basis).  The first thing I had
to do was fix a lei bug which should bite anyone with a git directory
in their $HOME and a '.' in path:

https://public-inbox.org/meta/a67340a12b17379ad947f8ac96cd5c0524831741.camel@HansenPartnership.com/

But once that was done, it seems to work.

The next problem I have is that I don't really want to use double the
storage, so I need to use my dovecot maildir backend as the only
storage.  This leads to trying to use the --no-import-remote on the
queries, but it doesn't seem to work: I still get allocations in the
store greater in size than the contents of the existing mailbox ... so
if anyone has a fix for that, I'm all ears?

The command I'm using (after unsubscribing from a list) is:

lei q --augment --no-import-remote --dedupe=mid -o <maildir list location> -I <lore list location> "d:<currentdate>.."

Don't forget --augment (I did once and it destroyed my entire mailbox).
Then I have to get dovecot to index it:

doveadm force-resync -u jejb <maildir list location>

So far so good.  lei and dovecot use different unique file ids in the
Maildir, but so far it seems to recognize already received email and
not duplicate.  I came up with a cron script to automate the pull every
5 minutes or so:

#!/bin/bash
set -o pipefail
f=$(lei up --all 2>&1 | awk '/^# [0-9]* written to/{if ($2 != 0) print $5}')
if [ $? != 0 ]; then
	echo "lei up failed"
	# try to get the failure message by repeating the action
	lei up --all
	exit 1
fi
if [ -n "$f" ]; then
	for l in $f; do
		m=$(expr "$l" : '^/home/jejb/Maildir/\(.*\)/$')
		doveadm force-resync -u jejb $m
	done
fi
exit 0


Which means I can simply add new lists with 'lei q' and the saved
search will get executed as part of the cron job.

So far I've been converting over my lists slowly (especially as it
takes this lei shard command ages to run over a large existing mbox)
but it seems to be working.  Actually the most annoying issue now is
the confirmation round trip to unsubscribe from every list ...

Regards,

James


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Does anyone have any tips for efficient lei and dovecot integration?
  2025-05-30 18:46 Does anyone have any tips for efficient lei and dovecot integration? James Bottomley
@ 2025-06-06 18:02 ` Eric Wong
  2025-06-06 20:24   ` James Bottomley
  0 siblings, 1 reply; 4+ messages in thread
From: Eric Wong @ 2025-06-06 18:02 UTC (permalink / raw)
  To: James Bottomley; +Cc: tools, users, meta

James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> This will only interest the diminishing number of you actually running
> your own email server (or possibly if you have a shell login to the
> actual dovecot system).

+Cc: meta@public-inbox.org

Fwiw, lei can output directly to IMAP and not just Maildir.  The
main problem is Mail::IMAPClient is awful on high-latency links
due to a lack of pipelining right now.

> The next problem I have is that I don't really want to use double the
> storage, so I need to use my dovecot maildir backend as the only
> storage.  This leads to trying to use the --no-import-remote on the
> queries, but it doesn't seem to work: I still get allocations in the
> store greater in size than the contents of the existing mailbox ... so
> if anyone has a fix for that, I'm all ears?

I think that's answered below...

> The command I'm using (after unsubscribing from a list) is:
> 
> lei q --augment --no-import-remote --dedupe=mid -o <maildir list location> -I <lore list location> "d:<currentdate>.."
 
> Don't forget --augment (I did once and it destroyed my entire mailbox).
> Then I have to get dovecot to index it:

Oops, yeah, --augment being off by default is a mairix behavior
I imitated :x.  Unless you used --no-import-before, lei imported
your mailbox into ~/.local/share/lei/store before writing to
the -o destination; so you could dump everything it imported
back out: lei q --augment -o OUTPUT z:0..
(z:0.. means message size >= 0)

> doveadm force-resync -u jejb <maildir list location>

AFAIK, dovecot uses inotify||kevent to detect changes to its
Maildirs.  I never needed to run `doveadm force-resync' after
writing with 3rd-party tools (e.g. lei, mutt) to Maildirs used
by dovecot.

> So far I've been converting over my lists slowly (especially as it
> takes this lei shard command ages to run over a large existing mbox)
> but it seems to be working.

Yeah, performance is pretty awful but have some long-term goals
to work on, there (a separate discussion later/elsewhere).

> Actually the most annoying issue now is
> the confirmation round trip to unsubscribe from every list ...

Probably easier to set up bounces on your MTA and let the
mailing list software auto-unsubscribe you :>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Does anyone have any tips for efficient lei and dovecot integration?
  2025-06-06 18:02 ` Eric Wong
@ 2025-06-06 20:24   ` James Bottomley
  2025-06-07  0:04     ` Eric Wong
  0 siblings, 1 reply; 4+ messages in thread
From: James Bottomley @ 2025-06-06 20:24 UTC (permalink / raw)
  To: Eric Wong; +Cc: tools, users, meta

On Fri, 2025-06-06 at 18:02 +0000, Eric Wong wrote:
> James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> > This will only interest the diminishing number of you actually
> > running your own email server (or possibly if you have a shell
> > login to the actual dovecot system).
> 
> +Cc: meta@public-inbox.org
> 
> Fwiw, lei can output directly to IMAP and not just Maildir.  The
> main problem is Mail::IMAPClient is awful on high-latency links
> due to a lack of pipelining right now.

Yes, I know.  Konstantin's example was of exactly that:

https://people.kernel.org/monsieuricon/lore-lei-part-2-now-with-imap

But I wanted the unauthenticated version to run on the imap server, not
the local client because it's easier and more secure.

> > The next problem I have is that I don't really want to use double
> > the storage, so I need to use my dovecot maildir backend as the
> > only storage.  This leads to trying to use the --no-import-remote
> > on the queries, but it doesn't seem to work: I still get
> > allocations in the store greater in size than the contents of the
> > existing mailbox ... so if anyone has a fix for that, I'm all ears?
> 
> I think that's answered below...
> 
> > The command I'm using (after unsubscribing from a list) is:
> > 
> > lei q --augment --no-import-remote --dedupe=mid -o <maildir list
> > location> -I <lore list location> "d:<currentdate>.."
>  
> > Don't forget --augment (I did once and it destroyed my entire
> > mailbox). Then I have to get dovecot to index it:
> 
> Oops, yeah, --augment being off by default is a mairix behavior
> I imitated :x.  Unless you used --no-import-before, lei imported
> your mailbox into ~/.local/share/lei/store before writing to
> the -o destination; so you could dump everything it imported
> back out: lei q --augment -o OUTPUT z:0..
> (z:0.. means message size >= 0)

Thanks.  --no-import-before seems to have stopped store growing ... is
there any way I can shrink it?

> > doveadm force-resync -u jejb <maildir list location>
> 
> AFAIK, dovecot uses inotify||kevent to detect changes to its
> Maildirs.  I never needed to run `doveadm force-resync' after
> writing with 3rd-party tools (e.g. lei, mutt) to Maildirs used
> by dovecot.

Well, it can, yes, but my setup is a fraction of a terrabyte of imap
running all my subscriptions, which is pretty huge and fanotify can be
a bit unscalable in something that big, so I have it turned off and I
simply used dovecot-lda for local delivery via procmail to make sure
everything got noticed immediately.

> > So far I've been converting over my lists slowly (especially as it
> > takes this lei shard command ages to run over a large existing
> > mbox) but it seems to be working.
> 
> Yeah, performance is pretty awful but have some long-term goals
> to work on, there (a separate discussion later/elsewhere).
> 
> > Actually the most annoying issue now is
> > the confirmation round trip to unsubscribe from every list ...
> 
> Probably easier to set up bounces on your MTA and let the
> mailing list software auto-unsubscribe you :>

Heh, I'll let Konstantin yell at you for that one ...

Regards,

James


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Does anyone have any tips for efficient lei and dovecot integration?
  2025-06-06 20:24   ` James Bottomley
@ 2025-06-07  0:04     ` Eric Wong
  0 siblings, 0 replies; 4+ messages in thread
From: Eric Wong @ 2025-06-07  0:04 UTC (permalink / raw)
  To: James Bottomley; +Cc: tools, users, meta

James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> Thanks.  --no-import-before seems to have stopped store growing ... is
> there any way I can shrink it?

Not easily in a fine-grained way.  I *think*
`rm -rf ~/.local/share/lei/store' (but keeping
~/.local/share/lei/saved-searches) should be OK but I haven't
tested.  Theoretically we could add support for purging
individual messages from it (like public-inbox-purge).

You can also use normal git-gc || SQLite VACUUM ||
xapian-compact to reduce space losslessly.  We can probably add
`lei compact' to wrap xapian-compact similar to what
public-inbox-compact does; and probably add SQLite VACUUM
support to both of those xapian-compact wrappers.

> On Fri, 2025-06-06 at 18:02 +0000, Eric Wong wrote:
> > AFAIK, dovecot uses inotify||kevent to detect changes to its
> > Maildirs.  I never needed to run `doveadm force-resync' after
> > writing with 3rd-party tools (e.g. lei, mutt) to Maildirs used
> > by dovecot.
> 
> Well, it can, yes, but my setup is a fraction of a terrabyte of imap
> running all my subscriptions, which is pretty huge and fanotify can be
> a bit unscalable in something that big, so I have it turned off and I
> simply used dovecot-lda for local delivery via procmail to make sure
> everything got noticed immediately.

Ouch, yeah.

I actually have an idea to flip things by exposing lei/store
data as Maildirs via FUSE3 so there'd be no need for Maildirs
on regular FSes at all.

I'm not sure if the performance of the mainline perl
implementation + FUSE3 shim will be fast enough to handle
lots of small files even w/ readdirplus.  Or just bypass Perl
and just use FUSE3+SQLite+git but there's still syscall
amplification from all those layers.


sidenote: I completely forgot fanotify exists since the
early versions required CAP_SYS_ADMIN :x

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-06-07  0:04 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-30 18:46 Does anyone have any tips for efficient lei and dovecot integration? James Bottomley
2025-06-06 18:02 ` Eric Wong
2025-06-06 20:24   ` James Bottomley
2025-06-07  0:04     ` Eric Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).