From: Ben Schmidt <mail_ben_schmidt@yahoo.com.au>
To: mlmmj@mlmmj.org
Subject: Re: [mlmmj] List archive howto?
Date: Wed, 06 Oct 2010 08:03:31 +0000 [thread overview]
Message-ID: <4CAC2D53.9000704@yahoo.com.au> (raw)
In-Reply-To: <4C6B3882.6070904@gmail.com>
I'm afraid you've completely lost me there, Robin, possibly because I
know basically nothing about mhonarc. The concepts sound great, though.
A question, though.... Is there anything specific and useful in your
stuff reply that I should be working into the documentation about
archive solutions? Or is this a complex solution, or not detailed
enough, and best left out and revealed if someone finds the other
solutions aren't adequate for them?
Cheers,
Ben.
On 4/10/10 8:37 AM, Robin H. Johnson wrote:
> On Tue, Aug 17, 2010 at 07:33:54PM -0600, Morgan Gangwere wrote:
>> Howdy y'all
>>
>> I'm kinda curious as to what is used for the archiving of the mailing list?
>>
>> I've got my own list set up and I'm pouring through the documentation
>> looking for what actually generates the html. Any help?
> For Gentoo, we use mhonarc, with some minor modifications.
>
> One of the key modifications was ultimately caused by the message
> numbering issue in threads.
>
> Specifically, what URL should exist if a message arrives out of sequence
> due to a delay. Some archive tools renumber the urls, others keep an
> index based on some part of the mail. Both of these caused some degree
> of problems for us, and we had to come up with an alternative.
>
> Specifically, when the message is being received, we add two headers:
> X-Archives-Salt: ${UUID}
> then we hash the entire message file, including the new header, and
> generate:
> X-Archives-Hash: ${SHA1}
>
> This gets used in the filename for the individual message files we
> write to disk:
> http://archives.gentoo.org/gentoo-scm/msg_c0f2f8f123f85bb8b664827b4a1dcb09.xml
> It's consistent regardless of how many times we have to rebuild a
> message archive.
>
> We've got nearly 588k messages in the archive, and now the next problem
> is starting to be the storage/access scalability of some of the larger
> lists. Also need to work on better parallelization for building archives
> for each list.
>
> Largest lists, by mails:
> 72808 gentoo-dev
> 97619 gentoo-user
> 283036 gentoo-commits
>
> Footnote:
> Yes, we generate XML, that's to integrate with our templating, but you
> could generate HTML even more easily.
>
next prev parent reply other threads:[~2010-10-06 8:03 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-08-18 1:33 [mlmmj] List archive howto? Morgan Gangwere
2010-08-18 6:38 ` Andreas Schneider
2010-08-18 12:06 ` Wolf Bergenheim
2010-10-03 13:35 ` Ben Schmidt
2010-10-03 21:37 ` Robin H. Johnson
2010-10-06 8:03 ` Ben Schmidt [this message]
2010-10-06 19:57 ` Robin H. Johnson
2010-10-06 22:19 ` Ben Schmidt
2010-10-11 10:41 ` Florian Effenberger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4CAC2D53.9000704@yahoo.com.au \
--to=mail_ben_schmidt@yahoo.com.au \
--cc=mlmmj@mlmmj.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.