Discussion:
[Dovecot] Recommended FS for Dovecot Maildir
Daniel Watts
2006-05-05 21:05:59 UTC
Permalink
Hi,

I've heard that for Dovecot/Mailir systems there are filesystems that
are optimised for the situation of many small files in one folder.

Could I possibly have some feedback on what the recommended filesystems
are? I've heard of ReiserFS but was wondering what other options there
are and how they compare.

If I get a good comprehensive response I'll build a wiki summary page
out of the data gathered.

Best wishes,
Daniel
Brent Clark
2006-05-05 21:13:46 UTC
Permalink
Post by Daniel Watts
Could I possibly have some feedback on what the recommended filesystems
are? I've heard of ReiserFS but was wondering what other options there
are and how they compare.
HTH

http://www.debian-administration.org/articles/388
http://fsbench.netnation.com/
http://linuxgazette.net/122/TWDT.html#piszcz

Kind Regards
Brent Clark
Marc Perkel
2006-05-05 23:44:58 UTC
Permalink
Post by Daniel Watts
Could I possibly have some feedback on what the recommended
filesystems are? I've heard of ReiserFS but was wondering what other
options there are and how they compare.
Reiser has traditionally been a very good choice for maildir because it
has infinite inodes, it is very fast on directories with large numbers
of files, and it does sub allocation so small files take less space. And
it's very fast. Maildir is the area where Reiser does best.
Daniel Watts
2006-05-06 00:00:36 UTC
Permalink
Post by Marc Perkel
Post by Daniel Watts
Could I possibly have some feedback on what the recommended
filesystems are? I've heard of ReiserFS but was wondering what other
options there are and how they compare.
Reiser has traditionally been a very good choice for maildir because
it has infinite inodes, it is very fast on directories with large
numbers of files, and it does sub allocation so small files take less
space. And it's very fast. Maildir is the area where Reiser does best.
Thanks for this - I have heared many maildir admins laud Reiser. How is
it for ongoing stability and reliability? I suppose with using any
non-mainstream technology (ie ext stuff) the admins concern is that it
is less well tested for bugs and corruption.

Eg i see many people saying xfs is great but who wouldn't think of
having it put into production.
Les Mikesell
2006-05-06 02:15:19 UTC
Permalink
Post by Daniel Watts
Post by Marc Perkel
Reiser has traditionally been a very good choice for maildir because
it has infinite inodes, it is very fast on directories with large
numbers of files, and it does sub allocation so small files take less
space. And it's very fast. Maildir is the area where Reiser does best.
Thanks for this - I have heared many maildir admins laud Reiser. How is
it for ongoing stability and reliability?
I've used it from the days when it was the only journaled fs around
and an fsck of the stock e2fs drives cause hours of downtime
after any crash and have not had any more problems with it than
any other fs type. The one thing you might have to be careful
about is that there have been different versions and the
tools need to match the filesystem version. But you see the
same thing with ext3 on Linux. You can't, for example create
ext3 filesystems on a current version, restore an old system
on them, and have it come up working even though both claim
to be ext3.
Post by Daniel Watts
I suppose with using any
non-mainstream technology (ie ext stuff) the admins concern is that it
is less well tested for bugs and corruption.
I don't think I'd install the OS on it - just create new
filesystems afterwords and move /home and /var onto them.

--
Les Mikesell
***@gmail.com
Udo Rader
2006-05-06 04:39:21 UTC
Permalink
Post by Daniel Watts
Post by Marc Perkel
Post by Daniel Watts
Could I possibly have some feedback on what the recommended
filesystems are? I've heard of ReiserFS but was wondering what other
options there are and how they compare.
Reiser has traditionally been a very good choice for maildir because
it has infinite inodes, it is very fast on directories with large
numbers of files, and it does sub allocation so small files take less
space. And it's very fast. Maildir is the area where Reiser does best.
Thanks for this - I have heared many maildir admins laud Reiser. How is
it for ongoing stability and reliability? I suppose with using any
non-mainstream technology (ie ext stuff) the admins concern is that it
is less well tested for bugs and corruption.
three or four years ago we had a major reiserfs corruption due to faulty
memory modules (as I know now). The result was a completely unusable and
unrepairable partition that could eventually, after two weeks or so, be
repaired with the help of Hans Reiser himself. Even though I really
appreciated direct involvement of the main developer, I will avoid it by
any means.
Post by Daniel Watts
Eg i see many people saying xfs is great but who wouldn't think of
having it put into production.
XFS is one of the most mature filesystems around for UNIX systems. It
has been developed by SGI for its IRIX OS for a _very_ long time. If
POSIX compliance (ACLs!!), out of the box quota support, scalability (up
to 9 million terabytes max. capacity, IIRC), sophisticated
backup/restore facitilites (including snapshots and more), online file
defragmentation and much more are of importance for you, then go for
XFS ... well, and you certainly guessed it, we are using it for almost
all of our servers :-)

Regards

Udo Rader
--
BestSolution.at EDV Systemhaus GmbH
http://www.bestsolution.at
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://dovecot.org/pipermail/dovecot/attachments/20060505/a3fd8125/attachment.pgp
Richard Laager
2006-05-06 06:23:47 UTC
Permalink
Post by Udo Rader
three or four years ago we had a major reiserfs corruption due to faulty
memory modules (as I know now). The result was a completely unusable and
unrepairable partition that could eventually, after two weeks or so, be
repaired with the help of Hans Reiser himself. Even though I really
appreciated direct involvement of the main developer, I will avoid it by
any means.
Not that I really care what you use, but don't you think it's a bit
illogical to avoid a filesystem based on problems you know were caused
by faulty hardware? Just how is the filesystem supposed to avoid
problems from bad RAM?

Richard
Udo Rader
2006-05-06 18:56:55 UTC
Permalink
Post by Richard Laager
Post by Udo Rader
three or four years ago we had a major reiserfs corruption due to faulty
memory modules (as I know now). The result was a completely unusable and
unrepairable partition that could eventually, after two weeks or so, be
repaired with the help of Hans Reiser himself. Even though I really
appreciated direct involvement of the main developer, I will avoid it by
any means.
Not that I really care what you use, but don't you think it's a bit
illogical to avoid a filesystem based on problems you know were caused
by faulty hardware? Just how is the filesystem supposed to avoid
problems from bad RAM?
Yes, you are right, the cause for this incident was faulty memory and I
don't blame reiserfs for failing due to this. But the effect was a
unrepairable filesystem and that again was a problem with the repair
tools available then. And that definetively left a bad after taste for
me.

It may be pure coincidence, but I never experienced anything equally
desastrous with any other FS (knocking on wood right now :-).

Udo Rader
--
BestSolution.at EDV Systemhaus GmbH
http://www.bestsolution.at
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://dovecot.org/pipermail/dovecot/attachments/20060506/027744d5/attachment.pgp
Charles Marcus
2006-05-08 03:44:40 UTC
Permalink
Post by Udo Rader
Yes, you are right, the cause for this incident was faulty memory and I
don't blame reiserfs for failing due to this. But the effect was a
unrepairable filesystem and that again was a problem with the repair
tools available then.
Not necessarily... faulty memory could cause corruption that NO file
system repair tools could repair.
--
Best regards,

Charles
Ben
2006-05-08 03:52:30 UTC
Permalink
Certainly, but my completely anecdotal experience is that I've seen
lots of people complain about bad memory causing corruption on their
reiser partitions, while far less people complain about the same
problem with other file systems. I'm not saying reiser is inherently
fragile, but it's a suspicious correlation.

If reiser *wasn't* fragile, I'd expect to see more people complaining
how bad memory corrupted their ext3/xfs/whatever partition.
Post by Charles Marcus
Post by Udo Rader
Yes, you are right, the cause for this incident was faulty memory and I
don't blame reiserfs for failing due to this. But the effect was a
unrepairable filesystem and that again was a problem with the repair
tools available then.
Not necessarily... faulty memory could cause corruption that NO
file system repair tools could repair.
--
Best regards,
Charles
Stewart Dean
2006-05-08 20:24:55 UTC
Permalink
Does anyone have experience with the JFS2 filesystem under AIX with DC?
Post by Ben
Certainly, but my completely anecdotal experience is that I've seen
lots of people complain about bad memory causing corruption on their
reiser partitions, while far less people complain about the same
problem with other file systems. I'm not saying reiser is inherently
fragile, but it's a suspicious correlation.
If reiser *wasn't* fragile, I'd expect to see more people complaining
how bad memory corrupted their ext3/xfs/whatever partition.
Post by Charles Marcus
Post by Udo Rader
Yes, you are right, the cause for this incident was faulty memory and I
don't blame reiserfs for failing due to this. But the effect was a
unrepairable filesystem and that again was a problem with the repair
tools available then.
Not necessarily... faulty memory could cause corruption that NO file
system repair tools could repair.
--
Best regards,
Charles
--
====
Stewart Dean, Unix System Admin, Henderson Computer Resources
Center of Bard College, Annandale-on-Hudson, New York 12504
***@bard.edu voice: 845-758-7475, fax: 845-758-7035
Udo Rader
2006-05-08 05:56:14 UTC
Permalink
Post by Charles Marcus
Post by Udo Rader
Yes, you are right, the cause for this incident was faulty memory and I
don't blame reiserfs for failing due to this. But the effect was a
unrepairable filesystem and that again was a problem with the repair
tools available then.
Not necessarily... faulty memory could cause corruption that NO file
system repair tools could repair.
Hmm, what kind of corruption should that be? I was not talking about
individual files being lost but an entire partition being inaccessible.

At least in my naive world this is something that should never happen at
all (unless the storage media breaks). AFAIK, any modern filesystem
keeps backups of mission critical data like for example superblocks, but
please feel free to correct me, if I am missing something here.

But in order to become on-topic again, what I was trying to say was that
the quality and availability of disaster recovery tools/procedures is of
major importance when choosing a FS for any server and that is where
reiserfs failed at least for my part.

Udo Rader
--
BestSolution.at EDV Systemhaus GmbH
http://www.bestsolution.at
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://dovecot.org/pipermail/dovecot/attachments/20060507/be205f69/attachment.pgp
Les Mikesell
2006-05-08 06:48:39 UTC
Permalink
Post by Udo Rader
Post by Charles Marcus
Post by Udo Rader
Yes, you are right, the cause for this incident was faulty memory and I
don't blame reiserfs for failing due to this. But the effect was a
unrepairable filesystem and that again was a problem with the repair
tools available then.
Not necessarily... faulty memory could cause corruption that NO file
system repair tools could repair.
Hmm, what kind of corruption should that be? I was not talking about
individual files being lost but an entire partition being inaccessible.
Everything that's on the disk was written there from a memory
buffer.
Post by Udo Rader
At least in my naive world this is something that should never happen at
all (unless the storage media breaks). AFAIK, any modern filesystem
keeps backups of mission critical data like for example superblocks, but
please feel free to correct me, if I am missing something here.
If the memory buffer does not retain what the OS attempted to
store there, the on-disk copy isn't going to be correct
either - including as many copies as you might try to make.
Post by Udo Rader
But in order to become on-topic again, what I was trying to say was that
the quality and availability of disaster recovery tools/procedures is of
major importance when choosing a FS for any server and that is where
reiserfs failed at least for my part.
There's a reason that servers usually have ECC memory. You are
better off having uncorrectable errors stop the machine then
continuing with corruption.
--
Les Mikesell
***@gmail.com
grant beattie
2006-05-08 08:13:16 UTC
Permalink
Post by Udo Rader
Post by Charles Marcus
Post by Udo Rader
Yes, you are right, the cause for this incident was faulty memory and I
don't blame reiserfs for failing due to this. But the effect was a
unrepairable filesystem and that again was a problem with the repair
tools available then.
Not necessarily... faulty memory could cause corruption that NO file
system repair tools could repair.
Hmm, what kind of corruption should that be? I was not talking about
individual files being lost but an entire partition being inaccessible.
At least in my naive world this is something that should never happen at
all (unless the storage media breaks). AFAIK, any modern filesystem
keeps backups of mission critical data like for example superblocks, but
please feel free to correct me, if I am missing something here.
if you are really worried about silent filesystem corruption caused by
bad hardware, I strongly recommend investigating Solaris' ZFS, which
can detect and repair data corruption assuming you have some
redundancy (ie. mirror or raidz). plus it's always consistent on disk
(no fsck, ever). and you get ACLs, snapshots, online expansion, ...
the list goes on.

and getting back on topic, ZFS works very well for Maildir :)

grant.
Richard Laager
2006-05-06 09:03:56 UTC
Permalink
Post by Udo Rader
three or four years ago we had a major reiserfs corruption due to faulty
memory modules (as I know now). The result was a completely unusable and
unrepairable partition that could eventually, after two weeks or so, be
repaired with the help of Hans Reiser himself. Even though I really
appreciated direct involvement of the main developer, I will avoid it by
any means.
Not that I really care what you use, but don't you think it's a bit
illogical to avoid a filesystem based on problems you know were caused
by faulty hardware? Just how is the filesystem supposed to avoid
problems from bad RAM?

Richard
Sergey Ivanov
2006-05-06 05:50:26 UTC
Permalink
Post by Daniel Watts
Post by Marc Perkel
Post by Daniel Watts
Could I possibly have some feedback on what the recommended
filesystems are? I've heard of ReiserFS but was wondering what other
options there are and how they compare.
Reiser has traditionally been a very good choice for maildir because
it has infinite inodes, it is very fast on directories with large
numbers of files, and it does sub allocation so small files take less
space. And it's very fast. Maildir is the area where Reiser does best.
Thanks for this - I have heared many maildir admins laud Reiser. How is
it for ongoing stability and reliability? I suppose with using any
non-mainstream technology (ie ext stuff) the admins concern is that it
is less well tested for bugs and corruption.
I was using reiserfs with maildir till September 2004, and since then
switched to reiser4. It works much better for me, that means economy in
space, improvement in speed and reliability. Including successful
repairing not only after power failures, but even after my own mistakes
like running for a short time fsck.ext3 on reiser4 partition with
"fixing" it.
But you should be aware, now Reiser4 is trying to get into mainstream
kernel, and to comply with requirements they rewrite their code. For my
opinion the most stable was the patch for 2.6.12 kernel. With newer
versions you may experience problems under a very high load (approx.
2,000,000 messages with ~100 msg/sec being stored).
I have never experienced such problems for a small server with 10 users,
4Gb storage and about a thousand msg/day received.
Reiser4 outperforms reiserfs in all parameters, especially in reliability.
--
Sergey.
borsilinux at gmx.net ()
2006-05-06 21:41:15 UTC
Permalink
Post by Daniel Watts
Post by Marc Perkel
Post by Daniel Watts
Could I possibly have some feedback on what the recommended
filesystems are? I've heard of ReiserFS but was wondering what other
options there are and how they compare.
Reiser has traditionally been a very good choice for maildir because
it has infinite inodes, it is very fast on directories with large
numbers of files, and it does sub allocation so small files take less
space. And it's very fast. Maildir is the area where Reiser does best.
Thanks for this - I have heared many maildir admins laud Reiser. How is
it for ongoing stability and reliability? I suppose with using any
non-mainstream technology (ie ext stuff) the admins concern is that it
is less well tested for bugs and corruption.
Hi,
when i switched from mbox to Maildir half a year ago i wanted to put my
"Mail-Partition" on a reiserfs. The migration failed 3 times due to
reiserfs-corruption. I don't remember the exact error message, sorry.
Admitted, the transition put a heavy load on the fs (~4GB of Mails),
still it should have worked. I reformatted the partition with XFS and
all problems went away :).

Before that i used and recommended ReiserFS, now i use XFS whenever
possible and didn't have problems until now...
Post by Daniel Watts
Eg i see many people saying xfs is great but who wouldn't think of
having it put into production.
Wouter Van Hemel
2006-05-10 05:35:36 UTC
Permalink
On Fri, 05 May 2006 17:24:29 +0100
Post by Daniel Watts
Post by Marc Perkel
Post by Daniel Watts
Could I possibly have some feedback on what the recommended
filesystems are? I've heard of ReiserFS but was wondering what
other options there are and how they compare.
Reiser has traditionally been a very good choice for maildir because
it has infinite inodes, it is very fast on directories with large
numbers of files, and it does sub allocation so small files take less
space. And it's very fast. Maildir is the area where Reiser does best.
Thanks for this - I have heared many maildir admins laud Reiser. How is
it for ongoing stability and reliability? I suppose with using any
non-mainstream technology (ie ext stuff) the admins concern is that it
is less well tested for bugs and corruption.
Eg i see many people saying xfs is great but who wouldn't think of
having it put into production.
XFS is in the kernel for quite some time already. I've been quite
doubtful about trying other filesystems in the past, but last year I
started some tests and my experiences are very positive. I've both xfs and
ext3 in use on production machines -- and ufs with softupdates enabled on
the BSD side of things.

I'm not going to make a recommendation because I don't know enough about
filesystems vs mailserver performance, but those of the filesystems I've
tried have been working very reliably and integrate as good with my
systems as classic ext2 does. I've had one case of minor fs corruption,
and that turned out to be a faulty disk.

In real life on general purpose servers, the gains have been quite
marginal, though. Filesystem change isn't a miracle cure for performance
problems, obviously; if that's the problem, more disks to spread the
transactions over make a much bigger difference I/O wise.


Regards,

Wouter
Les Mikesell
2006-05-10 05:49:54 UTC
Permalink
Post by Wouter Van Hemel
In real life on general purpose servers, the gains have been quite
marginal, though. Filesystem change isn't a miracle cure for performance
problems, obviously; if that's the problem, more disks to spread the
transactions over make a much bigger difference I/O wise.
If you put a huge number of files in the same directory, the
filesystem type can make a big difference in access time.
Remember that before you can create a new file you must
scan the current list first to see if that name already
exists and the whole operation has to happen atomically with
the directory locked. Filesystems that index the directories
can help compared to a linear scan although there are some
tradeoffs. Also some never shrink a directory when files
are removed so you continue to scan all the empty slots.

Long ago I used a benchmark program called 'postmark' to
test the speed of file creation/deletion operations that
are typical in maildir environments. I haven't been able
to find it recently although the last time I mentioned it
someone said it was in the debian repositories and available
via apt-get.
--
Les Mikesell
***@gmail.com
Daniel Watts
2006-05-10 07:24:40 UTC
Permalink
Post by Les Mikesell
Long ago I used a benchmark program called 'postmark' to
test the speed of file creation/deletion operations that
are typical in maildir environments.
Can you remember any general findings?

Thanks to all for responding in this thread - I am reading them all and
making notes. All very useful information.

Can I phrase the question in a different way? Of those of you who are
running 10,000+ user system - what FS do you use?
Or do you all use special file servers (netapps, emc etc?) with
proprietary filesystems?

Daniel
Wouter Van Hemel
2006-05-10 08:32:47 UTC
Permalink
On Tue, 09 May 2006 16:49:33 -0500
Post by Les Mikesell
Post by Wouter Van Hemel
In real life on general purpose servers, the gains have been quite
marginal, though. Filesystem change isn't a miracle cure for
performance problems, obviously; if that's the problem, more disks to
spread the transactions over make a much bigger difference I/O wise.
If you put a huge number of files in the same directory, the
filesystem type can make a big difference in access time.
Remember that before you can create a new file you must
scan the current list first to see if that name already
exists and the whole operation has to happen atomically with
the directory locked. Filesystems that index the directories
can help compared to a linear scan although there are some
tradeoffs. Also some never shrink a directory when files
are removed so you continue to scan all the empty slots.
IIRC all typical filesystems for Linux (ext3, xfs, jfs, reiserfs) use
directory indexing, usually by means of a b-tree.

It's important to note that these filesystems each have their own
strengths, and performance will depend on many factors such as the size
and number of files, parallellism, number and type of disks,
fragmentation, i/o load, possibly even cpu load. Are we talking about a
relaying mailserver or end-user storage? Do the users move or delete a
lot of files? Do they rather use imap, or pop3? What other activities run
on the machine? How do you see the reliability/performance trade-off?

In real life, things aren't as clean-cut as in most of those generic
benchmarks, and people tend to attach too much importance to them and
then usually get into silly flamewars. :)
Post by Les Mikesell
Long ago I used a benchmark program called 'postmark' to
test the speed of file creation/deletion operations that
are typical in maildir environments. I haven't been able
to find it recently although the last time I mentioned it
someone said it was in the debian repositories and available
via apt-get.
I seem to remember to have used it once too, also in a very vague past.

Now, I'm not sure how valid the results would be if, for instance, there's
a webserver serving dynamic webmail pages in the same time...

In the past, I've spent (wasted) quite some time benchmarking things like
FreeBSD vs Linux, Perl vs PHP, template systems, etc. Now I believe that
people should just pick what they feel comfortable with, because the
differences are often not that large and it's rarely worth their time and
money.

(Though, that's often not what people want to hear. :) )
Les Mikesell
2006-05-10 08:48:32 UTC
Permalink
Post by Wouter Van Hemel
IIRC all typical filesystems for Linux (ext3, xfs, jfs, reiserfs) use
directory indexing, usually by means of a b-tree.
They are optional on ext3 and I don't think they are on
by default.
Post by Wouter Van Hemel
In real life, things aren't as clean-cut as in most of those generic
benchmarks, and people tend to attach too much importance to them and
then usually get into silly flamewars. :)
If the main use is a mail server with maildir storage the speed
of creating/deleting files is going to the the main factor.
Post by Wouter Van Hemel
In the past, I've spent (wasted) quite some time benchmarking things like
FreeBSD vs Linux, Perl vs PHP, template systems, etc. Now I believe that
people should just pick what they feel comfortable with, because the
differences are often not that large and it's rarely worth their time and
money.
Well, now you can usually afford to throw a few more gigs of
RAM in and let buffering solve the problem. That used to
be much more expensive.
--
Les Mikesell
***@gmail.com
Wouter Van Hemel
2006-05-10 09:39:18 UTC
Permalink
On Tue, 09 May 2006 19:48:19 -0500
Post by Les Mikesell
Post by Wouter Van Hemel
IIRC all typical filesystems for Linux (ext3, xfs, jfs, reiserfs) use
directory indexing, usually by means of a b-tree.
They are optional on ext3 and I don't think they are on
by default.
That, I don't know either, but it would make more sense if they would be.
Post by Les Mikesell
If the main use is a mail server with maildir storage the speed
of creating/deleting files is going to the the main factor.
That depends on one's priorities.

I would go for reliability and recovery possibility. Those are different
requirements, but as valid. Web content can be uploaded easily, but
mailbox recovery is a messy affair. I've not had a machine with i/o being
the limiting factor -- at least for imap mailbox storage.

I suppose one could shave off some milliseconds, but that hasn't been my
priority for mailbox storage servers. And personally, I would first try to
get the mailqueue (if any) and dovecot's indexes on another disk.

Apart from those considerations, I totally agree that it makes sense to
look at filesystems that deal well with operations on small files.
Post by Les Mikesell
Post by Wouter Van Hemel
In the past, I've spent (wasted) quite some time benchmarking things
like FreeBSD vs Linux, Perl vs PHP, template systems, etc. Now I
believe that people should just pick what they feel comfortable with,
because the differences are often not that large and it's rarely
worth their time and money.
Well, now you can usually afford to throw a few more gigs of
RAM in and let buffering solve the problem. That used to
be much more expensive.
True.
Jan Kundrát
2006-05-11 01:20:40 UTC
Permalink
Post by Wouter Van Hemel
Post by Les Mikesell
Post by Wouter Van Hemel
IIRC all typical filesystems for Linux (ext3, xfs, jfs, reiserfs) use
directory indexing, usually by means of a b-tree.
They are optional on ext3 and I don't think they are on
by default.
They aren't.
Post by Wouter Van Hemel
That, I don't know either, but it would make more sense if they would be.
FYI, from `man mke2fs` ("mke2fs 1.38 (30-Jun-2005), Using EXT2FS Library
version 1.38"):

-O feature[,...]
Create filesystem with given features (filesystem options),
overriding the default filesystem options. *Currently, the sparse_super
and filetype features are turned on by default* when mke2fs is run on a
system with Linux 2.2 or later (unless creator-os is set to the Hurd).
[..]
dir_index
Use hashed b-trees to speed up lookups in large directories.
--
cd /local/pub && more beer > /dev/mouth
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 258 bytes
Desc: OpenPGP digital signature
Url : http://dovecot.org/pipermail/dovecot/attachments/20060510/4b9fdde0/signature.pgp
Lorens
2006-05-10 16:02:28 UTC
Permalink
Post by Wouter Van Hemel
It's important to note that these filesystems each have their own
strengths, and performance will depend on many factors such as the size
and number of files, parallellism, number and type of disks,
fragmentation, i/o load, possibly even cpu load. Are we talking about a
relaying mailserver or end-user storage? Do the users move or delete a
lot of files? Do they rather use imap, or pop3? What other activities run
on the machine? How do you see the reliability/performance trade-off?
In real life, things aren't as clean-cut as in most of those generic
benchmarks, and people tend to attach too much importance to them and
then usually get into silly flamewars. :)
Agreed, mostly. I feel that I need to point the only time ever
where I have noted real-life difference between journaling
file-systems during normal operation.

I use an rsync backup scheme that depends heavily on hard links
(googling rsync backup hard link should give you details). A
typical day's backup would consist of maybe a thousand files
spread across a directory tree consisting of some ten thousand
directories and some hundred thousand+ hard links to files
shared between daily backups. I moved the storage from ext3 to
reiser3. I noted that deleting the directory tree representing a
day's backup went from insignificant minutes to tens of minutes,
hundreds of minutes, sometimes up to a thousand-some minutes.

I found benchmarks that seemed to confirm this advantage of
ext3 over reiser3, so I didn't try any tuning. I didn't feel
strongly enough about it to move the server back to ext3, I just
separated the deleting from the backing up so the the one did
not delay the other.

Of course this use is probably not relevant to a server used for
Maildir or mbox, but it's worth noting that while there can be
differences in performance, the usage pattern has to be really
extreme before you see any difference at the application layer.
Post by Wouter Van Hemel
Post by Les Mikesell
Long ago I used a benchmark program called 'postmark' to
test the speed of file creation/deletion operations that
are typical in maildir environments. I haven't been able
to find it recently although the last time I mentioned it
someone said it was in the debian repositories and available
via apt-get.
I seem to remember to have used it once too, also in a very vague past.
Well, I have a debian machine, so it's easy to check!

% postmark 1.51-3 File system benchmark from NetApp
%
% Benchmark that's based around small file operations similar to
% those used on large mail servers and news servers. Has been
% ported to NT so should be good for comparing OSs.
%
% http://www.netapp.com/tech_library/postmark.html

HTH.
Les Mikesell
2006-05-10 20:44:35 UTC
Permalink
Post by Lorens
Post by Wouter Van Hemel
I seem to remember to have used it once too, also in a very vague past.
Well, I have a debian machine, so it's easy to check!
% postmark 1.51-3 File system benchmark from NetApp
%
% Benchmark that's based around small file operations similar to
% those used on large mail servers and news servers. Has been
% ported to NT so should be good for comparing OSs.
%
% http://www.netapp.com/tech_library/postmark.html
HTH.
That link no long works, which is why I couldn't find
it before - but the source is probably somewhere in the
debian repository too. Maybe I can fire up a ububntu
livecd and find it.
--
Les Mikesell
***@gmail.com
Daniel Watts
2006-05-10 18:14:07 UTC
Permalink
Post by Wouter Van Hemel
IIRC all typical filesystems for Linux (ext3, xfs, jfs, reiserfs) use
directory indexing, usually by means of a b-tree.
It's important to note that these filesystems each have their own
strengths, and performance will depend on many factors such as the size
and number of files, parallellism, number and type of disks,
fragmentation, i/o load, possibly even cpu load. Are we talking about a
relaying mailserver or end-user storage? Do the users move or delete a
lot of files? Do they rather use imap, or pop3? What other activities run
on the machine? How do you see the reliability/performance trade-off?
In real life, things aren't as clean-cut as in most of those generic
benchmarks, and people tend to attach too much importance to them and
then usually get into silly flamewars. :)
For the sake of this discussion I think it would be most relevant to
assume a machine dedicated to Dovecot IMAP Maildir storage without any
other services running on it.
In terms of load type does it matter whether there are a lot of light
usage users or just a few massively heavy usage users? Maximal mailbox
reads and writes is probably the best abstract way to avoid the 'how
many users / how many mail files?' problem.
Post by Wouter Van Hemel
Now I believe that
people should just pick what they feel comfortable with, because the
differences are often not that large and it's rarely worth their time and
money.
(Though, that's often not what people want to hear. :) )
Actually that is my gut feeling too. However some people dont' really
have a preference and so might as well spend a bit of time choosing the
best system in terms of (theoretical) performance for their particular
application.
Ben Winslow
2006-05-10 21:21:26 UTC
Permalink
Post by Les Mikesell
Long ago I used a benchmark program called 'postmark' to
test the speed of file creation/deletion operations that
are typical in maildir environments.
Aha. I hadn't heard of postmark before, but I'll have to check it out.
It sounds useful enough. Thanks!
Post by Les Mikesell
I haven't been able to find it recently although the last time
I mentioned it someone said it was in the debian repositories
and available via apt-get.
The upside of this is that it means the original source is mirrored
(even though the URL is gone):
http://http.us.debian.org/debian/pool/main/p/postmark/postmark_1.51.orig.tar.gz
--
Ben Winslow <***@bluecherry.net>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 827 bytes
Desc: OpenPGP digital signature
Url : http://dovecot.org/pipermail/dovecot/attachments/20060510/c6c6c774/signature.pgp
Ben Winslow
2006-05-08 12:59:43 UTC
Permalink
Post by Daniel Watts
Hi,
I've heard that for Dovecot/Mailir systems there are filesystems that
are optimised for the situation of many small files in one folder.
Could I possibly have some feedback on what the recommended filesystems
are? I've heard of ReiserFS but was wondering what other options there
are and how they compare.
If I get a good comprehensive response I'll build a wiki summary page
out of the data gathered.
Best wishes,
Daniel
From my personal experiences, I'd heartily recommend xfs.

I've been using reiserfs since around the time it was merged into the
stock kernel and was the only journalling fs in the main kernel tree. I
still use reiserfs in a few places where it hasn't been practical to
convert to xfs.

I started using xfs on my workstation shortly before it became part of
the main kernel tree, because I was quite interested in POSIX ACLs and
it also performed better than reiserfs in my testing. Since that time,
usage has fanned out to most of the boxes I administer, and I've found
it performs quite a bit better than reiserfs for me -- especially when
dealing with lots of small files (e.g. Maildir.)

I'm echoing some of the more recent conversation now, but perhaps just
as important or moreso than raw performance is failure recovery: 4-5
years of experience with each FS is ample time to see some hardware
failures, and reiserfs has dealt rather poorly with filesystem
corruption in my experience.

Most recently, I had a handful of sectors go bad on a drive full of
Maildirs, and this was brought to my attention not by kernel errors
being logged, but by the system spontaneously and repeatedly rebooting.
xfs, on the other hand, has been extremely graceful when it runs into fs
corruption -- something especially important when physical access to the
system isn't readily available (a few of the boxes I admin are ~900mi
away.)

My other complaint with reiserfs is that reiserfsck is painstakingly
slow -- especially when you need to resort to --rebuild-tree (as I did
in the above scenario) -- which means more downtime when something
Really Bad(tm) happens. I don't remember how long it took to repair
that filesystem once I'd moved it to another drive, but I'm sure it was
at least a couple of hours.

Unfortunately, between xfs and reiserfs, I haven't extensively used any
other filesystems recently enough to have a good idea of Maildir
performance or how well they deal with hardware failures. I would
recommend xfs over reiserfs in a heartbeat, though, after having dealt
with both on failing drives.

YMMV, of course -- these are just my experiences.

HTH,
--
Ben Winslow <***@bluecherry.net>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 827 bytes
Desc: This is a digitally signed message part
Url : http://dovecot.org/pipermail/dovecot/attachments/20060508/79c166bf/attachment.pgp
Fran Fabrizio
2006-05-10 23:45:22 UTC
Permalink
Post by Daniel Watts
I've heard that for Dovecot/Mailir systems there are filesystems that
are optimised for the situation of many small files in one folder.
Could I possibly have some feedback on what the recommended
filesystems are? I've heard of ReiserFS but was wondering what other
options there are and how they compare.
I happen to also be investigating this, and I too keep hearing the
Reiser is the way to go for Maildirs. However, when I went looking for
benchmarks to support this, the information out there is sparse,
confusing and contradicting. For instance, here's a benchmark showing
Reiser being among the slowest filesystems in many areas...
http://linuxgazette.net/122/piszcz.html

Here's a guy who claims that Maildir + ReiserFS is a win but then when
you look at his numbers it appears that Reiser was slower than ext3...
http://www.decisionsoft.com/pdw/mailbench.html

This guy has Reiser as the slowest...
http://www.thesmbexchange.com/eng/qmail_fs_benchmark.html

And so on. You can find benchmarks that show Reiser as the best, ext3
as the best, xfs as the best, jfs as the best. All for maildirs. So,
I'm not sure if I can trust any of the benchmarks.

One thing that seemed consistent is that everyone had xfs performing
pretty well. I am configuring a new RAID array for my Maildirs and so I
am in a position where I can test various filesystems now. I tried xfs
on Fedora Core 5 and it crashed the kernel, so that was a bummer. As
soon as I did the mkfs -t xfs and rebooted, system hangs during startup
right when it is looking at the disk array to find the filesystems. I
am probably going to go to suse enterprise linux or redhat enterprise
linux for this server and see if stability is there, but I'm really
starting to lean towards the "ext3 because it works good enough" camp,
and optimize elsewhere. More disks in the array, more memory in the
servers, etc...

-Fran
--
Fran Fabrizio
Senior Systems Analyst
Department of Computer and Information Sciences
University of Alabama at Birmingham
http://www.cis.uab.edu/
205.934.0653
Roger Weeks
2006-05-11 00:16:39 UTC
Permalink
Date: Wed, 10 May 2006 00:24:23 +0100
Subject: Re: [Dovecot] Recommended FS for Dovecot Maildir
Can you remember any general findings?
Thanks to all for responding in this thread - I am reading them all
and
making notes. All very useful information.
Can I phrase the question in a different way? Of those of you who are
running 10,000+ user system - what FS do you use?
Or do you all use special file servers (netapps, emc etc?) with
proprietary filesystems?
Daniel
We're probably right at your 10000 user threshold, but we're an ISP
so we get a LOT of mail traffic. Our mail system consists of dual
exim servers processing incoming mail, writing to maildirs on a home-
built NFS system. The home-built system is a dual-xeon machine with
lots of SATA disks in a RAID5. We then have two client-facing
machines running dovecot and squirrelmail.

These are all SuperMicro servers running RedHat ES 4. They all use
SATA disks. We are using EXT3 for the RAID5 filesystem, and I have
had no performance issues to speak of. I had considered Reiser or
xfs, but as I have limited experience with either filesystem I didn't
feel comfortable using them in production.

--
Roger J. Weeks
Systems & Network Administrator
Mendocino Community Network
Loading...