Discussion:
Architecture for large Dovecot cluster
(too old to reply)
Murray Trainer
2014-01-05 13:06:37 UTC
Permalink
Hi All,

I am trying to determine whether a mail server cluster based on Dovecot
will be capable of supporting 500,000+ mailboxes with about 50,000 IMAP
and 5000 active POP3 connections. I have looked at the Dovecot
clustering suggestions here:

http://blog.dovecot.org/2012/02/dovecot-clustering-with-dsync-based.html

and some other Dovecot mailing list threads but I am not sure how many
users such a setup will handle. I have a concern about the I/O
performance of NFS in the suggested architecture above. One possible
option available to us is to split up the mailboxes over multiple
clusters with subsets of domains. Is there anyone out there currently
running this many users on a Dovecot based mail cluster? Some
suggestions or advice on the best way to go would be greatly appreciated.

Thanks

Murray
Robert Schetterer
2014-01-05 19:33:53 UTC
Permalink
Post by Murray Trainer
Hi All,
I am trying to determine whether a mail server cluster based on Dovecot
will be capable of supporting 500,000+ mailboxes with about 50,000 IMAP
and 5000 active POP3 connections. I have looked at the Dovecot
as long as you have some load balancing and/or proxy/director with few
servers on good modern hardware you havent worry about pop3, 5000 pop3
logins per minute should work with small tuning, no idea about asked
number of imap cons
Post by Murray Trainer
http://blog.dovecot.org/2012/02/dovecot-clustering-with-dsync-based.html
good article, but however there are many ways how to goal this,depending
what is your budget etc, i.e you dont have to use nfs , you may consider
use cluster file systems with drbd and/or ceph or equals, at last there
many other pay solutions for solving io storage
which is the most sensible part, think about using dbox or mdbox as
mailbox format, what mailbox quota you like to offer etc
Post by Murray Trainer
and some other Dovecot mailing list threads but I am not sure how many
users such a setup will handle. I have a concern about the I/O
performance of NFS in the suggested architecture above. One possible
option available to us is to split up the mailboxes over multiple
clusters with subsets of domains. Is there anyone out there currently
running this many users on a Dovecot based mail cluster? Some
suggestions or advice on the best way to go would be greatly appreciated.
look about list archive for equal setups , ask Timo or other people for
paid support, wait for people reporting their big setups
Post by Murray Trainer
Thanks
Murray
Best Regards
MfG Robert Schetterer
--
[*] sys4 AG

http://sys4.de, +49 (89) 30 90 46 64
Franziskanerstra?e 15, 81669 M?nchen

Sitz der Gesellschaft: M?nchen, Amtsgericht M?nchen: HRB 199263
Vorstand: Patrick Ben Koetter, Axel von der Ohe, Marc Schiffbauer
Aufsichtsratsvorsitzender: Florian Kirstein
Ed W
2014-01-23 16:57:22 UTC
Permalink
Hi
Post by Robert Schetterer
Post by Murray Trainer
and some other Dovecot mailing list threads but I am not sure how many
users such a setup will handle. I have a concern about the I/O
performance of NFS in the suggested architecture above. One possible
option available to us is to split up the mailboxes over multiple
clusters with subsets of domains. Is there anyone out there currently
running this many users on a Dovecot based mail cluster? Some
suggestions or advice on the best way to go would be greatly appreciated.
look about list archive for equal setups , ask Timo or other people for
paid support, wait for people reporting their big setups
It's difficult for me (on the outside) to gauge how many people do pay
Timo, et al for services. However, just to put a stake in the ground, I
have "employed" Timo on a couple of occasions, just for small projects,
but in my case to add new features or fix bugs which are specific to my
requirements. I will very positively recommend this, I found Timo
extremely helpful and although I only paid an affordable amount to have
a feature added, he has kindly continued to maintain these features as
part of the core software (for which I am extremely grateful)

I'm very satisfied and have to highly recommend Timo. His prices were
extremely reasonable and he offered service excellent.

This is obviously a glowing endorsement, take that as you wish. However,
I suspect that sometimes we are all guilty of forgetting that there are
humans on the far side of these projects and for relatively affordable
sums we can employ them to both help us out (and possibly benefit all
users of the software). I don't have big pockets, but I have
successfully asked for enhancements to several open source projects
(dovecot/dnsmasq/shorewall/squid and some others) and the whole
experience has worked very well for me.

Please feel encouraged to employ Timo if you use Dovecot!

Good luck

Ed W
Charles Marcus
2014-01-23 17:04:22 UTC
Permalink
Post by Ed W
I'm very satisfied and have to highly recommend Timo. His prices were
extremely reasonable and he offered service excellent.
...snip...
Please feel encouraged to employ Timo if you use Dovecot!
I will add a hearty 'seconded!' to this endorsement.

Timo helped migrate our old courier-imap setup and dis so quickly and
efficiently. A few legacy config issues prevented us from switching to
the dovecot LDA at the time, but he explained in detail what I needed to
do, and when I migrated our old bare metal gentoo mail server to a shiny
new virtualized one, I made the changes and everything just worked (with
a few minor issues I had to fix, also related to the same legacy config
issues)...

I just wish my boss was more open to spending money on technology so I
could engage Timo to do a few more things...
--
Best regards,

Charles
Stan Hoeppner
2014-01-24 12:10:01 UTC
Permalink
Sven, why didn't you chime in? Your setup is similar scale and I think
your insights would be valuable here. Or maybe you could repost your
last on this topic. Or was that discussion off list? I can't recall.

Anyway, I missed this post Murray. Thanks Ed for drudging this up.
Maybe this will give you some insight, or possibly confuse you. :)
Post by Murray Trainer
Hi All,
I am trying to determine whether a mail server cluster based on Dovecot
will be capable of supporting 500,000+ mailboxes with about 50,000 IMAP
and 5000 active POP3 connections. I have looked at the Dovecot
http://blog.dovecot.org/2012/02/dovecot-clustering-with-dsync-based.html
and some other Dovecot mailing list threads but I am not sure how many
users such a setup will handle. I have a concern about the I/O
performance of NFS in the suggested architecture above. One possible
option available to us is to split up the mailboxes over multiple
clusters with subsets of domains. Is there anyone out there currently
running this many users on a Dovecot based mail cluster? Some
suggestions or advice on the best way to go would be greatly appreciated.
As with MTAs Dovecot requires miniscule CPU power for most tasks. Body
searches are the only operations that eat meaningful CPU, and only when
indexes aren't up to date.

As with MTAs, mailbox server performance is limited by disk IO, but it
is also limited by memory capacity as IMAP connections are long lived,
unlike an MTA where each lasts a few seconds.

Thus, very similar to the advice I gave you WRT MTAs, you can do this
with as few as two hosts in the cluster, or as many as you want. You
simply need sufficient memory for concurrent user connections, and
sufficient disk IO.

The architecture of the IO subsystem depends greatly on which mailbox
format you plan to use. Maildir is extremely metadata heavy and thus
does not perform all that well with cluster filesystems such as OCFS or
GFS, no matter how fast the SAN array controller and disks may be. It
can work well with NFS. Mdbox isn't metadata heavy and works much
better with cluster filesystems.

Neither NFS nor a cluster filesystem setup can match the performance of
a standalone filesystem on direct attached disk or a SAN LUN. But
standalone filesystems make less efficient use of total storage
capacity. And if using DAS failover, resiliency, etc are far less than
optimal.

With correct mail routing from your MTAs to your Dovecot servers, and
with Dovecot director, you can use any of these architectures. Which
one you choose boils down to:

1. Ease of management
2. Budget
3. Storage efficiency

The NFS and cluster filesystem solutions are generally significantly
more expensive than filesystem on DAS, because the NFS server and SAN
array required for 500,000 mailboxes are costly. If you go NFS you
better get a NetApp filer. Not just for the hardware, snapshots, etc,
but for the engineering support expertise. They know NFS better than
the Pope knows Jesus and can get you tuned for max performance.

Standalone servers/filesystems with local disk give you dramatically
more bang for the buck. You can handle the same load with fewer servers
and with quicker response times. You can use SAN storage instead of
direct attach, but at cost equivalent to the cluster filesystem
architecture. You'll then benefit from storage efficiency, PIT
snapshots, etc.

Again, random disk IOPS is the most important factor wil mailbox
storage. With 50K logged in IMAP users and 5K POP3 users, we simply
have to guesstimate IOPS if you don't already have this data. I assume
you don't as you didn't provide it. It is the KEY information required
to size your architecture properly, and in the most cost effective manner.

Lets assume for argument sake that your 50K concurrent IMAP users and
your 5K POP users generate 8,000 IOPS, which is probably a high guess.
10K SAS drives do ~225 IOPS.

8000/225= 36 disks * 2 for RAID10 = 72

So as a wild ass guesstimate you'd need approximately 72 SAS drives in
multiple at 10K spindle speed for this workload. If you need to use
high cap 7.2K SATA or SAS drives to meet your offered mailbox capacity
you'll need 144 drives.

Whether you go NFS, cluster on SAN, or standalone filesystems on SAN,
VMware with HA, Vmotion, etc, is a must, as it gives you instant host
failover and far easier management that KVM, Xen, etc.

On possible hardware solution consists of:

Qty 1. HP 4730 SAN controller with 25x 600GB 10K SAS drives
Qty 3. Expansion chassis for 75 drives, 45TB raw capacity, 21.6TB
net after one spare per chassis and RAID10, 8100 IOPS.
Qty 2. Dell PowerEdge 320, 4 core Xeon and 96GB RAM, Dovecot
Qty 1. HP ProLiant DL320e with 8GB RAM running Dovecot Director

You'd run ESX on each Dell with one Linux guest per physical box. Each
guest would be allocated 46GB of RAM to facilitate failover. This much
RAM is rather costly, but Vmware licenses are far more, so it saves
money using a beefy 2 box cluster vs a 3/4 box cluster of weaker
machines. You'd create multiple RAID10 arrays using a 32KB strip size
on the 4730 of equal numbers of disks, and span the RAID sets into 2
volumes. You'd export each volume as a LUN to both ESX hosts. You'd
create an RDM of each LUN and assign one RDM to each of your guests.
Each guest would format its RDM with

~# mkfs.xfs "-d agcount=24" /dev/[device]

giving you 24 allocation groups for parallelism. Do -not- align XFS
(sunit/swidth) with a small file random IO workload. It will murder
performance. You get two 10TB filesystems, each for 250,000 mailboxes,
or ~44MB average per mailbox. If that's not enough storage, buy the
900GB drives for 66MB/mailbox. If that's still not enough, use more
expansion chassis and more RAID sets per volume, or switch to a large
cap SAS/SATA model. With 50K concurrent users, don't even think about
using RAID5/6. The RMW will murder performance and then urinate on its
grave.

With HA configured, if one box or one guest dies, the guest will
automatically be restarted on the remaining host. Since both hosts see
both LUNs, and RDMs, the guest boots up and has its filesystem. This is
an infinitely better solution than a single shared cluster filesystem.
The dual XFS filesystems will be much faster. If the CFS gets corrupted
all your users are down--with two local filesystems only half the users
are down. Check/repair of a 20TB GFS2/OCFS2 filesystem will take -much-
longer than xfs_repair on a 10TB FS, possibly hours one you have all
500K mailboxes on it. Etc, etc.
--
Stan
Javier de Miguel Rodríguez
2014-01-24 12:24:50 UTC
Permalink
Great mail, Stan

Another trick: you can save storage (both space & iops) using mdox and
compression. CPU power is far cheaper than iops , the less data you
read/write, the fewer iops.

You can use gzip,bzip2 or even LZMA/xz compression for LDA. If you also
use Single Instace Storage and Alternate (cheap) storage for old mail,
you can save a lot of money in storage. Also consider using mdbox + ssd
for indexes (hp storevirtual VSA+ a couple of ESXi with ssd disks will
give you real-time replicated ssd iscsi lun for indexes)

Just my 2 cents.

Regards

Javier
Stan Hoeppner
2014-01-25 00:49:09 UTC
Permalink
Post by Javier de Miguel Rodríguez
Great mail, Stan
Another trick: you can save storage (both space & iops) using mdox and
compression. CPU power is far cheaper than iops , the less data you
read/write, the fewer iops.
Yeah, the cost of enterprise storage is insane. But I'd be wary of
using compression on primary storage with 50K concurrent IMAP users plus
5K POP users. Even with dozens of cores of horsepower it'll still add
latency. For alt storage sure. Using compression on primary storage
would make system sizing much more difficult WRT core counts, clock
speed, and memory requirements. And it would need much load testing.
Post by Javier de Miguel Rodríguez
You can use gzip,bzip2 or even LZMA/xz compression for LDA. If you also
use Single Instace Storage and Alternate (cheap) storage for old mail,
you can save a lot of money in storage. Also consider using mdbox + ssd
for indexes (hp storevirtual VSA+ a couple of ESXi with ssd disks will
give you real-time replicated ssd iscsi lun for indexes)
I don't know how much SIS would benefit an Australian service provider.
I don't know the culture, people's "forwarding" habits. If it's like
parts of The States it may help some. Alt storage definitely would. To
me your SSD suggestion just puts extra write wear on the SSDs. A form
of SAN flash cache would be better. In the case of the VSAs they have
tons of memory, 12 slots, to having fast hot indexes probably wouldn't
be an issue. But obviously the HP gear isn't the only game in town.
--
Stan
Sven Hartge
2014-01-24 12:42:59 UTC
Permalink
Post by Stan Hoeppner
Sven, why didn't you chime in? Your setup is similar scale and I
think your insights would be valuable here. Or maybe you could repost
your last on this topic. Or was that discussion off list? I can't
recall.
Rather busy right now with a large scale Identity Management+AD rollout here, so
unfortunately not too much time to elaborate my setup in great detail.

But after testing the nothing-shared-6-node-cluster setup with imapc as
the backend for shared folders I concluded that this does not scale very
well (the imapc-part, that is) and changed my plans to an director-based
NFS-backed (Netapp 3240) setup, which is much more common.

I reckoned I'd be nearly the only one on this planet to be so crazy to
try to use a backwards-normal-user-as-master-user-for-imapc setup for
shared folders and that having anyone other than me understanding that
setup, let alone getting support for it, would be to big a hassle.

So I put the mdbox storage on two 15k-SAS-NetApp with 1TB FlashCache,
connected with 2x 10GBit to the SAN, using NFS to mount the volumes in
my 6 backend-dovecot servers, putting 2 director-dovecots in front,
which will sit behind a Linux IPVS loadbalancer. All systems are VMs on
ESX.

I recently added two more shelves with SATA drives to the NetApp to use
as storage for the alt-storage feature of dovecot to automatically
migrate mails older than 180 days to less expensive storage.

As of now, the system is not yet live (see IDM rollout above), I hope to
resume my migration in late spring, early summer.

But during initial synthetic benchmarks have show that this setup will
be more than sufficient to provide the needed oompf for my 15k users,
with enough room to grow.

Interesting datapoint: NetApp Deduplication did only recover about 1% of
storage space with mdbox-based mail storage, while on an maildir-based
mail storage, the rate was about 15%. (This was tested with a copy of
real user data, so is accurate for my workload.)

Gr??e,
Sven.
--
Sigmentation fault. Core dumped.
Stan Hoeppner
2014-01-25 03:23:26 UTC
Permalink
This went to me only so bringing back on list.
Is anybody using the Object Storage plugin for large-scale
installations?
I've not used it.
We're considering it, but are thinking of an in-house
S3 storage system (riak, or ceph, or ?) Looking to support perhaps
300k users. I was thinking that if we use a bank of dovecot servers
(with director) with ssds as cache, we might be able to consolidate
all the storage on something like a riak cluster, which would make
scaling simple and inexpensive - certainly much less than a NetApp
solution.
Everything costs less than a NetApp...except an EMC.
If anyone has any first-hand experience (or even
off-the-top-of-their-head thoughts), I'd love to hear them)
Distributed filesystems give you the advantage of a single filesystem
namespace with massive amounts of storage, fairly easy addition of
storage space, and distributed replication to allow failure of a storage
node without service interruption.

Replication mitigates node failure, but not disk failure, so you still
need RAID in each node. So you have RAID6 in a node and filesystem
block mirroring amongst nodes. Thus storage utilization is -worse- than
direct attach, CFS on SAN, or NFS head attached RAID10 and far worse
than RAID6 in these 3 setups. And if using large SSD cache you'd surely
use RAID6 with DAS, CFS, or NFS. You'd need half as many disk drives vs
DFS.

Each DFS expansion, assuming the typical model, entails the cost of a
server, RAID HBA (unless using md) and disks, not strictly buying disks
as with DAS, CFS/SAN, or NFS filer. Then you also need more switch
ports, more power connections, greater UPS capacity due to all the CPUs,
RAM, etc in the nodes. And you'll have a higher electric bill.

So while a distributed filesystem storage architecture may seem less
expensive it may not be. And just as one can build a DIY DFS cluster,
one can also build a DIY NFS cluster instead of buying a NetApp, saving
significant cash on the front end box and on disks since you'd need half
as many vs a distributed filesystem architecture, though failure of one
node may not be quite as graceful as with a NetApp losing a controller
board.
--
Stan
Tom Johnson
2014-01-26 17:45:31 UTC
Permalink
Post by Stan Hoeppner
Is anybody using the Object Storage plugin for large-scale
installations?
I've not used it.
We're considering it, but are thinking of an in-house
S3 storage system (riak, or ceph, or ?) Looking to support perhaps
300k users. I was thinking that if we use a bank of dovecot servers
(with director) with ssds as cache, we might be able to consolidate
all the storage on something like a riak cluster, which would make
scaling simple and inexpensive - certainly much less than a NetApp
solution.
Everything costs less than a NetApp...except an EMC.
If anyone has any first-hand experience (or even
off-the-top-of-their-head thoughts), I'd love to hear them)
(Stan gives a great run-down on the economics of using a NetApp or even homegrown NFS filer versus using an object storage backend.)

I am quite familiar with NetApp, and EMC - I used to have a number of Celera file servers back in my BigFish/FrontBridge days.

But now I'm in a situation where I have dozens of servers with spare storage bays and unused CPU cycles sitting in data centers where the power is already provisioned, and a DFS is what makes most sense for me now.

So, I would like to ask once again- is anyone on the list using the object storage plugin for dovecot at any reasonably large scale, whether it's an in-house storage solution or S3?

Thanks-

Tom
Stan Hoeppner
2014-01-28 04:56:55 UTC
Permalink
Post by Tom Johnson
Post by Stan Hoeppner
On 1/24/2014 11:09 AM, Tom Johnson wrote: Is anybody using the
Object Storage plugin for large-scale installations?
I've not used it.
We're considering it, but are thinking of an in-house S3 storage
system (riak, or ceph, or ?) Looking to support perhaps 300k
users. I was thinking that if we use a bank of dovecot servers
(with director) with ssds as cache, we might be able to
consolidate all the storage on something like a riak cluster,
which would make scaling simple and inexpensive - certainly much
less than a NetApp solution.
Everything costs less than a NetApp...except an EMC.
If anyone has any first-hand experience (or even
off-the-top-of-their-head thoughts), I'd love to hear them)
(Stan gives a great run-down on the economics of using a NetApp or
even homegrown NFS filer versus using an object storage backend.)
Tom I'm sorry I wasted your time with my initial response.
Post by Tom Johnson
I am quite familiar with NetApp, and EMC - I used to have a number of
Celera file servers back in my BigFish/FrontBridge days.
But now I'm in a situation where I have dozens of servers with spare
storage bays and unused CPU cycles sitting in data centers where the
power is already provisioned, and a DFS is what makes most sense for
me now.
Had I known these details above up front I wouldn't have responded. I
incorrectly assumed you were designing new infrastructure, wading into
new waters, because few are yet to deploy DFS for mailbox storage these
days.
Post by Tom Johnson
So, I would like to ask once again- is anyone on the list using the
object storage plugin for dovecot at any reasonably large scale,
whether it's an in-house storage solution or S3?
I'm hoping, as I'd guess you are, that someone will respond who is
already doing this. If someone has it working well it offers others
more storage options, which is always a good thing. Whether it costs
more or less than the other solutions, it may still be a better option
for some folks either way.
--
Stan
Thomas Johnson
2014-01-28 05:25:10 UTC
Permalink
Hi Stan-
Post by Stan Hoeppner
Post by Tom Johnson
(Stan gives a great run-down on the economics of using a NetApp or
even homegrown NFS filer versus using an object storage backend.)
Tom I'm sorry I wasted your time with my initial response.
No, you absolutely didn't waste my time, and it was certainly of great advantage to the list. I think it was a great write-up of the advantages and disadvantages of each different option. I know my situation isn't the standard one...
Post by Stan Hoeppner
Post by Tom Johnson
I am quite familiar with NetApp, and EMC - I used to have a number of
Celera file servers back in my BigFish/FrontBridge days.
But now I'm in a situation where I have dozens of servers with spare
storage bays and unused CPU cycles sitting in data centers where the
power is already provisioned, and a DFS is what makes most sense for
me now.
Had I known these details above up front I wouldn't have responded. I
incorrectly assumed you were designing new infrastructure, wading into
new waters, because few are yet to deploy DFS for mailbox storage these
days.
I think it's great that you did respond, and thanks for doing so. I know that this is wading into new waters...I'm just hoping I'm not really the very first :)
Post by Stan Hoeppner
Post by Tom Johnson
So, I would like to ask once again- is anyone on the list using the
object storage plugin for dovecot at any reasonably large scale,
whether it's an in-house storage solution or S3?
I'm hoping, as I'd guess you are, that someone will respond who is
already doing this. If someone has it working well it offers others
more storage options, which is always a good thing. Whether it costs
more or less than the other solutions, it may still be a better option
for some folks either way.
Dovecot's commercial arm is certainly marketing the object storage. I'm just hoping someone is actually using it and can offer some guidance.

Tom
Stan Hoeppner
2014-01-28 07:53:44 UTC
Permalink
Post by Thomas Johnson
Hi Stan-
Post by Stan Hoeppner
Post by Tom Johnson
(Stan gives a great run-down on the economics of using a NetApp
or even homegrown NFS filer versus using an object storage
backend.)
Tom I'm sorry I wasted your time with my initial response.
No, you absolutely didn't waste my time, and it was certainly of
great advantage to the list. I think it was a great write-up of the
advantages and disadvantages of each different option. I know my
situation isn't the standard one...
Post by Stan Hoeppner
Post by Tom Johnson
I am quite familiar with NetApp, and EMC - I used to have a
number of Celera file servers back in my BigFish/FrontBridge
days.
But now I'm in a situation where I have dozens of servers with
spare storage bays and unused CPU cycles sitting in data centers
where the power is already provisioned, and a DFS is what makes
most sense for me now.
Had I known these details above up front I wouldn't have responded.
I incorrectly assumed you were designing new infrastructure, wading
into new waters, because few are yet to deploy DFS for mailbox
storage these days.
I think it's great that you did respond, and thanks for doing so. I
know that this is wading into new waters...I'm just hoping I'm not
really the very first :)
Post by Stan Hoeppner
Post by Tom Johnson
So, I would like to ask once again- is anyone on the list using
the object storage plugin for dovecot at any reasonably large
scale, whether it's an in-house storage solution or S3?
I'm hoping, as I'd guess you are, that someone will respond who is
already doing this. If someone has it working well it offers
others more storage options, which is always a good thing. Whether
it costs more or less than the other solutions, it may still be a
better option for some folks either way.
Dovecot's commercial arm is certainly marketing the object storage.
I'm just hoping someone is actually using it and can offer some
guidance.
Tom
This may be a dumb suggestion and maybe you already have done so, but
since this is a commercial only option, maybe you should contact Timo
directly and see if he can point you to other customers who have
deployed it.
--
Stan
Urban Loesch
2014-01-24 15:09:23 UTC
Permalink
Hi,
and some other Dovecot mailing list threads but I am not sure how many users such a setup will handle. I have a concern about the I/O performance of
NFS in the suggested architecture above. One possible option available to us is to split up the mailboxes over multiple clusters with subsets of
domains. Is there anyone out there currently running this many users on a Dovecot based mail cluster? Some suggestions or advice on the best way to
go would be greatly appreciated.
we only have running a setup with 35k Users (2000 imap and 300 pop3 sessions simultaneous).
But we split all users and domains accross 9 virtual containers. Until now all containers are running on 1 bare metal machine, because
the server is fast enough and quite new.

In front of our backend servers we use two imap/pop3 proxies which gets their static routing informations for imap/pop3/smtp/lmtp from
dedicated mysql-databases (master-master mode, also multiple slaves are possible). Same for smtp relay.

This setup allows us to scale out as wide we need. In theory it's possible to use for each account a separate storage backend scaled out on
multiple servers. Connections beetween proxies and backends are made by IPv6 on layer2. No routers between.
So we have no problems with tight ipv4 space :-)

Some info on storage backends:
- Mailbox format is mdbox with zlib plugin. Each file hax a max of 10MB.
- Dovecot internal caches for authentication etc. doing a good job. Without the caches the database becomes busy.
- Central administration functions are implemented on our internal admin frontend to for example clear caches, change account password or get/change
user quota.
- Mailindexes are stored on RAID 1 SSD SLC disks (about 20GB now)
- Maildata is stored on RAID 10 SATA 7.2k rpm disks (10 disks)
- Incomming Mailqueue and OS for the containers on RAID 1 SAS disks (10k rpm)
- all Backends are in HA with a passive machine and DRBD with 10GBIT Cross Links

IMAP/POP3/SMTP Proxies are running on 2 dedicated mid range servers (HA):
- IMAP/POP3 Proxies are clustered and load balanced with the IPTable ClusterIP Module (poor man's load balancer)
- Same on SMTP relay server for outgoing email.
- MX Servers for incomming mail are load balanced by DNS priority as usual.

Each setup has his advantages and disadvantages. For example no idea how can we use shared folders within one domain if the accounts
are spread out on multiple backends. But at the moment we don't need that.
For our needs this setup works very good.

Also thanks to Timo for his great work on dovecot.

Regards
Urban
Rick Romero
2014-01-24 15:15:59 UTC
Permalink
Post by Urban Loesch
Hi,
Post by Murray Trainer
and some other Dovecot mailing list threads but I am not sure how many
users such a setup will handle.? I have a concern about the I/O
performance of
NFS in the suggested architecture above.? One possible option available
to us is to split up the mailboxes over multiple clusters with subsets
of
domains.? Is there anyone out there currently running this many users
on a Dovecot based mail cluster?? Some suggestions or advice on the
best way to
go would be greatly appreciated.
we only have running a setup with 35k Users (2000 imap and 300 pop3 sessions simultaneous).
But we split all users and domains accross 9 virtual containers. Until
now all containers are running on 1 bare metal machine, because
the server is fast enough and quite new.
- all Backends are in HA with a passive machine and DRBD with 10GBIT Cross Links
?
How do you do backups?
Urban Loesch
2014-01-24 15:29:16 UTC
Permalink
Post by Rick Romero
Post by Urban Loesch
- all Backends are in HA with a passive machine and DRBD with 10GBIT Cross Links
How do you do backups?
The underlying storage is based on lvm. So we can take a daily snapshot on the passive server,
mount them readonly and have no load impact on the active machine during the backuptime.

Maildata etc. is synced via rsync to a small storagesystem in a seperate datacenter over a dedicated 1Gbit
dark fiber link. Works very well for us and is within our budget.
Joseph Tam
2014-01-28 01:50:40 UTC
Permalink
Post by Sven Hartge
Interesting datapoint: NetApp Deduplication did only recover about 1% of
storage space with mdbox-based mail storage, while on an maildir-based
mail storage, the rate was about 15%. (This was tested with a copy of
real user data, so is accurate for my workload.)
Just a guess, but I expect the difference is because NetApp de-dupes by
checksumming blocks and mark whole blocks as duplicates if they have
the same checksum.

The message body has the same block offset in maildir (i.e. the start of
a message is at byte 0), whereas mdbox might align message body anywhere
in a block, so you might have 512 different block configurations for
the same message.

I don't know whether message alignment would be a worthwhile optimization
for mdbox.

Joseph Tam <jtam.home at gmail.com>
Sven Hartge
2014-01-28 09:46:30 UTC
Permalink
Post by Joseph Tam
Post by Sven Hartge
Interesting datapoint: NetApp Deduplication did only recover about 1%
of storage space with mdbox-based mail storage, while on an
maildir-based mail storage, the rate was about 15%. (This was tested
with a copy of real user data, so is accurate for my workload.)
Just a guess, but I expect the difference is because NetApp de-dupes
by checksumming blocks and mark whole blocks as duplicates if they
have the same checksum.
The message body has the same block offset in maildir (i.e. the start
of a message is at byte 0), whereas mdbox might align message body
anywhere in a block, so you might have 512 different block
configurations for the same message.
True, the start of the message is always at byte 0, but because of
different header length per user for the same message (different mail
address with different lengths) the body will never start at the same
byte.

In the end, a slight compression (gzip 3) via Dovecot resulted in better
space savings than compression and deduplication via NetApp.

The most space can obviously saved via SiS of attachements in dovecot,
but to be frank, this feature scares me a bit.

Gr??e,
Sven.
--
Sigmentation fault. Core dumped.
Joseph Tam
2014-01-29 02:05:50 UTC
Permalink
Post by Sven Hartge
Post by Joseph Tam
Just a guess, but I expect the difference is because NetApp de-dupes
by checksumming blocks and mark whole blocks as duplicates if they
have the same checksum.
True, the start of the message is always at byte 0, but because of
different header length per user for the same message (different mail
address with different lengths) the body will never start at the same
byte.
Oh yes, that's right. I confused maildir format with sendmail
queue files that separates header and body. There is still some
similarlity for mass mailouts to the same mail domain: they will
have almost identical headers +/- message IDs and a few bytes
here and there, but as you say, SiS is the way to go for deduping
bulky message attachments.

Joseph Tam <jtam.home at gmail.com>

Continue reading on narkive:
Loading...