ejabberd “cloud edition alpha”
Eric Cestari - cstar - May 14, 2009Objectives
It’s ejabberd, the best XMPP server, with a set of custom modules aiming for making it stateless and very scalable on the storage backend.
All state data (including user accounts, roster information, persistent conference room, pubsub nodes and subscriptions) are stored in AWS webservices, S3 or SimpleDB.
It helps scaling up and down, and keeps managing costs at a proportianal cost. AWS services are very wide, and massively parallel access is what it’s all about.
Default ejabberd configuration uses mnesia, but Process One recommends switching some services like roster or auth to ODBC when load increases.
But DBMS have their own scaling problems, and that’s yet another piece of software to administrate.
CouchDB seems loads of fun, and I’d like to put some effort running ejabberd over it later on. Some work has started, but not much progress yet. (and CouchDB is still software to one needs to manage).
Current state
ejabberd_auth_sdb: store users in SimpleDB. The version in github stores password encrypted, but forces password in PLAIN over XMPP, that means that TLS is required (really !). I have a version somewhere which exchanges hashes on the wire but stores password in clear in SimpleDB. Your call.mod_roster_sdb: roster information is stored in SimpleDBmod_pubsub: nodetree data is stored in S3 along with items. Subscriptions are stored in SimpleDB. I reimplemented nodetreedefault and nodedefault, with means that PEP works fine too.mod_muc: Uses modular_muc with the S3 storage for persisting rooms.mod_last_sdb: Stores last activity in SimpleDB
Still lacking :
Following the names of the modules, where to store data, in my opinion.
mod_shared_roster: in SimpleDBmod_vcard: VCards in S3, index in SimpleDBmod_private: S3mod_privacy: S3mod_muc_log: S3 (with a specific setting for direct serving, maybe)
These modules are the only one which have state that should be persisted on disk. Mnesia is of course still be used for routing, configuration – but that’s transient data.
Transactions and latency
We loose transactions by switching away from mnesia or ODBC. That may or may not be a problem. I think it won’t be, but I don’t have data to prove one way or the other.
Latency also grows, but erlsdb and erls3, the libraries on which the modules are built, can interface with memcached (and are ketama enabled) if you use merle. Additionally using merle will keep usage costs down.
ejabberd mod_pubsub underwent several optimizations recently, and that improved performance of non-memcached AWS mod_pubsub. Initial code had latency around 10 seconds between publishing and receiving the event. Since last week’s improvement, performance is much better.
Down the road
I’d wish to see an EC2 AMI based on this code, just pass the domain name or the ejabberd.cfg file to ec2-start-instance and boom ! you have an ejabberd server up and running.
Want more horse power ? Start another one on the same domain in the same EC2 security group, the ejabberd nodes autodiscover each other and you’ve got a cluster. ec2nodefinder is designed for this use.
Combined with the very neat upcoming load-balancing and autoscaling services Amazon Web Services, there’s a great opportunity for deploying big and cheap!
Alternatives to the AWS loadbalancing would be pen, or a “native” XMPP solution.
A few things would need to be implemented for this to work well, like XMPP fast reconnect via resumption and/or C2S/S2S process migration between servers, because scaling down is as important as scaling up in the cloud.
If you want to participate, you’d be very welcome. Porting the modules I did not write, or testing and sending feedback would be … lovely.
And of course if Process One wants to integrate this code in a way or another, that would also be lovely !
Get it
Get it, clone it, fork it ! There’s bit of documentation on the README page.
[edited : added links to XEP-0198 and rfc3920bis-08, thanks to Zsombor Szabó for pointing me to them]
Categories: Blogs Eric Cestari
Comments
Why have you put whole ejabberd source to the repository? You could just put your modules to avoid constant merging from upstream.
Posted by Anton on 15 May 2009 at 12:15I find it simpler to do the merge within git with all the ejabberd source, as it’s one command :
github pull bjc master
and another simple one to push over to my github page :
git push origin master
Can’t get any simpler. Of course conflicts may arise, but whatever the technique I use, I’d have to fix them.
Posted by cstar on 16 May 2009 at 12:47There’s a detailed follow-up to my previous comment here :
http://www.cestari.info/2009/5/16/why-fork-the-whole-ejabberd-tree
It is nice to find a site about my interest. My first visit to your site is been a big help. pass4sure N10-004 Thank you for the efforts you been putting on making your site such an interesting and informative place to browse through. pass4sure 220-701 I’ll be visiting your site again to gather some more valuable information. pass4sure 642-832 You truly did a good job.
Posted by Ming21 on 11 Feb 2011 at 08:59
Add comment
Erlang on Twitter
» despenjahatdos (Jon champion): Eits jangan salah begini2 saya titisan dewa erlang RT @yolapitalokaa: Yg ngepost twit kyknya jg lg galau drtd ... http://t.co/QfCyVSIl
» erlangtriaji (erlang triaji ): Sini sun ahahaha RT @Encays: Udah udah, lo berduaan aja RT @revianh: Kepooo! RT @erlangtriaji: Hadir RT @Encays: Udah, sama erlang aj
» Encays (antarif cahyadi): Menjepit RT @erlangtriaji: Tegang! RT @revianh: Kepooo! RT @erlangtriaji: Hadir RT @Encays: Udah, sama erlang aja RT @revianh
» erlangtriaji (erlang triaji ): Tegang! RT @revianh: Kepooo! RT @erlangtriaji: Hadir RT @Encays: Udah, sama erlang aja RT @revianh: Nanggepnya lama banget
» Encays (antarif cahyadi): Udah udah, lo berduaan aja RT @revianh: Kepooo! RT @erlangtriaji: Hadir RT @Encays: Udah, sama erlang aja RT @revianh: Nanggepnya lama
» revianh (Revian Hermansyah): Kepooo! RT @erlangtriaji: Hadir RT @Encays: Udah, sama erlang aja RT @revianh: Nanggepnya lama banget -_-
» erlangtriaji (erlang triaji ): Hadir RT @Encays: Udah, sama erlang aja RT @revianh: Nanggepnya lama banget -_-
» Encays (antarif cahyadi): Udah, sama erlang aja RT @revianh: Nanggepnya lama banget -_-
» mshiba64 (Masami Shibatani): Erlangではシリアライズはterm_to_binaryというBuilt-in-functionで実行される。画像データもErlangで扱われるいくつかのTermも全てBinary型に可逆変換できる。
» tomohikoseven (tomohiko nagase): 更新した。|andreのブログ: Erlang avl tree insert を作った : http://t.co/4uBqenSw
Statistics
Number of aggregated posts: 10454
Number of comments: 1392
Most recent article: January 31, 2012
Latest comments
» nobelboy on OpaDo Data Storage: Feel free to add some Qs here or contact me offline, and I will see what I can work into…
» darrensy on The Twisted Matrix: This has been a great idea you have shared. covers for kindle
» jony on Principle Software Engineer at LonoCloud (Full-time): That provides will become a internet marketer of little kinds of expert methods developers developing strategy using Erlang/OTP. There will…