Getting started with distributed Erlang - Mnesia table relocation

21st Century Code WorksBest of Erlang - noreply@blogger.com (Benjamin Nortier) - April 10, 2008

Mnesia is a distributed database that forms part of the Erlang release. One of the features that I think is potentially powerful, is transparent table relocation across machines. With Mnesia, you can replicate tables to any nodes you wish in your network, and Mnesiatakes care of all the back end bits for you. With “transparent”, I mean that you don’t need to do anything in your clients to make them “aware” of the new tables. Reads that were taking place from a table on one machine, will now be distributed across multiple nodes (where the nodes reside on single or multiple machines).

I wanted to see how difficult it is to achieve this. For the setup, I installed two virtual Ubuntu 7.10 machines using VMware Player. You can get images for most Ubuntu distros at http://isv-image.ubuntu.com/vmware/. FYI, the username and password for these images is ubuntu:ubuntu. I named the two nodes

node1.21ccw.blogspot.com and
node2.21ccw.blogspot.com

You’ll need to edit the network configurations with the IP addresses if you want to reproduce this experiment. If you need some help, post a question as comment :)

I now had two machines that could ping each other using the full names, and a warm and fuzzy feeling inside:


The next step was to start up an Erlang node on each machine. There’s a catch here though. I got some problems using erl -sname, probably because of the way I set up the hostnames of the machines. So, I had to specify the fully qualified names manually:


ubuntu@node1:~/node1$ erl -name 'node1@node1.21ccw.blogspot.com'
Erlang (BEAM) emulator version 5.5.5 [source] [async-threads:0] [kernel-poll:false]

Eshell V5.5.5 (abort with ^G)


ubuntu@node2:~/node2$ erl -name 'node2@node2.21ccw.blogspot.com'
Erlang (BEAM) emulator version 5.5.5 [source] [async-threads:0] [kernel-poll:false]

Eshell V5.5.5 (abort with ^G)
(node1@node1.21ccw.blogspot.com)1> nodes().
[]


Notice the output of the nodes() command. This will return a list of other Erlang nodes that this node is aware of. Initially there’s no awareness. To let a node know of another node, you can use net_adm:ping/1 to ping the other node. Both nodes will then become be aware of each other:


(node1@node1.21ccw.blogspot.com)4> net_adm:ping('node2@node2.21ccw.blogspot.com').
pong

(node1@node1.21ccw.blogspot.com)5> nodes().
['node1@node1.21ccw.blogspot.com']


(node2@node2.21ccw.blogspot.com)1> nodes().
['node1@node1.21ccw.blogspot.com']


Cool. Now the nodes know of each other. To get Mnesia started, you have to create a schema on each node. A schema is located on the file system, in the same location where the actual disc-copies of tables will reside. [node()|nodes()] creates a list of the current node and all the other connected nodes. ls() shows the directory that Mnesia has created for the database.


(node1@node1.21ccw.blogspot.com)5> mnesia:create_schema([node()|nodes()]).
ok

(node1@node1.21ccw.blogspot.com)6> ls().
Mnesia.node1@node1.21ccw.blogspot.com
ok



(node2@node2.21ccw.blogspot.com)2> ls().
Mnesia.node2@node2.21ccw.blogspot.com
ok


Now we have to start Mnesia on both nodes. You will notice that when we do an mnesia:info on node2 at this point, that it shows both nodes as being running database nodes.



(node1@node1.21ccw.blogspot.com)8> mnesia:start().
ok



(node2@node2.21ccw.blogspot.com)3> mnesia:start().
ok

(node2@node2.21ccw.blogspot.com)4> mnesia:info().
...
running db nodes = ['node1@node1.21ccw.blogspot.com','node2@node2.21ccw.blogspot.com']
...

Next we’ll create an actual database table, and populate it with some data. We define a record using rd(), then create a table on node1 (by default, this table will reside in RAM and have a disc copy), write a record to it and then read the record again. The primary key of the table is the first field of the record, i.e. the name.


(node1@node1.21ccw.blogspot.com)9> rd(person, {name, email_address}).
person

(node1@node1.21ccw.blogspot.com)10> mnesia:create_table(person, [{attributes, record_info(fields, person)}, {disc_copies, [node()]}]).
{atomic,ok}

(node1@node1.21ccw.blogspot.com)11> mnesia:transaction(fun() -> mnesia:write(#person{name = "John", email_address = "john@21ccw.blogspot.com"}) end).
{atomic,ok}

(node1@node1.21ccw.blogspot.com)14> mnesia:transaction(fun() -> mnesia:read({person, "John"}) end).
{atomic,[#person{name = "John",email_address = "john@21ccw.blogspot.com"}]}

(node1@node1.21ccw.blogspot.com)15> mnesia:info(). ...
...
[{'node1@node1.21ccw.blogspot.com',disc_copies}] = [person]
...


What happens when we do the same read on node2? Remember that node has access to the person table only via the network, since it resides in RAM and on disc on node1.

node2@node2.21ccw.blogspot.com)5> mnesia:transaction(fun() -> mnesia:read({person, "John"}) end).
{atomic,[{person,"John","john@21ccw.blogspot.com"}]}
Nice. Mnesia has transparently read the record from a table that’s on another machine :)

Now we decide to copy the table to node2. This requires a single command. Mnesia does the copying of the actual data for you to the other machine, and when you look at the file system on node2, there will now be “person.DCD” file, which is the disc copy of the table.


(node1@node1.21ccw.blogspot.com)15> mnesia:add_table_copy(person, 'node2@node2.21ccw.blogspot.com', disc_copies).
{atomic,ok}



(node2@node2.21ccw.blogspot.com)9> ls("Mnesia.node2@node2.21ccw.blogspot.com").
DECISION_TAB.LOG LATEST.LOG person.DCD
schema.DAT
ok


At this point, when you do a query on the person table, the actual data can come from either node. I’m not sure how Mnesia decides how to distribute the data, that’s something to investigate further.

Since the table is resident on both nodes, we can actually delete it from node1, and doing a read on node1 will now read the table over the network from node2:


(node1@node1.21ccw.blogspot.com)23> mnesia:del_table_copy(person, node()).
{atomic,ok}

(node1@node1.21ccw.blogspot.com)19> mnesia:info().
...
[{'node2@node2.21ccw.blogspot.com',disc_copies}] = [person]
...

(node1@node1.21ccw.blogspot.com)18> mnesia:transaction(fun() -> mnesia:read({person, "John"}) end).
{atomic,[#person{name = "John",email_address = "john@21ccw.blogspot.com"}]}


Cool.

What I’ve show is how to start up an Erlang/Mnesia node on two machines that are networked together, create tables on either node, and move the tables to other nodes by copying and then deleting them. Mnesia has the ability to configure tables to be RAM only, RAM and disc and disc only, which gives you lots of power for optimisation. Couple this with the fact that you can change your configuration dynamically and you have powerful, dynamically configurable distributed database!



Categories: Blogs  21st Century Code Works  Best of Erlang  

Comments

anonymous avatar

Your current address is a random address assigned from a pool of addresses owned by your ISP. They use a DHCP server to assign the address. The address is negotiated from the DHCP server when your PC or cable modem connects to the Internet.
cheap dedicated server

Posted by cheap dedicated server on 18 Dec 2009 at 10:24



 
anonymous avatar

This is some great information. I can utilize this material to make my job easier. Thanks for everything.

Posted by Roger on 18 Jan 2011 at 21:23



 


Add comment

Name:

Email:

URL:

Smileys

Remember my personal information

Notify me of follow-up comments?