Part 1: Introducing riak_core
Hypothetical Labs - kevin - July 30, 2010-
What is riak_core?
riak_coreis a single OTP application which provides all the services necessary to write a modern, well-behaved distributed application.riak_corebegan as part of Riak. Since the code was generally useful in building all kinds of distributed applications we decided to refactor and separate the core bits into their own codebase to make it easier to use.Distributed systems are complex and some of that complexity shows in the amount of features available in
riak_core. Rather than dive deeply into code, I’m going to separate the features into broad categories and give an overview of each.Note: If you’re the impatient type and want to skip ahead and start reading code, you can check out the source to
riak_corevia hg or git.Node Liveness & Membership
riak_core_node_watcheris the process responsible for tracking the status of nodes within a riak_core cluster. It usesnet_kernelto efficiently monitor many nodes.riak_core_node_watcheralso has the capability to take a node out of the cluster programmatically. This is useful in situations where a brief node outage is necessary but you don’t want to stop the server software completely.riak_core_node_watcheralso provides an API for advertising and locating services around the cluster. This is useful in clusters where nodes provide a specialized service, like a CUDA compute node, which is used by other nodes in the cluster.riak_core_node_watch_eventscooperates withriak_core_node_watcherto generate events based on node activity, i.e. joining or leaving the cluster, etc. Interested parties can register callback functions which will be called as events occur.Partitioning & Distributing Work
riak_coreuses a master/worker configuration on each node to manage the execution of work units. Consistent hashing is used to determine which target node(s) to send the request and the master process on each node farms out the request to the actual workers.riak_corecalls worker processesvnodes. The coordinating process is thevnode_master.The partitioning and distribution logic inside
riak_corealso handles hinted handoff when required. Hinted handoff occurs as a result of a node failure or outage. In order to assure availability, most clustered systems will use operational nodes in place of down nodes. When the down node comes back the cluster needs to migrate the data from its temporary home on the substitute nodes to the data’s permanent home on the restored node. This process is called hinted handoff and is managed by components insideriak_core.riak_corealso handles migrating partitions to new nodes when they join the cluster such that all work continues to be evenly partitioned to all cluster members.riak_core_vnode_masterstarts all the worker vnodes on a given node and routes requests to
the vnodes as the cluster runs.riak_core_vnodeis an OTP behavior wrapping all the boilerplate logic required to implement a vnode. Application-specific vnodes need to implement a handful of callback functions in order to participate in handoff sessions and receive work units from the master.Cluster State
A
riak_corecluster stores global state in a ring structure. The state information is transferred between nodes in the cluster in a controlled manner to keep all cluster members in sync. This process is referred to as “gossiping”.riak_core_ringis the module used to create and manipulate the ring state data shared by all nodes in the cluster. Ring state data includes items like partition ownership and cluster-specific ring metadata. Riak KV stores bucket metadata in the ring metadata, for example.riak_core_ring_managermanages the cluster ring for a node. It is the main entry point for application code accessing the ring, viariak_core_ring_manager:get_my_ring/1, and also keeps a persistent snapshot of the ring in sync with the current ring state.riak_core_gossipmanages the ring gossip process and insures the ring is generally consistent across the cluster. -
What’s the plan?
Over the next several months I’m going to cover the process of building a real application in a series of posts to this blog where each post covers some aspect of system building with
riak_core. All of the source to the application will be published under the Apache2 licensed and shared via a public repo on github.And what type of application will we build? Since the goal of this series is to illustrate how to build distributed systems using
riak_coreand also satisfy my own technical curiosity I’ve decided to build a distributed graph database. A graph database should provide enough use cases to really exerciseriak_corewhile at the same time not obscuring the core learning experience in tons of complexity.Thanks to Sean Cribbs and Andy Gross for providing helpful review and feedback.
Categories: Blogs Hypothetical Labs
Comments
No comments so far, you could be the first.Add comment
Erlang on Twitter
» quercialwji2 (Quercia Quinn): @kreese555 http://t.co/pPiIpTCx
» erlang (Andreas Åkre Solberg): RT @SaraJChipps: Node.js is like taking a bubble bath in JavaScript.
» trabajosit (Empleos en IT): argentina Desarrolladores - iOS, Ruby, Erlang: Estamos buscando un líder de desarrollo que quiera hacer una gran… http://t.co/zhTLpGI1
» erlang (Andreas Åkre Solberg): jQuery Scroll Path Plugin http://t.co/DHZ0W36c via @JoelBesada
» martyns (martynas): fighting with load bursts. probably its time to deploy https://t.co/xJxKcRmQ. #erlang
» Louellaoqk (Louella Delaroca): Informationsmanagement in Hochschulen (German Edition): Die Informations- und Kommunikationstechnik (IuK) erlang… http://t.co/95UJGZS3
» fgtrjhyu (アスパラガー): 多分「どうでもいい」だと思うんだ。 Smalltalk→Obj…C erlang→node.jsと同じで。
» zbyszek (Zbyszek Żółkiewski): RT @michalptaszek: Going to give #ejabberd tutorial on @erlangfactory in SF this March :) Anyone?
http://t.co/0bnFtIKf #xmpp #erlang
» jeedee (jeedee): Erlang, y u so fast?
» michalptaszek (Michal Ptaszek): Going to give #ejabberd tutorial on @erlangfactory in SF this March :) Anyone?
http://t.co/0bnFtIKf #xmpp #erlang
Statistics
Number of aggregated posts: 10456
Number of comments: 1445
Most recent article: February 06, 2012
Latest comments
» simple smile on Scale means Skills: Very informative article. Pretty sure people would love to go to that place for shopping. Specially to those who are…
» simplesmile on 27 January 2012: Erlang Solutions embarks on an Erlang Embedded KTP: Your article will make the world better. Thanks again and good luck to you in your life. See you next time.simplesmile
» tandblekning easewhite on 08 February 2012: Erlang Express 3-day Course in San Francisco on 8 February: ncomprehensible to me now, but in general, the usefulness and significance is overwhelmingtandblekning easewhite