ICU collation in Erlang
Damien Katz - Damien Katz - September 06, 2009Right now we have a big performance problem in CouchDB view indexing when Erlang calls the ICU collation routines. The problem is that the facilities in Erlang to make C callouts are all dog slow, and collation of strings is something that happens a lot. So right now we have a big CPU bottleneck from collation in the indexing code, and it’s mostly overhead just marshaling the Erlang data to a C “port”.
To optimize, I had the idea that we could do just the basic ASCII string collation in Erlang, and when we hit non-ASCII we fail over to the ICU callouts. That makes our general collation faster in the general case, but still slower for anyone not American.
That got me to thinking, how hard would it be to implement all of ICU collation in Erlang? It’s my understanding that the ICU code is generated from parsable data and the source for the C and java versions are generated from that. How hard would it be to generate the Erlang code to do that, and would it be efficient? And what about case and accent insensitive sorting, something we don’t have now but probably will in the future? Any ICU experts out there have ideas?
Right now, the only thing we use the ICU for is collation, so that makes the problem easier.
Categories: Blogs Damien Katz
Comments
No comments so far, you could be the first.Add comment
Erlang on Twitter
» ajfeed (Ajinkya Feed): Erlang: Erlang : UDP socket usage example with gen_udp: submitted by dzysyak [link] [comment] http://t.co/WMpJtySv
» rianindahinyonk (Rian Indah Syafitri): RT @fathiaamandaaa: RT @indrasan: selamat ulang tahun saudara reza erlang @rezasur semoga makin banyak proyek nya ya.
» ericmoritz (Eric Moritz): RT @Burbass: I can now control my Mindstorm Lego car with Erlang. #erlang #mindstorm http://t.co/Jn78yViH
» dalnefre (Dale Schumacher): RT @Burbass: I can now control my Mindstorm Lego car with Erlang. #erlang #mindstorm http://t.co/Jn78yViH
» Erlang_ABNIC (Erlangga .A): “Dream, Believe, and Make it Happen”. ☺ RT @cjerikho829: “Believe”
» aidilnasution (M Aidil Nasution): RT @fathiaamandaaa: RT @indrasan: selamat ulang tahun saudara reza erlang @rezasur semoga makin banyak proyek nya ya.
» indytertuing (indy hamid): ƪ(^ヮ^)ʃ RT @fathiaamandaaa: RT @indrasan: selamat ulang tahun saudara reza erlang @rezasur semoga makin banyak proyek nya ya.
» rvirding (Robert Virding): RT @Burbass: I can now control my Mindstorm Lego car with Erlang. #erlang #mindstorm http://t.co/Jn78yViH
» pikuseru (Dave Birdsall): @mho105 I bought an O’Reilly book about Clojure, a language for the JVM. Bought Scala and Erlang books over a year ago. Just interesting.
» maxmurphy (Max Murphy): @mdesjardins LOL, erlang might fix something….but not that
Statistics
Number of aggregated posts: 10503
Number of comments: 2135
Most recent article: May 21, 2012
Latest comments
» DRS786 on 25 May 2012: Poznan Erlang User Group Event: I’m going!
» the tantric way in london on TextOne HD for webOS: Interesting articles are published here. By reading it I acquired great deal of knowledge on various subject. Thank you for…
» israeli jewelry on 08 February 2012: Erlang Express 3-day Course in San Francisco on 8 February: It is a wonderful blog. It helps me out a lot. Thank you. I really need help in development and…