processi

about processes and engines

Archive for the ‘tokyocabinet’ Category

retiring rufus-tokyo

As you probably know, rufus-tokyo is a Ruby FFI wrapper for Tokyo Cabinet|Tyrant, the fine pieces of software delivered by Hirabayashi Mikio.

Rufus-tokyo is 12 or 13 months old, but it’s time to retire it (maintenance mode).

James Edward Gray II is building his Oklahoma Mixer which will, hopefully, completely overlap rufus-tokyo and simply be better, very soon. Once the mixer covers Ruby 1.9.x, JRuby and Tokyo Tyrant (see todo list), rufus-tokyo will be irrelevant.

This is great for me (and for you), we will get a better Ruby library for TC/TT. Please note that for Tokyo Tyrant we already have a very fast option with Flinn Mueller’s ruby-tokyotyrant, which will always be faster than a FFI Tokyo Tyrant library.

James is exposing his motivations for Oklahoma Mixer on its front page. I have a few clarifications to make about them and rufus-tokyo. I use this blog to make these clarifications more accessible.

Why not just use rufus-tokyo?

There is already a Ruby FFI interface to Tokyo Cabinet and more called rufus-tokyo. I am a huge fan of rufus-tokyo and have contributed to that project. I have learned a ton from working with rufus-tokyo and that code was my inspiration for this library.

That said, I did want to change some things that would require big changes in rufus-tokyo. Here are the things I plan to do differently with Oklahoma Mixer:

* Tokyo Cabinet’s B+Tree Database has some neat features, like custom ordering functions, that are hard to expose through rufus-tokyo due to the way it is designed. I would have had to rewrite pretty much that entire section of the library anyway to add these features.
* There are some places where rufus-tokyo uses more memory than it absolutely must or slows itself down a bit with extra iterations of the data. Again, this is a result of how it is designed. It allows more code reuse at the cost of some efficiency. I wanted to improve performance in those areas.
* I’m working on some higher level abstractions for Tokyo Cabinet that I eventually plan to include in this library. These extra tools are the reason I needed to make these changes and additions.
* Finally, rufus-tokyo paved the way to a Ruby-like interface for Tokyo Cabinet. Previous choices were scary in comparison. I wanted to push that movement even further though and get an even more Hash- and File-like interface, where possible.

It’s important to note though that rufus-tokyo does do some things better and it always will. Its advantages are:

* It offers a nice Ruby interface over a raw C extension which is faster than using FFI. I have no intention of duplicating that interface, so rufus-tokyo will remain the right choice for raw speed when using MRI.
* For now, rufus-tokyo is more full featured. It provides interfaces for Tokyo Tyrant and Tokyo Dystopia. I would like to add these eventually, but I’m starting with just Tokyo Cabinet.
* It offers a pure Ruby interface for communicating with Tokyo Tyrant without needing the C libraries installed. I won’t be copying that either, so it will remain the right choice for a no dependency install.
* For now, it’s more mature. A lot of developers have used it and contributed to it. It’s probably the safer library to trust for production applications at this time.

The “raw C extension” mentioned is provided by Hirabayashi-san. Rufus-tokyo only provides a (mostly) unified interface to it (same interface for FFI and the C exts).

The “pure ruby interface for communicating with Tokyo Tyrant” again is provided by Hirabayashi-san.

That doesn’t leave me much merit. All this wrapping effort was code-named Rufus-Edo and you can read more about the motivations and mechanisms in ‘rufus-tokyo 0.1.9 (Rufus::Edo)’. You will also find more details in my Edo Cabinet presentation.

So I’d advise TC/TT Rubyists to support the effort of James and Flinn. For those of you who want to participate in the larger Tokyo Cabinet|Tyrant community, Flinn has set up the Tokyo Cabinet Users mailing list, where lots of people help and share.

Rufus-tokyo is now in maintenance mode. I will still help people with the occasional bug or mem leak. The mailing list is still open (it’s the one for my rufus libraries) and if you know how to write an issue report, you will always find help (random tweets do not count).

Thanks for the fun on the way.

 

Written by John Mettraux

February 26, 2010 at 12:25 am

rufus-tokyo 1.0.4

東京 from its bay

This release contains a fix for a memory leak cornered by Jeremy Hinegardner and James Edward Gray II.

The gem is available via gemcutter and the source is on github.

Many thanks to James and Jeremy and merry Christmas to you all !

 

Written by John Mettraux

December 25, 2009 at 12:40 am

rufus-tokyo 1.0.1

yokohama cabinetrufus-tokyo is a ruby-ffi based library for accessing Tokyo Cabinet and Tokyo Tyrant databases. It also feature a Rufus::Edo side where the native ruby/c extensions provided by the TC/TT author are used (for speed) instead of ruby-ffi.

rufus-tokyo contains ffi bindings for Tokyo Dystopia as well, thanks to Jeremy Hinegardner.

This is the changelog for this 1.0.1 release :

== rufus-tokyo - 1.0.1    released 2009/09/18

- todo : add #putcat to Cabinet / Tyrant (Edo and Tokyo)
- todo : implemented search/union/intersection/difference for tables
- todo : added #putdup and #get4 to Cabinet / Tyrant (Edo and Tokyo)
- todo : better dylib 'detection' (Pietro Ferrari)
- todo : aliased lget to mget (thanks Runa)
- todo : proper Abort exception (Kenneth Kalmer)

 

putcat is used to append data to already stored value. getdup / get4 is only relevant with b+ trees structures. They allow for multiple values stored under one key. This get4/getdup returns all the values stored under one key.

Since Tokyo Cabinet 1.4.29 (Tyrant 1.1.30), cabinet tables and tyrant tables have this new metasearch feature. This is what is meant by “search/union/intersection/difference” in the changelog.

Hirabayashi-san is providing us with a way to combine queries. From rufus-tokyo, it looks like :

require 'rubygems'
require 'rufus/tokyo' # sudo gem install rufus-tokyo

t = Rufus::Tokyo::Table.new('table.tct')

t['pk0'] = { 'name' => 'alfred', 'age' => '22', 'hobby' => 'fishing' }
t['pk1'] = { 'name' => 'bob', 'age' => '18', 'hobby' => 'hunting' }
t['pk2'] = { 'name' => 'charly', 'age' => '45', 'hobby' => 'fishing' }
t['pk3'] = { 'name' => 'doug', 'age' => '77', 'hobby' => 'fencing' }
t['pk4'] = { 'name' => 'ephrem', 'age' => '32', 'hobby' => 'swimming' }

rs = t.union(
  t.prepare_query { |q| q.add 'hobby', :equals, 'fishing' },
  t.prepare_query { |q| q.add 'age', :numgt, '20' }
)

rs.each { |r| p r }
  # ==>
  # ["pk2", {"name"=>"charly", "hobby"=>"fishing", "age"=>"45"}]
  # ["pk3", {"name"=>"doug", "hobby"=>"fencing", "age"=>"77"}]
  # ["pk4", {"name"=>"ephrem", "hobby"=>"swimming", "age"=>"32"}]
  # ["pk0", {"name"=>"alfred", "hobby"=>"fishing", "age"=>"22"}]

t.close

This example returns the persons in the table who do fishing for a hobby OR whose age is greater than 20. The other methods understood by the table are intersection and difference.

For more information, see the rufus-tokyo source on github and the rufus-tokyo rdoc.

Many thanks to all the persons who contributed to rufus-tokyo.

 

On the general Tokyo Cabinet / Tyrant front, Flinn Mueller, the author of ruby-tokyotyrant started a couple months ago a mailing list about Tokyo Cabinet / Tyrant. Lots of knowledge about the Tokyo products is exchanged there.

There is also the release of tyrantmanager by Jeremy Hinegardner (the author of the Dystopia bindings in rufus-tokyo). It’s a neat command line tool for managing Tokyo Tyrant instances, individually or in batch.

 

Written by John Mettraux

September 18, 2009 at 1:47 am

edo cabinet – rubykaigi2009

Here are my slides for the “edo cabinet” talk at the RubyKaigi2009.

I presented about Ruby-FFI, Tokyo Cabinet|Tyrant and then rufus-tokyo and Rufus::Edo.

These are just slides, here is the english transcript of the actual talk.

 

Many thanks to the RubyKaigi team for organizing this great event ! Special kudos to Leonard Chin for the on-the-fly translations, throughout the 3 days.

You’ll find videos of most of the talks on ustream.tv (thanks to the KaigiFreaks) and most of the slides are available from Slideshare.

 

Written by John Mettraux

July 28, 2009 at 11:57 pm

rufus-tokyo 1.0.0

climaxJust released rufus-tokyo 1.0.0

This is mostly a “cleanup” release (spec reorg, to_s enforcement on key and values, and dropped backward compatibility with older TC/TT releases). Hence the 1.0.0.

There are new features though :

* Matthew King helped me add process method for table queries.

* Jeremy Hinegardner did an initial contribution on the front FFI / Tokyo Dystopia.

* Adam Keys contributed the ext method for the table structures (sorry, I had forgotten).

* full text search operators for table queries (:ftsphrase, :ftsec, :ftsor: ftsand) and the accompanying :token, :qgram and :opt inverted index types. I haven’t played too much with them. Feedback is welcome.

For those of you interested in extending Tokyo Tyrant via Lua functions, I warmly recommend Ilya Grigork’s post about it.

Many thanks to all the people who helped.

There will be a next release soon, as I’m busy trying to catch up with Hirabayashi-san.

rufus-tokyo : http://github.com/jmettraux/rufus-tokyo/

(picture graciously from @michelledasilva)

 

Written by John Mettraux

July 23, 2009 at 1:24 am

rufus-tokyo 0.1.13

shinjukuWith the initial releases of rufus-tokyo, I happily cut corners and went with C strings (ending with NUL).

This isn’t optimal, often you need to store binary data as the value, the resulting ‘string’ contains NUL characters and values get truncated at restitution. I was working around that with Base64 encoding. Fine, the perf cost isn’t too heavy. But then, binary data was working fine on the Rufus::Edo side (wrapping the original tokyo{cabinet|tyrant}-ruby), why should Rufus::Tokyo be an exception ?

rufus-tokyo 0.1.13 which just got released fixes that issue.

Thanks to Ben Mabey for pointing out the issue.

Kamal Fariz Mahyuddin was kind enough to fork rufus-tokyo and add the #putkeep(k, v) method which only put if there was no previous value associated with the key. The collaboration with Kamal also prompted me to provide #add_int and #add_double for int/double counters in cabinet/tyrant.

http://github.com/jmettraux/rufus-tokyo

 

Written by John Mettraux

June 2, 2009 at 12:47 am

rufus-tokyo 0.1.12, ext(lua)

tokyo-luaJust released version 0.1.12 of rufus-tokyo, a Ruby library for accessing Tokyo Cabinet and Tokyo Tyrant (via FFI or via the Ruby classes provided with the Tokyo products).

Tokyo Tyrant, once successfully compiled with –enable-lua, is open to lots of interesting usages.

In his last post on the Mixi dev blog, the author of Tokyo Cabinet / Tyrant is exposing his ideas about an application of map/reduce leveraging Tokyo Tyrant (post in Japanese, machine translation here). This demonstration relies on the embedded Lua runtime for the mapper/reducer implementation.

For this post, I’ll just show the vanilla “incr” function found in the Tokyo Tyrant documentation :

--
-- incr.lua
--
function incr (key, i)
  i = tonumber(i)
  if not i then
    return nil
  end
  local old = tonumber(_get(key))
  if old then
    i = old + i
  end
  if not _put(key, i) then
    return nil
  end
  return i
end

This Lua function increments the current value bound for a key by a given amount.

You start your Lua-enabled Tokyo Tyrant with something like :

  ttserver -ext incr.lua test.tch
 

and then hit it with code like :

# test.rb

require 'rubygems'
require 'rufus/tokyo/tyrant' # sudo gem install rufus-tokyo

t = Rufus::Tokyo::Tyrant.new('127.0.0.1', 1978)

5.times do
  puts t.ext(:incr, 'my-counter', 2).to_i
end

t.close

The “ext” method is available with the latest release of rufus-tokyo, for the Rufus::Tokyo::Tyrant and Rufus::Edo::NetTyrant classes.

I updated my documentation about Tokyo Cabinet / Tyrant install with some –enable-lua information.

This release of rufus-tokyo also adds transaction support for the Rufus::Tokyo::Cabinet class (requires Tokyo Cabinet 1.4.13).

 

Written by John Mettraux

April 7, 2009 at 5:32 am

rufus-tokyo 0.1.10, limit(count, offset)

musicJust released rufus-tokyo 0.1.10. It supports the new setlimit method in Tokyo Cabinet 1.4.10 tables and Tokyo Tyrant 1.1.17 tables.

(I put up a piece of documentation on how to install TC and TT at http://openwferu.rubyforge.org/tokyo.html)

What’s the deal with this setlimit method ? Well, it’s an improvement over setmax which only accepted a count parameter.

require 'rubygems'
require 'rufus/tokyo'

t = Rufus::Tokyo::Table.new('table.tct')

t['pk0'] = { 'name' => 'alfred', 'age' => '22' }
t['pk1'] = { 'name' => 'bob', 'age' => '18' }
t['pk2'] = { 'name' => 'charly', 'age' => '45' }
t['pk3'] = { 'name' => 'doug', 'age' => '77' }
t['pk4'] = { 'name' => 'ephrem', 'age' => '32' }

p t.query { |q|
  q.order_by 'age'
  q.limit(2, 3) # 2 records max, skip 3 records
}
  # => [ {"name"=>"ephrem", :pk=>"pk4", "age"=>"32"},
  #      {"name"=>"charly", :pk=>"pk2", "age"=>"45"} ]

t.close

(Oops, I noticed the max/count parameter is not always respected, submitted a bug report… And not sure if this offset/skip is ‘pagination’ out of the box, it seems more like a ‘skip’, literally)

This release of rufus-tokyo supports previous versions of Tokyo Cabinet and Tokyo Tyrant as well.

sudo gem install rufus-tokyo or http://github.com/jmettraux/rufus-tokyo/

Many thanks to Benjamin Yu for pointing out the issue.

 

Written by John Mettraux

March 19, 2009 at 5:50 am

rufus-tokyo 0.1.9 (Rufus::Edo)

picture-11I have just released rufus-tokyo 0.1.9 (sudo gem install rufus-tokyo)

Rufus-tokyo provides rubyist friendly classes for accessing Tokyo Cabinet and Tokyo Tyrant structures. It does so by binding Tokyo C libraries via FFI and …

Hirabayashi-san, the author of Tokyo Cabinet and Tyrant provides a ‘native’ Ruby library for Tokyo Cabinet and a pure Ruby library for Tokyo Tyrant. My initial motivation for writing rufus-tokyo was that those classes are very C oriented and I wanted to interact in an elegant way with the Tokyo ‘products’.

Another issue was that the author was not providing his libraries via gem install. But this has changed, Dane Jensen is now mirroring the libraries (cabinet, tyrant) at github and releasing them as gems (sudo gem install careo-tokyocabinet careo-tokyotyrant –source http://gems.github.com)

As Ilya Grigorik pointed out FFI is slower than extconf.rb classical Ruby to C bindings. I wanted to benefit from the speed of the ‘native’ bindings.

I have added to rufus-tokyo a Rufus::Edo namespace with 2 classes (cabinet, table) wrapping the ‘native’ ruby gem and 2 classes wrapping the ‘pure’ Ruby classes (cabinet, table) for Tyrant. The ‘pure’ ruby classes for Tyrant, despite being a bit slow are interesting because they don’t require the installation of the Cabinet + Tyrant libraries.

There is a bit of documentation at :

http://github.com/jmettraux/rufus-tokyo/tree/master/lib/rufus/edo

The Rufus::Tokyo classes and the Rufus::Edo classes have the same methods, making it easy to plug one for the other. For instance, in my workflow engine (ruote), I’ve made sure the native bindings are used if present. At instantiation/connection time, the right class (native vs FFI, Edo vs Tokyo) is chosen, the rest of the code is the same.

Here is a tiny decision table for choosing which ruby lib to use :

decision table

If only need a btree or a hash (no tables) served by a Tyrant, a native libmemcached wrapper might be a good option if speed is a must (yes, Tokyo Tyrant speaks ‘memcached’).

I have some very naive benchmarks at http://gist.github.com/71285 (Tokyo Cabinet 1.4.8 and Tyrant 1.1.16). Unfortunately the memcached part is not done with a native libmemcached wrapper.

rufus-tokyo : http://github.com/jmettraux/rufus-tokyo/

Written by John Mettraux

February 27, 2009 at 6:38 am

rufus-tokyo 0.1.5, hail to the Tyrant

golfIf you go to the the Tokyo Cabinet page, you’ll sometimes be welcome with a banner saying “Nagoya Cabinet” or “Shinjuku Cupboard”, no worries, you’re at the right place.

The two main projects of Mikio Hirabayashi are Tokyo Cabinet (local hash / table) and Tokyo Tyrant (remote Tokyo Cabinet).

After supporting Tokyo Cabinet in rufus-tokyo 0.1.4, Justin and I included support for Tokyo Tyrant in rufus-tokyo 0.1.5.

Getting Tokyo Cabinet and Tyrant is relatively easy, it boils down to getting the source from sf.net and then unpacking them and running “./configure && make && sudo make” for both of them (a more detailed description of the compile/install steps).

#
# start a tyrant server from the command line :
#
#   ttserver -port 45000 data.tch
#

require 'rubygems'
require 'rufus/tokyo' # sudo gem install 'rufus-tokyo'

t = Rufus::Tokyo::Tyrant.new('localhost', 45000)

t['shinjuku'] = 'twin towers'
t['ikebukuro'] = 'pond bag'

p t['ikebukuro'] # => 'pond bag'

t.close

Looks great, but since Tokyo Tyrant speaks kling… memcached, one is probably better served (in terms of performance) by a classic ruby memcached client library.

Still there is one area where rufus-tokyo + Tokyo Tyrant could make sense :

#
# on the command line, launch a tyrant server with a stable structure :
#
#   ttserver -port 45001 data.tct
#
# (note the .tct suffix, it indicates to the Tyrant it has to create
# a table structure)
#

require 'rubygems'
require 'rufus/tokyo/tyrant' # sudo gem install 'rufus-tokyo'

t = Rufus::Tokyo::TyrantTable.new('localhost', 45001)

t['pk0'] = { 'name' => 'jim', 'age' => '25', 'lang' => 'ja,en' }
t['pk1'] = { 'name' => 'jeff', 'age' => '32', 'lang' => 'en,es' }
t['pk2'] = { 'name' => 'jack', 'age' => '44', 'lang' => 'en' }
t['pk3'] = { 'name' => 'jake', 'age' => '45', 'lang' => 'en,li' }

# ...

p t.keys
  # => [ 'pk0', 'pk1', 'pk2', 'pk3' ]

p t.query { |q|
  q.add 'lang', :includes, 'en'
  q.limit 2
}
  # =>
  # [{"name"=>"jim", :pk=>"pk0", "lang"=>"ja,en", "age"=>"25"},
  #  {"name"=>"jeff", :pk=>"pk1", "lang"=>"en,es", "age"=>"32"}]

# ...

t.close

There is one potential and interesting development for rufus-tokyo : Yehuda Katz integrated it into Moneta, his “unified interface to key/value stores”. Looking forward to see how it develops.

Many thanks to Justin, Yehuda and Zev for their work/support/help.

http://github.com/jmettraux/rufus-tokyo

Tested with Ruby 1.8.6 and Ruby 1.9.1p0 (There is a small issue with JRuby 1.1.6, I hope to solve it very soon and release a 0.1.6 for it).
tyrant

Written by John Mettraux

February 13, 2009 at 3:08 am