processi

about processes and engines

Archive for the ‘ruote’ Category

workflow, bpm, selected resources

I have started to gather posts and blogs I think are worth a read in the workflow, BPM, Adaptive Case Management, rules, etc fields.

The list is at http://ruote.rubyforge.org/resources.html

I hope to list there resources that are sincere and passionate, and that challenge my point of view on workflow engines and enterprisey programming in general. I will try to avoid things that are press release like, that include words like “leading” or “fortune 500”, that are too closed, not leaving place for a slice of doubt.

 

Written by John Mettraux

March 31, 2010 at 4:35 am

Posted in acm, bpm, rules, ruote, workflow

ruote and decision tables

ruote 2.1.7 is out and it felt like it was time to update the CsvParticipant that could be found in ruote 0.9.

Ruote has been relying on rufus-decision for its decision table needs and ruote 0.9 was integrating a CsvParticipant directly. For ruote 2.1, the DecisionParticipant is left in the rufus-decision gem.

It’s not the first time I write something about decision tables. But I noticed that I hadn’t really written anything about how to mix ruote processes and decision tables.

Let’s consider the decision table in the adjacent figure.

Given two input values “type of visit” and “participant physician ?” the level of “reimbursement” is decided.

The CSV version of this decision table is available online.

Clearly this decision has to take place after the customer filled his reimbursement claim and it got reviewed by a clerk in the organization.

Up until now, applying the reimbursement rules was part of the tasks of the clerk (with probably some downstream checks), and our goal is to remove some of that burden.

Our process definition would look like :

Ruote.process_definition :name => 'reimbursement', :revision => '1.0' do
  cursor do
    customer :task => 'fill form'
    clerk :task => 'control form'
    rewind :if => '${f:missing_info}'
    determine_reimbursement_level
    finance :task => 'reimburse customer'
    customer :task => 'reimbursement notification'
  end
end

There are four participants involved in that process definition. Customer, Clerk and Finance our implemented with classical inbox/worklist participants : workitems for them get placed in their workqueue. (With ruote 2.1 I tend to use StorageParticipant for that or I emit the workitem via a ruote-amqp participant towards the target worklist or task manager).

“rewind” isn’t a participant, it’s a ruote expression working with the cursor expression.

That leaves us with the participant “determine_reimbursement_level” to implement.

The straightforward Ruby implementation of the participant would leverage a BlockParticipant and look like :

engine.register_participant 'determine_reimbursement_level' do |workitem|
  workitem.fields['reimbursement'] =
    case workitem.fields['type of visit']
      when 'doctor office'
        workitem.fields['participating physician ?'] == 'yes' ? '90%' : '50%'
      when 'hospital visit'
        workitem.fields['participating physician ?'] == 'yes' ? '0%' : '80%'
      when 'lab visit'
        workitem.fields['participating physician ?'] == 'yes' ? '0%' : '70%'
      else
        '0%'
    end
end

Given the values in the fields ‘type of visit’ and ‘participating physician ?’ this participant sets the value of the field ‘reimbursement’.

All is well, but you may have noticed that the decision table is rather simplistic. And how does it cope with change ? What will it look like when the rules get hairy ? Will the Ruby developer have to intervene for each change ?

(it might not be a bad thing to rely on the Ruby developer, they tend to test carefully their creations)

Those rules are produced by ‘business users’ and one of their favourite tools is the spreadsheet. Rufus-decision can read CSV output of decision tables. The decision participant would look like :

require 'rufus/decision/participant'
  # gem install rufus-decision

engine.register_participant(
  'determine_reimbursement_level',
  Rufus::Decision::Participant,
  :table => {
in:type of visit,in:participating physician ?,out:reimbursement
doctor office,yes,90%
doctor office,no,50%
hospital visit,yes,0%
hospital visit,no,80%
lab visit,yes,0%
lab visit,no,70%
  })

Well, OK, that’s nice, but… We still have to restart the application at each change to the decision table or at least re-register the participant with the updated table. Tedious.

If our business people publish the decision table as CSV somewhere over the intranet (yes, I know, it’s an old-fashioned word), the decision participant can be simplified to :

require 'rufus/decision/participant'
  # gem install rufus-decision

engine.register_participant(
  'determine_reimbursement_level',
  Rufus::Decision::Participant,
  :table => 'http://somewhere.example.com/decision_tables/reimbursement.csv')

Each time a decision will have to be taken, the table will be read. This lets business users modify the rules at will.

The whole example is packaged as reimbursement.rb. It runs the process from the command line and grabs the decision table online. It shouldn’t be too difficult to take inspiration from it and integrate decision tables in web applications (like ruote-kit or barley).

There are many open possibilities when combining a workflow engine and a decision mechanism. One immediate advantage is that they can evolve with different rhythms. They can also be used in isolation : the clerk in our process could talk the claim over the phone with the customer and run blank decisions to help the customer with incomplete forms.

Of course, you can use rufus-decision without ruote, it’s a plain ruby gem.

 

http://github.com/jmettraux/rufus-decision – the source code of rufus-decision
http://github.com/jmettraux/ruote – the source code of ruote
http://ruote.rubyforge.org – ruote’s documentation
http://groups.google.com/group/openwferu-users – the mailing list for ruote and rufus-decision

 

Written by John Mettraux

February 17, 2010 at 4:32 am

Posted in bpm, bpms, decision, ruby, rufus, rules, ruote

it’s barley a task manager

In the comments of the previous post, Karl asked me about ruote : “how to resume execution of the workflow from a web application ?”.

Though Karl’s questions was more about receivers, I took the time to explore a sample ruote application.

Barley is a [for fun] task manager. It takes its name from the barely repeatable processes, of course, it can’t compare with the excellent Thingamy work processor.

Barley is a Sinatra web application wrapping a ruote workflow engine. Ruote can run many process instances issued from different process definitions, but barley’s processes are all instantiated from a single process definition.

The cursor expression does most of the work. It runs in sequence a ‘trace’ participant and a ‘next’ participant. Then if the field ‘next’ is set in the workitem (most likely it has been chosen by the ‘current’ participant), the cursor is rewound and the task ends up on the ‘next’ participant’s desk.

PDEF = Ruote.process_definition :name => 'barley' do
  cursor do
    trace
    participant '${f:next}'
    rewind :if => '${f:next}'
  end
end

If the ‘next’ field is empty, the cursor will be over, letting the process terminate.

Barley simply lets users push tasks around. There is no explicit “delegation”, “escalation” or whatever…

For the rest, look at the barley itself, it’s all in one single file, a workflow engine, two participants and the webservice, with the haml template at the bottom.

Torsten has deployed a demo instance at http://barley-plain.torstenschoenebaum.de/work. (he also has a version hooked to twitter auth).

You can install it and run it locally with :

  curl http://github.com/jmettraux/ruote/raw/ruote2.1/examples/barley.rb -o barley.rb
  gem install sinatra haml json ruote
  ruby barley.rb

Then head to http://127.0.0.1:4567/

P.S.

to remove ruote and its dependencies from your gems :

  gem uninstall ruote rufus-json rufus-cloche rufus-dollar rufus-lru rufus-mnemo rufus-scheduler rufus-treechecker

 

Written by John Mettraux

January 29, 2010 at 8:22 am

Posted in ruby, ruote, workflow

ruote 2.1 released

Just released ruote 2.1.1. That requires a long, boring, explanatory blog post, especially since I was promising a ruote 2.0.0 release and now I’m jumping to 2.1.1.

My prose isn’t that great, the transitions from one paragraph to the next are often poor. Let’s try anyway.

Ruote is an open source ruby workflow engine. Workflow as in “business process”. Sorry, no photo workflow or beats per minute.

0.9, 2.0, 2.1

In May 2009, I had the feeling that ruote 0.9 was a bit swollen and very baroque in some aspects, some early design decisions had forced me to do some juggling and it felt painful. At that point, I was working with other languages than Ruby and I had written the beginning of a ruote clone in Lua.

Fresh ideas are tempting, after a while I got back to Ruby and started a ruote 2.0 rewrite.

I was fortunate enough to benefit from the feedback of a startup company that pushed ruote in a terrain that it was not meant to travel. Whereas I built ruote and the systems that came before for small organizations and trusted integrators to divide the work among multiple engines, these people started stuffing as much possible work into one engine, with extra wide concurrent steps.

Those experiments and deployments brought back invaluable learnings, matured the 2.0 line very quickly, but, in October, I was seriously tempted to venture into a second rewriting.

Ruote 2.0 was a complete rewrite, ruote 2.1 is only a rewrite of the core. The test suite is mostly the same and working from the 2.0 test suite allowed me to build a 2.1 very quickly.

things learnt

So what triggered the 2.1 rewrite ?

At first, a feeling of unease with the way certain things were implemented in 2.0. Some of them were inherited from the previous ruote 0.9, while others came from the rewrite, in both cases, I made bad decisions.

Second, those load reports. The 2.0 workqueue is only a refinement of the 0.9 one. The workqueue is where all the tasks for the expressions that make up the process instances are gathered. It’s a transient workqueue, processed very quickly, internal to the engine.

Ruote, being a workflow engine, is meant to coordinate work among participants. Handing work to a participant occurs when the process execution hits a ‘participant expression’. This usually implies some IO operations since the ‘real’ participant is something like a worklist (a workitem table in some DB), a process listening somewhere else and reachable via a message queue (XMPP, AMQP, Stomp, …).

As IO operations depend on things outside of the engine, the dispatch operation was usually done in a new thread, so that, meanwhile, the engine’s workqueue could go on with its work (and the other process instances).

When farming out work to ‘real’ participants with wide concurrency, as the engine is spawning a thread for each dispatch, it gets crowded. The engine got vulnerable at this point. Right in the middle of dispatching, with still lots of tasks in the transient workqueue, the best time for things to go wrong, the engine to crash, and processes to stall.

Third ? Is there a third ? Ah yes, I started playing with CouchDB (Thanks Kenneth), and many of the ideas in it are seducing and inspiring.

rubies

My passion for workflow engines comes from my [naive] hopes of reducing human labour, reducing total work by reducing/removing the coordination chores, the friction. Tasks left should be automated or left to humans. Left to humans… the bottleneck is in the human participants… Then there is this half-conviction that a workflow engine doesn’t need to be that fast, since most of the time, it sits waiting for the human-powered services to reply (or to time out).

Also I had settled on Ruby, a beautiful language, with not so fast interpreters (though this is changing, on many fronts, with so many people liking Ruby and giving lots of their times to make it better). The value of a workflow engine lies at its edge, where coordination meets execution, in the participants. That’s where Ruby shines (once more), since it makes it easy to write custom participants very quickly and to leverage all the crazy gems out there.

(I say Ruby shines once more, because I think it shines as well as the implementation language for the engine itself. It feels like the boundary between pseudo-code and code vanished)

Slow people, slow me, slow rubies.

storage and workers

I mentioned a few lines before CouchDB, that database made “out of the web”. Writing things to the disk is a necessity for a workflow engine. It’s an interpreter, but its processes are meant to last (longer than the uptime of a CPU).

One of CouchDB strengths comes from its eventual consistency and it made me wonder if, for ruote, it shouldn’t be worth sacrificing a bit of short-term performance for medium-term scalability / robustness.

A frequent complaint about Ruby is its global interpreter lock and its green threads, making it not so sexy in a 2+ core world. Threads are easy to play with from Ruby, but they’re quite expensive (some patches address that) and if you want to get past of the global lock, there seems to be only JRuby now.

Workqueue, queue ? Worker. Why not multiple workers ? One [Ruby] process per worker, that would be ideal ! Let them share a storage for Expressions, Errors, Schedules, Configuration and… Messages.

Ruote 2.1 has a persisted workqueue. For the items placed in the queue, I use the term “message”, it covers “tasks” (sent for execution by expressions) and “events” (intra-engine notifications).

An interesting consequence : a persisted workqueue can be shared by multiple workers, workers not in the same Ruby process.

A message ordering the launch of a ruote process looks like :

  {"type": "msgs",
   "_id": "1984-2151883888-1261900864.59903",
   "_rev": 0,
   "action": "launch",
   "wfid": "20091227-fujemepo",
   "tree":
     ["define", {"name": "test"}, [
       ["sequence", {}, [
         ["participant", {"ref": "alice"}, []],
         ["participant", {"ref": "bob"}, []]]]]],
   "workitem": {"fields": {}},
   "variables": {},
   "put_at": "2009/12/27 08:01:04.599054 UTC"}

Note the process definition tree passed in to the ‘tree’ part of the message, the empty workitem, the freshly minted process instance id in ‘wfid’, and yes, this is JSON. The process, named ‘test’ is simply passing work from alice to bob (in a sequence).

What about the worker that has to handle this message ? It would like :

  require 'ruote'
  require 'ruote/storage/fs_storage'

  storage = Ruote::FsStorage.new('ruote_work')
  worker = Ruote::Worker.new(storage)

  worker.run
    # current thread is now running worker

Whereas the 2.0 workqueue was transient, 2.1 has an external workqueue exploitable by 1+ workers. Simply set the workers to share the same storage. 1+ workers collaborating for the execution of business process instances.

Apart from tasks that need to be handled ASAP, some tasks are scheduled for later handling. A workflow engine thus needs some scheduling capabilities, there are activity timeouts, deliberate sleep period, recurring tasks… The 0.9 and 2.0 engines were using rufus-scheduler for this. The scheduler was living in its own thread. The 2.1 worker integrates the scheduler in the same thread that handles the messages.

(Since I wrote rufus-scheduler (with the help of many many people), it was easy for me to write/integrate a scheduler specifically for/in ruote)

A storage is meant as a pivot between workers, containing all the run data of an engine, ‘engine’ as a system (a couple storage + worker at its smallest). Workers collaborate to get things done, each task (message) is handled by only one worker and expression implementations are collision resilient.

The result is a system where each expression is an actor. When a message comes, the target expression is re-hydrated from the storage by the worker and is handed the message. The expression emits further messages and is then de-hydrated or discarded.

This the code of the worker is rather small, there’s nothing fancy there.

There will be some daemon-packaging for the worker, thanks to daemon-kit.

storage, workers and engine

We have a storage and workers, but what about the engine itself ? Is there still a need for an Engine class as an entry point ?

I think that yes, if only for making 2.1 familiar to people used to Ruote pre-2.1. At some point, I wanted to rename the Engine class of 0.9 / 2.0 as ‘Dashboard’, and forcing the engine term to designate the storage + workers + dashboards system. But well, let’s stick with ‘Engine’.

Throughout the versions the engine class shrinked. There is now no execution power in it, just the methods needed to start work, cancel work and query about work.

Here is an engine, directly tapping a storage :

  require 'ruote'
  require 'ruote/storage/fs_storage'

  storage = Ruote::FsStorage.new('ruote_work')
  engine = Ruote::Engine.new(storage)
  
  puts "currently running process instances :"
  engine.processes.each do |ps|
    puts "- #{ps.wfid}"
  end

No worker here, it’s elsewhere. That may prove useful when wiring the engine into, say, a web application. The workers may be working in dedicated processes while the web application is given the commands, by instantiating and using an Engine.

Launching a process looks like :

  pdef = Ruote.process_definition :name => 'test' do
    concurrence do
      alpha
      bravo
    end
  end

  p pdef
    # ["define", {}, [
    #   ["concurrence", {}, [
    #     ["alpha", {}, []],
    #     ["bravo", {}, []]]]]]

  wfid = engine.launch(pdef)

The execution will be done by the workers wired to the storage, somewhere else.

It’s OK to wrap a worker within an engine :

  require 'ruote'
  require 'ruote/storage/fs_storage'

  storage = Ruote::FsStorage.new('ruote_work')
  worker = Ruote::Worker.new(storage)
  engine = Ruote::Engine.new(worker)
    # the engine start the worker in its own thread
  
  puts "currently running process instances :"
  engine.processes.each do |ps|
    puts "- #{ps.wfid}"
  end

This is the pattern for a traditional, one-worker ruote engine. With the exception that it’s OK to add further workers, later.

In fact, as long as there is one worker and it can reach the storage, the process instances will resume.

storage implementations

There are currently 3 storage implementations.

The first one is memory-only, transient. It’s meant for testing or for temporary workflow engines.

The second one is file based. It’s using rufus-cloche. All the workflow information is stored in a three level JSON file hierarchy. Workers on the same system can easily share the same storage. It uses file locks to prevent overlap.

The third one is Apache CouchDB based, it’s quite slow, but we’re hoping that optimizations on the client (worker) side like using em-http-request will make it faster. Speed is decent anyway and, on the plus side, you get an engine that spans multiple host and futon as a low-level tool for tinkering with the engine data.

Those three initial storage implementations are quite naive. Future, mixed implemenation storages are possible. Why not a storage where CouchDB is used for expression storing while messages are dealt with RabbitMQ ? (Note the doubly underlying Erlang here)

dispatch threads

As mentioned previously, the default ruote implementation, when it comes to dispatching workitems to participants, will spawn a new thread each time.

People complained about that, and asked me about putting an upper limit to the number of such threads. I haven’t put that limit, but I made some provisions for an implementation of it.

There is in ruote 2.1 a dedicated service for the dispatching, it’s called dispatch_pool. For now it only understands new thread / no new thread and has no upper limit. Integrators can now simply provide their own implementation of that service.

Note that already since 0.9, a participant implementation is consulted at dispatch time. If it replies to the method do_not_thread and answers ‘true’, the dispatching will occur in the work thread and not in a new thread.

2.1 participants

Speaking of participants, there is a catch with ruote 2.1. Participants are now divided into two categories. I could say stateless versus stateful, but that would bring confusion with state machines or whatever. Perhaps, something like “by class” vs “by instance” is better. By class participants are instantiated at each dispatch and are thus manageable by any worker, while by instance participants are already instantiated and are thus only manageable via the engine (the Engine instance where they were registered).

BlockParticipants fall into the by instance category.

  engine.register_participant 'total' do |workitem|
    t = workitem.fields['articles'].inject(0) do |art|
      t += art['price'] * art['count']
    end
    workitem.fields['total'] = t
  end

will thus only be handled by the worker associated with the engine.

A by class participant is registered via code like :

  engine.register_participant(
    'alpha', Ruote::Amqp::AmqpParticipant, :queue => 'ruote')

Whatever worker gets the task of delivering a workitem to participant alpha will be able to do it.

from listeners to receivers

Ruote has always had the distincition between participants that replied by themselves to the engine (like BlockParticipant) and other participants which only focus on how to deliver the workitem to the real (remote) participant.

For those participants, the work is over when the workitem has been successfully dispatched to the ‘real’ participant. The reply channel is not handled by those participants but by what used to be called ‘listeners’. For example, the AmqpParticipant had to be paired with an AmqpListener.

As on the mailing list and in the IRC (#ruote) channel, I saw lots of confusion between listeners and the listen expression. That made me decide to rename listenerers as ‘receivers’.

A receiver could also get packaged as a daemon.

I still need to decide if a receiver should be able to “receive” ‘launch’ or ‘cancel’ messages (orders). Oh, and the engine is a receiver.

releasing

I released ruote 2.1.1 at the gemcutter.

For some people, a gem is a necessity, while for others a tag in the source is a sufficient release.

I hope that by releasing a gem from time to time and tagging the repo appropriately I might satisfy a wide range of people.

I’m planning to release gems quite often, especially in the beginning.

The unfortunate thing with a workflow engine, is that many business process instances outlive the version they started at and it’s not that easy to stay backward compatible all along… That explains some of my reticence about releasing.

ruote-kit

ruote-kit is an engine + worklist instance of ruote accessible over HTTP (speaking JSON and HTML), it has a RESTful ideal.

It’s being developed by Kenneth, with the help of Torsten. My next step is to help them upgrade ruote-kit from ruote 2.0 to 2.1. It should not be that difficult. Apart from the engine initialization steps, the rest of ruote being quite similar from 2.0 to 2.1.

conclusion

Two major versions of ruote for 2009, that’s a lot. I hope this post will help understand why I went for a second rewrite and how ruote matured and became, hopefully, less ugly.

Happy new year 2010 !

 

source : http://github.com/jmettraux/ruote
mailing-list : http://groups.google.com/group/openwferu-users
irc : freenode.net #ruote

 

Written by John Mettraux

December 31, 2009 at 9:57 am

Ruedas y Cervezas, first instance

“ruote” is the Italian for “wheels”. It’s the name of our ruby workflow engine. The name was proposed by Tomaso Tosolini, it stuck and we liked how people pronounced it “route” which is a good fit for a tool that routes work among participants.

If you translate “ruote” to Spanish you get “ruedas”. Add a thirsty community and you get “ruedas y cervezas”.

This event was organized and hosted by abstra.cc, a madrilenian company doing wonders with Java, JRuby, Scala and their blend of agile magic.

The meeting was divided in three parts. First, abstra.cc presented their work with ruote. They’ve bound it into their “Blue Mountain ERP” to drive the necessary business processes. Workitems are communicated to participants over XMPP. The platform is Java and the languages are JRuby and Scala. As you might have guessed, they run ruote on JRuby.

Then, Iván Martínez, student in the Universidad Politécnica de Madrid presented his work on a graphical process editor for ruote. It’s based on Flex and looks very promising. The abstra.cc guys encouraged him to share his code on github.com, I hope he will follow this suggestion.

For the part when people really get thirsty, I cooked up a bunch of slides about ruote 2.0. They were presented by Nando Sola. I have embedded them here. The first half of them is about why ruote 2.0, while the latter half is about the enhancements to the ruote process definition language.

I felt it was necessary to communicate a bit about ruote 2.0, especially since, in Madrid, people have tremendously contributed to ruote 0.9 (many thanks guys !). Ruote 2.0 is a small jump from 0.9, literally a jump, not a step, explanations are required.

Then beer ensued. A ruedas y cervezas (rc2) is planned for next month, I wish I could attend. So if you’re in Madrid and you’re interested in ruote or, less narrowly, in Ruby / JRuby / Scala / Java, contact Nando Sola and join the abstra.cc and UPM guys for the rc2.

 

 

Written by John Mettraux

October 15, 2009 at 10:35 am

Posted in bpm, ruby, ruote, workflow

Kenneth’s ruote talk

Kenneth Kalmer uploaded his “ruote in twenty minutes” video.

Kenneth Kalmer

I liked it a lot, an excellent mix of ruote idealism and south-african pragmatism.

It was 6 months ago though. We’re now working on ruote2.0, ruote-amqp, ruote-dm and the last addition is ruote-activerecord.

 

Written by John Mettraux

August 27, 2009 at 6:04 am

Posted in ruby, ruote

state machine != workflow engine

update (2010) : this resource lifecycle post features two quotes that are enlightening when thinking about state and workflow.

update (2011) : a discussion between two engineers about state machines and workflow (guest post at Engine Yard)

This post is intended for Ruby developers. The idea for it came after numerous discussions with fellow programmers about state machines and workflow engines.

What motivates me for posting is the publication of the 3rd issue of the excellent Rails Magazine. It contains an article named “Workflow solutions with AASM”.

At first a word of warning, I wrote these lines with no intention of minimizing the potential of state machines vs workflow engines. I’m listing workflow engine and state machine implementations for Ruby at the end of this post.

There are a number of open questions in this post, I don’t intend to answer them now or later. There are so many different use cases in the wild.

spark

The article “Workflow solutions with AASM” starts with :

There are two main forms of workflows: sequential and
state-machine. Sequential workflows are predictable. They
utilize the rules and conditions we provide at the beginning
to progress a process forward. The workflow is in control of
the process.

A state-machine workflow is the reverse. It is driven by
external events to its completion. We define the states and
required transitions between those states. The workflow sits
and waits for an external event to occur before transitioning
to one of the defined states. The decision making process hap-
pens externally outside of the workflow – there is a structure
to be followed still like a sequential workflow but control is
passed to its external environment.

I’m clearly in the “sequential workflow” faction. But I refute the ‘sequential’ label. Most of the state/transitions set out there can be qualified as “sequential”. Any workflow engine, in order to present any interest to its users, has to implement a certain number of [control flow] workflow patterns. The number 1 control-flow pattern is named Sequence, granted. But if you take a look at the pattern that immediately follows, it’s Parallel Split. I can’t adhere to the “sequential workflow” appellation.

Note that most workflow engines (you can call then “sequential workflow” engines) strive to implement a large set of those control-flow patterns. This is usually done via their ‘process definition’ language. The goal is to let users express their scenarii / pattern in a not too verbose way. We can argue that state machines may implement any of the control-flow patterns, my guess is ‘yes’, but what’s the cost in code ?

The difference between “sequential workflows” and “state machines” seems to lie not in the predictability of the former, but rather in how the flow between activities or state is weaved, especially how/where it’s concentrated.

Are “sequential workflows” and “state machines” excluding one another ?

The article states “sequential workflow are predictable. They utilize the rules and conditions we provide at the beginning”. It’s the same for state machines. The vast majority of [Ruby] state machine implementations rely on behaviour specified at implementation time (or right before the application’s [re]start).

The second paragraph I quoted up here says that “decision making process happens externally outside of the workflow (…) control is passed to its external environment”. I argue that a “sequential workflow” engine should do the same. A workflow [engine] can’t take a decision by itself, this task is usually handled by a human or a dedicated algorithm (a piece of code, a rule engine, a coin tossing engine wired to internet, whatever).

case

Wait, we’ve been using this ‘workflow’ term quite liberally until now, what is it about ? Why do state machines seem to be the perfect vessel for it, at least in the eyes of the hard-working developer ?

Let’s look at how state machines are sold, at their use cases. Here is a classical example from a [Rails-oriented] state machine library :

  class Document < ActiveRecord::Base
    include StateFu

    def update_rss
      puts "new feed!"
      # ... do something here
    end

    machine( :status ) do
      state :draft do
        event :publish, :to => :published
      end

      state :published do
        on_entry :update_rss
        requires :author  # a database column
      end

      event :delete, :from => :ALL, :to => :deleted do
        execute :destroy
      end
    end
  end

It attaches behaviour (state and transitions) to a resource. If you look at the example of any Ruby state machine library out there, you will see this pattern : a set of states and transitions attached to a resource in the system (a Model).

Let’s move to an adjacent example, it’s also about a document, but it’s in the “sequential workflow” faction (ruote). (The ‘cursor’ expression in the process definition allows for the flow to be rewound or skipped…) :

Ruote.process_definition :name => 'doc review', revision => '0.2' do

  cursor do

    participant '${f:author}', :step => 'finalize document'
    participant '${f:review_team}', :step => 'review document'

    rewind :unless => '${f:approved}'

    concurrence do
      participant '${f:publisher}', :step => 'publish document'
      participant '${f:author}', :step => 'publication notification'
    end
  end
end

There is a document yes, but it could be a folder of documents. The process definition is a separate thing, meant to be turned into multiple process instances.

Workflow engines are mostly process definitions interpreters, graduating as “operating systems for business processes” (think ‘ps’ and ‘kill dash 9’ but with business processes / process instances).

As said, most of the Ruby state machine libraries out there are all about binding behaviour to resources / models. What about moving to the workflow engine camp and instead of binding to a system artefact (document, order, invoice, book, customer, …) why not attach state and transitions to a virtual artefact, a so-called “process instance” ? The state machine would move out of its shell and turn into a full workflow engine. Or is that so ?

cases

Back to the 90% of the cases : the state machine attached to a model. What if there is more than 1 “stateable” model ? Easy. But what if transition in model A provokes a transition in model B ? The “real workflow” now lies at the intersection of two models.

The workflow engine will expose a process definition while, with the state machines, we’ll have to extract it from two or more models. Another advantage of workflow engines is that they have process definition versioning. Most of them run process instances from process definition X at version 2 happily with process instances from version 4, concurrently.

On the other hand, when a process instance goes ballistic it might take some time to repair it (align it with the reality of the business process). It might be easier with state machines, a simple SQL update usually, but what if there are multiple inter-related behaviours ? Pain on both sides.

Whatever the tool, you’ll have to carefully avoid locking yourself into your private hell of a system.

nuances

Kenneth Kalmer is using a mix of state_machine and ruote in their ISP in a box product. The state machine library is in charge of the state of its various models while the workflow engine does the orchestration. “Processes that drives the provision of services”

The two techniques may co-exist. Tinier states / transitions sets, more concise process definitions. Models managing their states, process instances coordinating the states of multiple models.

It’s OK to remove the workflow engine and replace it with a “super” state machine, but then you’re entering in the real workflow territory and the mindset is a bit different and at that level, you’ll be exposed to a flavour of change that directly involves your customer / end-user. Nowhere to hide.

machines

I’ve been linking profusely to my ruote workflow engine. The only other workflow engine I’ve spotted in the wild is rbpm, but its development stopped a long time ago.

Here is a non-exhaustive list of Ruby state machine implementations, in no specific order :

There’s a plethora of state machines versus very few (one or two) workflow engines. This could be interpreted in multiple ways.

State machines follow a well defined pattern, while workflow engines have to follow a set of patterns, and workflow engines strive to be “operating systems for business processes”. State machines are more humble.

So Rails Magazine #3 is out. Great read.

Written by John Mettraux

July 3, 2009 at 2:48 pm

ruote 0.9.20

ruoteI just released ruote 0.9.20. The original release text is on github, but I’m reproducing it here, with more details.

This release is not backward compatible. Changing the engine and restarting with a batch of persisted processes will not work. Start with new processes or migrate them.

Speaking of persistence, this release features revised and new persistence mechanisms. The most common one, the one directly bound to the file system has been revised to use Ruby marshalling instead of YAML with a net performance increase. Tokyo Cabinet and Tokyo Tyrant persistence mechanisms have been added as well, along with a DataMapper one.

Should the need to move from one persistence mecha to the other arise, a pooltool.ru has been included for easy migrations, back and forth.

On the front of expressions themselves, ‘sleep’ and ‘wait’ have been merged, as was pointed out to me, “wait ‘2d'” sounds more businessy than “sleep ‘2d'”.

Each expression may now have a on_cancel and/or on_error attribute, pointing to a subprocess or participant called in case of error or cancel respectively. This was suggested by Raphael Simon and Kenneth Kalmer.

There is a new way to define process via Ruby :

  OpenWFE.process_definition :name => 'cfp' do
    sequence do
      prepare_call
      concurrence do
        vendor1
        vendor2
        vendor3
      end
      decide_which
    end
  end

Kenneth Kalmer came up with a JabberParticipant and a JabberListener and Torsten Schoenebaum implemented an ActiveResourceParticipant for modifying Rails resources from Ruote.

Many thanks to everybody who contributed code, feedback, ideas to Ruote !

Ruote’s source : http://github.com/jmettraux/ruote/

Written by John Mettraux

March 18, 2009 at 7:59 am

Posted in bpm, openwferu, ruby, ruote, workflow

ruote in 20 minutes

ruote on beerKenneth Kalmer presented ruote in 20 minutes. The slides are available from his blog post.

I linked to his presentation from the ruote documentation and I also added another example to the ruote quickstart.

What’s great about Kenneth presentation is that he starts with a state machine example and builds on it to reach ruote territory.

Kenneth’s use of ruote in his company has one particularly interesting aspect : it leverages XMPP participants (contributed by Kenneth). The engine communicates with participants to businness processes over XMPP (Jabber / Gtalk). Very nice.

Written by John Mettraux

March 6, 2009 at 7:12 am

Posted in bpm, ruby, ruote, workflow

new ruote quickstart

picture-11Ruote has a new website, like the one for Rufus, it’s based on the excellent Webby.

I wrote yesterday about renovating the quickstart, I had initially written an example showing a mini todo tool, but this time I switched to a “mechanical turk” like example where a process instance fetches flickr pictures and presents them to human participants for evaluation and choice.

The process definition boils down to :

class PicSelectionProcess < OpenWFE::ProcessDefinition

  sequence do

    get_pictures

    concurrence :merge_type => 'mix' do
      user_alice
      user_bob
      user_charly
    end

    show_results
      # display the pictures chosen by the users
  end
end

the participant for fetching the pictures is called ‘get_pictures’ :

engine.register_participant :get_pictures do |workitem|

feed = Atom::Feed.new(
http://api.flickr.com/services/feeds/photos_public.gne”+
“?tags=#{workitem.tags.join(‘,’)}&format=atom”)
feed.update!
workitem.pictures = feed.entries.inject([]) do |a, entry|
a << [ entry.title, entry.authors.first.name, entry.links.first.href ] end end [/sourcecode] That's it for the quickstart. Now I have to update the tea tasting team example and the japanese website.

Written by John Mettraux

December 16, 2008 at 6:57 am

Posted in atom, bpm, openwferu, ruby, ruote, workflow