Thursday 19 September 2013

The Tortoise

 
I'm keeping the animal theme going after @tingenek's last post about riding the camel, with a reference to Aesop's fable about slow and steady winning the race.

It's a good year since we got funding thanks to Geovation and we are finally on the verge of launching the first app for android and accompanying website.

Barring any last-minute hiccups, the "Get Community Payback" app will be available to download from Google Play for android devices from October 1st 2013.


Users will be able to take photos of fly-tipping or other ugly sights, tag the location on a map and submit the information to their local Probation Trust. We'll forward nominations from other areas to the Probation Trust concerned.

In Staffordshire & West Midlands, staff can assess and review projects on a website we've developed. As projects are updated, automatic notifications go back to the person who suggested the project.

Other Probation Trusts are free to use our website to track projects in their area. The code is open source anyway so they can adopt it, adapt it or completely ignore it! Luckily, most seem interested.

People can keep track of projects on the public-facing website. There they can search a map and click on pins to reveal photos and project descriptions.

But while we've been treading carefully, learning and developing our skills, we have been overtaken by a company offering a rival service!


Paradex released a Community Payback project reporting app a few months ago, offering to e-mail project nominations to Probation Trusts for a monthly charge. They didn't ask if we were interested at Staffordshire & West Midlands, nor did they ask permission to add our video to their website!

That website ends with "org", despite Paradex looking to make a profit, and they use "official" Ministry of Justice and Community Payback logos and colour scheme. If you look very closely, you will see the small print admitting Paradex is "independent of Probation and the Ministry of Justice".

I don't have any beef with small software companies trying to identify opportunities to make a few quid, but asking cash-strapped public services for a fee so you can send them the odd e-mail?

In the end, if there is demand for the service and two apps are available, then people will use the best one and only they will be the judge of that.

We are hopeful that our methodical approach to service design and user experience will ultimately pay off as the finish line comes into view.

Keep an eye on @swmcpvisibility on Twitter for more updates in the next few weeks.


Thursday 20 June 2013

Riding the Camel

When I originally thought about the technical design for the CP Visibility application, I had in mind a series of blocks that were loosely coupled together for maximum flexibility;

  • An xml database (eXist)
  • A public search engine (SOLR)
  • A publication mechanism to push updates to the phones (MQTT)

However, at that time I hadn't considered exactly how that was going to happen - I'd assumed an api or library or REST game of some sort would present itself (optimism is an integral part of professional IT).

Move on a few months and it's clear that although there are well documented ways to do this in Java: the Paho client library for MQTT and the SOLRJ library for SOLR, each library introduces it's own complexity layer:
  • The library needs to be understood. 
  • Extra java code needs to be written and tested,
  • It has to work in eXist as a module.
Hold on though, each new use of the data, like a twitter or  a websocket feed would, again, require it's own mini 'solution' - created, from scratch, each time. That doesn't sound like a good idea.

Enter Apache Camel, which I'd used in another project at work in the meantime and seemed ideal for this task. Basically Camel knows both how to connect to a load of technologies and has a ruleset that tells it what to route to where and when. It's a 'mediation engine'.
[Have a look at their web site, it does a much better job of describing what Camel does.]


Now for this project the important aspects of Camel are:
  • It can be deployed as a servlet, so slots right in to existing stack.
  • It understands both MQTT and SOLR.
  • It has XQuery/XPath etc built in so it can understand our project files.
What it doesn't have is any understanding of eXist. However, that turns out not to be insurmountable. eXist has a 'file' module that lets us write out an xml project file to a folder on the server. It's not pretty, but it also solves the problem of queuing up updates. Sometimes old-school is the way to go.

We start then by writing an xml file to a folder on the server called camel-in each time there is a new or updated project. Next comes the Camel magic.

Camel consists of <routes> which we can configure in a simple xml file on the server.
First part of the route, the <from> tag picks up any files that appear in camel-in and deletes them after successful processing; the xml content becoming the 'body' or payload for the rest of the route:

<route>

    <from uri="file:/opt/tomcat/temp/camel-in?delete=true"/>

    <setHeader headerName="myid">
        <xpath>/document/id/text()</xpath>
    </setHeader>

    <multicast stopOnException="true">


<to uri="direct:solr"/>
        <to uri="direct:mqtt"/>
    </multicast>

</route>  

Next, we set up a header variable with the id of the project as we need it for mqtt later. Notice <xpath> is built in, so we can read it out directly.  Lastly, we 'multicast' to other routes for mqtt and solr, a bit like calling a sub-routine.
Multicasting is worth explaining; in Camel, a normal route with a couple of steps acts like a pipeline, pouring the body from the output of one into the input of the next. Usually, this is fine, unless you want to alter the body within the route. If you do that, the altered body gets poured into the next step, not the original. We need very different body data for mqtt (text) and solr (xml), so we have to <multicast>. This makes sure a separate copy of the body is sent to each route. First up is SOLR:

<route>
     <from uri="direct:solr"/>

     <to uri="xslt:file:/opt/tomcat/temp/camel-xsl
     /proj2solr.xsl"/>

     <log message="SOLR Update"/>

     <convertBodyTo type="java.lang.String"/>

     <setHeader headerName="SolrOperation">
        <constant>INSERT</constant>
     </setHeader>

     <to uri="solr://localhost/solr/cpsv0"/>
</route>

This route takes a project and uses xslt to put together the right xml format for a solr insert:

<add>
   <doc>
      <field name="id">538</field>
       more fields......
      <field name="location">51.6812,-2.23541</field>
   </doc>
</add>

Next, the body gets converted to a string (rather than xml) and sent to the solr end-point with the appropriate instructions i.e insert the data.

Now, this might seem a little complex, but look at the advantages. No-one needs to know SOLRJ, no code is written that we have to maintain and it's dead easy to alter. Next up is MQTT:

<route>
    <from uri="direct:mqtt"/>

    <to uri="xslt:file:/opt/tomcat/temp/camel-xsl
    /proj2mqtt.xsl"/>

    <log message="MQTT Update"/>

        <recipientList ignoreInvalidEndpoints="false" >

        <simple>
            mqtt:camel?host=tcp://localhost:1883
            &amp;publishTopicName=projects/${header.myid}
        </simple>

        </recipientList>
</route>

This route is a little more complex, but not by much. For the mqtt side we're publishing a message to the /projects/N topic, where N is the project number. The message content is created using the same approach that we used before, since xslt will also output plain text.
Curiously, one thing Camel doesn't do easily is allow you to just drop a variable into a url. Instead, you make up a <recepientList> which allows for route end-point strings to be calculated at run-time . The <simple> tag is the language we're using to make up that string. It could just as easily be <javascript> or <xpath> etc, there's a few you can use.

That's it, in a few lines of xml the updates we need are done. Camel is a very handy bit of kit, we can adapt as the project progresses without having to invest in apis or worry about some bit of code working with another. Best of all, it maintains the loose coupling and flexibility we need.