InstantSOAP

Introduction

This is a guide to exposing command-line applications through InstantSOAP. By the end of this tutorial, you should be able to expose command-lines to users with fine-grained control over how the application inputs and outputs are mapped to the application options and standard pipes.

This guide assumes that you have run through the Getting Started guide and have deployed the echo-test war at the default location on the current machine. We also assume that you will be developing your client using CXF as your web-services toolkit. It also assumes that you are on a *nix operating system rather than Windows.

Your First Command-line Application

In this example, we will expose the cat command-line application via InstantSOAP. Although this is not very exciting in itself, it will demonstrate several of the features of InstantSOAP's command-line bindings.

The first step is to create a file called META-INF/applications/applicationList. Each line of this file gives the relative location of a resource containing a single command-line application description.

echoViaCat.xml

Next, the named resource(s) needs to be populated with the XML descriptor for the command-line to be exposed. This file effectively contains the JAXB serialization of an Application bean. It starts with the application tag.

<cl:application xmlns:cl="http://cs.ncl.ac.uk/instantsoap/commandLine"
                 xmlns:svc="http://cs.ncl.ac.uk/instantsoap/service">

  ... configuration goes here

</cl:application>

We need to give an application name and description so that this info can be reported back.

<metaData>
  <name>echoMessage</name>
  <description>Echo your message back to you by interacting with the command-line.</description>
</metaData>

Next, we need to describe how options are mapped onto strings in the command-line. This varies from application to application, and from platform to platform. In this case, we set the prefix to -, the long form of the prefix (for options that are whole words rather than single letters) to --, indicate that we prefer single-letter prefixes and disable merging of multiple no-args options.

<commandFormat>
  <longForm>--</longForm>
  <mergeEmptyOptions>false</mergeEmptyOptions>
  <preferShortPrefixes>true</preferShortPrefixes>
  <prefix>-</prefix>
</commandFormat>

The command-line itself needs a working directory, the name of the executable, and information specifying any arguments to be passed in. In our case, the working directory can be /tmp, the executable is cat and we don't use arguments.

<commandLinePattern>
  <arguments/>
  <executable>cat</executable>
  <workingDirectory>/tmp</workingDirectory>
</commandLinePattern>

In the case of cat, we want to route the user input to the STDIN pipe of the command. We want the user to see an input named message but we want this to be routed to the application standard input, which is called #STDIN.

<inputs>
  <input>
    <metadata>
      <name>message</name>
      <description>The message that you wish to be echoed</description>
    </metadata>
    <renameTo>#STDIN</renameTo>
  </input>
</inputs>

Similarly, we want to rout the standard output of cat to the user. We will name it response. The standard output is initially named #STDOUT.

<outputs>
  <output>
    <metadata>
      <name>response</name>
      <description>The result of echoing the message.</description>
    </metadata>
    <renameFrom>#STDOUT</renameFrom>
  </output>
</outputs>

If we had other inputs and outputs, we could have described them in the inputs and outputs elements. If we want the name to remain unchanged, then there is no need to supply a renameFrom element.

To deploy this application, you need to package these files up into a jar, and put them into the WEB-INF/lib of a modified instantsoap-ws-echoTest.war. When you deploy the web service, you should see the echoMessage application deployed and should be able to invoke it just like any other InstantSOAP application.

You can obtain the compete source for this example, including the pom, in tar.gz or zip formats.

Special inputs and outputs

Some inputs and outputs are special. They are bound to things with particular behavior in the underlying execution of the application. The command-line bindings indicate this by prefixing them with a # (hash) sign. If these are not renamed in the input and output bindings, then they are dropped, or given empty values.

Special Handling
Name	Description
Inputs
#STDIN	Standard input to the application, defaults to the empty sring.
Outputs
#STDOUT	Standard output of the application.
#STDERR	Standard error of the application.
#EXIT_STATUS	The exit status of the application, usually 0 on success.

Further special inputs and outputs may be added in the future. They will have the same input/output mapping semantics as the existing ones.

Exposing Blast

No bioinformatics web-application framework is complete without an example of how to expose blast. To follow this example through, you need to have blast installed locally. If you are on ubuntu, you can install blast by typing:

sudo aptitude install blast2

If you do not have sudo access, or are on another operating system, contact your systems administrator.

The blast application accepts a plethora of options. When exposing this as a service, the vast majority of them are not relevant, and should be hidden. Also, the type of alignment to perform is actually specified by one of these options, -p and should not be configurable by the user. The -d option is used to select a database to search against. In the case of a service, it is good for people to use a meaingful name, and to map this down to a database name server-side.

You can obtain the compete source for this example, including the pom, in tar.gz or zip formats. You should download this now, as it contains sequence files under src/blast that are needed to work through this example.

Download the archive to your home directory.
Create a directory called /tmp/isExample/scratch
CD into this directory
Type tar xzf ~/site-1.0-catCommandline.tar.gz substituting in where you downloaded the archive to as needed.
Create a directory called /tmp/isExample/blastDBs
Change into this directory

Type:

formatdb -t "Silicon Proteins" -i /tmp/isExample/scratch/blastCommandline/blast/siliconProteins.fasta -p T -n silicon

You should now see a series of files starting with silicon in your current directory.

You now have a formatted database that we can use for blast. The next step is to create an application file to expose this database via blast. We will put this file into blastp in META-INF/applications as before.

This file starts with similar information as for the previous example, by starting of the XML, naming the application and specifying the command-line style to use.

<ns3:application xmlns:ns3="http://cs.ncl.ac.uk/instantsoap/commandLine"
                 xmlns:ns2="http://cs.ncl.ac.uk/instantsoap/service">
  <metaData>
    <description>Run blastp against a set of silicon-processing proteins.</description>
    <name>blastp-silicon</name>
  </metaData>
  <commandFormat>
    <longForm>--</longForm>
    <mergeEmptyOptions>false</mergeEmptyOptions>
    <preferShortPrefixes>true</preferShortPrefixes>
    <prefix>-</prefix>
  </commandFormat>

In the last example, there where no command-line arguments needed. In this case, we do need to specify some. Firstly, we will specify the -p switch and give it the value blastp.

<commandLinePattern>
  <arguments>
    <argument xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
              xsi:type="ns3:option">
      <shortName>p</shortName>
      <values>
        <value xsi:type="ns3:variableExpression">
          <components>
            <component xsi:type="ns3:constant">
              <value>blastp</value>
            </component>
          </components>
        </value>
      </values>
    </argument>

This opens the arguments list, and then supplies the value of the fisrt argument. This has the short name p, and we assign it a single constant value blastp.

Next we must do something similar for the -d option. We will give it the constant value silicon, so as to match up with the name we gave the database above.

  <argument xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
            xsi:type="ns3:option">
    <shortName>d</shortName>
    <values>
      <value xsi:type="ns3:variableExpression">
        <components>
          <component xsi:type="ns3:constant">
            <value>silicon</value>
          </component>
        </components>
      </value>
    </values>
  </argument>
</arguments>

Now, all that is left is to set the working directory to be /tmp/isExample/blastDBs, the executable to blastall and to rout query and result to the standard input and output of blast, respectively.

    <executable>blastall</executable>
    <workingDirectory>
      /tmp/isExample/blastDBs/
    </workingDirectory>
  </commandLinePattern>
  <inputs>
    <input>
      <metadata>
        <name>query</name>
      </metadata>
      <renameTo>#STDIN</renameTo>
    </input>
  </inputs>
  <outputs>
    <output>
      <metadata>
        <name>results</name>
      </metadata>
      <renameFrom>#STDOUT</renameFrom>
    </output>
  </outputs>
</ns3:application>

This application can be deployed in the same way the previous example was. Congratulations! You have exposed a sequence database via blast through InsantSOAP.

Wireing inputs to options

In the previous section, we exposed blastp and a protein database via InstantSOAP. Here, we will expose two nucleotide databases via blastn, and allow the user to specify which one to search by name. This will introduce command-line variables.

Firstly, create formatdb databases for selenium.fasta and lead.fasta, using selenium and lead as their names.

We will now create the application file blastn.xml to tell InstantSOAP about the databases. This will be quite similar to the description of blastp earlier, so here we will only show interesting differences.

Firstly, you need different application meta-data, to name this uniquely and provide a description to the users.

<metaData>
  <description>Run blastn against a set of lead or selenium proteins.</description>
  <name>blastn</name>
</metaData>

Next, we need to provide an input for the database name.

<input>
  <metadata>
    <name>database</name>
    <description>The database to search against. One of selenium or lead.</description>
  </metadata>
</input>

Lastly, we need to bind the database input to the -d option. This is done by using a variable component rather than a constant one.

<argument xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
          xsi:type="ns3:option">
  <shortName>d</shortName>
  <values>
    <value xsi:type="ns3:variableExpression">
      <components>
        <component xsi:type="ns3:variable">
          <name>database</name>
        </component>
      </components>
    </value>
  </values>
</argument>

If an option takes multiple values, then supply multiple value elements. It is possible to include a list of components for a single value. These can each be variables or constants. When the application is run, the variables will be replaced with their bindings, and the result will be concatenated together. So, if you had the constant foo followed by the variable bar and then the constant baz, and an invocation bound bar to PIDGEN then the option would expand out to fooPIDGENbar.

Technical

Project Documentation

Introduction

Your First Command-line Application

Special inputs and outputs

Exposing Blast

Wireing inputs to options