This is a guide to exposing command-line applications through InstantSOAP. By the end of this tutorial, you should be able to expose command-lines to users with fine-grained control over how the application inputs and outputs are mapped to the application options and standard pipes.
This guide assumes that you have run through the Getting Started guide and have deployed the echo-test war at the default location on the current machine. We also assume that you will be developing your client using CXF as your web-services toolkit. It also assumes that you are on a *nix operating system rather than Windows.
In this example, we will expose the cat
command-line
application via InstantSOAP. Although this is not very exciting in
itself, it will demonstrate several of the features of InstantSOAP's
command-line bindings.
The first step is to create a file called
META-INF/applications/applicationList
. Each line of this
file gives the relative location of a resource containing a single
command-line application description.
echoViaCat.xml
Next, the named resource(s) needs to be populated with the XML
descriptor for the command-line to be exposed. This file effectively
contains the JAXB serialization of an Application
bean.
It starts with the application tag.
<cl:application xmlns:cl="http://cs.ncl.ac.uk/instantsoap/commandLine" xmlns:svc="http://cs.ncl.ac.uk/instantsoap/service">
... configuration goes here
</cl:application>
We need to give an application name and description so that this info can be reported back.
<metaData> <name>echoMessage</name> <description>Echo your message back to you by interacting with the command-line.</description> </metaData>
Next, we need to describe how options are mapped onto strings in the
command-line. This varies from application to application, and from
platform to platform. In this case, we set the prefix to -
,
the long form of the prefix (for options that are whole words rather
than single letters) to --
, indicate that we prefer
single-letter prefixes and disable merging of multiple no-args options.
<commandFormat> <longForm>--</longForm> <mergeEmptyOptions>false</mergeEmptyOptions> <preferShortPrefixes>true</preferShortPrefixes> <prefix>-</prefix> </commandFormat>
The command-line itself needs a working directory, the name of the
executable, and information specifying any arguments to be passed in.
In our case, the working directory can be /tmp
, the
executable is cat
and we don't use arguments.
<commandLinePattern> <arguments/> <executable>cat</executable> <workingDirectory>/tmp</workingDirectory> </commandLinePattern>
In the case of cat, we want to route the user input to the STDIN pipe
of the command. We want the user to see an input named
message
but we want this to be routed to the application
standard input, which is called
#STDIN
.
<inputs> <input> <metadata> <name>message</name> <description>The message that you wish to be echoed</description> </metadata> <renameTo>#STDIN</renameTo> </input> </inputs>
Similarly, we want to rout the standard output of cat to the user. We
will name it response. The standard output is initially named
#STDOUT
.
<outputs> <output> <metadata> <name>response</name> <description>The result of echoing the message.</description> </metadata> <renameFrom>#STDOUT</renameFrom> </output> </outputs>
If we had other inputs and outputs, we could have described them in the
inputs
and
outputs
elements. If we want the name to remain unchanged,
then there is no need to supply a
renameFrom
element.
To deploy this application, you need to package these files up into a
jar, and put them into the WEB-INF/lib of a modified
instantsoap-ws-echoTest.war
.
When you deploy the web service, you should see the
echoMessage
application deployed and should be able to
invoke it just like any other InstantSOAP application.
You can obtain the compete source for this example, including the pom, in tar.gz or zip formats.
Some inputs and outputs are special. They are bound to things with
particular behavior in the underlying execution of the application.
The command-line bindings indicate this by prefixing them with a
#
(hash) sign. If these are not renamed in the input and
output bindings, then they are dropped, or given empty values.
Name | Description |
---|---|
Inputs | |
#STDIN | Standard input to the application, defaults to the empty sring. |
Outputs | |
#STDOUT | Standard output of the application. |
#STDERR | Standard error of the application. |
#EXIT_STATUS | The exit status of the application, usually 0 on success. |
Further special inputs and outputs may be added in the future. They will have the same input/output mapping semantics as the existing ones.
No bioinformatics web-application framework is complete without an example of how to expose blast. To follow this example through, you need to have blast installed locally. If you are on ubuntu, you can install blast by typing:
sudo aptitude install blast2
If you do not have sudo access, or are on another operating system, contact your systems administrator.
The blast application accepts a plethora of options. When exposing this
as a service, the vast majority of them are not relevant, and should
be hidden. Also, the type of alignment to perform is actually
specified by one of these options, -p
and should not be
configurable by the user. The -d
option is used to select
a database to search against. In the case of a service, it is good for
people to use a meaingful name, and to map this down to a database
name server-side.
You can obtain the compete source for this example, including the pom,
in
tar.gz
or
zip
formats. You should download this now, as it contains sequence files
under src/blast
that are needed to work through this
example.
/tmp/isExample/scratch
tar xzf ~/site-1.0-catCommandline.tar.gz
substituting in where you downloaded the archive to as needed.
/tmp/isExample/blastDBs
formatdb -t "Silicon Proteins" -i /tmp/isExample/scratch/blastCommandline/blast/siliconProteins.fasta -p T -n silicon
You now have a formatted database that we can use for blast. The next
step is to create an application file to expose this database via blast.
We will put this file into blastp
in
META-INF/applications
as before.
This file starts with similar information as for the previous example, by starting of the XML, naming the application and specifying the command-line style to use.
<ns3:application xmlns:ns3="http://cs.ncl.ac.uk/instantsoap/commandLine" xmlns:ns2="http://cs.ncl.ac.uk/instantsoap/service"> <metaData> <description>Run blastp against a set of silicon-processing proteins.</description> <name>blastp-silicon</name> </metaData> <commandFormat> <longForm>--</longForm> <mergeEmptyOptions>false</mergeEmptyOptions> <preferShortPrefixes>true</preferShortPrefixes> <prefix>-</prefix> </commandFormat>
In the last example, there where no command-line arguments needed. In
this case, we do need to specify some. Firstly, we will specify the
-p
switch and give it the value blastp
.
<commandLinePattern> <arguments> <argument xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="ns3:option"> <shortName>p</shortName> <values> <value xsi:type="ns3:variableExpression"> <components> <component xsi:type="ns3:constant"> <value>blastp</value> </component> </components> </value> </values> </argument>
This opens the arguments list, and then supplies the value of the fisrt
argument. This has the short name p
, and we assign it a
single constant value blastp
.
Next we must do something similar for the -d
option. We
will give it the constant value silicon
, so as to match
up with the name we gave the database above.
<argument xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="ns3:option"> <shortName>d</shortName> <values> <value xsi:type="ns3:variableExpression"> <components> <component xsi:type="ns3:constant"> <value>silicon</value> </component> </components> </value> </values> </argument> </arguments>
Now, all that is left is to set the working directory to be
/tmp/isExample/blastDBs
, the executable to
blastall
and to rout query
and
result
to the standard input and output of blast,
respectively.
<executable>blastall</executable> <workingDirectory> /tmp/isExample/blastDBs/ </workingDirectory> </commandLinePattern> <inputs> <input> <metadata> <name>query</name> </metadata> <renameTo>#STDIN</renameTo> </input> </inputs> <outputs> <output> <metadata> <name>results</name> </metadata> <renameFrom>#STDOUT</renameFrom> </output> </outputs> </ns3:application>
This application can be deployed in the same way the previous example was. Congratulations! You have exposed a sequence database via blast through InsantSOAP.
In the previous section, we exposed blastp and a protein database via InstantSOAP. Here, we will expose two nucleotide databases via blastn, and allow the user to specify which one to search by name. This will introduce command-line variables.
Firstly, create formatdb databases for
selenium.fasta
and lead.fasta
, using
selenium
and lead
as their names.
We will now create the application file blastn.xml
to tell
InstantSOAP about the databases. This will be quite similar to the
description of blastp
earlier, so here we will only show
interesting differences.
Firstly, you need different application meta-data, to name this uniquely and provide a description to the users.
<metaData> <description>Run blastn against a set of lead or selenium proteins.</description> <name>blastn</name> </metaData>
Next, we need to provide an input for the database name.
<input> <metadata> <name>database</name> <description>The database to search against. One of selenium or lead.</description> </metadata> </input>
Lastly, we need to bind the database input to the -d
option. This is done by using a variable component rather than a
constant one.
<argument xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="ns3:option"> <shortName>d</shortName> <values> <value xsi:type="ns3:variableExpression"> <components> <component xsi:type="ns3:variable"> <name>database</name> </component> </components> </value> </values> </argument>
If an option takes multiple values, then supply multiple value
elements.
It is possible to include a list of components for a single value. These
can each be variables or constants. When the application is run, the
variables will be replaced with their bindings, and the result will
be concatenated together. So, if you had the constant foo
followed by the variable bar
and then the constant
baz
, and an invocation bound bar
to
PIDGEN
then the option would expand out to
fooPIDGENbar
.