Longhorn

From Okapi Framework
Revision as of 17:56, 13 August 2018 by Kuro (talk | contribs) (Java version requirement updated)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Overview

Longhorn is a server application that allows you to execute Batch Configurations remotely on any set of input files. Batch Configurations which include pre-defined pipelines and filter configurations, can be exported from Rainbow.

The distribution also includes a client library to access the Longhorn Web services.

Download and Installation

To install Longhorn:

  • Unzip the distribution file on your server.
  • Follow the instructions provided with the readme file of the distribution.
  • Starting with M36, Longhorn requires Java 1.8.

Functionality

To process files with Longhorn these steps are required:

  1. Create a temporary project
  2. Upload a Batch Configuration file into that project
  3. Upload the input files into that project
  4. Execute the project
  5. Download the output files
  6. Delete the project

Usage

There are three ways to access Longhorns functionality. There is

  • a REST interface,
  • a Java API and
  • an HTML client.

They can be used as described below.

REST-Interface

Longhorn can be accessed directly via HTTP methods:

POST http://{host}/okapi-longhorn/projects/new 
Creates a new temporary project and returns its URI (e.g. http://localhost/okapi-longhorn/projects/1) in the Location header of the response.
POST http://{host}/okapi-longhorn/projects/1/batchConfiguration 
Uploads a Batch Configuration file (must be given as multipart/form-data)
POST http://{host}/okapi-longhorn/projects/1/inputFiles.zip 
Adds input files as a zip archive (the zip will be extracted and the included files will be used as input files)
PUT http://{host}/okapi-longhorn/projects/1/inputFiles/help.html 
Uploads a file that will have the name 'help.html'
GET http://{host}/okapi-longhorn/projects/1/inputFiles/help.html
Retrieve an input file that was previously added with PUT or POST
POST http://{host}/okapi-longhorn/projects/1/tasks/execute 
Executes the Batch Configuration on the uploaded input files
POST http://{host}/okapi-longhorn/projects/1/tasks/execute/en-US/de-DE 
Executes the Batch Configuration on the uploaded input files with the source language set to 'en-US' and the target language set to 'de-DE'
POST http://{host}/okapi-longhorn/projects/1/tasks/execute/en-US?targets=de-DE&targets=fr-FR 
Executes the Batch Configuration on the uploaded input files with the source language set to 'en-US' and multiple target languages, 'de-DE' and 'fr-FR'
GET http://{host}/okapi-longhorn/projects/1/outputFiles 
Returns a list of the output files generated
GET http://{host}/okapi-longhorn/projects/1/outputFiles/help.out.html 
Accesses the output file 'help.out.html' directly
GET http://{host}/okapi-longhorn/projects/1/outputFiles.zip 
Returns all output files in a zip archive
DEL http://{host}/okapi-longhorn/projects/1 
Deletes the project
GET http://{host}/okapi-longhorn/projects 
Returns a list of all projects on the server

REST-Interface Sample code: Python

This example works with the requests package - minidom is used to parse the XML project list.

   import requests
   from xml.dom import minidom
   url = 'http://localhost:8080/okapi-longhorn/'

Code to create a new project

   r = requests.post(url+'projects/new')
   print r.text

Code to list existing projects (i.e.: to check if the project was created, and to get the ID of the last project)

   r = requests.get(url+'projects/')
   xmlstring = minidom.parseString(r.text)
   itemlist = xmlstring.getElementsByTagName('e')
   lastproject = len(itemlist)


Code to post a batch config file

   batchfile = open('/home/user/batchconfig.bconf', 'rb')
   r = requests.post(url+'projects/'+str(lastproject)+'/batchConfiguration', files=dict(batchConfiguration=batchfile))

Code to put a string as a file

   payload = "hello world!"
   r = requests.put(url+'projects/'+str(lastproject)+'/inputFiles/test.txt', files=dict(inputFile=payload))

Code to post a file

   payload = open('/home/user/test.txt', 'rb')
   r = requests.post(url+'projects/'+str(lastproject)+'/inputFiles/test.txt', files=dict(inputFile=payload))

REST-Interface access by curl

Longhorn API can be accessed with the curl command, as shown below. Below, '-X GET' is not necessary but used there for clarity.


   $ curl -X POST -i  'http://localhost:8080/okapi-longhorn/projects/new'

This cretes a new project. The "Location:" header in the response shows the URL for managing this project.


   $ curl -X POST -F batchConfiguration=@tmp/batchconfig.bconf 'http://localhost:8080/okapi-longhorn/projects/1/batchConfiguration'

This submits the batch config file exported from Rainbow.


   $ curl -X POST -F inputFile=@tmp/commonmark_original.md 'http://localhost:8080/okapi-longhorn/projects/1/inputFiles/commonmark.md'

This sends a raw file to be extracted.


   $ curl -X POST 'http://localhost:8080/okapi-longhorn/projects/1/tasks/execute'

Longhorn executes the project after receiving this.


   $ curl -X GET 'http://localhost:8080/okapi-longhorn/projects/1/outputFiles'

Longhorn returns the list of output files. For example:

   <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
   <l>
    <e>pack1/manifest.rkm</e>
    <e>pack1/original/commonmark.md</e>
    <e>pack1/work/commonmark.md.xlf</e>
   </l>


   $ curl -X GET 'http://localhost:8080/okapi-longhorn/projects/1/outputFiles/pack1/work/commonmark.md.xlf'

You can obtain obtain one of the output files like this.


   $ curl -X GET -o out.zip 'http://localhost:8080/okapi-longhorn/projects/1/outputFiles.zip'

Or get all the output files in a zip file.


   $ curl -X DELETE 'http://localhost:8080/okapi-longhorn/projects/1'

This ends the project.


Java API

The API is distributed as a .jar file in the Longhorn distribution package. You can also build it from the Okapi source code via Maven from the project lib-longhorn-api.

Maven

The API is available as a maven dependency. Add this repository to your pom.xml:

   <repository>
       <id>okapi-longhorn-release</id>
       <name>Okapi Longhorn Release</name>
       <url>http://repository-opentag.forge.cloudbees.com/release/</url>
   </repository>

Along with this dependency, substituting in a valid version number (e.g, 0.27):

   <dependency>
     <groupId>net.sf.okapi.lib</groupId>
     <artifactId>okapi-lib-longhorn-api</artifactId>
     <version>${okapi.version}</version>
   </dependency>

Sample Code

LonghornService ws = new RESTService(new URI("http://localhost:9095/okapi-longhorn"));

// Create project
LonghornProject proj = ws.createProject();

// Post batch configuration
File bconfFile = new File("C:\\setup.bconf");
proj.addBatchConfiguration(bconfFile);

// Send input files

// First by single upload...
File file1 = new File("C:\\help.html");
// * in the root directory
proj.addInputFile(file1, file1.getName());
// * and in a sub-directory
proj.addInputFile(file1, "samefile/" + file1.getName());

// ...then by package upload
File inputPackage = new File("C:\\more_files.zip");
proj.addInputFilesFromZip(inputPackage);

// Execute pipeline
// Languages don't matter
proj.executePipeline();
// Languages matter
proj.executePipeline("en-US", "de-DE");

// Get output files
ArrayList<LonghornFile> outputFiles = proj.getOutputFiles();

// Does the fetching of files work?
for (LonghornFile of : outputFiles) {
	InputStream is = of.openStream();
	//TODO save InputStream to local file
}

// Delete project
proj.delete();

HTML-Client

You can create projects and upload/download files via an integrated HTML client, too. Uploading input files (and downloading output files) as a zip archive is currently not implemented for the HTML client.

Longhorn html client.png

Configuration

Since Okapi M22 Okapi Longhorn can be build to run multiple instances on one server. You can adjust the build so that it is possible to run multiple Longhorn instances in one JBoss application server. Therefore, the build must be called with an additional parameter:

mvn clean verify -DuseUniqueContextRoot

Configure working directory path

Longhorn has 2 options to configure the working directory of longhorn (sort by priority):

  1. system parameter "LONGHORN_WORKDIR"
  2. configuration file in user.home "/okapi-longhorn-configuration.xml"

If nothing is defined, the working-directory is in user.home in folder "Okapi-Longhorn-Files". Longhorn configuration file example:

<longhorn-config>
    <use-unique-working-directory>True</use-unique-working-directory>
    <working-directory>D:\testData\longhorn-files</working-directory>
</longhorn-config>

Configuration Options

option description data type
working-directory path of the working directory string
use-unique-working-directory if set to true the version of longhorn will be added to working directory name

e.g path/to/working/directory_M0.21

boolean(True or False)