eRez Imaging Server 3.2

Configuring Plug-Ins

 

This document describes how the Plug-Ins bundled with eRez Imaging Server 3.0 and later can be customized and configured.

 

Table of contents

Configuring Plug-Ins 1

Table of contents 1

Configuring the Generic Plug-In. 1

Location of the files 1

Adding a new format 2

Activating the new formats 2

Configuring the PS plug-in. 3

Obtaining Ghostscript 3

Obtaining XPDF. 4

Location of the files 5

Configuring Ghostscript 6

Configuring PDFtoText 7

 

Configuring the Generic Plug-In

 

This section describes how the Generic Plug-In bundled with eRez Imaging Server 3.0 and later can be used to add support for new file formats.

 

Location of the files

 

The Plug-In code and support files are located inside the webapps/erez3/WEB-INF/plugins/generic folder in the Tomcat folder.

 

The folder “icons” contains small TIFF images used to represent the various file formats when eRez displays them. You will find icons for a small set of common formats in there already.

 

The file “plugin.xml” contains information for use by the eRez Imaging Server as well as description for each of the file formats. This file can be edited by a standard text editor.

 

Adding a new format

 

To add support for a new file format you will need to insert a new line like this:

 

<format name="StuffIt Archive" extension=".sit" mimetype="application/x-stuffit" preview="icons/stuffit.tif"/>

 

The easiest way to do this is to copy an existing format and then modify the value of the attributes. The following attributes can be specified:

 

Name

Description

name

The name of the format as displayed by the eRez Imaging Server.

extension

The file name extension used to detect this format including a leading “.”. Example: “.doc”.

mimetype

The Mime Type used to identify this format as define by IANA.

You can use the mime type “application/octet-stream” if you don’t known the mime type for the file format.

preview

Relative link to the thumbnail TIFF file to use. Example:  icons/stuffit.tif

ole2format

If this attribute is present and the value is “true” the PlugIn will try to read metadata from the files as an OLE2 document and present it as IPTC data. This makes it possible to extract metadata from formats like Microsoft Office documents and display it as IPTC information.

xmp

 

If this attribute is present and the value is “true” the PlugIn will try to locate XMP metadata in the file. If a thumbnail image is present it will be used to generate a preview and the IPTC compatible metadata will be displayed and indexed for searching.

Activating the new formats

 

In order for your new file formats to become active you must restart the eRez Server.


 

Configuring the PS plug-in

 

This section describes how the PS Plug-In bundled with eRez Imaging Server 3.0 and later can be configured to create previews for EPS, PDF and Adobe Illustrator documents and include text from PDF documents in the searchable index.

 

These features require installation of two freely available external applications:

 

1)     Ghostscript – used for rendering documents for preview.

2)     XPDF – used to extract text from PDF documents.

 

For legal reasons we can not bundle these applications with eRez, so you will need to download and install them separately.

Obtaining Ghostscript

 

Ghostscript is the name of a set of software that provides:

 

 

 

 

 

 

In simple terms, this means that Ghostscript can read a PostScript or PDF file and display the results on the screen or convert them into a form you can print on a non-PostScript printer. Especially together with several popular previewers, with Ghostscript you can view or print an entire document or even isolated pages, even if your computer doesn't have Display PostScript and your printer doesn't handle PostScript itself.

 

Ghostscript is available in several versions and for many platforms. These links may be help you find the right version for your platform:

 

  1. http://www.ghostscript.com/ here you the freely available (AFPL) release of Ghostscript as platform independent source code as well as precompiled binaries for Windows.

 

  1. http://sourceforge.net/projects/espgs a GNU Ghostscript distribution with Linux RPMs.

 

  1. http://ii2.sourceforge.net/ ”i-Installer Home Page” - a network-aware installer application for Mac OS X 10.2 or higher which can be used to install Ghostscript 8 very easily.

 

You can install Ghostscript anywhere on your system.

 

Obtaining XPDF

 

Xpdf is an open source viewer for Portable Document Format (PDF) files. (These are also sometimes also called 'Acrobat' files, from the name of Adobe's PDF software.) The Xpdf project also includes a PDF text extractor, PDF-to-PostScript converter, and various other utilities.

 

Xpdf runs under the X Window System on UNIX, VMS, and OS/2. The non-X components (pdftops, pdftotext, etc.) also run on Win32 systems and should run on pretty much any system with a decent C++ compiler.

 

Xpdf is designed to be small and efficient. It can use Type 1, TrueType, or standard X fonts.

 

Xpdf should work on pretty much any system which runs X11 and has Unix-like (POSIX) libraries. You'll need ANSI C++ and C compilers to compile it. If you compile it for a system not listed on the xpdf web page, please let me know. If you can't get it to compile on your system, I'll try to help.

 

The PS Plug-in only requires the pdftotext tool.

 

The home of XPDF with binaries for common platforms such as Windows, Linux, Solaris and Mac OS X is here: http://www.foolabs.com/xpdf/home.html.

 

You can install XPDF anywhere on your system.


 

Location of the files

 

The Plug-In code and support files are located inside the webapps/erez3/WEB-INF/plugins/ps folder in the Tomcat folder.

 

 

The file “plugin.xml” contains information for use by the eRez Imaging Server as well as the settings for using Ghostscript and XPDF. This is what the default plugin.xml looks like:

 

<?xml version="1.0" encoding="UTF-8"?>

<plugin class="com.yawah.erez.plugins.ps.PSPlugIn">

      <ghostscript

            enabled="false"

            executable="C:\gs\gs8.14\bin\gswin32c.exe"

            timeout="300"

            >

            <parameter value="-r150"/>

            <parameter value="-dQUIET"/>

            <parameter value="-dSAFER"/>

            <parameter value="-dBATCH"/>

            <parameter value="-dNOPAUSE"/>

            <parameter value="-dNOPROMPT"/>

            <parameter value="-sDEVICE=tiff24nc"/>

            <parameter value="-dUseCIEColor"/>

            <parameter value="-dTextAlphaBits=4"/>

            <parameter value="-dGraphicsAlphaBits=4"/>

            <parameter value="-dEPSCrop"/>

      </ghostscript>

 

      <pdftotext

            enabled="false"

            executable="C:\Program Files\xpdf-3.00-win32\pdftotext.exe"

            timeout="300"

            >

            <parameter value="-q"/>

      </pdftotext>

</plugin>

 

For convenience the Ghostscript and pdftotext elements are highlighted with yellow and green.

 


Configuring Ghostscript

 

The Ghostscript element contains the 3 attributes “enabled”, “executable” and “timeout” followed by a series of Ghostscript parameters. To enable Ghostscript you must change the value for enabled to “true” and set “executable” to point to your Ghostscript application as shown below:

 

      <ghostscript

            enabled="true"

            executable="C:\gs\gs8.14\bin\gswin32c.exe"

            timeout="300"

            >

            <parameter value="-r150"/>

            <parameter value="-dQUIET"/>

            <parameter value="-dSAFER"/>

            <parameter value="-dBATCH"/>

            <parameter value="-dNOPAUSE"/>

            <parameter value="-dNOPROMPT"/>

            <parameter value="-sDEVICE=tiff24nc"/>

            <parameter value="-dUseCIEColor"/>

            <parameter value="-dTextAlphaBits=4"/>

            <parameter value="-dGraphicsAlphaBits=4"/>

            <parameter value="-dEPSCrop"/>

      </ghostscript>

 

When the PS Plug-in call the Ghostscript application it will wait for a maximum of “timeout” seconds for Ghostscript to complete the rendering. If GhostScript has not completed within that period it will be terminated by the Plug-in.

 

If for some reason Ghostscript encounters an error or is terminated by the Plug-in a log file is saved in a hidden folder “.erez/{filename}/GhostScriptLog.txt” where {filename} reperesent the name of the eps, pdf or illustrator file. 

 

As long as the file “GhostScriptLog.txt” is present, no further attempt is made to render a preview. You can inspect the fille to see the output from Ghostscript including possible error messages. This is handy when experimenting with parameters etc.

 

The Ghosstcript parameters are passed to Ghostscript in the order they appear in the XML file followed by the parameters  "-sOutputFile={tmpfile} {srcfile}" where {tempfile} respresents the path and filename of a temporary TIFF file and {srcfile} the path and filename of the original eps, pdf or illustrator file.

 

The example configuration that comes with the Plug-in is optimized for AFPL Ghostscript version 8.14. Depending on the Ghostscript version that you are using you may need to remove some of the parameters. Particularly the parameters “TextAlphaBits”, "GraphicsAlphaBits" and “EPSCrop” may not be supported by earlier versions of Ghostscript and may cause Ghostscript to fail. Likewise you may want to add extra parameters to take advantage of features implemented after the writing of this document.

 


Configuring PDFtoText

 

The pdftotext element contains the 3 attributes “enabled”, “executable” and “timeout” followed by a series of pdftotext parameters. To enable pdftotext you must change the value for enabled to “true” and set “executable” to point to your pdftotext application as shown below:

 

      <pdftotext

            enabled="true"

            executable="C:\Program Files\xpdf-3.00-win32\pdftotext.exe"

            timeout="300"

            >

            <parameter value="-q"/>

      </pdftotext>

 

When the PS Plug-in call the pdftotext application it will wait for a maximum of “timeout” seconds for pdftotext to complete the rendering. If pdftotext has not completed within that period it will be terminated by the Plug-in.

 

The pdftotext parameters are passed to pdftotext in the order they appear in the XML file followed by the parameters "{srcfile} -" where {srcfile} represents the path and filename of the original PDF file. The final "-" instructs pdftotext to send the text output to the console where it is captured by the Plug-in.