JHOVE Distribution

JHOVE can be downloaded from the JHOVE SourceForge site.

1 Requirements

  1. Java J2SE 1.4
    (JHOVE was originally implemented using the SUN J2SE SDK 1.4.1, but has also been tested to run properly under Sun J2SE SDK 5.)
  2. If you would like to recompile the JHOVE source code, then Apache Ant is required
    Note that the JAVA_HOME environment variable must be appropriately assigned for Ant to function properly.
    (JHOVE was implemented and tested using ANT 1.5.1.)

JHOVE should be usable on any Unix, Windows, or OS X platform with the appropriate J2SE installation.

2 Distribution

JHOVE is distributed as a GZIP'ed TAR and ZIP files under the terms of the GNU Lesser General Public License (LGPL).

3 Third-party modules

Modules for additional file formats have been written by authors outside of the Harvard University Library (HUL). These are available for downloading. Please note that these modules are not supported by HUL.

4 Installation

The files of the standard HUL-supported distribution packages can be unpacked by:

gunzip  jhove-1_1.tar.gz
tar xvf jhove-1_1.tar

gunzip  jhove-examples.tar.gz
tar xvf jhove-examples.tar
or
unzip jhove-1_1.zip
unzip jhove-example.zip

which will produce the following directory structure:

4.1 Directory Structure

jhove/
      COPYING                          # GNU Lesser General Public License
      LICENSE                          # JHOVE license information
      README
      RELEASENOTES                     # JHOVE release notes
      bin/
               jhove.jar
               jhove-handler.jar
               jhove-module.jar
               JhoveApp.jar
               JhoveView.jar
      build.xml
      classes/
               build.xml
               edu/harvard/hul/ois/jhove/
                                         build.xml
                                         *.class
                                         *.java
                                         handler/
                                                 build.xml
                                                 *.class
                                                 *.java
                                                 META-INF/
                                                          MANIFEST.MF
                                                 audit/
                                                     build.xml
                                                     *.class
                                                     *.java
                                         META-INF/
                                                  MANIFEST.MF
                                         module/
                                                 build.xml
                                                 *.class
                                                 *.java
                                                 META-INF/
                                                          MANIFEST.MF
                                                 aiff/
                                                      build.xml
                                                      *.class
                                                      *.java
                                                 gif/
                                                      build.xml
                                                      *.class
                                                      *.java
                                                 html/
                                                      build.xml
                                                      *.class
                                                      *.java
                                                      xhtml-lat1.ent
                                                      xhtml-special.ent
                                                      xhtml-symbol.ent
                                                      xhtml1-frameset.dtd 
                                                      xhtml1-strict.dtd
                                                      xhtml1-transitional.dtd 
                                                      xhtml11-flat.dtd
                                                 iff/
                                                      build.xml
                                                      *.class
                                                      *.java
                                                 jpeg/
                                                      build.xml
                                                      *.class
                                                      *.java
                                                 jpeg2000/
                                                      build.xml
                                                      *.class
                                                      *.java
                                                 pdf/
                                                      build.xml
                                                      *.class
                                                      *.java
                                                 tiff/
                                                      build.xml
                                                      *.class
                                                      *.java
                                                 wave/
                                                      build.xml
                                                      *.class
                                                      *.java
                                                 xml/
                                                      build.xml
                                                      *.class
                                                      *.java
                                         viewer/
                                                 build.xml
                                                 *.class
                                                 *.java
                                                 META-INF/
                                                          MANIFEST.MF
               Jhove.*
               META-INF/
                        MANIFEST.MF
	       ADump.*
	       GDump.*
	       JDump.*
	       J2Dump.*
	       PDump.*
	       TDump.*
	       WDump.*
      conf/
               jhove.conf
      doc/
               *.html
               ...
      examples/
               ascii/...
               gif/  ...
               pdf/  ...
               tiff/ ...
               utf-8/...
      adump
      adump.bat
      j2dump
      j2dump.bat
      jdump
      jdump.bat
      jhove
      jhove.bat
      pdump
      pdump.bat
      tdump
      tdump.bat
      wdump
      wdump.bat

4.2 Installation

After unpacking the distribution, edit the configuration file, jhove/conf/jhove.conf, and set the <jhoveHome> element to the absolute pathname of the JHOVE installation, or home, directory and the temporary directory (in which temporary files are created):

<jhoveHome>jhove-installation-directory</jhoveHome>
<tempDirectory>temporary-directory</tempDirectory>
The JHOVE home directory is the top-most directory in the distribution TAR or ZIP file. On Unix systems, /var/tmp is an appropriate temporary directory; on Windows, C:\Temp. For example, if the distribution TAR file is disaggregated on a Unix system in the directory "/users/sampleuser/projects", then the configuration file should read:

<jhoveHome>/users/sampleuser/projects/jhove</jhoveHome>
<tempDirectory>/var/tmp</jhoveHome>

In the JHOVE home directory, edit the JHOVE Bourne shell driver script, jhove, and set the JHOVE home directory, Java home directory, and Java interpreter:

JHOVE_HOME=jhove-home-directory
JAVA_HOME=java-home-directory
JAVA=java-interpreter
where JHOVE_HOME is set to specify the absolute pathname of the JHOVE home directory; JAVA_HOME is set to specify the absolute pathname of the Java home directory; and JAVA is set to specify the absolute pathname of the Java interpreter. For example:

JHOVE_HOME=/users/sampleuser/projects/jhove
JAVA_HOME=/usr/local/j2re1.4.1_02
JAVA=$JAVA_HOME/bin/java

For Unix systems which support Perl, a Perl script is provided to facilitate the setting of the appropriate values without a lot of manual editing. From the JHOVE home directory, type

configure.pl jhove-home-directory java-home-directory java-interpreter

This will configure the jhove script, the various dump utilities, and conf/jhove.conf with the values you select. This will not work on Windows systems.

For Window systems, the DOS shell driver script, jhove.bat, should be edited to supply the correct values for the JHOVE home directory, Java home directory, and Java interpreter:

SET JHOVE_HOME=jhove-home-directory
SET JAVA_HOME=java-home-directory
SET JAVA=%JAVA_HOME%\bin\java

For example:

SET JHOVE_HOME="C:\Program Files\jhove"
SET JAVA_HOME="C:\Program Files\java\j2re1.4.1_02"
SET JAVA=%JAVA_HOME%\bin\java

The quotation marks are necessary because of the embedded space characters. On Windows platforms it may also be necessary to add the Java bin subdirectory to the System PATH environment variable:

PATH=C:\Program Files\java\j2re1.4.1_02\bin;...
Specific instructions on installing JHOVE in a Windows XP environment are available. For additional information on setting a Windows environment variable, consult your local documentation or system administrator.

4.3 Configuring JHOVE

At the time of its invocation, JHOVE performs dynamic configuration of its modules and output handlers based on a XML-formatted configuration file. The configuration file is specified by the first valid value defined as:

  1. The -c config command line argument (only for the command-line interface);
  2. The file ${user.home}/jhove/conf/jhove.conf, where ${user.home} is the standard Java user.home property; or
  3. The edu.harvard.hul.ois.jhove.config property in the properties file ${user.home}/jhove/jhove.properties.

Note that the GUI interface only searches for the configuration file at the second and third locations listed above; it does not make use of the -c config option.

All format modules and output handlers must be specified in the XML-formatted configuration file, validatable against the XML Schema <http://hul.harvard.edu/ois/xml/xsd/jhove/jhoveConfig.xsd>. (In the following display, brackets [ and ] enclose optional configuration file elements.)

<?xml version="1.0"?>
<jhoveConfig version="1.0"
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xmlns="http://hul.harvard.edu/ois/xml/ns/jhove/jhoveConfig"
 xsi:schemaLocation="http://hul.harvard.edu/ois/xml/ns/jhove/jhoveConfig
                     http://hul.harvard.edu/ois/xml/xsd/jhove/jhoveConfig.xsd">
  <jhoveHome>jhove-home-directory</jhoveHome>
[ <defaultEncoding>encoding</defaultEncoding> ]
[ <tempDirectory>directory</tempDirectory> ]
[ <bufferSize>buffer</bufferSize> ]
[ <mixVersion>version</mixVersion> ]
[ <sigBytes>n</sigBytes> ]
  <module>
    <class>module-class-name</class>
  [ <init>optional-module-init-argument</init> ]
  [ <param>optional-module-parameter</param> ]
    ...
  </module>
  ...
  <outputHandler>
    <class>output-handler-class-name</class>
  </outputHandler>
  ...
[ <logLevel>logging-level</logLevel> ]
</jhoveConfig>

The optional <defaultEncoding> element specifies the default character encoding used by output handlers. This option can also be specified by the -e encoding command line argument. The default output encoding is UTF-8.

The optional <tempDirectory> element specifies the pathname of the directory in which temporary files are created. This option can also be specified by the -t directory command line argument. On most Unix systems, a reasonable temporary directory is "/var/tmp"; on Windows, "C:\temp".

The optional <bufferSize> element specifies the buffer size use for buffered I/O. This option can also be specified by the -b buffer command line argument.

The optional <mixVersion> element specifies the MIX schema version conformance for the output produced by the XML output handler. By default the handler output conforms to version 0.2 of the schema. For version 1.0 conformance, specify:

<mixVersion>1.0<mixVersion>

The optional <sigBytes> element specifies the maximum number of byte that JHOVE modules will examine looking for an internal signature (or magic number). The default value is 1024.

The optional <logLevel> element specifies the logging level, used by calls to the logging API. This option can also be specified by the -l log-level command line argument. The default is SEVERE.

All class names must be fully qualified with their package name, for example:

edu.harvard.hul.ois.jhove.module.AsciiModule
edu.harvard.hul.ois.jhove.module.PdfModule
edu.harvard.hul.ois.jhove.module.TiffModule
edu.harvard.hul.ois.jhove.module.Utf8Module

The order in which format modules are defined is important; when performing a format identification operation, JHOVE will search for a matching module in the order in which the modules are defined in the configuration file. In general, the modules for more generic formats should come later in the list. For example, the standard module ASCII should be defined before the UTF-8 module, since all ASCII objects are, by definition, UTF-8 objects, but not vice versa.

The optional <init> element is used to pass a module-specific argument to a module at the time it is first instantiated within JHOVE. See the details for the individual modules to see if such an argument is defined. The use of the <init> argument is currently not defined for any of the standard JHOVE modules.

The optional and repeatable <param> element is used to pass a module-specific parameter to a module immediately prior to each invocation of the module's parse() method. See the details for the individual modules to see if such a parameter is defined.

In addition to the modules and output handlers specified in the configuration file, JHOVE is also always statically linked with the standard Bytestream module and Text and XML output handlers.

4.4 Testing

Invoking JHOVE on the example file control.txt:

jhove -c conf/jhove.conf -k examples/ascii/control.txt
should generate the following output:
Jhove (Rel. 1.0, 2005-05-26)
 Date: 2005-05-12 10:20:43 EDT
 RepresentationInformation: examples/ascii/control.txt
  ReportingModule: ASCII-hul, Rel. 1.1 (2005-01-11)
  LastModified: 2003-09-09 16:56:50 EDT
  Size: 51
  Format: ASCII
  Status: Well-formed and valid
  MIMEtype: text/plain; charset=US-ASCII
  ASCIIMetadata: 
   LineEndings: LF
   ControlCharacters: TAB (0x09), VT (0x0B), FF (0x0C), SUB (0x1A)
  Checksum: bae406b6
   Type: CRC32
  Checksum: 2774395cac046bf2fd1898ebed8a5f9a
   Type: MD5
  Checksum: 6753be3059ba36fd970ff620a7ebe3fff95c13fe
   Type: SHA-1

with the appropriate date and time values.

The JHOVE Swing-based GUI interface can be invoked from a command shell:

java -jar bin/JhoveView.jar

5 Tutorial

A tutorial on how to use JHOVE is available.

Acknowledgements

Development of JHOVE is funded in part by the Andrew W. Mellon Foundation through a grant to JSTOR for the recently launched Electronic-Archiving Initiative.

Copyright 2003-2011 by JSTOR and the President and Fellows of Harvard College. Used by permission.
Last updated 2011-01-04