Literal (JHOVE Documentation)

Overview

Package

Class

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

edu.harvard.hul.ois.jhove.module.pdf
Class Literal

java.lang.Object
  edu.harvard.hul.ois.jhove.module.pdf.Token
      edu.harvard.hul.ois.jhove.module.pdf.StringValuedToken
          edu.harvard.hul.ois.jhove.module.pdf.Literal

Direct Known Subclasses:: Hexadecimal

public class Literal
extends StringValuedToken
extends StringValuedToken

Class for Tokens which represent PDF strings. The class maintains a field for determining whether the string is encoded as PDF encoding or UTF-16. This is determined in the course of analyzing the characters for the token.

Field Summary
`static char[]`	`PDFDOCENCODING` Mapping between PDFDocEncoding and Unicode code points.

Fields inherited from class edu.harvard.hul.ois.jhove.module.pdf.StringValuedToken
`_rawBytes, _value`

Constructor Summary
`Literal()` Creates an instance of a string literal

Method Summary
`void`	`appendHex(int ch)` Append a hex character.
`void`	`convertHex()` Convert the raw hex data.
`boolean`	`isDate()` Returns `true` if the string value is a parsable date.
`boolean`	`isPDFACompliant()` Returns `true` if this token doesn't violate any PDF/A rules, `false` if it does.
`boolean`	`isPDFDocEncoding()` Returns `true` if this string is in PDFDocEncoding, false if UTF-16.
`java.util.Date`	`parseDate()` Parse the string value to a date.
`long`	`processLiteral(Tokenizer tok)` Process the incoming characters into a string literal.
`void`	`setPDFDocEncoding(boolean pdfDocEncoding)` Sets the value of pDFDocEncoding.

Methods inherited from class edu.harvard.hul.ois.jhove.module.pdf.StringValuedToken
`getRawBytes, getValue, setValue`

Methods inherited from class edu.harvard.hul.ois.jhove.module.pdf.Token
`isPdfACompliant, isSimpleToken`

Methods inherited from class java.lang.Object
`clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Field Detail

PDFDOCENCODING

public static char[] PDFDOCENCODING

Mapping between PDFDocEncoding and Unicode code points.

Constructor Detail

Literal

public Literal()

Creates an instance of a string literal

Method Detail

appendHex

public void appendHex(int ch)
               throws PdfException

Append a hex character. This is used only for hex literals (those that start with '<').

Parameters:: ch - The integer 8-bit code for a hex character
Throws:: PdfException

processLiteral

public long processLiteral(Tokenizer tok)
                    throws java.io.IOException

Process the incoming characters into a string literal. This is used for literals delimited by parentheses, as opposed to hex strings.

Parameters:: tok - The tokenizer, passed to give access to its getChar function.
Returns:: true if the character was processed normally, false if a terminating parenthesis was reached.
Throws:: java.io.IOException

convertHex

public void convertHex()
                throws PdfException

Convert the raw hex data. Two buffers are saved: _rawBytes for the untranslated hex-encoded data, and _value for the PDF or UTF encoded string.

Throws:: PdfException

isPDFDocEncoding

public boolean isPDFDocEncoding()

Returns true if this string is in PDFDocEncoding, false if UTF-16.

setPDFDocEncoding

public void setPDFDocEncoding(boolean pdfDocEncoding)

Sets the value of pDFDocEncoding.

isDate

public boolean isDate()

Returns true if the string value is a parsable date. Conforms to the ASN.1 date format: D:YYYYMMDDHHmmSSOHH'mm' where everything before and after YYYY is optional. If we take this literally, the format is frighteningly ambiguous (imagine, for instance, leaving out hours but not minutes and seconds), so the checking is a bit loose.

parseDate

public java.util.Date parseDate()

Parse the string value to a date. PDF dates conform to the ASN.1 date format. This consists of D:YYYYMMDDHHmmSSOHH'mm' where everything before and after YYYY is optional. Adobe doesn't actually say so, but I'm assuming that if a field is included, everything to its left must be included, e.g., you can't have seconds but leave out minutes.

isPDFACompliant

public boolean isPDFACompliant()

Returns true if this token doesn't violate any PDF/A rules, false if it does.