|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectedu.harvard.hul.ois.jhove.module.pdf.Parser
public class Parser
The Parser class implements some limited syntactic analysis for PDF. It isn't by any means intended to be a full parser. Its main job is to track nesting of syntactic elements such as dictionary and array beginnings and ends.
Constructor Summary | |
---|---|
Parser(Tokenizer tokenizer)
Constructor. |
Method Summary | |
---|---|
int |
getArrayDepth()
Returns the number of array starts not yet matched by array ends. |
int |
getDictDepth()
Returns the number of dictionary starts not yet matched by dictionary ends. |
java.util.Set |
getLanguageCodes()
Returns the language code set from the Tokenizer. |
Token |
getNext()
Gets a token. |
Token |
getNext(java.lang.Class clas,
java.lang.String errMsg)
A class-sensitive version of getNext. |
Token |
getNext(long max)
Gets a token. |
long |
getOffset()
Returns the current offset into the file. |
boolean |
getPDFACompliant()
Returns false if either the parser or the tokenizer has detected non-compliance with PDF/A restrictions. |
java.lang.String |
getWSString()
Returns the Tokenizer's current whitespace string. |
PdfArray |
readArray()
Reads an array. |
PdfDictionary |
readDictionary()
Reads a dictionary. |
PdfObject |
readObject()
Reads an object. |
PdfObject |
readObjectDef()
Reads an object definition, from wherever we are in the stream to the completion of one full object after the obj keyword. |
PdfObject |
readObjectDef(Numeric objNumTok)
Reads an object definition, given the first numeric object, which has already been read and is passed as an argument. |
void |
reset()
Clear the state of the parser so that it can start reading at a different place in the file. |
void |
resetLoose()
Clear the state of the parser so that it can start reading at a different place in the file and ignore any nesting errors. |
void |
scanMode(boolean flag)
If true, do not attempt to parse non-whitespace delimited tokens, e.g., literal and hexadecimal strings. |
void |
seek(long offset)
Positions the file to the specified offset, and resets the state for a new token stream. |
void |
setEncrypted(boolean encrypted)
Tells this Parser, and its Tokenizer, whether the file is encrypted. |
void |
setObjectMap(java.util.Map objectMap)
Set the object map on which the parser will work. |
void |
setPDFACompliant(boolean pdfACompliant)
Set the value of the pdfACompliant flag. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public Parser(Tokenizer tokenizer)
tokenizer
- The Tokenizer which the parser will useMethod Detail |
---|
public void setObjectMap(java.util.Map objectMap)
public void reset()
public void resetLoose()
public Token getNext() throws java.io.IOException, PdfException
java.io.IOException
PdfException
public Token getNext(long max) throws java.io.IOException, PdfException
max
- Maximum allowable size of the token
java.io.IOException
PdfException
public Token getNext(java.lang.Class clas, java.lang.String errMsg) throws java.io.IOException, PdfException
java.io.IOException
PdfException
public int getDictDepth()
public void setEncrypted(boolean encrypted)
public int getArrayDepth()
public java.lang.String getWSString()
public java.util.Set getLanguageCodes()
public boolean getPDFACompliant()
true
is no guarantee that the file is compliant.
public void setPDFACompliant(boolean pdfACompliant)
true
, the tokenizer's pdfACompliant
flag is also set to true
.
public PdfObject readObjectDef() throws java.io.IOException, PdfException
java.io.IOException
PdfException
public PdfObject readObjectDef(Numeric objNumTok) throws java.io.IOException, PdfException
java.io.IOException
PdfException
public PdfObject readObject() throws java.io.IOException, PdfException
java.io.IOException
PdfException
public PdfArray readArray() throws java.io.IOException, PdfException
java.io.IOException
PdfException
public PdfDictionary readDictionary() throws java.io.IOException, PdfException
java.io.IOException
PdfException
public long getOffset()
public void seek(long offset) throws java.io.IOException, PdfException
java.io.IOException
PdfException
public void scanMode(boolean flag)
flag
- Scan mode flag
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |