types module

Defines custom Python classes used transversely within the library.

Amongst these definition are found Python representations for PDF Objects (section 7.3 of the Standard), Lexer’s output tokens, and XRefTable entry types.

class pdf4py.types.PDFDictDelimiter(value)

[Internal] Represents tokens << and >>.

value

Alias for field number 0

class pdf4py.types.PDFHexString(value)

Represents the PDF Object ‘Hexadecimal string’.

An hexadecimal string is used mainly to encode a small quantity of binary data. The sequence of hexadecimal digits are not decoded from ascii but stored directly as bytes in value attribute. This is so because you tipically want to pass that value to the binascii.unhexlify function.

value

Alias for field number 0

class pdf4py.types.PDFIndirectObject(object_number, generation_number, value)

Represents a PDF indirect object.

Attribute value contains the PDF object the indirect object structure wraps.

generation_number

Alias for field number 1

object_number

Alias for field number 0

value

Alias for field number 2

class pdf4py.types.PDFKeyword(value)

[Internal] Represents a keyword in the PDF grammar, for example xref.

value

Alias for field number 0

class pdf4py.types.PDFLiteralString(value)

Represents the PDF Object ‘Literal string’.

A literal string is a sequence of ASCII characters. This is in theory, in practice there are so many PDF writers that store non ASCII strings using this object type that is best to leave the associated value in bytes and pass to the user the duty of choosing the right decoding scheme.

value

Alias for field number 0

class pdf4py.types.PDFOperator(value)

Represents an operator appearing in a ContentStream.

value

Alias for field number 0

class pdf4py.types.PDFReference(object_number, generation_number)

Represent a PDF reference to a PDF Indirect object.

generation_number

Alias for field number 1

object_number

Alias for field number 0

class pdf4py.types.PDFSingleton(value)

[Internal] Represents a singleton in the PDF greammar, for example {.

value

Alias for field number 0

class pdf4py.types.PDFStream(dictionary, stream)

Represents a PDF stream.

The attribute dictionary points to the stream dictionary. The attribute stream is a callable object requiring no arguments that when called returns the stream content bytes. The content is read from the source only when stream is called, following the lazy loading philosophy around which pdf4py is built around.

dictionary

Alias for field number 0

stream

Alias for field number 1

class pdf4py.types.PDFStreamReader(value)

[Internal] A wrapper around a function f(length) returned by Lexer to Parser when parsing a PDF stream object.``

value

Alias for field number 0

class pdf4py.types.XrefCompressedEntry(object_number, objstm_number, index)

Represents an entry in the Cross Reference Table pointing to an object that currently contributes to the final PDF render, but stored in a compressed object stream to reduce the size of the PDF file.

index

Alias for field number 2

object_number

Alias for field number 0

objstm_number

Alias for field number 1

class pdf4py.types.XrefInUseEntry(offset, object_number, generation_number)

Represents an entry in the Cross Reference Table pointing to an object that currently contributes to the final PDF render (as opposite to removed, i.e. free, objects).

generation_number

Alias for field number 2

object_number

Alias for field number 1

offset

Alias for field number 0