Extract and analyze images with DynaPDF
Did you know that you can extract and analyze images from a PDF file with DynaPDF? For this you can use the function DynaPDF.GetImage. In the parameters you can specify which information you want to get for an image. You can choose between the following values:
BufSize | The size of the image buffer in bytes. |
---|---|
BufSize | The size of the image buffer in bytes. |
Buffer | The image data as JPEG or FILE. |
Picture | The image as a picture container. Either JPEG or TIFF. |
Filter | The format of image: Required decode filter if the image is compressed. Possible values are dfDCTDecode (JPEG), dfJPXDecode (JPEG2000), and dfJBIG2Decode. Other filters are already removed by DynaPDF since a conversion to a native file format is then always required. |
OrgFilter | The image was compressed with this filter in the PDF file. This info is useful to determine which compression filter should be used when creating a new image file from the image buffer. |
BitsPerPixel | Bit depth of the image buffer. Possible values are 1, 2, 4, 8, 24, 32, and 64. |
ColorSpace | The color space refers either to the image buffer or to the color table if set. |
NumComponents | The number of components stored in the image buffer. |
MinIsWhite | If 1, the colors of 1 bit images are reversed. |
ColorCount | The number of colors in the color table. |
Width | Image width in pixel. |
Height | Image height in pixel. |
ScanLineLength | The length of a scanline in bytes. |
InlineImage | If 1, the image is an inline image. |
Interpolate | If 1, image interpolation should be performed. |
Transparent | The meaning is different depending on the bit depth and whether a color table is available. If the image is a 1 bit image and if no color table is available, black pixels must be drawn with the current fill color. If the image contains a color table, ColorMask contains the range of indexes in the form min/max index which should appear transparent. If no color table is present ColorMask contains the transparent ranges in the form min/max for every color component. |
Intent | The rendering intent. Default is none. |
MetadataSize | Length of Metadata in bytes. |
ResolutionX | Image resolution on the x-axis. |
ResolutionY | Image resolution on the y-axis. |
Metadata | Optional XML Metadata stream as text. |
ICCProfile | ICC Color Profile of the colorspace (can be empty). |
MaskImage | If set, a 1 bit image is used as a transparency mask. Returns index of that image. |
SoftMask | If set, a grayscale image is used as alpha channel. Returns index of that image. |
FillColor | The current fill color. An image mask is drawn with the current fill color. |
FillColorSpace | The color space in which FillColor is defined. |
For example, the size query would look like this:
Set Field [ Images::Width ; Value: MBS( "DynaPDF.GetImage"; $PDF; $i; "Width" ) ]
Set Field [ Images::Height; Value: MBS( "DynaPDF.GetImage"; $PDF; $i; "Height") ]
In the parameters we first specify the PDF working environment in which the file with the analyzed images is located, then the index of the image to determine which image should be analyzed and finally the type of information that is requested.
As already mentioned, we can not only analyze images, but also extract them from the file. The image is in the end only information that we query, which means we can use the same function again.
Set Field [ Images::image ; MBS("DynaPDF.GetImage"; $PDF; $i; "Picture"; $i&"_imag.png"; "PNG") ]
We can specify more information here in the parameters, so we additionally specify the file name and the format we want for the image.
If we want to extract all images of a document, we can use the functions in a loop. The loop will then run as many times as there are images. We determine the number of images with the function DynaPDF.GetImageCount. The index of the addressed images starts at 0, and ends at DynaPDF.GetImageCount-1.
If you are interested in this topic have a look at our new example Extract and analyze images.fmp12 included with the next version.
To use this functions you need a DynaPDF Lite license and the new example file Extract and analyze images.fmp12 will be part of next pre-release..