« FileMaker License Pri… | Home | Identify data content… »

DynaPDF Parser for Xojo

With MBS Xojo DynaPDF Plugin 24.0 we include the DynaPDFParserMBS class. This class provides a top-level interface on the parser in DynaPDF. You can do various operation using this class:

  • Parse a page
  • Find text on the page.
  • Extract text on the page
  • Delete text within a rectangle.
  • Query coordinates for found text.
  • Replace found text with new text.
  • Set alternative font for new text.
  • Finally write changes back to page.

The combination of these allows for a lot of things. Like finding text on a page and replacing it with new text for a little PDF editor. Or to find text and then:

  • Draw rectangles around the found text to show it.
  • Put highlight annotations on the text.
  • Find website names or keywords and put WebLink annotations on them.
  • Use DeleteText function to remove the text from the PDF.
  • Find each character and then know the coordinates of every letter.

The possibilities from the handful of functions is enormous. Let's take a look on sample code:

// import all pages and close file Call pdf.ImportPDFFile(1, 1.0, 1.0) // now do search and replace Dim FindText As String = "PDF" Dim ReplaceText As String = "Test" // initialize parser Dim Parser As New DynaPDFParserMBS(pdf) // search options allow to search a rectangle and // decide whether to do case sensitive or insensitive search Dim area As DynaPDFRectMBS = Nil // whole page Dim SearchType As Integer = DynaPDFParserMBS.kstCaseInSensitive Dim ContentParsingFlags As Integer = DynaPDFParserMBS.kcpfEnableTextSelection // we loop over all pages Dim count As Integer = pdf.GetPageCount For i As Integer = 1 To count Dim needWrite As Boolean If parser.ParsePage(i, ContentParsingFlags) Then // we run the search Dim found As Boolean = Parser.FindText(area, SearchType, FindText) While found // we replace the found text If Parser.ReplaceSelText(ReplaceText) Then needWrite = True End If // and continue search found = Parser.FindText(area, SearchType, FindText, True) Wend // only if we changed something, we should write changes back If needWrite Then Call Parser.WriteToPage End If End If Next

This does a search and replace while looping over all pages. Instead of doing the replace, we could of course query the coordinates on the page for later using the SelBBox property. Then after we found all the positions, we can call EditPage to make changes and for example draw rectangles into the found places. Or add annotations.

Please try the new functions and let us know.

15 01 24 - 08:23