API for scanning a pdf

Hello,

Is there any API for scanning a PDF file i.e. calculate existing content width and height ... I need this to automatically calculate the clipping width and height to accommodate an ImportedPageArea.

The PDF solution I need to provide requires very dynamic assembling of parts into a PDF document ... there are many part types that can be combined, thus a template-based solution would not work (combinatorial explosion) since would require an unmanageable amount of templates.

Thanks in advance,
Best regards,
Giovanni

Posted by a ceTe Software moderator

Hello Giovanni,

You can get the margins of a page or the dimensions of the page by using the PageDimensions class. You can refer to the Dimensions class on our help documentation.

If this does not help you then please be more specific about the requirement you have so that we can help you further.

Thanks,
ceTe Software Support Team.

Thank you for the prompt response.

I have a header template. This is only a part of the final PDF document. The header is created by the user directly as word document and exported to PDF. In conclusion I have a pdf file that contains *only* the header template. Likewise other "part templates" are created and saved as PDF files.

Now I need to build a final real PDF document by assembling all these "part templates". I found today that I can use the ImportedPageArea API for this purpose.

My problem now is that the "part templates" should be maintainable by the users. So these small templates change. When I assemble the final document I need to know how big these "part templates" are to space and layout them correctly in the final PDF document.

I need an API to calculate e.g. how big is the longest sentence and how many lines are actually occupied in these "part templates".

I hope is clear now.

TIA,
Best regards,
Giovanni

Posted by a ceTe Software moderator

Hello Giovanni,

It looks like you want to know the page size or dimensions of the "part templates" you are getting from your clients.

Actually the "part templates" you are getting are PDF documents with varying sizes depending on the content (header, footer etc). You can get its size by using the dimensions class.

Below is a sample code for this.

   PdfDocument pdf = new PdfDocument("C:/Header.pdf");
   MergeDocument document = new MergeDocument(pdf);

   Page page = doc.getPages().getPage(0);
   float bodyH = page.getDimensions().getBody().getHeight();
   float bodyW = page.getDimensions().getBody().getWidth();
   float topMargin = page.getDimensions().getTopMargin();

Thanks,
ceTe Software Support Team.

Hello,

Thank you for your support!

The version 4.0.3 build_70511 dated 11.05.2007 that we have does not offer the method getBody():

float bodyH = page.getDimensions().getBody().getHeight();

I tried instead:

float bodyH = page.getDimensions().getHeight();

but this will always return the same sizes regardless of the actual content:
842.0
595.22
0.0

TIA,
Best regards,
Giovanni

Posted by a ceTe Software moderator

Hello Giovanni,

Are you resetting the page variable with the page of the new document each time? It is possible that the value stored in the page object is not getting updated and therefore you see the same value returned all the time.

The code you are using should work fine and return the height of the page for each document differently if the pages are of different size.

Thanks,
ceTe Software Support Team

Hello,

>Are you resetting the page variable with the page of the
>new document each time? It is possible that the value stored
>in the page object is not getting updated and therefore
>you see the same value returned all the time.
>
I have no idea what you mean above "resetting the page variable" or the second part. The imported documents I need scanning already exists in as files i.e. they are not generated in memory therefore I reckon they are updated or?

Please find below the full code containing the issue. Note there are two existing files that are referenced: header.pdf and footer.

Thanks in advance,
Best regards,
Giovanni

/**
* Example for generating one othe letters as shown in the SRS
*/
public class ExampleLetter {
        /**
         * @param args
         */
        public static void main(String[] args) {
                // create a PDF document
   Document document = new Document();

   // add one blank page to the document
   document.getPages().add(new Page(PageSize.LETTER));

   // create an even odd template and add elements to it
   Template template = new Template();

   // build the page
   template.getElements().add(getHeader());
   template.getElements().add(getFooter());

   // add the template to the document
   document.setTemplate(template);

   document.draw("letter.pdf");
        }

        /**
         * Returns the header part
         *
         * @return the header part
         */
        private static PageElement getHeader() {
                PdfDocument pdfDocument = new PdfDocument("header.pdf");
                showDimensions(pdfDocument);

                int pageNumber = 0;
                int xOffset = 100;
                int yOffset = 0;
                PdfPage pdfPage = pdfDocument.getPages().getPdfPage(pageNumber);
                ImportedPageArea headerArea = new ImportedPageArea(pdfPage, pageNumber, xOffset, yOffset);
          headerArea.getContents().setClipLeft(50);
          headerArea.getContents().setClipTop(50);
          headerArea.getContents().setClipRight(50);
          headerArea.getContents().setClipBottom(50);

                return headerArea;
        }

        /**
         * Returns the footer {@link PageElement}
         *
         * @return the footer {@link PageElement}
         */
        private static PageElement getFooter() {
          // fill placeholders on template footer
                MergeDocument mergeDocument = new MergeDocument("footer.pdf", new MergeOptions(true));
                mergeDocument.getForm().getFields().getFormField("ph0001").setValue("Juan Perez");
                mergeDocument.getForm().getFields().getFormField("ph0002").setValue("Head of Customer Acquisition");
                mergeDocument.getForm().getFields().getFormField("ph0003").setValue("Dayana Gonzales");
                mergeDocument.getForm().getFields().getFormField("ph0004").setValue("Head of Customer Satisfaction");

                // output the populated template to a PdfDocument
                ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
                mergeDocument.draw(outputStream);
                PdfDocument pdfDocument = new PdfDocument(new ByteArrayInputStream(outputStream.toByteArray()));
                showDimensions(pdfDocument);

                // create an imported area from the in-memory pdfDocument
                int pageNumber = 0;
                int xOffset = 0;
                int yOffset = 400;
                PdfPage pdfPage = pdfDocument.getPages().getPdfPage(pageNumber);
                ImportedPageArea importedArea = new ImportedPageArea(pdfPage, pageNumber, xOffset, yOffset);
          importedArea.getContents().setClipLeft(50);
          importedArea.getContents().setClipTop(50);
          importedArea.getContents().setClipRight(100);
          importedArea.getContents().setClipBottom(100);

                return importedArea;
        }

        private static void showDimensions(PdfDocument pdfDocument) {
          showDimensions(new MergeDocument(pdfDocument));
        }

        private static void showDimensions(MergeDocument mergeDocument) {
          Page page = mergeDocument.getPages().getPage(0);
          System.out.println(page.getDimensions().getHeight());
          System.out.println(page.getDimensions().getWidth());
          System.out.println(page.getDimensions().getTopMargin());
        }
}

Posted by a ceTe Software moderator

Hello,

You can get the dimensions of the page and the margins of the page using our product. It is not possible to get the dimensions of a visible content on a page.

The code you posted will return the page height, width and top margin. If you use different page sizes then it will return different values. You might be having pages of same size.

Thanks,
ceTe Software Support Team.

API for scanning a pdf

DynamicPDF Generator for Java (v4) Forum