PdfPage.GetText
Overloads
GetText() | Gets the text in the page. |
GetText(Single, Single, Single, Single) | Gets the text in the specified rectangle of the page. |
GetText(Single, Single, Single, Single, TextExtractionOrder) | Gets the text in the specified rectangle of the page. |
GetText(TextExtractionOrder) | Gets the text in the page. |
GetText()
Gets the text in the page.
public string GetText()
Function GetText() As String
Returns
A string containing the text in the page.
Licensing Info
This method is a full DynamicPDF Core Suite feature. One of the following is required for non-evaluation usage:
- An active DynamicPDF Ultimate Subscription
- An active DynamicPDF Professional or Professional Plus Subscription with DynamicPDF Core Suite selected.
- A DynamicPDF Core Suite for .NET v12.X Developer License.
Examples
The following example will extract the entire text of the specified page in the given PDF documents.Imports System
Imports ceTe.DynamicPDF
Imports ceTe.DynamicPDF.Merger
Module MyModule
Sub Main()
' Create PDF document object
Dim pdfA As PdfDocument = New PdfDocument( "C:\Invoice.pdf")
' Call GetText method from PDF document object to get the text from the document
Dim extractedText As String = pdfA.Pages.Item(1).GetText()
End Sub
End Module
using System;
using ceTe.DynamicPDF.Merger;
public class Example
{
public static void GetText(string inputPath)
{
// Create PDF document object
PdfDocument pdfA = new PdfDocument(inputPath);
// Call GetText method from PDF document object to get the text from the document
string extractedText = pdfA.Pages[0].GetText();
}
}
Remarks
This method can be used to extract the text in the same order the pdf operators are loaded. Text extraction skips characters that are morethan 2 bytes. With some of the .Net runtimes (example: .Net Core 2.0), Text extraction will give the error "No data is available for encoding 1252. For information on defining a custom encoding, see the documentation for the Encoding.RegisterProvider method.". To resolve this error refer the user manual page Encoding Considerations.
GetText(Single, Single, Single, Single)
Gets the text in the specified rectangle of the page.
public string GetText(float x, float y, float width, float height)
Function GetText(x As Single, y As Single, width As Single, height As Single) As String
Parameters
X coordinate of the rectangle.
Y coordinate of the rectangle.
- width
- Single
Width of the rectangle.
- height
- Single
Height of the rectangle.
Returns
A string containing the text in the specified rectangle of the page.
Licensing Info
This method is a full DynamicPDF Core Suite feature. One of the following is required for non-evaluation usage:
- An active DynamicPDF Ultimate Subscription
- An active DynamicPDF Professional or Professional Plus Subscription with DynamicPDF Core Suite selected.
- A DynamicPDF Core Suite for .NET v12.X Developer License.
Remarks
This method can be used to extract the text in the same order the pdf operators are loaded. Text extraction skips characters that are morethan 2 bytes.
GetText(Single, Single, Single, Single, TextExtractionOrder)
Gets the text in the specified rectangle of the page.
public string GetText(float x, float y, float width, float height, TextExtractionOrder textExtractionOrder)
Function GetText(x As Single, y As Single, width As Single, height As Single, textExtractionOrder As TextExtractionOrder) As String
Parameters
X coordinate of the rectangle.
Y coordinate of the rectangle.
- width
- Single
Width of the rectangle.
- height
- Single
Height of the rectangle.
- textExtractionOrder
- TextExtractionOrder
Order in which text has to be extracted.
Returns
A string containing the text in the specified rectangle of the page.
Licensing Info
This method is a full DynamicPDF Core Suite feature. One of the following is required for non-evaluation usage:
- An active DynamicPDF Ultimate Subscription
- An active DynamicPDF Professional or Professional Plus Subscription with DynamicPDF Core Suite selected.
- A DynamicPDF Core Suite for .NET v12.X Developer License.
Remarks
This method can be used to extract the text in the same order the pdf operators are loaded. Text extraction skips characters that are morethan 2 bytes.
GetText(TextExtractionOrder)
Gets the text in the page.
public string GetText(TextExtractionOrder textExtractionOrder)
Function GetText(textExtractionOrder As TextExtractionOrder) As String
Parameters
- textExtractionOrder
- TextExtractionOrder
Order in which text has to be extracted.
Returns
A string containing the text in the page.
Licensing Info
This method is a full DynamicPDF Core Suite feature. One of the following is required for non-evaluation usage:
- An active DynamicPDF Ultimate Subscription
- An active DynamicPDF Professional or Professional Plus Subscription with DynamicPDF Core Suite selected.
- A DynamicPDF Core Suite for .NET v12.X Developer License.
Examples
The following example will extract the entire text of the specified page in the given PDF documents.Imports System
Imports ceTe.DynamicPDF
Imports ceTe.DynamicPDF.Merger
Module MyModule
Sub Main()
' Create PDF document object
Dim pdfA As PdfDocument = New PdfDocument( "C:\Invoice.pdf")
' Call GetText method from PDF document object to get the text from the document
Dim extractedText As String = pdfA.Pages.Item(1).GetText()
End Sub
End Module
using System;
using ceTe.DynamicPDF.Merger;
public class Example
{
public static void GetText(string inputPath)
{
// Create PDF document object
PdfDocument pdfA = new PdfDocument(inputPath);
// Call GetText method from PDF document object to get the text from the document
string extractedText = pdfA.Pages[0].GetText();
}
}
Remarks
This method can be used to extract the text in the same order the pdf operators are loaded. Text extraction skips characters that are morethan 2 bytes. With some of the .Net runtimes (example: .Net Core 2.0), Text extraction will give the error "No data is available for encoding 1252. For information on defining a custom encoding, see the documentation for the Encoding.RegisterProvider method.". To resolve this error refer the user manual page Encoding Considerations.