HTML
JPG
PDF
XML
DOCX
PDF
Parse PDF documents via .NET SDK
Extract forms, images, tables and text boxes from PDF document using Cloud .NET SDK
Get StartedHow to parse PDF documents for extraction forms, images, tables and text boxes using Cloud .NET SDK
In order to parse PDF documents via Cloud .NET SDK , we’ll use Aspose.PDF Cloud .NET SDK This Cloud SDK allows you to easily build cloud-based PDF creator, editor & converter apps in C#, ASP.NET, or other .NET languages for various cloud platforms. Open NuGet package manager, search for Aspose.PDF Cloud and install. You may also use the following command from the Package Manager Console.
Package Manager Console Command
PM> Install-Package Aspose.Pdf-Cloud
Steps to parse PDF documents via .NET SDK
Aspose.PDF Cloud developers can easily load & parse PDF documents in just a few lines of code.
- Create an instance of PdfApi using your AppSid and AppSecret from Aspose Cloud Dashboard.
- Upload PDF to Cloud Storage.
- Extract a Tables from PDF document.
- Print Operation Status. Output the result status to the console.
This sample code shows parsing PDF documents to extract tables
public static async Task ParseTables()
{
const string localPdfFileName = @"C:\Samples\sample.pdf";
const string storageFileName = "sample.pdf";
const string storageTempFolder = "YourTempFolder";
// Get your AppSid and AppSecret https://dashboard.aspose.cloud (free registration required).
var pdfApi = new PdfApi(AppSecret, AppSid);
using var file = File.OpenRead(localPdfFileName);
var uploadResult = pdfApi.UploadFile(Path.Combine(storageTempFolder, storageFileName), file);
Console.WriteLine(uploadResult.Uploaded[0]);
TablesRecognizedResponse response = await pdfApi.GetDocumentTablesAsync(storageFileName, folder: storageTempFolder);
if (response == null)
Console.WriteLine("GetTables(): Unexpected error!");
else if (response.Code < 200 || response.Code > 299)
Console.WriteLine("GetTables(): Failed to receive Tables from the document.");
else if (response.Tables == null || response.Tables.List == null || response.Tables.List.Count == 0)
Console.WriteLine("GetTables(): Tables not found in the document '{0]'.", storageFileName);
else
{
Console.WriteLine("GetTables(): Tables successfully received from the document '{0}.", storageFileName);
foreach (var table in response.Tables.List)
{
var tabResp = await pdfApi.GetTableAsync(storageFileName, table.Id, folder: storageTempFolder);
Console.WriteLine(JsonConvert.SerializeObject(tabResp.Table, Formatting.Indented));
}
}
}