HTML JPG PDF XML DOCX
  Product Family
PDF

Parse PDF documents via .NET SDK

Extract forms, images, tables and text boxes from PDF document using Cloud .NET SDK

Get Started

How to parse PDF documents for extraction forms, images, tables and text boxes using Cloud .NET SDK

In order to parse PDF documents via Cloud .NET SDK , we’ll use Aspose.PDF Cloud .NET SDK This Cloud SDK allows you to easily build cloud-based PDF creator, editor & converter apps in C#, ASP.NET, or other .NET languages for various cloud platforms. Open NuGet package manager, search for Aspose.PDF Cloud and install. You may also use the following command from the Package Manager Console.

Package Manager Console Command


    PM> Install-Package Aspose.Pdf-Cloud
     

Steps to parse PDF documents via .NET SDK

Aspose.PDF Cloud developers can easily load & parse PDF documents in just a few lines of code.

  1. Create an instance of PdfApi using your AppSid and AppSecret from Aspose Cloud Dashboard.
  2. Upload PDF to Cloud Storage.
  3. Extract a Tables from PDF document.
  4. Print Operation Status. Output the result status to the console.
 

This sample code shows parsing PDF documents to extract tables


    public static async Task ParseTables()
	{
	    const string localPdfFileName = @"C:\Samples\sample.pdf";
	    const string storageFileName = "sample.pdf";
	    const string storageTempFolder = "YourTempFolder";

	    // Get your AppSid and AppSecret https://dashboard.aspose.cloud (free registration required).
	    var pdfApi = new PdfApi(AppSecret, AppSid);

	    using var file = File.OpenRead(localPdfFileName);
	    var uploadResult = pdfApi.UploadFile(Path.Combine(storageTempFolder, storageFileName), file);
	    Console.WriteLine(uploadResult.Uploaded[0]);

	    TablesRecognizedResponse response = await pdfApi.GetDocumentTablesAsync(storageFileName, folder: storageTempFolder);
	    if (response == null)
	        Console.WriteLine("GetTables(): Unexpected error!");
	    else if (response.Code < 200 || response.Code > 299)
	        Console.WriteLine("GetTables(): Failed to receive Tables from the document.");
	    else if (response.Tables == null || response.Tables.List == null || response.Tables.List.Count == 0)
	        Console.WriteLine("GetTables(): Tables not found in the document '{0]'.", storageFileName);
	    else
	    {
	        Console.WriteLine("GetTables(): Tables successfully received from the document '{0}.", storageFileName);
	        foreach (var table in response.Tables.List)
	        {
	            var tabResp = await pdfApi.GetTableAsync(storageFileName, table.Id, folder: storageTempFolder);
	            Console.WriteLine(JsonConvert.SerializeObject(tabResp.Table, Formatting.Indented));
	        }
	    }
	}