PNG JPG BMP TIFF PDF
Aspose.PDF  for Python

Extract Pages from PDF document via Cloud Python SDK

API for working with Pages in PDF documents using Cloud Python SDK.

Get Started

How to get Pages from PDF via Cloud Python SDK

To extract Pages from PDF, we’ll use Aspose.PDF Cloud Python SDK. This Cloud SDK assists Python programmers in developing cloud-based PDF creator, annotator, editor, and converter apps using Python programming language via Aspose.PDF REST API. Simply create an account at Aspose for Cloud and get your application information. Once you have the App SID & key, you are ready to give the Aspose.PDF Cloud Python SDK. If the python package is hosted on Github, you can install directly from Github:

Installation from Github


     
    pip install git+https://github.com/aspose-pdf-cloud/aspose-pdf-cloud-python.git

Package Manager Console Command

     
    pip install asposepdfcloud

Steps to get Pages from PDF via Python SDK

Aspose.PDF Cloud developers can easily load & extract pages from PDF in just a few lines of code.

  1. Install Python SDK
  2. Upload a PDF document to the Aspose Cloud server
  3. Download the processed PDF document from the Aspose Cloud server
  4. Get page information of the PDF document
 

Extract Pages from PDF using Python


    import shutil
    import json
    import logging
    from pathlib import Path
    from asposepdfcloud import ApiClient, PdfApi, DocumentPagesResponse

    # Configure logging
    logging.basicConfig(level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s")


    class Config:
        """Configuration parameters."""
        CREDENTIALS_FILE = Path(r"C:\\Projects\\ASPOSE\\Pdf.Cloud\\Credentials\\credentials.json")
        LOCAL_FOLDER = Path(r"C:\Samples")
        PDF_DOCUMENT_NAME = "sample.pdf"
        LOCAL_RESULT_DOCUMENT_NAME = "output_sample.png"
        PAGE_NUMBER = 2

    class PdfPages:
        """ Class for managing PDF pages using Aspose PDF Cloud API. """
        def __init__(self, credentials_file: Path = Config.CREDENTIALS_FILE):
            self.pdf_api = None
            self._init_api(credentials_file)

        def _init_api(self, credentials_file: Path):
            """ Initialize the API client. """
            try:
                with credentials_file.open("r", encoding="utf-8") as file:
                    credentials = json.load(file)
                    api_key, app_id = credentials.get("key"), credentials.get("id")
                    if not api_key or not app_id:
                        raise ValueError("init_api(): Error: Missing API keys in the credentials file.")
                    self.pdf_api = PdfApi(ApiClient(api_key, app_id))
            except (FileNotFoundError, json.JSONDecodeError, ValueError) as e:
                logging.error(f"init_api(): Failed to load credentials: {e}")

        def upload_document(self):
            """ Upload a PDF document to the Aspose Cloud server. """
            if self.pdf_api:
                file_path = Config.LOCAL_FOLDER / Config.PDF_DOCUMENT_NAME
                try:
                    self.pdf_api.upload_file(Config.PDF_DOCUMENT_NAME, str(file_path))
                    logging.info(f"upload_document(): File {Config.PDF_DOCUMENT_NAME} uploaded successfully.")
                except Exception as e:
                    logging.error(f"upload_document(): Failed to upload file: {e}")

        def get_page_info(self):
            """ Get page information of the PDF document. """
            if self.pdf_api:
                result_pages: DocumentPagesResponse = self.pdf_api.get_page(Config.PDF_DOCUMENT_NAME, Config.PAGE_NUMBER)

                if result_pages.code == 200:
                    logging.info(f"Page #{Config.PAGE_NUMBER} information: {result_pages.page}")
                else:
                    logging.error(f"Failed to get the page #{Config.PAGE_NUMBER}.")

        def get_page_as_png(self):
            """ Get page information of the PDF document. """
            if self.pdf_api:
                try:
                    result_pages = self.pdf_api.get_page_convert_to_png(Config.PDF_DOCUMENT_NAME, Config.PAGE_NUMBER)
                    local_path = Config.LOCAL_FOLDER / Config.LOCAL_RESULT_DOCUMENT_NAME
                    shutil.move(result_pages, str(local_path))
                    logging.info(f"download_result(): File successfully downloaded: {local_path}")
                except Exception as e:
                    logging.error(f"download_result(): Failed to download file: {e}")

    if __name__ == "__main__":
        pdf_pages = PdfPages()
        pdf_pages.upload_document()
        pdf_pages.get_page_info()
        pdf_pages.get_page_as_png()
 

Work with Pages in PDF

Extracting pages from a PDF is a common task that serves various purposes across different contexts. This process involves selecting specific pages from a larger document to create a new, separate PDF file. Understanding the reasons behind this practice can help in effectively managing and utilizing PDF documents.​ Large PDF files can be cumbersome to share or store. By extracting only the necessary pages, users can create smaller, more manageable files. This is particularly useful when only a portion of the document is relevant for a specific purpose. For instance, removing unnecessary pages can significantly decrease the file size, making it easier to handle and distribute. ​ Extracting pages allows users to repurpose content for different applications. For example, one might extract pages from a comprehensive report to create a standalone summary or to isolate specific data for analysis. This enables the reuse of existing content without the need to recreate information from scratch. Extract the Pages from PDF documents with Aspose.PDF Cloud Python SDK.

With our Python library you can:

  • Combine PDF documents.
  • Split PDF Files.
  • Convert PDF to other formats, and vice versa.
  • Manipulate Annotations.
  • Work with Images in PDF, etc.
  • You can try out our free App to test the functionality online.