Extract Pages from PDF document via Cloud Python SDK
API for working with Pages in PDF documents using Cloud Python SDK.
Get StartedHow to get Pages from PDF via Cloud Python SDK
To extract Pages from PDF, we’ll use Aspose.PDF Cloud Python SDK. This Cloud SDK assists Python programmers in developing cloud-based PDF creator, annotator, editor, and converter apps using Python programming language via Aspose.PDF REST API. Simply create an account at Aspose for Cloud and get your application information. Once you have the App SID & key, you are ready to give the Aspose.PDF Cloud Python SDK. If the python package is hosted on Github, you can install directly from Github:
Installation from Github
pip install git+https://github.com/aspose-pdf-cloud/aspose-pdf-cloud-python.git
Package Manager Console Command
pip install asposepdfcloud
Steps to get Pages from PDF via Python SDK
Aspose.PDF Cloud developers can easily load & extract pages from PDF in just a few lines of code.
- Install Python SDK
- Upload a PDF document to the Aspose Cloud server
- Download the processed PDF document from the Aspose Cloud server
- Get page information of the PDF document
Extract Pages from PDF using Python
import shutil
import json
import logging
from pathlib import Path
from asposepdfcloud import ApiClient, PdfApi, DocumentPagesResponse
# Configure logging
logging.basicConfig(level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s")
class Config:
"""Configuration parameters."""
CREDENTIALS_FILE = Path(r"C:\\Projects\\ASPOSE\\Pdf.Cloud\\Credentials\\credentials.json")
LOCAL_FOLDER = Path(r"C:\Samples")
PDF_DOCUMENT_NAME = "sample.pdf"
LOCAL_RESULT_DOCUMENT_NAME = "output_sample.png"
PAGE_NUMBER = 2
class PdfPages:
""" Class for managing PDF pages using Aspose PDF Cloud API. """
def __init__(self, credentials_file: Path = Config.CREDENTIALS_FILE):
self.pdf_api = None
self._init_api(credentials_file)
def _init_api(self, credentials_file: Path):
""" Initialize the API client. """
try:
with credentials_file.open("r", encoding="utf-8") as file:
credentials = json.load(file)
api_key, app_id = credentials.get("key"), credentials.get("id")
if not api_key or not app_id:
raise ValueError("init_api(): Error: Missing API keys in the credentials file.")
self.pdf_api = PdfApi(ApiClient(api_key, app_id))
except (FileNotFoundError, json.JSONDecodeError, ValueError) as e:
logging.error(f"init_api(): Failed to load credentials: {e}")
def upload_document(self):
""" Upload a PDF document to the Aspose Cloud server. """
if self.pdf_api:
file_path = Config.LOCAL_FOLDER / Config.PDF_DOCUMENT_NAME
try:
self.pdf_api.upload_file(Config.PDF_DOCUMENT_NAME, str(file_path))
logging.info(f"upload_document(): File {Config.PDF_DOCUMENT_NAME} uploaded successfully.")
except Exception as e:
logging.error(f"upload_document(): Failed to upload file: {e}")
def get_page_info(self):
""" Get page information of the PDF document. """
if self.pdf_api:
result_pages: DocumentPagesResponse = self.pdf_api.get_page(Config.PDF_DOCUMENT_NAME, Config.PAGE_NUMBER)
if result_pages.code == 200:
logging.info(f"Page #{Config.PAGE_NUMBER} information: {result_pages.page}")
else:
logging.error(f"Failed to get the page #{Config.PAGE_NUMBER}.")
def get_page_as_png(self):
""" Get page information of the PDF document. """
if self.pdf_api:
try:
result_pages = self.pdf_api.get_page_convert_to_png(Config.PDF_DOCUMENT_NAME, Config.PAGE_NUMBER)
local_path = Config.LOCAL_FOLDER / Config.LOCAL_RESULT_DOCUMENT_NAME
shutil.move(result_pages, str(local_path))
logging.info(f"download_result(): File successfully downloaded: {local_path}")
except Exception as e:
logging.error(f"download_result(): Failed to download file: {e}")
if __name__ == "__main__":
pdf_pages = PdfPages()
pdf_pages.upload_document()
pdf_pages.get_page_info()
pdf_pages.get_page_as_png()
Work with Pages in PDF
Extracting pages from a PDF is a common task that serves various purposes across different contexts. This process involves selecting specific pages from a larger document to create a new, separate PDF file. Understanding the reasons behind this practice can help in effectively managing and utilizing PDF documents. Large PDF files can be cumbersome to share or store. By extracting only the necessary pages, users can create smaller, more manageable files. This is particularly useful when only a portion of the document is relevant for a specific purpose. For instance, removing unnecessary pages can significantly decrease the file size, making it easier to handle and distribute. Extracting pages allows users to repurpose content for different applications. For example, one might extract pages from a comprehensive report to create a standalone summary or to isolate specific data for analysis. This enables the reuse of existing content without the need to recreate information from scratch. Extract the Pages from PDF documents with Aspose.PDF Cloud Python SDK.
With our Python library you can:
- Combine PDF documents.
- Split PDF Files.
- Convert PDF to other formats, and vice versa.
- Manipulate Annotations.
- Work with Images in PDF, etc.
- You can try out our free App to test the functionality online.