PRODUCT_NAME_ALT

Free PDF to XML conversion via python

Use free app or Python SDK to convert between PDF & XML as well as several popular formats from Microsoft® Word.

Python Code for Converting PDF to XML

#Converting PDF to HTML
wordsApi = asposewordscloud.apis.wordss_api.WordsApi(GetClientId(),GetClientSecret(),"v3.0")

request_save_options_data = asposewordscloud.HtmlSaveOptionsData(file_name=file.PDF + '/file.HTML')
request = asposewordscloud.models.requests.SaveAsRequest(name=remote_name, save_options_data=request_save_options_data, folder=remote_folder)
result = wordsApi.words_api.save_as(request)

    #Converting HTML to XML
wordsApi = asposewordscloud.apis.wordss_api.WordsApi(GetClientId(),GetClientSecret(),"v3.0")
result = cellsApi.cells_workbook_put_convert_workbook(file+".HTML",format="XML")

How to convert PDF to XML in Cloud Apps

  1. Initialize WordsApi and CellsApi with Client Id, Client Secret, Base URL & API version
  2. Set ConvertDocumentRequest with parameters local file name and format as HTML
  3. Call WordsApi convertDocument to convert PDF document to HTML
  4. Initialize SaveOption from CellsAPI with parameters SaveFormat as XML
  5. Call cellsSaveAsPostDocumentSaveAs method to convert PDF file to XML

Get Started with Aspose.Total REST APIs

  1. Create an account at Dashboard to get free API quota & authorization details
  2. Get Aspose.Words and Aspose.Cells Cloud SDKs for Python source code from Aspose.Words GitHub and Aspose.Cells GitHub repos to compile/use the SDK yourself or head to the Releases for alternative download options.
  3. Also have a look at Swagger-based API Reference for Aspose.Words and Aspose.Cells to know more about the REST API.

How to Convert Word PDF to other formats

You can use Aspose.Words to transform PDF files into HTML format. Then, you can input the HTML files to any of the APIs in Aspose.Total, such as Aspose.Cells, Aspose.PDF, Aspose.Email, Aspose.Slides, Aspose.Diagram, Aspose.Tasks, Aspose.3D, Aspose.HTML. This will allow you to output the files in hundred of different formats.

To see the full list of supported formats, please check the Aspose.Total Cloud page.

How to Convert MS Word PDF to Image formats

Aspose.Words Cloud SDK produces few quick and easy ways to convert MS Word files to various image formats similar to what we have done above for XML: by direct REST API calls or using SDKs. There are multiple image formats accessible for converting Word documents with Aspose.Words Cloud APIs - JPEG, PNG, BMP, GIF, and TIFF.

  1. Create ConvertDocumentRequest object to convert PDF document
  2. Call ConvertDocument method of WordsApi class instance for conversion from PDF

How to Convert PDF to PDF

For PDF to PDF, you need to go to the web page PDF to PDF and upload the PDF file from your device. Then, you need to click on the “Convert” button and wait for the conversion to finish. After that, you can download the PDF file to your device.

How to Convert Webpage to XML format

For Webpage to XML format conversion, you need to go to the website Webpage to XML and enter the URL of the webpage you want to convert in the input box. Then, you need to click on the “Convert” button and wait for the conversion to finish. After that, you can download the XML file to your device.

FAQ

  • What is PDF Format?
    Portable Document Format (PDF) is a type of document created by Adobe back in 1990s. The purpose of this file format was to introduce a standard for representation of documents and other reference material in a format that is independent of application software, hardware as well as Operating System. PDF files can be opened in Adobe Acrobat Reader/Writer as well in most modern browsers like Chrome, Safari, Firefox via extensions/plug-ins. Most of the commercially available software suites also offer conversion of their documents to PDF file format without the requirement of any additional software component.
  • What is XML Format?
    XML stands for Extensible Markup Language that is similar to HTML but different in using tags for defining objects. The whole idea behind creation of XML file format was to store and transport data without being dependent on software or hardware tools. Its popularity is due to it being both human as well as machine readable. This enables it to create common data protocols in the form of objects to be stored and shared over network such as World Wide Web (WWW). The “X” in XML is for extensible which implies that the language can be extended to any number of symbols as per user requirements. It is for these features that many standard file formats make use of it such as Microsoft Open XML, LibreOffice OpenDocument, XHTML and SVG.
  • How can I get started with Aspose.Total REST APIs?
    Quickstart not only guides through the initialization of Aspose.Total Cloud API, it also helps in installing the required libraries.
  • Where can I see the release notes for Aspose.Total Cloud API?
    Complete release notes can be reviewed at Aspose.Total Cloud Documentation.
  • Is it safe to convert PDF to XML in the Cloud?
    Of course! Aspose Cloud uses Amazon EC2 cloud servers that guarantee the security and resilience of the service. Please read more about Aspose's Security Practices.
  • What file formats are supported by Aspose.Total Cloud API?
    Aspose.Total Cloud can convert file formats from any product family to any other product family to PDF, DOCX, XPS, image(TIFF, JPEG, PNG BMP), MD and more. Checkout the complete list of supported file formats.
  • I can not find the SDK for my favorite language. What should I do?
    Aspose.Total Cloud is also available as a Docker Container. Try using it with cURL in case your required SDK is not available yet.