Verifying PDF file data in Playwright

by Sumit Kumar Pradhan on May 21, 2025 May 21, 2025 in Playwright

In this example we will explore how to verify pdf text in playwright. Playwright, an open-source automation framework, does not have built-in PDF validation capabilities, but you can validate PDF content using third-party Node.js libraries like pdf-parse, we can achieve this by importing pdf-parse library in playwright project.

To install pdf-parse library

we need to run below node comment to install the library in playwright project.

npm install pdf-parse

In this example we will Automate clicking a download button, save the PDF, and verify its text content or number of pages.

import pdfjs from "pdf-parse";
import { test, expect } from "playwright/test";
const fs = require("fs");

test("pdf verification example", async ({ page }) => {
  
  await page.goto("https://examplefile.com/document/pdf/1-mb-pdf");
  const filePath = "../download";
    
  // Start waiting for download before clicking.
  const downloadPromise = page.waitForEvent("download");
  await page.locator("[class='lnr lnr-download']").click();
  const download = await downloadPromise;
  await download.saveAs(filePath);
  
  const dataBuffer = fs.readFileSync(filePath);
  await pdfjs(dataBuffer).then((data) => {
    // PDF text
    console.log(data.text);
    // PDF info
    console.log(data.info);
    // PDF metdata
    console.log(data.metadata);
    // number of pages
    console.log(data.numpages);
    expect(data.text).toContain(`can save time and effort in your workflow. Download your free 1 MB sample PDF file today and start`);
    expect(data.numpages).toEqual(324);
  });
});

Output :

This is all about pdf validation in playwright.

Wednesday, May 21, 2025

Verifying PDF file data in Playwright

No comments:

Post a Comment

Follow Us

Get Latest Updates

Tags

Popular Posts