How to Generate and Download PDFs From HTML Templates using Node.js (Express) & Puppeteer.

How to Generate and Download PDFs From HTML Templates using Node.js (Express) & Puppeteer.

In a freelancing project I was working on last week, I encountered this feature, and it took me quite a while to collect information from different sources to be able to implement it. So I just decided to document the procedures. Hope it enlightens someone :).

PDFs are awesome because, well, they are portable whilst maintaining their format. I'm going to be using Express as it's one of the most popular and powerful frameworks for building web apps in the JS world. NextJS on the front end is just my choice, but you can go with whatever you enjoy working with. All you need is to send a request anyway, right? Enough chitchat...

Server Setup (Express)

For this, you will need to install a couple of things. To begin create your project folder and inside it, another folder named server. Then in your terminal navigate to your server folder location and run:

# first initialize npm
npm init -y

# then install server & dev dependencies
npm install nodemon express cors

Then create a simple server and handle cors accurately:

// server/app.js

const express = require('express');
const cors = require('cors');

const app = express();
const port = 7000;

app.use(
  cors({
    origin: 'http://localhost:3000', // we will set up our frontend at port 3000
  })
);

app.get('/', (req, res) => {
  res.send('Hello, world!');
});

app.listen(port, () => {
  console.log(`Server is running on http://localhost:${port}`);
});

Now ensure that is working and set it aside for a while:

nodemon ./app.js # Server is running on http://localhost:7000

Frontend Setup (NextJS)

Nextjs is a full-stack framework built on the React library and that's what we'll be using. Open your project folder in the terminal and run the following to generate a new nextJs app (currently at version 14.0.1):

npx create-next-app@latest

Ensure you have node.js installed for the above commands to work. This will ask you a couple of questions including the project's name. Just go with the defaults, or choose whatever you prefer.


Connecting Backend to Frontend

Let's test what we have so far. In your next.js app, navigate to app/page.tsx (or .jsx if you went with the JavaScript option). Remove everything in this file and create a very simple component that sends a request to our backend and displays whatever response it receives. This would be a great point to install axios to help us with the requests:

# In your frontend app terminal
npm i axios

Awesome. Then add this to your app/page.tsx:

// client/app/page.tsx

'use client';
import axios from 'axios';
import { useEffect, useState } from 'react';

// where our server is running
const BACKEND_URL = 'http://localhost:7000';

const Home = () => {
  // to store the message from the server
  const [message, setMessage] = useState('');

  // fetch data from the server
  const fetchData = async () => {
    try {
      const res = await axios.get(BACKEND_URL);
      setMessage(res.data);
    } catch (err) {
      setMessage('Error occurred');
      console.error(err);
    }
  };

  useEffect(() => {
    fetchData();
  }, []);

  // render the message from the server
  return <div>{message || 'Fetching...'}</div>;
};

export default Home;

The code is quite simple. It sends a single request to our backend at the port 7000 and renders the message received.

Now you have to run both the frontend and backend servers in 2 different terminals. In your frontend terminal, run:

npm run dev # dev server by default listens at port 3000

Then in your backend terminal:

nodemon ./app.js # in our case opens port 7000

Now open your favorite browser and visit localhost:3000. If all goes well, you should see Fetching... for a moment, then the text Hello, world! that came from the backend. If you see any errors to do with something CORS check to ensure that your frontend URL is the same one passed in the origin option when setting up the cors in the server.


PDF Backend Init

Now let's set up a few things in the backend to ensure it is ready for pdf generation. In this example, we will use puppeteer which is:

a Node library that provides a high-level API to control headless Chrome over the DevTools Protocol.

To use this, you are going to have to download Chromium for whatever OS you are using. Download it on this page, then move it to a preferred location on your PC and extract it:

Download Chromium from Appspot

Now copy the path to the Chrome executable chrome.exe as we are about to use it. Now install the following in your backend:

npm i puppeteer

Now we can create a html template to be used in generating the pdf. This is just simple HTML with some internal CSS styling. Inside your server folder, create a templates folder and add a sample.html file with some content as:

<!-- server/templates/sample.html -->

<!DOCTYPE html>
<html>
  <head>
    <style>
      body {
        max-width: 800px;
        margin: 0 auto;
        padding: 20px;
        font-family: Arial, sans-serif;
      }

      header {
        text-align: center;
        padding: 20px;
        background-color: #f2f2f2;
      }

      h1 {
        color: #333;
      }

      table {
        width: 100%;
        border-collapse: collapse;
        margin-bottom: 20px;
      }

      th,
      td {
        padding: 10px;
        text-align: left;
        border-bottom: 1px solid #ddd;
      }

      footer {
        text-align: center;
        padding: 20px;
        background-color: #f2f2f2;
      }
    </style>
  </head>
  <body>
    <header>
      <h1>Sample Header</h1>
    </header>

    <div>
      <h2>Content</h2>
      <p>This is a sample content.</p>
    </div>

    <table>
      <thead>
        <tr>
          <th>ID</th>
          <th>Name</th>
          <th>Age</th>
          <th>Major</th>
        </tr>
      </thead>
      <tbody>
        {{DataHere}}
      </tbody>
    </table>

    <footer>
      <p>Sample Footer</p>
    </footer>
  </body>
</html>

This is a simple page with a header, content with a table and a footer at the bottom. The {{DataHere}} is a placeholder where we will include our data.

Almost there... Now create a new file - helper.js that will hold a generatePDF helper function that will handle the PDF generation logic. Then add the following code to it:

// server/helper.js

const generatePDF = async (htmlTemplate, pdfName, pdfFilePath) => {
  try {
    // Create a browser instance
    const browser = await puppeteer.launch({
      headless: 'new',
      executablePath:
        'c:\\Users\\Hassan\\Development\\drivers\\chrome-win\\chrome.exe', // set your chrome.exe path here
    });

    // Create a new page
    const page = await browser.newPage();

    // Load your HTML template into the page
    await page.setContent(htmlTemplate, {
      waitUntil: 'domcontentloaded',
    });

    // Emulate screen media type
    await page.emulateMediaType('screen');

    // Wait for fonts to load
    await page.evaluateHandle('document.fonts.ready');

    // Get a PDF buffer from the page
    const pdfBytes = await page.pdf({
      path: pdfFilePath,
      printBackground: true,
      format: 'A4',
      displayHeaderFooter: true,
    });

    // Write the PDF buffer to a file
    await fs.promises.writeFile(pdfFilePath, pdfBytes);

    // Close the browser
    await browser.close();

    // Return the name of the file that was saved
    return pdfName;
  } catch (error) {
    console.log(error, 'Error generating PDF');
    return null;
  }
};

module.exports = generatePDF;

This code is as straightforward as it gets. We create a browser instance, create a new page, load our HTML template into the page, and then generate a PDF from the page. We then write the PDF to a file and return the name of the file that was saved. If an error occurs, we log it to the console and return null.

Now all that is left in the backend is to set up a route and its handler. So let's quickly do that. Adjust your app.js, to look like this:

// server/app.js

const express = require('express');
const cors = require('cors');
const fs = require('fs');
const path = require('path');

const generatePDF = require('./helper');

const app = express();
const port = 7000;

// parse req body
app.use(express.json());

app.use(
  cors({
    origin: 'http://localhost:3000',
  })
);

app.get('/', (req, res) => {
  res.send('Hello, world!');
});

app.post('/generate-pdf', async (req, res) => {
  // get the students array from req body
  const students = req.body;

  // read the html template
  const htmlTemplate = fs.readFileSync(
    path.resolve(__dirname, './templates/sample.html'),
    'utf-8'
  );

  // create a folder to store the pdfs
  const pdfFolderPath = path.resolve(__dirname, './pdfs');
  // create the folder if it doesn't exist
  if (!fs.existsSync(pdfFolderPath)) {
    fs.mkdirSync(pdfFolderPath);
  }

  // create a unique pdf name
  const pdfName = `students_${Date.now()}.pdf`;
  const pdfFilePath = path.join(pdfFolderPath, pdfName);

  // create rows with students data
  const tableRows = students
    .map(
      (student, id) => `
    <tr>
      <td>${id}</td>
      <td>${student.name}</td>
      <td>${student.age}</td>
      <td>${student.major}</td>
    </tr>
  `
    )
    .join('');

  // replace the placeholder in html template with actual table rows with student data
  const html = htmlTemplate.replace('{{DataHere}}', tableRows);

  try {
    // generate pdf
    const file = await generatePDF(html, pdfName, pdfFilePath);

    if (!file) {
      throw new Error('Error generating PDF');
    }

    // convert pdf to base64
    const bitmap = await fs.promises.readFile(pdfFilePath);
    const pdfBase64 = Buffer.from(bitmap).toString('base64');

    // unlink the file - optional - deletes file from server
    if (fs.existsSync(pdfFilePath)) {
      fs.unlinkSync(pdfFilePath);
    }

    // send the base64 pdf as response
    res.send({
      file: pdfBase64,
      message: 'Success!',
    });
  } catch (error) {
    console.log(error);
    res.status(500).send('Error generating PDF');
  }
});

app.listen(port, () => {
  console.log(`Server is running on http://localhost:${port}`);
});

A few things are happening here:

  • We are first importing the fs & path modules inbuilt in node which are used in managing files and file paths respectively. We are also importing the generatePDF function we created earlier.

  • The app.use(express.json()); at line 12 is an express middleware allowing us to extract data from the request body. You can read more about middleware here.

  • From line 24 - 88 is our controller for anything POSTed in the /generate-pdf path in our backend. The controller fetches data from the request body and reads and replaces the placeholder text with actual mapped data in our sample template. It then generates the PDF using the template HTML through our generatePDF function, converts it to base64 suitable for transfer (and smaller size), deletes the created server pdf, and sends the converted version.


Now that we have the controller ready, let's trigger it from our front end. Back to the page.tsx, replace what we previously had in our component, and add the following:

// client/app/page.tsx

'use client';
import axios from 'axios';
import { useState } from 'react';

// where our server is running
const BACKEND_URL = 'http://localhost:7000';

// sample list of students each with a name, age and major
const students = [
  {
    name: 'Yuqee Chen',
    age: 21,
    major: 'Computer Science',
  },
  {
    name: 'Jane Doe',
    age: 20,
    major: 'Engineering',
  },
  {
    name: 'Tiffany Wei',
    age: 22,
    major: 'Business',
  },
];

const Home = () => {
  // to store the message from the server
  const [error, setError] = useState('');
  const [isLoading, setIsLoading] = useState(false);

  // fetch data from the server
  const fetchData = async () => {
    setIsLoading(() => true);
    try {
      const { data } = await axios.post(
        `${BACKEND_URL}/generate-pdf`,
        students
      );

      const a = document.createElement('a');

      // Set the href attribute to the data URL of the base64 file
      a.href = 'data:application/pdf;base64,' + data.file;

      // Set the download attribute to the file name
      a.download = 'my-server-doc.pdf';

      // Trigger the download by clicking the element
      a.click();
    } catch (err) {
      setError('Error occurred');
      console.error(err);
    }
    setIsLoading(() => false);
  };

  // render the message from the server
  return (
    <div>
      <button type='button' onClick={fetchData}>
        {isLoading ? 'Downloading...' : 'Click to download pdf!'}
      </button>

      {error && <span>{error}</span>}
    </div>
  );
};

export default Home;

What happens here is also quite straightforward:

  • Add 2 useStates to handle error and loading states.

  • We then add a list of students' data.

  • Create a fetchData function that sends a POST request to our backend with the static list of students above.

  • It then creates an anchor element pointing to download the base64 pdf data restructured from the backend response, and this link is clicked dynamically through the code to download the pdf - a.click().

  • We then have a button that when clicked calls the fetchData and shows its loading and success states.

Finally, the setup is done. All that remains is to make sure the 2 servers are up and running, then visit localhost:3000 and click the button. The magic will happen and pdf downloaded with our data.


The source code is in my GitHub repo at hassanShakur/pdf-downloader.


And that's it. Hope you suffered enough reading this first article of mine. I enjoy seeing people s... Forget it. Hope you enjoyed the show. Till later InshaAllah.