PDF without worries

29 Mar

PDF One of the most used file formats.

One of the simplest to use. Since even browsers can open them, we don’t even need any additional software besides OS.

One of the first pieces of advice we received was to always print files using PDF. Even in today’s age when unnecessary printing is being discouraged, we still need an easy way to create and share documents without worrying about formatting issues caused by different versions of software.

But what about us, developers?

What tools do we have at our disposal to create invoices, certificates, whatever it is your boss sends to other bosses, etc… ? Well, as most of you already know, that is a little more difficult than one would expect. There are tons of possible solutions, every language offers at least a few libraries (going by Google, PHP seems to have most of them) and also many paid services. So, which one to use?

Well, if you are searching for a simple answer that would say: Use XYZ!, I will disappoint you. There is not one solution that might fit everyone. But let me show you some (spoiler alert: 2) simple ways to create PDF documents without much worries and see where it gets us.

Before we start, I think we might want to define what we want to achieve. I compiled requirements I consider important:

Styling (fonts, colors, etc)
Tables with header on every page
Header/footer on every page with page numbers
Running custom code (for charts, maps, etc…)

Other features like a signed document or editable will be considered nice to have, but not essential.

React-pdf

Open source library that combines React (to define layout and content) and PDFKit (for rendering). One of its biggest advantages is simplicity: anyone with knowledge of React will be able to generate documents in a matter of minutes. Also the rendering process doesn’t use any HTML to PDF conversion, but creates documents directly in PDF format (which, to be honest, I am still not really sure how much advantage is). Styling can be achieved using CSS and flexbox, either using inline styles or Stylesheet object.

Introduction

Example of simple document:

const OneLinePDF = () => (
  <Document>
    <Page>
      <Text style={{ marginLeft: "auto", marginRight: "auto" }}>
        Test ME!
      </Text>
      <Image
        style={{ width: 200, marginLeft: "auto", marginRight: "auto" }}
        src="https://picsum.photos/200"
      ></Image>
    </Page>
  </Document>
);

const OneLinePDF = () => (

<Page>

Test ME!

</Text>

<Image

style={{ width: 200, marginLeft: "auto", marginRight: "auto" }}

src="https://picsum.photos/200"

></Image>

</Page>

</Document>

);

Now we have multiple options on how to get our document to the user. React-pdf supports both client-side and server-side rendering. This is how our document looks rendered in a browser:

Example of simple document:

<PDFViewer width="800px" height="500px">
  <OneLinePDF />
</PDFViewer>

</PDFViewer>

Simple and self-explanatory. But we might not want to display documents, we want to allow users to download them. See the Download now! Link on top? This is its source:

<PDFDownloadLink document={<OneLinePDF />} fileName="react-pdf.pdf">
  {({ blob, url, loading, error }) => loading ? "Loading..." : "Download document!"}
</PDFDownloadLink>

<PDFDownloadLink document={<OneLinePDF />} fileName="react-pdf.pdf">

{({ blob, url, loading, error }) => loading ? "Loading..." : "Download document!"}

</PDFDownloadLink>

OK, client side rendering might be cool, but less practical in the real world. Luckily, server-side rendering is as easy.

const { default: ReactPDF } = require("@react-pdf/renderer");

ReactPDF.render(<OneLinePDF />, `react-pdf.pdf`);

const { default: ReactPDF } = require("@react-pdf/renderer");

ReactPDF.render(<OneLinePDF />, `react-pdf.pdf`);

And that’s it. Combine it with express.js, add some parameters and reports-generating endpoint is ready. Now let’s check which requested features are supported.

Tables

React-pdf has a set of available components which we can use for defining documents. Quick glance does not reveal any table components. And that seems to be true also for PDFKit. But we need our tables! Looks like the only way to achieve tables is to style <Text> and <View> components to look like tables. At least we can use flexbox.

<View
  style={{
    display: "flex",
    flexDirection: "row",
    borderColor: "black",
    borderStyle: "solid",
    borderWidth: "2",
    height: "35px",
    alignItems: "center",
    backgroundColor: "lightgrey",
  }}
>
  <Text
    style={{
      flex: 1,
      flexGrow: 1,
    }}
  >
    Firstname
  </Text>
  <Text
    style={{
      flex: 1,
      flexGrow: 1,
    }}
  >
    Lastname
  </Text>
</View>
 
{
  data.map((d) => (
    <View
      style={{
        marginTop: -2,
        display: "flex",
        flexDirection: "row",
        borderColor: "black",
        borderStyle: "solid",
        borderWidth: "2",
        height: "25px",
      }}
      wrap={false}
    >
      <Text
        style={{
          flex: 1,
          flexGrow: 1,
        }}
      >
        {d.firstName}
      </Text>
      <Text
        style={{
          flex: 1,
          flexGrow: 1,
        }}
      >
        {d.lastName}
      </Text>
    </View>
  ));
}

<View

style={{

display: "flex",

flexDirection: "row",

borderColor: "black",

borderStyle: "solid",

borderWidth: "2",

height: "35px",

alignItems: "center",

backgroundColor: "lightgrey",

}}

<Text

style={{

flex: 1,

flexGrow: 1,

}}

Firstname

</Text>

<Text

style={{

flex: 1,

flexGrow: 1,

}}

Lastname

</Text>

</View>

{

data.map((d) => (

<View

style={{

marginTop: -2,

display: "flex",

flexDirection: "row",

borderColor: "black",

borderStyle: "solid",

borderWidth: "2",

height: "25px",

}}

wrap={false}

<Text

style={{

flex: 1,

flexGrow: 1,

}}

{d.firstName}

</Text>

<Text

style={{

flex: 1,

flexGrow: 1,

}}

{d.lastName}

</Text>

</View>

));

}

This gives us a simple (admittedly terrible looking, but hey, I am not going to steal the fun of writing CSS from you) table.

We can also create our own Table, TableCell etc. components to avoid repetition. But look at the end of the first page.

Table row is broken. It might not be obvious, since all values are the same, but we lost one John Smith. Yeah, this is going to Jira. Fixing it isn’t very complicated. Just add wrap={false} to every <View> in our example and try again.

Perfect. This property can be used also on any part of the document which we don’t want to be broken. Also be sure to check documentation for more page wrapping options.

Now we would like to have a table header on every page. I am going to let you down on this one. I wasn’t able to find any way to achieve that, except for counting how many rows fit in one page and then render the header. If rows have different heights, you are probably out of luck.

While we are at the table, I believe it is also worth mentioning the library react-pdf-table. I haven’t really used it, but from a quick look at the source, it looks like all their components are wrappers to Text and View similar as my example. One issue with this library I had, was breaking of cells in the middle, which I couldn’t find a way to configure.

Header/footer

To display header/footer in our document, we need to create a View element and set it to fixed. This means that our component will be rendered on every page. Now it is only a matter of proper styling (we can use absolute positioning).

If we need to display page number or total page count, we can pass the render function to <Text> or <View>, which will receive those two values as parameters.

Example of footer with page number:

<View
  fixed
  render={({ pageNumber }) => (
    <View>
      <Text>{pageNumber.toString()}</Text>
    </View>
  )}
  style={{
    height: "30px",
    color: "lightgrey",
    textAlign: "center",
    position: "absolute",
    fontSize: 12,
    bottom: 10,
    left: 0,
    right: 0,
    textAlign: "center",
    color: "grey",
  }}
/>

<View

fixed

render={({ pageNumber }) => (

<View>

<Text>{pageNumber.toString()}</Text>

</View>

)}

style={{

height: "30px",

color: "lightgrey",

textAlign: "center",

position: "absolute",

fontSize: 12,

bottom: 10,

left: 0,

right: 0,

textAlign: "center",

color: "grey",

}}

We can also use pageNumber to apply styles depending on which page we are at (left/right margins depending on odd/even numbers come to mind). Just one thing to mention: for pageNumber to work correctly, the outer Page component needs to have the wrap property set to true.

Custom code

React-pdf supports Canvas, that can be used to draw. Unfortunately, I can’t quite imagine how it would work with other charting libraries (implementing own adapters would be anything but worryless). Also, since we cannot use standard html tags, I wasn’t able to combine it with Google maps.

Conclusion

React-pdf is an easy to use library that avoids conversion between HTML and PDF and generates documents directly in PDF format using a predefined set of components (so only limited reuse of the front-end).

Might be useful for simple documents that don’t need features that would require other libraries. Also templates written for react-pdf cannot be used by any other library, so replacing it would lead to rewriting all existing templates.

Nice features are the possibility to add metadata to the final document, and client-side rendering, which can be useful in SPA applications (e.g. for live preview).

Puppeteer

Second from my web-like solution is using PUPPETEER. As most of you already know, Puppeteer is a Node library providing API to control Chrome (paraphrasing documentation). So basically browser you can run from Node JS. Most developers are probably familiar with this library, since one of its main uses is automated testing of web application, but it can also be used to convert page to PDF.

Introduction

Let’s start with same document as previously, only this time we will use HTML.

<h1 style="margin: auto;display: block;text-align: center;">Test ME!</h1>
<img src="https://picsum.photos/200" style="margin: auto;display: block;" />

1 2	<h1 style="margin: auto;display: block;text-align: center;">Test ME!</h1> <img src="https://picsum.photos/200" style="margin: auto;display: block;" />

Now, obviously, we cannot run puppeteer inside browsers (or…can we?). So let’s add code to our server endpoint.

const html = `
<h1 style="margin: auto;display: block;text-align: center;">Test ME!</h1>
<img src="https://picsum.photos/200" style="margin: auto;display: block;" />
`
const browser = await puppeteer.launch(); 
const page = await browser.newPage();
await page.setContent(html);
await page.pdf({
  path: "puppeteer.pdf"
});
await browser.close();

const html = `

const browser = await puppeteer.launch();

const page = await browser.newPage();

await page.setContent(html);

await page.pdf({

path: "puppeteer.pdf"

});

await browser.close();

So, what did we do?

we launched browser instance
we opened new empty page
we set our html as content of the page
created pdf and save to file
closed browser instance

Simple as that.

Also with just one small change, we can generate pdf from any accessible web page:

const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto("https://google.com"); // redirect browser to url
await page.pdf({
  path: "puppeteer.pdf",
});
await browser.close();

const browser = await puppeteer.launch();

const page = await browser.newPage();

await page.goto("https://google.com"); // redirect browser to url

await page.pdf({

path: "puppeteer.pdf",

});

await browser.close();

We can also add css stylesheets or javascript files.

Now let’s take a look at what PDF features puppeteer supports.

Tables

Adding tables using puppeteer is really non-issue. Just create the correct HTML table and that’s it. To have table header on every page, set proper header of table (<thead>) and puppeteer will add table header on every page. Puppeteer seems to avoid breaking cells at the end of the page out of the box.

Header / footer

Setting header and footer templates is done by sending parameters headerTemplate / footerTemplate to pdf() function. Just don’t forget to set property displayHeaderFooter to true. It is possible to display current page (or other from allowed values), just by creating an element with the correct class. For example our footer will display current page / total pages count.

await page.pdf({
  // other options not shown for simplicity
  displayHeaderFooter: true,
  headerTemplate: `<div style="font-size: 16px; text-align: center; width: 100%">Test Header</div>`,
  footerTemplate: `<div style="font-size: 16px;display: flex;width: 100%;justify-content: center;">
    			<div class="pageNumber"></div>
    			<div>/</div>
    			<div class="totalPages"></div>
  		     </div>`,
});

await page.pdf({

// other options not shown for simplicity

displayHeaderFooter: true,

headerTemplate: `<div style="font-size: 16px; text-align: center; width: 100%">Test Header</div>`,

footerTemplate: `<div style="font-size: 16px;display: flex;width: 100%;justify-content: center;">

</div>`,

});

Custom code

In my opinion, this is one of the greatest features of puppeteer: since we have a full JS engine at our disposal, we can run anything that the browser can run. For example here I used Google Maps to render a map of Bratislava.

Sometimes happens that puppeteer renders the page before all resources are loaded, for example before all map images we downloaded.

We can fix that by

await page.setContent(html, {
  waitUntil: 'networkidle0'
});

await page.setContent(html, {

waitUntil: 'networkidle0'

});

Now browser instance will wait until all network requests are finished and only then will start the rendering process (we can also wait for other events).

Conclusion

Using puppeteer is a great (and free) way to convert html to pdf documents. Greatest strength is that we can use the full power of javascript, so including charts, maps or anything else should go without problems.

Disadvantages includes lack of support for document metadata, encrypting documents or creating editable forms.

Performance might be an issue, depending on application. Unfortunately I don’t have any benchmarks on hand, but chrome is stereotypically resource hungry and might kill the server in high load pdf rendering scenarios.

Also usage in cloud, for example AWS Lambda Function, seems to be non-trivial.

Nice thing is, since we are using html to define content, switching to another html-pdf solution should be easy, which makes puppeteer ideal during the beginning phase of projects, when all requirements might not be fully defined.

No Comments

TAGS : pdf

tech

PDF without worries

29 Mar

PDF One of the most used file formats.

But what about us, developers?

React-pdf

Introduction

Tables

Header/footer

Custom code

Conclusion

Puppeteer

Introduction

So, what did we do?

Tables

Header / footer

Custom code

Conclusion

Related Post

10 Jun

The Importance of Code Readability and Communicati

11 Jun

JS code sharing tactics

19 Dec

New React Native Color picker

Leave a Comment Cancel reply

Recent Posts

The Importance of Code R

IT Freelancer vs. Softwa

Using AI in the Software

6 AI Software Developmen

UX And UI: Everything Yo

Contact us

Where to find us

Connect with us