Skip to content

Rearranging Pages of a PDF

Jorj X. McKie edited this page Mar 6, 2018 · 6 revisions

Using Document.select()

Since V 1.9.0, the Document class has a new method, select([...]). The only parameter is a sequence of pages (given by zero-based integers), that should be left over ("selected").

Successful execution will alter the document's representation in memory. For example, after select([0]), only the first page will be left over, everything else will have gone, pageCount will be 1, and so on. If you now save the document by save(...), you will have a new 1-page PDF reflecting what has happened.

Interesting to note, that all links, bookmarks and annotations will be preserved, if they do not point to deleted pages.

How can this method be used?

If you know how to manipulate Python lists, you are only limited by your imagination. For example

  • Delete pages containing no text or a specific text
  • Only include odd / even pages, e.g. to support double sided printing on some printer hardware
  • Re-arrange pages, e.g. the whole document from back to front: take lst = list(range(doc.pageCount-1, -1, -1)) as the list to be selected.
  • "Concatenate" a document with itself by specifying lst + lst as the list of pages to be taken
  • doc.select([1,1,1,5,5,5,9,9,9]) does what it looks like: create a 9-page document of 3 times 3 equal pages
  • Take the first / last 10 pages: lst = list(range(10)), lst = list(range(doc.pageCount - 10, doc.pageCount)), respectively.
  • etc.

You can apply several such selects in a row. After each one, the document structure will get updated (doc.loadPage will always reflect the current count, etc.).

The original PDF content is no longer accessible. But you can discard changes: close and re-open the document.

Save your work using doc.save(...). Be sure to include the garbage=4 option if you have deleted many pages (to reduce the PDF file size).

Using Other Methods

When select() is a caliber too big to achieve something simple: Consider using Documentmethods deletePage(), deletePageRange(), copyPage() or movePage().

As a general rule, use select() when many pages are involved and / or when some algorithm computes the required pages. Otherwise these single-page-methods may be more appropriate.

Clone this wiki locally