CHANGELOG

﻿Version 1.26.1, 2024-07-17
--------------------------

 - Added isCurrentlyEncrypted to PdfFileReader to return the current encryption status

Version 1.26.0, 2016-05-18
--------------------------

 - NOTE: Active maintenance on PyPDF2 is resuming after a hiatus

 - Fixed a bug where image resources where incorrectly
   overwritten when merging pages

 - Added dictionary for JavaScript actions to the root (louib)

 - Added unit tests for the JS functionality (louib)

 - Add more Python 3 compatibility when reading inline images (im2703
   and (VyacheslavHashov)

 - Return NullObject instead of raising error when failing to resolve
   object (ctate)

 - Don't output warning for non-zeroed xref table when strict=False
   (BenRussert)

 - Remove extraneous zeroes from output formatting (speedplane)

 - Fix bug where reading an inline image would cut off prematurely
   in certain cases (speedplane)


Patch 1.25.1, 2015-07-20

 - Fix bug when parsing inline images. Occurred when merging
   certain pages with inline images

 - Fixed type error when creating outlines by utilizing the
   isString() test

Version 1.25, 2015-07-07
------------------------

BUGFIXES:

 - Added Python 3 algorithm for ASCII85Decode. Fixes issue when
   reading reportlab-generated files with Py 3 (jerickbixly)

 - Recognize more escape sequence which would otherwise throw an
   exception (manuelzs, robertsoakes)

 - Fixed overflow error in generic.py. Occurred
   when reading a too-large int in Python 2 (by Raja Jamwal)

 - Allow access to files which were encrypted with an empty
   password. Previously threw a "File has not been decrypted"
   exception (Elena Williams)

 - Do not attempt to decode an empty data stream. Previously
   would cause an error in decode algorithms (vladir)

 - Fixed some type issues specific to Py 2 or Py 3

 - Fix issue when stream data begins with whitespace (soloma83)

 - Recognize abbreviated filter names (AlmightyOatmeal and
   Matthew Weiss)

 - Copy decryption key from PdfFileReader to PdfFileMerger.
   Allows usage of PdfFileMerger with encrypted files (twolfson)

 - Fixed bug which occurred when a NameObject is present at end
   of a file stream. Threw a "Stream has ended unexpectedly"
   exception (speedplane)

FEATURES:

 - Initial work on a test suite; to be expanded in future.
   Tests and Resources directory added, README updated (robertsoakes)

 - Added document cloning methods to PdfFileWriter:
   appendPagesFromReader, cloneReaderDocumentRoot, and
   cloneDocumentFromReader. See official documentation (robertsoakes)

 - Added method for writing to form fields: updatePageFormFieldValues.
   This will be enhanced in the future. See official documentation
   (robertsoakes)

 - New addAttachment method. See documentation. Support for adding
   and extracting embedded files to be enhanced in the future
   (moshekaplan)

 - Added methods to get page number of given PageObject or
   Destination: getPageNumber and getDestinationPageNumber.
   See documentation (mozbugbox)

OTHER ENHANCEMENTS:

 - Enhanced type handling (Brent Amrhein)

 - Enhanced exception handling in NameObject (sbywater)

 - Enhanced extractText method output (peircej)

 - Better exception handling

 - Enhanced regex usage in NameObject class (speedplane)


Version 1.24, 2014-12-31
------------------------

 - Bugfixes for reading files in Python 3 (by Anthony Tuininga and
   pqqp)

 - Appropriate errors are now raised instead of infinite loops (by
   naure and Cyrus Vafadari)

 - Bugfix for parsing number tokens with leading spaces (by Maxim
   Kamenkov)

 - Don't crash on bad /Outlines reference (by eshellman)

 - Conform tabs/spaces and blank lines to PEP 8 standards

 - Utilize the readUntilRegex method when reading Number Objects
   (by Brendan Jurd)

 - More bugfixes for Python 3 and clearer exception handling

 - Fixed encoding issue in merger (with eshellman)

 - Created separate folder for scripts


Version 1.23, 2014-08-11
------------------------

 - Documentation now available at http://pythonhosted.org//PyPDF2

 - Bugfix in pagerange.py for when __init__.__doc__ has no value (by
   Vladir Cruz)

 - Fix typos in OutlinesObject().add() (by shilluc)

 - Re-added a missing return statement in a utils.py method

 - Corrected viewing mode names (by Jason Scheirer)

 - New PdfFileWriter method: addJS() (by vfigueiro)

 - New bookmark features: color, boldness, italics, and page fit
   (by Joshua Arnott)

 - New PdfFileReader method: getFields(). Used to extract field
   information from PDFs with interactive forms. See documentation
   for details

 - Converted README file to markdown format (by Stephen Bussard)

 - Several improvements to overall performance and efficiency
   (by mozbugbox)

 - Fixed a bug where geospatial information was not scaling along with
   its page

 - Fixed a type issue and a Python 3 issue in the decryption algorithms
   (with Francisco Vieira and koba-ninkigumi)

 - Fixed a bug causing an infinite loop in the ASCII 85 decoding
   algorithm (by madmaardigan)

 - Annotations (links, comment windows, etc.) are now preserved when
   pages are merged together

 - Used the Destination class in addLink() and addBookmark() so that 
   the page fit option could be properly customized


Version 1.22, 2014-05-29
------------------------

 - Added .DS_Store to .gitignore (for Mac users) (by Steve Witham)

 - Removed __init__() implementation in NameObject (by Steve Witham)

 - Fixed bug (inf. loop) when merging pages in Python 3 (by commx)

 - Corrected error when calculating height in scaleTo()

 - Removed unnecessary code from DictionaryObject (by Georges Dubus)

 - Fixed bug where an exception was thrown upon reading a NULL string
   (by speedplane)

 - Allow string literals (non-unicode strings in Python 2) to be passed 
   to PdfFileReader

 - Allow ConvertFunctionsToVirtualList to be indexed with slices and
   longs (in Python 2) (by Matt Gilson)

 - Major improvements and bugfixes to addLink() method (see documentation
   in source code) (by Henry Keiter)

 - General code clean-up and improvements (with Steve Witham and Henry Keiter)

 - Fixed bug that caused crash when comments are present at end of 
   dictionary


Version 1.21, 2014-04-21
------------------------

 - Fix for when /Type isn't present in the Pages dictionary (by Rob1080)

 - More tolerance for extra whitespace in Indirect Objects

 - Improved Exception handling

 - Fixed error in getHeight() method (by Simon Kaempflein)

 - implement use of utils.string_type to resolve Py2-3 compatibility issues

 - Prevent exception for multiple definitions in a dictionary (with carlosfunk)
   (only when strict = False)

 - Fixed errors when parsing a slice using pdfcat on command line (by
   Steve Witham)

 - Tolerance for EOF markers within 1024 bytes of the actual end of the
   file (with David Wolever)

 - Added overwriteWarnings parameter to PdfFileReader constructor, if False
   PyPDF2 will NOT overwrite methods from Python's warnings.py module with
   a custom implementation.

 - Fix NumberObject and NameObject constructors for compatibility with PyPy
   (Rüdiger Jungbeck, Xavier Dupré, shezadkhan137, Steven Witham)

 - Utilize  utils.Str in pdf.py and pagerange.py to resolve type issues (by
   egbutter)

 - Improvements in implementing StringIO for Python 2 and BytesIO for
   Python 3 (by Xavier Dupré)

 - Added /x00 to Whitespaces, defined utils.WHITESPACES to clarify code (by
   Maxim Kamenkov)

 - Bugfix for merging 3 or more resources with the same name (by lucky-user)

 - Improvements to Xref parsing algorithm (by speedplane)


Version 1.20, 2014-01-27
------------------------

 - Official Python 3+ support (with contributions from TWAC and cgammans)
   Support for Python versions 2.6 and 2.7 will be maintained

 - Command line concatenation (see pdfcat in sample code) (by Steve Witham)

 - New FAQ; link included in README

 - Allow more (although unnecessary) escape sequences

 - Prevent exception when reading a null object in decoding parameters

 - Corrected error in reading destination types (added a slash since they
   are name objects)

 - Corrected TypeError in scaleTo() method

 - addBookmark() method in PdfFileMerger now returns bookmark (so nested
   bookmarks can be created)

 - Additions to Sample Code and Sample PDFs

 - changes to allow 2up script to work (see sample code) (by Dylan McNamee)

 - changes to metadata encoding (by Chris Hiestand)

 - New methods for links: addLink() (by Enrico Lambertini) and removeLinks()

 - Bugfix to handle nested bookmarks correctly (by Jamie Lentin)

 - New methods removeImages() and removeText() available for PdfFileWriter
   (by Tien Haï)

 - Exception handling for illegal characters in Name Objects


Version 1.19, 2013-10-08
------------------------

BUGFIXES:
 - Removed pop in sweepIndirectReferences to prevent infinite loop
   (provided by ian-su-sirca)

 - Fixed bug caused by whitespace when parsing PDFs generated by AutoCad

 - Fixed a bug caused by reading a 'null' ASCII value in a dictionary
   object (primarily in PDFs generated by AutoCad).

FEATURES:
 - Added new folders for PyPDF2 sample code and example PDFs; see README
   for each folder

 - Added a method for debugging purposes to show current location while
   parsing

 - Ability to create custom metadata (by jamma313)

 - Ability to access and customize document layout and view mode
   (by Joshua Arnott)

OTHER:
 - Added and corrected some documentation

 - Added some more warnings and exception messages

 - Removed old test/debugging code

UPCOMING:
 - More bugfixes (We have received many problematic PDFs via email, we
   will work with them)
 
 - Documentation - It's time for PyPDF2 to get its own documentation
   since it has grown much since the original pyPdf

 - A FAQ to answer common questions


Version 1.18, 2013-08-19
------------------------

 - Fixed a bug where older verions of objects were incorrectly added to the 
   cache, resulting in outdated or missing pages, images, and other objects
   (from speedplane)

 - Fixed a bug in parsing the xref table where new xref values were 
   overwritten; also cleaned up code (from speedplane)

 - New method mergeRotatedAroundPointPage which merges a page while rotating
   it around a point (from speedplane)

 - Updated Destination syntax to respect PDF 1.6 specifications (from
   jamma313)

 - Prevented infinite loop when a PdfFileReader object was instantiated
   with an empty file (from Jerome Nexedi)

Other Changes:

 - Downloads now available via PyPI
   https://pypi.python.org/pypi?:action=display&name=PyPDF2

 - Installation through pip library is fixed


Version 1.17, 2013-07-25
------------------------

 - Removed one (from pdf.py) of the two Destination classes. Both 
   classes had the same name, but were slightly different in content, 
   causing some errors. (from Janne Vanhala)

 - Corrected and Expanded README file to demonstrate PdfFileMerger

 - Added filter for LZW encoded streams (from Michal Horejsek)

 - PyPDF2 issue tracker enabled on Github to allow community
   discussion and collaboration


Versions -1.16, -2013-06-30
---------------------------

 - Note: This ChangeLog has not been kept up-to-date for a while.
   Hopefully we can keep better track of it from now on. Some of the
   changes listed here come from previous versions 1.14 and 1.15; they
   were only vaguely defined. With the new _version.py file we should 
   have more structured and better documented versioning from now on.
 
 - Defined PyPDF2.__version__

 - Fixed encrypt() method (from Martijn The)

 - Improved error handling on PDFs with truncated streams (from cecilkorik)

 - Python 3 support (from kushal-kumaran)

 - Fixed example code in README (from Jeremy Bethmont)

 - Fixed an bug caused by DecimalError Exception (from Adam Morris)

 - Many other bug fixes and features by: 
	
	jeansch
	Anton Vlasenko
	Joseph Walton
	Jan Oliver Oelerich
	Fabian Henze
	And any others I missed. 
	Thanks for contributing!


Version 1.13, 2010-12-04
------------------------

 - Fixed a typo in code for reading a "\b" escape character in strings.

 - Improved __repr__ in FloatObject.

 - Fixed a bug in reading octal escape sequences in strings.

 - Added getWidth and getHeight methods to the RectangleObject class.

 - Fixed compatibility warnings with Python 2.4 and 2.5.

 - Added addBlankPage and insertBlankPage methods on PdfFileWriter class.

 - Fixed a bug with circular references in page's object trees (typically
   annotations) that prevented correctly writing out a copy of those pages.

 - New merge page functions allow application of a transformation matrix.

 - To all patch contributors: I did a poor job of keeping this ChangeLog
   up-to-date for this release, so I am missing attributions here for any
   changes you submitted.  Sorry!  I'll do better in the future.


Version 1.12, 2008-09-02
------------------------

 - Added support for XMP metadata.

 - Fix reading files with xref streams with multiple /Index values.

 - Fix extracting content streams that use graphics operators longer than 2
   characters.  Affects merging PDF files.


Version 1.11, 2008-05-09
------------------------

 - Patch from Hartmut Goebel to permit RectangleObjects to accept NumberObject
   or FloatObject values.

 - PDF compatibility fixes.

 - Fix to read object xref stream in correct order.

 - Fix for comments inside content streams.


Version 1.10, 2007-10-04
------------------------

 - Text strings from PDF files are returned as Unicode string objects when
 pyPdf determines that they can be decoded (as UTF-16 strings, or as
 PDFDocEncoding strings).  Unicode objects are also written out when
 necessary.  This means that string objects in pyPdf can be either
 generic.ByteStringObject instances, or generic.TextStringObject instances.

 - The extractText method now returns a unicode string object.

 - All document information properties now return unicode string objects.  In
 the event that a document provides docinfo properties that are not decoded by
 pyPdf, the raw byte strings can be accessed with an "_raw" property (ie.
 title_raw rather than title)

 - generic.DictionaryObject instances have been enhanced to be easier to use.
 Values coming out of dictionary objects will automatically be de-referenced
 (.getObject will be called on them), unless accessed by the new "raw_get"
 method.  DictionaryObjects can now only contain PdfObject instances (as keys
 and values), making it easier to debug where non-PdfObject values (which
 cannot be written out) are entering dictionaries.

 - Support for reading named destinations and outlines in PDF files.  Original
 patch by Ashish Kulkarni.

 - Stream compatibility reading enhancements for malformed PDF files.

 - Cross reference table reading enhancements for malformed PDF files.

 - Encryption documentation.

 - Replace some "assert" statements with error raising.

 - Minor optimizations to FlateDecode algorithm increase speed when using PNG
 predictors.

Version 1.9, 2006-12-15
-----------------------

 - Fix several serious bugs introduced in version 1.8, caused by a failure to
   run through our PDF test suite before releasing that version.

 - Fix bug in NullObject reading and writing.

Version 1.8, 2006-12-14
-----------------------

 - Add support for decryption with the standard PDF security handler.  This
   allows for decrypting PDF files given the proper user or owner password.

 - Add support for encryption with the standard PDF security handler.

 - Add new pythondoc documentation.

 - Fix bug in ASCII85 decode that occurs when whitespace exists inside the
   two terminating characters of the stream.

Version 1.7, 2006-12-10
-----------------------

 - Fix a bug when using a single page object in two PdfFileWriter objects.

 - Adjust PyPDF to be tolerant of whitespace characters that don't belong
   during a stream object.

 - Add documentInfo property to PdfFileReader.

 - Add numPages property to PdfFileReader.

 - Add pages property to PdfFileReader.

 - Add extractText function to PdfFileReader.


Version 1.6, 2006-06-06
-----------------------

 - Add basic support for comments in PDF files.  This allows us to read some
   ReportLab PDFs that could not be read before.

 - Add "auto-repair" for finding xref table at slightly bad locations.

 - New StreamObject backend, cleaner and more powerful.  Allows the use of
   stream filters more easily, including compressed streams.

 - Add a graphics state push/pop around page merges.  Improves quality of
   page merges when one page's content stream leaves the graphics 
   in an abnormal state.

 - Add PageObject.compressContentStreams function, which filters all content
   streams and compresses them.  This will reduce the size of PDF pages,
   especially after they could have been decompressed in a mergePage
   operation.

 - Support inline images in PDF content streams.

 - Add support for using .NET framework compression when zlib is not
   available.  This does not make pyPdf compatible with IronPython, but it
   is a first step.

 - Add support for reading the document information dictionary, and extracting
   title, author, subject, producer and creator tags.

 - Add patch to support NullObject and multiple xref streams, from Bradley
   Lawrence.


Version 1.5, 2006-01-28
-----------------------

- Fix a bug where merging pages did not work in "no-rename" cases when the
  second page has an array of content streams.

- Remove some debugging output that should not have been present.


Version 1.4, 2006-01-27
-----------------------

- Add capability to merge pages from multiple PDF files into a single page
  using the PageObject.mergePage function.  See example code (README or web
  site) for more information.

- Add ability to modify a page's MediaBox, CropBox, BleedBox, TrimBox, and
  ArtBox properties through PageObject.  See example code (README or web site)
  for more information.

- Refactor pdf.py into multiple files: generic.py (contains objects like
  NameObject, DictionaryObject), filters.py (contains filter code),
  utils.py (various).  This does not affect importing PdfFileReader
  or PdfFileWriter.

- Add new decoding functions for standard PDF filters ASCIIHexDecode and
  ASCII85Decode.

- Change url and download_url to refer to new pybrary.net web site.


Version 1.3, 2006-01-23
-----------------------

- Fix new bug introduced in 1.2 where PDF files with \r line endings did not
  work properly anymore.  A new test suite developed with various PDF files
  should prevent regression bugs from now on.

- Fix a bug where inheriting attributes from page nodes did not work.


Version 1.2, 2006-01-23
-----------------------

- Improved support for files with CRLF-based line endings, fixing a common
  reported problem stating "assertion error: assert line == "%%EOF"".

- Software author/maintainer is now officially a proud married person, which
  is sure to result in better software... somehow.


Version 1.1, 2006-01-18
-----------------------

- Add capability to rotate pages.

- Improved PDF reading support to properly manage inherited attributes from
  /Type=/Pages nodes.  This means that page groups that are rotated or have
  different media boxes or whatever will now work properly.

- Added PDF 1.5 support.  Namely cross-reference streams and object streams.
  This release can mangle Adobe's PDFReference16.pdf successfully.


Version 1.0, 2006-01-17
-----------------------

- First distutils-capable true public release.  Supports a wide variety of PDF
  files that I found sitting around on my system.

- Does not support some PDF 1.5 features, such as object streams,
  cross-reference streams.