Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closing round bracket encoded in hexadecimal format breaks parsing #715

Open
krzyc opened this issue May 24, 2024 · 2 comments
Open

Closing round bracket encoded in hexadecimal format breaks parsing #715

krzyc opened this issue May 24, 2024 · 2 comments

Comments

@krzyc
Copy link

krzyc commented May 24, 2024

  • PHP Version: 8.2.10
  • PDFParser Version: master (4b86c66)

Description:

Closing round bracket encoded in hexadecimal format breaks parsing - string is truncated.
String is truncated here:

// Find next ')' not escaped.
$cur_start_text = $start_search_end = 0;
while (false !== ($cur_start_pos = strpos($name, ')', $start_search_end))) {
$cur_extract = substr($name, $cur_start_text, $cur_start_pos - $cur_start_text);
preg_match('/(?P<escape>[\\\]*)$/s', $cur_extract, $match);
if (!(\strlen($match['escape']) % 2)) {
break;
}
$start_search_end = $cur_start_pos + 1;
}
// Extract string.
$name = substr($name, 0, (int) $cur_start_pos);

Because this is my first contact with pdfparser I probably have no competence to provide a safe patch.

Test

public function testHexadecimalEncodedBracket(): void
{
    $document = new Document();

    $testString = '()';
    $content = '<< /Contents <'.bin2hex($testString).'> >>';
    $header = Header::parse($content, $document);
    $this->assertEquals($testString, (string) $header->get('Contents'));
}

Expected output & actual output

Test should pass, but returns:
Failed asserting that two strings are equal.
--- Expected
+++ Actual
@@ @@
-'()'
+'('

@k00ni
Copy link
Collaborator

k00ni commented May 27, 2024

@krzyc this looks decent enough to me to have a deeper look. Can you create a pull request with your changes + the test and we will discuss there how to proceed?

@krzyc
Copy link
Author

krzyc commented May 27, 2024

@k00ni I have provided PR and I am awaiting suggestions. It works for my case (extracting binary Contents from Sig object). All tests are passing.
Edited And there are more possible problems with round square parsing, I have added another test cases, which I think should pass.

@k00ni k00ni linked a pull request Jun 5, 2024 that will close this issue
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants