Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exception not handling on "Allowed memory size exhausted" #519

Open
bastienuh opened this issue Mar 8, 2022 · 7 comments
Open

Exception not handling on "Allowed memory size exhausted" #519

bastienuh opened this issue Mar 8, 2022 · 7 comments
Labels

Comments

@bastienuh
Copy link

bastienuh commented Mar 8, 2022

Hi,

I have the following exception :

Symfony\Component\ErrorHandler\Error\FatalError

  Allowed memory size of 536870912 bytes exhausted (tried to allocate 167772160 bytes)

  at vendor/smalot/pdfparser/src/Smalot/PdfParser/Font.php:221
    217▕                         $char_to = hexdec($matches['to'][$key]);
    218▕                         $offset = hexdec($matches['offset'][$key]);
    219▕
    220▕                         for ($char = $char_from; $char <= $char_to; ++$char) {
  ➜ 221▕                             $this->table[$char] = self::uchr($char - $char_from + $offset);
    222▕                         }
    223▕                     }
    224▕
    225▕                     // Support for : <srcCode1> <srcCodeN> [<dstString1> <dstString2> ... <dstStringN>]

I tried to handle it by a try / catch, but it's not working:

try {
    echo 'A';
    $pdf = $pdfParser->parseFile($pathname.'.pdf');
}
catch (\Whoops\Exception\ErrorException $e) {
    echo ' - Error in $pdf->parseFile() : '.$e->getMessage().' - ';
    return $new_document;
}
catch (\Symfony\Component\ErrorHandler\Error\FatalError $e) {
    echo ' - Error in $pdf->parseFile() : '.$e->getMessage().' - ';
    return $new_document;
}
catch (\Exception $e) {
    echo ' - Error in $pdf->parseFile() : '.$e->getMessage().' - ';
    return $new_document;
}
echo 'B';

The error is in parseFile() because the "A" is echoing, but not the "B".

Do you know why the exception is not handling?
Maybe it can be a solution to add a try / catch around the self::uchr ?

Thanks for your help and advice :-)

(If it could help, the parsed PDF is here : https://www.assemblee-nationale.fr/dyn/opendata/PIONANR5L15TAP0528.pdf)

@k00ni k00ni added the bug label Mar 9, 2022
@j0k3r
Copy link
Collaborator

j0k3r commented Mar 11, 2022

What PHP version are you using? Which version of PDFParser are you using?

@bastienuh
Copy link
Author

I'm running PHP 8.0.12 and using PDFParser 2.1.0 :-)

@dsuurlant
Copy link
Contributor

I ran into a similar problem which might have the same cause.

Debugging the code I found that the $offset had a really strange number:

char: 65018, char_from: 65012, offset: 2.1254268986858E+96

Which led to the TypeError for self::uchr.

It's possible to add a check so that if the result of $char - $char_from + $offset is not an integer type, it is skipped. (Trying to cast 2.1254268986858E+96 to an integer definitely doesn't work.)

But I'm unsure what effect doing that will have on the parsing of the PDF.

However I was able to catch the error using Error (PHP native, not Symfony). Ideally though the PDF would be parsed without errors.

@GreyWyvern
Copy link
Contributor

This is probably resolved by #623. The sample PDF is still accessible, and it's working in my copy.

@k00ni
Copy link
Collaborator

k00ni commented Aug 10, 2023

@bastienuh please try again and get back to us. Thanks

@k00ni k00ni added the stale needs decision label Aug 10, 2023
@k00ni k00ni removed the stale needs decision label Feb 9, 2024
@k00ni
Copy link
Collaborator

k00ni commented Feb 9, 2024

Probably solved. If not, comment here.

@k00ni k00ni closed this as completed Feb 9, 2024
@UnnitMetaliya
Copy link

Probably solved. If not, comment here.

@k00ni I think it's still an issue.

I ran the command with 3GB memory

php -d memory_limit=3G artisan pdf:process

I am being thrown this:

I am on "smalot/pdfparser": "^2.11"

Fatal error: Allowed memory size of 3221225472 bytes exhausted (tried to allocate 1342177280 bytes) in ../vendor/smalot/pdfparser/src/Smalot/PdfParser/Font.php on line 150

Symfony\Component\ErrorHandler\Error\FatalError

Allowed memory size of 3221225472 bytes exhausted (tried to allocate 1342177280 bytes)

at vendor/smalot/pdfparser/src/Smalot/PdfParser/Font.php:150
146▕
147▕ if (!isset(self::$uchrCache[$code])) {
148▕ // html_entity_decode() will not work with UTF-16 or UTF-32 char entities,
149▕ // therefore, we use mb_convert_encoding() instead
➜ 150▕ self::$uchrCache[$code] = mb_convert_encoding("&#{$code};", 'UTF-8', 'HTML-ENTITIES');
151▕ }
152▕
153▕ return self::$uchrCache[$code];
154▕ }

Whoops\Exception\ErrorException

Allowed memory size of 3221225472 bytes exhausted (tried to allocate 1342177280 bytes)

at vendor/smalot/pdfparser/src/Smalot/PdfParser/Font.php:150
146▕
147▕ if (!isset(self::$uchrCache[$code])) {
148▕ // html_entity_decode() will not work with UTF-16 or UTF-32 char entities,
149▕ // therefore, we use mb_convert_encoding() instead
➜ 150▕ self::$uchrCache[$code] = mb_convert_encoding("&#{$code};", 'UTF-8', 'HTML-ENTITIES');
151▕ }
152▕
153▕ return self::$uchrCache[$code];
154▕ }

  +1 vendor frames

2 [internal]:0
Whoops\Run::handleShutdown()`

I opened this issue -- with more details.

@k00ni k00ni reopened this Sep 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants