-
Notifications
You must be signed in to change notification settings - Fork 559
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OP_SUBSTR_CHOP - a specialised OP_SUBSTR variant #22785
base: blead
Are you sure you want to change the base?
Conversation
d6f958e
to
f258d43
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of small comments but overall nothing troubling-looking here.
I wonder a bit about the name though. I've usually seen the word "nibble" to mean a half-byte; i.e. a 4-bit value. I wondered if that is what is going on here at first. If there are other candidate names to call it, perhaps something else would be better? Not a huge problem though.
How about Food related alternatives: |
Not sure what's going on with the ABRT test failures. Don't get them locally. |
Looks like an op_private flags assertion. I'll dig into it soon. |
f258d43
to
98d187d
Compare
I'm rebasing and renaming it to |
Doesn't perl's |
The Perl |
@richardleach , merge conflicts ^^ |
This commit adds OP_SUBSTR_CHOP and associated machinery for fast handling of the constructions: substr EXPR,0,LENGTH,'' and substr EXPR,0,LENGTH Where EXPR is a scalar lexical, the OFFSET is zero, and either there is no REPLACEMENT or it is the empty string. LENGTH can be anything that OP_SUBSTR supports. These constraints allow for a very stripped back and optimised version of pp_substr. The primary motivation was for situations where a scalar, containing some network packets or other binary data structure, is being parsed piecemeal. Nibbling away at the scalar can be useful when you don't know how exactly it will be parsed and unpacked until you get started. It also means that you don't need to worry about correctly updating a separate offset variable. This operator also turns out to be an efficient way to (destructively) break an expression up into fixed size chunks. For example, given: my $x = ''; my $str = "A"x100_000_000; This code: $x = substr($str, 0, 5, "") while ($str); is twice as fast as doing: for ($pos = 0; $pos < length($str); $pos += 5) { $x = substr($str, $pos, 5); } Compared with blead, `$y = substr($x, 0, 5)` runs 40% faster and `$y = substr($x, 0, 5, '')` runs 45% faster. Note that this is "chop" in the sense of Perl_sv_chop, which it efficiently calls, not the Perl language's "chop" function.
98d187d
to
24108a7
Compare
Oh wow. Huh. In that case, might as well call this one Otherwise my thoughts were going to be something like |
Consider ltrim, with inspiration from PHP and Redis (or lstrip a la Ruby/Python but that sounds more whitespace-specific). Though it is also unrelated to builtin::trim, I think it's a bit more descriptive at least |
Hmmm, I'm not sure about this. It seems only more descriptive to someone who already is familiar with |
This commit adds
OP_SUBSTR_NIBBLE
and associated machinery for fast handling of the constructions:and
Where
EXPR
is a scalar lexical, theOFFSET
is zero, and either there is noREPLACEMENT
or it is the empty string.LENGTH
can be anything thatOP_SUBSTR
supports. These constraints allow for a very stripped back and optimised version of pp_substr.The primary motivation was for situations where a scalar, containing some network packets or other binary data structure, is being parsed piecemeal. Nibbling away at the scalar can be useful when you don't know how exactly it will be parsed and unpacked until you get started. It also means that you don't need to worry about correctly updating a separate offset variable.
This operator also turns out to be an efficient way to (destructively) break an expression up into fixed size chunks. For example, given:
This code:
is twice as fast as doing:
Compared with blead,
$y = substr($x, 0, 5)
runs 40% faster and$y = substr($x, 0, 5, '')
runs 45% faster.