bpo-32147: Improved perfomance of binascii.unhexlify().#4586
bpo-32147: Improved perfomance of binascii.unhexlify().#4586serhiy-storchaka merged 6 commits intopython:masterfrom
Conversation
Modules/binascii.c
Outdated
There was a problem hiding this comment.
Py_CHARMASK returns values in the range [0, 255], so this table needs to be 256 bytes long. See for example _Py_ctype_tolower.
Modules/binascii.c
Outdated
There was a problem hiding this comment.
Sorry for not noticing this earlier: this isn't needed, because Py_CHARMASK already does the cast. You can just use table_hex[Py_CHARMASK(argbuf[i])].
076fda7 to
d6d2365
Compare
d6d2365 to
4215f0a
Compare
serhiy-storchaka
left a comment
There was a problem hiding this comment.
_PyLong_DigitValue could be used also in binascii.a2b_qp.
| self.assertEqual(s, u) | ||
| self.assertRaises(binascii.Error, binascii.a2b_hex, t[:-1]) | ||
| self.assertRaises(binascii.Error, binascii.a2b_hex, t[:-1] + b'q') | ||
| self.assertRaises(binascii.Error, binascii.a2b_hex, bytes([255, 255])) |
There was a problem hiding this comment.
Add tests for b'0G', b'G0', b'0g' and b'g0'.
There was a problem hiding this comment.
It's on to you. Since your initial code shared the table with binascii.a2b_qp I thought that you perhaps want to keep the relation between binascii.a2b_hex and binascii.a2b_qp.
It is better to not squish intermediate commits. Squishing them makes reviewing harder.
There was a problem hiding this comment.
@serhiy-storchaka, sorry I removed my previous comment. I'll modify PR to use _PyLong_DigitValue in binascii.a2b_qp a bit later.
| (ascii_data[in+1] >= '0' && ascii_data[in+1] <= '9'))) { | ||
| /* hexval */ | ||
| ch = hexval(ascii_data[in]) << 4; | ||
| ch = _PyLong_DigitValue[ascii_data[in]] << 4; |
There was a problem hiding this comment.
What if replace the above checks with _PyLong_DigitValue[ascii_data[in]] < 16 && _PyLong_DigitValue[ascii_data[in+1]] < 16 (with saving intermediate results in variables)?
There was a problem hiding this comment.
Do you mean like here? bf57e77#diff-37791d6fab7c332f1ef6b333d28b7aceR1261
There was a problem hiding this comment.
Yes, exactly. What were the problems with this?
There was a problem hiding this comment.
It becomes slower than it was for some reason.
Before:
$ ./python -m timeit -s "from binascii import a2b_qp; b = b'=AF'*2**20" "a2b_qp(b)"
100 loops, best of 5: 3.39 msec per loop
After:
$ ./python -m timeit -s "from binascii import a2b_qp; b = b'=AF'*2**20" "a2b_qp(b)"
100 loops, best of 5: 3.72 msec per loop
| @@ -0,0 +1 @@ | |||
| :func:`binascii.unhexlify` is now up to 2 times faster. | |||
There was a problem hiding this comment.
Add "Patch by your name." Add your name in Misc/ACKS.
78f937e to
d42cb6d
Compare
serhiy-storchaka
left a comment
There was a problem hiding this comment.
Sorry, I forgot to merge this PR before 3.7b1. @ned-deily, is it good to backport this minor optimization to 3.7? I didn't merge it in November just because I wanted to find the opportunity to apply the same optimizations in other code, not because there were any doubts about it.
|
@serhiy-storchaka, I don't think we should be considering backporting something like this at this point in the 3.7 cycle when it has not even landed in master yet and had buildbot exposure. Let's get it in there first. |
https://bugs.python.org/issue32147