BUG : add __hash__ to AffineBase#2909
Conversation
python3 requires that if a class defines __eq__ then in must define __hash__ (so that if a == b then hash(a) == hash(b)). Define 64bit hash by pulling the first 8 characters out of a sha1 hash of the matrix closes matplotlib#2828
lib/matplotlib/transforms.py
Outdated
There was a problem hiding this comment.
Seems like a correct approach. But I wonder if it's faster to do:
hash(self.get_matrix().tostring())
(Don't know if it is, just though the code would spend more time in C that way...)
There was a problem hiding this comment.
indeed
In [51]: tt = np.random.rand(3, 3)
In [52]: %timeit int(hashlib.sha1(tt).hexdigest()[:8], 16)
1000000 loops, best of 3: 1.58 µs per loop
In [53]: %timeit hash(tt.tostring())
1000000 loops, best of 3: 270 ns per loop
|
I think this is the right approach, but it is subtlely different than the old Python 2.x behavior -- which is that these objects were hashed on their identity (their pointer in memory essentially), and not on their content. I think this is probably more correct, but we should verify there aren't unintended consequences (though beyond running the test suite, I'm not sure how to anticipate any of those consequences). |
|
@pelson: The PS backend maintains a dictionary where the keys are On Python 2, objects are always hashable, based on their memory pointer. On Python 3, objects are hashable unless they customize |
|
However, this opens up an issue that (existed before) that this could be badly abused else where in the code I don't know the transform stack well, but skimming the code It might be better to fix the problem where it is and use |
|
I think with this PR merged, the net effect will be that |
|
However it leaves with
With in this context it's ok (as the transforms should probably not be changing on the way out to disk) but is probably not good. |
|
I'm on my mobile so can't see everything within this PR, but as the transform is mutable it should only implement hash based on the object id and not its actual content. |
|
@pelson That introduces an inconsistency a different way, in that |
Sorry, I'm not following your notation - if I'm reading this correctly then you are raising concern that an object which is equal has a different hash? If so, that is a perfectly valid definition - anything else would mean we are computing hashes which are mutable, which is asking for trouble. This is also the existing behaviour in python2. |
|
The inconsistency of having equal objects with different hashes is the reason that py3k requires objects that define |
|
I am closing this in favor of #2927 |
python3 requires that if a class defines
__eq__then in must define__hash__(so that if a == b then hash(a) == hash(b)). Define 64bithash by pulling the first 8 characters out of a sha1 hash of the matrix
closes #2828