X Tutup
Skip to content

Fixed reference issue in LatexManager#5144

Closed
kwankyu wants to merge 1 commit intomatplotlib:masterfrom
kwankyu:pgf_backend_issue
Closed

Fixed reference issue in LatexManager#5144
kwankyu wants to merge 1 commit intomatplotlib:masterfrom
kwankyu:pgf_backend_issue

Conversation

@kwankyu
Copy link
Copy Markdown

@kwankyu kwankyu commented Sep 25, 2015

This fixes a potential bug in the pgf backend. The issue is that the class LatexManagerFactory keeps a strong reference to a LatexManager object while the class LatexManager has __del__ method to clean up its objects. While the python interpreter shuts down, the LatexManager destructor (__del__) will run because of the presence of the strong reference, but this potentially can lead to an exception (but ignored) since it relies on, for instance, os.path module, which possibly can be in the state of destruction.

Actually this causes an issue in matplotlib embedded in Sage. Look http://trac.sagemath.org/ticket/14798

The fix is to use weak referencing.

@tacaswell
Copy link
Copy Markdown
Member

attn @pwuertz

@pwuertz
Copy link
Copy Markdown
Contributor

pwuertz commented Sep 26, 2015

Unfortunately i wont be alle to review this due to a 2 week vacation starting total. I remember using weak references for similar reasons, maybe I forgot it at this place. So, +1 probably :)

@pwuertz
Copy link
Copy Markdown
Contributor

pwuertz commented Sep 27, 2015

Hm, wouldnt it be better to keep a strong reference to os.path within Latex Manager then? Wouldnt that be the only way to ensure that the module required in del isnt unloaded by python?

@kwankyu
Copy link
Copy Markdown
Author

kwankyu commented Sep 27, 2015

The 'cleanup', that is what the del does, is actually done by 'atexit' by
the time when the del is invoked as the last strong reference to the
LatexManager object is gone. So there is indeed nothing that the del
invocation should do (but it causes the exception to raise!) It seems to me
that it is not only 'os.path' that may go wrong at shutdown...
On Sep 27, 2015 3:17 PM, "Peter Würtz" notifications@github.com wrote:

Hm, wouldnt it be better to keep a strong reference to os.path within
Latex Manager then? Wouldnt that be the only way to ensure that the module
required in del isnt unloaded by python?


Reply to this email directly or view it on GitHub
#5144 (comment)
.

@tacaswell tacaswell added this to the next bug fix release (2.0.1) milestone Oct 8, 2015
@pwuertz
Copy link
Copy Markdown
Contributor

pwuertz commented Oct 15, 2015

This is not what I had in mind. I'd like the temporary directory to be deleted as soon as the LatexManager instance that created it is deleted. So ideally the atexit approach shouldn't be required at all. Deleting files and directories on the windows file system however turns out to be pretty difficult due to locking mechanisms. So the idea is that "hopefully" the locks are gone at the time atexit is called and I attempt to delete any residual temp files there.

The thing I missed is that in python it's better / more safe to keep references to imported modules if they are used within __del__. I'll propose a fix for that.

@pwuertz
Copy link
Copy Markdown
Contributor

pwuertz commented Oct 15, 2015

Does that other fix also solve the problem for you?

@kwankyu
Copy link
Copy Markdown
Author

kwankyu commented Oct 15, 2015

There will be no problem if all LatexManager instances are cleaned up and deleted before the atexit is called. When atexit is called, as you said, you want to delete (cleaning up the temporary diretories) all remaining LatexManager instances. But the LatexManager instance, whose strong reference is kept in LatexManagerFactory, is not deleted by this atexit mechanism. When this instance is finally deleted when the python interpreter shuts down and __del__ is invoked for this instance, the imported modules might be already gone, causing a failure.

I think that changing the strong reference kept in LatexManagerFactory to a weak reference is simple and more elegant way to fix the issue than keeping references to the imported modules. I also worry that these new references themselves are safe to rely on at shutdown time... But I don't know well about python at this level.

Testing your fix in Sage is tricky to me. It would take some time for me to figure out how to do this...

@pwuertz
Copy link
Copy Markdown
Contributor

pwuertz commented Oct 15, 2015

... the strong reference kept in LatexManagerFactory to a weak reference is simple and more elegant way to fix the issue than keeping references to the imported modules.

I think replacing the normal reference with a weak reference defeats the whole point of LatexManagerFactory. If you change the reference to a weak ref, python is free to delete the LatexManager at any time once the external references are dropped (maybe in between two figsaves). Starting up a new latex process takes a lot of time and I'd like to reuse an already running LatexManager if possible.

When atexit is called, as you said, you want to delete (cleaning up the temporary diretories) all remaining LatexManager instances.

You can't delete python instances on purpose. At exit I just make sure that cleanup is called for any objects that are still alive, tracked by the weakref set. Python by the way doesn't even guarantee that del is called at all before the the whole process is terminated.

I also worry that these new references themselves are safe to rely on at shutdown time... But I don't know well about python at this level.

Maybe, I think I encountered this problem one or two times before, and I can't remember if this fixed the problem or if I had to reimport the modules again. I also forgot how to provoke such a situation for testing purposes, which is why I asked if you could test this solution.
The next best thing I'd like to try then is to add a clean-flag that could be tested in del instead of relying on os for the first check. As you said, ideally the remaining instances alive should be clean after the atexit procedure, so after testing self._clean they shouldn't be running into cleanup(). But well, maybe they aren't (windows?) and trying again after atexit might succeed.

Testing your fix in Sage is tricky to me. It would take some time for me to figure out how to do this...

But didn't you write and test your solution in sage too? Using another backend_pgf.py shouldn't be any different, right?

@WeatherGod
Copy link
Copy Markdown
Member

Just wondering... couldn't we use a context manager to create/destroy the
temporary directory within a savefig() call? What's the reason for keeping
a single temporary directory across multiple savefig() calls? Or is that
not relevant to the issue at hand?

On Thu, Oct 15, 2015 at 2:09 PM, Peter Würtz notifications@github.com
wrote:

... the strong reference kept in LatexManagerFactory to a weak reference
is simple and more elegant way to fix the issue than keeping references to
the imported modules.

I think replacing the normal reference with a weak reference defeats the
whole point of LatexManagerFactory. If you change the reference to a weak
ref, python is free to delete the LatexManager at any time once the
external references are dropped (maybe in between two figsaves). Starting
up a new latex process takes a lot of time and I'd like to reuse an already
running LatexManager if possible.

When atexit is called, as you said, you want to delete (cleaning up the
temporary diretories) all remaining LatexManager instances.

You can't delete python instances on purpose. At exit I just make sure
that cleanup is called for any objects that are still alive, tracked by the
weakref set. Python by the way doesn't even guarantee that del is
called at all before the the whole process is terminated.

I also worry that these new references themselves are safe to rely on at
shutdown time... But I don't know well about python at this level.

Maybe, I think I encountered this problem one or two times before, and I
can't remember if this fixed the problem or if I had to reimport the
modules again. I also forgot how to provoke such a situation for testing
purposes, which is why I asked if you could test this solution.
The next best thing I'd like to try then is to add a clean-flag that could
be tested in del instead of relying on os for the first check. As you
said, ideally the remaining instances alive should be clean after the
atexit procedure, so after testing self._clean they shouldn't be running
into cleanup(). But well, maybe they aren't (windows?) and trying again
after atexit might succeed.

Testing your fix in Sage is tricky to me. It would take some time for me
to figure out how to do this...

But didn't you write and test your solution in sage too? Using another
backend_pgf.py shouldn't be any different, right?


Reply to this email directly or view it on GitHub
#5144 (comment)
.

@pwuertz
Copy link
Copy Markdown
Contributor

pwuertz commented Oct 15, 2015

The LatexManager and latex process ideally transcends multiple savefig calls. For that reason you can't run it in a temporary directory for which the lifetime is tied to savefig.

@WeatherGod
Copy link
Copy Markdown
Member

Ok, crazy idea, but how about wrapping the latex process with a tempdir
creation/deletion? Essentially, instead of calling out directly to latex,
you call out to a python script that wraps the latex process with a context
manager?

On Thu, Oct 15, 2015 at 2:31 PM, Peter Würtz notifications@github.com
wrote:

The LatexManager and latex process ideally transcends multiple savefig
calls. For that reason you can't run it in a temporary directory for which
the lifetime is tied to savefig.


Reply to this email directly or view it on GitHub
#5144 (comment)
.

@pwuertz
Copy link
Copy Markdown
Contributor

pwuertz commented Oct 15, 2015

That could be a solution, but at this point I don't see the benefit in putting much effort into a new wrapper. The current solution for cleaning up the tempdirs works, has been tested, and has only a simple aesthetic flaw on exit that can be fixed easily, right?

@kwankyu
Copy link
Copy Markdown
Author

kwankyu commented Oct 16, 2015

I tested your fix. It works fine in Sage as far as I am concerned.

On the other hand, I also tested my fix again to see if LatexManagerFactory loses the LatexManager instance kept by a weak reference. It did not in my test. Of course, this does not defeat your concern.

Anyway, it is up to you to choose which fix to merge to solve the issue I raised. I respect your expertise :-)

@pwuertz
Copy link
Copy Markdown
Contributor

pwuertz commented Oct 16, 2015

True, I'm sure that in many cases some reference will be floating around and keep the LatexManager instance alive, but yes, my concern is that only a reference within the Factory guarantees it.

The second concern is that you can reason how the use of weakref and the atexit cleanup affect the lifetime of LatexManager and prevent the situation of __del__ being called while the os global is missing for your specific use case (SAGE). But also for this there are are no guarantees. The actual bug is that I'm using modules in the destructor for which I forgot to keep a strong reference to (or perhaps reimport them? TBD..). This bug might be triggered again by another chain of events (a user might grab a LatexManager instance and work with it directly, who knows).

For those two reasons I'm in favour of fixing the __del__ method instead of the weakref in the factory.

Other than that the idea from @WeatherGod is really appealing. Running the tempdir context within another interpreter additionally has the benefit of being more robust in case of application segfaults too. I just can't pursue this at the time.

@pwuertz
Copy link
Copy Markdown
Contributor

pwuertz commented Oct 16, 2015

Oh, and thanks @kwankyu for reporting and working on this bug. If I read your bug tracker correctly this was the last item on the list, other issues were fixed in earlier matplotlib releases. If anything else should pop up just give me a ping.

@tacaswell
Copy link
Copy Markdown
Member

Closing this as we will go with #5249 or a variation on it.

@kwankyu Thank you for your work!

@tacaswell tacaswell closed this Nov 5, 2015
@QuLogic QuLogic modified the milestones: unassigned, 2.0.1 (next bug fix release) Jul 18, 2016
@kwankyu kwankyu deleted the pgf_backend_issue branch March 24, 2018 10:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants

X Tutup