X Tutup
Skip to content

Fix container cleanup on daemon restart#36249

Merged
vdemeester merged 1 commit intomoby:masterfrom
cpuguy83:36145_fix_container_reload
Feb 12, 2018
Merged

Fix container cleanup on daemon restart#36249
vdemeester merged 1 commit intomoby:masterfrom
cpuguy83:36145_fix_container_reload

Conversation

@cpuguy83
Copy link
Member

@cpuguy83 cpuguy83 commented Feb 8, 2018

When the daemon restores containers on daemon restart, it syncs up with
containerd to determine the existing state. For stopped containers it
then removes the container metadata from containerd.

In some cases this is not handled properly and causes an error when
someone attempts to start that container again.
In particular, this case is just a bad error check.

Fixes #36145

Copy link
Contributor

@boaz0 boaz0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert.NoError(t, err)

?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this check necessary? won't it fail on the next line, and the pid will be in the pathError message?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this (line 62-79) should be extracted into a helper so it doesn't distract from the test logic:

ppid := getContainerdShimPID(t, inspect.State.Pid)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really sure where this test should go. It seems like it has more to do with daemon state than the container API/.

s/daemon_linux_test/daemon_test.go/ and add a skip.If() because the relevant OS is where the daemon is running, not where the test client is running, so this would be skipped incorrectly if the client was windows, and the daemon was linux.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test will not compile on Windows and requires being on the same host since it's starts up a new daemon, kills processes, etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like it has more to do with daemon state than the container API

It's kind of everything, sadly.

When the daemon restores containers on daemon restart, it syncs up with
containerd to determine the existing state. For stopped containers it
then removes the container metadata from containerd.

In some cases this is not handled properly and causes an error when
someone attempts to start that container again.
In particular, this case is just a bad error check.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
@cpuguy83 cpuguy83 force-pushed the 36145_fix_container_reload branch from d3c4444 to c0d56ab Compare February 9, 2018 19:36
@cpuguy83
Copy link
Member Author

cpuguy83 commented Feb 9, 2018

Updated

Copy link
Member

@dnephin dnephin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@vdemeester vdemeester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🐯

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants

X Tutup