X Tutup
Skip to content

Add verbose flag to network inspect to show all services & tasks in swarm mode#31710

Merged
mavenugo merged 2 commits intomoby:masterfrom
sanimej:drillerrr
Mar 14, 2017
Merged

Add verbose flag to network inspect to show all services & tasks in swarm mode#31710
mavenugo merged 2 commits intomoby:masterfrom
sanimej:drillerrr

Conversation

@sanimej
Copy link

@sanimej sanimej commented Mar 9, 2017

For swarm mode networks currently network inspect only shows endpoints local to that host. Service Discovery and overlay network reachability information gets exchanged through the gossip channel between the nodes. There have been issues where failures in the gossip channel can lead to inconsistent state across clusters. But there was no easy way to identify it.

This change adds a verbose flag to the network inspect output to display all services on that network with all the task IPs and host IP where the container is running. This will be very useful to quickly identify any inconsistent state across hosts (this can show up stale or incorrect IPs in DNS queries).

Edit: libnetwork PR has been merged. Updated the vendoring.

Fixes docker #24186

Example output from a 3 node cluster. s1 has 3 replicas and s2 has 1 replica.

vagrant@net-3:~$ docker network inspect --verbose ov1
[
    {
        "Name": "ov1",
        "Id": "ybmyjvao9vtzy3oorxbssj13b",
        "Created": "2017-03-07T22:08:56.471374351Z",
        "Scope": "swarm",
        "Driver": "overlay",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "10.0.0.0/24",
                    "Gateway": "10.0.0.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Containers": {
            "5f125d4718f351333decbf34e02c3227a4455731b067cf6921cf9ffaa6779148": {
                "Name": "s1.2.mnzsnw7vadg2ra6mboz4ccuyb",
                "EndpointID": "54a324b9ebe5fc37c65e29cdb8eaa3becfa736e3a8a78a858bd4352a744c6879",
                "MacAddress": "02:42:0a:00:00:04",
                "IPv4Address": "10.0.0.4/24",
                "IPv6Address": ""
            }
        },
        "Options": {
            "com.docker.network.driver.overlay.vxlanid_list": "4097"
        },
        "Labels": {},
        "Peers": [
            {
                "Name": "net-3-3090671942ce",
                "IP": "192.168.33.13"
            },
            {
                "Name": "net-2-fb80208efd75",
                "IP": "192.168.33.12"
            },
            {
                "Name": "net-1-6ecbc0040a73",
                "IP": "192.168.33.11"
            }
        ],
        "Services": {
            "s1": {
                "VIP": "10.0.0.2",
                "Ports": [],
                "Tasks": [
                    {
                        "Name": "s1.1.530e00o1oj9l1kivoci7oupzt",
                        "EndpointID": "f0a46deaf7ce1d87bf42437188a7ba4d8d6ae3e98fe439db95171d8330e278a0",
                        "EndpointIP": "10.0.0.3",
                        "Info": "Host IP 192.168.33.12"
                    },
                    {
                        "Name": "s1.3.98jplsdb5j7ozibrbv0bfs3fg",
                        "EndpointID": "005c24c13b1ddb0395de122ae7c75c889cebc52d4cc61ee918ef3a9dbffa2b27",
                        "EndpointIP": "10.0.0.5",
                        "Info": "Host IP 192.168.33.11"
                    },
                    {
                        "Name": "s1.2.mnzsnw7vadg2ra6mboz4ccuyb",
                        "EndpointID": "54a324b9ebe5fc37c65e29cdb8eaa3becfa736e3a8a78a858bd4352a744c6879",
                        "EndpointIP": "10.0.0.4",
                        "Info": "Host IP 192.168.33.13"
                    }
                ]
            },
            "s2": {
                "VIP": "10.0.0.6",
                "Ports": [],
                "Tasks": [
                    {
                        "Name": "s2.1.spc20i5to6crco9wdh5t3gzh2",
                        "EndpointID": "9d36c87e72236b0c5e0c5794b6963c84c3449ecbab5648ccb861993c760c30b9",
                        "EndpointIP": "10.0.0.7",
                        "Info": "Host IP 192.168.33.12"
                    }
                ]
            }
        }
    }
]

@aaronlehmann aaronlehmann added status/1-design-review status/failing-ci Indicates that the PR in its current state fails the test suite and removed status/0-triage labels Mar 10, 2017
@aaronlehmann
Copy link

There is a gofmt issue preventing CI from running.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you will need a documentation comment here to get golint to succeed.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...and here.

@aaronlehmann
Copy link

Rather than changing the API to include this information, could we do this on the client side by adding network selectors to ListTasksRequest? I suggested doing this in #31551 (comment) for something else. It would be pretty easy to implement that, and it probably makes sense to do.

@sanimej sanimej force-pushed the drillerrr branch 3 times, most recently from 7a8f807 to b29326b Compare March 10, 2017 02:07
@sanimej
Copy link
Author

sanimej commented Mar 10, 2017

@aaronlehmann One of the reasons for implementing this is to have a quick way to check if the network control plane state distributed by gossip is consistent across all nodes. So this has to work on all nodes, mainly the workers. I am working on a diagnostics container which will probe the kernel state and make sure its consistent (for ex: LB entries in IPVS matches the number of tasks for a given service). So using the swarm control api will not work in this case. The service/task information presented here is fetched from libnetwork's networkDB.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder we can return map[string]string here (or json string maybe)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want the driver to return the endpointID and a string that can be presented to the user. The reason for this is the way networkDB API is currently designed. NetworkDB is a service provided by the libnetwork and the network drivers use it. But libnetwork is not aware of what exactly is the key used by the driver. Please see this moby/libnetwork#1674 (comment).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for confusion, what I meant was converting a human-friendly string returned as the 2nd retvalue here to a machine-friendly map.
i.e.

return key, map[string]string{"HostIP": peer.TunnelEndpointIP}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I think its a good idea. If the driver wants to return more info in future a map would be better than cramming it all into one string. Will change it in the libnetwork PR.

@sanimej sanimej force-pushed the drillerrr branch 3 times, most recently from 0f28505 to 401fd76 Compare March 12, 2017 01:28
@mavenugo
Copy link
Contributor

Thanks @sanimej . Yes, this is a very useful addition.

@aaronlehmann as @sanimej suggested, the main purpose of this change is provide a way to perform consistency check between the distributed control-plane (via Gossip) and the distributed data-plane that is built using various tools such as iptables, l2/l3 table, ipvs, etc...

Copy link
Contributor

@mavenugo mavenugo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sanimej couple of minor comments. rest of it LGTM.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you pls move these network specific Task and Service info to api/types/network/network.go ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

@sanimej
Copy link
Author

sanimej commented Mar 12, 2017

@mavenugo Addressed the comments. PTAL.

vendor.conf Outdated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is being vendored to the master, should we also consider picking up 4610dd67c7b9828bb4719d8aa2ac53a7f1f739d2 ?

@aboch can you share your opinion ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need it in 17.04.

But I was hoping to bring it in along with moby/libnetwork#1678 if that one and moby/swarmkit#2028 get merged in time.

Maybe better be safe and carry it in now along with this vendoring.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sanimej btw, this is not a mandatory request for this PR :-) ... so i will approve this PR for its own merits. If you can update to the above vendor, its a icing on the cake. thanks

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mavenugo Updated the vendoring to libnetwork SHA 4610dd67c7b9828bb4719d8aa2ac53a7f1f739d2 (brings moby/libnetwork#1354)

Copy link
Contributor

@mavenugo mavenugo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@thaJeztah thaJeztah removed the status/failing-ci Indicates that the PR in its current state fails the test suite label Mar 12, 2017
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
@sanimej
Copy link
Author

sanimej commented Mar 13, 2017

Vendoring also fixes docker #30727

@sanimej sanimej force-pushed the drillerrr branch 2 times, most recently from 1cb8ced to ffa17d4 Compare March 13, 2017 17:54

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe "...like the service's VIP"? Sorry to keep nitpicking this. The sentence just doesn't sound correct without something between "like" and "VIP".

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this sound better ?

docker network inspect --verbose for swarm mode overlay networks shows service-specific details like its VIP and port mappings.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I didn't have the context of this conversation, I would read that its as meaning that the network's VIP and port mappings are shown.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this ?
docker network inspect --verbose for swarm mode overlay networks shows service-specific details such as service's VIP and port mappings.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's good if you change service's to a service's or the service's.

@aaronlehmann
Copy link

LGTM

@thaJeztah thaJeztah added this to the 17.04.0 milestone Mar 14, 2017
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: missing space after $

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If youre still updating, can you wrap these paragraphs to 80-chars?

@thaJeztah
Copy link
Member

Left some small nits, but no show-stoppers

…m mode

Signed-off-by: Santhosh Manohar <santhosh@docker.com>
@sanimej
Copy link
Author

sanimej commented Mar 14, 2017

@thaJeztah Updated the PR.

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@mavenugo mavenugo merged commit cdf66ba into moby:master Mar 14, 2017
@sanimej sanimej mentioned this pull request Mar 14, 2017
dnephin pushed a commit to dnephin/docker that referenced this pull request Apr 17, 2017
Add verbose flag to network inspect to show all services & tasks in swarm mode
dnephin pushed a commit to dnephin/docker that referenced this pull request Apr 17, 2017
Add verbose flag to network inspect to show all services & tasks in swarm mode
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants

X Tutup