Uninstall instructions #571

nandanrao · 2016-06-21T14:51:42Z

It's really nice and easy to launch an elasticsearch cluster (in my case, into a DC/OS cluster) with this library. However, it's a little unclear to me how to remove a cluster / uninstall. Could use some mention of this in the docs!

nandanrao · 2016-06-21T15:03:34Z

(if the only correct way is to manually /teardown via mesos framework id, i'm happy to make a docs PR, but I suspect maybe there's another way?)

frankscholten · 2016-06-22T13:19:54Z

@nandanrao Thanks for opening this issue.

Indeed, tearing down via

curl -XPOST $MASTER/teardown -d 'frameworkId=794b66f4-2c4f-45cd-920b-8ee0b3555259-0001'

is the way to do it.

However, when testing this I might have found a bug. I did a teardown, the scheduler and executors were killed as expected and then Marathon restarted the scheduler. But now it did not launch a new executor. Instead the logs contained the following:

[DEBUG] 2016-06-22 12:37:16,061 class org.apache.mesos.elasticsearch.scheduler.ElasticsearchScheduler resourceOffers - Declined offer: id { value: "794b66f4-2c4f-45cd-920b-8ee0b3555259-O245" }, framework_id { value: "794b66f4-2c4f-45cd-920b-8ee0b3555259-0001" }, slave_id { value: "794b66f4-2c4f-45cd-920b-8ee0b3555259-S3" }, hostname: "172.17.0.8", resources { name: "ports",  type: RANGES,  ranges {  range {   begin: 37000,    end: 38000,   },  },  role: "*" }, resources { name: "cpus",  type: SCALAR,  scalar {  value: 2.0,  },  role: "*" }, resources { name: "mem",  type: SCALAR,  scalar {  value: 4096.0,  },  role: "*" }, resources { name: "disk",  type: SCALAR,  scalar {  value: 20000.0,  },  role: "*" }, url { scheme: "http",  address {  hostname: "172.17.0.8",   ip: "172.17.0.8",   port: 5051,  },  path: "/slave(1)" }, Reason: Cluster size already fulfilled
[DEBUG] 2016-06-22 12:37:22,042 class org.apache.mesos.elasticsearch.scheduler.ElasticsearchScheduler isHostnameResolveable - Attempting to resolve hostname: 172.17.0.5
[DEBUG] 2016-06-22 12:37:22,047 org.apache.mesos.Protos$TaskStatus <init> - Task status for elasticsearch_172.17.0.6_20160622T123139.777Z exists, using old state: TASK_RUNNING

Cluster size already fullfilled means that it thinks there is one task running based on ZK state even though that executor has been killed.

I am looking into the Mesos code to understand how teardown works at a lower level. We might have to add code to do proper ZK state cleanup on teardown. The question is how to do this and where in the framework. I asked a question on the DC/OS community channel in #general https://dcos-community.slack.com

nandanrao · 2016-06-22T13:38:31Z

Yes I saw this as well, and ran the docker cleanup script -- which I believe in this case ONLY removed the zookeeper node which was named after the elasticsearch cluster. That SEEMS to have fixed it, although I did not look very closely.

philwinder · 2016-06-22T14:23:04Z

Personally, I always viewed shutdown as of secondary importance. Who wants their ES cluster to be destroyed? ;-)

But seriously, in the past I simply stopped the scheduler. There would be some state left in zookeeper, which is required just in case it failed on its own. You can manually delete that, or just ignore it. There's very little in there.

philwinder · 2016-06-22T14:25:26Z

Oh, and also check out #550. If the scheduler is closed, then the executors are closed, then the scheduler starts, the scheduler will still think that the executors are running, because we never receive any updates from Mesos to tell us that they've gone. An issue with Mesos IMO. But a "ping" mechanism to make sure they are still there would work around.

frankscholten added the bug label Jun 22, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uninstall instructions #571

Uninstall instructions #571

nandanrao commented Jun 21, 2016

nandanrao commented Jun 21, 2016

frankscholten commented Jun 22, 2016

nandanrao commented Jun 22, 2016

philwinder commented Jun 22, 2016

philwinder commented Jun 22, 2016

Uninstall instructions #571

Uninstall instructions #571

Comments

nandanrao commented Jun 21, 2016

nandanrao commented Jun 21, 2016

frankscholten commented Jun 22, 2016

nandanrao commented Jun 22, 2016

philwinder commented Jun 22, 2016

philwinder commented Jun 22, 2016