Skip links

Unable to open shell error

We sometimes see an issue when running an install with MetroAE that results in the following error:

2018-10-27 15:42:48,890 p=17607 u=root |  fatal: [ -> localhost]: FAILED! => {
    "changed": false,
    "failed": true,
    "msg": "unable to open shell. Please see:"

If you follow the URL provided in the error message, you will find some useful instructions for how to put Ansible in DEBUG mode and get more information about what caused the error. In this case, however, I think this is the key to diagnosing the problem:

2018-10-27 15:42:38,774 p=30156 u=root |  control socket path is /root/.ansible/pc/6f207cff52
2018-10-27 15:42:38,774 p=30156 u=root |  current working directory is /root/nuage-metro
2018-10-27 15:42:38,774 p=30156 u=root |  using connection plugin network_cli
2018-10-27 15:42:38,815 p=30156 u=root |  failed to create control socket for host

MetroAE uses Ansible’s sros_command module to send commands to the VSC. Under the covers, Ansible uses network_cli and the paramiko library to connect to the VSC and execute the commands. There are a few issues to consider:

  1. Some versions of paramiko do not work with VSC and produce errors much like what you are seeing. Let’s check the paramiko version you have on your Ansible host to verify that it’s one of the versions we have tested. We have tested and validated versions 2.2.1 and 2.4.1. Check the output of pip show paramiko to verify that you are running a supported version. If the error output also includes GSSException, it could signal a wrong version of paramiko gssapi, see for more information. The fix is to install the proper paramiko version, e.g. pip install paramiko==2.4.1.
  2. Some versions of Ansible are known to have bugs in their handling of the sockets the network_cli uses. To improve performance, Ansible has implemented a policy of leaving a socket open for some period of time so that it can be reused by subsequent tasks. We have seen problems with some Ansible versions not cleaning up properly after the timeout. Please make sure you are running Ansible 2.4.1. This is the version we have tested and validated. Check the output of ansible --version.
  3. Even if the proper versions of paramiko and Ansible are being used, we have seen that sometimes a stale socket file is left. If all else fails, try manually deleting the socket file shown in the output, /root/.ansible/pc/6f207cff52 in this case.

This is the place to start when encountering this error.