8

I'm running into some on and off issues when using windows hosts in my Ansible playbooks. I'm running Ansible 2.3 with pywinrm 0.2.2 installed. I'm using basic authentication with the local Administrator user.

Sometimes I receive this issue when I run a task:

 [WARNING]: FATAL ERROR DURING FILE TRANSFER: Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/ansible/plugins/connection/winrm.py", line 267, in _winrm_exec
  self._winrm_send_input(self.protocol, self.shell_id, command_id, data, eof=is_last)
File "/usr/local/lib/python2.7/dist-packages/ansible/plugins/connection/winrm.py", line 248, in _winrm_send_input
  protocol.send_message(xmltodict.unparse(rq))
File "/usr/local/lib/python2.7/dist-packages/winrm/protocol.py", line 207, in send_message
   return self.transport.send_message(message)
File "/usr/local/lib/python2.7/dist-packages/winrm/transport.py", line 191, in send_message
   raise WinRMTransportError('http', error_message) WinRMTransportError: (u'http', u'Bad HTTP response returned from server. Code 500')

Other times, when I try to run a win_shell/win_command/raw module and with_items on a group of windows hosts it seems to fail on temporary files created by Ansible.

The task I'm trying to run is:

- name: Check services up
  win_command: 'sc queryex {{ item }} | Findstr RUNNING'
  with_items: '{{ component_services }}'
  register: command_result
  ignore_errors: yes

And the error I may get is:

changed: [172.16.104.169] => (item=Dnscache)
failed: [172.16.104.176] (item=Dnscache) => {"failed": true, "item": "Dnscache", 
  "module_stderr": "Exception calling \"Run\" with \"1\" argument(s): \"Exception calling \"Invoke\" with \r\n\"0\" 
     argument(s): \"The running command stopped because 
           the preference variable \r\n\"ErrorActionPreference\" 
           or common parameter is set to 
   Stop: (0) : cannot open \r\nC:\\Users\\ADMINI~1\\AppData\\Local\\Temp\\RESB3FF.tmp 
  for writing\r\n(1) : 
     using System;\r\n\"\"\r\nAt line:45 char:1\r\n+ 
     $output = $entrypoint.Run($payload)\r\n+ 
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\r\n+ 
  CategoryInfo          : NotSpecified: (:) [], ParentContainsErrorRecordE \r\nxception\r\n+ 
  FullyQualifiedErrorId : ScriptMethodRuntimeException\r\n", 
  "module_stdout": "", "msg": "MODULE FAILURE", "rc": 1}
     changed: [172.16.104.141] => (item=Dnscache)
     changed: [172.16.104.168] => (item=Dnscache)
     changed: [172.16.104.145] => (item=Dnscache)

Both issues are absolutely random and may even not appear at all over a sequence of different runs.

Any assistance?

Pierre.Vriens
  • 7,225
  • 14
  • 39
  • 84
Asaf Haim
  • 161
  • 2

2 Answers2

2

You should probably create an Ansible issue for this as it's most likely a bug in Ansible.

The first error makes me think about WinRM pipelining:

  • Ansible 2.3.0 introduced an always-on WinRM pipelining feature (similar to SSH pipelining), and that may be behind this.
  • SSH pipelining can cause issues in Ansible for Linux, and it can be useful to turn it off, but this isn't yet possible for WinRM pipelining.

This related issue includes some Git commits that will re-enable 'non-pipelined' mode in a future release (should now be released in 2.4, possibly with a backport as part of 2.3.2 - see this comment)

Try upgrading to Ansible 2.4.1+ (which generally works well) to get the fix. Or try downgrading to Ansible 2.2.3 to see if this helps - this will disable WinRM pipelining and may avoid other regression bugs in this area.

  • If you installed Ansible using pip, you can do pip install ansible==2.4.1 to upgrade (or ansible==2.2.3 to downgrade), then if that doesn't help, do the same with 2.3.1 to upgrade again.
  • you should also upgrade to latest pywinrm as mentioned in the issues above
RichVel
  • 902
  • 6
  • 16
1

I've found Ansible 2.3.2 to be the most stable, tho I haven't spent much time with 2.4.1 yet. 2.4.0 definetely has some stability issues when it comes to winrm.

Trondh
  • 381
  • 1
  • 6