18

I've got an Ansible playbook where I want to update a number of flaky devices in sequence. I can use serial:1, but I want to stop the playbook altogether if I get a failure so I can fix it before proceeding instead of accumulating errors.

I'd also like to restart the playbook at the same host I stopped on. Currently using Ansible v2.0, but can also switch to a newer version if that sort of a feature is only available in newer versions.

Woodland
  • 1,338
  • 8
  • 14
Peter Turner
  • 1,482
  • 4
  • 18
  • 39

2 Answers2

16

Your playbook will stop when a failure occurs and you're using serial: 1 according to the documentation.

By default, Ansible will continue executing actions as long as there are hosts in the group that have not yet failed.

That said there seems to be some confusion in the community over the default behavior, and it seems to have changed--or been buggy--somewhere between 1.8 and 2.1.

So, if serial: 1 doesn't suffice, use this additional setting:

max_failure_percentage: 0

In some situations, such as with the rolling updates described above, it may be desirable to abort the play when a certain threshold of failures have been reached. To achieve this, as of version 1.3 you can set a maximum failure percentage...

==

As for retrying your playbook, you should be seeing a failure message like this:

to retry, use: --limit @/home/user/site.retry

Use that --limit flag and on your next execution of ansible-playbook and it will continue from where it failed.

Retry files will be created unless you've set retry_files_enabled = False in your configuration.

Alternatively, --start-at-task may also work.

Sources:

https://github.com/ansible/ansible/issues/1663

https://github.com/ansible/ansible/issues/16241

http://docs.ansible.com/ansible/playbooks_delegation.html#rolling-update-batch-size

http://docs.ansible.com/ansible/playbooks_delegation.html#maximum-failure-percentage

http://docs.ansible.com/ansible/intro_configuration.html#retry-files-enabled

http://docs.ansible.com/ansible/playbooks_startnstep.html#start-at-task

Woodland
  • 1,338
  • 8
  • 14
2

In 2.5+ (well after the question), there is the debugger which covers most of this: https://docs.ansible.com/ansible/latest/user_guide/playbooks_debugger.html

As for one at a time, using "--forks 1" only connects to one system at a time if you want to do it ad-hoc and not every time.