1

I'm trying to debug a caching issue with Puppet on RedHat 7. My versions are at the bottom of this question.

Below is an excerpt from my site.pp manifest. This is all fine and the Nagios check is installed on the foo.example.com node.

node 'foo.example.com' {

  nagios::service {'my_database':
    check_command => 'check_tcp_nrpe!3306',
    service_description => 'My Database',
  }

}

Now, if I add another nagios::service check in site.pp it also gets picked up by,

puppet agent --noop --test

but if I remove the same nagios::service call and run the agent again, it still sees it (these are dry runs - I don't understand why it's caching). This has been happening in many different scenarios across multiple manifests. If I remove puppetdb and run the agent, puppetdb is re-created and everything goes back to normal for a while.

Any suggestions on where to look before I go down the route of upgrading puppet, or re-installing the latest version? I'm not sure what other information to provide, so please let me know if there's something that might help.

My versions,

puppetlabs-release-7-12.noarch
puppet-server-3.8.6-1.el7.noarch
puppetdb-terminus-2.3.8-1.el7.noarch
puppet-3.8.6-1.el7.noarch
puppetdb-2.3.8-1.el7.noarch

Update 1

Below is the output from running # puppet agent --noop --test,

# puppet agent --noop --test
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Loading facts
Info: Caching catalog for foo.example.com
Info: Applying configuration version '1522355276'
.
.
.
Notice: /Stage[main]/Nagios::Server/Nagios::Service_file[/etc/nagios/conf.d/services/foo-my_database_nagios_service.cfg]/File[/etc/nagios/conf.d/services/foo-my_database_nagios_service.cfg]/ensure: current_value absent, should be present (noop)
.
.
.
Notice: Finished catalog run in 21.10 seconds

The notice that that file should be present is bogus.

All I did was add,

nagios::service {'my_database':
    check_command => 'check_tcp_nrpe!3306',
    service_description => 'My Database',
}

run the agent, then removed it, and ran the agent again. Every time I run the agent it still thinks that check should be present even though it's not defined in any of my manifests.

Update 2

These are the steps I use to remove the cached item. After running these steps it no longer tries to add that my_database check.

cd /var/lib/puppetdb
sudo mv db db.`date +%F` # create a backup
sudo systemctl restart puppetmaster
sudo systemctl restart puppetdb
wsams
  • 161

1 Answers1

0

This issue turns out to be related to what I was doing in "Update 2". When puppetdb was deleted it lost track of all it's resources. Once puppet agent --test --noop was run on all of our servers it knew where to find the resources and everything could be found in the catalog.

Basically, once puppetdb is deleted you should run puppet agent --test --noop on all the hosts.

wsams
  • 161