Problem: In this scenario, we have a potentially long list of usernames (accounts) and a directory containing SSH public keys, one per file with each file named after the user that owns that key. We want to deploy all the keys for the users in our list.
The sets of users and keys are not identical; there may be more users without keys (a common situation, alas), and we may have many other keys belonging to users not in this list.
(You might instead store your users’ keys directly in a list or dictionary, which would obviate the need for the code below, but you’d forever be copy-and-pasting keys into a YAML data structure unless you have some automated way to keep it up to date. Come to that, I hear those crazy kids even store public keys in LDAP directories these days.)
First issue: a file lookup throws an error if the file doesn’t exist. We
could simply iterate over the list of users and set
that we pass on the ones that don’t have keys, then supply a null default
1 2 3 4 5 6
But this is messy and longwinded, as we’re incurring a file lookup, even if it fails, and remote call for every user. (Does default even work with lookup? I have a feeling it may not…)
A naïve first pass at solving this might be to go through the list of users, call the stat module to see if a local key file exists for each one and save the results in a list, and then use a conditional test before trying the lookup for the key:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
(Remember that a registered variable for a looped task contains a list of
hashes for each item in the loop, comprising the original object named
item and the task result for that object named after the module, in this
We use local_action because we’re looking for the key files on the Ansible controller node rather than the client. This at least saves us some remote calls, as we only deploy actual keys that we find, rather than trying to do so for all the users. Here, we reduce the overhead of the stat module slightly by disabling the retrieval of file attributes that we don’t need, such as checksums. But we’re effectively iterating over the entire list of users twice, once for the users and again for their potential keys, many of which may not exist, which is slow and mostly wasted effort.
A better approach would be to iterate over only a list of keys that we know exist:
1 2 3 4 5 6
Here we pull out all the stat elements from the results list and then
select only the ones that have an
exists attribute which is true. This
may reduce the number of iterations considerably but it is still a loop
stepping through one item at a time, and we haven’t avoided doing all that
file I/O for the interminable stat lookups.
Ideally, we’d instead generate a listing of all the key files in a single pass (like an ‘ls’ of the directory), then take the intersection of that list with our list of users - i.e. to obtain the list of users for whom we have keys.
My first thought was to use the fileglob lookup to create a list of all the key files. However, fileglob returns the full paths for all the objects it finds. You might think it would be possible to use the Jinja2 map function to apply the basename filter to every element of the list, thus stripping the paths and leaving only the filenames:
But this doesn’t work for some reason that isn’t obvious to me; instead it breaks the filenames up into a list of single character strings like this:
However, the find module does give us a list we can filter in that way:
1 2 3 4 5 6 7 8 9 10
(Note that the find
excludes parameter, used here to remove editor backup
files from the results, is only available from Ansible 2.5. It
isn’t strictly necessary, as only the files that match actual usernames
will be used anyway. Alternatively, you could ensure that all your key
files are named username.pubkey instead, which is perhaps a bit more
intuitive, and then use
patterns: '*.pubkey' with find. But you’d have
to strip the extensions as well in the next step.)
(Instead of find, we could just run an ls command and process the
standard output with
split to make a list, or even shell out and call
echo dir/*. But that’s spawning another process, which is cheating.)
Again, the find module is run locally as the key files are stored on our
Ansible controller node. We extract the
path attributes from the
key in the return value of the find module, as those contain the path for
each file found, and then run basename over them to strip the directory
Now we can obtain the intersection of our sets of users and key files with one simple Jinja2 filter, and deploy the precise set of keys that are we need:
1 2 3 4 5 6 7 8 9 10 11
This seems to me quite a neat pattern if you ever need to process a set of files according to some selective criteria or by cross-referencing against a second list. And it’s a lot quicker than watching a list of values scroll steadily up the screen.