Nothing’s worse than when your domain controllers won’t talk to each other.
In an ideal world, I’d have redundant file servers with redudnant power, feeding redundant virtual machine servers ready to failover at a moment’s notice. But life is rarely ideal, so when a few freak accidents like my virtualization server kernel panicing during backups (more on that later), and some odd ATA errors on my main domain controller come along, the resulting corruption can be a pain.
This problem presented itself as it often does when these two machines go awry:
my two (at least something’s redundant in this network) FreeIPA masters stop
updating each other. No big deal, right? I’ve seen this before. Just log into
one and run ipa-replica-manage force-sync
. Watch it time out because there
actually is a replication issue, and move on to ipa-replica-manage re-initialize
.
If I’m thinking of the two servers as primary and backup, that line of thought
makes sense. When that doesn’t work, remake the replication agreement with
ipa-replica-manage connect
.
Nope.
Creation of IPA replication agreement is deprecated with managed IPA replication topology. Please use
ipa topologysegment-*
commands to manage the topology.
For the sake of clarity, I’ll refer to the “primary” master as ipa1
, and the
“secondary” master as ipa2
Without an idea of how to proceed, I rebuilt ipa2 from the first by fully recreating its VM, which was never an idea I was happy with, and my gut feeling was right. I am now convinced that the corruption was on ipa1, and this move just made it worse.
With the replacement ipa2 working, and replication still showing as “connectivity left-right”,
I ran ipa-server-install --uninstall
on ipa1 and attempted to rebuild it as a replica of
the ipa2, which failed with the error Incorrect number of results (0) searching for public key
I restored the full-disk backup of ipa2 from March, and rebuilt ipa1 as a replica of it. This, finally resulted in a working, bi-directional replication agreement between ipa1 and ipa2, however blowing away one of only two IPA masters in my network was not a solution I’m happy with.
In a better world, I could have just broken and reformed the replication agreement and moved on with my life, without the interference of FreeIPA trying to preserve the topology. I’m hesitant to think this would fix everything, and I’m afraid doing this very thing in the past led to replication conflicts.
In an even better world, I would’ve started running ipa-backup
on cron much earlier,
but then I wouldn’t have learned any of the following:
- When IPA breaks down, it’s time to learn
ldapsearch
,ldapmodify
, andldapdelete
. - Even when you edit LDAP directly, you still can’t change ipaReplTopoSegmentDirection.
- LDIF is a format useful in modifying LDAP entries.
- FreeIPA switched away from manual connections to topology.
- FreeIPA really doesn’t like you doing anything that could disconnect the topology.
- Once replication is working Red Hat has a chapter on resolving replication conflicts.
I’m still somewhat confused about one of the solutions I tried in the interim. I created a new, third IPA master, and connected it to both the primary and secondary in the topology editor (an amazingly simple 3 click operation for each master). This allowed me to break the connection between the two machines and create a new link between them, which showed as “Connectivity both” in the domain topology segment. When I removed the third server, however, the two original masters stopped talking to each other. This, despite the topology segment still showing bidirectional communication.