We are using an AWS stack containing OpenSearch. We have several AWS accounts for different customers, but mainly a similar setup for OpenSearch in all accounts: In each AWS account, there is one PROD and one STAGING OpenSearch domain (PROD and STAGING are our way of configuring and using it, not anything AWS provides).
The problem is that from time to time, our master users stop working. Luckily only on the staging domains yet, but who knows? I can't log in to the OpenSearch dashboard with it any more and my app can't via API any more.
Our solution is to "create (another?) master user" using a DIFFERENT password.
So far, it has only happened once on each account, only in staging, but still, it is a real uncomfortable prospect to imagine that happening on PROD or more often than, maybe once a month on a single product.
Do you know what might be happening here? I considered something like upgrades losing the master DB or AWS blocking leaked passwords, although I wouldn't know how a KeePass generated password would leak, unless it REALLY LEAKS which would mean we were in much bigger trouble and in which case I'd expect a message from AWS. My most probable guess is that the clusters single instance has been replaced and the user DB has gone with it. Which would explain why PROD domains do not have this problem, but we would like to not have too many resources on hold for our staging ENV...
Any other ideas?