Hi guys, I need help with HBase replication troubleshooting. On active master I get the following:
Traceback (most recent call last): File "/var/lib/ambari-agent/cache/host_scripts/alert_hbase_replication.py", line 123, in execute raise Exception('The following region servers breached log/replication thresholds:\n'+'\n'.join(breached)) Exception: The following region servers breached log/replication thresholds: Region server AAA has breached SizeOfLogQueue limit of 20. Has values of: PeerID=XXX SizeOfLogQueue=3908 TimeStampsOfLastShippedOp=Mon Jan 16 06:44:10 UTC 2023 Replication Lag=953231333 Region server BBB has breached SizeOfLogQueue limit of 20. Has values of: PeerID=XXX SizeOfLogQueue=2711 TimeStampsOfLastShippedOp=Fri Jan 27 07:31:21 UTC 2023 Replication Lag=979 Region server CCC has breached SizeOfLogQueue limit of 20. Has values of: PeerID=XXX SizeOfLogQueue=589 TimeStampsOfLastShippedOp=Mon Jan 16 06:46:26 UTC 2023 Replication Lag=953096639