vSphere Replication Sync Fails With Exception: com.vmware.hms.replication.sync.DeltaAbortedException

There are few instances when a vSphere Replication Sync (RPO based or a manual sync) fails with Delta Aborted Exception. This in turn will also affect a test / planned migration when performed with Site Recovery Manager.

In the hms.log located under /opt/vmware/hms/logs on the vSphere Replication Server, you will notice something like:

2018-01-10 14:35:59.950 ERROR com.vmware.hms.replication.sync.ReplicationSyncManager [hms-sync-progress-thread-0] (..replication.sync.ReplicationSyncManager) operationID=fd66efca-f070-429c-bc89-f2164e9dbb7a-HMS-23613 | Completing sync operation because of error: {OnlineSyncOperation, OpId=fd66efca-f070-429c-bc89-f2164e9dbb7a-HMS-23613, GroupMoId=GID-2319507d-e668-4eea-aea9-4d7d241dd886, ExpInstSeqNr=48694, TaskMoId=HTID-56fd57dd-408b-4861-a124-70d8c53a1194, InstanceId=2f900595-2822-4f2b-987d-4361f7035
05c, OpState=started, VcVmMoid=vm-28686, createInstanceRetryCount=2, fullSyncOngoing=false, operationId=null}
com.vmware.hms.replication.sync.DeltaAbortedException
at com.vmware.hms.replication.sync.SyncOperation.checkHealth(SyncOperation.java:911)
at com.vmware.hms.replication.sync.SyncOperation$4.run(SyncOperation.java:735)
at com.vmware.hms.util.executor.LoggerOpIdConfigurator$RunnableWithDiagnosticContext.run(LoggerOpIdConfigurator.java:133)
at com.vmware.hms.util.executor.LoggerOpIdConfigurator$2.run(LoggerOpIdConfigurator.java:100)
at com.vmware.jvsl.sessions.net.impl.TlsPreservingWrapper$2.run(TlsPreservingWrapper.java:47)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

This occurs when the outgoingeventlogentity and incomingeventlogentity tables on the vR database are having a large number of entries.

The following fix should be applied at own risk. Have a snapshot and/or a backup of the vR server before performing the change.

1. Navigate to the VRMS database's bin directory:

# cd /opt/vmware/vpostgres/9.x/bin

The postgres version varies depending on the replication server release.

2. Backup the replication database using the below command:

# ./pg_dump -U vrmsdb -Fp -c > /tmp/DBBackup.bak

3. Connect to the vR database using:

# ./psql -U vrmsdb

4. Run the below queries to extract the number of events for the logentity tables:

select count(*) from outgoingeventlogentity;

select count(*) from incomingeventlogentity;

In my case, the output on the production site vR was:

vrmsdb=# select count(*) from incomingeventlogentity;

count

-------

21099

(1 row)

vrmsdb=# select count(*) from outgoingeventlogentity;

count

-------

146

(1 row)

And on the recovery site, the outgoingeventlogentity was having 21k+ events.

5. First, you can change the max event age limit to 10800 in the hms-configuration.xml file located at:

# cd /opt/vmware/hms/conf/hms-configuration.xml

This should be the output after the edit:

<hms-eventlog-maxage>10800</hms-eventlog-maxage>

6. Next, we will have to purge the event logs from the above mentioned tables. There are lot of fields in the table if you run select * from <table-name>;

The one column we need is the "timestamp" column.

The timestamp column would have a value like this: 1515479242006

To convert this to human readable date, you will have to:

> Remove the last 3 digits from the above output.

So 1515479242006 will be 1515479242. Then convert this EPOCH time to normal convention using this link here.

Now, you will have to use a timestamp in such a way that anything before that would be purged from the database. During the purge, the timestamp has to be the complete value obtained from the timestamp column. Then, run the below query:

DELETE from incomingeventlogentity WHERE timestamp < 1515479242006;

DELETE from outgoingeventlogentity WHERE timestamp < 1515479242006;

7. Then restart the hms service using:

# systemctl stop hms

# systemctl start hms

The above is applicable from 6.1.2 vR onward. For lower versions:

# service hms restart

8. Re-pair the sites and then perform a sync now operation and we should be good to go.

Hope this help!

vSphere Replication Sync Fails With Exception: com.vmware.hms.replication.sync.DeltaAbortedException

Trending Articles

LAG, Lacp configuration on Mellanox switches

Karimnagar District Police Office Mobile Numbers List in Telangana State

Ifield Avenue closed following crash in Langley Green

NCERT Solutions for Class 9th Sanskrit Chapter 2 अविवेकः परमापदां पदम्

Rajasthan Board 10th Result 2016 Roll No wise & Name Wise

Black Angus Grilled Artichokes

Derbyshire jeweller and scrap gold dealer, Jonathan Haag, must pay £57,000...

BREAKING NEWS: Park Street closed off after fire

Practice Sheet of Right form of verbs for HSC Students

The 10 Tennessee Cities With The Largest Black Population For 2021

FLASHBACK WITH SIRASA FM AT GALGAMUWA 2022

Mp3 Download: Mdu - Mazola

Imitation gun was fired at motorist in Leicester road-rage incident

Ndebele names

MCKINNEY EMALINE EMMA OF WES...

Okra & Motia — The Workshop (Prod by Hammer)

Skint TV teen to be sentenced

Moondru Mudichu 19-09-2017 – Polimer tv Serial

YOSVANI JAMES Arrested by Miami-Dade County Corrections on Jan 10, 2017

Stories • Goddess Stepmom