virtuallyPeculiar

VMware vRealize Log Insight is a product to collect logs from various solution and helps administrators to filter and analyze it. It helps for monitoring environments and performing security audits for each configured solution. The Log Insight is deployed as a virtual appliance from an ova template.

I will skip the ova deployment part as most of you are familiar with how the ova deployment goes. Once the ova deployment completes and the appliance is powered On, it will perform certain initialization tasks and then restart once again. Once the restart completes, you are all set to configure this appliance.

Log Insight has a HTML5 based client to configure and administer the solution. To access this page for configuration, go to:

https://Log_insight_IP/admin

This will bring you to a following page:

Click Next to start the configuration.

We will be configuring a new deployment of Log Insight, so click Start New Deployment

Provide the admin password and an optional administrator email for notifications.

Enter the License Key for the product and click Add License. If the license is valid you will get a table confirming the same. Click Save And Continue. If you do not have a license click Skip

In the General Configuration, provide an email ID for System alerts and notifications. Click Save And Continue

Configure a Time Server for your Log Insight appliance. If you have a NTP server, drop down Sync Server Time With and select NTP, and provide the NTP server address. You will have to click Test to validate the NTP server. If you do not have an NTP server, you can sync your time with the ESXi host. Click Save and Continue.

For system notifications to be forwarded SMTP has to be configured. Enter the SMTP server and the email address you would like to send notifications to and click Send Test Email. Once you confirm the test email was sent successfully click Save And Continue.

And with that the basic setup of your Log Insight is completed. Click Finish to proceed further.

Next, you can perform integration of solutions like vCenter sever to forward their logs to this Log Insight appliance.

Hope this was helpful.

When you try to configure syslog for ESXi host under /admin > vSphere (Integration) you might run into the below error:

Syslog configuration failed. See http://kb.vmware.com/kb/2003322 for manual configuration. (Details: Client received SOAP Fault from server: A general system error occurred: Internal error Please see the server log to find more detail regarding exact cause of the failure)

If you look at the ESXi host syslog field, Syslog.global.logHost under host > Configuration > Advanced Settings > Syslog you will notice either this field is empty or incorrectly configured. Populate it with the IP of your log insight machine, should look something like below. Click OK to save the settings.

If it is udp, it should be:
udp://<log-insight-ip>:514

For tcp it should be:
tcp://<log-insight-ip>:1514

Save the settings, also make sure syslog Firewall is open under Security Profile. Once confirmed, you can then proceed to reconfigure the syslog via Log Insight and it should not complete successfully.

You should be then able to view events under your Log Insight Dashboards.

Hope this helps!

While I agree PowerCLI API's are the right way to deploy multiple VMs on an ESXi, I had to fallback to bash script for a project that I have been working on lately. The script is pretty simple. It is divided into 6 functions.

Function 1 is for a VMX file template which has variable input obtained from the remaining functions
Function 2 is for creating a VMDK of required size and provision type
Function 3 is for MAC address generation
Function 4 is VM Uid and VC Uid generation
Function 5 is VM Registration
Function 6 is for VM Power On

Let's have a look at this:

While ESXi does not run "bash" I had to go with the #!/bin/sh shebang to define the interpreter.
The VMX file function has a pre created template with certain options that has a variable input which can be modified by the user later while executing the script:

Create_VM ()
	{
	read -p "Enter the VM name: " VM_name
	read -p "Enter the path of the datastore. /vmfs/volumes/<storage-name>/: " datastore_name
	cd /vmfs/volumes/$datastore_name
	mkdir $VM_name&&cd$VM_name&& touch $VM_name.vmx

	read -p "Enter the Hardware version for the VM: " HW_version
	read -p "Enter the Memory required for the VM: " Memory
	read -p "Enter the network type, e1000 / VMXNET3: " Net_type
	read -p "Enter the VM Port group name: " Port_group

	# VMX File Entries
	cat << EOF >$VM_name.vmx

	.encoding = "UTF-8"
	config.version = "8"
	virtualHW.version = "$HW_version"
	nvram = "$VM_name.nvram"
	pciBridge0.present = "TRUE"
	svga.present = "TRUE"
	pciBridge4.present = "TRUE"
	pciBridge4.virtualDev = "pcieRootPort"
	pciBridge4.functions = "8"
	pciBridge5.present = "TRUE"
	pciBridge5.virtualDev = "pcieRootPort"
	pciBridge5.functions = "8"
	pciBridge6.present = "TRUE"
	pciBridge6.virtualDev = "pcieRootPort"
	pciBridge6.functions = "8"
	pciBridge7.present = "TRUE"
	pciBridge7.virtualDev = "pcieRootPort"
	pciBridge7.functions = "8"
	vmci0.present = "TRUE"
	hpet0.present = "TRUE"
	memSize = "$Memory"
	scsi0.virtualDev = "lsisas1068"
	scsi0.present = "TRUE"
	ide1:0.startConnected = "FALSE"
	ide1:0.deviceType = "cdrom-raw"
	ide1:0.clientDevice = "TRUE"
	ide1:0.fileName = "emptyBackingString"
	ide1:0.present = "TRUE"
	floppy0.startConnected = "FALSE"
	floppy0.clientDevice = "TRUE"
	floppy0.fileName = "vmware-null-remote-floppy"
	ethernet0.virtualDev = "$Net_type"
	ethernet0.networkName = "$Port_group"
	ethernet0.checkMACAddress = "false"
	ethernet0.addressType = "static"
	ethernet0.Address = "$final_mac"
	ethernet0.present = "TRUE"
	scsi0:0.deviceType = "scsi-hardDisk"
	scsi0:0.fileName = "$VM_name.vmdk"
	scsi0:0.present = "TRUE"
	displayName = "$VM_name"
	guestOS = "windows8srv-64"
	disk.EnableUUID = "TRUE"
	toolScripts.afterPowerOn = "TRUE"
	toolScripts.afterResume = "TRUE"
	toolScripts.beforeSuspend = "TRUE"
	toolScripts.beforePowerOff = "TRUE"
	uuid.bios = "$uuid"
	vc.uuid = "$vcid"
	ctkEnabled = "TRUE"
	scsi0:0.ctkEnabled = "TRUE"
	EOF
	}

The create VMDK is simple which uses the vmkfstools -C to get the job done.

Create_VMDK ()
	{
	read -p "Enter disk format. thin / zeroedthick / eagerzeroedthick: " format
	read -p "Enter size: " size

	vmkfstools -c "$size"G -d $format$VM_name.vmdk
	}

The MAC address generation keeps a static MAC by modifying the VMX and the constant VMware defined prefix with a random generated number for the last octet.

MAC_address ()
	{

	mac=$(awk -v min=1000 -v max=9000 'BEGIN{srand(); print int(min+rand()*(max-min+1))}' \| sed -e 's/.\{2\}/&:/g;s/.$//')
	final_mac=00:50:56:00:$mac

	}

The similar algorithm is applied for VC UUid generation where the post digits are constant only the first octet is changed.

UUID_generate ()
	{
	uuid_postfix="1a c2 4e fe 1a 8c d2-db 90 02 81 ce d8 31 15"
	vcid_postfix="1a c9 91 4b 4a b9 93-79 23 12 1f b2 c5 37 f8"

	uuid_prefix=$(awk -v min=10 -v max=99 'BEGIN{srand(); print int(min+rand()*(max-min+1))}')
	vcid_prefix=$(awk -v min=10 -v max=99 'BEGIN{srand(); print int(min+rand()*(max-min+1))}')

	uuid="$uuid_prefix $uuid_postfix"
	vcid="$vcid_prefix $vcid_postfix"


	}

The complete source code{} can be accessed here:
https://github.com/happycow92/Lab-Deploy/blob/master/additional-vm-deploy.sh

A while loop is defined if a user wants to deploy multiple VMs.

Well, that's pretty much it.

Earlier, I had written a script to count number of backups for each client in VDP. This was a very basic script and you can access it from the below link:
http://www.virtuallypeculiar.com/2017/06/bash-script-to-list-number-of-backups.html

I have added couple of more features to this script to make it more readable and more insightful.
The script along with counting number of backups, it tells about the size of the VM, the Type of OS, Is it a partial backup and also tells if the backup is on local VDP storage or a data domain. Along with this, the previous script did not account the option of Agent level backups. This script will take care of the agent level backup count as well.

Currently, I am setting up a replication, so the /REPLICATE domain can be included to count number of replicated restore points.

The Complete Script can be accessed from my repository below:
https://github.com/happycow92/shellscripts/blob/master/list-backup.sh

The output would be similar to the below (Two separate output from two VDP servers):

I will update the change log here.
You can run this script in a production environment. It will not make any changes to the system. Make sure you provide execute permissions before you run it. Of course!

Hope this helps.

In vSphere Data Protection, you have couple of backup protocols. SAN mode, HotAdd, NBD and NBD over SSL. HotAdd is always the recommended protocol, as data handling and transfer is much faster than the rest. If your backups are running slow, then the first thing we will check is the backup protocol mode. Then we will move further to VDP load and finally the VMFS / Array performance.

If you have few VMs, you can easily find out the protocol type from the logs. However, if you have a ton of VMs and would like to determine the protocol, then you can use this script that I have written.
https://github.com/happycow92/shellscripts/blob/master/backup-protocol-type.sh

#!/bin/bash
	clear
	IFS=$(echo -en "\n\b")

	echo"This script should be executed on a proxy machine"
	echo"Checking current Machine......"

	directory="/usr/local/vdr"
	if [ !-d"$directory" ]
	then
	printf"Current machine is Proxy machine"
	else
	printf"Current machine is VDP Server"
	fi

	echo&&echo
	sleep 2s
	echo -e "--------------------------------------------------------"
	echo -e "\| Client Name \| Backup Type \| Proxy Used \|"
	echo -e "--------------------------------------------------------"

	cd /usr/local/avamarclient/var
	backupLogList=$(ls -lh \| grep -i "vmimagew.log\\|vmimagel.log"\| awk '{for (i=1; i<=8; i++) $i=""; print $0}'\| sed 's/^ *//')

	foriin$backupLogList
	do
	clientName=$(cat $i\| grep -i "<11982>"\| awk '{print $NF}'\| cut -d '/' -f 1)
	protocolType=$(cat $i\| grep -i "<9675>"\| awk '{print $7}'\| head -n 1)
	proxyName=$(cat $i\| grep -i "<11979>"\| cut -d ',' -f 2)
	if [ "$protocolType"=="hotadd" ]
	then
	protocol="hotadd"
	elif [ "$protocolType"=="nbdssl" ]
	then
	protocol="nbdssl"
	elif [ "$protocolType"=="nbd" ]
	then
	protocol="nbd"
	else
	protocol="SAN Mode"
	fi

	printf"\| %-20s\| %14s\| %12s\|\n""$clientName""$protocolType""$proxyName"
	done

	echo&&echo

Few things:
> The script must be always executed on a proxy machine. If your VDP is using internal proxy, then run it on the VDP machine itself.
> If you are using one or more External Proxy, then you need to run this on each of the proxy machines.
> Note, this will work on 6.x VDP and above.

I have added an IFS (Internal Field Separator) to handle spaces in backup job names. The rough version of script had issues handling spaces in job names.

It's a very lightweight script, takes seconds to execute and does not make any changes to your system.

Hope this helps.

While in VDP you have a built in feature for unprotected VMs (That is VMs not added to VDP backup job) you might need a script to determine if VMs are missing from a backup job.

The script has a simple algorithm:
> The first time it runs it creates a file to gather all the protected client list
> The next time it runs it will check what is missing since the last protect client list.
> New added VMs will not be considered as Missing, however on Next iteration of script execution it will run a check to see if the new clients are missing.
> If you remove the first generated file for protected list post your second execution, then the third iteration will be void as it will generate a new protected client list.

The script has an email feature to send the output to a mailing address. If you want to exclude this, then discard line-21 to line-32. If you want to run the script as a cronjob, you can add it to crontab -e, but you cannot have manual email address input running in the script. You will have to create a constant for your email address and call it in the EOF.

The script can be accessed from my repository here:
https://github.com/happycow92/shellscripts/blob/master/missing-client.sh

The code {}

#!/bin/bash
	IFS=$(echo -en "\n\b")

	FILE=/tmp/protected_client.txt
	if [ !-f$FILE ]
	then
	client_list=$(mccli client show --recursive=true \| grep -i /$(cat /usr/local/vdr/etc/vcenterinfo.cfg \| grep vcenter-hostname \| cut -d '=' -f 2)/VirtualMachines \| awk -F/ '{print $(NF-2)}')
	echo"$client_list"&> /tmp/protected_client.txt
	sort /tmp/protected_client.txt -o /tmp/protected_client.txt
	else
	new_list=$(mccli client show --recursive=true \| grep -i /$(cat /usr/local/vdr/etc/vcenterinfo.cfg \| grep vcenter-hostname \| cut -d '=' -f 2)/VirtualMachines \| awk -F/ '{print $(NF-2)}')
	echo"$new_list"&> /tmp/new_list.txt
	sort /tmp/new_list.txt -o /tmp/new_list.txt
	missing=$(comm -3 /tmp/protected_client.txt /tmp/new_list.txt \| sed 's/^ *//g')
	if [ -z"$missing" ]
	then
	printf"\nNo Client's missing\n"
	else
	printf"\nMissing Client is:\n"\| tee -a /tmp/email_list.txt
	printf"$missing\n\n"\| tee -a /tmp/email_list.txt
	printf"Emailing the list\n"

	FILE=/tmp/email_list.txt
	read -p "Enter Your Email: " TO
	FROM=admin@$(hostname)

	(cat - $FILE)<< EOF \| /usr/sbin/sendmail -f $FROM -t $TO
	Subject: Missing VMs from Jobs
	To: $TO
	EOF
	sleep 2s
	printf"\nEmail Sent. Exiting Script\n\n"
	fi
	rm /tmp/new_list.txt
	rm -f /tmp/email_list.txt
	fi

Feel free to reply for any issues. Hope this helps!

So you can use this script to export your current backup and replication job configurations to a text file and save it to your local desktop. In case if you run into any redeployment situation and you are unaware of the backup configuration, you can have a look at the exported text file.

The script exports, Job Name, State of the job, Clients in the job, Schedule, Retention and the type.
It currently does not export agent level backup jobs such as SQL, Exchange and Share-point.

The script needs the MCS service to be up as it relies on that. I am planning to export details from psql which can be used even when MCS is down.

This is what I have for right now. The script can be accessed from the below link:
https://github.com/happycow92/shellscripts/blob/master/backup-job-detail.sh

Suggestions and bugs are always welcome. Drop a comment for anything.

Hope this helps!

Below is one bash script that extracts information about replication for configured VMs. It displays, the name of the virtual machine, if yes or no for quiesce Guest OS and Network Compression. Then it tabulates RPO (in minutes) as "bc" is unsupported on vR SUSE to perform hour floating calculations and then the datastore MoRef ID.

The complete updated script can be accessed from my GitHub Repo:
https://github.com/happycow92/shellscripts/blob/master/vR-jobs.sh

As and when I add more or reformat the information the script in the link will be updated.

#!/bin/bash
	clear

	echo -e " -----------------------------------------------------------------------------------------------------------"
	echo -e "\| Virtual Machine \| Network Compression \| Quiesce \| RPO \| Datastore MoRef ID \|"
	echo -e " -----------------------------------------------------------------------------------------------------------"

	cd /opt/vmware/vpostgres/9.3/bin
	./psql -U vrmsdb << EOF
	\o /tmp/info.txt
	selectname from groupentity;
	selectnetworkcompressionenabled from groupentity;
	selectrpo from groupentity;
	selectquiesceguestenabled from groupentity;
	selectconfigfilesdatastoremoid from virtualmachineentity;
	EOF

	cd /tmp
	name_array=($(awk '/name/{i=1;next}/ro*/{i=0}{if (i==1){i++;next}}i' info.txt))
	quiesce_array=($(awk '/networkcompressionenabled/{i=1;next}/ro*/{i=0}{if (i==1){i++;next}}i' info.txt))
	compression_array=($(awk '/quiesceguestenabled/{i=1;next}/ro*/{i=0}{if (i==1){i++;next}}i' info.txt))
	rpo_array=($(awk '/rpo/{i=1;next}/ro*/{i=0}{if (i==1){i++;next}}i' info.txt))
	datastore_array=($(awk '/configfilesdatastoremoid/{i=1;next}/ro/{i=0} {if (i==1){i++;next}}i' info.txt))

	length=${#name_array[@]}
	for((i=0;i<$length;i++));
	do
	printf"\| %-32s \| %-23s \| %-10s \| %-10s\| %-20s\|\n""${name_array[$i]}""${quiesce_array[$i]}""${compression_array[$i]}""${rpo_array[$i]}""${datastore_array[$i]}"
	done

	rm -f info.txt

	echo&&echo

For any questions, do let me know. Hope this helps. Thanks.

There has been a lot of issues going on around the VDP deployment due to an expired certificate issued to the OVF template.

Basically, if you are running vCenter 6.5. then the web client is the only option to deploy the OVA files. And you cannot move past the section where it displays the certificate section as expired. If you are using pre 6.5 vCenter, then you can deploy this through the Windows C# client. Even though it says "Invalid" certificate, you can still click Next and proceed further.

If you are on 6.5, then the workaround is this:
1. Download the required version of VDP Server. All of them have their certificates expired around September.
2. Use a 7-zip utility to extract the OVA template. This will give you 4 files. The VMDK, OVF, MF and the CER.
3. In web client, when you deploy OVA, you can multi select the files. So select the 3 files (vmdk, ovf and mf) excluding the .cer file
4. This then displays No Certificate during the deployment and let's you proceed further.

This certificate is signed just for the OVA template and not for any particular port / service for the VDP itself.

EMC is currently working to update the certificate information for these templates. Hope this helps!

So there's a rare instance where you will be unable to protect a VM and the error it throws out is:

Internal error: class Vmacore::NotFoundException "Object not found"

Under Protection Groups > Related Objects > Virtual Machines, you will see the VM coming up as Not Configured.

And when you try to right click this and say Configure protection, you will notice that the Device Status will come up as Non-replicated

And if you browse the recovery location and provide the path of the replicated VMDK, you will run into this error.

In the web client logs, you will see:

[2017-11-28T09:27:50.156-06:00] [ERROR] srm-client-thread-1253 70015389 101315 201173 com.vmware.srm.client.infraservice.tasks.FakeTaskImpl [DrVmodlFakeTask:srm-fake-task-11:fake-server-guid]: com.vmware.vim.binding.dr.fault.DrRuntimeFault: Task Failed
at com.vmware.srm.client.infraservice.util.ExceptionUtil.newRuntimeFault(ExceptionUtil.java:92)
at com.vmware.srm.client.infraservice.util.ExceptionUtil.newRuntimeFault(ExceptionUtil.java:68)
at com.vmware.srm.client.infraservice.tasks.MultiTaskProgressUpdaterImpl.getSingleError(MultiTaskProgressUpdaterImpl.java:89)
at com.vmware.srm.client.infraservice.tasks.MultiTaskProgressUpdaterImpl.updateProgress(MultiTaskProgressUpdaterImpl.java:222)
at com.vmware.srm.client.infraservice.tasks.MultiTaskProgressUpdaterImpl$3.run(MultiTaskProgressUpdaterImpl.java:431)
at $java.lang.Runnable$$FastClassByCGLIB$$36fc6471.invoke(<generated>)
at net.sf.cglib.proxy.MethodProxy.invoke(MethodProxy.java:149)
at com.vmware.srm.client.topology.impl.osgi.aop.HttpRequestContextAdvice$CallInterceptor.intercept(HttpRequestContextAdvice.java:53)
at com.vmware.srm.client.topology.impl.osgi.aop.HttpRequestContextAdvice$Base$$EnhancerByCGLIB$$b6ab80b4.run(<generated>)
at com.vmware.srm.client.infraservice.tasks.MultiTaskProgressUpdaterImpl$4.run(MultiTaskProgressUpdaterImpl.java:442)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
Caused by: com.vmware.vim.binding.dr.fault.InternalError: Internal error: class Vmacore::NotFoundException "Object not found"
[context]zKq7AVMEAQAAAHjHWwAUdm13YXJlLWRyAACoLwpkci1yZXBsaWNhdGlvbi5kbGwAAGEbCgASaT8AAy5BAOv/QACT9EABuSMCY29ubmVjdGlvbi1iYXNlLmRsbAABx3QCAccrAgGg8AABPUMBAccrAgGSLgMBdwgDARb3AgHHKwIBuSMCAXcIAwEW9wIBxysC[/context].
at sun.reflect.GeneratedConstructorAccessor614.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at java.lang.Class.newInstance(Class.java:442)

The reason, one of them, is the source VMX file has some corrupt or incorrect entries.
So let's have a look at the VM's vmx file.

I will be looking for lines in this file which has a datastore path reference like:

vmx.log.filename = "/vmfs/volumes/58780b1d-045e1100-0efa-0025b5e01a45/Test-1/vmware.log"
sched.swap.derivedName = "/vmfs/volumes/59a30e4d-647fd9f2-2e66-000c295e9f61/Test-1/Test-1-932448b9.vswp"

I have two UUIDs here, 58780b1d-045e1100-0efa-0025b5e01a45 and 59a30e4d-647fd9f2-2e66-000c295e9f61

But, when I run:

[root@Wendy:/vmfs/volumes/59a30e4d-647fd9f2-2e66-000c295e9f61/Test-1] esxcfg-scsidevs -m

mpx.vmhba1:C0:T0:L0:3 /vmfs/devices/disks/mpx.vmhba1:C0:T0:L0:3 599ffcb3-d9ece508-7576-000c295e9f61 0 Wendy-Local

mpx.vmhba1:C0:T1:L0:1 /vmfs/devices/disks/mpx.vmhba1:C0:T1:L0:1 59a30e4d-647fd9f2-2e66-000c295e9f61 0 VDP-Storage

I just have these two UUIDs which do not match the one's in the VMX file. So these incorrect references are causing this drive status to be non replicated in turn causing issues with VM protection.
You might have one or more such entries in the VMX file.

Power off the virtual machine on source and then backup the VMX file and edit it to provide the UUID of the datastore where the VM resides / the appropriate UUID where the respective files should reside. In my case the Test-1 VM runs on VDP-Storage, which is 59a30e4d-647fd9f2-2e66-000c295e9f61

So the new VMX entry looks as:

vmx.log.filename = "/vmfs/volumes/59a30e4d-647fd9f2-2e66-000c295e9f61/Test-1/vmware.log"
sched.swap.derivedName = "/vmfs/volumes/59a30e4d-647fd9f2-2e66-000c295e9f61/Test-1/Test-1-932448b9.vswp"

Reload the VMX using:

# vim-cmd vmsvc/reload <vm-id>

The vm-id can be obtained from

# vim-cmd vmsvc/getallvms

Then Power on the VM and then right click the VM in protection group and configure recovery, this time the hard drive status will be displayed as replicated.

And that's pretty much it. Usually this is seen, when vmware.log files are configured to a different datastore and that particular datastore is no longer available.

Hope this helps.

In some instances, when you are running Array Based Replication for SRM, a failed planned migration might cause the SRM service to crash. In the vmware-dr.log found on the SRM machine, we will notice the following backtrace

2017-12-06T09:55:38.620-05:00 panic vmware-dr[06076] [Originator@6876 sub=Default]

-->

--> Panic: Assert Failed: "ok (Dr::Providers::Abr::AbrRecoveryEngine::AbrRecoveryEngineImpl::LoadFromDb: Unable to insert post failover info object 212337205 for group vm-protection-group-121101624 array pair array-pair-7065)" @ d:/build/ob/bora-6014840/srm/src/providers/abr/common/abrRecoveryEngine/abrRecoveryEngine.cpp:244

--> Backtrace:

--> [backtrace begin] product: VMware vCenter Site Recovery Manager, version: 6.5.1, build: build-6014840, tag: vmware-dr, cpu: x86_64, os: windows, buildType: release

--> backtrace[00] vmacore.dll[0x001F29FA]

--> backtrace[01] vmacore.dll[0x00067D60]

--> backtrace[02] vmacore.dll[0x0006A20E]

--> backtrace[03] vmacore.dll[0x002245A7]

--> backtrace[04] vmacore.dll[0x00224771]

--> backtrace[05] vmacore.dll[0x00059C0D]

--> backtrace[06] dr-abr-recoveryEngine.dll[0x00028A91]

--> backtrace[07] dr-abr-recoveryEngine.dll[0x00015199]

--> backtrace[08] dr-abr-recoveryEngine.dll[0x002DB368]

--> backtrace[09] dr-abr-recoveryEngine.dll[0x002DB913]

--> backtrace[10] vmacore.dll[0x001D6ACC]

--> backtrace[11] vmacore.dll[0x001865AB]

--> backtrace[12] vmacore.dll[0x0018759C]

--> backtrace[13] vmacore.dll[0x002202E9]

--> backtrace[14] MSVCR120.dll[0x00024F7F]

--> backtrace[15] MSVCR120.dll[0x00025126]

--> backtrace[16] KERNEL32.DLL[0x000013D2]

--> backtrace[17] ntdll.dll[0x000154E4]

--> [backtrace end]

This is seen when there are issues unmounting the source datastore or demoting the source datastore.

Disclaimer: Modifying database tables is done by VMware. Do this at your own risk.

The fix is:

1. Make sure SRM service is stopped on both sites

2. Backup the SRM databases on both sites

3. Login to the database either using PGadmin or SQL management studio depending on the type of database used

4. Open this table "pda_grouppostfailoverinfo"

5. Here we need to remove the db_id which is available from the back trace. In my case it is: 212337205
6. Once this is done, start the SRM service. If it crashes again, it usually generates another object ID and repeat the process.

And that should be it.

In couple of cases, either on fresh VDP deploy or an existing deployment, the connection from VDP to web client via the plugin might fail.

It would generically say,

Unable to connect to the requested VDP Appliance. Would you like to be directed to the VDP Configuration utility to troubleshoot the issue?

The vdp-configure page does not tell much about any of these errors and all the services seem to be running fine. You can also try restarting tomcat service on VDP using emwebapp.sh --restart, however that would might not help.

In case of this above error, the first logs you need to look at is the web client logs on the vCenter. And in this, the following was logged:

[2018-01-09T19:42:06.634Z] [INFO ] http-bio-9443-exec-27 com.emc.vdp2.api.impl.BaseApi Connecting to VDP at: [https://x.x.x.x:8543/vdr-server/auth/login]

[2018-01-09T19:42:06.646Z] [INFO ] http-bio-9443-exec-27 com.emc.vdp2.api.impl.BaseApi Setting the session ID to: null

[2018-01-09T19:42:06.656Z] [WARN ] http-bio-9443-exec-27 org.springframework.flex.core.DefaultExceptionLogger The following exception occurred during request processing by the BlazeDS MessageBroker and will be serialized back to the client: flex.messaging.MessageException: java.lang.NullPointerException : null

[2018-01-09T19:42:06.889Z] [WARN ] http-bio-9443-exec-26 org.springframework.flex.core.DefaultExceptionLogger The following exception occurred during request processing by the BlazeDS MessageBroker and will be serialized back to the client: flex.messaging.MessageException: org.eclipse.gemini.blueprint.service.ServiceUnavailableException : service matching filter

=[(objectClass=com.emc.vdp2.api.ActionApiIf)] unavailable

at flex.messaging.services.remoting.adapters.JavaAdapter.invoke(JavaAdapter.java:444)

at com.vmware.vise.messaging.remoting.JavaAdapterEx.invoke(JavaAdapterEx.java:50)

at flex.messaging.services.RemotingService.serviceMessage(RemotingService.java:183)

at flex.messaging.MessageBroker.routeMessageToService(MessageBroker.java:1400)

at flex.messaging.endpoints.AbstractEndpoint.serviceMessage(AbstractEndpoint.java:1011)

at flex.messaging.endpoints.AbstractEndpoint$$FastClassByCGLIB$$1a3ef066.invoke(<generated>)

at net.sf.cglib.proxy.MethodProxy.invoke(MethodProxy.java:149)

at org.springframework.aop.framework.Cglib2AopProxy$CglibMethodInvocation.invokeJoinpoint(Cglib2AopProxy.java:689)

at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)

at org.springframework.flex.core.MessageInterceptionAdvice.invoke(MessageInterceptionAdvice.java:66)

at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)

at org.springframework.aop.framework.adapter.ThrowsAdviceInterceptor.invoke(ThrowsAdviceInterceptor.java:124)

at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)

at org.springframework.aop.framework.Cglib2AopProxy$FixedChainStaticTargetInterceptor.intercept(Cglib2AopProxy.java:573)

at flex.messaging.endpoints.AMFEndpoint$$EnhancerByCGLIB$$72c7df65.serviceMessage(<generated>)

at flex.messaging.endpoints.amf.MessageBrokerFilter.invoke(MessageBrokerFilter.java:103)

at flex.messaging.endpoints.amf.LegacyFilter.invoke(LegacyFilter.java:158)

at flex.messaging.endpoints.amf.SessionFilter.invoke(SessionFilter.java:44)

at flex.messaging.endpoints.amf.BatchProcessFilter.invoke(BatchProcessFilter.java:67)

at flex.messaging.endpoints.amf.SerializationFilter.invoke(SerializationFilter.java:166)

at flex.messaging.endpoints.BaseHTTPEndpoint.service(BaseHTTPEndpoint.java:291)

at flex.messaging.endpoints.AMFEndpoint$$EnhancerByCGLIB$$72c7df65.service(<generated>)

at org.springframework.flex.servlet.MessageBrokerHandlerAdapter.handle(MessageBrokerHandlerAdapter.java:109)

at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:923)

at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:852)

In an event when we see this back trace with the com.emc.vdp2.api.ActionApi, the connection call is unavailable when a login request is sent by VDP. And due to this the connection fails.

To resolve this clear the SerenityDB on the vCenter Server. Follow this Knowledge base article for the procedure.

Post this, re-login back to web client and then connect to the VDP appliance and it should go through successfully.

Hope this helps.

With the release of VDP 6.1.6 comes more issues with upgrade, specially the main one; ISO is not detected in the vdp-configure page. You might see something like:

To upgrade your VDP appliance, please connect a valid upgrade ISO image to the appliance.

On the command line, you will see the ISO is already mounted:

root@Jimbo:~/#: df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda2 32G 5.9G 25G 20% /
udev 1.9G 148K 1.9G 1% /dev
tmpfs 1.9G 0 1.9G 0% /dev/shm
/dev/sda1 128M 37M 85M 31% /boot
/dev/sda7 1.5G 158M 1.3G 12% /var
/dev/sda9 138G 5.1G 126G 4% /space
/dev/sdd1 256G 208M 256G 1% /data02
/dev/sdc1 256G 208M 256G 1% /data03
/dev/sdb1 256G 4.3G 252G 2% /data01
/dev/sr0 4.1G 4.1G 0 100% /mnt/auto/cdrom

If it is not, use the below command to mount:

# mount /dev/sr0 /mnt/auto/cdrom

If it still does not help, then proceed further.

For all upgrade issues, we look at avinstaller.log located at /usr/local/avamar/var/avi/server_log/
However, I noticed here that the last update time on the log was Nov 22

-rw-r--r-- 1 root root 1686327 Nov 22 21:12 avinstaller.log.0

This said the avinstaller script is not updating the log. A restart of avinstaller using the below command will not help either:

# avinstaller.pl --restart

The process is running fine though:

root@Jimbo:/usr/local/avamar/var/avi/server_log/#: avinstaller.pl --test
Avistart process: 31230

INFO: AVI is running.

For this, we will have to rename the web server folder "jetty-0.0.0.0-7543-avi.war-_avi-any-" located under:

# cd /usr/local/avamar/lib/jetty/work

To rename use:

# mv jetty-0.0.0.0-7543-avi.war-_avi-any- jetty-0.0.0.0-7543-avi.war-_avi-any-old

Then restart the avinstaller script using avinstaller.pl --restart

Now the avinstaller.log is updated:
-rw-r--r-- 1 root root 1750549 Jan 10 19:38 avinstaller.log.0

And the ISO is detected in the vdp-configure page.

If this still does not work, then we will have to look into the avinstaller.log which can have a ton of causes. Best to reach out to VMware.

Hope this helps.

There are few instances when a vSphere Replication Sync (RPO based or a manual sync) fails with Delta Aborted Exception. This in turn will also affect a test / planned migration when performed with Site Recovery Manager.

In the hms.log located under /opt/vmware/hms/logs on the vSphere Replication Server, you will notice something like:

2018-01-10 14:35:59.950 ERROR com.vmware.hms.replication.sync.ReplicationSyncManager [hms-sync-progress-thread-0] (..replication.sync.ReplicationSyncManager) operationID=fd66efca-f070-429c-bc89-f2164e9dbb7a-HMS-23613 | Completing sync operation because of error: {OnlineSyncOperation, OpId=fd66efca-f070-429c-bc89-f2164e9dbb7a-HMS-23613, GroupMoId=GID-2319507d-e668-4eea-aea9-4d7d241dd886, ExpInstSeqNr=48694, TaskMoId=HTID-56fd57dd-408b-4861-a124-70d8c53a1194, InstanceId=2f900595-2822-4f2b-987d-4361f7035
05c, OpState=started, VcVmMoid=vm-28686, createInstanceRetryCount=2, fullSyncOngoing=false, operationId=null}
com.vmware.hms.replication.sync.DeltaAbortedException
at com.vmware.hms.replication.sync.SyncOperation.checkHealth(SyncOperation.java:911)
at com.vmware.hms.replication.sync.SyncOperation$4.run(SyncOperation.java:735)
at com.vmware.hms.util.executor.LoggerOpIdConfigurator$RunnableWithDiagnosticContext.run(LoggerOpIdConfigurator.java:133)
at com.vmware.hms.util.executor.LoggerOpIdConfigurator$2.run(LoggerOpIdConfigurator.java:100)
at com.vmware.jvsl.sessions.net.impl.TlsPreservingWrapper$2.run(TlsPreservingWrapper.java:47)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

This occurs when the outgoingeventlogentity and incomingeventlogentity tables on the vR database are having a large number of entries.

The following fix should be applied at own risk. Have a snapshot and/or a backup of the vR server before performing the change.

1. Navigate to the VRMS database's bin directory:

# cd /opt/vmware/vpostgres/9.x/bin

The postgres version varies depending on the replication server release.

2. Backup the replication database using the below command:

# ./pg_dump -U vrmsdb -Fp -c > /tmp/DBBackup.bak

3. Connect to the vR database using:

# ./psql -U vrmsdb

4. Run the below queries to extract the number of events for the logentity tables:

select count(*) from outgoingeventlogentity;

select count(*) from incomingeventlogentity;

In my case, the output on the production site vR was:

vrmsdb=# select count(*) from incomingeventlogentity;

count

-------

21099

(1 row)

vrmsdb=# select count(*) from outgoingeventlogentity;

count

-------

146

(1 row)

And on the recovery site, the outgoingeventlogentity was having 21k+ events.

5. First, you can change the max event age limit to 10800 in the hms-configuration.xml file located at:

# cd /opt/vmware/hms/conf/hms-configuration.xml

This should be the output after the edit:

<hms-eventlog-maxage>10800</hms-eventlog-maxage>

6. Next, we will have to purge the event logs from the above mentioned tables. There are lot of fields in the table if you run select * from <table-name>;

The one column we need is the "timestamp" column.

The timestamp column would have a value like this: 1515479242006

To convert this to human readable date, you will have to:

> Remove the last 3 digits from the above output.

So 1515479242006 will be 1515479242. Then convert this EPOCH time to normal convention using this link here.

Now, you will have to use a timestamp in such a way that anything before that would be purged from the database. During the purge, the timestamp has to be the complete value obtained from the timestamp column. Then, run the below query:

DELETE from incomingeventlogentity WHERE timestamp < 1515479242006;

DELETE from outgoingeventlogentity WHERE timestamp < 1515479242006;

7. Then restart the hms service using:

# systemctl stop hms

# systemctl start hms

The above is applicable from 6.1.2 vR onward. For lower versions:

# service hms restart

8. Re-pair the sites and then perform a sync now operation and we should be good to go.

Hope this help!

Recently after I upgraded my VDP to 6.1.6, there was issues connecting this appliance to the web client. The screen used to be grayed out forever and the vdr-server.log did not have any information about this cause.

When we ran the below command we saw the there were 5 vCenter connections down

# mccli server show-services

Name Status
---------------------------------- ---------------------------
/cartman.southpark.local 5 vCenter connection(s) down.

The MCS restart failed with the following in the mcserver.out log file located under

# cd /usr/local/avamar/var/mc/server_log/mcserver.out

Caught Exception : Exception : org.apache.axis.AxisFault Message : ; nested exception is:
javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: java.security.cert.CertPathBuilderException: Could not build a validated path. StackTrace :
AxisFault
faultCode: {http://schemas.xmlsoap.org/soap/envelope/}Server.userException
faultSubcode:
faultString: javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: java.security.cert.CertPathBuilderException: Could not build a validated path.
faultActor:
faultNode:
faultDetail:
{http://xml.apache.org/axis/}stackTrace:javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: java.security.cert.CertPathBuilderException: Could not build a validated path.

Caused by: sun.security.validator.ValidatorException: PKIX path building failed: java.security.cert.CertPathBuilderException: Could not build a validated path.
at sun.security.validator.PKIXValidator.doBuild(Unknown Source)
at sun.security.validator.PKIXValidator.engineValidate(Unknown Source)
at sun.security.validator.Validator.validate(Unknown Source)
at sun.security.ssl.X509TrustManagerImpl.validate(Unknown Source)
at sun.security.ssl.X509TrustManagerImpl.checkTrusted(Unknown Source)
at sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(Unknown Source)
... 30 more
Caused by: java.security.cert.CertPathBuilderException: Could not build a validated path.
at com.rsa.cryptoj.o.qb.engineBuild(Unknown Source)

This is because of the ignore_vc_cert = false

# grep ignore_vc /usr/local/avamar/var/mc/server_data/prefs/mcserver.xml

<entry key="ignore_vc_cert" value="false" />

To fix this, either edit the mcserver.xml file manually and change the value from false to true or run the below command (Make sure a backup of mcserver.xml is taken):

# sed -i -e 's/entry key="ignore_vc_cert" value="false"/entry key="ignore_vc_cert" value="true"/g'  /usr/local/avamar/var/mc/server_data/prefs/mcserver.xml

Restart the MCS (From admin mode) using:

# mcserver.sh --restart

That should fix the connection issue. Hope this helps!

Here we will be looking into how to configure and use vSphere replication with vRealize Orchestrator. The version of my setup is:

vCenter Appliance 6.5 U1
vSphere Replication 6.5.1
vRealize Orchestrator 7.3

In brief, deploy the vRealize Orchestrator OVA template. Then navigate to "https://<vro-fqdn>:8283/vco-controlcenter/" to begin the configuration.

I have a standalone Orchestrator deployment with vSphere Authentication mode.

SSO user name and password is required to complete the registration. A restart of vRO would be needed to complete the configuration.

Next, download the vSphere Replication vmoapp file from this link here.

To install this file, click on the Manage Plugins tab in the Orchestrator control center and browse for the downloaded vmoapp file.

Then accept the EULA to Install the Plugin.

If prompted, click Save Changes and this should show the vR plugin is available and enabled in the plugin page.

Next, register the vCenter Site for the replication server using the below "Register VC Site" Workflow. All the next tasks are done from the Orchestrator client.

Once done, you can verify the vSphere Replication site is now visible under Administer mode of vRO.

Next, we will configure replication for one virtual machine. With the Run mode execute the "Configure Replication" workflow.

The Site (source) will be selected first.

Selecting virtual machine will be the next task.

Target site vR selection will be next. I am replicating within the same vCenter, so the source and target vR site is the same machine.

Next, we will select the target datastore where the replicated files should reside.

Lastly, we will choose the RPO and other required parameters to complete the replication task and click Submit.

Finally, you can see the VM under Outgoing Replication tab for vCenter.

That's pretty much it!

This article is applicable only to Postgres database which is the embedded DB option available during install. If you have forgotten the database password then you will not be able to login to DB to alter tables or perform a repair / modify on the SRM instance.

Before resetting the password, make sure the SRM machine is on a snapshot and a backup is available for the DB.

1. First we will need to edit the pg_hba.conf file to allow all users as trusted users so that a password-less authentication will be performed. The pg_hba.conf file is located under:

C:\ProgramData\VMware\VMware vCenter Site Recovery Manager Embedded Database\data\pg_hba.conf

Make a backup of the file before editing it.

Locate this section in the conf file:

# TYPE DATABASE USER ADDRESS METHOD
# IPv4 local connections:
host all all 127.0.0.1/32 md5
# IPv6 local connections:
host all all ::1/128 md5
# Allow replication connections from localhost, by a user with the
# replication privilege.
#host replication postgres 127.0.0.1/32 md5
#host replication postgres ::1/128 md5

Replace that complete set with this:

# TYPE DATABASE USER ADDRESS METHOD

# IPv4 local connections:

host all all 127.0.0.1/32 trust

# IPv6 local connections:

host all all ::1/128 trust

# Allow replication connections from localhost, by a user with the

# replication privilege.

#host replication postgres 127.0.0.1/32 md5

#host replication postgres ::1/128 md5

Save the file.

2. Restart the SRM Embedded database service from services.msc

3. Open a command prompt in admin mode and now we will have to login to the database. Navigate to the below bin directory:

C:\Program Files\VMware\VMware vCenter Site Recovery Manager Embedded Database\bin

4. Connect to the postgres database using:

psql -U postgres -p 5678

Port might vary if you had a custom install of the database.

5. Run the below query to change the password:

ALTER USER "enter srm db user here" PASSWORD 'new_password';

The srm db user / port information can be found from the 64 bit ODBC connection.

A successful execution will return the output: "ALTER ROLE"

6. Revert the changes performed the pg_hba.conf file so that md5 authentication is required for users to login to SRM database.

7. Restart the SRM Embedded DB service again

Post this, the SRM service will fail to restart and you will notice the following backtrace in vmware-dr.log

2018-01-17T02:25:15.628Z [01748 error 'WindowsService'] Application error:

--> std::exception 'class Vmacore::Exception'"DBManager error: Could not initialize Vdb connection: ODBC error: (08001) - FATAL: password authentication failed for user "srmadmin"

--> "

-->

--> [backtrace begin] product: VMware vCenter Site Recovery Manager, version: 6.1.1, build: build-3884620, tag: -

--> backtrace[00] vmacore.dll[0x001C568A]

--> backtrace[01] vmacore.dll[0x0005CA8F]

--> backtrace[02] vmacore.dll[0x0005DBDE]

--> backtrace[03] vmacore.dll[0x0001362B]

--> backtrace[04] vmware-dr.exe[0x0015C59A]

--> backtrace[05] MSVCR90.dll[0x00074830]

--> backtrace[06] MSVCR90.dll[0x00043B3C]

--> backtrace[07] ntdll.dll[0x0009CED3]

--> backtrace[08] vmware-dr.exe[0x000060AF]

--> backtrace[09] vmware-dr.exe[0x00006A5E]

--> backtrace[10] windowsService.dll[0x00002BCE]

--> backtrace[11] windowsService.dll[0x000020DD]

--> backtrace[12] sechost.dll[0x000081D5]

--> backtrace[13] KERNEL32.DLL[0x0000168D]

--> backtrace[14] ntdll.dll[0x00074629]

--> [backtrace end]

Run a Modify on the SRM instance from Add/Remove programs and provide the new database password during this process and the service will start up just fine.

Hope this helps.

There are a ton of reasons why VDP ISO might not be available in the vdp-configure page (Few of those are available on my blog here) This is another one of the possible causes.

> When we run df -h the ISO shows as mounted
> In the avinstaller.log.0 under /usr/local/avamar/var/avi/server_log/ shows the package extraction and package checksum verification is completed successfully.

The avinstaller.pl is responsible for all the upgrade related tasks. And a restart of this service should help putting few things back in place. However, when I restarted this service, I ended up with the below backtrace:

2017-07-19 10:29:32.441 1776 [main] WARN o.e.jetty.webapp.WebAppContext - Failed startup of context o.e.j.w.WebAppContext@5d11346a{/avi,file:/usr/local/avamar/lib/jetty/work/jetty-0.0.0.0-7543-avi.war-_avi-any-/webapp/,STARTING}{/usr/local/avamar/lib/jetty/avi.war}
java.lang.IllegalStateException: java.lang.IllegalStateException: Failed to locate resource[avinstaller.properties] in ClassLoader
at com.avamar.avinstaller.util.LifecycleListener.contextInitialized(LifecycleListener.java:64) ~[na:na]
at org.eclipse.jetty.server.handler.ContextHandler.callContextInitialized(ContextHandler.java:788) ~[jetty-server-9.0.6.v20130930.jar:9.0.6.v20130930]
at org.eclipse.jetty.servlet.ServletContextHandler.callContextInitialized(ServletContextHandler.java:434) ~[jetty-servlet-9.0.6.v20130930.jar:9.0.6.v20130930]
========================
Cutting down log snippet
========================
at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:69) [jetty-util-9.0.6.v20130930.jar:9.0.6.v20130930]
at com.avamar.avinstaller.AVIMain.start(AVIMain.java:313) [avistart.jar:na]
at com.avamar.avinstaller.AVIMain.main(AVIMain.java:345) [avistart.jar:na]
Caused by: java.lang.IllegalStateException: Failed to locate resource[avinstaller.properties] in ClassLoader
at com.avamar.avinstaller.util.AppConfig.getInstance(AppConfig.java:52) ~[na:na]
at com.avamar.avinstaller.util.LifecycleListener.contextInitialized(LifecycleListener.java:59) ~[na:na]

So, there was a missing repo symlink in webapp directory. You can verify this by navigating to:

# cd /usr/local/avamar/lib/jetty/work/jetty-0.0.0.0-7543-avi.war-_avi-any-/webapp

If you ls -lh you should see this symbolic link here:
lrwxrwxrwx 1 root root 19 Jan 11 20:25 datalink -> /data01/avamar/repo

If this is missing, recreate it by using:

# ln -s /data01/avamar/repo datalink

Next, navigate to the below location:

# cd /usr/local/avamar/lib/jetty/work/jetty-0.0.0.0-7543-avi.war-_avi-any-/webapp/WEB-INF/classes/

Here you should have 13 files and I was missing avinstaller.properties file. I had to recreate this file. If you have a second VDP appliance copy this file from that server. Else, create a new file with the name avinstaller.properties using vi editor and copy paste the below content:

messageLogDirectory=/tmp/av

# db settings

dbserver=localhost

dbport=5555

dbname=avidb

dbuser=admin

dbpass=

dbmax=10

# db settings for EM

dbserverEM=localhost

dbportEM=5556

dbnameEM=emdb

dbuserEM=admin

dbpassEM=

# local repository

repo_location=/tmp/repo

repo_packages=/tmp/repo/packages

repo_downloads=/tmp/repo/downloads

repo_temp=/tmp/repo/temp

#client package

plugin_location=/usr/local/avamar/lib/plugins

waiting_timer_interval_in_min=60

The avinstaller.pl was still failing with this backtrace:

java.lang.IllegalStateException: org.jbpm.api.JbpmException: resource jbpm.cfg.xml does not exist

at com.avamar.avinstaller.util.LifecycleListener.contextInitialized(LifecycleListener.java:64) ~[na:na]

at org.eclipse.jetty.server.handler.ContextHandler.callContextInitialized(ContextHandler.java:788) ~[jetty-server-9.0.6.v20130930.jar:9.0.6.v20130930]

at org.eclipse.jetty.servlet.ServletContextHandler.callContextInitialized(ServletContextHandler.java:434) ~[jetty-servlet-9.0.6.v20130930.jar:9.0.6.v20130930]

========================

Cutting down log snippet

========================

at com.avamar.avinstaller.AVIMain.start(AVIMain.java:313) [avistart.jar:na]

at com.avamar.avinstaller.AVIMain.main(AVIMain.java:345) [avistart.jar:na]

Caused by: org.jbpm.api.JbpmException: resource jbpm.cfg.xml does not exist

at org.jbpm.pvm.internal.stream.ResourceStreamInput.openStream(ResourceStreamInput.java:60) ~[na:na]

at org.jbpm.pvm.internal.xml.Parse.getInputSource(Parse.java:146) ~[na:na]

at org.jbpm.pvm.internal.xml.Parser.buildDocument(Parser.java:453) ~[na:na]

========================

Cutting down log snippet

========================

at com.avamar.avinstaller.process.PVM.getInstance(PVM.java:65) ~[na:na]

at com.avamar.avinstaller.util.LifecycleListener.contextInitialized(LifecycleListener.java:60) ~[na:na]

... 20 common frames omitted

I was also missing jbpm.cfg.xml file. Recreate this file and enter the below contents:

<?xml version="1.0" encoding="UTF-8"?>

<jbpm-configuration>

<!--

-->

</jbpm-configuration>

Restart the avinstaller again using avinstaller.pl --restart and everything went fine and the ISO was detected in the configure page.

Hope this helps!

In few rare cases the management services might fail to start up with a permission denied message.
If you run dpnctl start mcs you will notice the almost generic failure:

root@vdp:/#: dpnctl start mcs
Identity added: /home/dpn/.ssh/dpnid (/home/dpn/.ssh/dpnid)
dpnctl: INFO: Starting MCS...
dpnctl: INFO: To monitor progress, run in another window: tail -f /tmp/dpnctl-mcs-start-output-21990
dpnctl: ERROR: error return from "[ -r /etc/profile ] && . /etc/profile ; /usr/local/avamar/bin/mcserver.sh --start" - exit status 13
dpnctl: INFO: [see log file "/usr/local/avamar/var/log/dpnctl.log"]

If you start MCS from admin mode using mcserver.sh --start --verbose you will notice:

admin@vdp:~/>: mcserver.sh --start --verbose
Permission deniedPermission denied at /usr/lib/perl5/vendor_perl/5.10.0/x86_64-linux-thread-multi/XML/LibXML.pm line 587.
Could not create file parser context for file "/usr/local/avamar/var/mc/server_data/prefs/mcserver.xml": Permission denied at /usr/local/avamar/lib/MCServer.pm line 128

The same above is seen if you run mcserver.sh --test

In the dpnctl.log you will notice the same:

2018/01/17-15:13:55 dpnctl: INFO: - - - - - - - - - - - - - - - BEGIN
2018/01/17-15:13:55 Permission deniedPermission denied at /usr/lib/perl5/vendor_perl/5.10.0/x86_64-linux-thread-multi/XML/LibXML.pm line 587.
2018/01/17-15:13:55 Could not create file parser context for file "/usr/local/avamar/var/mc/server_data/prefs/mcserver.xml": Permission denied at /usr/local/avamar/lib/MCServer.pm line 128
2018/01/17-15:13:55 dpnctl: INFO: - - - - - - - - - - - - - - - END
2018/01/17-15:13:55 /bin/cat /tmp/dpnctl-get-mcs-status-status-21389 2>&1
2018/01/17-15:13:55 [ "/bin/cat /tmp/dpnctl-get-mcs-status-status-21389 2>&1" exit status = 0 ]
2018/01/17-15:13:55 dpnctl: INFO: "[ -r /etc/profile ] && . /etc/profile ; /usr/local/avamar/bin/mcserver.sh --test" exit status = 13
2018/01/17-15:13:55 rm -f /tmp/dpnctl-get-mcs-status-status-21389 /tmp/dpnctl-get-mcs-status-output-21389
2018/01/17-15:13:55 dpnctl: INFO: "rm -f /tmp/dpnctl-get-mcs-status-status-21389 /tmp/dpnctl-get-mcs-status-output-21389" - exit status 0
2018/01/17-15:13:55 dpnctl: INFO: MCS status - - - - - - - - - - END
2018/01/17-15:13:55 dpnctl: INFO: MCS status: down.

This is because of incorrect permissions and ownership on the mcserver.xml file. I do not know the reason why the permissions and ownership was flipped.

The ideal permission and ownership on this file should be:
-rwxrwx--- 1 admin admin 50320 Jan 13 01:48 /usr/local/avamar/var/mc/server_data/prefs/mcserver.xml

To change ownership (Be logged in as root)

# chown admin:admin mcserver.xml

To change permission run the below:

# chmod u+rwx,g+rwx mcserver.xml

Then you should be able to proceed further with the restart of the MCS service.

Hope this helps!

This is issue is only seen in an upgraded 6.1.5 instance, where the MCS service constantly crashes with a locked user account. In some cases, the MCS might be running, but none of the mccli commands run and in few cases the backup scheduler service would not start.

If you try to start MCS from the admin mode using mcserver.sh --start --verbose, it starts successfully but crashes immediately.

If you run the below command:

# grep locked /usr/local/avamar/var/mc/server_log/mcserver.log.0

You will notice the account being locked:

WARNING: The user MCUser@/ is locked. Product VDP
WARNING: The user MCUser@/ is locked. Product VDP
WARNING: The user MCUser@/ is locked. Product MCCLI
WARNING: The user MCUser@/ is locked. Product VDP
WARNING: The user MCUser@/ is locked. Product VDP

When you start backup scheduler, you might see it fail with:

root@Jimbo:/usr/local/avamar-tomcat/lib/#: dpnctl start sched
Identity added: /home/dpn/.ssh/dpnid (/home/dpn/.ssh/dpnid)
dpnctl: INFO: Resuming backup scheduler...
dpnctl: ERROR: error return from "[ -r /etc/profile ] && . /etc/profile ; /usr/local/avamar/bin/mccli mcs resume-scheduler" - exit status 1
dpnctl: INFO: [see log file "/usr/local/avamar/var/log/dpnctl.log"]

In the dpnctl.log you will notice the following:

2018/01/19-06:13:55 output of "[ -r /etc/profile ] && . /etc/profile ; /usr/local/avamar/bin/mccli mcs resume-scheduler":
2018/01/19-06:13:55 - - - - - - - - - - - - - - - BEGIN
2018/01/19-06:13:55 1,22801,User login failure.
2018/01/19-06:13:55 Attribute Value
2018/01/19-06:13:55 --------- --------------------
2018/01/19-06:13:55 reason Locked user account.
2018/01/19-06:13:55 - - - - - - - - - - - - - - - END

This is because of a missing java archive file for MCS service due to which the account locks out.

If you run the below command you should be able to see the symbolic link.

# ls -lh /usr/local/avamar-tomcat/lib/mccommons.jar

lrwxrwxrwx 1 root root 35 Jan 19 11:44 /usr/local/avamar-tomcat/lib/mccommons.jar -> /usr/local/avamar/lib/mccommons.jar

If this is missing, then you will run into the above issue. To fix this:

1. Be logged in as root user into VDP

2. Navigate to the the below directory:

# cd /usr/local/avamar-tomcat/lib/mccommons.jar

3. Then run the below command to recreate the symbolic link:

# /bin/ln -fs /usr/local/avamar/lib/mccommons.jar .

The . is also required as you are linked to file to the current working directory.

4. Restart tomcat service using:

# emwebapp.sh --restart

5. Restart MCS using:

# mcserver.sh --restart --verbose

That should fix it. Hope this helps!

Configuring Log Insight 4.3 For A Fresh Deployment

Unable To Configure ESXi Syslog In Log Insight 4.x: Details: Client received SOAP Fault from server

Shell Script To Create VMs From Command Line

Bash Script For Backup Details.

Bash Script To Determine Backup Protocol

Bash Script To Determine Retired Clients.

Bash Script To Export VDP Backup Job Details

Bash Script To Extract vSphere Replication Job Information

VDP Expired Certificate

Unable To Protect a VM In SRM: "Object not found"

SRM Service Crashes After A Failed Recovery With "abrRecoveryEngine" Backtrace

Unable To Connect VDP To Web Client

ISO Undetected During VDP 6.1.6 Upgrade

vSphere Replication Sync Fails With Exception: com.vmware.hms.replication.sync.DeltaAbortedException

VDP 6.1.6 Does Not Connect To Web Client After An Upgrade

vSphere Replication 6.5.1 With vRealize Orchestrator 7.3

Resetting Site Recovery Manager's Embedded DB Password

VDP Upgrade ISO Not Detected

MCS Start Up Fails In VDP With: Permission Denied At MCServer.pm

MCS Service Crashes Due To Locked User Account