Pages

Tuesday, April 29, 2014

The Story of the 30 seconds freeze on Netapp filer every five hours at 9am, 2pm, 7pm and 12am

We have two netapp filers in active/active HA. One filer is for the SAS drives mostly storing the OS disks for the vmware VMs and the second filer filer02 hosting the data virtual disks for our VMware vms, CIFS and exchange database and logs.

A few weeks ago, we started noticing about 30-45 seconds freeze in outlook at 9:00am and 2:00pm. We only noticed the freeze at these times because this was during hours. We thought it was something to do with exchange servers. There were events related to delayed writes on the database and logs LUNs. However, the issue was occurring on all the exchange mailbox servers. We changed the backup schedules on all the servers. We take a backup every 4 hours. However, the issue persisted.
We then thought it could be a network issue. We use dedicated switches for storage traffic. However, our VMs ran fine and we did not notice the freeze on the VMs. When we checked the performance logs for the SAN we noticed a spike in NFS latency and CIFS latency around 9:01am, 2:01pm, 7:01pm and 12:00am. None of the de-dup schedules were running at all these four times. In the syslogs of the filer we saw the following for each of the exchange server’s iscsi initiators at the times mentioned above.
We still couldn’t understand what was causing the LUNS to reset the connection.
Event:
iscsi.notice






Severity:
notice

Message:
ISCSI: Initiator (iqn.1991-05.com.microsoft:server-mbx-05.domain.com) sent LUN Reset request, aborting all SCSI commands on lun 7
Triggered:
Sun Apr 27 00:01:32 PDT

iscsi.notice
Severity:
notice

Message:
ISCSI: New session from initiator iqn.1991-05.com.microsoft:srv-mbx-07.domain.com at IP addr 172.20.1.64

Triggered:
Fri Apr 11 09:01:19 PDT

We changed the schedule for all our VMware backups and any other backups that happened on the hour. There was still no change in this behavior.
Then I learned about the bug mentioned in
about the Netapp and NFS issue. We applied the workaround by changing the NFS.maxqueuedepth to 64 on all our hosts. Because the issue had started appearing after we had added a new host.
It still did not make a difference. Something was off and didn’t make sense. Because our VMs were not showing the freeze condition.
We went through every netapp log, every vmware log on each host but the only thing that pertained to this situation was the following in the \var\log\vmkernel.log for each host at the mentioned time.

2014-04-28T21:01:42.787Z cpu9:10653)NFSLock: 608: Stop accessing fd 0x410013247ce8  3
2014-04-28T21:01:42.787Z cpu9:10653)NFSLock: 608: Stop accessing fd 0x4100132692a8  3
2014-04-28T21:01:42.787Z cpu9:10653)NFSLock: 608: Stop accessing fd 0x41001219b328  3
2014-04-28T21:01:42.787Z cpu9:10653)NFSLock: 608: Stop accessing fd 0x41001218a228  3
2014-04-28T21:01:42.787Z cpu9:10653)NFSLock: 608: Stop accessing fd 0x410013242c68  3
2014-04-28T21:01:42.787Z cpu9:10653)NFSLock: 608: Stop accessing fd 0x41001219fae8  3
2014-04-28T21:01:42.787Z cpu9:10653)NFSLock: 608: Stop accessing fd 0x41001219e5e8  3
2014-04-28T21:01:42.787Z cpu9:10653)NFSLock: 608: Stop accessing fd 0x41001219fe68  3
2014-04-28T21:01:42.787Z cpu9:10653)NFSLock: 608: Stop accessing fd 0x410012190268  3
2014-04-28T21:01:42.787Z cpu9:10653)NFSLock: 608: Stop accessing fd 0x410013234fe8  3
2014-04-28T21:01:42.787Z cpu9:10653)NFSLock: 608: Stop accessing fd 0x410012190968  3
2014-04-28T21:01:51.783Z cpu6:8198)NFSLock: 568: Start accessing fd 0x410013246468 again
2014-04-28T21:01:51.801Z cpu6:8198)NFSLock: 568: Start accessing fd 0x41001219b328 again
2014-04-28T21:01:51.801Z cpu6:8198)NFSLock: 568: Start accessing fd 0x410012190268 again
2014-04-28T21:01:51.801Z cpu6:8198)NFSLock: 568: Start accessing fd 0x410013247ce8 again
2014-04-28T21:01:51.801Z cpu6:8198)NFSLock: 568: Start accessing fd 0x41001219fe68 again
2014-04-28T21:01:51.801Z cpu6:8198)NFSLock: 568: Start accessing fd 0x410013242c68 again
2014-04-28T21:01:51.801Z cpu6:8198)NFSLock: 568: Start accessing fd 0x410012190968 again
2014-04-28T21:01:51.801Z cpu6:8198)NFSLock: 568: Start accessing fd 0x41001219fae8 again
2014-04-28T21:01:51.801Z cpu6:8198)NFSLock: 568: Start accessing fd 0x41001218a228 again
2014-04-28T21:01:51.801Z cpu6:8198)NFSLock: 568: Start accessing fd 0x4100132692a8 again
2014-04-28T21:01:51.801Z cpu6:8198)NFSLock: 568: Start accessing fd 0x410013234fe8 again
2014-04-28T21:01:51.801Z cpu6:8198)NFSLock: 568: Start accessing fd 0x41001323dda8 again
2014-04-28T21:01:51.801Z cpu6:8198)NFSLock: 568: Start accessing fd 0x41001219e5e8 again


We were about to give up and call support for vmware and netapp, when I thought of aggregate snapshots. When I checked the aggregate snapshots….there it was..each aggregate snapshot was taking place at the mentioned times. Changed the schedule to off-peak hours, and saw the iscsi.notice at the times i changed the schedule to. I would have never thought aggregate snapshots could cause a freeze.



Saturday, February 22, 2014

Script to monitor Exchange 2010 queue and send email alerts if threshold is reached

This is a script that can be scheduled as a scheduled task and run every few minutes on a server and exchange management tools installed. This script monitors the exchange queues on your CAS servers and send email alerts (make sure you specify an external email address as well) if the queue threshold that you specify is reached. 

# Script:    Exch2010QueueMonitor.ps1 
# Purpose:  This script can be set as a scheduled task to run every 30minutes and will monitor all exchange 
#2010 queue's. If a threshold of 30 is met an  
#            output file with the queue details will be e-mailed to all intended admins listed in the e-mail settings 

# Comments: Lines 27, 31-35 should be populated with your own e-mail settings 
# Notes:     
#            - tested with Exchange 2010 SP1 - SP3
#            - The log report output file will be created under "C:\Support\Scripts\queue.txt" 

$snapins = Get-PSSnapin
$snapins | foreach-Object {

if ($_.name -match "Exchange")
{
$exchloaded = $TRUE
}
}
if ($exchloaded -eq $TRUE)
{
if ($showgui)
{
Write-Host -ForegroundColor Green "Exchange 2010 Snapin already loaded."
}
}
else
{
Add-PSSnapin *Exchange*
if ( $showgui ) { Write-Host -ForegroundColor Green "Exchange 2010 Snapin had to be loaded." }
}

$filename = “C:\Support\Scripts\queue.txt” 
Start-Sleep -s 10 
if (Get-ExchangeServer | Where { $_.isHubTransportServer -eq $true } | get-queue | Where-Object { $_.MessageCount -gt 30 }) 



Get-ExchangeServer | Where { $_.isHubTransportServer -eq $true } | get-queue | Where-Object { $_.MessageCount -gt 30 } | Format-Table -Wrap -AutoSize | out-file -filepath C:\Support\Scripts\queue.txt 
Start-Sleep -s 10 

$smtpServer = “smtprelay.domain.com”   #your smtp server
$msg = new-object Net.Mail.MailMessage
$att = new-object Net.Mail.Attachment($filename) 
$smtp = new-object Net.Mail.SmtpClient($smtpServer) 
$msg.From = “noreply_exchange@domain.com” #send as address
$msg.To.Add("admin1@domain.com")  #change this address for admin address
$msg.To.Add("admin2@externaldomain.com") # add external email address like gmail etc
$msg.Subject = “CAS SERVER QUEUE THRESHOLD REACHED - PLEASE CHECK EXCHANGE QUEUES” 
$msg.Body = “Please see attached queue log file for queue information” 
$msg.Attachments.Add($att) 
$smtp.Send($msg) 

}

Citrix Xenapp 6 - Published Desktop disconnects when published application is launched from within a published desktop

When you attempt to launch a published application from within a published desktop session, the published desktop session will disconnect. This is because Citrix receiver by default tries to reconnect to all your open sessions on launch. So when you attempt to start the published application, it disconnects the session of the published desktop. You can modify this behavior by changing the following registry key.

[HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Citrix\Dazzle]
"WSCReconnectMode"="0" 

Change this key on your published desktop. Once this is modified, relaunch the published application and your desktop session should remain intact. 

Monday, February 17, 2014

SOLVED - Citrix Xenapp 6.0 hotfix rollup fails to install - Error 1904 Module C:\program files (x86)\citrix\system32\rpm.dll failed to register


When you try to install hotfix rollup on a Citrix Xenapp 6.0, you receive the following error.


To resolve this, leave this window open. And browse to C:\windows\system32\ and rename the file cutildll64.dll to cutildll64.dll.old. Now hit retry and the hotfix should install successfully. after the install completes, reboot the server and then copy the new file from C:\program files (x86)\citrix\system32\cutildll64.dll to C:\windows\system32\. 

Restore items that had last modified date of a particular day from a Netapp snapshot


We ran into a problem where we got hit by a cryptolocker on one of our cifs and that ended up encrypting a bunch of files. Now this particular cifs share had around 500GB of data. we managed to restore the share from a snapshot but the users had modified a large number of files just before the virus hit. So the dilemma was how to just restore the files that were changed on a particular day. 

Well, I love powershell for a reason. Some of the things to make sure are.
1. .snapshot directory should be visible on the netapp cifs share.
2. get the snapshot name of the day (or hour) you want to restore from. In this example, it is nightly.4

The command is

Get-ChildItem -path Y:\~snapshot\nightly.4\share \operations *.* -Recurse | where-object {$_.lastwritetime.day -eq 5 -AND $_.LastWriteTime.Month -eq 9 -AND $_.lastwritetime.year -eq 2013} | copy-item -destination Y:\share \temp\restored


you can change the following to reflect the last modified date. 
1. $_.lastwritetime.day to day of the month
2. $_.LastWriteTime.Month to the month of the year
3. $_.lastwritetime.year to the year




Monday, February 3, 2014

How to resolve Citrix desktop delivery service console discovery process errors



The discovery process might fail with the following error
“Errors occurred when using servername in the discovery process”

- If the local computer is member of the farm, start the discovery process again and add local computer to the list of the servers and run the discovery again.

- If the discovery still fails, check if the server(the datacollector) is up. And MFCOM service and IMAservice is running on the server and also on local computer.

- If MFCOM service is not running, the server will need to be rebooted

- Run the command qfarm /load to check if the local server, and the servers are in the list. If they are not, run the following command on the server that is not in the list
Net stop imaservice
Net start imaservice
Rerun discovery

- If the process still fails, check if the licensing serveris up and does not have any errors in the eventlog related to citrix

- Also check if datastore (SQl server) is up and instance that the datastore is residing on is running on it.
Once we are sure that the datastore is up, open command prompt on any of the XenApp servers and run the following command. (make sure the command prompt is run as administrator)
Dscheck
See if any inconsistencies show up. If there are inconsistencies, run the following command to clear the inconsistencies
Dscheck /clean

Run the discovery again. This should fix the problem 

Thursday, January 30, 2014

Powershell script to remove smtp addresses with a domain from mailboxes in Exchange 2010

This script is to remove a smtp domain from a client’s mailboxes. Email address policy is how these domains get added to the mailboxes. However email address policies are additive only and cannot be used to remove the domain that was added using email address policy. They have to removed manually from each mailbox.

When will we need this?

A good scenario is when a client company ABC has changed their company smtp domain from abc.com to xyz.com and no longer want to receive any email on the old smtp abc.com that is they want that if somebody sends an email to abc.com they should get a bounceback. The old domain abc.com has been  removed from the accepted domains and the MX records for abc.com no longer point to your exchange server. Externally this will work correctly because obviously the DNS has been modified and abc.com has been removed from accepted domains in our exchange. But internally on the same exchange server, users will still be able to send and receive an email to abc.com address. The email addresses with abc.com either needs to be removed manually from each users exchange properties or you can use powershell to do it for you.

How will this work?


1       I am using a custom attribute for filtering the get-mailbox command, but you can use –scope to use OU for filtering or select all the users. Modify the customattribute1 value to the of the client and domain name (domain.com) to the target smtp domain in the script and copy the script
2       Open the exchange management shell and paste the script in the shell window (press enter once)

All the email address for client with clientcode ‘clientcode’ that contain email addresses with smtp domain ‘domain.com’ will be removed.

#Script to remove email address for a particular domain as EAP is additive Only
# BEFORE USING - please change the domain name and custom attribute as mentioned in comments
# IMPORTANT: DOMAIN IS NOT YOUR AD DOMAIN BUT THE SMTP DOMAIN YOU WANT TO REMOVE


#Gets the client mailboxes for the users with customattribute1 set as 'abc'

foreach($Clientmailbox in Get-mailbox -ResultSize Unlimited | where{$_.CustomAttribute1 -eq 'abc'})
{
#for each mailbox grabs the email addresses and filters the addresslist
#for the smtp domain the needs to be removed
#and then removes the email address
#CHANGE THE DOMAIN from domain.com to the corresponding domain
$Clientmailbox.EmailAddresses |
    ?{$_.AddressString -like '*@domain.com'} | %{
      Set-Mailbox $clientmailbox -EmailAddresses @{remove=$_}
    }
}



Wednesday, January 29, 2014

Powershell script to create a report for whether or not a Citrix Xenapp Hotfix is installed on Citrix Xenapp farm servers or not


One of the second biggest pain for me as a farm administrator has always been to figure out if the XenApp Servers in the farm do have the latest rollup (or hotfix installed. And which are the servers where the latest roll up is not installed. The biggest pain of course is to install it remotely on all the XenApp servers ( I am still working in that one).

So I have this script that creates a report for a particular rollup pack or hotfix telling which servers have the hotfix installed and which don’t. It also creates a list of servers where the hotfix is not installed so I can install the hotfix on those servers.

I have tested this script on XenApp 6.0 farm. I ran the script from a XenApp server in the farm.



# Script to create a report on whether the hotfix is installed or not on the
# XenApp servers
# Created by: Ruby Nahal


# Register Addins if missing
$snapin = Get-PSSnapin | where {$_.name -eq 'Citrix.Common.Commands'}
if($snapin -eq $null){ Add-PSSnapin Citrix.Common.Commands }
$snapin = Get-PSSnapin | where {$_.name -eq 'Citrix.XenApp.Commands'}
if($snapin -eq $null){ Add-PSSnapin Citrix.XenApp.Commands }

$farm = Get-XAfarm          #get farm info
$servers = Get-XAServer    # get all the xenapp servers in the farm
[string]$CXAName              # string object to store the Xenapp server name
$logfile = "C:\Support\XAreport.txt"         # log file to store CXA name and if the hotfix is installed or not
$CXAList = "C:\Support\CXAlist.txt"                   #list of servers that do not have the hotfix installed

Foreach($CXA in $servers)            #run for each CXA or Xenapp server in the farm
{
[Boolean]$RUisInstalled = $false    #boolean variable to check the status of hotfix on the server
$CXAName = $CXA.Servername                        #Get the server name for the CXA in the current loop
Write-Output $CXAName                                # write the cxa name

#function XenappHotfixRU2Report([string]$HotfixName)           #can also use this script as a function for multiple hotfixes
#       {
    # Get current computername and XenServer object
    [String]$HotfixName = 'XA600W2K8R2X64R02'          #hotfix in question - will be commented out if used as the function
   
    foreach($hotfix in (Get-XAServerHotFix -ServerName $CXAName)) #run for each hotfix installed on the server
          {
           
        if($hotfix.HotfixName -eq $HotfixName) #if hotfix name is equal to the $hotfixname set the boolean variable to true
                             {
                             $RUisInstalled = $true
                             }
                   else
                             {
                             echo ".." # else, basically do nothing
                             }
          }
                   echo "'n****** $CXAName *******" | Out-File $logfile -Append -Width 240 # write the CXA name to the log file
                   if ($RUisInstalled# tell us if hotfix is installed or not by writing it to the log file
                             {
                             echo "Rollup Installed" | Out-File $logfile -Append -Width 240
                             }
                   Else
                             {
                             echo "Rollup was Not Installed" | Out-File $logfile -Append -Width 240
                             echo $CXAName | Out-File $CXAList -Append -Width 240
                             }
#       }
          #XenappHotfixRU2Report 'XA600W2K8R2X64R02'

}

Tuesday, January 28, 2014

Workspace control on Citrix Xenapp published desktop


What this means?

-          If you login to published Desktop from computer 1 and then go to computer 2 and login from there, your session will automatically get disconnected from computer 1 and launch on computer 2. You will NOT get the error “you have reached the concurrent session limit

-          Auto- launch is disabled. This is a default from citrix that when workspace control is enabled, auto-launch will be disabled. This means, when you open the login page and enter your username and password, the session will NOT launch automatically. You will have to click on the Application icon to launch your session

Where this does not work?

-          If a session is running on computer 1 and you attempt to launch another session to the same desktop, you WILL get the concurrent limit error.

How this is done?

In Citrix Appcenter, expand “applications” and right click on the application you want to enable this on and select application properties. Select the “Name” section and under “application description”, type in the following

KEYWORDS:TreatAsApp Auto

TreatAsApp forces the desktop to be treated as a published application and enables workspace control

Auto automatically subscribes to the application in the storefront and the user does not need to manually add the application.

Tuesday, January 7, 2014

Script to backup ESXi configurations for all hosts in the cluster





This script is used to backup the ESXi configuration of all the hosts in the cluster. This backup will serve in the event if a VMware host has to be rebuild and if the configuration of a host goes corrupt  or compromised.This is a powershell script that requires the vmware powercli module.






#This script will backup the hosts Configuration


#Created By: Ruby Nahal

#run as scheduled task using the a service account that has 
#administrative rights on vcente. vcenter must have powercli installed

#this script creates the backup in D:\esxibackups and each backup is replaced # by the new one

# it also copies the backups to \\fileserver\share\backups



#Load the vmware powershell snapin

add-pssnapin vmware.vimautomation.core



#Connect to the vCenter Server

Connect-VIServer vcenter-01



#grab the list of hostnames

$esxihosts = get-VMhost | select name



#Print the list

Write-Output $esxihosts



#variable of type string to convert the object $esxihost to string value

[string]$esxihostname

Foreach ($esxihost in $esxihosts)

                {

                $esxihostname = $esxihost.Name

                Write-Output $esxihostname

                Get-VMHostFirmware -VMHost $esxihostname -BackupConfiguration -DestinationPath D:\esxibackups

                }

Copy-Item -Path D:\esxibackups -Destination \\fileserver\share\backups -Recurse -Force