Tuesday 14 December 2010

SNMP setup on Windows for Zenoss

Intenally we use Zenoss to monitor our servers and systems. In order for Zenoss to query lots of details on the windows servers we are using the free snmp informant program on each Windows host. http://www.snmp-informant.com/ the free version has distribution restrictions but can be used for any purpose for free.

To get this working (based on Win 2003 - other versions of windows might deviate slightly):

  1. In Add Remove Programs, windows components, inside "Management and monitoring Tools" select "Simple network management Protocol". This will probably require the windows installation media.
  2. While in there install the "WMI SNMP Provider" which is also needed by Zenoss.
  3. Windows SNMP service properties, security add community name you want to use to secure this service, read only. Accept snmp packets from specific hosts.
  4. Install SNMP informant
  5. Allow port 161 UDP through firewall for SNMP traffic
A Zenoss server will now be able to report on CPU, HDD, software installed etc.

Troubleshooting:
If WMI not connecting Check firewall
#netsh firewall set service RemoteAdmin enable to enable WMI
A great page for identifying issues is http://community.zenoss.org/docs/DOC-2520 which lists most common errors with monitoring windows devices.

Friday 10 December 2010

HP DL360 G5 very slow hard drive performance

One of our servers became free recently (HP Proliant DL360 G5 with 2 dual core 3GHz processors, 9GB RAM and 4 147GB, 15k rpm hdd in RAID(0+1) - or RAID 10 never sure of the difference). Nice fast server I thought.

So I install VMWare ESXi 4.1 and start testing. Abysmal performance on the hard drives! After much testing I work out that although read speeds are ok, write is useless (26MB/s read, <1MB/s write). I was expecting double that on the read speeds, far more on the write. I try a cheap iSCSI setup with FreeNas and get ~ 12MB/Sec read and write so the server itself is fine. I'm using iometer for this testing, a windows application from 2006 which takes a bit of setting up but is free and very powerful when you figure out how it works.

Eventually after much testing I figure out that although this server has a write cache it does not have a backup battery on this cache. VMWare is detecting the missing battery and deciding not to use the cache in case of power failure which would lead to corrupt data. I've just ordered HP parts 398648-001 (Battery for P400i raid controler cache) and 409125-001 (power cable for this part) which should get this server back up to speed. The battery was substituted with part 381573-001 which is compatible and replaces 398648-001.

I had some trouble finding where to buy the parts but eventually came across www.chilternitparts.com who seem to have lots of parts for HP and Dell systems.

UPDATE:
The battery and cable arrived. Installed with no problems (there is a handy guide printed on the inside of the server cover) and performance has increased greatly. The battery is still charging 4 hours after installing it but VMWare has obviously decided to trust it as write speeds have gone from 0.63MB/s to 23.49MB/s (iometer settings - 1 worker with 50MB file on c:\ drive; 4k; 0%read; 0% random, left to run for about 5 min).

My domain controller (2003 Small Business Server, Poweredge 2600) currently gets 30MB/s during backups - normally it gets nowhere close to this in daily use. However the tests I have been doing are deliberately hard on the throughput - by making the tests easier (32k block size instead of 4k, 32 job queue, 10GB test file to ensure I could out supply the cache and get actual throughput) I get an easy 60MB/sec write speed and 75MB/s read speed. Writes using cache hit a peak at 125MB/s and when I used a smaller file to read pure cache I was getting a rock solid 250MB/s. Tests over - its now time to start thinking about actually getting the Domain Controlelr virtualised. Probably an overnight job to do before Christmas. Luckily with another IT guy one of us can do the 6pm start cold clone part and the other one can do the 6am check clone thoroughly and bring live.
UPDATE: Nope that did not work.  Cold clone over the network took far longer than expected (17 hours for ~ 115GB of files over 3 partitions totaling ~ 260GB), partly due to us resizing the disks, partly due to it just being a known slow process.  Plan B is to find a long weekend, might have to delay as don't want to do it during our busy period (Jan, Feb, Mar).

VMWare iSCSI targets with FreeNAS

Due to budgets (I think I remember what they are, its been so long since I had one with anything left in it..... :) we do not have the money to get a decent SAN system for our VMWare environment at this point. Local storage works fine - its not as flexible as a SAN and we loose some functionality but it is fast enough and reliable.  However as we have just purchased Veeam Essentials for backup of our virtual machines I thought it would be worth trying a few things out to see how it would work. Definitly a productive exercise as although not as fast as the directly attached storage it gives us more flexibility and having a few hundred spare GB for testing is very useful as speed is not such a big issue there.

Steps required to get this going:
  1. Install FreeNas and setup iSCSI target, a good guide (ignoring the first bit which deals with installing it in a virtual environment) is here: http://www.sysprobs.com/nas-vmware-workstation-iscsi-target 
  2. Setup Win7 to connect to iSCSI (if you want to for testing).
  3. Setup VMWare ESXi to connect to the iSCSI target:http://www.sysprobs.com/connect-freenas-iscsi-disks-vmware-vsphere-4-esxi4 
  4. Profit!
If you are using Veeam to do backups you can increase the backup speed greatly by allowing Veeam to connect directly to the iSCSI target rather than using the ESXi host as a proxy. This also reduces the workload on the ESXi host. so is a good thing all round. Handy guide to doing this here:
http://jpaul.me/?p=334

Wednesday 24 November 2010

VMWare backup comparison

Looking for a suitable backup program for our use. Our systems are 3 dual socket VMWare ESXi servers with direct attached storeage running a selection of virtual machines. One is a windows 2003 SBS server running exchange and some file shares. We also have a NAS storage box with windows file shares and NFS file shares which other systems backup to.

We purchased an LTO3 tape drive + tapes with backup exec 2010 for small business which works well except that it can not backup the whole windows SBS virtual machine when running inside the virtual machine which was something I explicitly asked for with exchange granular restores when talking to BT Businessdirect sales support. I was hoping it would create a snapshot then backup from the snapshot or something but no such luck.  UPDATE: Currently negotiating with BT Businessdirect to get a refund as we purchased that program based on them saying it would work 100% in exactly this setup and I can't really afford/justify the extra £1k to get the full version of Backup Exec 2010.  So option 1 below has been eliminated.

Options:
  1. Full backup exec with agents for exchange etc which I could then install on a seperate server. More expensive and not sure what server I could use... Waiting for prices.
  2. Trilead VM Explorer - Does not seem to support Exchange granular restores. Or Exchange at all on its features page. Does have a completly free version with a few limitations.
  3. Veeam - Looking fairly good. They do a "Veeam Essentials" bundle which tallys closely with the VMWare Essentials bundle - 3 servers with 2 sockets per server etc. For non VMWare virtual machine backups it has the ability to copy files to/from any location which is visible to the server as a network share/local drive/iSCSI drive etc.
    Veeam does not backup to tape - would need a seperate process to do this for offsite backup. Could experiment with replication to offsite ESX hosts but that would need a more expensive licence. Could use something like Backula to go from hdd backup to tape fairly easily or just use old copy of Backup Exec or similar - at this point it is just copying files to tape, no fancy open files or anything to worry about.
  4. vRanger - Another promising possibility, requested quote a few days ago and not heard anything yet.
  5. Arkeia - Possibility, not investigated too much yet. They provide a free version which is limitied to two VM backups etc.

Friday 5 November 2010

VMWare with Adaptec 29320 and Tandberg LTO3 tape drive

Just installed VMWare 4.1 (and later ESXi 5.0.0 U1) on a server with a new Adaptec 29320 SCSI card and Tandberg HH LTO3 external tape drive and VMWare reports the tape drive as dead. Nothing shows in the devices section under storage controllers but on the paths screen there is a dead path shown. I know the card is good because the same SCSI cart has a Certance LTO2 tape drive connected which is working fine. Whoot. Half hour of slightly concerned googling later and I come across these posts which explain a quick workaround.
http://www.experts-exchange.com/Software/VMWare/Q_26389860.html
http://www.experts-exchange.com/Software/VMWare/Q_26366697.html

Basically use the troubleshooting section of the local console to enable local tech support, logon via Alt-F1 and enter the following command:
# esxcli nmp satp addrule --satp VMW_SATP_LOCAL --driver="aic79xx" --description="Specific rule for Adaptec Card"
NOTE: for ESXi version 5 the above command does not work - the syntax has changed slightly. The following command worked for me though:
# esxcli storage nmp satp rule add --satp VMW_SATP_LOCAL --driver="aic79xx" --description="Specific rule for Adaptec Card"
Reboot server and the tape drive is live and good to assign to a virtual machine.

No idea why this happened for this specific hardware, assume it will be fixed in a future release of VMWare. For some reason the ESXi host does not know which driver to load for the Adaptec SCSI card and the line above forces ESXi to load the correct driver. (UPDATE: Unfortunately not yet fixed in ESXi 4.1U1 or ESXi 5.0.0U1)

Friday 29 October 2010

IPv6

Starting to look at IPV6 and try to figure it out. So far I know that BT has no IPV6 available yet as of end Oct 2010 (we have a leased line with 100Mb bearer so definitely business class). Therefore it looks like I'll be doing a 6to4 setup for testing. Hopefully BT will at least get a 6rd system up and running fairly soon as that should give us better performance than 6to4 which currently would go to the closest 192.88.99.1 host which I think might be in France based on the traceroute results I get. Still getting decent pings to it though so not too bad.
Update Jan 2011: Just found http://ipv6.bt.com/ which claims BT's IPv6 trial is finished and services are now available. Hopefully this means we will get a properly allocated block of IPv6 addresses rather than using a tunnel. If we ever get it setup. I've chased my contact again for information.

Our firewall (Netscreen SSG140) supports IPv6 which is perfect and for testing I should be able to get our old Netscreen 5GT going, not sure if that will support 6to4 though. UPDATE: Yes - it does with newer firmware.

Monday 26 July 2010

Quick and dirty NFS share mounting on linux

To quickly access files on an NFS share which already exists:
# mkdir /mnt/remote-files && mount -t nfs 192.168.0.3:/raid/data/store /mnt/remote-files
This makes a local folder under /mnt/ and then mounts the NFS share "/raid/data/store" on that folder.  192.168.0.3 is of cours the machine hosting the NFS share.

Tuesday 20 July 2010

Scan hard drive for errors with badblocks

To scan a suspect hard drive for errors I use the linux badblocks program. Windows often seems to silently try to "fix" problems without telling you what if anything was wrong which does not help if I want to know if a hdd is starting to get bad sectors.  Be careful as this program in destructive mode will overwrite your data (but also give a more thorough test as it will try writing to the disk as well as reading)
#badblocks -nvs /dev/sda
-n = non destructive (read only mode)
-v = verbose
-s = Print out progress

Thursday 15 April 2010

VMWare ESXi backups with VSphere

With VSphere purchased you can run backups over the network of virtual machines without taking them offline. The downside is that this will only be a crash consistent backup, although you can do things to improve this. The process works by taking a snapshot of the VM, then backing up the snapshot while the machine is still running. Once done the snapshot is merged back in to the running VM.

The command to run is (from "c:\program files\VMware\VMware Consolidated Backup Framework" by default):
# vcbmounter -h <vsphere host> -u <username> -a name:<VM Machine Name> -r <Local backup directory> -t fullvm -m nbd -M1
The command above will prompt for your password on the command line, you can include it in the command if you want to leave your password in the logs in plain text.

-a name: - This uses the VM name shown in VSphere (case sensitive) and will allow you to backup VM's which are powered off. Other options allow you to use the network name or IP of the Virtual Machine, but these require the VM to be on.

-t fullvm - the other option is to mount the VM so you can acces the files in the filesystems. Full VM gives you a VMWare image you can import elsewhere.

-M 1 - This sets the monolithic flag giving one big file for the filsystem. Otherwise it would split the filesystem into 2GB chunks

Earlier I mentioned that this backup would only be crash consistent. You can reduce potential problems by installing VMWare Tools to the guest VM, and by setting up scripts which the backup process can run before the snapshot to unmount databases etc.  For example you could turn off any databases, do the snapshot then turn the databases back on - databases would only be off for a couple of minutes and the backup would not risk corrupted database tables.

mountvm gives access to the guest file system remotely so you can backup specific files
vcbvmname lists details of all VM's on a server

Tuesday 30 March 2010

Setting the correct time on Linux

Main ntp server I use: 0.uk.pool.ntp.org. Find one suitable for your location at http://www.pool.ntp.org/en/.  Internally hosts update from Brain or the Firewall.

Install ntp:
# yum install ntpd (Redhat / CentOS)
# apt-get install ntp (Debian)
Edit setup file to have server of 10.3.1.1:
# vi /etc/ntp.conf
Force sync time:
# ntpdate -b <IP OF NTP SERVER>
Setup crontab to auto sync every 15 min:
# crontab -e
15 * * * * /usr/sbin/ntpd -q -u ntp:ntp
To set the timezone info edit the /etc/sysconfig/clock file and set the zone="Europe/London". Then link the localtime file to the correct zone with:
# ln -sf /usr/share/zoneinfo/Europe/London /etc/localtime
To manually set the date use this command:
# date --set HH:MM:SS

Wednesday 17 February 2010

Connecting to a Postgres database from Windows via ODBC

Getting the ODBC link working for a SQL server to see a postgres server was fairly easy but had a few gotchas that took me a while to work out.  First step is to install the Postgres official ODBC driver and setup a DSN for the server using the standard ODBC interface in windows.

In the options for the ODBC link tick "Use Declare/Fetch" as otherwise the ODBC driver tries to cache the whole table in memory. For huge tables this will fail with errors about memory and unable to read tuples, even if there is spare memory on the machine (the ODBC driver can not access all the system memory - its limited so you get memory errors even when your machine and the postgres machine both have lots of free memory). With this ticked the driver grabs blocks of 100 rows at a time.

You also need to untick the bool as char box so the driver passes the booleans as bits to SQL server 2005 which is how it views them.  Otherwise it tries to put "yes/no" in a field which will only accept "1/0"

If using 64 bit Windows you still need the 32 bit ODBC driver - it will not show up in the data(ODBC) section of the control panel but will be accessible through the SQL server business intelligence development studio. Or you can run the 32 bit ODBC manager at C:\WINDOWS\SysWOW64\odbcad32.exe.  I never did work out why on 64 bit windows (server 2003) with 64 bit SQL server (2005) it uses the 32 bit ODBC connection.