Cheap VPS & Xen Server

Residential Proxy Network - Hourly & Monthly Packages

Openfiler 2.3 Active/Passive Cluster (heartbeat,DRBD) With Offsite Replication Node


Openfiler is a Linux based NAS/SAN application which can deliver storage over nfs/smb/iscsi and ftp. It has a web interface over that you can control these services. The howto is based on the Howto from Kyle Gililland. A lot of thanks to him for this.

The cluster we build will consist of two nodes replicating each other and taking over services and storage in case of emergency. Furthermore we have an Offsite Replication Server, which ideally stands in a physically different position and replicates the configurations/storage from which ever node is active. In case of emergency this Offsite Replication Server can be used to restore the cluster and to deliver the services.

I used the Openfiler VMware Images 2.3 x86 and Openfiler 2.3 x86 Installation Media to test out it this howto. Hopefully you will be able to repeat my steps and have a safe and cheap Storage Solution.

 

Overview

  • 1. Installation
  • 2. Prepaire the partitions for DRBD
  • 3. Create basic heartbeat configuration
  • 4. DRBD Configuration
  • 5. Prepaire the configuration partition
  • 6. Initiate Data/Storage partition
  • 7. Create final heartbeat configuration
  • 8. Enable/Disable sytem services
  • 9. Test failover on node1 and node2
  • 10. Test recover of filer01 and filer02
    • 10.1 DRBD Configuration
    • 10.2 filer01 and filer02 redo configuration
    • 10.3 Retake resources and run cluster again
  • 11. Use Replication Node as Main Node
    • 11.1 Finished Replication, how to turn Replication Node in standby again
  • 12. Add another Storage Partition

Requirements

Hardware

  • 3x boxes that meet the minimum hardware requirements of openfiler
  • 2x ethernet interfaces at each box

 

1. Installation

In the Installation process of openfiler leave the portion of the disk you wanna use for Storage Services unused. In this case it was a second harddisk in each system. We create the partitions for Storage Replication after the installation.

Networkconfiguration

  • on filer01:
    • eth0: 10.10.11.101
    • eth1: 10.10.50.101
  • on filer02:
    • eth0: 10.10.11.102
    • eth1: 10.10.50.102
  • on filer03:
    • eth0: 10.10.11.103
    • eth1: 10.10.50.103
  • virtual devices: ( this will be created by heartbeat, dont add these ip’s to any interface )
    • eth0:0 10.10.11.100 ( iscsi target server ip )
    • eth1:0 10.10.50.100 ( replication interface )

After finishing the installation build yourself a nice working environment while you update the openfiler systems. It’s important to update the systems now as openfiler 2.3 standard installation media dont bring DRBD 8.3.x which is needed for 3-node replication.

conary update conary
conary updateall –replace-files –no-conflict-check

While the systems update you add the other filers to the /etc/hosts file and generate some ssh-keys and exchange them.

root@filer01 ~# nano /etc/hosts

#Installed by rBuilder
127.0.0.1       filer01 localhost.localdomain   localhost
10.10.50.102	filer02
10.10.50.103	filer03

root@filer02 ~# nano /etc/hosts

#Installed by rBuilder
127.0.0.1       filer02 localhost.localdomain   localhost
10.10.50.101	filer01
10.10.50.103	filer03

root@filer03 ~# nano /etc/hosts

#Installed by rBuilder
127.0.0.1       filer03 localhost.localdomain   localhost
10.10.50.101	filer01
10.10.50.102	filer02

Then we create ssh-keys without a passphrase which we use for easier file-exchange.

root@filer01 ~# ssh-keygen -t dsa

Generating public/private dsa key pair.
Enter file in which to save the key (/root/.ssh/id_dsa):
Created directory ‘/root/.ssh’.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_dsa.
Your public key has been saved in /root/.ssh/id_dsa.pub.
The key fingerprint is:
4c:76:ec:61:10:2a:07:9f:35:8d:9b:16:89:5c:bc:b9 root@filer01

Do the same on the other 2 nodes:

root@filer02 ~# ssh-keygen -t dsa
root@filer03 ~# ssh-keygen -t dsa

Exchange the public keys between the nodes:

root@filer01 ~# scp .ssh/id_dsa.pub root@filer02:~/.ssh/authorized_keys2
root@filer01 ~# scp .ssh/id_dsa.pub root@filer03:~/.ssh/authorized_keys2

root@filer02 ~# scp .ssh/id_dsa.pub root@filer01:~/.ssh/authorized_keys2
root@filer02 ~# scp .ssh/id_dsa.pub root@filer03:~/.ssh/authorized_keys

root@filer03 ~# scp .ssh/id_dsa.pub root@filer01:~/.ssh/authorized_keys
root@filer03 ~# scp .ssh/id_dsa.pub root@filer02:~/.ssh/authorized_keys

root@filer01 ~# cat .ssh/authorized_keys >> .ssh/authorized_keys2
root@filer02 ~# cat .ssh/authorized_keys >> .ssh/authorized_keys2
root@filer03 ~# cat .ssh/authorized_keys >> .ssh/authorized_keys2

 

2. Prepare The Partitions For DRBD

Setup the partitions for DRBD which will contain the configuration files for openfiler and its storage services. Don’t add any of this to /etc/fstab. If you created the partitions in the installation process remove the entries for them at the /etc/fstab file.

You have todo this steps on all 3 Machines.

fdisk /dev/sdb

Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-1566, default 1):
Using default value 1
Last cylinder, +cylinders or +size{K,M,G} (1-1566, default 1566): 67
Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 2
First cylinder (68-1566, default 68):
Using default value 68
Last cylinder, +cylinders or +size{K,M,G} (68-1566, default 1566):
Using default value 1566
Command (m for help): t
Partition number (1-4): 2
Hex code (type L to list codes): 8e
Changed system type of partition 2 to 8e (Linux LVM)

You have to exist fdisk with w to write the changed to the disk.

You can check the partitions after this:

fdisk -l

Disk /dev/sda: 2355 MB, 2355978240 bytes
16 heads, 63 sectors/track, 4565 cylinders
Units = cylinders of 1008 * 512 = 516096 bytes
Disk identifier: 0x00000000

Device Boot Start End Blocks Id System
/dev/sda1 * 1 4565 2300696 83 Linux

Disk /dev/sdb: 12.9 GB, 12884901888 bytes
255 heads, 63 sectors/track, 1566 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0xff9e0345

Device Boot Start End Blocks Id System
/dev/sdb1 1 67 538146 83 Linux
/dev/sdb2 68 1566 12040717+ 8e Linux LVM

NOTE: Before you can continue the systems should have finished update.

 

3. Create A Basic Heartbeat Configuration

This step is needed to create the virtual IP that DRBD is going to use for the offsite replication.

Create /etc/ha.d/authkeys ( filer01 and filer02 ):

auth 2
2 crc

Change the accessrights to the /etc/ha.d/authkeys file:

chmod 600 /etc/ha.d/authkeys

Create etc/ha.d/ha.cf ( filer01 and filer02 ):

debugfile /var/log/ha-debug
logfile /var/log/ha-log
logfacility local0
bcast eth1
keepalive 5
warntime 10
deadtime 120
initdead 120
udpport 694
auto_failback off
node filer01
node filer02

Edit /opt/openfiler/etc/cluster.xml ( filer01 ):

<?xml version="1.0" ?>
<cluster>
<clustering state="on" />
<nodename value="filer01" />
<resource value="MailTo::it@company.com::ClusterFailover"/>
<resource value="IPaddr::10.10.50.100/24/eth1" />
</cluster>

Go to https://10.10.11.101:446 and start iscsi target service ( this will create /etc/ha.d/haresources on filer01 ).

Copy haresource file to filer02 and start heartbeat IP:

root@filer01 ~# scp /etc/ha.d/haresources root@filer02:/etc/ha.d/haresources
root@filer01 ~# service heartbeat start
root@filer02 ~# service heartbeat start

Wait till heartbeat is up on both nodes, then stop heartbeat on filer02:

root@filer02 ~# service heartbeat stop

This should create a failover where filer01 gets the ip address 10.11.50.100, check this with ifconfig on filer01, the output should be like this.

root@filer01 ~# ifconfig

eth0 Link encap:Ethernet HWaddr 00:0C:29:D9:66:27
inet addr:10.10.11.101 Bcast:10.10.11.255 Mask:255.255.255.0

eth1 Link encap:Ethernet HWaddr 00:0C:29:D9:66:31
inet addr:10.10.50.101 Bcast:10.10.50.255 Mask:255.255.255.0

eth1:0 Link encap:Ethernet HWaddr 00:0C:29:D9:66:31
inet addr:10.10.50.100 Bcast:10.10.50.255 Mask:255.255.255.0

If this is set you can now continue to start creating the drbd devices. If not check the output of /var/log/ha-log and try to figure out what is wrong.

 

4. DRBD Configuration

This manual implies that you have worked with drbd in the past and know howto tune the drbd common section regarding to your hardware. Using protocol C is save on stacked resources but not the best regarding network performance.

Remove the old drbd.conf and create a new one:

root@filer01 ~# rm /etc/drbd.conf

root@filer01 ~# nano /etc/drbd.conf

global {
        usage-count yes;
}
common {
        protocol C;
        disk {
                on-io-error detach;
                }
net {
        after-sb-0pri disconnect;
        after-sb-1pri disconnect;
        after-sb-2pri disconnect;
        rr-conflict disconnect;
        }
syncer {
  al-extents 257;
  rate 100M;
        }
handlers {
        pri-on-incon-degr "echo O > /proc/sysrq-trigger ; halt -f";
        pri-lost-after-sb "echo O > /proc/sysrq-trigger ; halt -f";
        local-io-error "echo O > /proc/sysrq-trigger ; halt -f";
 }
 startup {
  degr-wfc-timeout 120; # 2 minutes.
 }
}
resource meta {
 on filer01 {
  device /dev/drbd0;
  disk /dev/sdb1;
  address 10.10.50.101:7788;
  meta-disk internal;
 }
 on filer02 {
  device /dev/drbd0;
  disk /dev/sdb1;
  address 10.10.50.102:7788;
  meta-disk internal;
 }
}
resource data {
 on filer01 {
  device /dev/drbd1;
  disk /dev/sdb2;
  address 10.10.50.101:7789;
  meta-disk internal;
 }
 on filer02 {
  device /dev/drbd1;
  disk /dev/sdb2;
  address 10.10.50.102:7789;
  meta-disk internal;
 }
}
resource meta-U {
 stacked-on-top-of meta {
  device /dev/drbd10;
  address 10.10.50.100:7788;
 }
 on filer03 {
  device /dev/drbd10;
  disk /dev/sdb1;
  address 10.10.50.103:7788;
  meta-disk internal;
 }
}
resource data-U {
 stacked-on-top-of data {
  device /dev/drbd11;
  address 10.10.50.100:7789;
 }
 on filer03 {
  device /dev/drbd11;
  disk /dev/sdb2;
  address 10.10.50.103:7789;
  meta-disk internal;
 }
}

Copy this file over to the other nodes:

root@filer01 ~# scp /etc/drbd.conf root@filer02:/etc/drbd.conf
root@filer01 ~# scp /etc/drbd.conf root@filer03:/etc/drbd.conf

Initiate the lower resources:

root@filer01 ~# drbdadm create-md meta
root@filer01 ~# drbdadm create-md data

root@filer02 ~# drbdadm create-md meta
root@filer02 ~# drbdadm create-md data

Start DRBD on filer01 and filer02:

root@filer01 ~# service drbd start
root@filer02 ~# service drbd start

Set the lower drbd resources primary on filer01:

root@filer01 ~# drbdsetup /dev/drbd0 primary -o
root@filer01 ~# drbdsetup /dev/drbd1 primary -o

Create the DRBD Metadata on the stacked ressource:

root@filer01 ~# drbdadm –stacked create-md meta-U
root@filer01 ~# drbdadm –stacked create-md data-U

Enable the stacked resource and set it to primary:

root@filer01 ~# drbdadm –stacked up meta-U
root@filer01 ~# drbdadm –stacked up data-U
root@filer01 ~# drbdsetup /dev/drbd10 primary -o
root@filer01 ~# drbdsetup /dev/drbd11 primary -o

Create DRBD metadata on filer03 and start the resources:

root@filer03 ~# drbdadm create-md meta-U
root@filer03 ~# drbdadm create-md data-U
root@filer03 ~# service drbd start

Control the sync process on the nodes with:

root@filer01 ~# service drbd status

drbd driver loaded OK; device status:
version: 8.3.7 (api:88/proto:86-91)
GIT-hash: ea9e28dbff98e331a62bcbcc63a6135808fe2917 build by phil@fat-tyre, 2010- 01-13 17:17:27
m:res cs ro ds p mounted fstyp e
0:meta Connected Primary/Secondary UpToDate/UpToDate C
1:data Connected Primary/Secondary UpToDate/UpToDate C
10:meta-U^^0 Connected Primary/Secondary UpToDate/UpToDate C
11:data-U^^1 Connected Primary/Secondary UpToDate/UpToDate C

This way it should look when all nodes are synced and filer01 is the primary.

 

5. Prepare The Configuration Partition

Filer01

Initiate meta partition:

NOTE: Here we gonna use the stacked resource instead of the lower level one!

root@filer01 ~# mkfs.ext3 /dev/drbd10

Openfiler to Meta-Partition:

root@filer01 ~# mkdir /meta
root@filer01 ~# mount /dev/drbd10 /meta
root@filer01 ~# mv /opt/openfiler/ /opt/openfiler.local
root@filer01 ~# mkdir /meta/opt
root@filer01 ~# cp -a /opt/openfiler.local /meta/opt/openfiler
root@filer01 ~# ln -s /meta/opt/openfiler /opt/openfiler
root@filer01 ~# rm /meta/opt/openfiler/sbin/openfiler
root@filer01 ~# ln -s /usr/sbin/httpd /meta/opt/openfiler/sbin/openfiler
root@filer01 ~# rm /meta/opt/openfiler/etc/rsync.xml
root@filer01 ~# ln -s /opt/openfiler.local/etc/rsync.xml /meta/opt/openfiler/etc/
root@filer01 ~# mkdir -p /meta/etc/httpd/conf.d

Samba/NFS/ISCSI/PROFTPD Configuration Files to Meta Partition:

root@filer01 ~# service nfslock stop
root@filer01 ~# service nfs stop
root@filer01 ~# service rpcidmapd stop
root@filer01 ~# umount -a -t rpc-pipefs
root@filer01 ~# mkdir /meta/etc
root@filer01 ~# mv /etc/samba/ /meta/etc/
root@filer01 ~# ln -s /meta/etc/samba/ /etc/samba
root@filer01 ~# mkdir -p /meta/var/spool
root@filer01 ~# mv /var/spool/samba/ /meta/var/spool/
root@filer01 ~# ln -s /meta/var/spool/samba/ /var/spool/samba
root@filer01 ~# mkdir -p /meta/var/lib
root@filer01 ~# mv /var/lib/nfs/ /meta/var/lib/
root@filer01 ~# ln -s /meta/var/lib/nfs/ /var/lib/nfs
root@filer01 ~# mv /etc/exports /meta/etc/
root@filer01 ~# ln -s /meta/etc/exports /etc/exports
root@filer01 ~# mv /etc/ietd.conf /meta/etc/
root@filer01 ~# ln -s /meta/etc/ietd.conf /etc/ietd.conf
root@filer01 ~# mv /etc/initiators.allow /meta/etc/
root@filer01 ~# ln -s /meta/etc/initiators.allow /etc/initiators.allow
root@filer01 ~# mv /etc/initiators.deny /meta/etc/
root@filer01 ~# ln -s /meta/etc/initiators.deny /etc/initiators.deny
root@filer01 ~# mv /etc/proftpd /meta/etc/
root@filer01 ~# ln -s /meta/etc/proftpd/ /etc/proftpd

httpd Modules for Openfiler ( use lib64 for 64bit openfiler ):

root@filer01 ~# rm /opt/openfiler/etc/httpd/modules
root@filer01 ~# ln -s /usr/lib/httpd/modules /opt/openfiler/etc/httpd/modules

Restart Openfiler to see if everything worked:

root@filer01 ~# service openfiler restart

NOTE: If it doesnt restart check if you have linked the httpd Modules, probably you use lib64 instead of lib!

 

Filer02 and Filer03

Openfiler Configuration:

mkdir /meta
mv /opt/openfiler/ /opt/openfiler.local
ln -s /meta/opt/openfiler /opt/openfiler

Samba/NFS/ISCSI/PROFTPD Configuration Files to Meta Partition:

service nfslock stop
service nfs stop
service rpcidmapd stop
umount -a -t rpc-pipefs
rm -rf /etc/samba/
ln -s /meta/etc/samba/ /etc/samba
rm -rf /var/spool/samba/
ln -s /meta/var/spool/samba/ /var/spool/samba
rm -rf /var/lib/nfs/
ln -s /meta/var/lib/nfs/ /var/lib/nfs
rm -rf /etc/exports
ln -s /meta/etc/exports /etc/exports
rm /etc/ietd.conf
ln -s /meta/etc/ietd.conf /etc/ietd.conf
rm /etc/initiators.allow
ln -s /meta/etc/initiators.allow /etc/initiators.allow
rm /etc/initiators.deny
ln -s /meta/etc/initiators.deny /etc/initiators.deny
rm -rf /etc/proftpd
ln -s /meta/etc/proftpd/ /etc/proftpd

6. Initiate Data/Storage Partition

Change Line 53 in /etc/lvm/lvm.conf on all nodes:

filter = [ "a/.*/" ]

to

filter = [ "a|drbd1[0-9]|", "r|.*|" ]

Initiate data Partition:

NOTE: it’s important to use the stacked resource device like you done it with the meta partition!

root@filer01 ~# pvcreate /dev/drbd11
root@filer01 ~# vgcreate data /dev/drbd11
root@filer01 ~# lvcreate -L 400M -n filer data

 

7. Create Final Heartbeat Configuration

Change /meta/opt/openfiler/etc/cluster.xml on filer01:

<?xml version="1.0" ?>
<cluster>
<clustering state="on" />
<nodename value="filer01" />
<resource value="MailTo::it@company.com::ClusterFailover"/>
<resource value="IPaddr::10.10.11.100/24/eth0" />
<resource value="IPaddr::10.10.50.100/24/eth1" />
<resource value="drbdupper::meta-U">
<resource value="drbdupper::data-U">
<resource value="LVM::data">
<resource value="Filesystem::/dev/drbd10::/meta::ext3::defaults,noatime">
<resource value="MakeMounts"/>
</cluster>

Go to https://10.10.11.101:446 and start/stop iscsi target service ( this will recreate /etc/ha.d/haresources on filer01 ).

Copy haresources to filer02:

root@filer01 ~# scp /etc/ha.d/haresources root@filer02:/etc/ha.d/haresources

 

8. Enable/Disable System Services

We disable the services that are handled by heartbeat in the configuration:

root@filer01 ~# chkconfig –level 2345 heartbeat on
root@filer01 ~# chkconfig –level 2345 drbd on
root@filer01 ~# chkconfig –level 2345 openfiler off
root@filer01 ~# chkconfig –level 2345 open-iscsi off

root@filer02 ~# chkconfig –level 2345 heartbeat on
root@filer02 ~# chkconfig –level 2345 drbd on
root@filer02 ~# chkconfig –level 2345 openfiler off
root@filer02 ~# chkconfig –level 2345 open-iscsi off

root@filer03 ~# chkconfig –level 2345 drbd on
root@filer03 ~# chkconfig –level 2345 openfiler off
root@filer03 ~# chkconfig –level 2345 open-iscsi off

9. Test Failover On node1 And node2

NOTE : ALL DISCS SHOULD BE IN SYNC BEFORE DOING THIS!

Start heartbeat on node 2:

root@filer02 ~# service heartbeat start

Shutdown node 1:

root@filer01 ~# shutdown -h now

Watch /var/log/ha-log on filer02:

root@filer02 ~# tail -n 50 /var/log/ha-log

Check if you can login on https://10.10.11.100:446.

Start filer01 and do a manual failover:

root@filer01 ~# /usr/lib/heartbeat/hb_takeover

10. Test Recovery Of filer01 And filer02

Now we are going to see what happens if filer01 and filer02 are destroyed due to anything and we have to rebuild from our replication node.

First shutdown filer01 and filer02:

root@filer01 ~# shutdown -h now

root@filer02 ~# shutdown -h now

Now set up two complete new filer01 and filer02 from step 1. to step 3. From there on our recovery will be slightly different to the installation.

 

10.1 DRBD Configuration

Copy the drbd.conf and lvm.conf file from filer03 to filer01 and filer02:

root@filer03 ~# scp /etc/drbd.conf root@filer01:/etc/drbd.conf
root@filer03 ~# scp /etc/drbd.conf root@filer02:/etc/drbd.conf
root@filer03 ~# scp /etc/lvm/lvm.conf root@filer01:/etc/lvm/lvm.conf
root@filer03 ~# scp /etc/lvm/lvm.conf root@filer02:/etc/lvm/lvm.conf

Initiate the upper resources:

root@filer01 ~# drbdadm create-md meta
root@filer01 ~# drbdadm create-md data

root@filer02 ~# drbdadm create-md meta
root@filer02 ~# drbdadm create-md data

Start DRBD on filer01 and filer02:

root@filer01 ~# service drbd start
root@filer02 ~# service drbd start

Set the upper drbd resources primary on filer01:

root@filer01 ~# drbdsetup /dev/drbd0 primary -o
root@filer01 ~# drbdsetup /dev/drbd1 primary -o

Create the DRBD Metadata on the stacked resource:

root@filer01 ~# drbdadm –stacked create-md meta-U
root@filer01 ~# drbdadm –stacked create-md data-U

Enable the stacked resource:

root@filer01 ~# drbdadm –stacked up meta-U
root@filer01 ~# drbdadm –stacked up data-U

At this point DRBD will recognize the inconsistent data and start to sync from filer03.

root@filer01 ~# service drbd status

service drbd status
drbd driver loaded OK; device status:
version: 8.3.7 (api:88/proto:86-91)
GIT-hash: ea9e28dbff98e331a62bcbcc63a6135808fe2917 build by phil@fat-tyre, 2010-01-13 17:17:27
m:res cs ro ds p mounted fstype
… sync’ed: 0.2% (11740/11756)M
… sync’ed: 1.8% (11560/11756)M
… sync’ed: 35.7% (351792/538088)K
… sync’ed: 6.1% (509624/538032)K
0:meta SyncSource Primary/Secondary UpToDate/Inconsistent C
1:data SyncSource Primary/Secondary UpToDate/Inconsistent C
10:meta-U^^0 SyncTarget Secondary/Secondary Inconsistent/UpToDate C
11:data-U^^1 SyncTarget Secondary/Secondary Inconsistent/UpToDate C

For the lower resource meta and data filer01 is the SyncSource, while for the upper resource meta-U and data-U it is the SyncTarget. This shows us that the rebuild process has started.

Before you finish the synchronisation you can prepare the configuration for Openfiler and its Storage Services.

 

10.2 filer01 And filer02 Redo Configuration

As we have a fresh installation on filer01 and filer02 again we need to redo the configuration for Openfiler on these nodes like we have done it to filer02 and filer03 in the installation process before.

Openfiler Configuration:

mkdir /meta
mv /opt/openfiler/ /opt/openfiler.local
ln -s /meta/opt/openfiler /opt/openfiler

Samba/NFS/ISCSI/PROFTPD Configuration Files to Meta Partition:

service nfslock stop
service nfs stop
service rpcidmapd stop
umount -a -t rpc-pipefs
rm -rf /etc/samba/
ln -s /meta/etc/samba/ /etc/samba
rm -rf /var/spool/samba/
ln -s /meta/var/spool/samba/ /var/spool/samba
rm -rf /var/lib/nfs/
ln -s /meta/var/lib/nfs/ /var/lib/nfs
rm -rf /etc/exports
ln -s /meta/etc/exports /etc/exports
rm /etc/ietd.conf
ln -s /meta/etc/ietd.conf /etc/ietd.conf
rm /etc/initiators.allow
ln -s /meta/etc/initiators.allow /etc/initiators.allow
rm /etc/initiators.deny
ln -s /meta/etc/initiators.deny /etc/initiators.deny
rm -rf /etc/proftpd
ln -s /meta/etc/proftpd/ /etc/proftpd

We need to disable the services that are handled by heartbeat again:

root@filer01 ~# chkconfig –level 2345 heartbeat on
root@filer01 ~# chkconfig –level 2345 drbd on
root@filer01 ~# chkconfig –level 2345 openfiler off
root@filer01 ~# chkconfig –level 2345 open-iscsi off

root@filer02 ~# chkconfig –level 2345 heartbeat on
root@filer02 ~# chkconfig –level 2345 drbd on
root@filer02 ~# chkconfig –level 2345 openfiler off
root@filer02 ~# chkconfig –level 2345 open-iscsi off

 

10.3 Retake Resources And Run Cluster Again

NOTE : ALL DISCS SHOULD BE IN SYNC AND FILER03 IN STANDBY BEFORE DOING THIS!

When the synchronisation process has finished we can prepare the cluster now for rerunning the services on filer01. If you run the services for the cluster on filer03 ( Step 11.) you have to stop these services like described in Step 11.1 before you can continue.

Set the stacked resource primary on filer01:

root@filer01 ~# drbdadm –stacked primary meta-U
root@filer01 ~# drbdadm –stacked primary data-U

Mount the meta Partition and generate a new haresource file with openfiler:

root@filer01 ~# mount -t ext3 /dev/drbd10 /meta
root@filer01 ~# service openfiler restart

Now login into https://10.10.11.101:446/ and start/stop some service you don’t use to regenerate the /etc/ha.d/haresource file.

Then we can copy this file to filer02, start the heartbeat services on both machines and do a takeover.

root@filer01 ~# service openfiler stop
root@filer01 ~# service heartbeat start
root@filer02 ~# service heartbeat start
root@filer01 ~# /usr/lib/heartbeat/hb_takeover

After the network and filesystem mounts have happened you should see everything running fine again under the cluster IP 10.10.11.100.

You can check this by trying to login to https://10.10.11.100:446. Try a manual failover on filer02 now, too.

root@filer02 ~# /usr/lib/heartbeat/hb_takeover

 

11. Use Replication Node As Main Node

There are scenarios where you want to use the replication node probably for delivering the Storage so you can run services till you recover the hardware for filer01 and filer02. This can even be done when filer01 and filer02 are recovering from filer03.

Initiate the drbd resource as primary and start the partitions:

root@filer03 ~# drbdadm primary meta-U
root@filer03 ~# drbdadm primary data-U
root@filer03 ~# mount -t ext3 /dev/drbd10 /meta
root@filer03 ~# /etc/ha.d/resource.d/LVM data start

At this point we are able to start openfiler and the services we need, but we need the virtual IP which the cluster used to deliver services first. We use the resource.d scripts from heartbeat to do this.

root@filer03 ~# /etc/ha.d/resource.d/IPaddr 10.10.11.100/24/eth0 start

Then start all the services you need on filer03:

root@filer03 ~# service openfiler start

 

11.1 Finished Replication, How To Turn Replication Node In Standby Again

First disable the services that you started on the machine ( openfiler, iscsi, etc. ):

root@filer03 ~# service openfiler stop

Give up the cluster IP by using the resource.d scripts from heartbeat again.

root@filer03 ~# /etc/ha.d/resource.d/IPaddr 10.10.11.100/24/eth0 stop

Unmount the partitions and bring drbd in secondary mode.

root@filer03 ~# umount /dev/drbd10
root@filer03 ~# /etc/ha.d/resource.d/LVM data stop
root@filer03 ~# drbdadm secondary meta-U
root@filer03 ~# drbdadm secondary data-U

After this you can retake all services from filer01 like you found in Step 10.3.

 

12. Add Another Storage Partition

12GBs aren’t that much so you might want to add more Storage at a later point to your cluster.

This is a very easy process in which you first shutdown the passive nodes and built in your additional storage and then create a LVM Partition on it with fdisk like described in Step 2. Note: You don’t need to add another Linux Type Partition for configuration files, only another LVM Partition.

After this you add your new partition to the drbd.conf file on each node.

Add this to the drbd.conf file on filer01 and exchange it to filer02 and filer03.

resource data2 {
 on filer01 {
  device /dev/drbd2;
  disk /dev/sdc1;
  address 10.10.50.101:7790;
  meta-disk internal;
 }
 on filer02 {
  device /dev/drbd2;
  disk /dev/sdc1;
  address 10.10.50.102:7790;
  meta-disk internal;
 }
}
resource data2-U {
 stacked-on-top-of data2 {
  device /dev/drbd12;
  address 10.10.50.100:7790;
 }
 on filer03 {
  device /dev/drbd12;
  disk /dev/sdc1;
  address 10.10.50.103:7790;
  meta-disk internal;
 }
}

Note: filer01 must be the active for this to work!

Create the metadata on the lower resource before we can start the upper resource again.

root@filer01 ~# drbdadm create-md data2

root@filer02 ~# drbdadm create-md data2

Start the lower resource:

root@filer01 ~# drbdadm up data2

root@filer02 ~# drbdadm up data2

Make it primary:

root@filer01 ~# drbdsetup /dev/drbd2 primary -o

Create the upper resource and make it primary, too.

root@filer01 ~# drbdadm –stacked create-md data2-U
root@filer01 ~# drbdadm –stacked up data2-U root@filer01 ~# drbdsetup /dev/drbd12 primary -o

Create the meta-data on filer03 and start the resource:

root@filer03 ~# drbdadm create-md data2-U
root@filer03 ~# drbdadm up data2-U

After this we are ready to add the new device to our existing LVM Device and increase our storage. Note: It’s out of scope of this manual to resize the storage that you actually use on it.

Now we create a PV on the new stacked resource device and add it to the existing VolumeGroup:

root@filer01 ~# pvcreate /dev/drbd12
root@filer01 ~# vgextend data /dev/drbd12

Don’t forget to add your new device to your heartbeat configuration:

<?xml version="1.0" ?>
<cluster>
<clustering state="on" />
<nodename value="filer01" />
<resource value="MailTo::it@company.com::ClusterFailover"/>
<resource value="IPaddr::10.10.11.100/24/eth0" />
<resource value="IPaddr::10.10.50.100/24/eth1" />
<resource value="drbdupper::meta-U">
<resource value="drbdupper::data-U">
<resource value="drbdupper::data2-U">
<resource value="LVM::data">
<resource value="Filesystem::/dev/drbd10::/meta::ext3::defaults,noatime">
<resource value="MakeMounts"/>
</cluster>

Recreate the /etc/ha.d/haresource like we’ve done before by restarting some unused service over the Openfiler GUI, exchange this new haresource file to filer02.

After this you can log into your openfiler cluster IP and use the extended data storage. Instead of increasing you could just create another VolumeGroup. Refer to Step 6 for this.

 

Misc: Openfiler iSCSI Citrix Xen Modifications

Openfiler has some problems with the Storage created by Citrix Xen, so after a reboot you are going to have problems to add and find your LUNs. The main problem for this seems to be the AOE ( ATA over ethernet ) Service, which can be disabled with this command. Do this on all 3 nodes.

chkconfig –level 2345 aoe off

Another problem seems to be in the discovery off LVM Devices with Openfiler, the lvm config i posted is good to use for a system with stacked resources, but probably not right for a drbd only system, the drbd documentations mention the following lvm configurations for drbd and lvm which will only show the drbd or drbd stacked resources to lvm.

filter = [ "a|drbd.*|", "r|.*|" ]

and

filter = [ "a|drbd1[0-9]|", "r|.*|" ]

like you found in this howto. This will allow that the devices /dev/drbd10 – /dev/drbd19 are exposed to lvm. If you need more devices you have to change your lvm configuration regarding to this. You can find the example configurations in the drbd documentation here.

Edit the /etc/rc.sysinit file on Line 333-337 and comment out these lines:

From:

     if [ -x /sbin/lvm.static ]; then
                if /sbin/lvm.static vgscan --mknodes --ignorelockingfailure > /dev/null 2>&1 ; then
                        action $"Setting up Logical Volume Management:" /sbin/lvm.static vgchange -a y --ignorelockingfailure
                fi
        fi

to:

#    if [ -x /sbin/lvm.static ]; then
#                if /sbin/lvm.static vgscan --mknodes --ignorelockingfailure > /dev/null 2>&1 ; then
#                        action $"Setting up Logical Volume Management:" /sbin/lvm.static vgchange -a y --ignorelockingfailure
#                fi
#        fi

Restart your filers now to make the changes happen. You should be fine discovering the iSCSI LUNs with your Citrix Xen systems now.

 

Misc: Notes About Openfiler Clusters

Not all services are HA with this setup, some original configuration files which can be modified by openfiler remain on the single nodes partitions. In the starting process you can add these files to the meta partition.These services are:

  • /etc/ldap.conf
  • /etc/openldap/ldap.conf
  • /etc/ldap.secret
  • /etc/nsswitch.conf
  • /etc/krb5.conf

At the point of writing this howto rpath Linux ( Openfiler is based on this ) has heartbeat version 2.1.3 which would in theory be able to create n+1 clusters, but I haven’t found anything about even basic crm cluster configurations being succesfully running. I tried out to create cib.xml files with the onboard script /usr/lib/heartbeat/haresources2cib.py but the cluster did not start with them.

If you finished all steps of this howto succesfully it’s time to take one of your favourite drinks, you earned it.

Comments

comments