Cheap VPS & Xen Server


Residential Proxy Network - Hourly & Monthly Packages

Installing and Configuring Openfiler with DRBD and Heartbeat


Introduction

Openfiler is a high performance operating system tailored for use as a SAN/NAS appliance. This configuration will enable two Openfiler appliances to work in an Active/Passive high availability scenario.

 

Requirements

Hardware

  • 2 x boxes that meet the minimum spec of Openfiler’s hardware specifications.
  • 2 x ethernet interfaces in each box
  • Openfiler 2.3 installation media.
  • Both boxes should have the same size drives in each to avoid any replication inconsistencies.

Software

Install Openfiler 2.3 on both boxes utilizing a disk setup such as the following:

  • 3 GB root (“/”) partition
  • 2 GB “swap” partition
  • 512 MB “/meta” partition (used for DRBD0)
  • Data partition configured as an unmounted LVM (used for DRBD1)

 

Configuration

Network

Each Openfiler appliance will have two NICs: one for communicating with the LAN, the other for communicating with the
other SAN (via direct cable). The first will be used for administration, to communicate directly with each node.

A third “virtual” interface is used by the heartbeat service and is what will be used by computers on the LAN.

Below is what is used:

filer01

  • LAN Interface (eth0) 192.168.1.18
  • Replication Interface (eth1) 10.188.188.1

filer02

  • LAN Interface (eth0) 192.168.1.19
  • Replication Interface (eth1) 10.188.188.2

HA NAS/SAN Address (eth0) 192.168.1.17

  • This is configured in the cluster.xml file (do not attempt to configure anywhere else)

 

Hostname Setup

For both nodes to be able to recognize each other by name, configure the hosts file on each computer.
Modify our /etc/hosts (on filer01):

# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1 filer01 localhost.localdomain localhost
10.188.188.2 filer02

Modify our /etc/hosts (on filer02):

# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1 filer02 localhost.localdomain localhost
10.188.188.1 filer01

 

SSH Shared keys

To allow the two Openfiler appliances to talk to each other without having to use a password, use SSH shared keys.

On filer01:

root@filer01 ~# ssh-keygen -t dsa

Hit enter at the prompts (don’t set a password on the key).

On filer02:

root@filer02 ~# ssh-keygen -t dsa

Hit enter at the prompts (don’t set a password on the key).

The above command will generate a file called “id_dsa.pub” in ~/.ssh/, which is the public key that will need to be copied to
the other node:

root@filer01 ~# scp .ssh/id_dsa.pub root@filer02:~/.ssh/authorized_keys2
root@filer02 ~# scp .ssh/id_dsa.pub root@filer01:~/.ssh/authorized_keys2

 

Configure DRBD

DRBD is what will keep the data between the two nodes consistent.

On filer01:

root@filer01 ~# mv /etc/drbd.conf /etc/drbd.conf.org

Then modify drbd.conf (version 8) according to following:

global {
 # minor-count 64;
 # dialog-refresh 5; # 5 seconds
 # disable-ip-verification;
 usage-count ask;
}

common {
 syncer { rate 100M; }
}

resource cluster_metadata {
 protocol C;
 handlers {
  pri-on-incon-degr "echo O > /proc/sysrq-trigger ; halt -f";
  pri-lost-after-sb "echo O > /proc/sysrq-trigger ; halt -f";
  local-io-error "echo O > /proc/sysrq-trigger ; halt -f";
  # outdate-peer "/usr/sbin/drbd-peer-outdater";
 }
 
 startup {
  # wfc-timeout 0;
  degr-wfc-timeout 120; # 2 minutes.
 }

 disk {
  on-io-error detach;
 }

 net {
  after-sb-0pri disconnect;
  after-sb-1pri disconnect;
  after-sb-2pri disconnect;
  rr-conflict disconnect;
 }

 syncer {
  # rate 10M;
  # after "r2";
  al-extents 257;
 }

 on filer01 {
  device /dev/drbd0;
  disk /dev/sda3;
  address 10.188.188.1:7788;
  meta-disk internal;
 }

 on filer02 {
  device /dev/drbd0;
  disk /dev/sda3;
  address 10.188.188.2:7788;
  meta-disk internal;
 }
}

resource vg0drbd {
 protocol C;
 startup {
  wfc-timeout 0; ## Infinite!
  degr-wfc-timeout 120; ## 2 minutes.
 }

 disk {
  on-io-error detach;
 }

 net {
  # timeout 60;
  # connect-int 10;
  # ping-int 10;
  # max-buffers 2048;
  # max-epoch-size 2048;
 }

 syncer {
  after "cluster_metadata";
 }

 on filer01 {
  device /dev/drbd1;
  disk /dev/sda5;
  address 10.188.188.1:7789;
  meta-disk internal;
 }

 on filer02 {
  device /dev/drbd1;
  disk /dev/sda5;
  address 10.188.188.2:7789;
  meta-disk internal;
 }
}

Both hosts need the same drbd.conf, so the drbd.conf file from filer01 will be copied to filer02:

root@filer01 ~# scp /etc/drbd.conf root@filer02:/etc/drbd.conf

Initialise metadata on /dev/drbd0 (cluster_metadata) and /dev/drbd1 (vg0drbd) on both nodes:

root@filer01 ~# drbdadm create-md cluster_metadata
root@filer01 ~# drbdadm create-md vg0drbd
root@filer02 ~# drbdadm create-md cluster_metadata
root@filer02 ~# drbdadm create-md vg0drbd

Note: if the commands above generate errors about needing to zero out the file system, use the following command:

root@filer01 ~# dd if=/dev/zero of=/dev/sda3

Be careful with this command and make sure its on the correct drive.

Before starting the DRBD service, make sure that the partition used for drbd0 (in the cluster_metadata resource in the drbd.conf file) is not already mounted (which it will be by default if it was created during the installation).

root@filer01 ~# umount /dev/sda3

Now, start DRBD on both hosts:

root@filer01 ~# service drbd start

root@filer02 ~# service drbd start

If all goes well, they should connect and running “service drbd status” should present output similar to the following:

root@filer1 /# service drbd status

drbd driver loaded OK; device status:
version: 8.0.12 (api:86/proto:86)
GIT-hash: 5c9f89594553e32adb87d9638dce591782f947e3 build by phil@mescal, 2008-04-24 13:29:44
m:res cs st ds p mounted fstype
0:cluster_metadata Connected Secondary/Secondary Inconsistent/Inconsistent C
1:vg0drbd Connected Secondary/Secondary Inconsistent/Inconsistent C

Once both drbd resources are connected and both nodes are in Secondary state (as above), set a Primary node:

root@filer01 ~# drbdsetup /dev/drbd0 primary -o
root@filer01 ~# drbdsetup /dev/drbd1 primary -o

This should give you a status result of something like the following:

root@filer1 /# service drbd status

drbd driver loaded OK; device status:
version: 8.0.12 (api:86/proto:86)
GIT-hash: 5c9f89594553e32adb87d9638dce591782f947e3 build by phil@mescal, 2008-04-24 13:29:44
m:res cs st ds p mounted fstype
… sync’ed: 17.9% (247232/297152)K
0:cluster_metadata SyncSource? Primary/Secondary UpToDate/Inconsistent C
1:vg0drbd PausedSyncS? Primary/Secondary UpToDate/Inconsistent C

Note: if the vg0drbd LVM is large, it will take a long time to sync (perhaps overnight).

Enable DRBD to startup at boot:

root@filer01 ~# chkconfig –level 2345 drbd on
root@filer02 ~# chkconfig –level 2345 drbd on

Now create the cluster_metadata filesystem. Use this 512 MB partition to keep all of the Openfiler configuration data and the data for the services that should be available in HA (eg. NFS, iSCSI, SMB).

root@filer01 ~# mkfs.ext3 /dev/drbd0

Don’t add this partition to an /etc/fstab, as this is managed by Heartbeat (and will be configured shortly).

Configure LVM Partition

Create the /dev/drbd1 as a PV (Physical Volume) for the data volume group, which will be used to create Logical Volumes for data.

First, edit /etc/lvm/lvm.conf and modify the filter line:

From:

filter = [ "a/.*/" ]

To:

filter = [ "r|/dev/sda5|" ]

Note: Change /dev/sda5 to reflect the partition of your LVM. Also remember to apply these changes on both filer01 and filer02.

Create the LVM Physical Volume (only do this on our Primary node, as it will replicate to the Second node via drbd):

root@filer1 /# pvcreate /dev/drbd1

Physical volume “/dev/drbd1” successfully created

Configure Heartbeat

As mentioned before, Heartbeat controls failover between hosts. The two nodes run the Heartbeat service, that sends out a heartbeat pulse on the secondary interface (eth1). If one node dies, then Heartbeat detects this and roles the surviving node to Primary (if it wasn’t already) using startup scripts available in /etc/ha.d/resources.d.

Make modifications to /etc/ha.d/ha.cf and /etc/ha.d/authkeys. Make these changes on both nodes.

In /etc/ha.d/authkeys, add:

auth 2
2 crc

The /etc/ha.d/authkeys file may not exist and will need to be created, as it does not appear to exist in Openfiler 2.3.

Next, restrict permissions to authkeys to just “root”:

root@filer01 ~# chmod 600 /etc/ha.d/authkeys
root@filer02 ~# chmod 600 /etc/ha.d/authkeys

Create a /etc/ha.d/ha.cf on both nodes (needs to be identical on both, just like /etc/drbd.conf):

debugfile /var/log/ha-debug
logfile /var/log/ha-log
logfacility local0
bcast eth1
keepalive 5
warntime 10
deadtime 120
initdead 120
udpport 694
auto_failback off
node filer01
node filer02

Enable Heartbeat to startup at boot:

root@filer01 ~# chkconfig –level 2345 heartbeat on
root@filer02 ~# chkconfig –level 2345 heartbeat on

 

Openfiler Data Configuration

As mentioned above, a 512 MB partition was created to keep the configuration and HA services available during a failover. To get this working, copy the services and Openfiler configuration data over to the new partition – symbolically linking it back to it’s original location.

filer01:

root@filer01 ~# mkdir /cluster_metadata
root@filer01 ~# mount /dev/drbd0 /cluster_metadata
root@filer01 ~# mv /opt/openfiler/ /opt/openfiler.local
root@filer01 ~# mkdir /cluster_metadata/opt
root@filer01 ~# cp -a /opt/openfiler.local /cluster_metadata/opt/openfiler
root@filer01 ~# ln -s /cluster_metadata/opt/openfiler /opt/openfiler
root@filer01 ~# rm /cluster_metadata/opt/openfiler/sbin/openfiler
root@filer01 ~# ln -s /usr/sbin/httpd /cluster_metadata/opt/openfiler/sbin/openfiler
root@filer01 ~# rm /cluster_metadata/opt/openfiler/etc/rsync.xml
root@filer01 ~# ln -s /opt/openfiler.local/etc/rsync.xml /cluster_metadata/opt/openfiler/etc/

Then edit our /opt/openfiler.local/etc/rsync.xml file:

<?xml version="1.0" ?>
<rsync>
<remote hostname="10.188.188.2"/> ## IP address of peer node.
<item path="/etc/ha.d/haresources"/>
<item path="/etc/ha.d/ha.cf"/>
<item path="/etc/ldap.conf"/>
<item path="/etc/openldap/ldap.conf"/>
<item path="/etc/ldap.secret"/>
<item path="/etc/nsswitch.conf"/>
<item path="/etc/krb5.conf"/>
</rsync>

root@filer01 ~# mkdir -p /cluster_metadata/etc/httpd/conf.d

filer02:

root@filer2 ~# mkdir /cluster_metadata
root@filer2 ~# mv /opt/openfiler/ /opt/openfiler.local
root@filer2 ~# ln -s /cluster_metadata/opt/openfiler /opt/openfiler

Change the /opt/openfiler.local/etc/rsync.xml to reflect below:

<?xml version="1.0" ?>
<rsync>
<remote hostname="10.188.1881"/> ## IP address of peer node.
<item path="/etc/ha.d/haresources"/>
<item path="/etc/ha.d/ha.cf"/>
<item path="/etc/ldap.conf"/>
<item path="/etc/openldap/ldap.conf"/>
<item path="/etc/ldap.secret"/>
<item path="/etc/nsswitch.conf"/>
<item path="/etc/krb5.conf"/>
</rsync>

 

Heartbeat Cluster Configuration

Then modify the /cluster_metadata/opt/openfiler/etc/cluster.xml config file. This config file generates the /etc/ha.d/haresources file, which tells Heartbeat what it should do in a failover.

filer01 Only:

<?xml version="1.0" ?>
<cluster>
<clustering state="on" />
<nodename value="filer01" />
<resource value="MailTo::it@company.com::ClusterFailover"/>
<resource value="IPaddr::192.168.1.17/24" />
<resource value="drbddisk::">
<resource value="LVM::vg0drbd">
<resource value="Filesystem::/dev/drbd0::/cluster_metadata::ext3::defaults,noatime">
<resource value="MakeMounts"/>
</cluster>

Note how the HA IP address is declared here (192.168.1.17). As mentioned before, Heartbeat controls the setup of the network interface, the mounting of the LVM volume group, and the mounting of drbd0 (/cluster_metadata).

 

Samba and NFS Support

Modify Samba and NFS so it’s available on our /cluster_metadata drbd resource.

filer01:

root@filer01 ~# mkdir /cluster_metadata/etc
root@filer01 ~# mv /etc/samba/ /cluster_metadata/etc/
root@filer01 ~# ln -s /cluster_metadata/etc/samba/ /etc/samba
root@filer01 ~# mkdir -p /cluster_metadata/var/spool
root@filer01 ~# mv /var/spool/samba/ /cluster_metadata/var/spool/
root@filer01 ~# ln -s /cluster_metadata/var/spool/samba/ /var/spool/samba
root@filer01 ~# mkdir -p /cluster_metadata/var/lib
root@filer01 ~# mv /var/lib/nfs/ /cluster_metadata/var/lib/
root@filer01 ~# ln -s /cluster_metadata/var/lib/nfs/ /var/lib/nfs
root@filer01 ~# mv /etc/exports /cluster_metadata/etc/
root@filer01 ~# ln -s /cluster_metadata/etc/exports /etc/exports

Note: This moves /var/spool/samba into /cluster_metadata, which is only a 512 MB partition. So, if large print jobs are put through Samba, the free space on this volume will get eaten up pretty quickly. So, if this is the case, a separate DRBD resource should be created for the /var directory. Or, reconsider hosting print services on a different server.

filer02:

root@filer02 ~# rm -rf /etc/samba/
root@filer02 ~# ln -s /cluster_metadata/etc/samba/ /etc/samba
root@filer02 ~# rm -rf /var/spool/samba/
root@filer02 ~# ln -s /cluster_metadata/var/spool/samba/ /var/spool/samba
root@filer02 ~# rm -rf /var/lib/nfs/
root@filer02 ~# ln -s /cluster_metadata/var/lib/nfs/ /var/lib/nfs
root@filer02 ~# rm -rf /etc/exports
root@filer02 ~# ln -s /cluster_metadata/etc/exports /etc/exports

 

iSCSI Support

filer01:

root@filer01 ~# mv /etc/ietd.conf /cluster_metadata/etc/
root@filer01 ~# ln -s /cluster_metadata/etc/ietd.conf /etc/ietd.conf
root@filer01 ~# mv /etc/initiators.allow /cluster_metadata/etc/
root@filer01 ~# ln -s /cluster_metadata/etc/initiators.allow /etc/initiators.allow
root@filer01 ~# mv /etc/initiators.deny /cluster_metadata/etc/
root@filer01 ~# ln -s /cluster_metadata/etc/initiators.deny /etc/initiators.deny

filer02:

root@filer02 ~# rm /etc/ietd.conf
root@filer02 ~# ln -s /cluster_metadata/etc/ietd.conf /etc/ietd.conf
root@filer02 ~# rm /etc/initiators.allow
root@filer02 ~# ln -s /cluster_metadata/etc/initiators.allow /etc/initiators.allow
root@filer02 ~# rm /etc/initiators.deny
root@filer02 ~# ln -s /cluster_metadata/etc/initiators.deny /etc/initiators.deny

 

FTP Support

filer01:

root@filer01 ~# mv /etc/proftpd /cluster_metadata/etc/
root@filer01 ~# ln -s /cluster_metadata/etc/proftpd/ /etc/proftpd

filer02:

root@filer02 ~# rm -rf /etc/proftpd
root@filer02 ~# ln -s /cluster_metadata/etc/proftpd/ /etc/proftpd

 

Configure Volume Group

filer01:

Create a Volume group from /dev/drbd1:

root@filer1 etc# vgcreate vg0drbd /dev/drbd1

Volume group “vg0drbd” successfully created

Note: If planning on using Windows to connect to these iSCSI targets, do not use the “_” character or any other special characters when creating the volume group.

Once the Heartbeat service has been configured and started (see below), the Openfiler web administration GUI should be available on https://192.168.1.17:446. Once there, LVM volumes can be created and to exported via iSCSI etc.

 

Starting Heartbeat and First-Time Configuration

In order to get Openfiler to write the /etc/ha.d/haresources file based on the cluster.xml config file, restart the Openfiler service and then log onto the web interface (using the Primary node’s direct IP) and click on Services and enable iSCSI.

Make sure to do this on the Primary node (filer01).

root@filer01 ~# rm /opt/openfiler/etc/httpd/modules
root@filer01 ~# ln -s /usr/lib64/httpd/modules /opt/openfiler/etc/httpd/modules

Note: If you use a 32-bit system, just take out the “64”.

root@filer01 ~# service openfiler restart

With any luck, Openfiler has written this file out to /etc/ha.d/haresources. If haresources was created, copy it over to filer02.

Note: Before starting Heartbeat, a volume must be created:

root@filer01 ~# lvcreate -L 400M -n filer vg0drbd

It appears if you log onto the web interface and activate a service such as NFS or iSCSI. this will force Openfiler to rewrite the /etc/ha.d/haresources file. Copy (via scp) this file over to the second node as follows:

root@filer01 ~# scp /etc/ha.d/haresources root@filer02:/etc/ha.d/haresources

Since heartbeat was added to the start up scripts earlier, reboot filer01, then reboot filer02.

If all goes well, access the primary node via a web browser on the High Available ip address: https://192.168.1.17:446.

If the web server is not accessible, a good place to look for errors is in /var/log/ha-log or /var/log/ha-debug files.

Note: The rsync configuration is meant to synchronize changes between the two nodes, but this will not happen unless using the most recent build. Therefore, see the fix at the following page:

https://forums.openfiler.com/viewtopic.php?id=2380

Once Openfiler is up and running, delete the the filer volume created earlier and then create new volumes (be sure to create the new volumes before stopping the Heartbeat service, or it will not start).

Comments

comments