Table of Contents
D-Cache is a flexible storage vitalisation system which can aggregate large numbers of computers storage resources into a coherent whole. Unfortunately the more computers used within a D-Cache cluster the more likely it is that they will fail or need upgrading frequently.
When administering a D-Cache storage cluster replacing computers and upgrading the system is an essential part of keeping the service running. This should be done without loosing data that users have chosen to store within the D-Cache service. This chapter is about maintaining the service through upgrades and hardware failures.
D-Cache is a mature project and very rarely has changes to its configuration between releases. This helps in the upgrade process as to upgrade versions of D-Cache all that is typically required is to shutdown D-cache upgrading the RPM's, then restarting D-cache. It is not recommended that you rerun YAIM or any other configuration assistant as the configuration typically does not require upgrading. Please do check with others who may have tested the upgraded before you upgrade to a new release.
To stop D-cache run the following commands.
[root@dev01 root]# /etc/init.d/dcache-opt stop Shutting down dcache services: Stopping srmDomain (pid=29835) 0 1 2 3 4 5 6 7 Done Stopping gridftpdoorDomain (pid=29681) 0 1 2 3 4 5 6 7 Done Stopping gsidcapdoorDomain (pid=29760) 0 1 2 3 4 5 6 7 Done [root@dev01 root]# /etc/init.d/dcache-pool stop Shutting down dcache pool: Stopping dev01Domain (pid=29943) 0 1 2 3 4 5 6 7 Done [root@dev01 root]# /etc/init.d/dcache-core stop Shutting down dcache services: Stopping utilityDomain (pid=30662) 0 1 2 3 4 5 6 7 Done Stopping httpdDomain (pid=30569) 0 1 2 3 4 5 6 7 Done Stopping pnfsDomain (pid=30482) 0 1 2 3 4 5 6 7 Done Stopping adminDoorDomain (pid=30394) 0 1 2 3 4 5 6 7 Done Stopping doorDomain (pid=30309) 0 1 2 3 4 5 6 7 Done Stopping dirDomain (pid=30220) 0 1 2 3 4 5 6 7 Done Stopping dCacheDomain (pid=30122) 0 1 2 3 4 5 6 7 Done Stopping lmDomain (pid=30044) 0 1 2 3 4 5 6 7 Done [root@dev01 root]# /etc/init.d/pnfs stop Shutting down dcache services: Stopping Heartbeat .... Ready Killing pnfsd . Done Killing pmountd Done Killing dbserver . Done Removing 8 Clients 0+ 1+ 2+ 3+ 4+ 5+ 6+ 7+ Removing 8 Servers 0+ 1+ 2+ 3+ 4+ 5+ 6+ 7+ Removing main switchboard ... O.K.
Typical D-Cache installs will include the following RPM's.
pnfs-3.1.10-15 d-cache-lcg-5.0.0-1 d-cache-opt-1.5.3-84 d-cache-gpp-v1.2.2-1 d-cache-core-1.5.2-83 d-cache-client-1.0-100
Once the new rpms are downloaded they can be upgraded one by one using rpm. Administrators should at this stage check the new release to see if new mandatory fields have been added to the configurations system. D-Cache has a good record of informing users of their configuration upgrade path.
rpm -Uvh pnfs-3.1.10-15.i386.rpm \ d-cache-lcg-5.0.0-1.i386.rpm \ d-cache-opt-1.5.3-84.i386.rpm \ d-cache-gpp-v1.2.2-1.i386.rpm \ d-cache-core-1.5.2-83.i386.rpm \ d-cache-client-1.0-100.i386.rpm
Due to bad previous experiences with the rpm command. I no longer use the approved upgrade option within rpm and have taken to removing rpms and then reinstalling the fresh version as on occasion rpm upgrades update the database and do not upgrade the files referred to in the data base. This experience maybe out of date but to upgrade I follow the practise as shown below for a single rpm.
[root@dev01 root]# rpm -e --nodeps pnfs [root@dev01 root]# rpm -i ./oms/dcache_deploy/pnfs-3.1.10-15.i386.rpm
Once all the D-cache rpms have been upgraded D-Cache should be started again.
[root@dev01 root]# /etc/init.d/pnfs start Starting dcache services: Shmcom : Installed 8 Clients and 8 Servers Starting database server for admin (/opt/pnfsdb/pnfs/databases/admin) ... O.K. Starting database server for data1 (/opt/pnfsdb/pnfs/databases/data1) ... O.K. Waiting for dbservers to register ... Ready Starting Mountd : pmountd Starting nfsd : pnfsd [root@dev01 root]# /etc/init.d/dcache-core start Starting dcache services: Starting lmDomain 6 5 4 3 2 1 0 Done (pid=12383) Starting dCacheDomain 6 5 4 3 2 1 0 Done (pid=12455) Starting dirDomain 6 5 4 3 2 1 0 Done (pid=12539) Starting doorDomain 6 5 4 3 2 1 0 Done (pid=12620) Starting adminDoorDomain 6 5 4 3 2 1 0 Done (pid=12705) Starting pnfsDomain 6 5 4 3 2 1 0 Done (pid=12793) Starting httpdDomain 6 5 4 3 2 1 0 Done (pid=12880) Starting utilityDomain 6 5 4 3 2 1 0 Done (pid=12973) [root@dev01 root]# /etc/init.d/dcache-pool start Starting dcache pool: Starting dev01Domain 6 5 4 3 2 1 0 Done (pid=13095) [root@dev01 root]# /etc/init.d/dcache-opt start Starting dcache services: Starting gridftpdoorDomain 6 5 4 3 2 1 0 Done (pid=13182) Starting gsidcapdoorDomain 6 5 4 3 2 1 0 Done (pid=13267) Starting srmDomain 6 5 4 3 2 1 0 Done (pid=13350)