Chapter 2. YAIM Install

Table of Contents

Dcache install using YAIM on a fresh SL3 OS
Installed SL3.0.5 basic install + apt
Copy host certificates to correct location.
installing java
Create the relevant pointers to the rpm repositories
Install YAIM
Setup site-info.def file.
Checks
Firewall
YAIM install target
YAIM configure script
Updating the authorised user list

Dcache install using YAIM on a fresh SL3 OS

The quick and easy way to install D-Cache is to use YAIM, (Yet another install manager), this greatly simplifies the process of installing D-Cache on an LHC system as in resolves the dependencies automatically and processes all the configuration for D-Cache within a single site wide configuration file.

This document may move to http://www.dcache.org/ as the principle author of this document is joining the D-Cache team.

Following the instructions from this section is only valid if you are installing D-Cache for LHC. If this not is the case we recommend skipping the next chapter and reading the following chapters which cover manually installing D-cache, and provided the basis of the YAIM D-cache intergration.

Installed SL3.0.5 basic install + apt

Make sure /sbin and /usr/sbin are in your PATH.

Copy host certificates to correct location.

The required openssl commands to generate the public and private keys from the .pfx (or .p12) certificate are:

openssl pkcs12 -in cert.pfx -clcerts -nokeys -out hostcert.pem
openssl pkcs12 -in cert.pfx -nocerts -nodes  -out hostkey.pem

mkdir -p /etc/grid-security
cp hostcert.pem hostkey.pem /etc/grid-security

made sure that hostkey.pem is unencrypted.

chmod 400 hostkey.pem
chmod 644 hostcert.pem   

installing java

Install j2sdk-1_4_2_12-linux-i586.rpm by downloading the .bin from java website. This is a self extracting package that must be executed and then the extraction will ask for acceptance of the licencing conditions for Java, then decrypt and release an RPM. Installing the Java rpm directly using the rpm command line is simple and shown for the version of Java that I am using.

# rpm -i j2sdk-1.4.2_12-fcs.i586.rpm

Java versions of 1.5 are now supported for use under D-Cache but we recommend using the version is currently recommended by LCG/Glite. If errors occur filing bug to D-Cache would be apresiated.

Create the relevant pointers to the rpm repositories

echo 'rpm http://storage.esc.rl.ac.uk/ apt/datastore/sl3.0.5 stable obsolete' \
    > /etc/apt/sources.list.d/gpp_storage.list

echo 'rpm http://linuxsoft.cern.ch/ LCG-CAs/current production' \
    > /etc/apt/sources.list.d/glite-ca.list

echo 'rpm http://glitesoft.cern.ch/EGEE/gLite/APT/R3.0/ rhel30 externals Release3.0 updates'\
    > /etc/apt/sources.list.d/glite.list

The first repository is not strictly necessary but is used for GRIDPP specific upgrades.

Install YAIM

apt-get update
apt-get install glite-yaim

Setup site-info.def file.

The glite-yaim package is an installation tools that takes a single configuration file for an entire cluster. The file

/opt/glite/yaim/examples/site-info.def

is distributed as an example site-info.def file and should be copied to a new location and edited with the following key value pairs.

MY_DOMAIN=your.domain
SE_HOST=srm.$MY_DOMAIN
RB_HOST=lxn1188.cern.ch
BDII_HOST=lxn1189.cern.ch
DCACHE_ADMIN="<FQDN of admin node>"
DCACHE_POOLS="<FQDN of admin node>:size:/pool"

In the above example site-info.def Dcache variables were set up to get a pool on the admin node. Other pool nodes can be added to the system at a later date. In production services the load on the admin node is often to high to also provide pool nodes on this host even with quite small setups of only 4 or 5 hosts.

The SE_HOST variable is for the Computing elements (CE) close storage resource, this is used in optimising job execution to your cluster as jobs are meant to go to hosts where the data is rather than randomly around the grid. For this reason the value of this should be set to the node hosting the SRM door. The CE did have a dependency upon the storage element as well but I believe this is being phased out.

DCACHE_ADMIN is usually just set to the fully qualified host name.

DCACHE_POOLS require a colon separating the host name to the pools mount point. By default all pools will occupy all the space on a partition, if this is not desired placing a size in Gigabytes between the two colons will set the size of pools on creation.

DCACHE_POOLS="poolhost.esc.rl.ac.uk:15:/pool"

Will add a pool to host poolhost.esc.rl.ac.uk with a size of 15 Gigabytes and stored in the file system within the directory /pool. This will not change the sizes of existing pools as this is a costly operation to script.

Many other variables can be used to configure D-Cache but these are optional.

# DCACHE_PORT_RANGE="20000,25000"
# Set to "off" if you dont want the door to start on any host
# DCACHE_DOOR_SRM="door_node1[:port]"
# DCACHE_DOOR_GSIFTP="door_node1[:port] door_node2[:port]"
# DCACHE_DOOR_GSIDCAP="door_node1[:port] door_node2[:port]"
# DCACHE_DOOR_DCAP="door_node1[:port] door_node2[:port]"

# Only change if your site has an existing D-Cache installed
# To a different storage root.
# DCACHE_PNFS_VO_DIR="/pnfs/${MY_DOMAIN}/data"

DCACHE_PORT_RANGE is used to set the globus port range necessary for sites that cannot accommodate the "standard" globus port range of opening all ports between 20000 and 25000 at the site firewall.

DCACHE_DOOR_SRM sets the host and optionally the port upon which the SRM service will listed for clients. If this variable is unset the SRM will be hosted on the admin node.

DCACHE_DOOR_GSIFTP,DCACHE_DOOR_GSIDCAP, and DCACHE_DOOR_DCAP operate is a similar way with the host and optionally the port. If this variable is unset the Doors will be opened on the pool nodes listed in DCACHE_POOLS.

Particular note should be taken over the domain you set. Any errors at all in the site-info.def file may produce quite difficult to debug errors. Fortunately YAIM can be rerun multiple times without breaking the service.

Three variables control the scope of YAIM's action.

RESET_DCACHE_CONFIGURATION
RESET_DCACHE_PNFS
RESET_DCACHE_RDBMS

Each of these three variables limits (or should limit) the scope of YAIM's interaction. All fresh installs should set all three of these variables to "yes", but for upgrades sites should backup their configuration and set RESET_DCACHE_CONFIGURATION to "yes" with the other values to "no". RESET_DCACHE_PNFS will loose all data from PNFS. PNFS is the D-cache name server so data will not be lost, but the names identifying the files will be. RESET_DCACHE_RDBMS=yes will wipe all data used by D-Cache within postgresql. This includes all PNFS data if the postgresql based PNFS is used.

Admins may repeat all stages of D-Cache configuration after this step. Sometimes this is required to cope with sites which have issues installing D-cache.

Checks

A very important thing to check before continuing is that `search xxx.yyy.ac.uk` in `/etc/resolv.conf` must equal the output of `hostname -d` otherwise the YAIM installation will fail! This need to be fixed.

Firewall

Turn firewall off for the duration of the installation. A full list of [ports][dCache_ports] that should be open for dCache are listed in the "Setup Examples" chapter of this document. YAIM will not touch your site firewall configuration as this is felt to be something that the site administrator will no better than a simple configuration script. To stop the firewall run the following command.

service iptables stop

YAIM install target

Use the YAIM install target `SE_dcache`:

/opt/glite/yaim/scripts/install_node ${PATH_TO_ADMIN_SPACE}/site-info.def ${TARGET}

Yaim takes two parameters for both the install and configure stages the path to the site-info.def file followed by the "Target".

The target is an abstract identifier of the service the host will run and Yaim should install. At the moment (2006/07/03) the value of ${TARGET} should be set to

      
      glite-SE_dcache

Yaim will soon be decomposing D-cache into 3 separate service, and will have these three targets

glite-SE_dcache_pool
glite-SE_dcache_admin_postgres
glite-SE_dcache_admin_gdbm

This is to reflect dependencies of various installs, the variables in site-into.def define what type of node is installed in D-cache. The "glite-SE_dcache_pool" target has the least dependencies and is intended to work upon door and pool nodes for D-cache. The targets "glite-SE_dcache_admin_postgres" and "glite-SE_dcache_admin_gdbm" merely reflect the two versions of pnfs available. Due to technical limitations of the gdbm database D-cache recommend the glite-SE_dcache_admin_postgres version. It may be worth apt or yum to search for rpm's corresponding to more recent targets which may not yet be documented as install targets map one to one with rpm "meta" packages. So called "meta" packages only contain dependencies and so yum or apt will install all their dependencies so installing D-Cache. Yaim provides some additional error checking in the install process.

YAIM configure script

Run the YAIM configure script for ${TARGET} to configure D-cache. If you wish to experiment with configuring D-cache by hand the environment variable DESYOVERRIDE=yes will prevent D-cache being configured by YAIM.

/opt/glite/yaim/scripts/configure_node ${PATH_TO_ADMIN_SPACE}/site-info.def ${TARGET}  [ | tee /tmp/dcache_conf ]

The YAIM install includes all of the edg, vdt, postgres, pnfs software that is required to get everything up and running. The ${TARGET} will cause the script to set up everything that is required, including the postgreSQL database, pnfs and postgres user if the target is an admin node.

Current versions of YAIM have a bug when the output is piped to "tee" and postgresql and pnfs are found to crash when the process is cancelled. It is not currently recommended to "tee" the output although restarting crashing the services does help fix the issue.

The set up of recent glite versions of YAIM have changed the format for the user account and group account files. For a default YAIM install users may find the configure_node script fails complaining about a missing users.conf file as shown bellow

Configuring config_users
/opt/glite/yaim/etc/users.conf not found.
Error configuring config_users

If this occurs it is worth copying the files from the YAIM rpm in the examples directory into a newly created /opt/glite/yaim/etc/ directory.

/opt/glite/yaim/examples/groups.conf
/opt/glite/yaim/examples/groups.conf.README
/opt/glite/yaim/examples/users.conf
/opt/glite/yaim/examples/users.conf.README

It should be noted that some sites experience services failing after running the configure script. Two commonly reported issues are that pnfs is mounted twice, and that the SRM door has not started. I recommend the first stage is to check the mount points and if pnfs is mounted twice, please unmount the pnfs file system and then try restarting d-cache as shown below.

service dcache-pool stop
service dcache-core stop
service pnfs stop
service pnfs start
service dcache-core start
service dcache-pool start

For all doors which do not start it is also worth restarting dcache if the web interface shows doors off line for longer than 10 minutes. Alternatively a restart of the host operating system may help.

Updating the authorised user list

although Yaim calls this process it is occasionally useful to manually update the authorised user list.

grid-mapfile2dcache-kpwd is used to synchronise the Grid map file typically used by Globus utility's to map certificate distinguished name to local user group and identity. Dcache must import this user to VO table.

The following script should be placed in the directory "/etc/cron.hourly" I suggest the name "grid-mapfile2dcache-kpwd"

#!/bin/sh
/opt/d-cache/bin/grid-mapfile2dcache-kpwd

Set correct permissions on cron job

      
chmod 755 /etc/cron.hourly/grid-mapfile2dcache-kpwd

If you have not joined a VO yet. All admins should join dteam. For moment, just add in entry to /etc/grid-security/grid-mapfile

"/C=UK/O=eScience/OU=Edinburgh/L=NeSC/CN=greig cowan" .dteam