Attending: Jens, Duncan, Elena, John B, Steve, Lukasz, Winnie, Raja, Marcus, Daniel, Tom, Sam, Govind, Pete Apologies - Brian is away for Networkshop in Manchester 0. Operational blog post (you guessed it!) Marcus has written some really nice blog posts about ZFS! Can one move a disk to another RAID controller? Wouldn't hurt with a few more from others, either. Elena reports a problem with DPM: apparently copying a file with rfcp from the head node reports a "buffer overflow"; doing the same from a WN works. No entry in either the DPM or RFIO logs. In approximately increasing order of desperation one may suggest the following: - check whether the namespace lookup is happening - strace rfcp to check what it's doing - check disk servers' logs? - Anything in dmesg? - tcpdump the network connection (sometimes a bit fiddly if it's on the same host) - turn cores on (in the shell) in case rfcp dumps core 1. End of quarter/project milestoney things. So we have three things this time; DiRAC Leicester moving data - they have a lot of directories but little data, apparently Jon is debugging something. ATLAS cleanup at sites: seems to be OK but hard to determine whether sites have not cleaned up their old directories. And a related ATLAS thing, uploading the catalogues to the canonical location. This also seems to be happening; Alessandra has been following up and ticketing sites. So generally they are not in a terrible state but there may still be a few pieces to mop up. 2. The round table - it's been two months since we last had one. Things you are working on or worried about (storage and data related stuff only, please!) Marcus: ZFS blog. Local storage servers - extended storage. Are there concerns about the future of ZFS? The kernel driver has a different licence so needs to be managed independetly, but ZFS is stable and works. Ubuntu may switch to ZFS? Tom: working on user guide. Python API to DIRAC, cataloguing. Integration with Ganga; Pravda is using Ganga (proton therapy working with Mark at Bham) Expecting some nice stuff on Ganga at GridPP36. Daniel: Lustre 2.8 - minor update. Duncan: Federation of CMS in London. Expected to lead to increased load eg on Brunel. ATLAS xrootd test. Elena: Puppet modules for DPM? Asked for Ewan's scripts; check again with Kashif. Also others are interested (e.g. Liverpool). Govind: Cleaning dark data. SL5 head node - could it be migrated to CentOS7? Sam points out that pool nodes are OK on CentOS7 but not head nodes, although it seems to be "only" configuration problems. John: also looked at DPM puppet, tried CentOS7. Currently using SL6 so no great hurry. Problems were mostly missing packages, where for example the package was available only for the previous OS. Lukasz and Winnie: all new hardware arrived, expecting to double storage capacity over the next couple of weeks. Pete: chasing outstanding tickets. ATLAS tidy up, CMS AAA, HTTP. Sam: Puppet stuff. More storage, more hardware. Also wants to restart the LIGO stuff: how to integrate their catalogue with our infrastructure. Steve: same as John. Jens: Workflow for provenance; GLUE2 for CASTOR (much neglected ticket) and CASTOR WebDAV with Rob Appleyard, Indigo (think iRODS++?), CloudExpo London (which includes some Cloud Security stuff). 3. AOB Tom Whyntie: (23/03/2016 10:02:07) Hello - apologies in advance - I'm going ot have to leave at 10:35am for a safety course. Jens Jensen: (10:02 AM) ta John Bland: (10:06 AM) also maybe check dmesg/system messages for anything else that's crashing, in case it's a system malfunction Marcus Ebert: (10:15 AM) Apologies from me too, I'll have to leave at 10:30 today Peter Gronbech: (10:22 AM) Alessandra has had a look at Oxford and we do have stuff to be deleted, but I'm not yet back up to speed on this Kashif will be able to share the oxford puppet scripts John Bland: (10:30 AM) yes, please Peter Gronbech: (10:30 AM) Although Ewan (with Sam's help ) did all the work NO SL7 failed We used SL6 for both pool and head node in the end Exactly we needed a stable situtation before Ewan left Lukasz Kreczko: (10:41 AM) http://cassandra.apache.org/ ?