Attending: Winnie, Jens, John B, Sam, Duncan, Steve, Chris B, John H, Robert, Elena, Matt D, Gareth, Ewan, Jeremy, Brian Apologies: Wahid 0. Operational blog posts. So far we have a grand total of 0 blog posts this quarter. No operational issues. Also no blog posts! Please remember to write something if you do (and only if you do) something Interesting(tm). 1. SE housekeeping. Should we spring clean our SEs and get rid of obsolete or inactive VOs? (Or are they just in the BDII?) We have a number of VOs which are advertised by the SEs, some of which are closed, some presumed defunct, some are inactive, a few are mum despite their official contact being contacted. We agreed it was not a huge problem; only a few "small" VOs are not small. So there are a few TB or tens of TB, lying around, no biggie. However, can we delete the data - Ewan points out T2s are non-custodial, so in principle we can - yet users might expect data to be there event after being inactive for a little while. Jeremy points out EGI have a policy of marking a VO as unapproved if they are inactive for a while. Sam points out that the only "small" VOs that aren't small are T2K, HyperK, and Supernemo. 2. Status of DPM 1.8.9 - issues with installation, Matt's SAM test failures? Matt had found the problem - a site configuration issue. So there doesn't seem to be any technical problems (other than the niggles Wahid pointed out in December), so the remaining question is whether the step from 1.8.8 to 1.8.9 is big enough that it would be worth completing the upgrade across GridPP prior to run 2? The third question is whether there are enough new goodies in 1.8.9 that will warrant an upgrade. The baseline does not require 1.8.9 yet, and Brian doesn't think the baseline will change prior to run 2. New stuff in 1.8.9 includes changes in GridFTP redirect and performance, making it more efficient, move to puppet, and excludes support for YAIM. 3. Other baseline things - are we ready for run 2. We can't do IPv6; experiments are probably OK. There is the question about deletion rates. dCache: Chris will upgrade to 2.10; already scheduled downtime for 3rd week of Feb. Imperial have been running 2.10 for a while and Duncan reports that it works well, except for some initial IPv6 issues. However, breath holding waiting for RAL to switch to IPv6 is not recommended, so it should not be an issue for RALPP. Deletion rate over WebDAV may be a concern, but not one we can reasonably address within the time leading up to run 2. $. For your entertainment and lunchtime reading (and perhaps discussion tomorrow?), here are a couple of Exascale papers you might find interesting: http://www.eiow.org/home/E10-Architecture.pdf http://arxiv.org/abs/1501.05367 Sam points out that we already do a distributed checkpointing. If WLCG already has 1/2 exabyte, are we getting towards the exascale ourselves? Ian Bird writes (in the paper) that he thinks both ATLAS and CMS will have exabyte data within the timescale of SKA ramping up. How much could you fit into a rack these days (of disk) - probably up to 5PB. Jens points out that also climate are expecting growth - by pers. comm., but will see if there are any papers referencing the growth. Matt Doidge: (28/01/2015 10:06:53) Has a VO become "unapproved" that hasn't already become defunct? I think we regular disapprove of VOs :-) Ewan Mac Mahon: (10:07 AM) I don't think so, but we have recognised the defunctness of several. Steve Jones: (10:09 AM) Here are the ones babar camont.gridpp.ac.uk cedar ltwo minos.vo.gridpp.ac.uk na48 ngs.ac.uk supernemo.vo.eu-egee.org totalep Jens Jensen: (10:10 AM) https://www.gridpp.ac.uk/wiki/GridPP_approved_VOs https://www.gridpp.ac.uk/wiki/Maintaining_GridPP_approved_VOs Matt Doidge: (10:13 AM) Are all the VOs in that list beyond the data "grace" period? Steve Jones: (10:14 AM) https://www.gridpp.ac.uk/wiki/Policies_for_GridPP_approved_VOs Jeremy Coles: (10:17 AM) I'll check the EGI process again with regards to data policy. Matt Doidge: (10:17 AM) Turned out to be a site misconfiguration missing xroot from shift.conf on our headnode Having another issue on a disk node where gridftp isn't working Ewan Mac Mahon: (10:19 AM) I'm not massively enthusiastic about 1.8.9 at all. But no-one's relying on the gridftp change are they - none of the VO's are using htat 'classic se' interface at scale. Steve Jones: (10:22 AM) Just to cover the issue, I added this (clumsy) draft policy: The collaboration should discuss and decide on the data and storage implications prior to the removal of any of the VO's data. Please edit it if you think that's wrong. Matt Doidge: (10:22 AM) I have one node that can't access the dpm namespace via gridftp and I can't for the life of me figure out why John Bland: (10:23 AM) matt: did you puppet-config your 1.8.9? Matt Doidge: (10:23 AM) No, unless I count as a Puppet :-) I largely did okay, but there was just the shift.conf edit I missed. Our reason for upgrading was trying to leverage the uniform dmlite logging - I had underestimated the impact of the upgrade It's not great - a bit noisy even with LogLevel 1 set Ewan Mac Mahon: (10:29 AM) Actually, just jumping back a bit to the VO decomissioning discussion; Tier 2s are still 'non custodial', so we should avoid any implication that they're suddenly responsible for caring for a deleted VOs data more than they were while it was an active VO. Matt Doidge: (10:29 AM) Nice for it to keep throwing the Token into the syslog Samuel Cadellin Skipsey: (10:30 AM) Ewan: sure, I just feel bad about removing data for small VOs (as they don't have much of it anyway) Ewan Mac Mahon: (10:30 AM) And any log term 'archival' storage that the project as a whole wants to do for a former VO should be done by the Tier 1 because they have the gear. Matt Doidge: (10:31 AM) All VOs need to leave a forwarding address to have DVD-Rs with their data on posted to. John Bland: (10:34 AM) matt: they should leave a SAE, we're not made of stamps Steve Jones: (10:35 AM) Ewan: I agree on "custodial". But we would always "consider storage implications" prior to unapproving a VO, even if that means that we delete it anyway. ... in the end, after at least thinking about it. Ewan Mac Mahon: (10:37 AM) That's different though. We would definitely think about it before unapproving a VO, we would not necessarily think about it before removing their data. Since we may well remove their data entirely by accident, or just as a convenuence, that being the nature of non-custodial storage. Matt Doidge: (10:38 AM) I'm pretty sure you can get a PB in a rack using the normal supermicro kit Ewan Mac Mahon: (10:39 AM) About two, I think. 36 bays x 6TB drives x 10 servers ~= 2PB Samuel Cadellin Skipsey: (10:39 AM) Yep, no consider you can buy 45 disk units Steve Jones: (10:40 AM) Yes; prior to making them "unapproved", this would be considered. I.e. we would not unapprove them without doing so. Samuel Cadellin Skipsey: (10:40 AM) So, you can possibly get up to 3 PB Ewan Mac Mahon: (10:40 AM) OK; 45 bays x 6TB drives x 10 servers ~= 2.7PB Samuel Cadellin Skipsey: (10:40 AM) Yeah. Ewan Mac Mahon: (10:40 AM) Less a bit for RAIDage.