Attendees: Brian Davies Daniel Traynor Duncan Rand John Hill Matt Doidge Paige Winslowe Lacesso Rob Currie (=> notes) Sam Skipsey Ste Jones Sam was having technical issues so Rob led a round-table of people who had attended. Apologies I forgot to copy the minutes from Vidyo. Most of the important points I've added into the minutes so I don't think too much has been lost. Brian: ntr Daniel: speaking to vendors, will put out to tender 2Pb of storage, separate compute purchase after that 1Pb IRIS, 1Pb GridPP Looking to deploy Luster Will aim to add new storage to Luster instance Had some instabilities in site storage, seems to be working now ahead of leave Matt: Plans to present IRIS? Add to Lustre, present through storm (new version of storm coming in summer, convince users to use POSIX to read storm to write) Duncan: Which community within IRIS are you supporting? Unclear Duncan: Looking at perfsonar review John: @ATLAS meeting Thursday will drop 40Tb of storage and (re)install as XCache Been in touch with Mark If successful will decomission DPM after Matt: would be good to have idiots guide to XCache This is the plan from Mark, waiting until XCache is working before publishing instructions Matt: Waiting on DOME 1.13 (might install from epel-testing) Been maintaining old RAID which has gone funny (had to remove disks) plan to update disk nodes Rob questions Matt healthy paranoia LSST using storage couple 10s of Tb Duncan: Can we see how much LSST data is at each site? Not really, just plodding along Paige: Need to ask for help from Sam help with looking into Xroot storage element at site Sam: On todo list. This is something which works it's just we've never put something like this into production before. Rob: Actual storage is working without issues, tryingt to keep up to date with latest DPM -DOME. Trying to avoid migration to DPM+DOME for the time being due to reduced manpower There is active RUCIO development going on at Edinburgh and we've been in contact with the multi-VO and SKA RUCIO people at RAL. Rob will chat to Edinburgh people and see if we can produce a few slides describing the work which has been done for the storage group. Sam: Benchmarking/Testing CEPH for new site storage and xroot frontends. Ste: Nothing much to report, not a lot of effort so storage plodding along. Not too clear what the plan in the future is, DOME, XCache, lots of technologies and a lack of manpower. Rob: Deprecation of SRM is unfortunately driving the change, which will be required, but still need per-site discussions on the sites future. Matt: DPM-DOME is deprecated at September which means we should push people to move at this point as DPM only will be deprecated software in production otherwise. Gridpp43: Rob: As of September SRM is not supported in DPM. Shall we try and have a clear message at GridPP with this in mind about what the future should be. Matt: As GridPP43 is focussing on Tier2s then we may end up with multiple storage related talks there so would be good to all be on the same page to deliver the same message. Sam: ATLAS are using SRM less which is good. Lack of clear direction from the experiments until recently (past 6 months). Matt: CMS are using XRoot all the way, but LHCb has no clear direction yet. Brian listed sites impacted most (smaller T2s in the UK). Duncan: if sites are moving away is DPM+DOME going to scale?, what to do with mid-sized sites. Edinburgh, Liverpool, Brunel, RHUL Matt: DPM starts to stress when cluster too big Sam: Glasgow is moving to new kit and supporting CEPH as a replacement. Duncan: Worth notingh that early grid models were having a cache-like storage at T2s and we seem to be moving back in this direction, How about dCache? Matt: Lancaster used to use it. dCache has improved recently but used to have configuration nightmares. Glasgow or QMUL are in a really good situation to test/evaluate new tech so we should see what strategy they follow. Duncan: Given the large purchases can we make a decision on how to scale for the next few years now rather than later? Matt: Unfortunately storage at Lancaster was purchased to be integrated into existing tech. Dan: Can DPM be made to use a POSIX fs and use a new technology behind the scenes to cluster the storage? Matt: When upgraing disk nodes, can look at newer backend technologies. Sam: We should really recommend common technologies to improve support. Ste: Might this include ZFS? Matt: Potentially, but it's the distributed part of the storage which is tricky and we'd like to replace DPM with Duncan: HDFS seems to be favoured by the US. Matt: Xroot supports HDFS and https so HDFS+Xrootd might be an option. Duncan: Should we be discussing this here or at Ambleside? Matt: Can discuss this at next GridPP but manpower is being reduced so large changes are difficult. Sam: We've been saying that large changes are painful/difficult for a while but this will be needed. Duncan: When considering what technologies to use Glasgow is following RAL with CEPH so not breaking new ground. Should we be recommending CEPH to large sites? Matt: But ECHO had a lot of FTE to suport it which may not scale to other sites. Duncan: Isn't this the argument for moving from smaller sites? Matt: Yes, but there's lots of less supported older storage out there. Duncan: Need ATLAS/LHC to make a clear decision what they want which will help us.