Attending: Jens (chair+mins), Winnie, Gareth, Sam, Brian, John H, Robert, John B, Ewan, David, Pete, Raul, Elena 0. Operational blog posts as usual, plus end of quarter stuff for Brian and Sam in particular For the EoQ stuff, Jens needs to report to PMB on the publications, etc.; since we had a lot of interesting events in this quarter - ASGC, CHEP, HEPiX, hepsysman, and a WLCG workshop, there ought to be a year's worth of good stuff. Which is OK, this part of the system was designed to allow a year's worth of publications in one quarter. Speaking of reporting targets, we have five blog posts (Jens four, Brian one) which is less than the target eight but then we also had the Purdah and there are only eight posts altogether for the whole year so far on the GridPP planet aggregator, two of which are from SSI (or are some missing?). So we are kind of doing OK, relatively speaking. Winnie raised an operational issue associated with the publication of storage paths. Many storage systems manage storage across aggregated resources, providing a unified addressing mechanism: but Bristol have GPFS so already have a unified storage system under their SE. So the DPNS namespace is kind of similar to the GPFS namespace, except you'd probably chop off a top bit of one and add something? Like /dpm/gridpp/vo.gridpp.ac.uk/ <-> /gpfs/gridpp/vo.gridpp.ac.uk ? CMS and ATLAS do not use the path being published in the information system, nor do they use the "close SE" environment variables but it still makes sense to publish this information because some tools may use them and some VOs may use them. 1. Semi-regular update on new/small(ish)/non-LHC VOs, and update on WebFTS? Update on DiRAC - ticking along, some 40-60TB in 15K files, coming in at a rate of ~250MB/s but would like to scale to ~400MB/s if pos during the summer months when students' netflix and facebook traffic is evicted from Durham's networks. Useful as another GridPP case study: currently outstanding issues are (a) proxy - probably needs a robot but maximal lifetime of a VOMS proxy is 24 hrs, a policy decision. Transfers can work by DN alone so doesn't need vomsification, but accounting would be improved with VOMS data. (b) Ownership - need a new user for each of the other sites. Local file ownership in Durham is of course lost as only the file is copied, not its associated metadata (as it would be if you'd tarred it or dumped the filesystem) so maybe the best solution is to maintain a file with chmod and chown instructions inside them - crude but it'd work. (c) Data security in general; if they start doing griddy things at the end user level then we might need to start thinking about access control permissions - this may be an issue also for other VOs. For example, Hydra can manage fine grained ACLs but GridPP is not running Hydra. (d) BDII publishes information as D1T0 or D0T0 when it is D0T1 - probably a config issue - Brian raised a ticket in RAL's helpdesk. No other news on other VOs. 2. Yes, let's discuss the CMS thing Brian sent around today/yesterday (on 30 Jun 2015). Background information - CMS proposal for sites: https://twiki.cern.ch/twiki/bin/view/CMSPublic/SpaceMonSiteAdmin and more generally the syncat format for SEs, or as close as we can get: https://twiki.cern.ch/twiki/bin/view/LCG/ConsistencyChecksSEsDumps#Format_of_SE_dumps Preparing a dump weekly-ly and uploading it manually may be too onerous. Brunel had a go, dumping about 1M files but had too much metadata? and would like the see the requirements for the whole exercise. Clearly the format is different for different SEs, which is not great, so we have some sympathy for CMS's position/proposal. UK sites that advertise resources for CMS are (in the order dumped from BDII) RALT1, QMUL, RALPP, Brunel, ECDF, RHUL, UCL, SHEF, IC, BRIS, DURHAM, OX, LIV, BHAM, LANCS, CAM, GLASGOW, MAN, i.e. pretty much everyone. Also not 100% clear whether it's done by SE type or by underlying storage or both (for example, StoRMs are capable of managing several types of filesystems, where dCache is more like an integrated "solution" and CASTOR is Different(tm)). To summarise our feedback arising from this discussion: 1. We'd like to see the requirements documented, so T2s can see how they can contribute, other than just providing technical feedback 2. It'd be best to use a single format such as syncat 3. We suggest that the catalogue be stored in a single well defined location, e.g. /acct/syncat.xml A VO may have more than one VO path - probably in a T1 - but this would probably be figureoutable. In this case, no extra tools would be required and everybody's lives would be easier. 3. AOB Sam and Ewan suggested a discussion on the technical layout of what a future T2 might look like. The advance of CEPH and of archive drives e.g. using shingled storage, could be that a T2 could support predominantly-reads patterns on such storage and thus achieve a better capacity/cost ratio than traditional high end storage systems. Such a T2 would be like a Classic SE, providing GridFTP and xroot access. A test would need to be conducted at a useful scale to be, er, useful - say a PB. As there are risks associated with it we need something like a business case: here's an opportunity, there are risks associated with it, here's what we'd plan to do and what it would cost. We then present the case to the PMB and see what they say. Ewan had started a document but it was just started, so he'd send it around to some people for suggestions and contributions. Sam and Ewan had different things they wanted to test: 1. Ewan wanted a new setup, say the aforementioned PB, with a realistic but predominantly reading access pattern 2. Sam wanted to look a case of new disks in old chassis; 3. and also points out the need for a migration to a classic SE of this form. The outcome, other than the study itself, could be the model of a future T2 with even better bang for the buck than today. However, we recognise there are risks and the presentation (document) to the PMB would need a careful risk register. For example, CERN have run a large scale CEPH instance (30 PB disk) and seen "some scaling problems." With 7500 "OSD"s, those problems would not affect a T2. Also, although access patterns are currently different, Brian suggests that ATLAS and CMS may in the near future begin to use T2 disk in usage patterns similar to T1 disk (ie more scratch use), which would rather scupper the case for archival drives. And CEPH may need SSDs for the metadata/journaling stuff: The T1 CEPH team is known to have "played it safe" on the hardware and new nodes tend to have non-RAIDED disks but with SSDs for the metastuff. Paige Winslowe Lacesso: (01/07/2015 10:03:51) Im asking if the pool accounts USE environment variables that yaim puts into grid-env.sh - it's NOT dpm-specific; I know cms doesn't And I think ops does not either? There are *no* DP-anything env vars set for pool accounts. partly, yes. Email reply = I could've studied it further & less sound intereference - can you send that data in an email to me? Ewan Mac Mahon: (10:13 AM) It's also what a tarball does. But I think the 'fork' file idea with the metadata in a magic file would do fine. Especially if they can create them on their filesystem at backup time. Jens Jensen: (10:14 AM) Thanks... Samuel Cadellin Skipsey: (10:14 AM) Ewan: so, we've just invented the old OS7 filesystem from the 1990s? (not that I object to this) Ewan Mac Mahon: (10:15 AM) It does have its similarities. Though it could be a single file (basically an ls -lR) with the metadata for everything in, rather than a per-file resource though. It's syncat with a /lot/ of knobs on, by the look of it. Jens Jensen: (10:18 AM) Could you do an XSLT to convert one XML format to another? Ewan Mac Mahon: (10:19 AM) Not quite sure why they need any of the knobs. The 'make a namespace dump' shouldn't be too hard work, but the whole quasi-phedex validation and upload stage looks rather over cumbersome. (also, on a minor point, I'm not sure I expect Tony Wildish to be subscribed to the gridpp-storage list, and, while I haven't asked, I'm not sure Daniela did either) Jens Jensen: (10:44 AM) As I understand CEPH it needs to have a fairly large number of nodes... (from the T1 ceph team) Or we increase the capacity by being able to buy more Ewan Mac Mahon: (10:48 AM) I'm actually not too worried about migration, btw; there's a lot of churn on our ATLAS data so if we marked the old thing offline and the new thing online a lot would shift 'naturally'. And then you'd FTS the rest. There are several ways to get 'cheap' kit, of course, there's low spec from a performance POV, and there's low spec from a resilience POV. You can get a lot of oomph for not too much money if you're using i7 CPUs, non-ECC RAM etc. We pay a significant premium for making our machines reliable, and maybe they don't need to be. Jens Jensen: (10:51 AM) But even T1 will move D1T0 to CEPH Gareth Douglas Roy: (10:52 AM) I agree but I'm not sure how far down the cost scales Ewan Mac Mahon: (10:55 AM) Be fun to find out though, wouldn't it? We've got this question from DB about why we're spending more than the ~£30/TB that he can buy a USB disk for, and we should try to answer it. Robert Wolfgang Frank: (10:57 AM) http://www.storagereview.com/seagate_archive_hdd_review_8tb