Attending: Jens (chair+mins), Daniel, Winnie, John B, Steve, Govind, Marcus, Duncan, David, Gareth, Sam, Ewan, Matt, Brian Apologies: Tom (Internet access issues) 0. Operational blog posts Marcus sent his preferred address for blogging and has been invited to the blog. Winnie's primary name node went kablooie and the fallback failed to operate properly. Luke fixed it; not sure why it broke but would be worth understanding? Govind's issues reported to the list. The file system listing - since it dies for a specific file, it may be a database problem and Govind should run the dpm_dpck tool. * Don't run on a live system, or at least not with automated fixing turned on. * Aim to use the next version coming out in the next release as it will have bugfixes and goodness. * Sam will double check and mail the list. * Ewan has run the tool. Suggest to run it manually, i.e. manually check the fixes it is suggesting. In Ewan's case it had suggested fixing things on non-existing and long gone disk servers. Govind's second issue was on the file listing - biomed wanted file listing with time stamps. If the DPM listing is like CASTOR's nsls (with -lR switches) then it generates directory names followed by a listing of the contents of the directory; these could be put together to provide full path names for files using something simple like an Emacs macro or a Perl script - in Govind's case, biomed only have 23K files. As an aside, if each VO wants a different flavour of listing, that obviously makes lives harder for us and it would be easier if a joint listing (such as syncat) could be produced and the local flavours could be derived from it. However, ATLAS had looked at syncat and declared it "too unwieldy" - however, something flexible might still be possible locally. 1. Storage related summary of last week's hepsysmen and women - particularly Sam's storage singalong on Friday. The background to this is that Sam has been nominated to speak for sites - all WLCG sites - at the coming WLCG event in Feb. The topic is "medium term evolution", meaning ~2-10yrs. Presentations expected from experiments - a joint one! - developers, likely led by dCache but obviously with input from others, and from sites (Sam). Sam's search for input from sites has until now been relatively fruitless but Alastair Dewhurst from T1 has promised input and hepsysman was another opportunity. But future thinking can include stuff like what future interfaces would look like, and if we also need to support broader user bases - beyond WLCG - perhaps we should look at the "other" interfaces even if they are not standard ones and they are less efficient. Also relevant to future stuff is hardware evolution (for storage); Sam presented thoughts that RAID6 would no longer provide the protection it currently provides due to longer rebuild times. Disks are expected to continue to grow in capacity over the next 10 years, but not in performance (cf. shingled drives). Other things may become attractive, like erasure codes or specific implementations like HDFS, or CEPH. Whatever it turns out to be, if it is not the same as today we will need a migration strategy, so this should be a focus for the workshop - and could also usefully be raised at the ATLAS Jamboree. There is no DPM Skunk Works to develop secret stealth DPMs or super high velocity back ends. RAL famously has a CEPH team that have been working for a while getting CEPH into production; CERN have a EOS-CEPH interface and RAL has been testing *->CEPH. For the grid we would of course need GridFTP, xroot, and BDII. And maybe HTTP/WebDAV. There was an attempt at DPM-to-HDFS which did strange and a bit hacky things to the metadata. Is there a role for HDFS or CEPH in a future T2? Of course some T2s are already running HDFS although they are predominantly in the US and use BeSTMan. dCache has a CEPH backend and may be the better way to run a Grid interface to CEPH? While RAL has spent a lot of time getting its CEPH into production, far more time than a T2 could allocate to it, RAL has also spent a lot of time getting CEPH into production and a T2 might be able to benefit from this, as long as things don't come out with "RAL" written on them (Quattor anyone?) In which case, *if* CEPH is to be an option for a future T2, we need a volunteer T2 to test it as a secondary SE. A secondary SE could be added to ATLAS's AGIS [thanks to Brian for explaining]. Edinburgh currently have something like a second SE configured. Migration could be accomplished by marking the old SE read-only and eventually churn would move much of the data - in theory. Ewan, who is about to leave GridPP, cheerfully volunteered Glasgow to be such a site. 2. Catalogue topics revisited (cf Govind's question and the ongoing saga of producing catalogues for the experiments) Apparently the DPM listing is like the CASTOR one, which lists the directory name and then the list of files (or subdirectories) in said directory, e.g., /castor/ads.rl.ac.uk/prod/biomed/disk/: drwxrwxr-x 2 bio001 bio 0 Sep 18 22:17 01788d86-2629-4127-9799-335afee36e0d drwxrwxr-x 2 bio001 bio 0 Mar 05 2015 026a272d-5310-4d15-9d72-4f6b6231ea5d drwxrwxrwx 2 bio001 bio 0 Dec 11 2013 0bc26966-6ad1-4432-99e6-6bc10eb16de7 drwxrwxr-x 2 bio001 bio 0 Nov 07 2014 0c25fd70-58de-4f4f-ba4c-42f4c45618de drwxrwxr-x 2 bio001 bio 0 Sep 18 21:13 17d17510-e029-4959-bee4-2ad333a0b8d7 drwxrwxr-x 2 bio001 bio 0 Jul 13 2012 232bcc1a-38d1-4cf8-9d7f-3ef40bdafded drwxrwxrwx 2 bio001 bio 0 Jan 09 2014 27231445-5da2-43f4-bb2c-8366093b284e drwxrwxr-x 2 bio001 bio 0 Jun 29 2015 28f628e4-bd1d-4273-9f54-076e4f326349 So what one needs to write is a script to take this information and throw away stuff that is not needed and glue the filename onto the directory name. Simple! Whoever does it gets to choose a sensible scripting language (Perl, bash, python, scheme, ruby). Maybe timestamps need a bit of reformatting? 3. AOB NOB [1] https://www.eff.org/pages/https Steve Jones: (20/01/2016 10:03:43) Went Kablui like the Nipigon Bridge? Paige Winslowe Lacesso: (10:05 AM) Kablooey Steve Jones: (10:06 AM) I stand corrected... Ewan Mac Mahon: (10:20 AM) The mental image that comes to mind for a 'joint position of all the experiments; is a game of Twister. IMO there are two site positions: a) everything's going to have to change to be more Ceph b) please don't change anything we have neither money nor effort (with people holding both of those, not picking one) Daniel Traynor: (10:28 AM) Dynamic Disk Pools instead of raid6 https://github.com/mar-file-system/marfs Samuel Cadellin Skipsey: (10:32 AM) (This is because DPM doesn't actually have the manpower.) Ewan Mac Mahon: (10:34 AM) I think the long term future of DPM is that there isn't one. Daniel Traynor: (10:34 AM) marfs looks likes like interesting soslution to glue storage systems together+tiered storage we have got ceph / lustre / hdfs solutions woring working Samuel Cadellin Skipsey: (10:37 AM) with StoRM on top, Dan? Daniel Traynor: (10:37 AM) ontop of lustre yes I ment ceph / hdfs working at other sites Brian Davies @RAL-LCG2: (10:38 AM) https://twiki.grid.iu.edu/bin/view/Documentation/Release3/HadoopOverview Ewan Mac Mahon: (10:38 AM) The way this works with academic funding though is that you have to jump of the top of the cliff and hope someone funds the trampoline at the bottom before you get there. If you don't jump, it's seen as there being no need for the trampoline. Daniel Traynor: (10:40 AM) although Terry had ceph working for his business for at least a year now (in production, earning money) Ewan Mac Mahon: (10:42 AM) Ceph clearly works. The questionable bit is the gridftp/xrootd interfaces. But they're not /that/ questionable since the Tier 1 has basically comitted to them. They did already jump off the metaphorical cliff, so they're going to have to make it work. AIUI they don't have a "let's call the whole thing off" option in the plan. Steve Jones: (10:44 AM) Let's draw straws! Paige Winslowe Lacesso: (10:45 AM) Sorry, another meeting, must - Ewan Mac Mahon: (10:48 AM) I think ceph arguably is the conservative option - ceph itself demonstably work at scale and it's Red Hat backed, which is always nice. And assuming that the Tier 1 stick to (what I think is) the plan, then they'll definitely be doing that, so you'd be going the same way rather than another way, and regardless of the merits of the individual ways, the consistency has some value. Also, does HDFS to erasure coding? Because I'm assuming that Tier 2s aren't going to wear the cost increase of going for replication. Duncan Rand: (10:51 AM) There is always dcache... Daniel Traynor: (10:51 AM) lets have a storage bakeoff