Attending in joinological order: Jens, Gareth, Daniel, John H, John B, Marcus, Raja, Brian, Steve, Matt D, Duncan, Winnie, Sam, Elena, Ewan, Govind Apologies in apolological order: Tom Jens thanked Sam for the notes from last week and welcomed Marcus to the community. 0. Operational blog posts Most recent blog post was a month ago by Brian. dCache 2.1.14 is out. 1. ATLAS space token cleanup and next steps (Brian?) T2s still have PRODDISK and are cleaning it up, or will be ticketed by Brian soon. Cleanup is fairly slow. Sam discovered that DAVIX's rm (ie HTTP based delete) is quicker as it seems to have received most attention from Fabricio and is parallelised properly. Deletion needs to delete the actual file on the disk server as opposed to just deleting the nameserver entry (a la unlink) because otherwise the file will sit on the disk server forever; there is nothing special to garbage collect it. This is also why it is slow as rfrm et al open a connection each time to the relevant disk server, so clearing lots of small files will be slower than the same volume of larger files. Sam will try to get order-of-magnitude numbers from the most recent single process deletion. The DAVIX method like the rfrm one may not need certificate authentication if you are root on a trusted system such as the head node (thus potentially speeding up the process, although keeping the connection open to the disk server would be quicker still). The DAVIX method is expected to delete empty directories(?) but this will need checking. ATLAS would like to recover the site ASAP but it is not a blocker if a site hasn't cleared its PRODDISK. There may be a few other test space tokens at 2-3 T2 sites which need clearing. 2. Syncat requirements and experiences follow up Need to upload syncat file (even if it is not syncat format?) to specific location; John H will test, and Matt D although Matt is currently focusing on the PRODDISK cleanup (which currently has 1.5TB left). John got the script working by installing python 2.6 because it was written for 2.6 and the system had 2.7 by default. ATLAS want relative paths, so relative to what ATLAS think is the endpoint which may be the same as what your BDII publishes as the endpoint. This also assumes that there is a 1-1 correspondence between your endpoints and space tokens. Is it worth starting with the dCache sites? Or John H and Matt can have a go. 3. The Tier3 question (from 1.5 weeks ago) revisited? The issue is with world readable files: what is the world? Are they certificate authenticated (but otherwise anonymous in the sense that no additional authorisation is needed) users or unauthenticated anonymous users? DPM had a default of allowing unauthenticated read access to world readable files, but this is "fixed" in the most recent build, as this may be surprising to some VOs and can have other unintended consequences: - the SE can be indexed by bots unless it presents a robots.txt file, which previously they had trouble presenting because they could not publish files in the "canonical" location because of the fixed file path (such as /dpm) - can they present robots.txt now? - Unauthenticated users may be given lower priority on the system Our current recommendation is: - turn off unauthenticated access (look for a ns anon key in two configuration files and remove them) - only turn it on again (selectively) if asked by VOs, such as CERN@School for whom it was actually useful - it is recommended to check with a browser (with and without certificates) whether it actually works :-) 4. Quick notes from last week's EGI Community Forum - GridFTP presented as "workhorse" of WLCG data transfer, used also by Globus(Online) and EUDAT. Case for interoperation between implementations - not just grid SEs but also iRODS etc. - New user communities such as EISCAT3D with 4PB/yr; as they are just starting their "grid" stuff, it is interesting to see their perceptions - eg "WLCG is dCache", and they haven't really thought about data transfers yet. - There is a gLibrary developed by INFN for accessing data in EGI, previous versions assumed AMGA and Postgres but more recent versions should be more generic. In turn it was used by DARIAH to implement "Storing and Accessing DARIAH in EGI" = SADE. - More generally EGI is interested in "citizen science" and open data, looking to EUDAT for the latter. - Discussions between EGI, EUDAT, and IndigoDC - may be of interest for GridPP, ie moving data between projects - ideally if needed by user community, or if they are really low hanging fruit. 5. Anyone submitting anything to ISGC (cfp closes tomorrow (which will be today when we discuss it)) - Today is deadline for abstracts. ISGC is actually useful for "escience" type presentations because it brings together Asia, Europe, and (at least North) Americas; for example Australians do interesting stuff. However, it is a bit of a way to travel, of course (and to persuade your projects to fund, particularly if you also want to go to CHEPiX). 6. AOB NOB Matt Doidge: (18/11/2015 10:04:17) Lancaster is *still* deleting proddisk. Not really an issue, more of a mild pain in the bum. John Bland: (10:05 AM) matt: are you running multiple rmrf in parallel on different paths? Matt Doidge: (10:06 AM) I was at first in the top of proddisk just down to rucio now, I need to recurse up into it I think Our problem is that we keep hitting files that were on apparently inproperly drained pools. Sounds cool - what package gives these tools? Samuel Cadellin Skipsey: (10:09 AM) https://dmc.web.cern.ch/projects/davix/documentation Matt Doidge: (10:10 AM) Ta! Elena Korolkova: (10:11 AM) there is also storage dumps which should be done monthle atlas is asking for this Ewan Mac Mahon: (10:19 AM) Interesting assumption. John Hill: (10:23 AM) I will try it out properly later this week Daniel Peter Traynor: (10:24 AM) works for cernatschool at qM https://se03.esc.qmul.ac.uk:8443/cernatschool.org/ John Bland: (10:25 AM) that's ATLAS's problem, surely Ewan Mac Mahon: (10:25 AM) This is one of the occasions when 'world readable' does what it says. Our wiki lets you edit it with any grid cert, IIRC. John Bland: (10:27 AM) why should I limit our DPM for all VOs based on ATLAS's lack of foresight Raja Nandakumar: (10:28 AM) Apologies - I was offline briefly. Could you repeat the question (if still relevant) Daniel Peter Traynor: (10:30 AM) in storm we habe a per spacetoken setting e.g. STORM_CERNATSCHOOLORG_ANONYMOUS_HTTP_READ=true|false Jens Jensen: (10:30 AM) Raja I know you need to leave now but the qyuestion was about dumps of SE contents - whether you need something from UK sites Raja Nandakumar: (10:31 AM) No - we do not need the dumps - at least as yet Jens Jensen: (10:32 AM) ok ta Raja Nandakumar: (10:32 AM) We do not have the manpower / technology as yet to be able to use them. Thanks!