Brian, Bruno, Chris, David, Elena, Ewan, Gareth, Jens, John H, Matt, Raja, Raul, Robert, Sam, Steve, Tom, Wahid 0. Operational blog posts 1. Post DPM workshop Wahid reported (and has also sent a report to the list; it should appear eventually). Collaborations in general. CERN have contributed a lot: other sites perhaps less than planned, or sometimes differently than originally intended. Load limiting was discussed: DPM is of the opinion that it should go into xroot; but then they said that the last couple of times, too. DAV stability discussed: well, at least the discussion is started. YAIM and puppet: YAIM will still work for 1.8.9 but is deprecated; sites should switch to puppet - would be useful to do a GridPP tutorial. Andrea's tutorial highlighted some minor gotchas. Later there will be a single metamodule to make it easier/simpler. GridFTP redirection discussed. A longer term "solution" for "space tokens" or equivalent in non-SRM protocols was discussed: in the short term there will be a sort of reporting facility which needs to be enabled - also here one should be aware of certain gotchas if switched on. xrootd4 is not in 1.8.9 as we have seen. CMS stress test: Brunel worked with Fabricio, leading to an improved metadata handling. There should be a WebDAV SAM test, not critical, enabled by EGI, DMLite shell - more info in the Edinburgh workshop - also real time monitoring and improvements to logging. 2. Alistair Dewhurst guest speaker on Ceph Alistair gave a presentation http://storage.esc.rl.ac.uk/Ceph20141015.pdf It is good to have T1 stuff coming out, particularly if a T2 in a few years' time will look like such a thing. Conversely, some T2s already have experience with Ceph which could be useful for the T1. Direct IO with RADOS - but RADOS != POSIX. Discussed S3 interface, no CDMI. If RUCIO used GUID, this, too, could be the natural identifier for the object store? People use catalogues and metadata lookup anyway to find their GUIDs. Concern about FTS streaming data through FTS: this will not scale. May need to keep GridFTP around for 3rd party copying (and, presumably, interoperation with other SRMs.) IPv6: could you run the infrastructure on a private subnet, akin to 192.168.0.0/16, then switch to "proper" IPv6 addresses once they become available? 3. AOB Jens Jensen: (15/10/2014 10:00:11) I was talking...! Maybe my mike isn't working - again! Gareth Douglas Roy: (10:00 AM) Couldn't hear you Jens Bruno: (10:00 AM) I'm here. Bruno Canning from RAL Samuel Cadellin Skipsey: (10:00 AM) apparently I'm also only vaguely understandable over voice, too. (but people can hear me) Jens Jensen: (10:00 AM) Damn Samuel Cadellin Skipsey: (10:00 AM) Yay Vidyo. Jens Jensen: (10:00 AM) Zut alors! It's a desktop Samuel Cadellin Skipsey: (10:01 AM) I should say, I couldn't hear the very quiet person. Jens Jensen: (10:01 AM) I will try to phone in and keep the window open Samuel Cadellin Skipsey: (10:01 AM) So it could just be the voices in Wahid's head. Jens Jensen: (10:02 AM) :-) Christopher John Walker: (10:03 AM) What command? brian: (10:03 AM) gfal-ls Christopher John Walker: (10:05 AM) [cwalker@lxplus0063]~% gfal-ls srm://se03.esc.qmul.ac.uk/dteam dugs-lhcb-testing s.fayer dbauer generated sdjp cj cjw-atlas.cern.ch.tar.gz SelUCCR.txt That's the lxplus-ipv6.cern.ch Samuel Cadellin Skipsey: (10:07 AM) (sorry Wahid ;) ) wahid: (10:07 AM) https://indico.cern.ch/event/324705/timetable/#20141009.detailed John Hill: (10:11 AM) Wahid's mail just arrived Ewan Mac Mahon: (10:11 AM) I'm sortof OK in principle with the move to puppet for pool nodes, but I'm really wary about it for the head node. Especially an existing head node. Matt Doidge: (10:13 AM) I'm wary too, it all depends how reverse-engineerable and, well, modular the puppet module is. Ewan Mac Mahon: (10:15 AM) I like the idea of doing a UK tutorial/workshop though. If we could get the right people together we could quite possibly actually move people's production systems to puppet configs, rather than just teaching people how to. Which means if Wahid & Sam blow up our DPMs, it's totally not our fault :-) Christopher John Walker: (10:17 AM) http://storage.esc.rl.ac.uk/Ceph20141015.pdf wahid: (10:20 AM) live migration of production systems eh Ewan ... I'm not taking that responsibility. I indeed have also not braved the headnode but (at least on 1.8.9 release) and report.. Ewan Mac Mahon: (10:22 AM) Well, at some point, someone's got to do it, and notional responsibility aside (I dont actually think that's a thing we care abut at all), I think it would be a lot easier doing it in a room full of knowledgable supportive people than not. And we'll do Matt's one first. wahid: (10:23 AM) PS _ the Aussie guys Ceph have several replicas and also on top of RAID - so they are not afraid of using resources (!) .. Samuel Cadellin Skipsey: (10:23 AM) Well, if you *have* the money, Wahid... Of course, on almost all POSIX filesystems that Rucio is writing to, its hashed directories *do not* have the effect of spreading load. Jens Jensen: (10:31 AM) Kind of like SRM1 ... /ducks/ Samuel Cadellin Skipsey: (10:31 AM) (Since most of them are either not POSIX filesystems, or are filesystems that don't spread metadata load on a directory basis) I just assumed the Rucio system was just to provide a "guaranteed unique" path. Ewan Mac Mahon: (10:38 AM) Can you have 'hard links' in ceph - two names for one object? In ceph ceph, not cephfs ceph. Samuel Cadellin Skipsey: (10:40 AM) Ewan: just reading the RADOS API, will get you an answer. Gareth Douglas Roy: (10:42 AM) Can't you sotroe the object named as a GUID and stopre the human readable "tag" in the key/value object metadata stored with the object] Samuel Cadellin Skipsey: (10:42 AM) Not as far as I can tell, which makes sense for an object store. Gareth Douglas Roy: (10:42 AM) Then it's not seperate, it's still welded to the object Samuel Cadellin Skipsey: (10:42 AM) Gareth: yeah, that's what cephfs does, basically. And the SRB ceph interface, too. Gareth Douglas Roy: (10:42 AM) I think that;s how WOS works as well...not Ceph but same idea and shows it scales Samuel Cadellin Skipsey: (10:44 AM) The point was more to note that "making things look like posixy paths" was not the optimal approach in the first instance (esp when Rucio itself isn't that POSIXy) Gareth Douglas Roy: (10:45 AM) Yes I agree, just working through the use cases. If the human readable path is for debugging then use a name that aids fast lookup and just store the "path" in the object metadata Jens Jensen: (10:46 AM) Chris you're very faint... Much better.. Ewan Mac Mahon: (10:47 AM) 16+2 = 18 fits nicely into the 36-bay supermicros, fwiw. Christopher John Walker: (10:49 AM) Don't you need gridftp if you are transferring from a gridftp node? Ewan Mac Mahon: (10:49 AM) That FTS thing isn't going to scale. If we're going to move to underlying storage systems that expect a pure client/server model and don't understand 3rd party copies, then we might need an FTS at each site. Samuel Cadellin Skipsey: (10:50 AM) What's FTS' underlying meta layer? Is it still based on GFAL? (I ask because GFAL2-ceph is not a hard thing to write...) wahid: (10:59 AM) Chris - is there any progress on the Dav on Storm btw is there a ticket link tracking it.. Christopher John Walker: (10:59 AM) Not that I'm aware of. Certainly nothing in the ticket. I'll prod wahid: (10:59 AM) ok - can you send the link (Someone was asking me if they should use storm to provide dav on posix) Ewan Mac Mahon: (11:00 AM) I say 'baking it in forever', but can you switch a ceph from v4 to v6, even if it can't be both together? Christopher John Walker: (11:01 AM) wahid: https://ggus.eu/?mode=ticket_info&ticket_id=105361 Samuel Cadellin Skipsey: (11:01 AM) Erm, it strongly implies in the manual that you can't easily (the ip addresses are used to determine identity) Christopher John Walker: (11:01 AM) Actually there does look to be movement: https://issues.infn.it/jira/browse/STOR-607 Enrico Vianello added a comment - 30/Sep/14 11:01 AM TO-DO: Update Milton version on branch fix/STOR-632 (currently it's 2.6.2.1-SNAPSHOT) and merge with develop Samuel Cadellin Skipsey: (11:01 AM) so you can do so, but I suspect you'd have to "remove" the IPv4 nodes and "add" IPv6 versions of them one by one? (Off the top of my head, have not reread the manual today ;) ) (Certainly, not a topic the manual wants to suggest.) At least earlier this year, it was considered impossible to switch (http://blog.widodh.nl/2014/05/deploying-ceph-over-ipv6/ is a good blog post on the nature of the config) Ewan Mac Mahon: (11:04 AM) This is a real problem. If you're building a storage-only cluster that is only accessed via gateways, then it's not at all, but if you're expecting to have rados accesses from worker nodes, then now is not a good time to build in a permanent reliance on IPv4. Samuel Cadellin Skipsey: (11:04 AM) Indeed. The general feeling from the ceph community is "why aren't you already using IPv6 anyway?" Ewan Mac Mahon: (11:07 AM) Right back at them - www.ceph.org has no AAAA record. Gareth Douglas Roy: (11:07 AM) Sorry another meeting gotta go Tom Whyntie: (11:08 AM) Thanks, bye