Attending: Brian, Jens, John B, Wahid, John H, Steve, Rob, Adam, Elena, Sam, Ewan, Matt D, Chris W Special guest stars: Tom Byrne, Alastair Dewhurst, Bruno Canning (RAL T1) Apologies: Tom W 0. Operations and blog posts Matt D had a problem which caused SAM tests to fail, but did not affect jobs, as it hit the head node rather than pools, on DPM 1.8.9. So if you haven't, don't upgrade to 1.8.9 just yet, but you weren't going to do that the last week before Christmas anyway, were you? Five posts - still room for a few more - maybe your storage-related Christmas Programming Challenge? Can't spend all the time eating mince pies. 1. Reports from Argus and GDB last week (if we can find someone who attended, even remotely?) Actually nobody attended so we'd need to see if we can get a report by some other means. 2. Report from the T1 CEPH team CEPH has the concept of a "gateway" which does what you think it does: you talk some protocol to it (S3, say) and it interfaces to RADOS and gets your data back to you. The xroot gw can be distributed. Found, perhaps unsurprisingly, uneven filling of OSDs with large files: it is a knapsack problem. CERN had the problem and decided to chunk and stripe; RAL will play with it a bit and see what comes out. There is a case for S3 gw, but the case seems to be Amazon funding BNL to do it, rather than a technical need. RAL team seeing it as an easy means to get some stress testing done (some time in 2015). Generally aiming for a production infrastructure "some time in 2015", so on that timescale will still need support for both GridFTP and xroot. There's also Brian Bockelman's interface to HDFS, although it will need the access control added. There were slides which were probably sent to the list. Bruno gave a talk about the configuration and setup of the cluster, to aid in development and testing of new clusters. Using Aquilon with Quattor. Installation of a new node takes 20-25 mins, and then deployment into the cluster takes another minute, so reasonably rapid response to running out of resources (although it will of course take time for the system to rebalance itself.) 3. AOB Who is going to the xroot workshop in San Diego, 27-29 Jan? Apparently so far only Rob A from RALT1. Merry and Happy! Bruno @ RAL: (17/12/2014 10:16:20) Hi All Apologies for late entry, I was on another teleconference with CERN. Jens Jensen: (10:17 AM) Hi Bruno Welcome on board wahid: (10:17 AM) you mean davix for ROOT Ewan Mac Mahon: (10:23 AM) We should just use NFS. It's a tried and tested solution. Tom Byrne: (10:23 AM) yes I did wahid, cheers! Ewan Mac Mahon: (10:26 AM) Isn't the key advantage of S3 that it's a proper native object store interface, and has multiple widely supported implementations. Not sure anything else hits both of those points. Tom Byrne: (10:26 AM) That was what I though Alastair Dewhurst: (10:27 AM) I took Jens question, to mean why politically, rather than why technically also, I can't answer any technical questions Ewan Mac Mahon: (10:27 AM) :-) I'm afraid I've slightly lost the thread of why the S3 interface wasn't felt to be adequate - what was the problem that makes xroot better? (posibly related) the AWS docs suggest that their S3 at least does support range requests for http accesses. Bruno @ RAL: (10:44 AM) https://archive.fosdem.org/2014/schedule/event/hpc_devroom_aquilon/attachments/slides/499/export/events/attachments/hpc_devroom_aquilon/slides/499/FOSDEM14_HPC_devroom_04_Aquilon.pdf wahid: (10:44 AM) CERN on the Huwai S3 appliance Alastair Dewhurst: (10:45 AM) if you want, I can unmute and type in the background wahid: (10:45 AM) had to add some things to get Range / vector reads to work on S3 . but then I think they got similar performance to xrootd also at the last chep there was some BES talk where they used s3 with some success but xrootd is super optimised to do what we want to do - it will be good / thebest (whether we care and can use something more 'standard' is an open question) Ewan Mac Mahon: (10:52 AM) I think if we can get http/S3 to the point where it's good enough, that has a big advantage on the client side - an xroot system is pretty much 100% PP specific; anyone else is basically stuck short of something like xrdcp file staging. And 'good enough' for us, works at all for other people is probably going to be the best possible case we can hope for in the future. wahid: (10:56 AM) quite - and we have seen it is good enough Samuel Cadellin Skipsey: (10:56 AM) I agree entirely. But also I have to flee to another meeting :( Jens Jensen: (10:57 AM) Yes, we should be wrapping up... wahid: (11:00 AM) Got to go - MERRY CHRISTMAS !