Attending: Elena, Gareth, Wahid, Jens, Raja, Brian, Jeremy, Robert, Sam, Matt, Ewan, David, Adam, Chris W Apologies: John H 0. Elena seeing dashboard errors; Wahid suggests using xroot instead of rfio. 1. Raja joined to explain the LHCb use of T2 storage resources and answer questions. "T2D" - selected T2s providing storage for D (Data Analysis?) Need >= 300 TB. Currently asking for file via SRM, using two space tokens. Preferred access and transfer protocols are file > xroot > dcap > rfio (and possibly others) Sites in the UK are currently Manchester and RALPP, but will also have sites in other countries, including Brazil and Switzerland. Was planning four sites in UK, but currently managing with two. Selection criteria unknown (see chat) and funding criteria, but could find out from Andrew McNab and Pete Clarke. Aiming for not just storage but also reliability. Normal job processing will be as before, so there shouldn't be significant changes in the other sites' transfers. Using CERN FTS with RAL as backup (which means we won't have the statistics data, only the overall summaries, see links in chat.) 2. Sam and Wahid gave summaries from CHEP: dCache and DPM "awesome." CMS reported their increasingly network centric model where they fetch the file from wherever it is - this, of course, can affect sites with limited bandwidth (and we're all limited somewhere) as they can see increased network usage. Brian B gave a plenary talk, emphasising "federation" as important, and pointing out that we don't do "big data" (see discussion in chat.) We also don't do data driven research in the sense that many industry players talk about big data. Nevertheless, we do have the volume... Talk about using CEPH as an object store (so not the filesystem), and Rucio seems to be able to use an object store. Huawei presentation on storage, focusing on ease of use. BES3 experiment, looking at S3, getting decent performance. Also people have looked at Hadoop and relational databases for HEP, but not necessarily doing better than the current stuff (see also chat.) HTTP - lightning talk and poster on using HTTP for transport, with DMLite /Rucio plugin - Mario. HTTP "ecosystem", talk about DAV. 3. AOB - sites that haven't to upgrade their stuff - Brian tracking (end of Nov?) - Jens out next Wed for dept'l meeting, someone else to chair - Sam reports ATLAS jobs fail with bad magic number from LFC; Elena will have a look. [10:12:29] Jens Jensen May just be that Andrew = Manchester, and Raja = RAL? [10:12:32] Ewan Mac Mahon There is an extent to which hosting LHCb is a bit of a liability - they're not paying as much for the disk space as ATLAS would. [10:12:43] Ewan Mac Mahon So don't be too sad about it. [10:13:18] Sam Skipsey Sure, but if funding for non T2Ds reduces (for example) as a result of funding shifting to t2Ds, then those non-T2Ds will naturally reduce their share for LHCb [10:15:45] Ewan Mac Mahon Indeed. We're basically all ATLAS all the time now. Other VOs get a look in when ATLAS break something central. [10:16:13] Sam Skipsey Well, we *had* been giving LHCb more than their fair share to be nice to them... [10:16:37] Ewan Mac Mahon The PMB have been warned that their funding mechanisms are also communications, but they've not changed things (yet). [10:18:46] Brian Davies main wlcg fts monitoring for lhcb: [10:18:47] Ewan Mac Mahon Arguably the stricter 'you get what you pay for' approach is actually a reflection of a cloud computing model, and therefore clearly the way of the future. [10:18:47] Brian Davies http://dashb-wlcg-transfers.cern.ch/ui/#date.interval=40320&grouping.dst=%28country,site%29&grouping.src=%28country,site%29&m.content=%28efficiency,errors,successes,throughput%29&technology=%28fts%29&vo=%28lhcb%29 [10:18:57] Wahid Bhimji we had a report in the ops meeting [10:19:30] Wahid Bhimji Don't repeat what you said yesterday [10:20:22] Raja Nandakumar LHCb Data transfer rates : [10:20:24] Raja Nandakumar http://lhcbweb.pic.es/DIRAC/LHCb-Production/visitor/systems/accountingPlots/dataOperation#ds9:_plotNames10:Throughputs9:_groupings11:FinalStatuss13:_timeSelectors5:86400s14:_OperationTypes35:putAndRegister,replicateAndRegisters7:_Sources488:LCG.BHAM-HEP.uk,LCG.Bristol-HPC.uk,LCG.Bristol.uk,LCG.Brunel.uk,LCG.Cambridge.uk,LCG.Durham.uk,LCG.ECDF.uk,LCG.EFDA.uk,LCG.Glasgow.uk,LCG.Imperial.uk,LCG.Lancashire.uk,LCG.Liverpool.uk,LCG.LT2-IC-HEP.uk,LCG.Manchester.uk,LCG.Oxford.uk,LCG.QMUL.uk,LCG.RAL-HEP.uk,LCG.RHUL.uk,LCG.Sheffield.uk,LCG.UKI-LT2-Brunel.uk,LCG.UKI-LT2-IC-HEP.uk,LCG.UCL.uk,LCG.UKI-LT2-IC-LeSC.uk,LCG.UKI-LT2-QMUL.uk,LCG.UKI-LT2-RHUL.uk,LCG.UKI-SCOTGRID-DURHAM.uk,LCG.UKI-SCOTGRID-ECDF.uk,LCG.UKI-SCOTGRID-GLASGOW.uks9:_typeNames13:DataOperatione [10:20:38] Jens Jensen [10:20:38] Raja Nandakumar Please use firefox if possible. [10:21:07] Raja Nandakumar And you can choose from various options on the left hand pane as you prefer [10:21:15] Jeremy Coles CMS sites in the USA support just the one VO. [10:22:05] Ewan Mac Mahon But their network pipes support just one VO and the entire rest of the institution. [10:22:22] Sam Skipsey (We should note that *Ceph*/Inktank also don't consider the filesystem interface to be production.) [10:22:35] Ewan Mac Mahon It's not like we haven't had pushback from university networking people about high usage in the past. [10:22:51] Sam Skipsey sure, but the difference is that in the US, sites tend to be single VO [10:23:02] Sam Skipsey In the UK, not to as great an extent [10:23:08] Ewan Mac Mahon Personally though, I think the UK network should be well up to it; it's a really good network. [10:23:24] Christopher Walker Flying visit - have a meeting at 10:30. [10:24:01] Christopher Walker Object store, don't you then need a database to go from file you want to the object you have. [10:24:10] Christopher Walker Not that this is a bad idea... [10:24:26] Jeremy Coles http://indico.cern.ch/getFile.py/access?contribId=523&sessionId=10&resId=0&materialId=slides&confId=214784 [10:25:17] Ewan Mac Mahon @Chris AIUI that's exactly what DPM is turning itself into with the possible object store based DMlite backends. [10:25:39] Sam Skipsey (Brian is of course biasest as one of the people who originally invented the diskless T3 thing) [10:25:45] Sam Skipsey ...biased, even [10:25:56] Ewan Mac Mahon He's the biasedest? [10:27:06] Matt Doidge Isn't a biasest someone who's biased against biase? [10:29:10] Christopher Walker Got to go. Bye [10:29:11] Jens Jensen We have some Vs: volume, velocity, veracity. There's one more which is not variability but I can't remember. [10:29:23] Sam Skipsey Variety, Jens. [10:29:58] Jens Jensen Variety. Ta. [10:30:55] Ewan Mac Mahon A lot of 'big data' solutions are applicable to our stuff too, with the likely exception of the map/reduce idea. [10:31:22] Ewan Mac Mahon And that's not sensibly applicable to a lot of the things that people are using it for in an attempt to be all big data-y. [10:31:32] Sam Skipsey Well, you *can* map/reduce stuff, if you change your file formats... [10:31:55] Jens Jensen Some stuff might work better on grid; other stuff on Hadoop, and yet other things on SQL. [10:32:57] Jens Jensen Finding out which stuff works best where is the key... [10:33:33] Ewan Mac Mahon I do think one of the interesting things from the 'big data' workshop was the (clearly biasest) Intel figures on how well a flat Lustre system did on a hadoop workload compared with HDFS. [10:34:24] Ewan Mac Mahon A lot of hadoop clusters seem to use rubbish networking because they're already jumping through the hoops to avoid using the network, rather than just using a decent network and being done with it. [10:35:40] Wahid Bhimji https://indico.cern.ch/contributionDisplay.py?sessionId=10&contribId=523&confId=214784 [10:35:44] Wahid Bhimji is my summary slides [10:36:25] Jens Jensen Thanks, Wahid. [10:36:52] Wahid Bhimji I think the people you need to chase are not in here ... apartt from matt who will do it [10:39:04] Matt Doidge I'll take a look at Lancaster, it might have got lost in our interesting times. [10:39:29] Jens Jensen Is that (bad (magic number)) or ((bad magic) number) :-P [10:39:49] Sam Skipsey Get error: Failed to get LFC replicas: -1 (lfc_getreplicas failed with: 2704, Bad magic number)