Present: Wahid Stephen John Matt Alessandra Brian Sam David Ewan Chris Santanu Rob Mark Apologies: 1. The incredible exploding DPMs?? Bug seems to be with versions prior to 1.8.3. Lancaster saw the problem and RHUL, on 1.8.2. Could it be connected to certain usage of the DPM? Admins may cause it with draining, but users can also trigger it. Sam had one segfault with 1.8.2 but also has problems with 1.8.3, had to be restarted a few times. Stopped logging at some point. Should people be generating cores? 1.8.2 are understood; Sam's 1.8.3. Sam will check for residual problems with 1.8.3; for people with many problems (Lancaster, RHUL) workarounds may be sufficient. 1.8.3 is the current one in EMI. People can also try cleaning the requests table as Ewan suggested (they should in any case). Does the current cleaning actually clean all the requests in the tables? Sam seems to see that they keep growing anyway. At upgrade, Sam dumped and dropped his requests table. gLite 3.2 is the repository, but is not supported at the end of the months? Ewan "upgraded" to EMI and found it fairly straightforward. Should everybody now move to EMI? Alessandra would prefer gLite on SL6 to minimise disruptions. Will EMI2 be only SL6? No, also SL5. An announcement was made that there were some problems with SL6 in EMI2 (and Debian), but SL5 was on track. .. and what happened at Cambridge, I am confused... ops certificate may have caused the problem, when a WLCG certificate which probably wasn't used anymore was removed; Steve Lloyd submitted as ATLAS. Seems to be passing tests. A file was created and deleted but then requested again after an hour. Cambridge also get better transfer rates on the ATLAS sonar tests. Some network problems recently, though, should have been fixed as of yesterday evening. 2. Round up of stuff from GridPP last week? LFC/FTS incorporating - but some experiments moving away from LFC. snoplus have their own catalogue anyway, but do use the LFC. CERN also has an interest in NA62 which T1 also supports. Chris suggests having a document with the datamodel recommended for small VOs. This has so far just been "use gLite", but they may need more recommendations also on how to use it, e.g. should they be lcg-cping files across the world. We can cover recommendations, a subset of all the possibilities which seems to work. Could also recommend the use of DIRAC. 3. Milestone stuff - maybe things we'd do anyway for hepsysman. ... so we ran out of time, here, but the discussion provided a few useful pointers. Note that Pete is at HEPiX. Also, (Pete:) Martin Bly, Dave Kelsey, John Gordon, Ian Collier and James Adams [10:02:32] Wahid Bhimji joined [10:02:46] Stephen Jones joined [10:02:48] John Bland joined [10:02:58] Stephen Jones try again [10:03:12] Matthew Doidge joined [10:03:33] Alessandra Forti joined [10:03:36] Brian Davies joined [10:03:47] Sam Skipsey joined [10:03:49] David Crooks joined [10:03:53] Ewan Mac Mahon joined [10:05:00] PPRC QMUL joined [10:05:03] Matthew Doidge It seems to be connected to drains or pool nodes going wrong [10:05:07] Santanu Das joined [10:05:20] Matthew Doidge haven't installed it yet [10:05:30] Matthew Doidge yep [10:05:34] Rob Fay joined [10:05:47] Mark Norman joined [10:06:12] Matthew Doidge we're doing okay [10:06:26] Matthew Doidge crashing every few weeks rather then 5 seconds [10:06:40] Matthew Doidge yes, but on the admin side (sorry. no mike) [10:06:55] Matthew Doidge I don't know [10:07:42] Matthew Doidge not quite, your database could still be "dirty" [10:08:57] Ewan Mac Mahon Didn't quite a lot of people's crashes go away when we all cleaned up our old request tables? Or were we just imagining that? [10:09:33] Alessandra Forti that worked for me [10:10:44] Ewan Mac Mahon I /think/ it worked for me too, but it wasn't long after we did that that we also upgraded to a whole new DPM head node, so it's not exactly a fair test. [10:12:29] Ewan Mac Mahon We're certainly still on 1.8.2 [10:12:29] Matthew Doidge is it in the gllite repos yet? [10:12:36] Ewan Mac Mahon (it's in EMI) [10:12:37] Sam Skipsey no, just EMI [10:17:11] Matthew Doidge our bad problems were caused by requests to retired pools [10:17:26] Matthew Doidge not sure on the cause of the very intermittant problems [10:17:37] Wahid Bhimji why doesn't the work around work 100% for you [10:17:58] Wahid Bhimji (I mean why do you still get the odd segfault if you have followed the workaround [10:17:59] Matthew Doidge don't know, i need to dive through the databases to find out, or read the core dump [10:18:20] Wahid Bhimji ok - maybe just send it to ricardo - if the workaround doesn't work then would be worth knowing [10:18:48] Matthew Doidge it worked to some degree, before we fixed things we didn't stay up for a few seconds [10:19:16] Wahid Bhimji the fact that you screwed it up first time shows why there is reluctance [10:20:32] Matthew Doidge me too [10:20:44] Ewan Mac Mahon Hmm, there is that, true. [10:21:01] Ewan Mac Mahon It's not so much not wanting to move, as only wanting to move the once? [10:21:18] Wahid Bhimji Yeah I think once they have a EMI2 /SL6 they could consider dropping glite once people have one hop. [10:21:28] Wahid Bhimji I also thought this was strange " upgrade gLite 3.2 DPMs to EMI-2 without a re-installation." [10:23:23] Alessandra Forti not going to try that on the head node for sure. [10:24:49] Wahid Bhimji correct [10:24:54] Ewan Mac Mahon \o/ [10:30:28] Wahid Bhimji the dinner [10:30:39] Jens Jensen [10:32:04] Ewan Mac Mahon I'm not sure there's much option though, is there? Other than everyone build their own. [10:32:11] Ewan Mac Mahon Which would suck. [10:34:47] Wahid Bhimji I think they will probably want their own catalogues anyway if they want any other info in it. so adding the lfc info isn't that much [10:35:16] Wahid Bhimji but anyway I agree that we need a good recomendation with also an idea of how it may work in future. [10:36:04] Jens Jensen We could have some case studies on how small VOs do it. [10:38:25] Ewan Mac Mahon How does xrootd do this sort of thing? Doesn't that give you a single namespace and just use the redirectors to find the actual stuff? [10:38:32] Wahid Bhimji yes - dirac sounds good to me; but we have minimal experience of it right. [10:38:41] Ewan Mac Mahon i.e. no LFC required? [10:39:03] Wahid Bhimji xrootd - depends on experiment. ALICE use xrootd with the Alien file catalogue [10:39:06] Ewan Mac Mahon On the jobs front I think it's between ganga/dirac and glideinWMS [10:39:16] Ewan Mac Mahon Both look interesting. [10:39:47] Jens Jensen I once cowrote a proposal (with Imperial) to reuse Dirac for climate [10:39:54] Wahid Bhimji ATLAS have built a xroot->lfc lookup thing - [10:39:58] Alessandra Forti you still need a file catalogue with xrootd [10:40:08] Alessandra Forti that's how babar used to do it [10:40:22] Alessandra Forti and the metadata catalogue and file catalogue where the same [10:40:31] Mark Norman left [10:40:33] David Crooks left [10:40:33] John Bland left [10:40:34] Wahid Bhimji left [10:40:35] Rob Fay left [10:40:35] Sam Skipsey left [10:40:36] Brian Davies left [10:40:37] Alessandra Forti that' where atlas is going now [10:40:38] Santanu Das left