Jens is away; the Honourable W Bhimji presiding. 0. Operations issues e.g. xrootd issues - are there any common threads (no pun intended). Shef prob OK . Lancs found a config problem so maybe OK after that (both are/were fedredir_atlas issues only). ECDF and glasgow have seen 2 (rarish)problems : 1 thread limit reached (clear from logs). 1 crash in night. (core send to David Smith - no solution yet). Brunel have much more frequent problem (maybe cms redir caused) - see chat below. Doesn't seem like there are common issues but should keep an eye on. No other ops issues from ops team 1. Hardware discussion - overflow / updates from last week. Manchester ; Oxford tenders out. Wahid had further emails with dell which Dave colling is hopefully taking forward in meeting with Dave Coughlin et al. Otherwise people seemed happy. 2. Storage ticket roundup - perhaps Brian can paste his list of storage tickets (ps how do you make them - is it manual?) See other mail from brian - we may have such a list to look at each week - but won't go through individually each item - just if someone notices something on the list if they want to talk about. 3. DPM workshop issues (register at : https://indico.cern.ch/confRegistrationFormDisplay.py?confId=273864) How to check if you've already registered:Wahid will add a list of people registered (if you have an indico account it will be obvious if you have ) Dinner on Thu: There will be something - but further reason to register by next week as around christmas things need booking. (There is a tick box on the form). 4. AOB: IPV6 chat - Brian is interested - test systems at glasgow and oxford. Have other sites have asked their campus network folks ? (see chat window responses below) Chat window: 10:03:42] David Crooks Sam's just coming [10:04:49] Elena Korolkova Many thanks for your help, Wahid [10:07:04] Ewan Mac Mahon Morning [10:07:21] Ewan Mac Mahon I am muted, and I've turned the camera off too. [10:08:29] Wahid Bhimji have you looked into it Sam ? [10:08:44] Sam Skipsey I have not. [10:08:57] Sam Skipsey The "default" max threads are implied to be rather high for xrood [10:10:04] Wahid Bhimji we can't hear you raul if you are speaking - you can type [10:10:10] Raul Lopes headset down [10:11:03] Matt Doidge I broke my xrootd [10:11:39] Raul Lopes I've seen xroot failing 5 times a day. always a general protection failure apparently related to the redirector [10:12:29] Raul Lopes Not taht we know [10:12:45] Raul Lopes it waas fine before 1.87 [10:12:45] Matt Doidge Once xrootd is fixed, expect to hear from me about my webdav problems. [10:13:12] Raul Lopes i'm interacting with D Smith [10:14:56] Wahid Bhimji OK - well let me know what he says ... happy to also to share config details etc. if you want to try and figure out why its different ... is it related to CMS federation part [10:16:59] Raul Lopes aparently something to do with CMS federation. Only it was perfect before the upgrade to 1.87. and I haven't changed it and actually David and Frabrizio to go through them [10:18:42] Ewan Mac Mahon Our tender closing date is the 5th of December, so in eight days from now. [10:19:12] Wahid Bhimji thats quite demanding on them ... [10:19:38] Wahid Bhimji (Raul) - but the problem with cms fed brings down also the other xrootd server processes [10:19:39] Wahid Bhimji ? [10:20:22] Ewan Mac Mahon Yes and no. All three vendors were pre-warned and had direct discussions, and what we want is fairly simple - almost certainly a bunch of quad motherboard CPU servers and some big disk boxes. [10:20:48] Ewan Mac Mahon It shouldn't be much more complex than just generating some quotes would have been, so it shouldn't take weeks. [10:21:04] Ewan Mac Mahon It's certainly a short tender period, but it's a pretty minimal tender. [10:22:00] Matt Doidge Feel free to forward me any tickets that you want brought up. [10:22:24] Raul Lopes mostly the redir service dies. frequently, not always, the federedir_cms goes down as well. xroot stays alive. however, after a certain xroot also freezes, even if it shows a stus of OK [10:22:58] Raul Lopes when I say it freezes, I mean: it seems to be alive, but Atlas local access stops working [10:24:27] Elena Korolkova Will it be evo/vidyo connection to the meeting? [10:28:46] Wahid Bhimji why don't you have dteam ? [10:31:00] Wahid Bhimji good at having pwer cuts [10:32:29] Wahid Bhimji Raul - perhaps you could stop the cms fedredir for a while and see if at least the rest all works for a while..(maybe that it is not popular /possible with cms...) [10:34:23] Wahid Bhimji well if you do install dpm the dpm guys may be interested... I also have regular dpm tests I could add at oxford... if you do install it - no hurry [10:35:12] Wahid Bhimji nowhere [10:35:38] Wahid Bhimji no [10:35:49] John Bland we're not talking to our local CSD network guys at the moment (long story) [10:36:17] John Hill Our campus people are ready and willing - I just haven't had time to do anything [10:36:27] John Bland but there is some odd ipv6-over-ipv4 stuff going on [10:36:52] Matt Doidge IPv6 a good topic for next HEPSYSMAN? [10:37:36] Raul Lopes Wahid: impossible to switch off fedcms_dir for a small site like Brunel. David C and CMS ops would destroy me [10:38:10] Raul Lopes we expect to activate IPv6 in a few weeks at Brunel. early Jan? GRIDPP2: Deployment and support of SRM and local storage management [GRIDPP-STORAGE@JISCMAIL.AC.UK] on behalf of Brian Davies [brian.davies@STFC.AC.UK] [Reply All] gridpp-storage 27 November 2013 09:48 Unfortunately, since not everyome fills in, ( and sometimes more thanm one )ToP on their tickets I can't search fully by But I start form here: https://ggus.eu/ws/ticket_search.php?show_columns_check[]=REQUEST_ID&show_columns_check[]=TICKET_TYPE&show_columns_check[]=AFFECTED_VO&show_columns_check[]=AFFECTED_SITE&show_columns_check[]=PRIORITY&show_columns_check[]=RESPONSIBLE_UNIT&show_columns_check[]=STATUS&show_columns_check[]=DATE_OF_CREATION&show_columns_check[]=LAST_UPDATE&show_columns_check[]=TYPE_OF_PROBLEM&show_columns_check[]=SUBJECT&ticket=&supportunit=NGI_UK&su_hierarchy=all&vo=all&user=&keyword=&involvedsupporter=&assignto=&affectedsite=&specattrib=0&status=open&priority=all&typeofproblem=all&ticketcategory=&mouarea=&date_type=creation+date&radiotf=1&timeframe=any&from_date=27+Nov+2013&to_date=28+Nov+2013&untouched_date=&orderticketsby=GHD_INT_REQUEST_ID&orderhow=descending and then search by hand. A good check is to go through the tickets picked up by https://ggus.eu/ws/ticket_search.php?show_columns_check[]=REQUEST_ID&show_columns_check[]=TICKET_TYPE&show_columns_check[]=AFFECTED_VO&show_columns_check[]=AFFECTED_SITE&show_columns_check[]=PRIORITY&show_columns_check[]=RESPONSIBLE_UNIT&show_columns_check[]=STATUS&show_columns_check[]=DATE_OF_CREATION&show_columns_check[]=LAST_UPDATE&show_columns_check[]=TYPE_OF_PROBLEM&show_columns_check[]=SUBJECT&ticket=&supportunit=NGI_UK&su_hierarchy=all&vo=all&user=&keyword=&involvedsupporter=&assignto=&affectedsite=&specattrib=0&status=open&priority=all&typeofproblem=File+Transfer&ticketcategory=&mouarea=&date_type=creation+date&radiotf=1&timeframe=any&from_date=27+Nov+2013&to_date=28+Nov+2013&untouched_date=&orderticketsby=GHD_INT_REQUEST_ID&orderhow=descending and https://ggus.eu/ws/ticket_search.php?show_columns_check[]=REQUEST_ID&show_columns_check[]=TICKET_TYPE&show_columns_check[]=AFFECTED_VO&show_columns_check[]=AFFECTED_SITE&show_columns_check[]=PRIORITY&show_columns_check[]=RESPONSIBLE_UNIT&show_columns_check[]=STATUS&show_columns_check[]=DATE_OF_CREATION&show_columns_check[]=LAST_UPDATE&show_columns_check[]=TYPE_OF_PROBLEM&show_columns_check[]=SUBJECT&ticket=&supportunit=NGI_UK&su_hierarchy=all&vo=all&user=&keyword=&involvedsupporter=&assignto=&affectedsite=&specattrib=0&status=open&priority=all&typeofproblem=File+Access&ticketcategory=&mouarea=&date_type=creation+date&radiotf=1&timeframe=any&from_date=27+Nov+2013&to_date=28+Nov+2013&untouched_date=&orderticketsby=GHD_INT_REQUEST_ID&orderhow=descending and https://ggus.eu/ws/ticket_search.php?show_columns_check[]=REQUEST_ID&show_columns_check[]=TICKET_TYPE&show_columns_check[]=AFFECTED_VO&show_columns_check[]=AFFECTED_SITE&show_columns_check[]=PRIORITY&show_columns_check[]=RESPONSIBLE_UNIT&show_columns_check[]=STATUS&show_columns_check[]=DATE_OF_CREATION&show_columns_check[]=LAST_UPDATE&show_columns_check[]=TYPE_OF_PROBLEM&show_columns_check[]=SUBJECT&ticket=&supportunit=NGI_UK&su_hierarchy=all&vo=all&user=&keyword=&involvedsupporter=&assignto=&affectedsite=&specattrib=0&status=open&priority=all&typeofproblem=Storage+Systems&ticketcategory=&mouarea=&date_type=creation+date&radiotf=1&timeframe=any&from_date=27+Nov+2013&to_date=28+Nov+2013&untouched_date=&orderticketsby=GHD_INT_REQUEST_ID&orderhow=descending currently: Ticket-ID Type VO Site Priority Resp. Unit Status Date Last Update ToP Subject 98923 ops UKI-SOUTHGRID-RALPP urgent NGI_UK in progress 2013-11-15 2013-11-26 16:27 COD Operations NAGIOS *eu.egi.sec.dCache-SHA-2* failed on heplnx... 98882 atlas UKI-SOUTHGRID-SUSX very urgent NGI_UK in progress 2013-11-14 2013-11-26 10:01 File Transfer UKI-SOUTHGRID-SUSX all file transfers fail 98719 none UKI-LT2-UCL-HEP less urgent NGI_UK on hold 2013-11-07 2013-11-25 21:02 Storage Systems Please upgrade your storage system to minimum leve... 98594 lhcb UKI-NORTHGRID-SHEF-HEP urgent NGI_UK in progress 2013-11-04 2013-11-27 08:00 Other File uploading problem at UKI-NORTHGRID-SHEF-HEP 98125 atlas UKI-LT2-UCL-HEP less urgent NGI_UK on hold 2013-10-17 2013-11-25 21:03 File Transfer UKI-LT2-UCL-HEP:[gfalt_copy_file][plugin_filecopy]... 94746 biomed UKI-LT2-QMUL less urgent NGI_UK on hold 2013-06-10 2013-11-25 14:03 Other SE se04.esc.qmul.ac.uk shows in BDII for biomed VO