Subject: notes from todays meeting
From: Wahid Bhimji <wbhimji@staffmail.ed.ac.uk>
Date: 27/08/2014 10:40
To: "SC) Jensen Jens (STFC RAL" <jens.jensen@stfc.ac.uk>


DPM workshop (in Naples 9-10 Oct) 
https://indico.cern.ch/event/324705/overview
Contact Gridpp if you want travel expenses to go - and then sign up.

Operations 
Web Dav - Edinburgh was contacted / ticketed by atlas for instability - believe this is on DPM 1.8.7 disk servers.
Ewan - believe also seen instability (less frequently) on  1.8.8 ?
Is there a redular est for Dav? Should be a SAM test but yes should follow up.

Atlas xrootd fed redir crashed at some sites - down to a particular client request - 
is it fixed in xrootd 3 verson or just xrootd4 ? Will find out. 
Meanwhile all UK sites seem to be restarted and atlas will avoid crashing again.

Gridpp meeting outcomes 

Ceph - Sam asked  presentation at Network Shack on Ceph 

Cloud storage - no discussion at Gridpp - reflects interest.
It could be said we use cloud storage.
Maybe if people wanted to offer resources in this way they would be interest. 
Also if we had a unified thing - then people could make use of it?
Or maybe people have to run on their own disk.

Many tools for S3 
DPM plugin for S3 doesn't have long term support.

Ewan - could ceph s3 interface be used on top of other things (with e.g. gridsite ) 
What about T1 proposal of cephfs + gridftp? 

Gfalfs plugin for ceph from Sam … have a map of what he's doing.

LFC/rucio discussion
How does one now translate from logical names to files.
If its it at your site then you can use the algorithmic method (wahid can forward details). 
If want to know what sites then you have to contact rucio (wahid will check if this can be done by non-atlas member). 

CMS sent a request to point fedredir thing to another thing. Does actually point to a public page for a pleasant surprise! 

AOB 
Ewan argus / DPM banning syncing - see blog post 
http://gridpp-storage.blogspot.co.uk/2014/08/argus-user-suspension-with-dpm.html 

Chat window

wahid: (27/08/2014 10:01)

https://indico.cern.ch/event/324705/page/0

https://ggus.eu/index.php?mode=ticket_info&ticket_id=107884

Ewan Mac Mahon: (10:06 AM)

Apache is running on ~20% of my nodes.

It's variously dead or stopped on all the others.

Samuel Cadellin Skipsey: (10:07 AM)

We actually have a service to restart our httpd to keep them up.

wahid: (10:07 AM)

http://dashb-atlas-ssb.cern.ch/dashboard/request.py/siteview#currentView=FAX+endpoints&highlight=false

Ewan Mac Mahon: (10:08 AM)

Speaking of test failures; are there autmated tests for the webdav? Because if it doesn't appear in the dashboard, it's not realy reasonable to consider it a production service.

John Bland: (10:08 AM)

our puppet keeps httpd going on our headnode, but not the nodes. I've not seen any crashing

Duncan Rand: (10:09 AM)

The fact the chat window has no history is irritating...

Ah, if I type something it all magically appears..

Jeremy Coles: (10:10 AM)

Agreed. I have fed a few issues back.

Ewan Mac Mahon: (10:10 AM)

Ceph, ceph, ceph, ceph, ceph.

That's about it for storage.

Jeremy Coles: (10:10 AM)

back = fedback.

Duncan Rand: (10:12 AM)

Ewan: which 'dashboard' are you referring to?

Ewan Mac Mahon: (10:12 AM)

The one that makes the ROD people ticket sites when things break.

Bah, humbug.

Maybe CephFS and StoRM?

Clearly the way of the future.

Gareth Douglas Roy: (10:26 AM)

Yeah but this means anything you want to use has to provide POSIX......

Ewan Mac Mahon: (10:31 AM)

That's a wonderful use of the phrase 'well understood' :-)

Duncan Rand: (10:36 AM)

Post a link to the list?




The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.