Minutes of the storage EVO meeting, 11 Aug 2010

Present:
	Glasgow: Sam, David
	Edinburgh: Wahid
	Liverpool: John, Stephen
	Bristol: Winnie
	QMUL: Chris W
	Lancaster: Matt
	Manchester: Alessandra
	RAL: Brian, James, Jens (chair+mins)


*** Don't forget the tasklist on
*** https://savannah.cern.ch/projects/srmsupportuk/

0. We probably need a semi-regular tasklet on wiki updates

   Jens updated front page, removing obsolete mailing lists.
   Sam will update the information about the DPM toolkit.

1. Progress with T2K syncat exercise at Lancaster - Sam+Matt plus others

   The checker has been updated.

   Chris is debugging the StoRM version at QMUL.

   Brian has also obtained a dump from the LFC via the DBAs,
   essentially CSV of guid, size, SURLs.  Got 3E5 files, but not sure
   whether they all belong to T2K, and how many at Lancaster.

   ATLAS have a database->syncat tool, which is probably generic
   enough to be useful in a wide context.  Syncat allows for guid,
   filesize, checksum.  Could check ACLs but physics don't care too
   much.

   Plan needs fleshing out a bit now - Jens

2. ext4 vs xfs revisited (again) - cf Alessandra's find on the list

   Sam has a DPM with ext4 disk servers in production, will have more
   when some hardware problems have been resolved.

   How is it supported in the SL5 kernel?  Via extra page (ext4-progs)
   but stable.  Not clear if filesystems can be dumped yet, but this
   is a problem for normal machines, not an issue for a DPM disk
   server.

   Wahid is testing the 64 bit support: current ext4 filesystems are
   limited to 16 TB because they cannot create enough inodes.  Ted
   T'so has written a version for 64 bit but it is highly experimental
   at this stage and seems to have unresolved compilation dependencies
   against some kernels (as expected for experimental code).

3. Detecting and coping with uneven load (unbalanced datasets) on
   disk servers - progress, recommendations?

   Sam's tool does the check and calculates stats for the file
   distribution.  Can release to others if they're interested,
   although it needs some code tidying and some more work.

   It assumes an ATLASy setup where filesets are in directories.

   DPM doesn't quite round robin, at least not when other transfers
   are going on at the same time.  In fact under some circs it may
   even rr to a full disk server.  And it does not fail gracefully
   when transfer into a disk server fails - in the sense that then
   this particular file transfer will fail.  Also, when new disk
   servers are added to a pool, they will of course (hopefully) be
   empty initially.

   Ideally, a tool should be able to rebalance the files, eg by
   calling out to a tool to transfer individual files to other disk
   servers (in the same pool).

   dCache and CASTOR are different because they have more complex
   placement algorithms; dCache probably does the Right Thing(tm)
   by default.

4. Experiment liaison revisited - questions to ask at Ambleside

   Particularly questions for the non-LHC experiemnts who are normally
   less well represented.  Do they even know we exist and are able and
   willing to help them?  They may be interfacing to us via Jeremy.

   * Which services do they expect from us (or would they find useful,
     eg syncatting, integrity checking, etc)

   * Any future changes in data management, like is currently being
     discussed by the WLCG experiments?

5. Progress on dCache/Hadoop tests/writeups

   To be revisited next week.

6. AOB

   NOB



== CHAT ==


[10:00:19] Stephen Jones joined
[10:01:46] Brian Davies joined
[10:01:58] Wahid Bhimji i can do some
[10:02:03] Wahid Bhimji exactly
[10:02:25] Wahid Bhimji I was stalled trying think of suitable puns
[10:02:58] Jens Jensen http://www.gridpp.ac.uk/wiki/Dark_Data_clearance
[10:03:33] Christopher Walker joined
[10:07:42] Winnie Lacesso joined
[10:11:38] Matthew Doidge joined
[10:35:14] Wahid Bhimji thanks bye
[10:35:15] John Bland left
[10:35:16] James Thorne left
[10:35:16] Wahid Bhimji left
[10:35:18] Matthew Doidge left
[10:35:20] Winnie Lacesso left
[10:35:23] Brian Davies left
[10:35:24] David Crooks left
[10:35:27] Sam Skipsey left
[10:35:30] Alessandra Forti left
[10:35:33] Christopher Walker left