Minutes of the storage EVO meeting 9 Feb 2011 Present: Glasgow: Sam, David QMUL: Chris (briefly) Liverpool: John, Stephen Manchester: Alessandra Bristol: Winnie RHUL: Govind RAL T1: Brian, Jens (chair+mins) Apologies: Edinburgh: Wahid RAL T1: James 0. Review of action items (below see) 1. File deletion shenanigans? (cf discussion on list.) This is probably usual unusual stuff, rather than unusual unusual stuff. Nevertheless, it is an interesting consistency problem. Maybe worth contacting J.-P. to hear what he thinks. 2: quickies (possibly): 2a. Quick pass through the (current) agenda for the storage workshop Main open question is what time we need to finish, and we need someone to invite Dell to speak (Chris?) 2b. CVMFS - what happened to? Currently commissioned at CERN. Not yet asked for by ATLAS. Current status in GridPP is "feel free to experiment but don't put it on production services" as upgraded from "don't touch it" :-) 2c. Technology topics (hardware) revisited - suggestions? Question about whether people would be interested in SSD? Answers came there none. Other hardware? No. 2d. Hadoop - revisited See also cloud presentation. Brian reports that there is a Hadoop users group this week which he will attend, and can then report back to the group. 2e. Remaining BeStMan evaluations Nothing new to report AFAWK 2f. CEPH? Currently seen as less reliable than, say, Hadoop, or Lustre, but a stable release is expected by the end of the month. Tentatively assigning to Sam the task to evaluate before the workshop. There is a FUSE interface and it has the advantage of running with normal kernels (as opposed to Lustre-patched ones). However, it will require more recent kernels than those found on SL5. 2g. pNFS? Also a possibility (cf NFS4) - maybe as lowest common denominator for local access? 2f. Brian points out SL4 support is coming to an end, in April(?) and sites should upgrade. If your DPM only runs on SL4 then it's time to upgrade the DPM as well. Sam said dpm-users had noted some problems with DPM 1.8 upgrades where the schema version wasn't updated correctly. If you see this, please report to list. Easily fixable but odd. 3. Disk server load/scaling testing study discussion Some interest in modelling disk servers and their behaviour, several discussions within T1. J.-P. also interested in instrumenting DPM which will give a better view of the system. WAN transfer testing/tuning: Interestingly seeing asymmetric connections RAL-Gla: 960 Mb/s, but the other way around seeing 3,000 Mb/s. Both are 3x iperf. Not too hard to get permission to run iperf, ran out of hours. Investigating why numbers are different? 4. Local site activities - infrastructure and users Postponed again! 5. AOB Don't forget the cloud storage talk in the NGS surgery. 401 02/06/2010 Clean up the wiki ALL Low Open 427 19/01/2011 Send Areca stuff to Sam Ewan High Open 429 19/01/2011 Contact Sanger about storage (Lustre workshop and GridPP) Jens Med Open 430 19/01/2011 Test the rest of Chris's StoRM (cf 425) ALL Med Open All of these actions are unchanged from last week! [09:58:21] David Crooks joined [09:58:24] John Bland joined [10:01:34] Stephen Jones joined [10:02:18] Brian Davies joined [10:03:04] Brian Davies left [10:03:54] Govind Songara joined [10:04:10] Winnie Lacesso joined [10:04:47] Alessandra Forti joined [10:04:50] Elena Korolkova joined [10:05:00] Brian Davies joined [10:06:38] Alessandra Forti when I had this problem in manchester [10:06:57] Alessandra Forti I could remove the "non existent files" with lcg-del [10:07:06] Alessandra Forti with atlas prodproxy [10:07:21] Sam Skipsey Aye, but the issue Govind is having is that the remaining *directories* are then not being deletable... [10:07:25] Govind Songara i think in my case they are not file [10:07:43] Alessandra Forti I had the same problem [10:08:00] Alessandra Forti I couldn't delete the dirs because it thought the file was there [10:08:07] Alessandra Forti but the file wasn't. [10:08:42] Alessandra Forti I deleted the file from outside. [10:09:08] Govind Songara OK..let me get list of dir from dpns-ls and will send across [10:09:42] Alessandra Forti the worst that can happen is that it doesn't work anyway [10:15:45] Winnie Lacesso Dr Metson's Hadoop experimenting at Bris is going slow due to overload with other. [10:21:42] Queen Mary, U London London, U.K. joined [10:25:13] Queen Mary, U London London, U.K. left [10:26:02] Queen Mary, U London London, U.K. joined [10:26:02] Queen Mary, U London London, U.K. left [10:28:40] Queen Mary, U London London, U.K. left [10:32:34] Elena Korolkova left [10:32:35] Winnie Lacesso left [10:32:38] Brian Davies left [10:32:38] Alessandra Forti left [10:32:38] John Bland left [10:32:41] Sam Skipsey left [10:32:46] Stephen Jones left [10:32:47] David Crooks left [10:32:49] Govind Songara left