Sam, Jens (chair+mins), John B, Simon, Rob C, Winnie, John H, Brian, Marcus,
Robert, Daniel, David, Chris, Govind, Luke, Steve, Elena


0. Blog posts, operational issues

   No other operational issues.  Only three blog posts from previous Q;
   we had two more drafts but they weren't finished in time...

1. End of quarter stuff

   Anything reportable which is not in the minutes of the storage meeting,
   please report!

2. GDB and networking

   The GDB today has a bit on accounting and another on networking:
   http://indico.cern.ch/event/578982/
   the latter being from yesterday's pre-GDB.  Note in particular Duncan's
   call for more sites with dual stack storage.

   Re dual stacked storage, there is an April deadline to provide access via IPv6.
   Experiments didn't like proxies - or complexity...

   No IPv6 support for CASTOR, currently.  Glasgow could have IPv6 but it would not be
   performant; and only HEP asked for IPv6 (so far)

3. ZFS discussion (the other items should be quick so we can spend most
   of the time on the ZFS discussion)

   Generally ZFS has worked reasonably well out of the box, but there are sometimes
   non-naive device settings.  What hardware should sites buy for ZFS storage?

   How is performance compared to the more traditional RAID6 approach?  Simon: comparable; Marcus got better perf.
   1.2 Gb/s, motherboard limitation.

   Performance, hardware, tuning - are there "go faster stripes" that could be painted on it to go faster
   [this may be a RALism, "go faster stripes" were performance tunings in Oracle]
   ... not much tuning in ZFS, disable in controller.  Getting HBA instead of RAID controllers, cost savings possible,
   or complexity savings (with ZFS) at same performance.

   Matt: 36 bays, but with 2 extra (hidden) drives for OS: Chris got them too.  RALPP has 3 x 12 disk ZFS pools.
   Would the network interface be a bottleneck, cf the discussion of the decreasing access latency for HDDs.
   Chris reports that the disk servers can saturate the network but tend not to; he could add extra 10Gb interfaces
   but there is no great need at the moment.  With 6TB disks, there are 150TB usable storage.  Generally data is
   balanced well enough that the rack gets saturated before individual disk servers.

   On CentOS7 for ZFS, Chris reports the kernel builder is not reliable for the ZFs module.  Sam's got CentOS6
   which is OK.  Matt reports SL6 is easier.  Ubuntu should have it in the latest release.  Marcus had no problems
   with the rebuild.

   Chris also reports Nagios check does hw and sw RAID but doesn't do ZFS.  Sam points out there are various failure
   modes.  Marcus prefers monitoring with the ZFS event daemon, not Nagios.

   Lustre works well with ZFS; LLNL use ZFS on RAID and Lustre on top.

   Maybe there should be the "GridPP Seal of Approval" type system, e.g. based on ZFS if it ends up being easier
   than RAID and just as performant and/or cost effective.  With associated hardware recommendations.

   Could we have a test system?  People tend to run a few tests first when they get new hardware

4. Other events

   Networkshop coming up; however it turns out they only have all or nothing tickets.

5. AOB

   Two sites being nagged about visible SL5 nodes.  People tend to move disk servers first but also head nodes
   are visible and need to be moved.


Simon George: (11/01/2017 10:12:04)
Hello!
John Bland: (10:23 AM)
all our 2U SM servers have rear-2.5" SATAs on the motherboard, MD-raid
Lukasz Kreczko: (10:29 AM)
ya
yay
Lustre, HDFS on ZFS has been done
HDFS requires no RAID ;)
Marcus Ebert: (10:43 AM)
one more comment to ZFS: What do others think about the need of 12Gbps ports on the controller or using 6gbps ports only for Grid Storage?
Matt Doidge: (10:45 AM)
It depends if the 6gbps HVA can "fill the NIC" - which I suspect it can.
*HBA