Jens (chair+mins), John B, John H, Matt D, Winnie, Rob, Robert, Steve, Wahid, Raul, Raja, Elena, Brian, Tom, Sam, Ewan, Jeremy (briefly)

Special guest star: Paul Hopkins (Cardiff/LIGO)

Apologies: Jeremy

0. Operational Blog Posts

   Wahid is leaving!! - next time will be his last time!! - he's going to NERSC!!

   DPM 1.8.9 - could write up some lessons learned.  E.g. new head
   node set up with puppet - what is needed if you're not using puppet.
   Is symlinking libraries really necessary?

   Regarding backups, there are two databases, cns_db and dpm_db, the
   former being the namespace and the latter being mostly static stuff.
   Not in general necessary to back up old requests, so these tables
   could be omitted.  Could one do incremental backups of the database?
   It could be possible using binary logs.  Should back up at least once
   a day.  Migration could use a checkpoint restore (i.e. restore the
   most recent backup and add in the incrementals).

   Brunel - GridFTP segfaults.  Node died, reboot with IPMI.  After
   upgrade, node contacts MySQL - fixed.  Currently at 40TB running
   out of space.  And xroot wasn't configured properly with puppet -
   used the wrong library - feedback from David Smith.

  Could we run DPM in Docker?  Raul is willin', has testbed.

1. Jeremy did listen in to the pre-GDB, will give a report next week.

2. An audience with LIGO (Paul Hopkins, Cardiff)

   VO is about 10^3 people, studying (or trying to detect?) gravitional
   waves.  Six compute clusters in US, one in DE, all custom built for LIGO.
   Also using a HTCondor cluster in Cardiff with 700 Westmere cores and
   acceptance testing 1400 Haswells.  LIGO (in UK) is STFC funded.
   Investigating using a data analysis pipeline on GridPP resources.  Already
   talked to Catalin from RALT1 about CVMFS, but need to install client.

   We should set up a VO.  There is no EGI VO; there used to be an OSG one but
   it is now not maintained.  Sites in UK would be (could be) GLA, BHM, OXF,
   SHF.  There are already gravitywaveologists in Glasgow but they are not
   involved with the LIGO activity, yet there may be benefits of bringing them
   together.  Also note that IT and science working together (many of us IT
   folks have science backgrounds anyway, so not too much of a stretch)

   How do people do LIGO stuff?  They log into a cluster and submit a job.
   Working with the grid - say, with the DIRAC server at IC - they would
   submit "to the grid" instead.  Currently they have 8TB at Cardiff of raw
   data, but new runs are  expected, initially with 10TB/yr, maybe ramping up
   to 100TB/yr.  A single job will need data on the order of gigabytes.  Some
   will be long running (~week), some will be short.  Suggestion that we start
   with data at one site.  Also, some people run numerical simulations of
   black holes which also generate output data.

   We should set up a VO.  LIGO "own" www.ligo.org, so vo.ligo.org would be a
   good name for a new VO.  Also good opportunity to debug our to-be-updated
   instructions for setting up a new VO.  Paul should not wait for the VO to
   be set up but could get started using the GridPP VO.

3. AOB

   Brian reports problems with FTS transfers from CERN to RAL-tape for ATLAS
   and speculates whether there is a generic problem of CERN-to-X with FTS for
   ATLAS due to a below-baseline version of GFAL being used (2.6.8, where
   baseline is 2.7.8).  Alternatively whether there are other problems of
   X-to-RAL.  Recommendation to check whether GFAL components are up to date,
   particularly since the SAM tests will switch to GFAL.  There are known
   issues with CASTOR which are GFAL related.


Paige Winslowe Lacesso: (11/03/2015 10:03:05)
WILL MISS**N YOU WAHID!!!
wahid: (10:04 AM)
yes we are talking
https://www.gridpp.ac.uk/w/index.php?title=DPMUpgradeTips
ggus at https://ggus.eu/index.php?mode=ticket_info&ticket_id=112272
raul: (10:08 AM)
Very sorry that Wahid is leaving. Sam and Wahid are my preferred source of
knowledge/help in DPM. 
wahid: (10:11 AM)
does anyone actually have a problem backing up the whole thing though ?
Ewan Mac Mahon: (10:12 AM)
Back up the binary logs?
Though I'm not quote sure why you'd bother - is it actually at all troublesome
to back up the whole thing?
wahid: (10:12 AM)
--incremental -maybe - but anyway is this an academic discussion 
Ewan Mac Mahon: (10:14 AM)
There used to be a problem with the dump scripts locking the database for the
duration, but I tihnk that went away with the move to innodb tables (?)
I think this might be a case for 'everyone post their backup scripts to the
list/wiki'
wahid: (10:15 AM)
https://www.gridpp.ac.uk/wiki/Performance_and_Tuning
its on there I think 
Ewan Mac Mahon: (10:16 AM)
The od thing is I don't think I have binary logs on.
Hmm.
Samuel Cadellin Skipsey: (10:16 AM)
Ewan: I shall check, then. Innodb tables dump faster anyway, though.
John Bland: (10:16 AM)
ewan: one line essentially, mysqldump --all-databases --single-transaction,
which I use succesfully for other dbs as well
Ewan Mac Mahon: (10:19 AM)
Ours is basically the same - IIRC the '--single-transaction' is the critical
don't-lock-the-entire-db bit.
Samuel Cadellin Skipsey: (10:19 AM)
Yes, iit is.
Ewan Mac Mahon: (10:20 AM)
We've got a bit of boilerplate around it to name the dumps after the day of
the week, so after seven days it starts overwriting the old one. 
Also our tiny little shell script is aparantly a tiny little perl script,
which is a bit surprising.
But apparantly we cribbed it from Glasgow.
Samuel Cadellin Skipsey: (10:21 AM)
Ah, so with --single-transaction, mysql already does a binary log dump at that
point (even without binary logs explicitly enabled).
So, yeah, you're fine, Ewan.
(We turned them on as it also helped with some other database operations we
wanted to do, and avoided locking for them)
Ewan Mac Mahon: (10:28 AM)
Usual PSA: Catalin uses male pronouns.
Elena Korolkova: (10:30 AM)
We can support LIGO in Sheffield. I'll talk to our LIGO guy.
Tom Whyntie: (10:34 AM)
There's an overview here:
https://www.gridpp.ac.uk/w/images/5/5f/Twhyntie_DRN000024-v1-0_DIRAC-CVMFS-CERNVM_mk01.jpg
Ewan Mac Mahon: (10:35 AM)
Is there actually a ligo VO atm, or do we need to set oneup?
Jens Jensen: (10:35 AM)
Sounds like we need to set one up (with EGI)
Ewan Mac Mahon: (10:35 AM)
I think we might be skipping the regional incubator phase here.
(and I think we should skip the regional incubator phase in this case)
So, who wants to be the one place? I'm thinking either the Tier 1 or Glasgow,
given who's involved.
But the Tier one has better networking.
Define 'a lot'
It's always important to use numbers since we tend to find that our idea of
'big' can be somewhat divergent from other people's.
So we need: a VO, stuff in the cvmfs, space at the tier 1, and CPU-only
resources at the Tier 2s.
Paul Hopkins: (10:45 AM)
paul.hopkins@astro.cf.ac.uk
Tom Whyntie: (10:45 AM)
https://www.gridpp.ac.uk/wiki/Quick_Guide_to_Dirac
https://www.gridpp.ac.uk/wiki/DIRAC_new_user_checklist
Ewan Mac Mahon: (10:46 AM)
How long does it take to set up a VO?
an EGI one.
We should ping Jeremy, I think he's most up to date. Unless Tom is?
Elena Korolkova: (10:48 AM)
I think the name should be chosen carefully.
I mean we don't want to change ligo to ligo.org as we did for t2k.
Ewan Mac Mahon: (10:50 AM)
That's true, we'll need a DNS name.
Ewan, Jens, Sam, I think.
And Elena.
Er.
Jeremy?
wahid: (10:50 AM)
was he ever here
I mean today
Ewan Mac Mahon: (10:51 AM)
That was going a bit existentialist for a moment there.
Tom Whyntie: (10:53 AM)
Thanks Paul.
Steve Jones: (10:54 AM)
Note to Tom: sounds like a good prototyping process is being made. Good work.
Pls document so it can be reused with other VOs.
With other _new_ VOs, I mean.