Attending: RobC, Sam, Jens (chair+mins), Raja, JohnH, Matt, Winnie, Ste, Teng, Brian, Dan 0. OBP NOI Thanks to all the blog posters who have posted blogs! and lots of v interesting stuff, too! 1. Any questions/wishlist topics for the folks attending the Rucio workshop next week From RAL, IanJ, Katy, and Alastair will be attending, also James Perry from Edinburgh. * Is there a roadmap * Is it possible to limit data rates (through FTS, anything else)? * Support for object stores - should be able to upload/download, what about replication? 2. Update (if any) on the "data lakes" (any flavour) - CERN data lake: Brian chasing permissions. - Non-CERN data lake (ie GridPP-EOS): not enough effort to do anything about this atm - Commercial cloud data lakes generally give you analytics too (a la Hadoop), so also expect to know the data format (and by default it'd be something simple like CSV). The same true for data movers that are designed for use with data lakes. 3. Last week's GDB had a few interesting things we should look at: did anyone call in? (I was thinking specifically of the information systems and the storage synchronisation but other bits might be of interest, too) https://indico.cern.ch/event/739875/ The information system raises the point of service discovery, service information dsicovery, service stats and status, etc. Previously these were handled by the GLUE schema through the BDII. Is the new work an aggregation "portal" on top of these or do they propose to replace everything...? On the "cloud synchronisation" talk, it turns out to not be about cloud synchronisation at all but rather about interactive work on notebooks on data, like you would get from a CCSP. Part of the discussion is about how to bring compute to data and how we might have got away from this Noble Principle. This might have happened for political reasons - data should be at a specific place - or because people have got used to faster networks leading to the data remote access being "solved". Of course we don't support the interactive stuff very well (though it might be fun to try!?), as we're designed for a later stage in the process where things are batch processed. So there is a use case for people doing notebook style stuff; or they could do them at Tier 3 s? This is also why we have pilot jobs, so we can have a late binding of slot->job but again this is a production feature rather than exploratory. Notebooks, as the presentation indicates, is also a great way to share your analytics with others, which can help make the data more open and accessible - and/or validate the results of what will eventually be your publications. LSST use Jupyter notebooks and SKA expressed an interest. 4. AOB NOB Brian should we be offering it to small VOs? trabsfer rates are limited at the FTS level yes what system is/planned used for Object stores is this dynafed y? https://docs.google.com/document/d/1qvIGoK2d6lCZGQNRVNRnX8N3IO4dDUORj1AFMXXD5wo/edit#heading=h.od6lvm5rgyps BD Robert no I think this is more native support in RUCIO for object stores RA Brian new student starting this minth iirc month* no im not awrae of any other considering its dev history ( and where its devloped) why use anything else than FTS. if FTS is missing Objet Stores, fix FTS... I'll cjase up chase* if we are asked for one, make sure we j know what they actually mean by oene BD Daniel I had a data lake, it's just a shame it was a pint of liquid and my laptop ;( DT Brian https://indico.cern.ch/event/739875/ BD https://indico.cern.ch/event/739875/ Brian Service discovery at the moment without BDII is hard! combination with SWAN/Jupyter is probanly a likely focus from the HSF. focus is two part, 1: user interactrion, 2: work for bulk production activities this talk about 1? BD Raja Apologies - got to leave now R Thanks Raja Brian I think ATLAS are under-estimating the amount of CPU RSS which will be at diskless sites. need to get form them whst % at diskless sites works for them BD Today at 10:33 AM