DPM 18.10 - Brian - do we need to be a bit firmer about uniform updates to releases across entire sites (for new functionality, like gridftp redirection which needs all things to be up to date). Yaim v Puppet configuration changes (a blocker for moving to 1.8.9+) - difference between "installing a new thing with Puppet" and "moving from a system which is yaim-configured to a system which is puppet-configured, while in production". - (Glasgow's plan was to just build a new Puppetized head and then move the database to). Brian "we need to be in a position where, when the experiments do turn around and say "we really need this functionality", we can provide it". Sam - "this also, of course, blocks xrootd 3->4 changes) etc, which all assume that you have puppet scripts" Matts "adhoc" installs - "we have a system of configs already prepared, which we just roll out with scripts". Have been looking at Ansible (we just don't like puppet here, or yaim!). [How do we adjust to new configuration requirements? Read puppet scripts and watch what puppet does]. In response to Gareth's question: Quite like Ansible. Gareth notes that Ansible is a different perspective on system config management - it's more active ("directing the system to change in this way" rather than managing a step removed). Also dislikes Puppet's esotericism. At the moment Lancs is scripts+pdsh (so Ansible is essentially the polished version of). (Some discussion of the issues with Puppet happened.) - DPM Workshop discussion and agenda. - Ewan: one of the general problems with Puppet is that it isn't very good at scoping/isolation. (so it's hard to integrate the DPM Puppet modules with an existing Puppet configuration system). {we tend to run the modules separately from the "site puppet" at Glasgow} Ewan " how much work would it be to make the older YAIM configuration up-to-date with doing xrootd4?" Luke noted, the problem with puppet is that you need a sufficiently large infrastructure for it to be worthwhile. But how's Puppet really that different to YAIM? Ewan noted the inertia factor. Gareth notes the that actually, most of our systems don't use YAIM (ARC doesn't, for example). Sam noted that storage is inertiaful, in a different way to other services (must be persistent). Ewan: the problem with DPM is also that the configuration hasn't actually simplified, it may have actually gotten more complex! So, moving from YAIMy CREAM to ARC is also a process of simplifying your configuration scripts. Steve Jones noted that the "key issue is that there's a standard way to build these things." However we decide to build these things, we should have a standard way. YAIM gave us a standard way. Whereas, when we have more than one way, we don't have a standard. [But Puppet *is* the officially supported standard.] Steve notes that "Acceptance of the Standard has to be there, before the standard is a standard" [Some discussion about feedback.] Ewan notes, the puppet instructions still have a line about setting up disk nodes, which note that they're not duplicating stuff that is shared with hea nodes. (But disk nodes are simpler than head nodes, so this is messed up in ordering.) - can we fix this? (The instructions have improved generally over time, though.) Steve Jones - "so, perhaps the thing someone needs to do is to make the leap and try a transition" *action - SAM to look at this. Brian: who has looked at the DPM / ATLAS consistency dumps? Has anyone done it? Was going to bring up John Hill's question about the time of the dump, but also need to ask anything else useful? (Brian sees people simply cronning the script and forgetting about it.) Has anyone done it? - John Hill has done it. - Sam was waiting until his PRODDISK cleanup was finished. - sub item here on how slow rfrm is. (See dpm user list for discussion) Is last Friday of Month a good time for them? - Ewan suggests the 28th of every month to make it cleaner. - Brian notes that making it a Friday avoids people doing upgrades (which people avoid Fridays for) Steve suggested adding the DPM upgrade/puppet config stuff to HEPSYSMAN agenda. Ewan noted that we should probably keep the second day from being bogged down with "Griddy specific stuff". —— Chat log Lukasz Kreczko: (11/11/2015 10:04) Bristol T2 fully puppetised DPM puppet agent -t --noop for dry-runs is a good way to compare Matt Doidge: (10:10 AM) The last dozen pool node installs we've adhoc installed Ewan Mac Mahon: (10:11 AM) As in, not with YAIM or puppet? Matt Doidge: (10:12 AM) The orinals were made with yaim *originals Gareth Douglas Roy: (10:13 AM) @Matt how do you like Ansible as a management solution, we've been auditing CM systems and I'd be interested in your take. Ewan Mac Mahon: (10:14 AM) I've been tossing up using ansible for the DPM nodes and having it run the DPM puppet scripts in masterless 'puppet apply' style. My feeling about puppet as a general solution seems similar to Gareth's; it's just not actually good. Matt Doidge: (10:17 AM) I'd be happy to help out with it Lukasz Kreczko: (10:18 AM) going and giving a talk Ewan Mac Mahon: (10:18 AM) Yup, not going, will try to keep an ear on it if I can. Haven't got it in the calendar though. Duncan Rand: (10:18 AM) Is there an agenda page? Lukasz Kreczko: (10:18 AM) https://indico.cern.ch/event/432642/other-view?view=standard Duncan Rand: (10:19 AM) Matt I don't think you'd need to be giving a talk to go. Ewan Mac Mahon: (10:21 AM) I quite like YAIM, FWIW. John Bland: (10:21 AM) I like puppet. I just don't want to balls up my DPM Lukasz Kreczko: (10:21 AM) Bristol IT does r10k + puppet, so it the option for us Matt Doidge: (10:22 AM) I dislike them equally, but that might be my grumpy-old-manness talking. Lukasz Kreczko: (10:23 AM) as with software, it is difficult to write good non-clashing puppet modules so feedback is needed to improve Ewan Mac Mahon: (10:24 AM) It seems to be much harder to do in puppet than it ever was in cfengine. Matt Doidge: (10:24 AM) I also blame yaim for my penchant for resorting to dodgey bash scripts. If I was to diff dpm configs at two atlas sites, would the only real difference be host and site names? Perhaps a few IP ranges. Ewan Mac Mahon: (10:31 AM) Yup, we have a standard, we just don't like it. Or at least, I don't. We're basically hovering between the 'bargaining' and 'depression' steps. Lukasz Kreczko: (10:33 AM) We need someone with an axe to inforce standards Ewan Mac Mahon: (10:35 AM) Fundamentally that's the VOs. I can more or less duck the issue for now, because no-one's jumping up and down and absolutely demanding the new features. But they will. Matt Doidge: (10:36 AM) I think a lot of us would simply by okay with some example configs and some documentation- which could have been generated by someone with puppet aka manual instructions. Elena Korolkova: (10:37 AM) Sorry, I was late. Perhaps it was discussed already. I was trying to update dpm on one of disk servers . !.8.10 was in emi3 directory but I was getting an error message: package emi-dpm_disk-1.8.10-1.el6.x86_64.rpm is not signed I support Matt. I'm still using yaim with modifying some configuration files Matt Doidge: (10:41 AM) I know the emi-dpm_disk-1.8.10-1 was just put in, they may have ether forgotten to sign it or not got round to it Elena Korolkova: (10:42 AM) Thanks, Matt Should we wait for the signature? Ewan Mac Mahon: (10:42 AM) Probably best. i think that a nudge on the dpm-users list mught help that along. Matt Doidge: (10:43 AM) It'll be worth bringing it up with the dpm-user-forum - I need to install a few disk servers later so I can do it John Hill: (10:43 AM) That was what I was doing on my disk node when I hit the checksum issue Tom Whyntie: (10:44 AM) To be fair Steve J (and Steve J Jnr) have been providing some fantastic feedback for the GridPP UserGuide :-) Steve Jones: (10:46 AM) Thanks Tom... John Hill: (10:47 AM) I've tested that it works. You need python 2.6 added to a SL5 head node for it to work Tom Whyntie: (10:47 AM) Have to leave early - apologies - thanks all John Hill: (10:48 AM) I also saw a mysql exception which seemed not to matter Matt Doidge: (10:50 AM) Hopefully I'll have news on this tomorrow Brian There's only 4-5 proper working weeks left this year