<?xml version="1.0" encoding="utf-8"?><?xml-stylesheet title="XSL formatting" type="text/xsl" href="http://blog.endemics.info/feed/rss2/xslt" ?><rss version="2.0"
  xmlns:dc="http://purl.org/dc/elements/1.1/"
  xmlns:wfw="http://wellformedweb.org/CommentAPI/"
  xmlns:content="http://purl.org/rss/1.0/modules/content/"
  xmlns:atom="http://www.w3.org/2005/Atom">
<channel>
  <title>On IT Operations and Infrastructures</title>
  <link>http://blog.endemics.info/</link>
  <atom:link href="http://blog.endemics.info:82/feed/rss2" rel="self" type="application/rss+xml"/>
  <description></description>
  <language>en</language>
  <pubDate>Mon, 20 May 2013 22:58:12 +0200</pubDate>
  <copyright></copyright>
  <docs>http://blogs.law.harvard.edu/tech/rss</docs>
  <generator>Dotclear</generator>
  
    
  <item>
    <title>Réponse au post de l'Institut Agile : &quot;Devops - premières rencontres et survol&quot;</title>
    <link>http://blog.endemics.info/post/2010/12/03/R%C3%A9ponse-au-post-de-l-Institut-Agile-%3A-%22Devops-premi%C3%A8res-rencontres-et-survol%22</link>
    <guid isPermaLink="false">urn:md5:dba25567fa767ceddf1ebe6d38c34e0d</guid>
    <pubDate>Fri, 03 Dec 2010 16:37:00 +0100</pubDate>
    <dc:creator>Gildas LE NADAN</dc:creator>
            
    <description>    &lt;p&gt;This post is exceptionally in french, since I wasn't able to comment
directly on the &lt;a href=&quot;http://blog.institut-agile.fr/2010/12/devops-premieres-rencontres-et-survol.html&quot; hreflang=&quot;fr&quot;&gt;institut agile blog&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;N'ayant pu répondre dans les commentaires sur le &lt;a href=&quot;http://blog.institut-agile.fr/2010/12/devops-premieres-rencontres-et-survol.html&quot; hreflang=&quot;fr&quot;&gt;blog de l'institut agile&lt;/a&gt;, je le fais exceptionnellement ici,
en français.&lt;/p&gt;
&lt;p&gt;--&lt;/p&gt;
&lt;p&gt;Laurent,&lt;/p&gt;
&lt;p&gt;Merci pour ce compte-rendu, je n'ai malheureusement pas été en mesure de
venir à ce premier meetup-parisien.&lt;/p&gt;
&lt;p&gt;Je m'inscris un peu en faux sur l'impression que me donne ta phrase &amp;quot;Vous
aurez compris qu'on est entre techniciens...&amp;quot;.&lt;/p&gt;
&lt;p&gt;Je conviens que le choix du nom devops est un peu malheureux et de nature à
donner l'impression qu'il s'agit là d'un mouvement de techniciens pour les
techniciens.&lt;/p&gt;
&lt;p&gt;Du chemin a été parcouru depuis le premier Devopsdays à Gand en 2009 où le
nom est apparu, et déjà à l'époque le nom était apparu comme trop réducteur car
il n'adressait qu'une partie des centres d'intérêts des personnes
présentes.&lt;/p&gt;
&lt;p&gt;En fait les difficultés entre dév et prod n'étaient que la partie cachée de
l'iceberg, et les problèmes abordés étaient aussi bien techniques
qu'organisationnels ou humains.&lt;/p&gt;
&lt;p&gt;Du fait du rayonnement international de la conférence et parce qu'il
remplissait un vide, le terme devops (1) s'est propagé comme une trainée de
poudre et s'est imposé.&lt;/p&gt;
&lt;p&gt;Si l'on peut regretter que le terme soit trompeur, ce défaut est à mon sens
contrebalancé par le fait qu'il existe désormais une étiquette sous laquelle
nous sommes nombreux à nous retrouver pour échanger.&lt;/p&gt;
&lt;p&gt;Je ne doute pas qu'au vu des profils des gens ayant participé au premier
devops meetup parisien la majorité des problématiques abordées aient été plutôt
techniques, et il est de fait logique que ce soit ce point qui apparaisse dans
ton compte rendu, mais il me semblerait dommageable que cela renforce chez tes
lecteurs l'ambiguité déjà imputable au nom.&lt;/p&gt;
&lt;p&gt;A mon sens devops est un mouvement qui traite des problèmes liés à
l'informatique d'entreprise, ce qui est un vaste sujet! De fait, je partage
complètement l'avis de Damon Edwards : devops n'est pas la réponse à un
problème technique mais à un problème business (2). J'invite les lecteurs
curieux ou réfractaires à l'anglais à aller lire la rapide présentation que
j'ai posté il y a de cela plusieurs mois déjà sur devops.fr (3).&lt;/p&gt;
&lt;p&gt;Cordialement, Gildas Le Nadan @endemics --&lt;/p&gt;
&lt;p&gt;1- Le plus souvent d'ailleurs sous une forme, &amp;quot;DevOps&amp;quot;, différente de celle
souhaitée initialement par Patrick Debois pour qui les majuscules rappellent
malheureusement la séparation entre dév et prod.&lt;/p&gt;
&lt;p&gt;2- &lt;a href=&quot;http://dev2ops.org/blog/2010/11/7/devops-is-not-a-technology-problem-devops-is-a-business-prob.html&quot; hreflang=&quot;en&quot;&gt;http://dev2ops.org/blog/2010/11/7/devops-is-not-a-technology-problem-devops-is-a-business-prob.html&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;3- &lt;a href=&quot;http://www.devops.fr/&quot; hreflang=&quot;fr&quot;&gt;http://www.devops.fr/&lt;/a&gt;
qu'il est plus qu'urgent que je mette à jour&lt;/p&gt;</description>
    
    
    
          <comments>http://blog.endemics.info/post/2010/12/03/R%C3%A9ponse-au-post-de-l-Institut-Agile-%3A-%22Devops-premi%C3%A8res-rencontres-et-survol%22#comment-form</comments>
      <wfw:comment>http://blog.endemics.info/post/2010/12/03/R%C3%A9ponse-au-post-de-l-Institut-Agile-%3A-%22Devops-premi%C3%A8res-rencontres-et-survol%22#comment-form</wfw:comment>
      <wfw:commentRss>http://blog.endemics.info/feed/atom/comments/567264</wfw:commentRss>
      </item>
    
  <item>
    <title>Devops Meetups and Devops Dojos</title>
    <link>http://blog.endemics.info/post/2010/07/27/Devops-Meetups-and-Devops-Dojos</link>
    <guid isPermaLink="false">urn:md5:a46830f3ecbd9d486151f08687749988</guid>
    <pubDate>Tue, 27 Jul 2010 23:30:00 +0200</pubDate>
    <dc:creator>Gildas LE NADAN</dc:creator>
            
    <description>    &lt;h2&gt;Devopsday USA 2010 and the first Silicon Valley Devops Meetup&lt;/h2&gt;
&lt;p&gt;In late june/early july this year I went to San Francisco for &lt;a href=&quot;http://www.devopsdays.org/2010-us/&quot; hreflang=&quot;en&quot;&gt;devopsday USA 2010&lt;/a&gt; that
I had the pleasure to co-organized with &lt;a href=&quot;http://twitter.com/damonedwards&quot; hreflang=&quot;en&quot;&gt;Damon Edwards&lt;/a&gt;, &lt;a href=&quot;http://twitter.com/patrickdebois&quot; hreflang=&quot;en&quot;&gt;Patrick Debois&lt;/a&gt; and
&lt;a href=&quot;http://twitter.com/littleidea&quot; hreflang=&quot;en&quot;&gt;Andrew Shafer&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I really enjoyed the experience and am glad so many people came to attend
the conference (spéciale dédidace to the French Diaspora: Alexis, Olivier,
Patrice and Jérôme). I look forward now for another chance to contribute to the
next events!&lt;/p&gt;
&lt;p&gt;I was still in the Bay Area on july the 6th when the first Silicon Valley
Devops Meetup was organized by &lt;a href=&quot;http://twitter.com/davenielsen&quot; hreflang=&quot;en&quot;&gt;Dave Nielsen&lt;/a&gt; in Mountain View and I decided to join attend
their first meetup.&lt;/p&gt;
&lt;p&gt;Although Patrick and I have been in contact and working on presentations
about &amp;quot;Agile and Operations&amp;quot; and &amp;quot;Continuous Deployment pipelines&amp;quot; months
before he pinned the devops term and decided to create the &lt;a href=&quot;http://www.devopsdays.org/ghent09/&quot; hreflang=&quot;en&quot;&gt;first Devopsdays&lt;/a&gt;, we
don't live close enough to one another to be able to see each other regularly,
and I don't know yet enough devops-minded people locally to be able to start
regular meetups, so it was interesting for me to to see what form it would take
(sadly I haven't managed to attend to the &lt;a href=&quot;http://londondevops.org/&quot; hreflang=&quot;en&quot;&gt;popular london meetups&lt;/a&gt; yet).&lt;/p&gt;
&lt;p&gt;The meetup started with a little discussion on the group name and on what
the content and form should be for the following meetups.&lt;/p&gt;
&lt;p&gt;The first Devopsdays was a 2 days conference with speakers in the morning
and openspaces/unconference in the afternoon. I felt it was a nice format since
the morning presentation would raise interest on specific subjects and fuel the
afternoon debates without restricting them. (We were more constrained by time
-only one day- for devopsday USA 2010 and had plenty of speakers so we decided
to only have panels and a few lightning talks to raise interest/awareness to
other subjects.)&lt;/p&gt;
&lt;p&gt;I guess I felt that I was passing on the torch somehow and since some of the
topics and discussions that took place during (and after) &lt;a href=&quot;http://www.devopsdays.org/ghent09/&quot; hreflang=&quot;en&quot;&gt;Devopsdays&lt;/a&gt; came back, it
was an opportunity to share what was said and done back then. I think that the
biggest benefit the devops movement is that it enables people to share their
experience with one another, and I believe this is one of the way we can solve
the problem I addressed on my first &lt;a href=&quot;http://blog.endemics.info/post/2009/01/16/On-the-Shortcomings-Of-Systems-and-Networks-Engineers-Training&quot; hreflang=&quot;en&quot;&gt;post&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;One of the things from Ghent that I mentioned was the very nice experiment
by &lt;a href=&quot;http://twitter.com/auxesis&quot; hreflang=&quot;en&quot;&gt;Lindsay Holmwood&lt;/a&gt; when
he proposed a &lt;a href=&quot;http://holmwood.id.au/~lindsay/2009/11/01/using-cucumber-as-a-scripting-language/&quot; hreflang=&quot;en&quot;&gt;1-hour gang-development session on &amp;quot;cucumber as a script
language&amp;quot;&lt;/a&gt;. Not only because the subject was cool, but also because there
was actually concrete code produced after this session, and I believe this is
great if we can not only exchange ideas but also produce something that goes in
the right direction.&lt;/p&gt;
&lt;p&gt;Even though the devops movement is very much about people, about having the
right mindsets, about breaking silos and about business alignment and change
management, it is also about tools. And I think that since developers and ops
(and network and security and QA) people meet together during the meetups and
conferences, it is also probably the right place for new tools to emerge, tools
that can efficiently and elegantly solve the daily pain points and bring people
together/help them concentrate on what's really important.&lt;/p&gt;
&lt;p&gt;This is why I was really happy to see that the meetup then followed by a
nice presentation by Alex Honor on the &amp;quot;devops toolchain project&amp;quot;. I'm glad I
had the opportunity to meet Alex several time during my stay in the USA as he
also had been thinking about those issues for a long time. His work on the
toolchain helps pointing the gaps, the same way the &amp;quot;missing tools?&amp;quot; session
during Ghent's Devopsdays did and there is a lot to do!&lt;/p&gt;
&lt;h2&gt;Devops Dojos?&lt;/h2&gt;
&lt;p&gt;Before I was involved in the devops movement, I was very much influenced by
the Agile community, thanks to my friend &lt;a href=&quot;http://twitter.com/perafoo&quot; hreflang=&quot;fr&quot;&gt;Raphaël Pierquin&lt;/a&gt; (I also met Patrick thanks to him). He is
the one who introduced me to the notion of &lt;a href=&quot;http://www.agiledesign.co.uk/technical/dojo-kata-or-randori/&quot; hreflang=&quot;en&quot;&gt;&amp;quot;Coding Dojos&amp;quot;&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I'm not sure who invented the Coding Dojos in the first place (it might have
been Laurent Bossavit and al), but the idea is roughly &amp;quot;how come you are
supposed to become a java expert after a one-week course when it takes a life
time of regular training to become a martial art expert?&amp;quot;, and as a martial art
practitioner myself I find this idea sound.&lt;/p&gt;
&lt;p&gt;Still, while I'm sure regular trainings on devops ideas makes sense, I'm not
sure exactly how this should be done:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Do we need to train on a specific problem, a specific tool or on a specific
method?&lt;/li&gt;
&lt;/ul&gt;
&lt;ul&gt;
&lt;li&gt;Maybe we could do retrospectives on a problem we've had and the solution
we've implemented, to see how others would have fixed it?&lt;/li&gt;
&lt;/ul&gt;
&lt;ul&gt;
&lt;li&gt;Maybe this could be an opportunity to design a tool that would solve a
specific problem, or a modification on an existing tool so it would be a better
fit?&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If you guys have an idea about this, I'd be really interested hearing
it!&lt;/p&gt;</description>
    
    
    
          <comments>http://blog.endemics.info/post/2010/07/27/Devops-Meetups-and-Devops-Dojos#comment-form</comments>
      <wfw:comment>http://blog.endemics.info/post/2010/07/27/Devops-Meetups-and-Devops-Dojos#comment-form</wfw:comment>
      <wfw:commentRss>http://blog.endemics.info/feed/atom/comments/535577</wfw:commentRss>
      </item>
    
  <item>
    <title>Why I don't want a 1024x600 screen</title>
    <link>http://blog.endemics.info/post/2010/06/02/Why-I-don-t-want-a-1024x600-screen</link>
    <guid isPermaLink="false">urn:md5:71a2f6516bba28f6e01fad22fc4562fd</guid>
    <pubDate>Wed, 02 Jun 2010 17:13:00 +0200</pubDate>
    <dc:creator>Gildas LE NADAN</dc:creator>
            
    <description>    &lt;p&gt;I know this is slightly off topic, but I've been complaining a lot lately
about 1024x600 screens on &lt;a href=&quot;http://www.twitter.com/endemics&quot; hreflang=&quot;en&quot;&gt;twitter&lt;/a&gt; and I probably need to explain why :)&lt;/p&gt;
&lt;p&gt;Apparently, lately 4:3 was declared deprecated and bad, so everyone moved to
16:10 or 16:9 formats. It was alright for me as long as the resolution was
above 1024x768, but unfortunately almost all the netbooks and tablets seem to
be afflicted with a 1024x600 screen. Except Apple's Ipad.&lt;/p&gt;
&lt;p&gt;I might not want an apple device for other reasons, but I don't think
1024x768 is a &lt;a href=&quot;http://www.techeye.net/hardware/ipad-display-is-a-bastard-screen-size&quot; hreflang=&quot;en&quot;&gt;bastard size&lt;/a&gt;. It certainly was the standard not long ago on
CRTs or LCD. So much in fact that (for worse more than for bad) almost all
websites are designed for that size.&lt;/p&gt;
&lt;p&gt;So what happen when I try to view a website on my 1024x600 netbook? I
scroll. Vertically or Horizontally. All the time. And I swear a lot too.&lt;/p&gt;
&lt;p&gt;Watching a video on youtube is a pain unless I'm browsing full size. Same
for all the flash stuff. All in all, the user experience is really not
enjoyable, and most of the time comparable to having a 800x600 screen (I even
wonder if all this scrolling isn't causing me carpal tunnel syndrome on top of
all the other pains). And that's not limited to browsing the web. Using a
regular OS (not optimized for this form-factor) or editing a document is a pain
too.&lt;/p&gt;
&lt;p&gt;So, no thanks, no 1024x600 screen for me in the future. If you really insist
on the 16:9/16:10 format because it gives a nicer form factor for the device or
is cheaper to produce, then fine, but I would gladly pay a premium for a
1366x768 and avoid the broken 1024x600 resolution.&lt;/p&gt;</description>
    
    
    
          <comments>http://blog.endemics.info/post/2010/06/02/Why-I-don-t-want-a-1024x600-screen#comment-form</comments>
      <wfw:comment>http://blog.endemics.info/post/2010/06/02/Why-I-don-t-want-a-1024x600-screen#comment-form</wfw:comment>
      <wfw:commentRss>http://blog.endemics.info/feed/atom/comments/523927</wfw:commentRss>
      </item>
    
  <item>
    <title>the certified DBA</title>
    <link>http://blog.endemics.info/post/2010/03/25/the-certified-DBA</link>
    <guid isPermaLink="false">urn:md5:0ec84a5d1b5f2faede8a8d2c7310892d</guid>
    <pubDate>Thu, 25 Mar 2010 11:18:00 +0100</pubDate>
    <dc:creator>Gildas LE NADAN</dc:creator>
            
    <description>    &lt;p&gt;Yesterday my friend and ex-colleague &lt;a href=&quot;http://iain.cx/&quot; hreflang=&quot;en&quot;&gt;Iain&lt;/a&gt; send me a link to &lt;a href=&quot;http://thedailywtf.com/Comments/The-Certified-DBA.aspx&quot; hreflang=&quot;en&quot;&gt;this
dailyWTF article&lt;/a&gt;. I felt the content of the article and of (most of) the
comments were so wrong on so many levels I had to write something about
it...&lt;/p&gt;
&lt;h2&gt;the RH Performance Tuning course&lt;/h2&gt;
&lt;p&gt;I suspect he did this because he remembered a heated discussion I had in a
team meeting with our team leader back when Iain and myself worked together:
our team leader was coming back from a &amp;quot;Red Hat Performance Tuning&amp;quot; course and
said there was a lot of things that we could do to improve the performance of
our systems, including:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;ensure that all systems had swap defined as twice the amount of RAM&lt;/li&gt;
&lt;li&gt;ensure that the /tmp partitions were created on the outside parts of the
&amp;quot;spindles&amp;quot;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I expressed serious doubts about the validity of those assumptions in a
modern IT environment.&lt;/p&gt;
&lt;p&gt;First of all, memory is cheap nowadays and QoS matters. In most cases, a
swapping server is the best way to guarantee that it won't be able to offer the
right level of service: it is either an indication that there is something
wrong with the software like a memory leak, or that the server is not properly
sized for the task.&lt;/p&gt;
&lt;p&gt;The partitioning issue is very similar to the case described in the dailyWTF
article. It is based on physical assumptions that are not necessarily true
nowadays, especially when we are talking about partitions made on a hardware
raid1 volume using multi-platter drives. In my opinion, there was no guarantee
that the firmware of the RAID controller nor the one from the drives will do
what we think they do.&lt;/p&gt;
&lt;h2&gt;Proof versus Belief&lt;/h2&gt;
&lt;p&gt;Interestingly enough, it seems that I was wrong and that drives
manufacturers do their best to keep a mapping that is still in sync with the
belief system in place, as the proved by the &lt;a href=&quot;http://www.coker.com.au/bonnie++/zcav/results.html&quot; hreflang=&quot;en&quot;&gt;zcav
tests&lt;/a&gt; pointed in one of the article's comment.&lt;/p&gt;
&lt;p&gt;What is important here is the experimental evidence as opposed to beliefs or
possibly outdated knowledge.&lt;/p&gt;
&lt;p&gt;Still, it is important to remember that the zcav published datas are only
valid in the context of the tests: they might not be valid for your production
system with your set of drives, your raid controller and moreover, for your
application needs.&lt;/p&gt;
&lt;p&gt;Of course, with enough experience with a specific application, there is a
possibility that generic rules can be deducted, as long as they are
methodically deduced rules and not just wild assertions.&lt;/p&gt;
&lt;p&gt;Alas, if the conditions changes, the rules are no longer valid, so you can't
blindly follow them when you do performance optimization, you can just use them
as hints or possible things to try: only in-situ measures can validate an
hypothesis and prove performance increases.&lt;/p&gt;
&lt;p&gt;Which means that if you have to optimize performance, you have to use your
brains, common sense and produce reproducible test results!&lt;/p&gt;
&lt;h2&gt;Overcomplicated setup&lt;/h2&gt;
&lt;p&gt;With such a complicated setup, it can be difficult to measure the right
thing and there can be plenty of unwanted interactions.&lt;/p&gt;
&lt;p&gt;On the other hand, if you can't prove it makes a difference, you just
over-complicate for the sake of it, which means it will be more difficult to
maintain and diagnose for no provable benefit.&lt;/p&gt;
&lt;p&gt;If it brings more performance, the benefits will still have to be evaluated
against the operational risks that the complexity brought.&lt;/p&gt;
&lt;p&gt;Not only the amount of complexity but also where you add the complexity
matters: the more complexity you push down the stack, the harder it is to
change things: a configuration option in an application is easier to do (and
revert) than changing the version of the software.&lt;/p&gt;
&lt;p&gt;If you depend on a specific software and hardware stack for your
system/application to work, you are tied and have very limited ways to make
your solution evolve or adapt. This tend to create systems where changes induce
more risks.&lt;/p&gt;
&lt;p&gt;It might not necessarily be a problem, especially if your system does not
evolve a lot either in functionality or scale, and the risks can be mitigated
by tests, but those costs must be clearly understood when the decision is
taken.&lt;/p&gt;
&lt;p&gt;Usually a lot of optimization can be done on the highest level, i-e the user
side, with limited risks and efforts. However it can't be achieved if you don't
understand what you're doing nor if you don't understand what the client
application/users are doing,&lt;/p&gt;
&lt;p&gt;Instead of shooting in the dark by applying random recipes, talking to the
users to get the Big Picture can help making the system more in sync with the
actual needs, and will let you identify which path that can be explored.&lt;/p&gt;
&lt;p&gt;Sometimes it can be as easy as spreading the load over the course of a
day/week/month instead of having everyone doing their queries at the same
time.&lt;/p&gt;
&lt;p&gt;Also, provided the DB is not used by a blackbox system, there are different
things that can be done either on the DB or at the application system:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;pruning the tables&lt;/li&gt;
&lt;li&gt;optimizing the schema/indexes/queries&lt;/li&gt;
&lt;li&gt;queuing/asynchronous queries&lt;/li&gt;
&lt;li&gt;spliting/sharding the tables&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;From my experience, the need to do performance tuning usually comes from a
bad design and the inability to scale horizontally. The IT industry has been
relying too much on the ability to do vertical scaling, and unfortunately, it
seems that apart from the big web players, only a few companies have realized
that vertical scaling is barely an option now.&lt;/p&gt;
&lt;h2&gt;The communication problem&lt;/h2&gt;
&lt;p&gt;I believe the communication issue and lack of trust to be the most
fundamental problem.&lt;/p&gt;
&lt;p&gt;It is obvious that the company has a &amp;quot;Us vs Them&amp;quot; syndrome between DBAs and
Ops, indicating a big silo problem, and I doubt this problem is only limited to
the &lt;a href=&quot;http://www.krisbuytaert.be/blog/devops-secops-dbaops-netops&quot; hreflang=&quot;en&quot;&gt;Ops DBA interaction&lt;/a&gt;, but probably spans to other teams
interactions as well.&lt;/p&gt;
&lt;p&gt;I think the Ops persons and his boss did the wrong thing there by hiding
things under the rug. I believe that it was only pride that made the Ops guy
behave the way he did, and I think this will only create more problems for him
in the future.&lt;/p&gt;
&lt;p&gt;Maybe a better way to do is to show the DBA that there was a better way to
do things on a system level, to create a trust relationship with him and to
encourage communication with the DB users.&lt;/p&gt;
&lt;p&gt;Pouring oil on the fire is not going to stop the fire spreading nor the
false beliefs...&lt;/p&gt;</description>
    
    
    
          <comments>http://blog.endemics.info/post/2010/03/25/the-certified-DBA#comment-form</comments>
      <wfw:comment>http://blog.endemics.info/post/2010/03/25/the-certified-DBA#comment-form</wfw:comment>
      <wfw:commentRss>http://blog.endemics.info/feed/atom/comments/502417</wfw:commentRss>
      </item>
    
  <item>
    <title>self documented agile infrastructure</title>
    <link>http://blog.endemics.info/post/2009/03/02/self-documented-agile-infrastructure</link>
    <guid isPermaLink="false">urn:md5:ff0aa745be481feb0b9af97284b59c43</guid>
    <pubDate>Mon, 02 Mar 2009 14:13:00 +0100</pubDate>
    <dc:creator>Gildas LE NADAN</dc:creator>
            
    <description>    &lt;p&gt;In my latest position, as an IT Operations Manager I was confronted to the
classic problems of a non-mature Operations: We were understaffed, in a
fire-fighting mode, there was poor documentation (either missing or not
up-to-date, often misleading), almost no backup, and the team members had
almost no overlap in their skillsets and were demotivated.&lt;/p&gt;
&lt;p&gt;I couldn't afford to lose a single person of my team as the knowledge lost
would be dire for the company, and to make things even more complicated, our
CEO wanted us to be able to deploy our home made software to remote client
sites.&lt;/p&gt;
&lt;p&gt;On the good side, one of my team member had an excellent knowledge of the
home made software, another was a good perl developer, there was a good
knowledge of Suse, rpm packaging and they already had a set up a subversion
repository and a basic puppet setup.&lt;/p&gt;
&lt;p&gt;To consolidate the knowledge and move away from manual operations, it was
decided to use svn, puppet, Suse and pxe to build a self-documented agile
infrastructure where anyone would be able to deploy new services.&lt;/p&gt;
&lt;h2&gt;The basic blocks&lt;/h2&gt;
&lt;p&gt;The applications was packaged using rpm and the latest valid version stored
on a file server, but all the configuration files (including those needed to
build the packages) were stored in subversion.&lt;/p&gt;
&lt;p&gt;This way, it was possible to keep track of the changes (who, why) while at
the same time having a way to retrieve the latest valid version using a simple
'svn co'. The svn commits were sent to all team members, so it kept everyone
informed of what was going on.&lt;/p&gt;
&lt;h2&gt;The recipes&lt;/h2&gt;
&lt;p&gt;The services and server setup were described in puppet and stored in
subversion. The services were described in a generic manner using templates as
configuration files so you could instantiate a new service by deploying the
needed rpms and creating &amp;quot;on the fly&amp;quot; the configuration files adapted to that
specific instance. The important idea was that no manual operation was needed
to deploy a new service thus allowing it to be perfectly reproductible.&lt;/p&gt;
&lt;p&gt;Thanks to this solution, one could easily deploy a new instance of a service
on either a physical or virtual machine. As we were in a j2ee world with a
multi-tiered application, you could either stack several services on a machine
(for development or testing for instance) or one service per machine, depending
on your needs.&lt;/p&gt;
&lt;p&gt;The nice side effect is that puppet is the live documentation of your
systems as it defines and enforces the active configurations! Since the puppet
files are also stored in svn, it is possible to see all the changes for a file
through time with the associated comments.&lt;/p&gt;
&lt;p&gt;The drawback of the system is that extreme care must be taken not to
manually tamper with the configuration of the servers: everything MUST go
through puppet, and the comments must be kept relevant.&lt;/p&gt;
&lt;h2&gt;The deployment system&lt;/h2&gt;
&lt;p&gt;The machines could be either physical or virtual machines, and pxe combined
with kickstart is used to deploy a basic setup consisting of a basic Suse +
puppet. Of course the kickstart files are stored in svn. Once the server is
deployed, puppet can then populate the server with a set of
services/configuration.&lt;/p&gt;
&lt;h2&gt;The backup server&lt;/h2&gt;
&lt;p&gt;Since a service/server could be easily reinstalled using this solution,
there was no need to backup them which is a big time and tape saver.&lt;/p&gt;
&lt;p&gt;This way you can concentrate on saving your application data, that is your
production dataset as well as the files on the file server and the subversion
repository.&lt;/p&gt;
&lt;p&gt;In our setup, it was decided to sync the subversion repository and the files
stored on the fileserver between 2 sites. Also, thanks to the use of
subversion, everyone in the team had the files on their own machine.&lt;/p&gt;
&lt;h2&gt;Disaster recovery&lt;/h2&gt;
&lt;p&gt;During the implementation, cross-dependencies between the subversion,
installation, puppet, file and backup servers were considered in order to allow
a complete restoration of the infrastructure, provided that we had access to
the backup tapes and could reinstall the backup server manually using a Suse
install media.&lt;/p&gt;
&lt;p&gt;It was decided that the subversion, file, build and installation services
would be installed on a single machine. From there, you could reinstall the
puppet server via a very limited set of operations that were documented with
care (basically, installing the packages and checking out the svn
repository).&lt;/p&gt;
&lt;p&gt;Once this is done, and provided &lt;em&gt;all&lt;/em&gt; your infrastructure is
described using puppet recipes, you can easily repopulate your servers in a
case of disaster recovery, but it could also be used to install everything on a
remote site, provided you have a machine were you can bootstrap your
infrastructure.&lt;/p&gt;</description>
    
    
    
          <comments>http://blog.endemics.info/post/2009/03/02/self-documented-agile-infrastructure#comment-form</comments>
      <wfw:comment>http://blog.endemics.info/post/2009/03/02/self-documented-agile-infrastructure#comment-form</wfw:comment>
      <wfw:commentRss>http://blog.endemics.info/feed/atom/comments/317622</wfw:commentRss>
      </item>
    
  <item>
    <title>On the Shortcomings Of Systems and Networks Engineers Training</title>
    <link>http://blog.endemics.info/post/2009/01/16/On-the-Shortcomings-Of-Systems-and-Networks-Engineers-Training</link>
    <guid isPermaLink="false">urn:md5:731fd1888f9f7aff3c8da62c03ceec8c</guid>
    <pubDate>Fri, 16 Jan 2009 20:29:00 +0100</pubDate>
    <dc:creator>Gildas LE NADAN</dc:creator>
            
    <description>    &lt;p&gt;As far I know, there is no course to become a Systems and Networks Engineer,
aside from courses to learn (and gain certification in) a given vendor's
product. In fact, back in my university years, I remember that my teachers
seemed to assume that there was no interest in this kind of thing as learning
the options and caveats of a particular product was all you needed. In their
eyes, algorithmic and development approaches (RAD and OO at the time) were
where the real focus lay.&lt;/p&gt;
&lt;p&gt;In my case, the situation might have been worsened by the traditional
friction in France between university (were the &amp;quot;real, pure, academic&amp;quot; research
is done) and the Ecoles d'Ingénieur (where you learn about engineering and
sometimes conduct &amp;quot;applied research&amp;quot;), but I'm not so sure the situation would
have been so different in an engineering school or another country (I'll be
interested in your feedback there to prove me wrong!).&lt;/p&gt;
&lt;p&gt;So, how does one becomes a Systems and Networks Engineer? Well, it's easy,
you learn by yourself, usually starting with a small set of machines and mainly
by a trial-and-error approach. If you're lucky enough, you might benefit from
someone else's experience and coaching. But still, it remains mostly an ad-hoc
approach.&lt;/p&gt;
&lt;p&gt;Of course, you quickly learn to avoid tinkering with the production platform
on a Friday evening, and given enough experience you can even begin to
&amp;quot;guesstimate&amp;quot; - to a greater or lesser degree of accuracy - the impact of
such-and-such a modification, then hopefully the number of systems you manage
will increase until eventually you find out the hard way that complexity
doesn't grow linearly with the number of systems.&lt;/p&gt;
&lt;p&gt;I would even claim that given the chance to work with different environments
and large scale platforms (highly available, highly loaded web platforms; HPC
clusters; heterogeneous banking environments), one might infer common rules of
thumb and even have the hubris to try to find a meaning in the chaos.&lt;/p&gt;
&lt;p&gt;The fact, however, is that I believe this ad-hoc approach to learning the
job and the lack of (field proven) best-practice references to be The Source Of
All Evil.&lt;/p&gt;
&lt;p&gt;First of all, from this learning process comes an approach comprising
unproven beliefs, mythology or carved-in-stone rules (&amp;quot;one needs twice the
amount of ram as swap space&amp;quot;). It also makes it difficult to assess someone's
ability as a Systems and Networks Engineer if not by considering her technical
knowledge/certifications or previous experience in a similar position.&lt;/p&gt;
&lt;p&gt;Secondly, the good practice of &amp;quot;not changing what works&amp;quot; forged by the
trial-and-error approach, tends to encourage cruft accumulation and creates a
certain reluctance to change anything at all. As a result risk-mitigation
approaches such as continuous integration and minor steps are replaced by
&amp;quot;big-bang&amp;quot; style changes with increased risks of failures.&lt;/p&gt;
&lt;p&gt;All in all, I believe that it has created a situation whereby IT Operations
is working against the (in my eyes desirable) goal of becoming agile and
business-oriented - a true competition differentiator and not just a &amp;quot;cost
center&amp;quot; working in firefighting mode.&lt;/p&gt;
&lt;p&gt;The &amp;quot;cost center&amp;quot; aspect has motivated the few approaches trying to address
the lack of maturity in IT Operations: ITIL, Cobit and so on. To the best of my
knowledge, they are all process-oriented and mostly address the problem from a
financial perspective (ROI, risk management).&lt;/p&gt;
&lt;p&gt;While I believe there are interesting ideas in all of them, and that cost is
an important factor in the need - solution equation, I am not too convinced by
the &amp;quot;process&amp;quot; approach which limits risk but adds weight and inertia to the
organisation and kills pleasure and innovation. I confess I might be too
influenced by the ideas of the &lt;a href=&quot;http://agilemanifesto.org/&quot; hreflang=&quot;en&quot;&gt;Agile Manifesto&lt;/a&gt; here, but I can't stop myself thinking that neither
Google nor Facebook used ITIL to get where they are.&lt;/p&gt;
&lt;p&gt;I also find them too complicated to be real enablers and believe that even
though they warn against it, they incite dogmatism where pragmatism should
rule. Because of this, I think they fight against the exact goals they are
trying to achieve.&lt;/p&gt;
&lt;p&gt;So how can we get out of this mess?&lt;/p&gt;
&lt;p&gt;We would definitely benefit from an increase in interest from the academic
world towards IT Operations and Infrastructure realities. Consider Google's
study on &lt;a href=&quot;http://research.google.com/archive/disk_failures.pdf&quot; hreflang=&quot;en&quot;&gt;Hard Drives failures&lt;/a&gt;. Before its publication different people
had wildly differing beliefs about disk failures based on factors such as:
their own experience with a statistically-insignificant sample size of drives;
manufacturer advertising (propaganda); luck. With a large scale, scientific
study to turn to, people gained a much better understanding of the subject
matter.&lt;/p&gt;
&lt;p&gt;Naturally, courses about availability, scalability, large scale systems and
networks design and management would be welcome in Universities.&lt;/p&gt;
&lt;p&gt;But successful companies such as Google or Amazon couldn't have emerged
without good IT engineering practices and a sound infrastructure (after all
Amazon even sells its services now via EC2 and S3!), so, it is certainly
possible &lt;strong&gt;today&lt;/strong&gt; to build an IT infrastructure that makes a
difference.&lt;/p&gt;
&lt;p&gt;Then we definitely have the responsibility to learn from those leaders and
spread that information around if we want IT Operations and Infrastructures to
mature and serve the business and our own users (kudos here to websites such as
&lt;a href=&quot;http://highscalability.com/&quot; hreflang=&quot;en&quot;&gt;High Scalability&lt;/a&gt; or
&lt;a href=&quot;http://www.storagemojo.com/&quot; hreflang=&quot;en&quot;&gt;Storage Mojo&lt;/a&gt; for their
excellent work).&lt;/p&gt;
&lt;p&gt;Undoubtedly most of the technologies those companies use to manage their
infrastructures are purpose-built in-house developments that won't be
published, so we as a community need to build the tools we need in the same way
developers have started open-source re-implementations of well known building
blocks such as MapReduce for instance &lt;a href=&quot;http://hadoop.apache.org/core/&quot; hreflang=&quot;en&quot;&gt;hadoop&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Tools such as &lt;a href=&quot;http://madstop.com/&quot; hreflang=&quot;en&quot;&gt;Luke Kanies'
Puppet configuration management&lt;/a&gt;, rapid deployments tools such as &lt;a href=&quot;http://www.openqrm.org/&quot; hreflang=&quot;en&quot;&gt;openqrm&lt;/a&gt; or easily adaptable and
scalable monitoring systems such as &lt;a href=&quot;http://hobbitmon.sourceforge.net/&quot; hreflang=&quot;en&quot;&gt;hobbit (now renamed Xymon)&lt;/a&gt; should be &lt;strong&gt;endemic&lt;/strong&gt;
to our infrastructures, yet they are sadly too often an exception.&lt;/p&gt;</description>
    
    
    
          <comments>http://blog.endemics.info/post/2009/01/16/On-the-Shortcomings-Of-Systems-and-Networks-Engineers-Training#comment-form</comments>
      <wfw:comment>http://blog.endemics.info/post/2009/01/16/On-the-Shortcomings-Of-Systems-and-Networks-Engineers-Training#comment-form</wfw:comment>
      <wfw:commentRss>http://blog.endemics.info/feed/atom/comments/317616</wfw:commentRss>
      </item>
    
  <item>
    <title>Yet Another Blog?</title>
    <link>http://blog.endemics.info/post/2009/01/13/Yet-Another-Blog</link>
    <guid isPermaLink="false">urn:md5:a75737ed6c91c3dc66ee452cdffce420</guid>
    <pubDate>Tue, 13 Jan 2009 14:53:00 +0100</pubDate>
    <dc:creator>Gildas LE NADAN</dc:creator>
            
    <description>    &lt;p&gt;Hello there!&lt;/p&gt;
&lt;p&gt;In this introduction post I will try to explain why on earth I've started
Yet Another Blog.&lt;/p&gt;
&lt;p&gt;For years now I've exchanged ideas about IT Infrastructure and Operations
with my colleagues and friends, be they IT Ops guys or dev dudes (or even from
a completely different background). I've learned a lot from those discussions
and I believe my work has matured as a result.&lt;/p&gt;
&lt;p&gt;Lately though, this flow of communication has dried up for several reasons
and I've grown frustrated about it, hence the idea of this blog. Hopefully it
will allow for fruitful interaction with people I know and indeed others that I
don't know. People with whom I am impatient to share ideas and experience!&lt;/p&gt;
&lt;p&gt;So, welcome aboard!&lt;/p&gt;
&lt;p&gt;Gildas&lt;/p&gt;</description>
    
    
    
          <comments>http://blog.endemics.info/post/2009/01/13/Yet-Another-Blog#comment-form</comments>
      <wfw:comment>http://blog.endemics.info/post/2009/01/13/Yet-Another-Blog#comment-form</wfw:comment>
      <wfw:commentRss>http://blog.endemics.info/feed/atom/comments/317306</wfw:commentRss>
      </item>
    
</channel>
</rss>