Announcement

Collapse
No announcement yet.

X.Org Is Getting Their Cloud / Continuous Integration Costs Under Control

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • X.Org Is Getting Their Cloud / Continuous Integration Costs Under Control

    Phoronix: X.Org Is Getting Their Cloud / Continuous Integration Costs Under Control

    You may recall from earlier this year that the X.Org/FreeDesktop.org cloud costs were growing out of control primarily due to their continuous integration setup. They were seeking sponsorships to help out with these costs but ultimately they've attracted new sponsors while also better configuring/optimizing their CI configuration in order to get those costs back at more manageable levels...

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    Surely at even 3k/month they would be better off self hosting everything, if your costs are really that high is it wise to continue pursuing the cloud?

    Comment


    • #3
      I used to work on the CI flow at a major semiconductor firm. It's amazing how quickly costs can spiral out of control when a seemingly minor change is run thousands of times per day. Some yahoo decides to improve his build speeds by using `make -j8`, and suddenly your build speeds drop precipitously as the entire compute farm melts down, NFS servers fall over, and scratch disks fill up.

      Not to mention CI causes a 'throw it over the wall' mentality, where people code crap and count on the CI system to just reject it if it's bad.

      Don't get me wrong, I'm a big advocate, but this story rings very true to me.

      Comment


      • #4
        The cloud scales right through the budget, better to use a distributed system client side.

        Comment


        • #5
          Very hard to understand this story. So much focus on dollars costs, which seemed not that much?
          > " ... $6k per month down to around $3k per month ... "
          More interesting to me:
          > " ... Among the optimizations as a result were configuring their RAID array, garbage collecting registry images, and remediations on their artifacts dropped bandwidth usage from 3.5TB per week down to around 150GB. Google Compute costs were reduced from an updated Nginx, Git tweaking, and other tuning."
          The environmental damage is very much minimized. So much of our bandwidth is being badly used & overloaded. If such engineering changes can be this amazing, we should focus on this, not the immediate financial costs. The real story is how the bandwidth is one -twenty-third (1/23) of what it used to be.
          The other major concern is the cause, treatment & cure. An engineering design change was needed. Bringing an outsider to examine the issue. Simple low cost & "easy" treatment, made such a dramatic change. Lateral thinking engineers like myself excel at this "talent". So much engineering of every type is done by closed-minded code crunchers. The "remedy" is really simple, as this case shows.
          Last edited by gregzeng; 19 September 2020, 05:16 AM.

          Comment


          • #6
            Originally posted by gregzeng View Post
            Very hard to understand this story. So much focus on dollars costs, which seemed not that much?
            > " ... $6k per month down to around $3k per month ... "
            More interesting to me:
            > " ... Among the optimizations as a result were configuring their RAID array, garbage collecting registry images, and remediations on their artifacts dropped bandwidth usage from 3.5TB per week down to around 150GB. Google Compute costs were reduced from an updated Nginx, Git tweaking, and other tuning."
            The environmental damage is very much minimized. So much of our bandwidth is being badly used & overloaded. If such engineering changes can be this amazing, we should focus on this, not the immediate financial costs. The real story is how the bandwidth is one -twenty-third (1/23) of what it used to be.
            The other major concern is the cause, treatment & cure. An engineering design change was needed. Bringing an outsider to examine the issue. Simple low cost & "easy" treatment, made such a dramatic change. Lateral thinking engineers like myself excel at this "talent". So much engineering of every type is done by closed-minded code crunchers. The "remedy" is really simple, as this case shows.
            The thing is they're spending an absurd amount of money that they don't have to. They could easily buy an 128 core EPYC server fully decked out at the price they're paying for this and then spend a vastly smaller amount colocating it in a data center and using that as their CI/CD server, and frankly with a 128 core beast like that probably the rest of their infrastructure. This is absurd financial mismanagement.

            Comment


            • #7
              Originally posted by gregzeng View Post
              Very hard to understand this story. So much focus on dollars costs, which seemed not that much?
              I think the issue was that it was showing exponential growth month to month. It was going to be a problem pretty quick, and they spent 6 months not looking at their bill and not realizing it was going to be an issue, then suddenly found out.

              Comment

              Working...
              X