Announcement

Collapse
No announcement yet.

The GCC Git Conversion Heats Up With Hopes Of Converting Over The Holidays

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    Originally posted by stevecrox View Post
    It feels like gnu started that and forgot the check-in part and now it's been two years
    It should be noted that GCC has decided to place a marker on the ground such that on Dec 16th, the most complete conversion solution will be chosen, and GCC intends that the project will be git at the beginning of the new year (there are a few days over the extended holiday period where the final conversions would be done against a frozen svn repo). None of the choices get everything correct, although right now Maxim's tools are arguably the most ready/complete, but ESR has another week to finish his reposurgeon tool and fix his (self) admitted critical bugs. Nothing focuses you like having a hard deadline.

    Comment


    • #22
      If they really need to switch to Git, why not simply make a new branch and only do that on the new Git, while the old SVN is archived as read-only for the time being. Must be cheaper and easier to keep that so rather than converting something that, most likely, in the end won't be trusted anyway.

      Comment


      • #23
        Originally posted by Spam View Post
        If they really need to switch to Git
        Technically, they probably don't need to switch, although switching has some well understood benefits for a large code base such as gcc.
        why not simply make a new branch and only do that on the new Git
        There are certainly some projects that have made that choice. But there are those in the gcc project that do not wish to lose the author attribution, code fragment, and dead branch information in the historical record using the common tools they will be using going forward (git). Note to add the the various challenges, the svn repo was itself migrated from cvs, which generated a different set of artifacts.
        while the old SVN is archived as read-only for the time being
        The svn repo will probably be kept (essentially) forever, as no conversion is likely going to be absolutely perfect, and having access to the historical base might also come in valuable by some computing archivist from the 22nd century trying to understand what the grey beards contributed. Of course, whether one will be able to find a working subversion binary in the 22nd century is a different question.
        Last edited by CommunityMember; 12-08-2019, 07:21 PM.

        Comment


        • #24
          Originally posted by discordian View Post
          Conversion is one thing, but I don't get why they did not convert to git and then do the cleanup there. Finding identical (sub)directories in git is rather easy with the hashes being used, and access is generally faster than with svn.
          cleanup of history means rewriting of history. it affects everyone who forked your repo, i.e. probably hundreds of people, they all have to rebase their branches on your new history. but it is not clear why anyone needs full ideal history in git. make something which includes still supported versions, everything else can be cleaned in separate legacy repo without writers

          Comment


          • #25
            Originally posted by stevecrox View Post
            It feels like gnu started that and forgot the check-in part and now it's been two years
            They don't want him to get back to other more important stuff, that much is obvious.

            Comment


            • #26
              Originally posted by CommunityMember View Post
              Technically, they probably don't need to switch, although switching has some well understood benefits for a large code base such as gcc.

              There are certainly some projects that have made that choice. But there are those in the gcc project that do not wish to lose the author attribution, code fragment, and dead branch information in the historical record using the common tools they will be using going forward (git). Note to add the the various challenges, the svn repo was itself migrated from cvs, which generated a different set of artifacts.

              The svn repo will probably be kept (essentially) forever, as no conversion is likely going to be absolutely perfect, and having access to the historical base might also come in valuable by some computing archivist from the 22nd century trying to understand what the grey beards contributed. Of course, whether one will be able to find a working subversion binary in the 22nd century is a different question.
              Except, converting this information to git is rather simple. Even a shell script can accomplish this. This is absolutely mindblowing to me. It looks to me like they are trying to not freeze the repo for long for fear of developers not being able to contribute. By doing that, they've created a mess, spending more development hours creating a 'solution' than would take if they simply used off the shelf tools. Admittedly, I haven't looked at the GCC codebase/history, nor do I have the desire to, but I've seen or performed large conversions in the past. You might waste a weekend, maybe even a week (though I've never seen it take that long), however, during that time you simply ask developers to hold onto their changes or improve them. Once conversion is completed, as long as you are using something like GitHub or Bitbucket, you can simply have them do pull requests, and deal with the last minute nonsense when the new git tree is live. If you aren't using something like GitHub or Bitbucket, then I have to question your sanity.

              To further prove how open source projects often suffer from such insanity: I've read elsewhere that KDE refused to use GitHub because 'it isn't FOSS'. At the end of the day, git is git. Even if GitHub disappeared tomorrow, that would not affect the project in the least. it would be relatively easy to use a different platform or host your own. If the loss of 'bugs' or 'issues' on GitHub's issue tracker is something you are worried about, you can simply have your own API that logs these to a database as they come in, as GitHub not only has a full API, but they also have event hooks. What did KDE do? They instead opted to go with an incomplete and immature platform. A platform where you can't even reset your own password if you don't remember your 'username'. A platform that is awkward to navigate. A platform that has likely costed a ton of development time and an insignificant amount of money for hosting. All in the name of keeping everything FOSS. While one can appreciate the spirit of that approach. That platform limits developer contributions to those who can be bothered to have yet another account, learn yet another system that isn't an established industry standard, and deal with the inevitable bugs of that platform. Their choice has costed them both talent and money, and this is evidenced by their constant call for help both in the way of developers and dollars.

              Don't even get me started on some of the other Open Source nonsense we've been witness to in the past. This is not to say that this can't happen in closed source environments as well. However, when a company's bottom line is erm...on the line, they tend to value developer/support time and cost of implementation over all else, which in turn tends to work out well for everyone (except in some of the crazier 'big business' scenarios I've seen, which are almost always related to incompetence).

              Comment


              • #27
                LOL.
                Just archive this history and start a fresh git repo with 1 commit.
                The Linux kernel does this every few years to keep things manageable in <4GB RAM.
                I hope they use git-submodules if they opt to keep the history.

                Comment


                • #28
                  So far the responses in gcc ml are very mildly but pushing back against ESR's lack of trust towards git-svn. Some could be even considered 'jabs'. Popcorn might be in order.

                  Comment


                  • #29
                    Originally posted by cybertraveler View Post
                    I don't know why some of you are so salty or angry about this.

                    I just see a guy who wants to carefully and cleanly convert the data and has been working on doing this in the spare time he has available.

                    Any cowboy with some basic coding knowledge could do a fast conversion using existing tools or even some homebrew software. However, the result would almost certainly be an imperfect conversion that would leave a trail of confusion and problems for years to come. But that cowboy would have "done his job" and received praise for doing it, so he probably wouldn't care. I've met those kind of coders; they are numerous. There are far fewer coders who are methodical, far-thinking and considerate.

                    If I was a GCC dev I'd prefer that this guy takes another 2 years doing this task properly instead of it being done quickly with potential messy fallout.
                    True, given how old the gcc codebase is, there are a lot of commits, branches etc. and cleanly copying over the history will be quite tough.

                    Comment


                    • #30
                      This page contains some suggestions about the difficult of converting the gcc repository.
                      https://gcc.gnu.org/wiki/GitConversion
                      Basically it seems that it is not a simple import, but some revision/changeset have to be tweaked in order to solve some artifact related to the past cvs-&gt;svn conversion. In this the Maxim's solution is described as inferior than the ESR's conversion via Reposurgeon tool.

                      Anyway the GCC history starts from 1988, and the current gcc repository contains about 350 branches spread over ~280000 commits.

                      Comment

                      Working...
                      X