No announcement yet.

9 Years After Starting, AppStream 1.0 Is Coming For Cross-Distribution Package Metadata

  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Originally posted by Ximion View Post
    "Barely adopted" depends on your point of view.
    Okay, maybe "less adopted outside the Debian world" would be more appropriate to say.

    Originally posted by Ximion View Post
    The AppStream reference implementation is used by default in all of KDE
    Except of org.kde.Platform and org.kde.Sdk as well as all KDE apps at kdeapps and flathub - they use org.freedesktop.appstream-glib instead.

    Originally posted by Ximion View Post
    by Elementary Linux for the AppCenter
    And e.g. Endless OS uses a different implementation.

    Originally posted by Ximion View Post
    by Debian, Ubuntu
    You're right, I should mention that. The thing is that Red Hat/Fedora provides the AppData file even when upstream doesn't provide it. I haven't seen a similar practice in Debian, but it may have changed since I last checked it.
    Moreover, as far as I know, snaps don't require AppData at all (although it is able to incorporate external metadata thanks to the parse-info function), while flatpaks explicitly require it.

    Originally posted by Ximion View Post
    Arch Linux to generate, validate and maintain their metadata
    How does Arch use appstream-cli? I have not seen it in any PKGBUILD.

    Originally posted by Ximion View Post
    and at Purism for metadata generation and online presentation
    But they said that their store is centered around Flatpaks as the packaging format for applications.

    Originally posted by Ximion View Post
    Not to mention projects using the tools directly to maintain their metainfo files.
    From what I gather, most projects use appstream-glib (appstream-util) or leave the matter to package maintainers. Can you give examples of any projects that use appstream (appstream-cli) in the Makefile/CMake/Meson file or Travis CI?

    Originally posted by Ximion View Post
    It's unfortunate that there are two implementations, but Richard and I talk regularly and we worked on quite a few projects together. In the long run we maybe will be able to unify the projects, but since they have very different internals that's much harder than it might look, especially now that existing stuff depends on it. Both implementations parse AppStream properly though and are standard-compliant. AppStream-GLib will permit a few non-standard things that shouldn't be used and its validator is more brutal on styling checks (as it was basically meant to check for "looks good in GNOME Software", while AppStream itself checks for "is it standard compliant?" and then might give info-hints to improve the looks of app metadata), but that's about it.
    Because org.freedesktop.appstream-glib is explicitly required by Flathub (if it fails, then the whole build will fail as well), they enforce their policies on other software. KDE may use appstream-cli internally, but it still has to adapt to the org.freedesktop.appstream-glib rules, created with GNOME Software in mind. I don't think it should look like this. That's why I said that appstream-glib widely adopted, even if software developers don't necessarily want it.

    Originally posted by Ximion View Post
    This is a thing I wasn't aware of... When validating with `appstreamcli`, there is unofficial support for modifying the relevance of certain hints to customize the result (e.g. some projects may want to make certain style checks that aren't fatal fatal). That could maybe help here (I will have a closer look at the actual divergences, it's likely not an issue - a fork sounds a bit drastic).
    First of all, they wanted to force OARS everywhere, which is completely unnecessary in my opinion (and even harmful), because it doesn't make much sense outside of games or gambling software.
    Later they came up with the idea that they would change the validation level from "validate-relax" to "validate". They thought that in this way they would improve the quality of Linux software. However, the rules were constantly changing, and instead of improving changelogs, people began to completely remove them. So they lowered the criteria by patches, until finally it came to the point that "validate" from org.freedesktop.appstream-glib is extremely close to "validate-relax" from appstream-glib, and really far from the original "validate".


    • #12
      Originally posted by Ximion View Post
      The latter part sounds like unintentional behavior in AppStream-GLib, maybe it's worth reporting a bug for that? In any case, the situation isn't comparable with OOXML, as both implementations will read the same stuff and interpret it the same way (if something doesn't behave spec compliant, it's a bug that will be fixed). They do make different requirements for validation though, which is due to the different scope of the projects. As per specification, an AppStream Metainfo file without a content_rating is perfectly compliant, so `appstreamcli` will pass the validation. If the validation tool has additional agenda though, e.g. having requirements for Flathub, they may make that tag mandatory and fail validation if it isn't present.
      The version comparison algorithms are different. Although the AppData specification doesn't specify any version scheme, it requires the items to be sorted in a latest-to-oldest order.
      The <releases> tag contains <release/> child tags which describe some metainformation about the current release of the described software. Each release of the software component should have a <release/> tag describing it, but at least one release child must be present for the current release of the software. The release children should be sorted in a latest-to-oldest order to simplify reading the metadata file.
      What is more, appstream explicitly states that "the version compare algorithm is also used by RPM".

      In RHEL 7 we have:
      $ appstream-util --version
      Version:    0.7.8
      $ appstream-util vercmp "Build 9half" "Build 10"
      Build 9half > Build 10
      $ appstream-util vercmp "Build 9.5" "Build 10"
      Build 9.5 > Build 10
      $ appstream-util vercmp "Build 9+" "Build 10"
      Build 9+ > Build 10
      $ appstream-util vercmp "Build 9a" "Build 10"
      Build 9a > Build 10
      $ appstream-util vercmp "Build 9" "Build 10"
      Build 9 > Build 10
      $ appstream-util vercmp "9half" "10"
      9half > 10
      $ appstream-util vercmp "9.5" "10"
      9.5 > 10
      $ appstream-util vercmp "9+" "10"
      9+ > 10
      $ appstream-util vercmp "9a" "10"
      9a > 10
      $ appstream-util vercmp "9" "10"
      9 < 10
      Let's say that this version is outdated. However, it is still the latest version in RHEL 7 and it is widely used during package building with rpmbuild/mock. It is used to validate many apps for EPEL7, and it probably won't be updated. Never ever!

      Anyway, let's check the latest org.freedesktop.appstream-glib from Flathub:
      $ flatpak run org.freedesktop.appstream-glib --version
      Version:    0.7.16
      $ flatpak run org.freedesktop.appstream-glib vercmp "Build 9half" "Build 10"
      Build 9half > Build 10
      $ flatpak run org.freedesktop.appstream-glib vercmp "Build 9.5" "Build 10"
      Build 9.5 > Build 10
      $ flatpak run org.freedesktop.appstream-glib vercmp "Build 9+" "Build 10"
      Build 9+ > Build 10
      $ flatpak run org.freedesktop.appstream-glib vercmp "Build 9a" "Build 10"
      Build 9a > Build 10
      $ flatpak run org.freedesktop.appstream-glib vercmp "Build 9" "Build 10"
      Build 9 > Build 10 
      $ flatpak run org.freedesktop.appstream-glib vercmp "9half" "10"
      9half < 10
      $ flatpak run org.freedesktop.appstream-glib vercmp "9.5" "10"
      9.5 < 10
      $ flatpak run org.freedesktop.appstream-glib vercmp "9+" "10"
      9+ < 10
      $ flatpak run org.freedesktop.appstream-glib vercmp "9a" "10"
      9a < 10
      $ flatpak run org.freedesktop.appstream-glib vercmp "9" "10"
      9 < 10
      Better, but still not 100% correct.

      As you can guess, it should be:
      $ rpm -q --qf "%{VERSION}\n" rpmdevtools
      $ rpmdev-vercmp "Build 9half" "Build 10"
      Build 9half < Build 10
      $ rpmdev-vercmp "Build 9.5" "Build 10"
      Build 9.5 < Build 10
      $ rpmdev-vercmp "Build 9+" "Build 10"
      Build 9+ < Build 10
      $ rpmdev-vercmp "Build 9a" "Build 10"
      Build 9a < Build 10
      $ rpmdev-vercmp "Build 9" "Build 10"
      Build 9 < Build 10
      $ rpmdev-vercmp "9half" "10"
      9half < 10
      $ rpmdev-vercmp "9.5" "10"
      9.5 < 10
      $ rpmdev-vercmp "9+" "10"
      9+ < 10
      $ rpmdev-vercmp "9a" "10"
      9a < 10
      $ rpmdev-vercmp "9" "10"
      9 < 10
      Of course, appstream-cli handles it correctly.

      Actually, we hit this problem with Widelands. It uses an unusual version scheme, like "Build 19", "Build 20", etc. Until recently, this project was hosted on Launchpad. It was created under Debian/Ubuntu. This relationship is very strong. Widelands developers are still trying to support Ubuntu 14.04 LTS. Anyway, from a Debian/Ubuntu/appstream/appstream-cli perspective, the AppData file is valid.
      Now we have to "fix" it for Flathub. What's worse, Flathub members have requested that this must be corrected with a patch file. Of course, it must be updated with each release (or even a small change) in the AppData file. We tried to do it for a while, but it is too much work. I ignored Flathub recommendations by using xmlstarlet to remove all releases except the first one, so I was threatened that I get stripped of push access to any repository if I keep going like this.

      Anyway, I've reported a bug regarding appstream-glib in 2018. I suggested using the rpm algorithm or rewriting it in glib style. I even made a PR for it. However, it has been rejected. Instead, we had little improvement for the current algorithm, and optional use of the RPM algorithm when built with rpm. However, we already know Debian, Ubuntu, Gentoo, Arch and Slackware won't build this package with rpm support. Flathub doesn't do that either. Personally, I think that it only hurt the project. We should not allow the situation in which the same file is valid on one system and not valid on another, especially when we have portable packages, such as flatpaks, snaps and appimages. This standard should to be portable across various Linux distributions.
      If distro maintainers want the version number to match the package version, then maybe we should have an additional field for it, i.e. pkg_version? Or maybe we should allow additional fields for each package system?
           <release date="2019-05-02" version="Build 20">
             <pkg_version type="deb">20</<pkg_version>
             <pkg_version type="rpm" variant="mga">b20</<pkg_version>
             <pkg_version type="arch">20</<pkg_version>
             <pkg_version type="slk">build20</<pkg_version>
           <release date="2016-11-11" version="Build 19">
             <pkg_version type="deb">19</<pkg_version>
             <pkg_version type="rpm" variant="mga">b19</<pkg_version>
             <pkg_version type="arch">19</<pkg_version>
             <pkg_version type="slk">build19</<pkg_version>
           <release date="2014-02-22" version="Build 18">
             <pkg_version type="deb">18</<pkg_version>
             <pkg_version type="rpm" variant="mga">b18</<pkg_version>
             <pkg_version type="arch">18</<pkg_version>
             <pkg_version type="slk">build18</<pkg_version>
      Maybe we should develop a new standard? We already have a situation where the GitHub release does not match the package version. For example, the "1.2.3-rc1+20200129" release in RPM would look like this: "1.2.3~rc1-20200129".
      We already know that semver is not enough, so maybe we should choose the least common denominator? Or at least something that will be compatible with both RPM and DEB (for now, there are some differences here, e.g. RPM allows the version string starting with a non-digit character, while DEB does not)?
      There is room for discussion here, but it should be discussed among all the parties: Red Hat/Fedora, SUSE/openSUSE, Debian, Canonical/Ubuntu, maybe even Gentoo and Arch. I wanted to say Slackware as well, but they still ignore the AppData standard.

      Anyway, such an important thing shouldn't depend on just one person who changes his mind several times a year.

      We have already had cases when the algorithm has been changed for specific software, e.g. Blender or some firmware.

      What is worse, it uses weird heuristics (see: as_utils_version_parse).
      $ appstream-util --version
      Version:    0.7.8
      $ appstream-util vercmp "20150915" "30150915"
      20150915 > 30150915
      I asked hughsie several times for documentation, but I never got an answer. How should developers adapt to something that is completely undocumented and changes several times a year?
      My point is that you can't say that everyone has to change the release numbers 20 years into the past, because you made some changes last weekend and this is the "new Linux standard". It won't work.

      AppSteam has clearly defined the version comparison algorithm (it should be compatible with rpmvercmp) and should remain so, at least until the next standard is developed (AppStream 2.0?). In the meantime, we can discuss what to do about this problem in the future. That's what I think.


      • #13
        Speaking of things that are not completely thought out, we should also mention Application ID. Within a 5 years, some of us had to change the Application ID more than once.
        Let's imagine that we are LinuxDevelopers and our application is LinuxApp or rather linuxapp-gtk, since we also offer a cli interface, called linuxapp-cli.
        Five years ago our ID probably would be "appname.desktop", as it was the standard for RHEL 7.
        Then we had to change it to the new reverse-DNS style format. Let it be: "net.sourceforge.linuxapp".
        But as everyone was switching from SF to GH, we also decided to do this. Our new ID would be something like this: "com.github.linux-app". We tried to use "com.github.linuxapp", but linuxapp was already taken.
        However, it can be problematic, because while com.github.linux-app may be unique to our project, it does not include a project specific identifier. So let's rename it again to "com.github.linux-app.linuxapp-gtk".
        Unfortunately, this ID is not valid according to the DBus specification. We have to switch to "com.github.linux_app.linuxapp-gtk".
        Someone in our team noticed that most project tend to capitalize the last part. We decided to change our ID to "com.github.linux_app.LinuxApp-gtk". That's what GNOME guys do, so it must be good.
        Then Flathub instructed us that we don't have any any control over the domain and we should use instead. So once again, we changed our ID. The new ID is: "io.github.linux_app.LinuxApp-gtk".
        We want to join Elementary project. Unfortunately, RDNN with underscores is not allowed here. In the meantime, Microsoft took over GitHub and half of our members threatened to leave the project if we stayed here. Screw this. We are switching to self-hosted GitLab with our own domain. Our new ID is "com.8linuxhaters.LinuxApp-gtk" (of the original 20 of us, only 8 remained).
        Unfortunately, if a component starts with a digit, it is required to prepend underscore to it. The final ID would be "com._8linuxhaters.LinuxApp-gtk", but we decided to give up on the Linux software development front. We came to the conclusion that UWP is not so bad, and Xamarin is actually pretty cool...
        Fun fact: Guys from Papirus icon theme put a price on our heads.

        Anyway, the facts are that a lot of developers have refused to change ID to reverse-DNS style format. We also have many examples of "outdated" identifiers:
        I won't even mention the wrong ones, like "linuxapp" or "tk.linux-haters.linuxapp"!

        Please keep in mind that change of ID is often associated with:
        - renaming the AppData file
        - renaming the desktop file
        - renaming the icon file - this is the reason why people from Papyrus icon theme want us dead
        - renaming the D-Bus interface - this one may be super painful for others
        - updating ID in the AppData file
        - adding old <id> in the <provides> section of the AppData file
        - providing the X-Flatpak-RenamedFrom entry in the desktop file
        - providing the StartupWMClass entry in the desktop file
        - recreating the pot file
        - updating po files
        - providing some changes in source code
        - updating Autotools/CMake/Meson files
        - updating RPM/DEB packages
        - updating DMG bundle for macOS (because of renaming icon)
        - updating Windows installer (same as above)
        - updating Travis CI
        - updating flatpak manifest
        - renaming repo on Flathub

        Of course, flatpak-builder has some interesting options:
        - rename-appdata-file
        - rename-desktop-file
        - rename-icon
        - copy-icon

        However, Flathub members treat it as a "dirty hack" and often force the actual ID change.

        Originally posted by Ximion View Post
        This is precisely the reason why `appstreamcli` has an issue severity system for its validator hints, to make it a bit less opinionated: Errors and Warnings fail the validation for serious violations of the specification, and other issue severities are hints for the developer to help them improve their metadata and are not fatal by default. That includes e.g. style checks like summary-ends-with-dot.
        I'm not completely sure, but if I good remember, in the past an AppData file didn't pass appstream-cli validation when it was using OARS 1.1 or kudos. However, currently it only gives a non-fatal warning.


        • #14
          Originally posted by Ximion View Post

          Hehe ^^ I wasn't the one who picked XML originally, but it definitely is a solid choice for the purpose of AppStream, as it is translatable with existing tools and can be extended without breaking backwards compatibility far easier than a format like YAML or JSON. AppStream also is pretty straightforward XML, not the insanity that you can create with it.
          There is no objectively best markup format, there's always tradeoffs involved which make a format more or less suitable for a particular task. And for XML: You don't have to love it to acknowledge that it is useful sometimes ;-)
          - JSON is nice for quick and dirty data exchange, for example "internal" exchanges or very simplistic data formats (little complexity / little control).
          - YAML is great for minimally-codified structured data (meaning by that there are little regulations on what content is actually put inside keys, or those regulations are taken care of externally, but you want something easy to read and parse and possibly some basic structure regulation).
          - XML is great for every other professional need, including and in priority the need for having a descriptible and readable format, enforceable with a list of technical/business rules on both keys and values to ensure that whomever sends it and whomever uses it they can be confident on the integrity of data in relation with the particular business.
          There are probably many other formats I don't know of that can be better than those three for a particular need but imo that's a nice sumup.


          • #15
            the_scx Thank you for your feedback! I actually worked on addressing a few of these points, the Flatpak case will be a bit trickier though.

            Originally posted by the_scx View Post
            Okay, maybe "less adopted outside the Debian world" would be more appropriate to say.
            How does Arch use appstream-cli? I have not seen it in any PKGBUILD.
            There is more to this than just "depends on appstream-util vs depends on appstreamcli". The AppStream reference implementation provides a whole library to work with this metadata, and appstream-generator is built on top of that. And that is what Debian/Ubuntu/Arch/etc. are using. It of course validates with libappstream's validator (the same thing appstreamcli also exposes) and generates a HTML report from that. On Debian, this feature is integrated with package QA, so package maintainers might fix some of the issues the validator raises.

            Originally posted by the_scx View Post
            First of all, they wanted to force OARS everywhere, which is completely unnecessary in my opinion (and even harmful), because it doesn't make much sense outside of games or gambling software.
            Later they came up with the idea that they would change the validation level from "validate-relax" to "validate". They thought that in this way they would improve the quality of Linux software. However, the rules were constantly changing, and instead of improving changelogs, people began to completely remove them. So they lowered the criteria by patches, until finally it came to the point that "validate" from org.freedesktop.appstream-glib is extremely close to "validate-relax" from appstream-glib, and really far from the original "validate".
            That's pretty much a problem of Flathub, that just happens to affect AppStream - which is of course annoying.

            Thanks for raising these issues! Also, don't hesitate to file bug reports in case anything is a direct issue with AppStream itself or a general problem of the ecosystem that can be improved with explicit guidance from the reference implementation.

            In the meanwhile, and also based on other user's feedback, I've put together a web application that will help people to create better metadata faster: