Announcement

Collapse
No announcement yet.

Enhancing PTS for FreeBSD regression testing

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Enhancing PTS for FreeBSD regression testing

    It's time for the FreeBSD project to get a continuous benchmark system to test for performance regressions. At this point, we're weighing PTS vs a homegrown solution. But PTS is missing some features that we would like. We have four FreeBSD developers willing to work part time on adding them, if it can be done easily and if Phoronix would be willing to merge our changes upstream. However, we are all complete n00bs to PTS. Can some more experienced PTS developers please advise us on the ease of making these particular enhancements?

    1) We'll need to add some extra benchmark programs. This looks easy.

    2) We'll need to detect far more hardware and software configuration elements. Some of these (like sysctl settings) will be FreeBSD-specific and won't apply to other platforms. When viewing results, we'll want to be able to query by these elements. For numeric elements, like the kernel version of the amount of installed RAM, we'll want to be able to query a range.

    3) We'll need more graph types. In particular, we need to be able to do a scatter or line graph of one or more results against another variable, like kernel version.

    4) We need to incorporate networking. AFAICT, PTS benchmarks are restricted to a single host. But networking services like nfsd and pf are very imporant to FreeBSD users, so any benchmark suite must be able to exercise them. This means we'll need a small cluster of slave nodes to generate load for the machine under test. We'll probably have to record the slaves' hardware and software configurations, and we'll need PTS to command and control them. We'll also need an exclusive reservation system, so multiple systems running PTS don't try to use the same clients and the same time. Is this soft of functionality on the roadmap for PTS? Is it even possible with PTS's architecture?

  • #2
    Hi, yes, we would be interested in working with you to enhance FreeBSD support... I'll reply in more detail to your post later today or tomorrow as I have been travelling all day.

    -- Michael
    Michael Larabel
    https://www.michaellarabel.com/

    Comment


    • #3
      Originally posted by Alan Somers View Post
      It's time for the FreeBSD project to get a continuous benchmark system to test for performance regressions. At this point, we're weighing PTS vs a homegrown solution. But PTS is missing some features that we would like. We have four FreeBSD developers willing to work part time on adding them, if it can be done easily and if Phoronix would be willing to merge our changes upstream. However, we are all complete n00bs to PTS. Can some more experienced PTS developers please advise us on the ease of making these particular enhancements?

      1) We'll need to add some extra benchmark programs. This looks easy.
      Yes, this should be fairly easy. Basically it's some small XML files and then shell scripts... Basically, if you can have a script to automate the execution of a particular test, it's only a small step away from being able to make a test profile out of it. If you have an automated script, I'm more than happy to help out in turning it to an upstreamable profile.

      Originally posted by Alan Somers View Post
      2) We'll need to detect far more hardware and software configuration elements. Some of these (like sysctl settings) will be FreeBSD-specific and won't apply to other platforms. When viewing results, we'll want to be able to query by these elements. For numeric elements, like the kernel version of the amount of installed RAM, we'll want to be able to query a range.
      I'm happy to see more upstream FreeBSD/BSD PTS improvements. I'm the lead/main developer of PTS and don't have any commercial customers or anything on FreeBSD PTS that I'm aware of, thus all of my FreeBSD PTS porting up to this point has just been about "scratching my own needs" and making a best effort when time allows for a good port. So basically whenever I want to run tests at Phoronix.com or just want to toy around with the latest PTS release, I'll see what works and what doesn't. There's a lot more that can be exposed through PTS than what is currently offered with FreeBSD compatibility, in terms of hardware/software reporting, etc. Patches are always welcome. For many of the areas when I've tried expanding upon the FreeBSD port I've been blocked by not having documentation / knowing some of the FreeBSD commands or sysctl names for reading certain information, etc.

      Originally posted by Alan Somers View Post
      3) We'll need more graph types. In particular, we need to be able to do a scatter or line graph of one or more results against another variable, like kernel version.
      PTS already has scatter and line graph support. You can see some Phoronix articles as examples with real-time frame-rate reporeting, real time thermal information during test (set MONITOR=all environment variable prior to testing and saving results as an example), etc. PTS can even do image quality comparisons with automated screenshot captures, etc.

      Originally posted by Alan Somers View Post
      4) We need to incorporate networking. AFAICT, PTS benchmarks are restricted to a single host. But networking services like nfsd and pf are very imporant to FreeBSD users, so any benchmark suite must be able to exercise them. This means we'll need a small cluster of slave nodes to generate load for the machine under test. We'll probably have to record the slaves' hardware and software configurations, and we'll need PTS to command and control them. We'll also need an exclusive reservation system, so multiple systems running PTS don't try to use the same clients and the same time. Is this soft of functionality on the roadmap for PTS? Is it even possible with PTS's architecture?
      It should be possible to run some basic network tests on PTS. I know of some PTS users that internally have some basic network benchmarks running that involve multiple systems. There just isn't much upstream for that since I personally haven't had a need/interest in network testing so I haven't done much in that area but I'm happy to help out and advise.

      Let me know if you have any further questions, sorry about the delay in response due to travelling the past two days.
      Michael Larabel
      https://www.michaellarabel.com/

      Comment


      • #4
        Originally posted by Michael View Post
        PTS already has scatter and line graph support. You can see some Phoronix articles as examples with real-time frame-rate reporeting, real time thermal information during test (set MONITOR=all environment variable prior to testing and saving results as an example), etc. PTS can even do image quality comparisons with automated screenshot captures, etc.
        We will likely want to do graphs based on multiple experimental runs. For example, graphing the result of a certain test against the SVN rev of the kernel. For that, we will need a good way to query the results database. A SQL-like interface would be ideal. When I run the phoronix-test-suite on my laptop, locally saved results get stored in XML files, one per test. Does Phoromatic also store results in XML files or does it use a database? Is there an API spec for Phoromatic?

        Comment


        • #5
          Originally posted by Alan Somers View Post
          We will likely want to do graphs based on multiple experimental runs. For example, graphing the result of a certain test against the SVN rev of the kernel. For that, we will need a good way to query the results database. A SQL-like interface would be ideal. When I run the phoronix-test-suite on my laptop, locally saved results get stored in XML files, one per test. Does Phoromatic also store results in XML files or does it use a database? Is there an API spec for Phoromatic?
          Phoromatic and OpenBenchmarking.org do preserve the original data in the XML result files. OpenBenchmarking.org additionally indexes results in (My)SQL but when interacting with the data, the current code is still dealing directly with the XML files. It is possible to parse and deal with the data in various ways, including what you mention of against a revision/version over time. There's the http://kernel-tracker.phoromatic.com/ and another use-case example are benchmarks of Wine revisions over time - http://www.phoronix.com/scan.php?pag...tem&px=MTAwMjQ.

          The current implementations though are dealing with the result objects directly; it would be possible to implement it with a SQL-like interface front-end without too much burden and would be a great addition, but I haven't unfortunately found the time to work on such public interface yet.
          Michael Larabel
          https://www.michaellarabel.com/

          Comment


          • #6
            Originally posted by Michael View Post
            Phoromatic and OpenBenchmarking.org do preserve the original data in the XML result files. OpenBenchmarking.org additionally indexes results in (My)SQL but when interacting with the data, the current code is still dealing directly with the XML files. It is possible to parse and deal with the data in various ways, including what you mention of against a revision/version over time. There's the http://kernel-tracker.phoromatic.com/ and another use-case example are benchmarks of Wine revisions over time - http://www.phoronix.com/scan.php?pag...tem&px=MTAwMjQ.

            The current implementations though are dealing with the result objects directly; it would be possible to implement it with a SQL-like interface front-end without too much burden and would be a great addition, but I haven't unfortunately found the time to work on such public interface yet.
            After thinking about it some more this afternoon, the barrier for adding a SQL-like interface will be rather low after the Phoronix Test Suite 5.0 release -- since it's already working in that direction a bit with the WebSockets interface, etc - http://www.phoronix.com/scan.php?pag...tem&px=MTU2NTg - that after that point it would mostly be restructuring code and defining the interface.
            Michael Larabel
            https://www.michaellarabel.com/

            Comment


            • #7
              Originally posted by Michael View Post
              Phoromatic and OpenBenchmarking.org do preserve the original data in the XML result files. OpenBenchmarking.org additionally indexes results in (My)SQL but when interacting with the data, the current code is still dealing directly with the XML files. It is possible to parse and deal with the data in various ways, including what you mention of against a revision/version over time. There's the http://kernel-tracker.phoromatic.com/ and another use-case example are benchmarks of Wine revisions over time - http://www.phoronix.com/scan.php?pag...tem&px=MTAwMjQ.

              The current implementations though are dealing with the result objects directly; it would be possible to implement it with a SQL-like interface front-end without too much burden and would be a great addition, but I haven't unfortunately found the time to work on such public interface yet.
              So what is the current interface to OpenBenchmarking and Phoromatic?

              Comment


              • #8
                Originally posted by Alan Somers View Post
                So what is the current interface to OpenBenchmarking and Phoromatic?
                There's support for cloning any result files (via the phoronix-test-suite) on OpenBenchmarking.org to have the XML data (and system logs, etc) locally for either running comparison or further analyzing and manipulating the data.

                There's also some basic interfaces for reading the available tests, finding popular tests, and querying other meta-data.

                The rest of the interfaces and features are grown organically as demand arises and hosted on OpenBenchmarking.org like for various analytics features, etc, but there isn't any uniform SQL/JSON interface available to the general public at this time for carrying out server-side queries.

                Open to other methods for interfacing; it's probably easier to continue that email thread with Matthew and I, just let us know what you're after, etc.
                Michael Larabel
                https://www.michaellarabel.com/

                Comment


                • #9
                  FreeBSD is going to take the homegrown approach

                  We've discussed it for a week, and decided to build a homegrown tool rather than use the Phoronix Test Suite. Our reasons are:
                  • Architecture. The PTS and Phoromatic both store their results in flat XML files. For the types of analyses that we want to do, that's nonideal. The system that we're designing will store all results in a SQL database and be designed from the start for rich comparisons.
                  • Code Reuse. Over time, we realized that fewer and fewer of PTS's components would be useful to us. Most of its benchmarks aren't relevant, because they're mostly designed to test the hardware whereas we are mostly concerned with the software. The Phoromatic GUI doesn't provide many of the options we want, like comparisons, so it would need to be heavily extended. PTS's sensors and configuration analyzer are nice, but even those would need to be heavily extended.
                  • Networking. Networking is critical to many FreeBSD users. We feel that the benchmark system must be designed to incorporate network I/O from the start. It won't be an easy problem to solve, but the added utility will be worth it.
                  • Licensing. Phoromatic is closed-source. That's probably not a problem for a single-vendor project, but it is for FreeBSD. It would be very useful for individual users and vendors to be able to locally install a copy of our system. With closed-source components, that's not possible. Plus, it seems like many of the enhancements we want to make would have to be done within Phoromatic. We don't want to be in a position where we can't share our work freely.


                  There are certainly some penalties that we'll have to pay for starting from scratch.
                  • Reuse. We won't be able to reuse any of PTS's many available benchmark programs. That will be a drag, especially for CPU and graphics benchmarks.
                  • Duplication of effort. We'll also have to duplicate a lot of effort to do the boring things, like write the CLI and the sensors.
                  • Social Networking. OpenBenchmarking.org is neat. It's a great way to harness the community. But that's not our core problem. We're going to focus on other things, and we'll probably never have a community site as good as OpenBenchmarking.org.
                  • Cross-platform. By writing a FreeBSD-specific system, it will be more difficult for us to compare different operating systems. But it will also allow us to focus on what matters the most: our own operating system. It will probably still be possible to run most of our benchmarks on other OSes, but it won't be automated, and it won't have sensors or automatic system configuration detection. That's the price we'll pay.


                  I still think that PTS is great at what it does; it's just not the solution for us. And Phoronix is still my favorite website for hardware news. Thanks for all the work you've done over the years.

                  Comment


                  • #10
                    Reuse. We won't be able to reuse any of PTS's many available benchmark programs. That will be a drag, especially for CPU and graphics benchmarks.
                    You can download the test scripts from openbenchmarking.org. That should help in finding good runtime arguments to pass for each bench.

                    Comment

                    Working...
                    X