Announcement

Collapse
No announcement yet.

where do I find git sources for a test?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    Originally posted by nevion View Post
    Michael another bump - as you requested previously.
    Hi nevion,
    I've been playing with it this morning. Built ArrayFire stock and installed it in /usr/local, (that build script mentioned in the test profile contents didn't work).

    Is there a reason ArrayFire isn't being built with the test profile itself? Seems like it could be automated easily sans getting all the dependencies, then building based upon what OpenCL/CUDA/etc is available.

    It's been working fine for me. here is example I used of selecting multiple options:

    local/arrayfire:
    Test Installation 1 of 1
    Installation Size: 1.0 MB
    Installing Test @ 10:35:22



    ArrayFire 1.0:
    local/arrayfire
    Processor Test Configuration
    1: OpenCL
    2: CUDA
    3: CPU
    4: Test All Options
    Platform: 3


    1: Accumulate_1D_f32
    2: Accumulate_1D_f64
    3: Accumulate_2D_f32
    4: Accumulate_2D_f64
    5: Bandwidth_f32
    6: Bandwidth_f64
    7: BilateralFilter_f32
    8: BilateralFilter_f64
    9: Convolve_f32_11x11
    10: Convolve_f32_5x5
    11: Convolve_f32_9x9
    12: Convolve_f64_11x11
    13: Convolve_f64_5x5
    14: Convolve_f64_9x9
    15: Data_f32_CONSTANT
    16: Data_f32_IDENTITY
    17: Data_f32_RANDN
    18: Data_f32_RANDU
    19: Data_f32_RANGE
    20: Data_f64_CONSTANT
    21: Data_f64_IDENTITY
    22: Data_f64_RANDN
    23: Data_f64_RANDU
    24: Data_f64_RANGE
    25: ELWISE_f32_ADD
    26: ELWISE_f32_ADD_CONSTANT
    27: ELWISE_f32_ARC_COS
    28: ELWISE_f32_ARC_SIN
    29: ELWISE_f32_ARC_TAN
    30: ELWISE_f32_ATAN2
    31: ELWISE_f32_CBRT
    32: ELWISE_f32_COS
    33: ELWISE_f32_DIVIDE
    34: ELWISE_f32_DIVIDE_CONSTANT
    35: ELWISE_f32_ERF
    36: ELWISE_f32_ERFC
    37: ELWISE_f32_EXP
    38: ELWISE_f32_EXP_M1
    39: ELWISE_f32_HYPOT
    40: ELWISE_f32_HYP_ARC_COS
    41: ELWISE_f32_HYP_ARC_SIN
    42: ELWISE_f32_HYP_ARC_TAN
    43: ELWISE_f32_HYP_COS
    44: ELWISE_f32_HYP_SIN
    45: ELWISE_f32_HYP_TAN
    46: ELWISE_f32_IS_INF
    47: ELWISE_f32_IS_NAN
    48: ELWISE_f32_IS_ZERO
    49: ELWISE_f32_LGAMMA
    50: ELWISE_f32_LOG10
    51: ELWISE_f32_LOG_1P
    52: ELWISE_f32_LOG_E
    53: ELWISE_f32_MAX
    54: ELWISE_f32_MIN
    55: ELWISE_f32_MODULO
    56: ELWISE_f32_MULTIPLY
    57: ELWISE_f32_MULTIPY_CONSTANT
    58: ELWISE_f32_POW
    59: ELWISE_f32_REMAINDER
    60: ELWISE_f32_SIN
    61: ELWISE_f32_SQRT
    62: ELWISE_f32_SUBTRACT
    63: ELWISE_f32_SUBTRACT_CONSTANT
    64: ELWISE_f32_TAN
    65: ELWISE_f32_TGAMMA
    66: ELWISE_f64_ADD
    67: ELWISE_f64_ADD_CONSTANT
    68: ELWISE_f64_ARC_COS
    69: ELWISE_f64_ARC_SIN
    70: ELWISE_f64_ARC_TAN
    71: ELWISE_f64_ATAN2
    72: ELWISE_f64_CBRT
    73: ELWISE_f64_COS
    74: ELWISE_f64_DIVIDE
    75: ELWISE_f64_DIVIDE_CONSTANT
    76: ELWISE_f64_ERF
    77: ELWISE_f64_ERFC
    78: ELWISE_f64_EXP
    79: ELWISE_f64_EXP_M1
    80: ELWISE_f64_HYPOT
    81: ELWISE_f64_HYP_ARC_COS
    82: ELWISE_f64_HYP_ARC_SIN
    83: ELWISE_f64_HYP_ARC_TAN
    84: ELWISE_f64_HYP_COS
    85: ELWISE_f64_HYP_SIN
    86: ELWISE_f64_HYP_TAN
    87: ELWISE_f64_IS_INF
    88: ELWISE_f64_IS_NAN
    89: ELWISE_f64_IS_ZERO
    90: ELWISE_f64_LGAMMA
    91: ELWISE_f64_LOG10
    92: ELWISE_f64_LOG_1P
    93: ELWISE_f64_LOG_E
    94: ELWISE_f64_MAX
    95: ELWISE_f64_MIN
    96: ELWISE_f64_MODULO
    97: ELWISE_f64_MULTIPLY
    98: ELWISE_f64_MULTIPY_CONSTANT
    99: ELWISE_f64_POW
    100: ELWISE_f64_REMAINDER
    101: ELWISE_f64_SIN
    102: ELWISE_f64_SQRT
    103: ELWISE_f64_SUBTRACT
    104: ELWISE_f64_SUBTRACT_CONSTANT
    105: ELWISE_f64_TAN
    106: ELWISE_f64_TGAMMA
    107: Erode_f32_5x5
    108: Erode_f64_5x5
    109: FFT_1D_f32
    110: FFT_1D_f64
    111: FFT_2D_f32
    112: FFT_2D_f64
    113: GFOR_FOR_LOOP_SUM
    114: GFOR_NO_LOOP_SUM
    115: GFOR_SUM
    116: Histogram_f32
    117: Histogram_f64
    118: Image_Bilateral_11x11
    119: Image_Bilateral_5x5
    120: Image_Bilateral_9x9
    121: Image_Convolve_11x11
    122: Image_Convolve_5x5
    123: Image_Convolve_9x9
    124: Image_Erode_11x11
    125: Image_Erode_5x5
    126: Image_Erode_9x9
    127: Image_FAST
    128: Image_Histogram
    129: Image_ORB
    130: Image_Resize_Expand_2x
    131: Image_Resize_Shrink_2x
    132: Cholesky_f32
    133: Cholesky_f64
    134: LU_f32
    135: LU_f64
    136: MatrixMultiply_f32
    137: MatrixMultiply_f64
    138: MedianFilter_f32_4x4_PAD_SYM
    139: MedianFilter_f32_4x4_PAD_ZERO
    140: MedianFilter_f64_4x4_PAD_SYM
    141: MedianFilter_f64_4x4_PAD_ZERO
    142: PinnedMemory_f32_Bandwidth
    143: PinnedMemory_f64_Bandwidth
    144: Expand_2D_f32_AF_INTERP_BILINEAR
    145: Expand_2D_f32_AF_INTERP_NEAREST
    146: Expand_2D_f64_AF_INTERP_BILINEAR
    147: Expand_2D_f64_AF_INTERP_NEAREST
    148: Shrink_2D_f32_AF_INTERP_BILINEAR
    149: Shrink_2D_f32_AF_INTERP_NEAREST
    150: Shrink_2D_f64_AF_INTERP_BILINEAR
    151: Shrink_2D_f64_AF_INTERP_NEAREST
    152: Rotate_f32_INTERP_NEAREST
    153: Rotate_f64_INTERP_NEAREST
    154: Sort_f32_ASCENDING
    155: Sort_f32_DESCENDING
    156: Sort_f64_ASCENDING
    157: Sort_f64_DESCENDING
    158: Sum_1D_f32
    159: Sum_1D_f64
    160: Sum_2D_f32
    161: Sum_2D_f64
    162: Transpose_f32
    163: Transpose_f64
    164: Test All Options
    Benchmark: 1-10

    Is working fine and could do like 1-10,20-40,57,164
    Michael Larabel
    https://www.michaellarabel.com/

    Comment


    • #22
      Michael It takes quite some time to compile arrayfire... I didn't have it built automatically just for that reason. That and using the prepackaged arrayfire makes for a more stable basis. So I download and install that out of band, although the updateLibraries script should do that too. I think it is possible to put the download to this software in your download list to make it a bit more self contained - but it's a heavy package, near a gig and they're hosting it on amazon (=bw costs per dl)... I figured it more like CUDA in the way of cost of installationthat it should just be installed out of band prior to benchmark run. What do you think?

      Do you have interest in incorporating this testing into your repertoire - are there some tests you'd like to see added or removed? Any other problems? I'd like to extend or contract it where you think that could be useful; and potentially add more. For instance, I am going to add int{8,16,32,64} benchmarks for the majority of these.

      Comment


      • #23
        [QUOTE=nevion;n924207]Michael It takes quite some time to compile arrayfire... I didn't have it built automatically just for that reason. That and using the prepackaged arrayfire makes for a more stable basis. So I download and install that out of band, although the updateLibraries script should do that too. I think it is possible to put the download to this software in your download list to make it a bit more self contained - but it's a heavy package, near a gig and they're hosting it on amazon (=bw costs per dl)... I figured it more like CUDA in the way of cost of installationthat it should just be installed out of band prior to benchmark run. What do you think?[/QUOTE[

        Hmm okay. I didn't think it took too incredibly long to build ArrayFire. Maybe only build it in the test profile if it's not found on the system otherwise? Just trying to think how to make it easier to setup/deploy.

        Originally posted by nevion View Post
        Do you have interest in incorporating this testing into your repertoire - are there some tests you'd like to see added or removed? Any other problems? I'd like to extend or contract it where you think that could be useful; and potentially add more. For instance, I am going to add int{8,16,32,64} benchmarks for the majority of these.
        Sure, would be interested in promoting it to the official test repository. Currently running all 168 tests on the CPU: with many of them failing -- is it known a number of them fail, at least when running on the CPU?

        After doing that, will likely try out a CUDA build, etc. Were there any other planned improvements you wanted to the test profile? Unfortunately not too familiar with ArrayFire to know if there are any big pieces missing, etc.
        Michael Larabel
        https://www.michaellarabel.com/

        Comment


        • #24
          The problem I had in building arrayfire was making sure it pointed to the right opencl and that, against ROCm, things would execute at all - ROCm's last release presented some mixed runtime vs driver versioning right now that trips it up (and was a bad call IMO) - one more argument for prebuilt binaries, for now. We can do a local build or fetch-install if it's not on the system though - keeping in consideration the OpenCL caveat just mentioned.

          With the max problemsize I am running at, those CPU jobs will not finish in a timely manner (many would be very, very long times). I mentioned previously I'm only allowing so many seconds for the jobs to compute, then they are killed - this is regardless of platform (cpu, cuda, or opencl). When a test times out, it will fail. It's a tough problem with no right answer - if you change the problem sizes to fit the CPUs, it opens up other cans of worms or mixes and matches results. We can increase the threshold, but the behavior is always going to be there. It also deals with hung or near hung GPUs (where a bug or performance issue is present, which I am experiencing now also and reported to ROCm upstream).

          I don't know of an in-arrayfire profile change I'd make atm beyond datatype extension, I'll have to look it over - but it's already a good haul.

          Comment


          • #25
            Originally posted by nevion View Post
            The problem I had in building arrayfire was making sure it pointed to the right opencl and that, against ROCm, things would execute at all - ROCm's last release presented some mixed runtime vs driver versioning right now that trips it up (and was a bad call IMO) - one more argument for prebuilt binaries, for now. We can do a local build or fetch-install if it's not on the system though - keeping in consideration the OpenCL caveat just mentioned.

            With the max problemsize I am running at, those CPU jobs will not finish in a timely manner (many would be very, very long times). I mentioned previously I'm only allowing so many seconds for the jobs to compute, then they are killed - this is regardless of platform (cpu, cuda, or opencl). When a test times out, it will fail. It's a tough problem with no right answer - if you change the problem sizes to fit the CPUs, it opens up other cans of worms or mixes and matches results. We can increase the threshold, but the behavior is always going to be there. It also deals with hung or near hung GPUs (where a bug or performance issue is present, which I am experiencing now also and reported to ROCm upstream).

            I don't know of an in-arrayfire profile change I'd make atm beyond datatype extension, I'll have to look it over - but it's already a good haul.
            I may add it then where if arrayfire on the system isn't detected, go ahead and do a stock build, trying to guess sane defaults.

            Okay, I'll let this CPU run continue for a hour or two (since I'm busy with other work at the moment anyhow) and if I don't hear from you with other changes will then go through and ensure everything is tidy and working fine with the test profile before pushing it to OpenBenchmarking.org!

            One more thing: right now arrayfire-benchmark is just cloned from Git, is it sane enough where I can package up a git snapshot and host it on Phoronixtestsuite.com for it to download=, that way don't have to worry about any Git changes there breaking the test profile, etc.
            Michael Larabel
            https://www.michaellarabel.com/

            Comment


            • #26
              Michael I'll update the scripts tonight to clone tag'd versions - as well as a local install of arrayfire prebuilt when not on the system already. At that point you can put it up stream though I want to make sure that I can get the datatype extended variant tests in on short notice after that - otherwise I'd prefer to wait another week or two.
              Last edited by nevion; 12 January 2017, 04:16 PM.

              Comment


              • #27
                Originally posted by nevion View Post
                Michael I'll update the scripts tonight to clone tag'd versions - as well as a local install of arrayfire prebuilt when not on the system already. At that point you can put it up stream though I want to make sure that I can get the datatype extended variant tests in on short notice after that - otherwise I'd prefer to wait another week or two.
                Just checking if those updates have been pushed yet?
                Michael Larabel
                https://www.michaellarabel.com/

                Comment


                • #28
                  Michael - nope, I'm trying to get those datatype extensions in before. I'll ping you again when I'm ready but hopefully it'll be tonight.

                  Comment


                  • #29
                    Michael - ok I had some issues with ROCm that slowed me down but I pushed through and namely got those datatype extensions in. I also switched to local, from-git no-gl installations of arrayfire and created pts branches for arrayfire and arrayfire-benchmark which will serve as mainline for bugfixes and in this moment some fixes for arrayfire on ROCm to work _now_. Still trying to fix CUDA builds but I'm getting pulled down fixing it on an OpenSUSE box and it's a bit harder to satisfy the build deps of arrayfire (subtle cmake bugs). It probably works on ubuntu on CUDA, not that I can test.

                    I'll probably fix any remaining issues I'm having on SUSE but it's working pretty well now. I submitted a pull request for a couple of package deps on Ubuntu/SUSE too, for arrayfire local builds. See if things work fine for you on a clean install, if you can - perhaps it's good enough to be merged now - fixing SUSE+CUDA is taking like 20 minutes every test change...

                    Comment


                    • #30
                      Originally posted by nevion View Post
                      Michael - ok I had some issues with ROCm that slowed me down but I pushed through and namely got those datatype extensions in. I also switched to local, from-git no-gl installations of arrayfire and created pts branches for arrayfire and arrayfire-benchmark which will serve as mainline for bugfixes and in this moment some fixes for arrayfire on ROCm to work _now_. Still trying to fix CUDA builds but I'm getting pulled down fixing it on an OpenSUSE box and it's a bit harder to satisfy the build deps of arrayfire (subtle cmake bugs). It probably works on ubuntu on CUDA, not that I can test.

                      I'll probably fix any remaining issues I'm having on SUSE but it's working pretty well now. I submitted a pull request for a couple of package deps on Ubuntu/SUSE too, for arrayfire local builds. See if things work fine for you on a clean install, if you can - perhaps it's good enough to be merged now - fixing SUSE+CUDA is taking like 20 minutes every test change...
                      Great thanks trying it out today.
                      Michael Larabel
                      https://www.michaellarabel.com/

                      Comment

                      Working...
                      X