Announcement

Collapse
No announcement yet.

Many Power Management Updates For The Linux 4.10 Kernel

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Originally posted by gilboa View Post

    Actually, I have such library for simple stuff, rdtsc, atomic counters, atomic bitmaps, etc.
    Sounds good, does it happen to be open source?

    Specifically atomic counters work well in std C, though, or are they in some way special? Just curious...

    [Hope I'm not derailing the thread, but it's a quiet older one anyway.]

    Comment


    • #12
      Originally posted by indepe View Post

      Sounds good, does it happen to be open source?
      Hopefully, down the road.

      Specifically atomic counters work well in std C, though, or are they in some way special? Just curious...
      Back when I started working on it, it wasn't in std library.
      Plus, I rather not be limited to modern compilers (I still support older GCC and MSVC).

      [Hope I'm not derailing the thread, but it's a quiet older one anyway.]
      *We* are, but as you pointed out, its a quiet thread

      - Gilboa

      oVirt-HV1: Intel S2600C0, 2xE5-2658V2, 128GB, 8x2TB, 4x480GB SSD, GTX1080 (to-VM), Dell U3219Q, U2415, U2412M.
      oVirt-HV2: Intel S2400GP2, 2xE5-2448L, 120GB, 8x2TB, 4x480GB SSD, GTX730 (to-VM).
      oVirt-HV3: Gigabyte B85M-HD3, E3-1245V3, 32GB, 4x1TB, 2x480GB SSD, GTX980 (to-VM).
      Devel-2: Asus H110M-K, i5-6500, 16GB, 3x1TB + 128GB-SSD, F33.

      Comment


      • #13
        Originally posted by gilboa View Post
        Back when I started working on it, it wasn't in std library.
        Plus, I rather not be limited to modern compilers (I still support older GCC and MSVC).


        Also, as I just looked up, LOCK ADD may be a cycle faster than LOCK XADD, which std C probably uses, based on the API, unless the compiler optimizes that when the return value is not used.

        Comment


        • #14
          Originally posted by indepe View Post
          [/SIZE]

          Also, as I just looked up, LOCK ADD may be a cycle faster than LOCK XADD, which std C probably uses, based on the API, unless the compiler optimizes that when the return value is not used.
          Doubt it.
          ASM code usually stays untouched.

          - Gilboa
          oVirt-HV1: Intel S2600C0, 2xE5-2658V2, 128GB, 8x2TB, 4x480GB SSD, GTX1080 (to-VM), Dell U3219Q, U2415, U2412M.
          oVirt-HV2: Intel S2400GP2, 2xE5-2448L, 120GB, 8x2TB, 4x480GB SSD, GTX730 (to-VM).
          oVirt-HV3: Gigabyte B85M-HD3, E3-1245V3, 32GB, 4x1TB, 2x480GB SSD, GTX980 (to-VM).
          Devel-2: Asus H110M-K, i5-6500, 16GB, 3x1TB + 128GB-SSD, F33.

          Comment


          • #15
            Originally posted by gilboa View Post
            Doubt it.
            ASM code usually stays untouched.
            And in the case of C, I wouldn't expect compilers to have optimizations specific to a function like atomic_fetch_add.

            Comment


            • #16
              Originally posted by indepe View Post

              And in the case of C, I wouldn't expect compilers to have optimizations specific to a function like atomic_fetch_add.
              In the end, someone wrote a bunch of asm OPs and shove'm in a place GCC optimizer cannot touch

              - Gilboa
              oVirt-HV1: Intel S2600C0, 2xE5-2658V2, 128GB, 8x2TB, 4x480GB SSD, GTX1080 (to-VM), Dell U3219Q, U2415, U2412M.
              oVirt-HV2: Intel S2400GP2, 2xE5-2448L, 120GB, 8x2TB, 4x480GB SSD, GTX730 (to-VM).
              oVirt-HV3: Gigabyte B85M-HD3, E3-1245V3, 32GB, 4x1TB, 2x480GB SSD, GTX980 (to-VM).
              Devel-2: Asus H110M-K, i5-6500, 16GB, 3x1TB + 128GB-SSD, F33.

              Comment


              • #17
                Originally posted by gilboa View Post

                I'm not sure I understand your request: You want the kernel to enforce zero-PM on machines that doesn't support constant TSC?
                Why? Kernel / user-land developers on such platforms should be aware of this limitation and use other clock sources instead of TSC...

                - Gilboa
                I ask to know if c1e and or EIST should be disabled in the BIOS when the CPU supports only CONSTANT TSC (constant= NOT INVARIANT). TSC si base on processor CLOCK so if TSC si CONSTANT how can it manage for C1E or EIST STATES!? THANKS.

                Comment


                • #18
                  Originally posted by Azrael5 View Post

                  I ask to know if c1e and or EIST should be disabled in the BIOS when the CPU supports only CONSTANT TSC (constant= NOT INVARIANT). TSC si base on processor CLOCK so if TSC si CONSTANT how can it manage for C1E or EIST STATES!? THANKS.
                  AFAIK, if constant TSC is supported by the CPU, C1E / EIST should not make a difference.
                  However, take what I say with a *large* grain of salt, as I didn't test it myself.

                  I'd write a small tester - comping rdtsc vs wall clock with EIST and w/o it. See if it makes a difference.

                  - Gilboa
                  oVirt-HV1: Intel S2600C0, 2xE5-2658V2, 128GB, 8x2TB, 4x480GB SSD, GTX1080 (to-VM), Dell U3219Q, U2415, U2412M.
                  oVirt-HV2: Intel S2400GP2, 2xE5-2448L, 120GB, 8x2TB, 4x480GB SSD, GTX730 (to-VM).
                  oVirt-HV3: Gigabyte B85M-HD3, E3-1245V3, 32GB, 4x1TB, 2x480GB SSD, GTX980 (to-VM).
                  Devel-2: Asus H110M-K, i5-6500, 16GB, 3x1TB + 128GB-SSD, F33.

                  Comment


                  • #19
                    Originally posted by gilboa View Post

                    AFAIK, if constant TSC is supported by the CPU, C1E / EIST should not make a difference.
                    However, take what I say with a *large* grain of salt, as I didn't test it myself.

                    I'd write a small tester - comping rdtsc vs wall clock with EIST and w/o it. See if it makes a difference.

                    - Gilboa
                    ok anyway have you examined on how constant OR invariant TSC affect CPU states on EIST or C1E!?


                    «The Linux kernel uses different time sources. The most interesting are the HPET (High Precision Event Timer) and the TSC (Time Stamp Counter).
                    The TSC is the preferred clocksource between the two counters, as it is the fastest one, however it can only be used if it is stable. Currently there are 4 types of TSC present:
                    1. Constant. Constant TSC means that the TSC does not change with CPU frequency changes, however it does change on C state transitions.
                    2. Invariant. As described in the Intel manual: “The invariant TSC will run at a constant rate in all ACPI P-, C- and T-states”
                    3. Non-stop. The Non-stop TSC has the properties of both Constant and Invariant TSC.
                    4. None of the above. The TSC changes with the C, P and S state transitions».
                    Last edited by Azrael5; 31 January 2017, 07:26 AM.

                    Comment


                    • #20
                      Originally posted by Azrael5 View Post

                      ok anyway have you examined on how constant OR invariant TSC affect CPU states on EIST or C1E!?


                      «The Linux kernel uses different time sources. The most interesting are the HPET (High Precision Event Timer) and the TSC (Time Stamp Counter).
                      The TSC is the preferred clocksource between the two counters, as it is the fastest one, however it can only be used if it is stable. Currently there are 4 types of TSC present:
                      1. Constant. Constant TSC means that the TSC does not change with CPU frequency changes, however it does change on C state transitions.
                      2. Invariant. As described in the Intel manual: “The invariant TSC will run at a constant rate in all ACPI P-, C- and T-states”
                      3. Non-stop. The Non-stop TSC has the properties of both Constant and Invariant TSC.
                      4. None of the above. The TSC changes with the C, P and S state transitions».

                      Hi,

                      Our software is installed on both mid to high-end servers (1/2/4S and above) and high-end demo laptops (running Fedora, that are shipped to our sales people).
                      While we tested TSC heavily on servers, we usually only verify that the software more-or-less works on laptops.

                      That said, we rely heavily on TSC for timing calculation and throws a big-fat warning if it encounters any type of major (more than a couple 100ns) TSC clock drifts.
                      As far as I remember I've yet to see any clock drifts on skylake laptops w/ all the power management options enabled (I would imagine that both C1E and EIST are enabled by default).

                      - Gilboa
                      oVirt-HV1: Intel S2600C0, 2xE5-2658V2, 128GB, 8x2TB, 4x480GB SSD, GTX1080 (to-VM), Dell U3219Q, U2415, U2412M.
                      oVirt-HV2: Intel S2400GP2, 2xE5-2448L, 120GB, 8x2TB, 4x480GB SSD, GTX730 (to-VM).
                      oVirt-HV3: Gigabyte B85M-HD3, E3-1245V3, 32GB, 4x1TB, 2x480GB SSD, GTX980 (to-VM).
                      Devel-2: Asus H110M-K, i5-6500, 16GB, 3x1TB + 128GB-SSD, F33.

                      Comment

                      Working...
                      X