Jump to content

nVidia GPU performance tuning with AGPM


Hervé

Recommended Posts

  • Administrators

Much has been written about GPU tuning with AppleGraphicsPowerManagement (AGPM). There are good threads on the matter at InsanelyMac and Olarila or on RampageDev's blog and many other places that I simply cannot list exhaustively.
http://olarila.com/forum/viewtopic.php?f=18&t=1839
http://www.rampagedev.com/?page_id=311
 
A couple of years ago I did some research with Bronxteck on the matter of CPU/GPU performance tuning with FakeSMC. Whilst we achieved good results, I noticed ages ago that my D630 nVidia had no GPU throttling after waking from sleep and GPU remained on high speed leading to very high T°. As my motherboard recently failed again due the well-known nVidia GPU weakness and got revived through "baking", I decided to look in to the matter to avoid unnecessary GPU fast clocking and consequential lengthy overheating.
 
I sought to fix the issue with AGPM. Here are my findings...
 
Recap of the D630 GPU-related specs & configuration:

  • nVidia Quadro NVS 135M GPU (based on GeForce 8400M GS (G86M) chip).
  • GPU PCI ven/dev ids: 10de/042b (in hex)
  • 3 known GPU PStates (=power states: GPU Core/RAM speed): 168/100MHz, 275/300MHz, 400/594MHz
  • SMBIOS used: MacBookPro5,1

In order for graphics power management to work as expected, the Info plist of the AGPM kext must normally have an entry for the targeted Mac model and targeted GPU. The plist does have an entry for the MacBooPro5,1 and a quick look at it shows:

<key>MacBookPro5,1</key>
<dict>
        <key>GFX0</key>
        <dict>
                <key>BoostPState</key>
                <array>
                        <integer>0</integer>
                        <integer>1</integer>
                        <integer>2</integer>
                        <integer>3</integer>
                </array>
                <key>BoostTime</key>
                <array>
                        <integer>3</integer>
                        <integer>3</integer>
                        <integer>3</integer>
                        <integer>3</integer>
                </array>
                <key>Heuristic</key>
                <dict>
                        <key>ID</key>
                        <integer>0</integer>
                        <key>IdleInterval</key>
                        <integer>250</integer>
                        <key>SensorOption</key>
                        <integer>1</integer>
                        <key>SensorSampleRate</key>
                        <integer>4</integer>
                        <key>TargetCount</key>
                        <integer>1</integer>
                        <key>Threshold_High</key>
                        <array>
                                <integer>57</integer>
                                <integer>65</integer>
                                <integer>82</integer>
                                <integer>100</integer>
                        </array>
                        <key>Threshold_Low</key>
                        <array>
                                <integer>0</integer>
                                <integer>68</integer>
                                <integer>75</integer>
                                <integer>95</integer>
                        </array>
                </dict>
                <key>control-id</key>
                <integer>17</integer>
        </dict>
        <key>IGPU</key>
        <dict>
                <key>BoostPState</key>
                <array>
                        <integer>0</integer>
                        <integer>1</integer>
                        <integer>2</integer>
                        <integer>3</integer>
                </array>
                <key>BoostTime</key>
                <array>
                        <integer>3</integer>
                        <integer>3</integer>
                        <integer>3</integer>
                        <integer>3</integer>
                </array>
                <key>Heuristic</key>
                <dict>
                        <key>ID</key>
                        <integer>1</integer>
                        <key>IdleInterval</key>
                        <integer>100</integer>
                        <key>Threshold_High</key>
                        <array>
                                <integer>50</integer>
                                <integer>75</integer>
                                <integer>96</integer>
                                <integer>100</integer>
                        </array>
                        <key>Threshold_High_v</key>
                        <array>
                                <integer>70</integer>
                                <integer>85</integer>
                                <integer>94</integer>
                                <integer>100</integer>
                        </array>
                        <key>Threshold_Low</key>
                        <array>
                                <integer>0</integer>
                                <integer>40</integer>
                                <integer>55</integer>
                                <integer>92</integer>
                        </array>
                        <key>Threshold_Low_v</key>
                        <array>
                                <integer>0</integer>
                                <integer>50</integer>
                                <integer>60</integer>
                                <integer>92</integer>
                        </array>
                </dict>
                <key>control-id</key>
                <integer>16</integer>
        </dict>
        <key>LogControl</key>
        <integer>0</integer>
</dict>

Amongst others, Olarila forum + RampageDev's blog provide detailed information on the meaning of the above code. I'll just highlight the fact that the values under the "Threshold..." lines are GPU idle state % (not load %) of P (=power) states 0, 1, 2 and 3 respectively. PState 0 is lowest idle/highest load and PState 3 is highest idle/lowest load. The available literature on AGPM also explains that, in order to obtain good GPU throttling, the AGPM kext should have an entry matching the device listed in the DSDT or corresponding to the computer hardware in the form "VendorDevice".
 
In the case of the MacBookPro5,1 section of the Info plist, we can see an entry for GFX0 and for IGPU. GFX0 is usually for discrete/add-on GPUs (e.g.: nVidia or AMD graphics) whilst IGPU is for integrated GPUs (e.g.: Intel GMA, Intel HD graphics)
 
If we look at the "regular" DSDT for the D630n, we clearly see that the GPU is described under a device called VID. No sign of a GFX0 or a IGPU device. I was therefore a little surprised that GPU throttling actually worked natively, even if only under certain conditions.

I did 2 separate things to try and fix the loss of GPU throttling after wake:

  • rename the VID device to GFX0 in the DSDT
  • keep the DSDT untouched and add a Vendor10deDevice042b entry in the MacBookPro5,1 section of the AGPM kext Info plist

I also changed the LogControl parameter of the MBP5,1 section of the Info plist to 1 to be able to track GPU state changes in the kernel log.

In both cases, results were disappointing: whilst I regained GPU throttling after wake, I could only see 2 GPU states: 275/300MHz and 400/594MHz. No sign of the lowest state 100/168MHz. I tried to tune the various idle % levels specified in the GFX0 entry, unfortunately to no avail...
Mar 14 20:14:21 D630n kernel[0]: AGPM: updateGPUHwPstate(1, 0): fHwPstate = 0 fFB = 0xffffff800b28d800
Mar 14 20:14:21 D630n kernel[0]: AGPM: updateGPUHwPstate(): state = 1. Calling fFB->setAggressiveness()...
Mar 14 20:14:21 D630n kernel[0]: AGPM: GPU = GFX0 G-state set to 1 from 0, ControlID = 17. SW occupancy updated.

 
Browsing through the rest of the Info plist, I noticed several nVidia device entries which all had a very specific parameter that differed from the GFX0 or IGPU entries: control-id. In the case of the GFX0 entry of the MBP5,1 Info plist, the parameter holds a value of 17, for IGPU a value of 16 and for some nVidia devices, a value of 18. I therefore thought I'd try to change the GFX0 control-id value from 17 to 18. On reboot, bingo!, HWMonitor app displayed all 3 x GPU PStates. The kernel log entries also showed GFX0 power states varying between 0 (high), 1 (medium) and 2 (low) as GPU load varied. Applying the same value to the Vendor10deDevice042b alternative configuration produced the exact same result. As desired, AGPM now worked after waking the laptop, so the days of the nVidia GPU stuck on highest speed after sleep are effectively gone. :)
Mar 14 21:24:42 D630n kernel[0]: AGPM: updateGPUHwPstate(1, 0): fHwPstate = 0 fFB = 0xffffff800b28d800
Mar 14 21:24:42 D630n kernel[0]: AGPM: updateGPUHwPstate(): state = 1. Calling fFB->setAggressiveness()...
Mar 14 21:24:42 D630n kernel[0]: AGPM: GPU = VID G-state set to 1 from 0, ControlID = 18. SW occupancy updated.
Mar 14 21:24:23 D630n kernel[0]: AGPM: updateGPUHwPstate(2, 0): fHwPstate = 1 fFB = 0xffffff800b28d800
Mar 14 21:24:23 D630n kernel[0]: AGPM: updateGPUHwPstate(): state = 2. Calling fFB->setAggressiveness()...
Mar 14 21:24:23 D630n kernel[0]: AGPM: GPU = VID G-state set to 2 from 1, ControlID = 18. SW occupancy updated.


So there we are, GPU throttling at all times on the D630 nVidia using one of the following 2 solutions:

  • Rename the correct VID device to GFX0 in the DSDT + change the control-id of MBP5,1 GFX0 entry of AGPM kext Info plist to 18. Be careful here as there are 2 different VID devices in the DSDT: one under AGP device and one under PCI0 device. Only rename the VID device under AGP, then replace all references to AGP.VID by AGP.GFX0.
  • Keep DSDT as is and clone the GFX0 section of the MBP5,1 entry of AGPM kext Info plist to a new Vendor10deDevice042b entry with control-id set to 18.

Obviously, solution #2 is the preferred method as it does not require the somehow risky DSDT patch and it fits in line with existing entries in the AGPM kext Info plist. It is also more identifiable due to the clear Vendor/Device section:

<key>MacBookPro5,1</key>
<dict>
        <key>Vendor10deDevice042b</key>
        <dict>
                <key>BoostPState</key>
                <array>
                        <integer>0</integer>
                        <integer>1</integer>
                        <integer>2</integer>
                        <integer>3</integer>
                </array>
                <key>BoostTime</key>
                <array>
                        <integer>3</integer>
                        <integer>3</integer>
                        <integer>3</integer>
                        <integer>3</integer>
                </array>
                <key>Heuristic</key>
                <dict>
                        <key>ID</key>
                        <integer>0</integer>
                        <key>IdleInterval</key>
                        <integer>250</integer>
                        <key>SensorOption</key>
                        <integer>1</integer>
                        <key>SensorSampleRate</key>
                        <integer>4</integer>
                        <key>TargetCount</key>
                        <integer>1</integer>
                        <key>Threshold_High</key>
                        <array>
                                <integer>37</integer>
                                <integer>45</integer>
                                <integer>54</integer>
                                <integer>80</integer>
                        </array>
                        <key>Threshold_Low</key>
                        <array>
                                <integer>0</integer>
                                <integer>40</integer>
                                <integer>60</integer>
                                <integer>75</integer>
                        </array>
                </dict>
                <key>control-id</key>
                <integer>18</integer>
        </dict>
        <key>LogControl</key>
        <integer>0</integer>
</dict>

The idle % levels can be adjusted if/as desired/required (as in above example) to optimise performance. This GPU throttling was experimented and verified under Mountain Lion 10.8.5, Mavericks 10.9.5 and Yosemite 10.10.3. The same principles are of course re-usable for other machines.

 

:excl: The patched AGPM kext must remain in /S/L/E. It will be ineffective if placed in /E/E for handling by MyHack...

  • Like 2
Link to comment
Share on other sites

  • Administrators

I've just achieved identical results on my D620 nVidia model, using the exact same principles but a different device id for the Quadro NVS 110M GPU (id 0x01d7). I've no monitoring other than the kernel logs though...  <_>

<key>MacBookPro5,1</key>
<dict>
        <key>Vendor10deDevice01d7</key>
        <dict>
                <key>BoostPState</key>
                <array>
                        <integer>0</integer>
                        <integer>1</integer>
                        <integer>2</integer>
                        <integer>3</integer>
                </array>
[...]
[...)
[...]

Provided ControlLog parameter is set to 1 in the AGPM Info plist, the kernel log shows entries such as the following even after wake:

Mar 18 13:55:22 d620_nvidia kernel[0]: AGPM: GPU = VID G-state set to 1 from 0, ControlID = 18. SW occupancy updated.

Mar 18 13:55:24 d620_nvidia kernel[0]: AGPM: updateGPUHwPstate(2, 0): fHwPstate = 1 fFB = 0x8039c00
Mar 18 13:55:24 d620_nvidia kernel[0]: AGPM: updateGPUHwPstate(): state = 2. Calling fFB->setAggressiveness()...
Mar 18 13:55:24 d620_nvidia kernel[0]: AGPM: GPU = VID G-state set to 2 from 1, ControlID = 18. SW occupancy updated.
Mar 18 13:56:28 d620_nvidia kernel[0]: AGPM: updateGPUHwPstate(1, 0): fHwPstate = 2 fFB = 0x8039c00
Mar 18 13:56:28 d620_nvidia kernel[0]: AGPM: updateGPUHwPstate(): state = 1. Calling fFB->setAggressiveness()...
Mar 18 13:56:28 d620_nvidia kernel[0]: AGPM: GPU = VID G-state set to 1 from 2, ControlID = 18. SW occupancy updated.
Mar 18 13:56:30 d620_nvidia kernel[0]: AGPM: updateGPUHwPstate(2, 0): fHwPstate = 1 fFB = 0x8039c00
Mar 18 13:56:30 d620_nvidia kernel[0]: AGPM: updateGPUHwPstate(): state = 2. Calling fFB->setAggressiveness()...
Mar 18 13:56:30 d620_nvidia kernel[0]: AGPM: GPU = VID G-state set to 2 from 1, ControlID = 18. SW occupancy updated.
Mar 18 13:56:35 d620_nvidia kernel[0]: AGPM: updateGPUHwPstate(0, 0): fHwPstate = 2 fFB = 0x8039c00
Mar 18 13:56:35 d620_nvidia kernel[0]: AGPM: updateGPUHwPstate(): state = 0. Calling fFB->setAggressiveness()...
Mar 18 13:56:35 d620_nvidia kernel[0]: AGPM: GPU = VID G-state set to 0 from 2, ControlID = 18. SW occupancy updated.
Mar 18 13:56:38 d620_nvidia kernel[0]: AGPM: updateGPUHwPstate(1, 0): fHwPstate = 0 fFB = 0x8039c00
 
It's pretty safe to say this extends to other D830, M2300 or M4300 of similar specifications.

 

Link to comment
Share on other sites

  • Administrators

Confirmed to work on E6400 with nVidia Quadro NVS 160M 256Mo (PCI id 10de/06eb) too.The same patch as provided for D630 above can safely be re-used on that laptop, the NVS 160M being based on GeForce 9300M GS; just change the device id in the patch (obviously).

<key>Threshold_High</key>
<array>
        <integer>37</integer>
        <integer>45</integer>
        <integer>54</integer>
        <integer>80</integer>
</array>
<key>Threshold_Low</key>
<array>
        <integer>0</integer>
        <integer>40</integer>
        <integer>60</integer>
        <integer>75</integer>
</array>
Link to comment
Share on other sites

  • Administrators

Thanks to a hint mentioned by DuongTH, I've experimented with AGPM injection in FakeSMC. It works perfectly. It's a simple matter of adding an "AGPM" section in the kext's Info.plist file with all the above parameters. This basically avoid the hassle of repatching the vanilla AGPM kext after each OS X installation or update.

 

AGPM injection sample provided below:

Info.plist_FakeSMC_with_AGPM_injection.zip

 

Re-usable at will with all necessary and/or eventual parameters adjustment of course.

 

By the way, these are the typical GPU T° I can see on my D630 nVidia (I've got a copper shim on the dGPU chip), proving things are well under control:

Temp#1.jpg Temp#2.jpg Temp#3.jpg

Link to comment
Share on other sites

  • Administrators

Revised tuning after renewed testing on D630n under Mojave and Catalina. The following values bring better throttling and lower GPU temperatures:

<key>Threshold_High</key> 
<array>
        <integer>37</integer>
        <integer>45</integer>
        <integer>54</integer>
        <integer>80</integer>
</array>

 

All of the above updated accordingly.

Link to comment
Share on other sites

Guest
This topic is now closed to further replies.
×
×
  • Create New...