820-2849, random kernel panics - possibly audio?

RandomInsano

New member
Used machine, no history. Has a large dent from a hard drop on the front-right side. Currently hitting random kernel panics whether the machine is hot or cold. No visible water damage on the back of the motherboard.

Because coreaudiod seems to be on the CPU most often during death, I wonder if the sound hardware is to blame. Test details below.

MacBook Pro (17-inch, Mid 2010)
2.53 GHz Intel Core i5
BTO/CTO

I'm expecting this is the particular motherboard given the date and processor specs, but I can get more details if needed.
http://www.powerbookmedic.com/MacBoo...2-p-24188.html

Processes on the current thread during crash:
Coreaudiod (more than twice)
kernel_task (more than twice)
com.apple.geod (once)

Steps to troubleshoot:

1. Swapped RAM and SSD, reinstalled macOS 10.13.6
2. Ran hard OpenGL and CPU processes
2. Ran Ubuntu Linux 2019.04, similar random apps crashing
3. Ran YouTube videos
4. Disabled on-board graphics, discreet graphics

I've managed to have the machine be fairly stable overnight playing audio *while muted* (left Google Play running for over eight hours without a crash), and I've pushed the discreet graphics hard with some WebGL aquarium demo and confirmed the GPU is being taxed by that with a tool called OpenGL Extensions Viewer:
https://webglsamples.org/aquarium/aquarium.html
https://apps.apple.com/us/app/opengl...er/id444052073

I've also run some 15-20 minute compiles and Cinebench R20 to make sure the CPU isn't the problem, and it seems to be unaffected.

To make sure the audio circuit wasn't the problem, I did try running YouTube videos through Bluetooth audio and crashing still occurs. It does.

I've tried disabling the internal graphics from the power pane in macOS and it still crashed. I've disabled discreet graphics with the NVRAM trick and it still crashes.

All of the above was done while plugged into power and fully charged. When battery dropped to ~%80 crashes happened in quick succession every three minutes or so (too short to be able to save the crash report text) until I plugged into power. I'll try and replicate again.


I'd appreciate some ideas on what to try and narrow down where on the PCB I should start checking for voltage/cracked traces.
 
Last edited:

RandomInsano

New member
Hmm. Noticed after the last crash that the discreet GPU is back. No upgrade or anything. I was under the impression this'd stick a little longer, and involuntary nvram reset is a hell of a thing to search for on Google.

Questions from me so far on this thread:

1. What are the motherboard voltages / tests I can do here?
2. Does the NVRAM get reset often with macOS 10.13.6?
 

2informaticos

Administrator
Staff member
Run ASD and see if reports any error.
Also use gfxcardstatus to switch between graphics for tests.
Still crashes on charger only (no battery)?
 

RandomInsano

New member
ASD - Downloading pack now. Unsure which parts of the 52GB torrent are needed, but I'll figure it out.
gfxcardstatus - crashed while forced to integrated. testing discreet now.
charger only - I'll unplug the battery tomorrow.
 

RandomInsano

New member
Definitely helpful!

ASD - EFI passed (except S.M.A.R.T., DMA skipped), OS failed with 27 failed cases
gfxcardstatus - Crashes regardless of integrated or discreet
charger only - boot loop. Never manages to bring up the GUI before dying and blinking five times.


Notable thing about the failed tests is that I could visibly see their output (lines, triangles, foggy boxes). I'm not sure if the particular version of the test suite matters, but I chose the .3 version assuming it was the newest.
 

Attachments

  • ASD 3S138 - 1.zip
    6.4 KB · Views: 0
Last edited:

2informaticos

Administrator
Staff member
Did you try one RAM module at a time, each slot by separate?
Possibly cracked solders/traces under PCH/CPU, if was dropped...
 

RandomInsano

New member
No, I didn't think it was necessary given the boot loop and the memory tests that passed both in the EFI and OS test suites.

I can give them a wiggle while running a memory test. Anything else I can check/do while I have the toolbox out?

I'm wondering if there's a voltage regulator circuit just on the edge of unusable, but I don't do this sort of thing that often.
 

RandomInsano

New member
Alrighty, tests with RAM in the bottom slot resulted in multiple kernel panics in the OS test suite. RAM in the top slot did the same.

I tapped on the RAM and various parts of the board while the test was happening and it seems to fail only when doing the video test.

I'm up for more ideas.



Random extra adventures that may not be relevant: I noticed the gmux gets too hot to touch seemingly randomly then cools off quickly.

On a whim I started a Windows XP installation with the battery disconnected (where it does not even boot to the login screen in macOS) and it's able to run most of the way through until got to initializing hardware and then had the BSOD. The first run through it died without a specific module, the second run through it died in CDFS.sys. I'm considering running some tests without a GUI.
 
Last edited:

RandomInsano

New member
Downloaded OpenBoardView and the schematics and did some poking around this evening.

Code:
[FONT=courier new]PPVCORE_GPU      = 0.856V (when discreet graphics enabled, 0V with integrated)
PPBUS_G3H        = 7.8V   (measured from C5305)
PP0V75_S0_DDRVTT = 0.767V (measured from C7360)
PP5V_S0          = 5.00V  (measured from J5660)
PPVCORE_S0_CPU   = 1.096v (booting) ~0.772V (idle, measured from C1651)[/FONT]

PPBUS_G3H should be 6V according to quadrant D7 on page 8 of schematic, but that seems impossible given PPVBAT_G3H_CHGR_REG and CHGR_PHASE_MID are both 8.4 on the opposite side of F7040/F7041 (page 70).

Unplugging power and running off battery, PPBUS_G3H = 7.70V and with the battery + charger it eventually climbed to 8.35


While the machine was being its grumpy self, I didn't see any serious dips in power on PPVCORE_S0_CPU or PP0V75_S0_DDRVTT so I'm assuming things are happy voltage-wise for both RAM and CPU. Any ideas of where to poke considering it *seems* to be graphics related?
 
Last edited:

RandomInsano

New member
I?ve already done the enable/disable GPU and so not much difference. Swapping the GPU chip itself is going to be more than I'm capable of, so if that's the next step I think you can stop here. It's a hobby on my side, not a job so I don't want to burn your time budget on my fooling around.

I've done a few more things though:
  1. Running a background process to occupy two cores lets the uptime go from between 30 to 60 minutes to eight hours while doing the same workflow (watching YouTube, editing files, music). The toastier the better it seems, so I was wrong in my original post.
  2. Running Ubuntu Server so that the graphics aren?t pushed too hard still crashes when the machine is cooler
  3. The machine is using an older 85w charger. Model number A1172.
I compiled a tool called SMCkit that lists the temperatures in the system. Here are the temperatures when things are ?stable?:

Code:
-- Temperature --
CPU_0_DIODE          84.0?C
CPU_0_PROXIMITY      67.0?C
ENCLOSURE_BASE_0     34.0?C
ENCLOSURE_BASE_1     34.0?C
ENCLOSURE_BASE_2     31.0?C
GPU_0_DIODE          -1.0?C   # This comes and goes, but when it?s here, it?s below 100degC
GPU_0_PROXIMITY      58.0?C
HEATSINK_1           51.0?C
HEATSINK_2           50.0?C
MISC_PROXIMITY       51.0?C
PALM_REST            32.0?C

I?ve also taken a few more voltage measurements. Most are right in Line with their tolerances, but PPBUS_CPU_IMVP_ISNS seems to fluctuate:

6.95v - under load
7.47v - idle
8.11v - with charger
8.37v - with charger, no battery connected. Same value as PPBUS_G3H

I found that when PPBUS_CPU_IMVP_ISNS = PPBUS_G3H, the system rebooted quite often.


Does anyone know what PPBUS_CPU_IMVP_ISNS is used for? It seems to feed into PPVCORE_S0_CPU from page 74 of the schematic.
 

2informaticos

Administrator
Staff member
PPBUS_CPU_IMVP_ISNS is practically same as PPBUS_G3H; only 10 miliohm resistor betwerrn both rails.
In fact I'm not even sure if is present, it appeares as OMIT; not soldered normally.
Please confirm if R5388 and U5388 are there.
If yes, change the resistor, or solder 5 miliohm instead; just solder another 10 miliohm on top of original.
 
Top