3662/3787/00138/etc CPU/Board Issue Question

SMMRepair

Member
Sorry to post about this every month, but I have some potentially additional information about the terrible CPU/board failure of the L2013-2015 A1398 boards.

I replaced two customer boards this week with this issue. When inspecting the boards, I found that U7310 had been replaced on both. There was no sign of liquid to that area (no discolored cap ends or anything), and neither was shorted, but both had definitely been replaced. They were replaced pretty cleanly as well, but looked to have been done by Apple (not perfect, but good enough, small amount of flux, a few pins not "perfect", etc). Both boards had the same CPU failure; both had issues coming out of deep sleep/hibernation (i.e. I would leave them overnight, open the unit and fans would be running high and no video, etc). I'm going to go through the pile of boards with the issue and see how many have had U7310 replaced.

Knowing this is a vcore issue, could it be that one of the regulators/power supplies (U7310/20/30) is failing, which prevents vcore from recovering/stabilizing? I know that it's possible these had s5_hs_computing shorted to ground, but knowing that kills the CPU so often (90%+), it would seem odd that both "survived" it well enough to be sold as refurbished by Apple (wouldn't put it past them, though). Additionally, there was zero sign of liquid at all--no discolored caps like you sometimes see when liquid touches there, etc.

To add a bit to this: I repaired a third 00138 board this week that DID have a tiny spot of liquid to U7310. Expecting the CPU to be dead, I did it with low hopes, but surprisingly the board powered right up and passed diagnostics. It ran Heaven for ~1 hour, at which point I closed the unit for the night. Came back the next morning and it, too, had high fans and no video...which makes me wonder if the U7310 I had pulled (from a board with the CPU issue, undoubtedly), might have been bad, and caused this board (that had survived a shorted s5_hs...) to develop the CPU issue? Most likely a huge coincidence, but could faulty U7310/20/30 chips be causing this whole issue? I know you, duke, have done WAY more testing on this than I'll ever hope to have time to do, but I did think it was interesting. I know the current thought is that there are open lines/traces inside the board, and that it isn't repairable. It's so painful telling retail-customers that this board is not repairable.

Just a thought, curious to hear your input.
 
Last edited:

dukefawks

Administrator
Only one way to find out for sure. Replace U7310/20/30 on a board with NEW ones. I have swapped them, but they were all pulled from garbage boards so they could have been duds too. I have not ordered new ones and tried that so this may be the solution...

Also a good possibility is that "Apple" is replacing U7310 because they don't even know what the hell is going on and the boards never were really fixed to begin with. 820-2850 boards "Apple" was replacing the GPUs on at first when later it turned out to be VRAM power issues.
 
Last edited:

SMMRepair

Member
That's a good point. I'm going to order some and see...it would seem odd that they'd be bad/defective across 3 years of production. Guess we'll see! I'll order some tonight and update in the next week or so. Thanks, Duke!
 

dukefawks

Administrator
Replacing U7310 does make sense in a way. As these crashes only happen during low load operation U7200 will probably have shut down phase 2 and 3 and be running in diode emulation mode on phase 1.
Of course I cannot find the datasheet on U7200 as there may be a few other approaches here too. Pins SLOPE, PROG1/2/3 are probably used to set boot voltage/max. current/slew rates. Messing with those may also be a solution and or improvement. But without the datasheets I'm just stabbing in the dark.

Having a 100% sure way to reproduce the crashes would also help. These boards can run for hours without any issues which make diagnosing just a total PIA.
 

SMMRepair

Member
Yeah I've noticed the inconsistent crashing, too. The most reliable way I seem to be able to reproduce the failure is to put them to sleep/hibernate for a long time (i.e. overnight, or over the weekend preferably, etc). Most that are failing (or failed "enough") will then crash and fans kick on high while the unit is still closed (before I even open/touch it). Also, many of the failed boards I receive for repair (from retail customers) include the note "trouble coming out of sleep", which supports that method a bit, too. That's what gave me the idea to use that for testing. Other than that, the symptoms seem to vary slightly...some crashed under load after a while, others seemingly randomly. I had a few that froze up completely immediately after closing Heaven (it would run fine for 1-2 hours, but as soon as I closed it out, bam, locked up). I've had a few that wouldn't boot up at all (high fans), but those seem to possibly be a different issue altogether. Like you say, hard to tie them all together when they seem to be a bit inconsistent in their failures. I wonder if this could be because a combination of U7310/20/30 fails (same model chip and all)...i.e. if phase 1 fails, "X" symptom set. If phase 2 fails, "Y" symptom set. If phase 1 and phase 2 fail in conjunction, "Z" symptom set. Just thinking out loud, because this is getting beyond my understanding honestly.

With FDMF6808N being used only on the boards in question...it seems like a good candidate. I've contacted a supplier about some "new" chips. Only seems to be one or two selling them...and I'm hoping they aren't just pulling them from boards. We'll see I guess.
 
Last edited:

dukefawks

Administrator
Crashing when it is asleep is strange. In sleep CPU Vcore should be shut down as it is a S0 rail. ALL_SYS will also be down and that is the enable for U7200, so Vcore is not on. It means that something must be waking the system and making it crash. Fans are also powered from 5V_S0 and should not be able to spin up when in sleep mode. That then begs the question did the machine really go to sleep fully or did it just hang half way. Should check if Vcore actually shuts down and then wait till the system crashes in sleep and measure if all S0 have come on again.

Check the date codes on U7310/20/30 on the boards that you think U7310 was replaced. U7310 date code must differ from the other 2.

I have not seen this failure on 00138 boards, yet the Vcore circuit is 100% identical besides R7230 which should not matter. Of course the 00138 are newer and in my perception less of them were sold.

Aliexpress is full of these chips, I doubt they will be pulling them as they are quite cheap.
 

SMMRepair

Member
Thanks, Duke. Yeah I'm going to grab some chips now and will see how that goes. That's a good point about the unit failing during sleep as well. I've noticed that the boards will hardly ever fail in light sleep (s3), i.e. after a few hours or so. It's only when they sleep/hibernate (s4 I assume) for a while that they fail. I've let a board sit for 3-4 hours and it comes out just fine (over and over--3 or 4 "sessions" of this). As soon as it goes for a longer duration (12-24 hours), it fails. It seems to be a duration thing. At what point (in terms of time--minutes/hours/etc) do these units go from s3-s4? What specifically determines that? It would be nice to be able to force the unit into s4 quicker, maybe via terminal or an actual circuit modification or something?

I'm going to take some time soon and really go through and make some very specific notes on these boards. I'll try to separate between 3662/3787 and the newer 00138/00426. I'm going to put together 4 or 5 test units and get these things set up and submit them to the exact same variables and then note everything. Do you have any specific things you would recommend I check/note or pay attention to? I want to try to get this thing figured out. I have literally 40+ of these sitting (probably 30 3662/3787, 10 or so 00138/00426, which does speak to the 2015 models just being newer and less failure rate at this point, I'd imagine), so if repairable, it would be a nice payoff. I'd be happy to send you a few boards if you're interested, though I'm sure you have plenty yourself.
 
Last edited:

Inwerp

New member
call me dumb but did you try to test it with windows/linux/"native" os version (internet recovery with command+R installs "factory" version instead of latest)?
at least, Microsoft is well known for breaking ACPI support for old devices.
From my PC experience, wakeup problems often come from videocard switching. For example, Sony Vaio laptops have some kind of a button videocard switch, which is integrated with driver+bios, needless to say - there's no updates for W10 and i literally invested my sanity to fix sleep mode.
 

dukefawks

Administrator
Apple does not break anything simple like that on their machines. This is the beauty of a limited hardware pool. No surprises with weird combinations of hardware. All machines that a version of OSX supports are tested and a clean OSX will not have stupid issues like crashing in sleep.
 
Top