Announcement

Collapse
No announcement yet.

Tesla m40 burned my motherboard VRM

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    Tesla m40 burned my motherboard VRM

    Hi Folks:

    I just bought an used Tesla m40 12 Gb used at ebay. Thing is I bought an nvidia CPU to pciex power adapter. Two wires to one. So I just plugged those two pciex and the card in pciex16 slot. It started all good. The computer booted, the linux driver installation program detected the m40. So I start to install the driver from the internet since the installation program told me it was on the linux channels. But then after a while (I was browsing the internet) my computer shot down and went into magic smoke. Thing is the VRMs burned. It has discrete mosfets (asrock z77 extreme 4). So right now I am getting a new board (maybe will try to replace the mosfets on the old one) but I would like to try the Tesla as they are lost now. I am worried I could burn the new board too. So I am wondering how could I test it before putting in the new one. I am worried the card is ok but the driver turns on some circuit that is shorted or consuming abnormal levels. What are your thoughts?

    #2
    Re: Tesla m40 burned my motherboard VRM

    I forgot to mention the Tesla was quite hot when I removed it from the motherboard. I wasn't calculating anything. Just started driver installation.

    Comment


      #3
      Re: Tesla m40 burned my motherboard VRM

      What power supply are you using (brand and model #)?
      The PSU should have shut down, unless it's one of those huge single 12V-rail designs... which isn't really a good thing, IMO.
      As for the motherboard burning - that could have been a freak accident / coincidence when you had the GPU installed. Or it could be that the extra heat from the Tesla triggered something to go bad on the mobo. Or the PSU was overloaded and giving out extra ripple, taking out a component on the motherboard.

      Kind of hard to guess what happened from this info alone. Pictures and more info would be helpful.

      Most likely the Tesla card is OK. But you may want to use a different PSU that shuts down quicker in case of a problem. Also, make sure that power connector you used can carry sufficient current. Otherwise, a large voltage drop on the cables could also be another reason why something failed.

      Comment


        #4
        Re: Tesla m40 burned my motherboard VRM

        Power supply is a Seasonic Prime 1200 Platinum. The power connector was original NVIDIA CPU to PCIEX wire. I send you some pictures. I got an asus board now with DIGI VRM and DrMOS. I suppose that will avoid in the future burning my VRM again even with any overload. But still worried I could burn the new board once arrives. Previous board was running all perfectly with zero issues before. I gave to smell the Tesla card to some ppl and they say doens't smells like burnt.
        Attached Files

        Comment


          #5
          Re: Tesla m40 burned my motherboard VRM

          This is exactly the power supply:

          SSR-1200PD

          Says

          Protection OPP, OVP, UVP, OCP, OTP, SCP

          We tested voltages with meters and all where in range, also powered another computer with it and all ok. No big load however. I have USB 40MHz Oscilloscope around. Will have to figure out how to safely add some load to it.

          https://seasonic.com/prime-platinum?...#specification
          Last edited by waldoalvarez00; 05-30-2021, 03:23 PM.

          Comment


            #6
            Re: Tesla m40 burned my motherboard VRM

            Probably not the PSU then.

            I suspect this was just a "freak" accident - likely something to do with things moving around a bit when the card was installed. For example, I see you have some custom bolt mods going through the board for the CPU heatsink. Even with proper insulating washers under the nuts, you still have to be careful how you install these. The one in the upper-right corner of the first picture looks especially accident-prone with the way the steel washer gets close to SMD components there. If you don't have good thick plastic insulating washer (and make sure they don't apply pressure over any SMD component or near it), then that could well have shorted when the card was inserted and thing were moved a little. As to why the card/board didn't show problems right away but only when you installed the drivers - well, something could have been super-close to shorting from those bolt mods, but just had to "wait" for the "right" moment.

            In any case, I don't see why/how the Tesla could short out the CPU VRM. Perhaps maybe due to PCI-E lines on modern CPUs going directly to the PCI-E slots?? But even if that's the case, there are signal coupling caps on those lines, so I think damage should have been prevented.

            To be on the safe side, perhaps try the Tesla card in an old (and maybe less valuable) board that doesn't have PCI-E bus lanes from the CPU going directly to the card. Something like socket 775 (Intel) or 939/AM2 (AMD) might be a good test bed. If the card works there and installs the drivers without issues, then I'd say the whole VRM burning thing must have been a coincidence or freak accident. But if not, then you may want to check the Tesla card for any obvious defects (missing/chipped SMDs, shorted ceramic caps on the PCI-E slot used for signal coupling, and etc.)
            Last edited by momaka; 05-30-2021, 10:56 PM.

            Comment


              #7
              Re: Tesla m40 burned my motherboard VRM

              I had all pciex occupied when things happened. Video card (small one but without power input), wifi card and sound card + tesla.

              I am wondering is it possible the card is only sucking power from pciex port rather than from the pciex power wires?

              Maybe using all those cards caused the VRM to be overloaded? Maybe only when the driver turned on the card it started to consume big??

              The worst I think is if the card turns on some circuit that is shorted or something when driver starts. Biggest concern to me.

              I tend to not think the screws caused issue. They had plastic washers yes and never caused anything before. They were also not supertight. I only tight them enough so the thermal pad touches the FETs. Where watercooled. No leaks at all. Seems the CPU survived. Could not properly test cause the guy didn't had compatible motherboard and is not easy to find these days 1155 boards. The CPU got hot when putted in motherboard he tested. That is a good sign. Motherboard beeped. RAM was ok. Looks to me the VRM received some sort of overload. As I tested with meter only one row of the mosfets reads shorted. The other row seems to be fine.

              I was thinking to maybe test the card with some of those pciex raisers that have a chip in between.

              It would be great if I could use this GPU to make my calculations. Cause they are lost and now are more expensive with this silicon crisis. Even considered getting my hands at one of those. But they are expensive.
              Attached Files
              Last edited by waldoalvarez00; 05-31-2021, 07:42 PM.

              Comment


                #8
                Re: Tesla m40 burned my motherboard VRM

                I have update. I decided to remove heatsink and found something strange. One of the SMD inductors is cracked.
                Attached Files

                Comment


                  #9
                  Re: Tesla m40 burned my motherboard VRM

                  i don't like the look of those long threaded bars

                  Comment


                    #10
                    Re: Tesla m40 burned my motherboard VRM

                    I found the explanation from this guy.

                    https://www.youtube.com/watch?v=6BBEasRiM_M

                    Seems the memory VRM failed somehow, that is why L14 inductor is cracked and close shunt is also little bit bulged. Since it takes power from PCIex port that is what caused the motherboard VRM to burn. If 12v didn't got to VRAM is repairable. Otherwise according to a guy comment in the video if 12v go into VRAM it even damages the GPU partially.

                    Conclusion this card is not safe to put in another motherboard 100%.

                    Is just a bad designs from nVidia. Not even a fuse there.

                    Comment


                      #11
                      Re: Tesla m40 burned my motherboard VRM

                      Originally posted by waldoalvarez00 View Post
                      Seems the memory VRM failed somehow, that is why L14 inductor is cracked and close shunt is also little bit bulged. Since it takes power from PCIex port that is what caused the motherboard VRM to burn.
                      I agree 100% with the video, but this is NOT why your motherboard's CPU VRM burned. The two are not connected in any way and it's just NOT possible.
                      I've connected many video cards with burned MOSFETs and inductors to motherboards (and even watched as one puffed a ton of magic smoke due to a single rail powerful PSU not shutting down when it should have), and in none of those cases the motherboard was damaged. The only exception is if RAM VRM on the GPU takes power from 12V rail from PCI-E connector - then you can get melted pins on PCI-E connector if PSU doesn't shut down and there is a severe overload there.

                      But regardless, motherboard's CPU VRM -cannot- be damaged by video card.

                      Whatever happened in your case, I can't say with 100% certainty because I don't have hardware in front of me. However, I am pretty certain of the above that motherboard wasn't damaged by your GPU.

                      That aside, of course the cracked inductor on your GPU probably should be addressed indeed. To me, that looks like a factory installation defect and not one from an overload. If it was overloaded, you would have seen some sooth and discoloration like the pictures in the video above. Your card doesn't have that and your R33 inductors look good. The fact that your card even booted to desktop means they are still working. Now, should you do a preemptive replacement of the R33 inductors so that your card doesn't fail? - That's up to you. But to me, it seems that your Tesla video card is working OK at the moment, and I again would strongly suggest you test it on another (older / less valuable) motherboard to confirm its operation. I think you will find that it works normally, and your AsRock z77 burned CPU VRM for some other (unknown) reason. (Though, like petehall347, I too don't like the looks of those long threaded rods and suspect foul play with the custom cooling you had on there.)

                      Originally posted by waldoalvarez00 View Post
                      Is just a bad designs from nVidia. Not even a fuse there.
                      It's not a bad design. Just cheap construction for cost savings during manufacturing. The design is good.
                      They just used smaller inductors to save on part costs, regardless that this is an expensive GPU, which is typical of today's cost-manufactured shit.

                      Also, a fuse may not help. It takes time for a fuse to blow - even for a fast-blow type. In some cases, that alone may not be enough to prevent damage to the GPU and its RAM. A good VRM will ramp up T_on time on its bottom-side MOSFET and ideally indefinitely leave it turned ON when it starts detecting over-voltage on the output. But seems that not all VRMs know how to respond well (or quick enough) to an over-voltage / high-side MOSFET failure.

                      I've had this happen on an Intel motherboard whose upper MOSFET shorted multiple times on me while trying to repair its VRM (issue turned out to be a partially-dud PWM driver.) I had to replace the same upper MOSFET no less than 4 times. And not once did it damage my CPU. Each time the upper MOSFET shorted, the lower one closed and made the over-load / short-circuit protection on the PSU kick in, thus completely shutting everything down. Of course, I was using a modest PSU with low 12V current rating, so it shut down quick. I -don't- like high-power single 12V rail PSU that can do more than 30 Amps without any sort of limit. I honestly consider these a bit of a fire hazard, especially anything over 700-800 Watts, because even with an 8-pin and 6-pin PCI-E power connectors, that can still push upwards of 60 Amps through those 6-8x 18-16 AWG wires... or about 8-10 Amps per wire - and that's without the PSU even thinking it's overloaded.

                      So if you want to stay on the safe side when testing hardware, use a PSU that has several separate 12V rails with 18-22 Amps limit. This is by far the quickest way to prevent damage to video card, motherboard, and even PSU wires.
                      Last edited by momaka; 06-01-2021, 07:47 PM.

                      Comment


                        #12
                        Re: Tesla m40 burned my motherboard VRM

                        Ok probably you are right about motherboard VRM since is 12v and shouldn't be going through motherboard VRM.

                        I do however tested 12v pcie pins looking at the pinout (first 3 side B)

                        https://pinouts.ru/Slots/pci_express_pinout.shtml

                        to ground and they read like shorted. I don't know much about video cards but doesn't looks right to me.

                        I am thinking to test 3.3v lines too. Just guessing.

                        The FETs on the VRAM VRM don't read like shorted. They look ok and I saw no shorts there.

                        I did some measurements on the small smd caps before vCore VRM and some read shorted (but not necessarily they are bad for that I guess). Was thinking to desolder them with station and test outside.

                        Yes I am probably replacing those R33 inductors preemptively if I manage to make this card work. It wouldn't be that difficult with station since they are relatively big.

                        You know what kind of measurements I could do there? Maybe a tutorial here at the forum? I saw this youtube video that looks interesting

                        https://www.youtube.com/watch?v=vHGn2pZ0Cv8
                        Last edited by waldoalvarez00; 06-01-2021, 08:41 PM. Reason: no 5v in pciex

                        Comment


                          #13
                          Re: Tesla m40 burned my motherboard VRM

                          I found this youtube video with a guy diagnosing something similar. Short in those pciex lines. (different shunt) Says looks repairable. I hope my GPU didn't fried.

                          According to what I understood there, one of the FETs feeds from the pciex slot and got shorted. According to the test he did in those small caps, my second DrMOS from top to bottom is shorted.

                          https://www.youtube.com/watch?v=MunL4-DqFm8

                          So if I guess the safe move here would be to remove that FET and see if the short is gone. Then if gone put a new one and replace the cracked choke and the bulged shunt. See if things look right. Power it and pray .

                          Question is there a way I can know if the GPU is not a rock now? Resistance gives me 2.9-3 ohms. But my meter I don't think is that good. Meter points touching gives me 0.6 ohms. RAM gives me like 50 ohms resistance so looks fine.

                          Comment

                          Working...
                          X