4 Operator
•
2.4K Posts
0
2187
August 11th, 2023 19:45
XPS 8940, Nvidia RTX 2060 6GB, lock up starting again!
XPS 8940
I'm running Nvidia Driver 536.67 for a short while... a week or two.
Wed., the MS Update Patch Tuesday was installed.
Had a lock-up, classical, had a window open and clicked on a desktop icon to open it, lock up.
Today, had Revo Uninstaller open and clicked on a left tab selection, lock up...
See there IS a new Nvidia driver though, 536.99 that came out on 8/8 that I discovered checking to see what version I was running.
So I opened the Event Viewer and for Tues. timestamped after the MS Update, I saw this in the Event Viewer:
H/W error, and it happened during the time I went to open the desktop Icon. On other lock-ups I've seen this error too so I dismissed it.
Today though, no error other than the normal Windows Error 41 for using the power button on a lock up:
So I used F12 on boot this time to run the Diagnostics (I've run it before with no error):
I ran the exhaustive Memory test that does more h/w testing too, Return code 0000, no problems found.
I also had the odd MS Money problem, when opened, it should open to HOME, and it does, but it flashes and shows no data. When I manually select HOME it worked fine? That started MANY Nvidia drivers ago, may 5 or 6, and 2 drivers ago it was working fine again, but not on the last 2. I figure that is an Nvidia buffer problem?
Sort of makes me think that RTX2060 is the root problem again?
I updated the Nvidia driver to the latest, but I still had the MS Money oddity.
I have had WEEKS of no lock ups prior to this. Usually, if I did get one, it was during the Install of an Nvidia driver, not the classical one, but the Driver Install Window never closed and stayed on the top of the windows on the desktop.
Also same thing with Thunderbird recently too... Could not close it after it reported a filter problem (writing to disk, think it was full, but it is not), after I attempted to close it, it was unresponsive and never closed.
I was thinking Memory (RAM or RTX2060 card) but all the tests I've run fail to show that to be the case.
So now my thoughts are the Nvidia driver or an MS Update?




DELL-Nat M
Community Manager
•
3.4K Posts
0
October 18th, 2023 16:59
(Marking as solution for visibility)
Hi all,
We would like to share the most recent workaround provided by the team:
If the issue persists, please contact Dell Support by using the "Get Help Now" button located at the bottom left of your screen.
(edited)
ispalten
4 Operator
•
2.4K Posts
0
August 12th, 2023 00:12
Well, after messing around and doing some research, I discovered a LiveKernel DMP file @
C:\Windows\LiveKernelReports\WATCHDOG\WATCHDOG-20230810-1639.dmp
and in it, "VIDEO_ENGINE_TIMEOUT_DETECTED (141)".
Suspect a Nvidia Driver error that actually caused the lock up.
You might want to look in that a path, C:\Windows\LiveKernelReports\WATCHDOG\
and see if you have a DMP file there is you ever had a lock-up?
(edited)
wcypierre
19 Posts
0
August 13th, 2023 10:59
@ispalten oh my the dreaded issue again...... have you tried rolling back to an older nvidia version?
ispalten
4 Operator
•
2.4K Posts
0
August 13th, 2023 20:30
@wcypierre
Yes, I did, and a newer one too...
On the newest one now,
JamieLinux
2 Intern
•
278 Posts
0
August 15th, 2023 20:09
@ispalten
The VIDEO_ENGINE_TIMEOUT_DETECTED live dump has a value of 0x00000141. This indicates that one of the display engines failed to respond in a timely fashion.
(This code can never be used for a real bug check; it is used to identify live dumps.)
Anyways, you could use !analyze command and read the dump file. Since it is a live dump and not a normal BOSD I do not think BOSD viewer will work with it.
windows-driver-docs/windows-driver-docs-pr/debugger/using-the--analyze-extension.md at staging · MicrosoftDocs/windows-driver-docs · GitHub
You can go there and download and use the tool to analyze, which should tell you either what driver caused the issue. If it comes up as a memory error, it very well could be one of the memory chips on the GPU is going faulty or has a bad or broken solder joint.
Since there is so many different variants to what can actually cause this crash. Test everything. to test ram I would use memcheck 86 over the windows built-in or dell built-in memory diag.
ispalten
4 Operator
•
2.4K Posts
0
August 15th, 2023 22:51
@JamieLinux
Yes, I tested RAM, clean (MEMTEST86), ran some card tests, and F12 tests... all pass.
Understand the error, did use !analyze to get the error.
Yes, not a BSOD, never is with the hang.
Taking into account the MS Money problem and what happened, I'd suspect the driver more than anything else. Used DDU in Safe Mode to remove the driver completely and reinstall it (and everything else, GeForce, CC) to no avail. A few drivers ago MS Money was working, had not be for a few drivers though before that one, and no now working correctly.
Since that is/was the case, I sort of mentally rules out some h/w issue. I could be wrong.
Could really be a timing issue, and those are hard to solve.
I have tried a few Dump Readers, SysInternals, Nirsoft, and they don''t see any info I think that helps.
JamieLinux
2 Intern
•
278 Posts
0
August 16th, 2023 04:55
Any way for you to test the GPU in another system? I only ask that because typically software crashes won't report hardware failure faults. As shown in your first screenshot. Also, what happens if you remove the Nvidia GPU from the system? If you are not ok with doing that well that is fine.
ispalten
4 Operator
•
2.4K Posts
0
August 16th, 2023 14:48
@JamieLinux
Well, I do have an XPS8700 and XPS8500 I could try moving it too, but I suspect there could be some cabling problems...
As for the card itself being bad, I'm thinking possible, but there are four reasons I don't suspect it:
I should also mention just moving the HDMI cable to the Intel HD750 and in Device Manager DISABLING the Nvidia card I (and others) have NOT had a lock-up.
Again, since this is random, and I have at times with different versions of the BIOS and Nvidia Drivers even gone almost a month without a lock up (as have others) only to have it come back, and the basic change, MS W11 updates and the Nvidia Driver. Very interesting one the latest Nvidia driver release date, 8/8/2023, one day before some major KB Win11 MS install....
No matter what I do, MS Update continues to push down Nvidia Installs on me too. I don't even KNOW it was happening either... might contribute to some problems....
I had stopped that before, but I guess the last major release reset that:
I guess I have to find that setting again, not something I want to have happen. Didn't realize it until I looked now.
JamieLinux
2 Intern
•
278 Posts
0
August 17th, 2023 02:42
@ispalten Honestly, if you could just downgrade the bios and look at it from upgrading I would do that, there isn't much the new bios have added to really warrant running the updates. The other thing you could do if this is a critical system. would be to check out eBay and or Amazon warehouses. You could find an 11th-generation board case and Psu that was returned (they can't sell returns as new if the boxes were opened.) for dirt cheap and transplant everything to the new case and just E-bay your case and PSU and recoup the money.
Ultimately if Dell knows there is an issue with this, it should be fixed with a firmware update. Or anyone with an affected board should be given a free upgrade to a motherboard from Dell that doesn't have that problem. If some people have the issues and I'm sure many people don't like a few million. I can almost bet there are different board revisions.
If you feel the GPU is good, that's fine, I would still test it in another system and or take it to a shop has a way to hook the GPU to a tester and test it that way.
(edited)
ispalten
4 Operator
•
2.4K Posts
0
August 17th, 2023 14:05
@JamieLinux
Ahh, I guess you've not read the many threads here on the issue.
A couple of problems, not the least is there was no problem running BIOS V2.3.0, but it all started with V2.4.0. The ONLY change made to the PC was updating the BIOS. So, where would you think the problem lies, in the base XPS h/w or the new BIOS code? Ahh, but what IS BIOS code..., no, it is not first of all made by Dell, but a 3rd party. However, there are components within the BIOS package that are not made by that 3rd party or Dell, the Intel Management Interface for one, and there are others...
Also, back leveling the BIOS doesn't do a 100% back-level. For instance, when doing so, some components, such as the Intel Management Engine Interface sees a new version install and doesn't install the old one. Programs might experience problems due to this.
I should add, after BIOS V2.4.0 was installed and one hit the problem lock-up, back leveling to BIOS V2.3.0 fixed the problem. However, once BIOS V2.5.0 was installed, you could no longer solve the problem by going back to BIOS V2.3.0. Not clear why not?
Lastly, not 100% of XPS8940's with any Nvidia card has the problem. Matter of fact, one member of this forum bought 2 basically identical and only ONE had the problem.
Add into that that Dell used a few Motherboards on the XPS8940, and everyone one of those had users reporting the problem.
Dell has never been able to reproduce the problem.
Dell has replaced my Motherboard once and back leveled it once, also the Nvidia card, never resolved the problem and it continued. They offered to exchange my XPS for an 'reconditioned one' of equal or better configuration. One Caveat, I can NOT alter anything on the XPS. It must be in the state (and s/w) I am using it as. No go, I had too many single license programs installed... and I could not uninstall them nor delete the license (if possible). Other were offered the same deal, and to the best of my knowledge, no one accepted the deal.
I appreciate your suggestions and 'trouble-shooting', but most has either been done or not possible.
My PC is not critical by any means... I am retired... however, Windows doesn't like being turned of in any other way than using SHUTDOWN so it can write back buffers that need be, and close open files safely. Using the Power Button (as required by the lock up) can cause file corruption. I did have it happen, mostly to programs and Windows files. Programs, at least 3 that I recall lost their settings and options and when opened were at the default ones... I now back those settings up. However, it is even possible that the PC will not be bootable if the wrong Windows files is corrupted.
Yes, I agree, Dell (or its 3rd party vendors) should (or should have) fixed it. They did not.
The hardest 'bugs' to fix are random ones you don't know how to recreate. As is this one. Also, timing issues, that depend on many factors and operations. In both cases, normally .DMP files are created, but none are created during this problem. Even if one were to put the system under a Debug trace, it might not happen as that changes timing.
The only 'saving grace' seems to be as the BIOS versions increase, the rate of failure due to lock ups decreased... at least right now, since I posted the original post here in this thread I've not had another one.
bdtnr
1 Rookie
•
43 Posts
0
August 17th, 2023 15:59
Count me in too as having the same issue. Just started back up again for me as well, it seems like a new NVidia driver update started the problems. I'm currently on 536.99 with the latest BIOS. The other oddity for me, is the the GeForce Experience application throws an error when trying to open the app, so attribute all of this to the new driver.
ispalten
4 Operator
•
2.4K Posts
0
August 17th, 2023 20:57
@bdtnr
Thanks for the verification. I'm not having GeForce problems though, it still seems to work fine?
See my post a few up. I did have Driver Update OFF, and somehow it was reset, assume an MS Update. You'll see I had a failed Update of the Nvidia Driver... and I wonder if that could be contributing to the problem?
MS Money started having a problem for me, and at that point I decided to use DDU in Safe Mode and clean out all instances of any Nvidia driver and utilities completely. Rebooted to normal mode and before I could install the latest driver, Windows did it. So I am not sure I'm 'ok' totally. However for one Nvidia driver after that, MS Money worked fine, but the next driver broke it again.
I am on the latest too, as well a GeForce:
chipschap
1 Rookie
•
26 Posts
1
August 17th, 2023 22:02
@ispalten et alia
Just one thing to add, I'm the Linux user to whom @ispalten refers. With Turbo turned off I had zero lockups for ten months. (As noted I did experiment with turning Turbo on and very predictably got mulitple lockups within a couple of days.)
Then on August 6 I got a classical lockup, right out of the blue. I had been on BIOS 2.13.1 since June 25. Nvidia driver updates on Linux are infrequent and generally a full revision behind Windows. There was no driver update. Of course there are very frequent kernel updates. I am at a complete loss for why I got this seemingly random lockup. It is now August 17 and there have been no further lockups. A new Nvidia driver dropped this week; let's see how it goes.
Note that the classical lockups on my system have never been when the video card is under heavy load. They were seemingly limited to very light load, or no load, or once in a while alt-tabbing between full screen windows. The most recent was with the system sitting idle for a couple of hours and my finding it in lockup when I returned.
The classical lockup always is preceded by a journal entry stating "the GPU has fallen off the bus." After this last lockup I tried resetting the Nvidia card to maximum performance to hopefully keep it out of an idle state (where possibly it may seem to "fall off the bus" but I don't know that for sure).
Also I should point out there is no equivalent of GeForce Experience on Linux.
bdtnr
1 Rookie
•
43 Posts
0
August 25th, 2023 14:25
@ispalten - I followed your advice and ran DDU in Safe Mode to uninstall all the drivers/software for NVidia. I restarted windows and it just installed a generic MS driver for the card. I then installed the latest NVidia Game Ready Driver (it was 537.13). I immediately started to have problems and had the lockups. Probably 4-5 in a 2 day period. I didn't have time to investigate and try to resolve, so I have reverted back to using the Intel GPU and disabling the NVidia GPU and as usual, have had NO problems since! My computer did tell me this morning that there is a newer NVidia driver available. They've released at least 3 this month, which may be a sign of issues on their end.
Have you been able to resolve or get back to a more stable system? Have you updated NVidia drivers and did they help?
ispalten
4 Operator
•
2.4K Posts
0
August 25th, 2023 18:55
@bdtnr
It appeared that I only had problems with the driver I listed originally... but not the case. The next one was more stable, less lock ups, but then I got a really odd one I never saw before. Basically the XPS either was going to sleep or something woke it up. I was not there when it happened, but when I came back, I saw a Black Screen with 2 mouse pointers about 1/2 apart, like something was moving the mouse fast across the screen? Don't know if it was going to sleep or something was coming 'alive' making it open (don't think this is even possible?).
3 days later, another new driver... and a week later, another.
I suspect Nvidia found a problem, and the last one, probably due to a needed change to match MS's Patch Tuesday as it came out on Wed.?
Running 537.13 released on 8/22 (Tues) and I installed it Wed., and so far, no lock ups... but maybe too short to tell if it is OK or not?
I did get 'fooled' using DDU the last time. I did it in Safe Mode, and then booted normally. Windows used its Generic driver, which was fine but in lower resolution (I running 3840x2160 normally, 32" Dell monitor). Before I had a chance to install the Nvidia driver, Windows did it back to the, from MS Update which was 2 back I recall... had to run DDU again, but this time I was ready to immediately install the latest Nvidia once I completed booting on the MS driver. PITA....