Заголовок сообщения: MSI GeForce RTX2060 Ventus OS RU Зависает в определенных ситуациях.
Добавлено: 14 окт 2022, 08:24
Интересующийся
Зарегистрирован: 08 мар 2016, 18:22 Наличности на руках: 247.86 Сообщения: 143 Откуда: Irkutsk
Приветствую коллеги! Забрал для себя такую карточку. У продавца было несколько идентичных. При покупке покрутил бублик, прогнали 3DMark все ок. Прогнал MATS\MODS - идеальный PASS! Дома запустил еще раз бублик и бенч Superposition тоже ОК. Далее запустил Warface в нем при броске дыма (именно при нем) получаю завис с черным экраном. В бенче Heaven получаю гарантированный аналогичный завис в одних и тех же местах на первых 10-15 секундах теста. Ни артефактов, ни зависаний вне этих ситуаций не наблюдаю. У продавца поменял карту на аналогичную, и прогнали бенч Heaven на обоих на его ПК (что то на 775 сокете). Без проблем. Забрал другую карту. Дома все повторилось со второй картой абсолютно идентично. Причем попробовал на разных ПК (на 1155 и 1151v2 сокетах). Систему и драйвера переустанавливал. Vbios прошивал из того что нашел на techpowerup.com. В общем нахожусь в замешательстве, возможно кто то может подсказать как локализовать проблему?
Заголовок сообщения: Re: MSI GeForce RTX2060 Ventus OS RU Зависает в определенных ситуациях.
Добавлено: 14 окт 2022, 12:49
Продвинутый форумчанин
Зарегистрирован: 03 ноя 2017, 01:24 Наличности на руках: 5,011.04 Сообщения: 3985 Откуда: Budapest
superposition openGL/CL
SectoR
Заголовок сообщения: Re: MSI GeForce RTX2060 Ventus OS RU Зависает в определенных ситуациях.
Добавлено: 14 окт 2022, 14:07
Я тут случайно
Зарегистрирован: 27 сен 2019, 23:40 Наличности на руках: 6.01 Сообщения: 15 Откуда: Челябинск
Прогнать матсом с частотой памяти 110%
demonis
[ТС]
Заголовок сообщения: Re: MSI GeForce RTX2060 Ventus OS RU Зависает в определенных ситуациях.
Добавлено: 17 окт 2022, 11:18
Интересующийся
Зарегистрирован: 08 мар 2016, 18:22 Наличности на руках: 247.86 Сообщения: 143 Откуда: Irkutsk
SectoR писал(а):
Прогнать матсом с частотой памяти 110%
Возможно я немного туплю, матс недавно использую. Но если правильно понимаю частоту памяти dram_clk можно выставить только в MODS. MATS не дает выставить частоту памяти. Если не прав поправьте пожалуйста. В общем чушь продолжается. Вторую карту прогнал после вашего сообщения MODS\MATS первый раз со статусом MODS - PASS, MATS - FAIL, Но без серьезных ошибок по банкам.
MODS - pass
MODS start: Sat Oct 15 21:49:09 2022
Command Line : gputest.js -short -test 275 -no_gold -adc_cal_check_ignore -matsinfo
CPU Foundry : GenuineIntel Name : Intel(R) Core(TM) i3-9100F CPU @ 3.60GHz Family : 6 Model : 14 Stepping : 10
Version MODS : 400.281 OperatingSystem: Linux (x86_64) Kernel : 4.17.4-gentoo KernelDriver : 3.87 SBIOS Version : 1.E1 SBIOS Date : 11/17/2021 HostName : tinylinux
GPU 0 [01:00.0] dev.sub 0.0 ---------------------------------------- DevInst : 0 PCI Location : 0x00, 0x01, 0x00, 0x00 GPU DID : 0x1f08 PDI : 0x1d5b75aa2815da13 Raw ECID : 0x013ff6400000000e00061d91 Raw ECID (GHS) : 0x1640e00061c00000019ff8240 ECID : P3W067-25_x-1_y09 Device Id : TU106 Revision : a1 NV Base : 0xa3000000 FB Base : 0x90000000 IRQ : 11 WARNING: GPU 0 [01:00.0] PCIE speed capability (8000Gbps) higher than down stream port link speed (2500Gbps) Warning: MODS console is unable to alter terminal parameters Foundry : TSMC Subsystem VID : 0x1462 Subsystem DID : 0x3755 Board ID : 0x010c Chip SKU : 200-A Project : G161-0042 Fuse File Fmt : JSON Display : 0x00001000 (id) SBIOS Init : Primary Native Mode : 1920x1080 Memory Size : 6144 MB FB Vendor : Hynix RAM Protocol : GDDR6 RAM Config : 2 ROM Version : 90.06.69.00.32 ROM Type : Partner Production ROM OEM Vendor : NVIDIA ROM Partner : msi ROM Project ID : 113696msi ROM Timestamp : 2021-4-1 08:13:27 ROM Expiration : 2021-9-28 08:13:27 ROM GUID : 0166B0D9864642E48501B3F9B3AB7113 PPC Pri/Sec FW : Unknown PPC Driver Ver : Unknown PState (mode) : 8 5 3 2 [0] PState Version : 3.5 EDC : Disabled GPC Clock : 1950.0/1946.9 MHz NAFLL DRAM Clock : 7001.0/6981.3 MHz DEFAULT Host Clock : 1380.0/1380.0 MHz NAFLL XBar Clock : 1860.0/1851.1 MHz NAFLL Sys Clock : 1950.0/1936.9 MHz NAFLL Power Clock : 540.0/541.3 MHz DEFAULT NVDec Clock : 1800.0/1799.8 MHz NAFLL Display Clock : 1330.0/1330.3 MHz DEFAULT NVVDD : 1056 mV GPC Mask : 0x07 (3 GPCs) TPC Mask : [3e 3e 3e] (15 TPCs) FB Mask : 0x0e (3 FB Partitions) ROP/L2 Mask : [3 3 3 x] (6 ROP/L2s) PES Mask : [7 7 7] (9 PESes) FBIO Mask : 0x0e (3 FBIO Partitions) FBIO Shift Mask: 0x00 XP Mask : 0x03 (2 3gio Pads) Nvdec Mask : 0x01 (1 engine) Nvenc Mask : 0x01 (1 engine) Gpu Temp : 43 deg C PEX Rx Lanes : 0xffff PEX Tx Lanes : 0xffff PEX Det. Lanes : 0xffff PEX Width, ASLM: 16 lanes, Not Supported PEX Link Speed : 8.0 Gbit/s PEX BandWidth : 128.0 Gbit/s ASPM, ASPM-CYA : (L0s/L1, Disabled) ASPM L1SS, CYA : (Disabled, L1.1/L1.2) LTR : Enabled
Chipset VID : 8086 (Intel) Chipset DID : A305 (IntelZ390) Chipset ASPM : L0s/L1 Chipset LTR : Enabled
Running test(s) on GPU 0 [01:00.0] (DID: 0x1f08) Enter SetPState (test 0) Switched to PState 0 (0.max). Pcie Speed=8000, x16 ClkM = 7000.98 MHz ClkHost = 1380.00 MHz ClkDisp = 1330.00 MHz ClkGpc = 1950.00 MHz ClkXbar = 1860.00 MHz ClkSys = 1950.00 MHz ClkHub = 810.00 MHz ClkPwr = 540.00 MHz ClkNvd = 1800.00 MHz ClkPexGen = 3.00 NVVDD = 1050 mV Exit 000000000000 : SetPState (test 0) ok Enter BaseBoostClockTest (test 275) Found board name PG161-0042 WARNING: RTP power policy is not supported in the VBIOS on GPU 0 Could not open/detect USB temperature sensor. Disabling reads from the sensor, defaulting ambient temperature to 30C, and continuing. Starting Preheat GPU=0, setting max power limit TGP=172 W, RTP=0 W, AcousticTC = 87 C Heating GPU=0 at 21:49:16 from 43 C to 69 C, for up to 90 sec. MonitorTemp GPU=0 Tj=69 over temp limit of 69 at 71 s Preheat done GPU=0 Tj=69 Tach=998 P=171 V=1.043 G=1725.0 TDP BaseClockTest starting GPU=0, RTP power=0 W, compensated AcousticTC temperature = 83 C, TGP limit = 172 W Heating GPU=0 at 21:50:29 at 65 C for 30 sec. BaseClock done GPU=0 Tj=62 Tach=1887.7333333333333 P=106 V=0.743 G=1365.0 TDP BoostClockTest starting GPU=0, RTP = 0 W, compensated AcousticTC = 83 C, TGP limit = 172 W Heating GPU=0 at 21:51:01 at 59 C for 30 sec. BoostClock done GPU=0 Tj=58 Tach=3319.866666666667 P=113 V=0.906 G=1710.0 Exit 000000000000 : BaseBoostClockTest (test 275) ok Enter SetPState (test 0) Switched to PState 0 (0.intersect.nvvdd). Pcie Speed=8000, x16 ClkM = 7000.98 MHz ClkHost = 1155.00 MHz ClkDisp = 1330.00 MHz ClkGpc = 1305.00 MHz ClkXbar = 1245.00 MHz ClkSys = 1305.00 MHz ClkHub = 810.00 MHz ClkPwr = 540.00 MHz ClkNvd = 1200.00 MHz ClkPexGen = 3.00 NVVDD = 706 mV Exit 000000000000 : SetPState (test 0) ok Enter BaseBoostClockTest (test 275) Found board name PG161-0042 WARNING: RTP power policy is not supported in the VBIOS on GPU 0 Could not open/detect USB temperature sensor. Disabling reads from the sensor, defaulting ambient temperature to 30C, and continuing. Starting Preheat GPU=0, setting max power limit TGP=172 W, RTP=0 W, AcousticTC = 87 C Heating GPU=0 at 21:51:31 from 55 C to 69 C, for up to 90 sec. MonitorTemp GPU=0 Tj=69 over temp limit of 69 at 32 s Preheat done GPU=0 Tj=69 Tach=994 P=170 V=1.043 G=1725.0 TDP BaseClockTest starting GPU=0, RTP power=0 W, compensated AcousticTC temperature = 83 C, TGP limit = 172 W Heating GPU=0 at 21:52:06 at 65 C for 30 sec. BaseClock done GPU=0 Tj=63 Tach=1888.5333333333333 P=106 V=0.743 G=1365.0 TDP BoostClockTest starting GPU=0, RTP = 0 W, compensated AcousticTC = 83 C, TGP limit = 172 W Heating GPU=0 at 21:52:37 at 60 C for 30 sec. BoostClock done GPU=0 Tj=59 Tach=3294.6666666666665 P=113 V=0.887 G=1710.0 Exit 000000000000 : BaseBoostClockTest (test 275) ok Enter SetPState (test 0) Switched to PState 2 (2.max). Pcie Speed=8000, x16 ClkM = 6801.00 MHz ClkHost = 1380.00 MHz ClkDisp = 1330.00 MHz ClkGpc = 1920.00 MHz ClkXbar = 1830.00 MHz ClkSys = 1920.00 MHz ClkHub = 810.00 MHz ClkPwr = 540.00 MHz ClkNvd = 1785.00 MHz ClkPexGen = 3.00 NVVDD = 1050 mV Exit 000000000000 : SetPState (test 0) ok Enter BaseBoostClockTest (test 275) Found board name PG161-0042 WARNING: RTP power policy is not supported in the VBIOS on GPU 0 Could not open/detect USB temperature sensor. Disabling reads from the sensor, defaulting ambient temperature to 30C, and continuing. Starting Preheat GPU=0, setting max power limit TGP=172 W, RTP=0 W, AcousticTC = 87 C Heating GPU=0 at 21:53:08 from 56 C to 69 C, for up to 90 sec. MonitorTemp GPU=0 Tj=69 over temp limit of 69 at 28 s Preheat done GPU=0 Tj=69 Tach=999 P=172 V=1.037 G=1725.0 TDP BaseClockTest starting GPU=0, RTP power=0 W, compensated AcousticTC temperature = 83 C, TGP limit = 172 W Heating GPU=0 at 21:53:38 at 65 C for 30 sec. BaseClock done GPU=0 Tj=63 Tach=1888.3333333333333 P=106 V=0.731 G=1365.0 TDP BoostClockTest starting GPU=0, RTP = 0 W, compensated AcousticTC = 83 C, TGP limit = 172 W Heating GPU=0 at 21:54:09 at 60 C for 30 sec. BoostClock done GPU=0 Tj=60 Tach=3276.6 P=114 V=0.912 G=1710.0 Exit 000000000000 : BaseBoostClockTest (test 275) ok Enter SetPState (test 0) Switched to PState 2 (2.intersect.nvvdd). Pcie Speed=8000, x16 ClkM = 6801.00 MHz ClkHost = 1110.00 MHz ClkDisp = 1080.00 MHz ClkGpc = 1260.00 MHz ClkXbar = 1200.00 MHz ClkSys = 1260.00 MHz ClkHub = 648.00 MHz ClkPwr = 540.00 MHz ClkNvd = 1170.00 MHz ClkPexGen = 3.00 NVVDD = 700 mV Exit 000000000000 : SetPState (test 0) ok Enter BaseBoostClockTest (test 275) Found board name PG161-0042 WARNING: RTP power policy is not supported in the VBIOS on GPU 0 Could not open/detect USB temperature sensor. Disabling reads from the sensor, defaulting ambient temperature to 30C, and continuing. Starting Preheat GPU=0, setting max power limit TGP=172 W, RTP=0 W, AcousticTC = 87 C Heating GPU=0 at 21:54:40 from 57 C to 69 C, for up to 90 sec. MonitorTemp GPU=0 Tj=69 over temp limit of 69 at 26 s Preheat done GPU=0 Tj=69 Tach=1004 P=172 V=1.037 G=1725.0 TDP BaseClockTest starting GPU=0, RTP power=0 W, compensated AcousticTC temperature = 83 C, TGP limit = 172 W Heating GPU=0 at 21:55:07 at 65 C for 30 sec. BaseClock done GPU=0 Tj=63 Tach=1887.7333333333333 P=106 V=0.743 G=1365.0 TDP BoostClockTest starting GPU=0, RTP = 0 W, compensated AcousticTC = 83 C, TGP limit = 172 W Heating GPU=0 at 21:55:38 at 61 C for 30 sec. BoostClock done GPU=0 Tj=60 Tach=3260.733333333333 P=114 V=0.912 G=1710.0 Exit 000000000000 : BaseBoostClockTest (test 275) ok Enter SetPState (test 0) Switched to PState 3 (3.max). Pcie Speed=8000, x16 ClkM = 5000.98 MHz ClkHost = 1380.00 MHz ClkDisp = 1330.00 MHz ClkGpc = 1905.00 MHz ClkXbar = 1815.00 MHz ClkSys = 1905.00 MHz ClkHub = 810.00 MHz ClkPwr = 540.00 MHz ClkNvd = 1770.00 MHz ClkPexGen = 3.00 NVVDD = 1043 mV Exit 000000000000 : SetPState (test 0) ok Enter BaseBoostClockTest (test 275) Found board name PG161-0042 WARNING: RTP power policy is not supported in the VBIOS on GPU 0 Could not open/detect USB temperature sensor. Disabling reads from the sensor, defaulting ambient temperature to 30C, and continuing. Starting Preheat GPU=0, setting max power limit TGP=172 W, RTP=0 W, AcousticTC = 87 C Heating GPU=0 at 21:56:09 from 57 C to 69 C, for up to 90 sec. MonitorTemp GPU=0 Tj=69 over temp limit of 69 at 24 s Preheat done GPU=0 Tj=69 Tach=1006 P=173 V=1.037 G=1725.0 TDP BaseClockTest starting GPU=0, RTP power=0 W, compensated AcousticTC temperature = 83 C, TGP limit = 172 W Heating GPU=0 at 21:56:36 at 66 C for 30 sec. BaseClock done GPU=0 Tj=64 Tach=1888.8 P=106 V=0.743 G=1365.0 TDP BoostClockTest starting GPU=0, RTP = 0 W, compensated AcousticTC = 83 C, TGP limit = 172 W Heating GPU=0 at 21:57:08 at 61 C for 30 sec. BoostClock done GPU=0 Tj=61 Tach=3244.9333333333334 P=115 V=0.912 G=1710.0 Exit 000000000000 : BaseBoostClockTest (test 275) ok Enter SetPState (test 0) Switched to PState 3 (3.intersect.nvvdd). Pcie Speed=8000, x16 ClkM = 5000.98 MHz ClkHost = 1110.00 MHz ClkDisp = 1080.00 MHz ClkGpc = 1260.00 MHz ClkXbar = 1200.00 MHz ClkSys = 1260.00 MHz ClkHub = 648.00 MHz ClkPwr = 540.00 MHz ClkNvd = 1170.00 MHz ClkPexGen = 3.00 NVVDD = 700 mV Exit 000000000000 : SetPState (test 0) ok Enter BaseBoostClockTest (test 275) Found board name PG161-0042 WARNING: RTP power policy is not supported in the VBIOS on GPU 0 Could not open/detect USB temperature sensor. Disabling reads from the sensor, defaulting ambient temperature to 30C, and continuing. Starting Preheat GPU=0, setting max power limit TGP=172 W, RTP=0 W, AcousticTC = 87 C Heating GPU=0 at 21:57:38 from 57 C to 69 C, for up to 90 sec. MonitorTemp GPU=0 Tj=69 over temp limit of 69 at 22 s Preheat done GPU=0 Tj=69 Tach=996 P=170 V=1.037 G=1725.0 TDP BaseClockTest starting GPU=0, RTP power=0 W, compensated AcousticTC temperature = 83 C, TGP limit = 172 W Heating GPU=0 at 21:58:03 at 65 C for 30 sec. BaseClock done GPU=0 Tj=64 Tach=1888.8666666666666 P=106 V=0.743 G=1365.0 TDP BoostClockTest starting GPU=0, RTP = 0 W, compensated AcousticTC = 83 C, TGP limit = 172 W Heating GPU=0 at 21:58:34 at 61 C for 30 sec. BoostClock done GPU=0 Tj=61 Tach=3246.8 P=115 V=0.912 G=1710.0 Exit 000000000000 : BaseBoostClockTest (test 275) ok Enter SetPState (test 0) Switched to PState 5 (5.max). Pcie Speed=5000, x16 ClkM = 810.00 MHz ClkHost = 1350.00 MHz ClkDisp = 1330.00 MHz ClkGpc = 1905.00 MHz ClkXbar = 1815.00 MHz ClkSys = 1905.00 MHz ClkHub = 405.00 MHz ClkPwr = 540.00 MHz ClkNvd = 1770.00 MHz ClkPexGen = 2.00 NVVDD = 1043 mV Exit 000000000000 : SetPState (test 0) ok Enter BaseBoostClockTest (test 275) Found board name PG161-0042 WARNING: RTP power policy is not supported in the VBIOS on GPU 0 Could not open/detect USB temperature sensor. Disabling reads from the sensor, defaulting ambient temperature to 30C, and continuing. Starting Preheat GPU=0, setting max power limit TGP=172 W, RTP=0 W, AcousticTC = 87 C Heating GPU=0 at 21:59:05 from 58 C to 69 C, for up to 90 sec. MonitorTemp GPU=0 Tj=69 over temp limit of 69 at 20 s Preheat done GPU=0 Tj=69 Tach=996 P=172 V=1.037 G=1710.0 TDP BaseClockTest starting GPU=0, RTP power=0 W, compensated AcousticTC temperature = 83 C, TGP limit = 172 W Heating GPU=0 at 21:59:28 at 65 C for 30 sec. BaseClock done GPU=0 Tj=64 Tach=1888.2 P=106 V=0.743 G=1365.0 TDP BoostClockTest starting GPU=0, RTP = 0 W, compensated AcousticTC = 83 C, TGP limit = 172 W Heating GPU=0 at 21:59:59 at 61 C for 30 sec. BoostClock done GPU=0 Tj=61 Tach=3244.733333333333 P=115 V=0.912 G=1710.0 Exit 000000000000 : BaseBoostClockTest (test 275) ok Enter SetPState (test 0) Switched to PState 5 (5.intersect.nvvdd). Pcie Speed=5000, x16 ClkM = 810.00 MHz ClkHost = 1110.00 MHz ClkDisp = 1080.00 MHz ClkGpc = 1260.00 MHz ClkXbar = 1200.00 MHz ClkSys = 1260.00 MHz ClkHub = 405.00 MHz ClkPwr = 540.00 MHz ClkNvd = 1170.00 MHz ClkPexGen = 2.00 NVVDD = 700 mV Exit 000000000000 : SetPState (test 0) ok Enter BaseBoostClockTest (test 275) Found board name PG161-0042 WARNING: RTP power policy is not supported in the VBIOS on GPU 0 Could not open/detect USB temperature sensor. Disabling reads from the sensor, defaulting ambient temperature to 30C, and continuing. Starting Preheat GPU=0, setting max power limit TGP=172 W, RTP=0 W, AcousticTC = 87 C Heating GPU=0 at 22:00:29 from 58 C to 69 C, for up to 90 sec. MonitorTemp GPU=0 Tj=69 over temp limit of 69 at 21 s Preheat done GPU=0 Tj=69 Tach=1005 P=172 V=1.037 G=1725.0 TDP BaseClockTest starting GPU=0, RTP power=0 W, compensated AcousticTC temperature = 83 C, TGP limit = 172 W Heating GPU=0 at 22:00:52 at 65 C for 30 sec. BaseClock done GPU=0 Tj=64 Tach=1888.2 P=106 V=0.743 G=1365.0 TDP BoostClockTest starting GPU=0, RTP = 0 W, compensated AcousticTC = 83 C, TGP limit = 172 W Heating GPU=0 at 22:01:23 at 61 C for 30 sec. BoostClock done GPU=0 Tj=61 Tach=3240.4666666666667 P=115 V=0.912 G=1710.0 Exit 000000000000 : BaseBoostClockTest (test 275) ok Enter SetPState (test 0) Switched to PState 8 (8.max). Pcie Speed=2500, x16 ClkM = 405.00 MHz ClkHost = 555.00 MHz ClkDisp = 1080.00 MHz ClkGpc = 645.00 MHz ClkXbar = 600.00 MHz ClkSys = 630.00 MHz ClkHub = 202.50 MHz ClkPwr = 540.00 MHz ClkNvd = 585.00 MHz ClkPexGen = 1.00 NVVDD = 700 mV Exit 000000000000 : SetPState (test 0) ok Enter BaseBoostClockTest (test 275) Found board name PG161-0042 WARNING: RTP power policy is not supported in the VBIOS on GPU 0 Could not open/detect USB temperature sensor. Disabling reads from the sensor, defaulting ambient temperature to 30C, and continuing. Starting Preheat GPU=0, setting max power limit TGP=172 W, RTP=0 W, AcousticTC = 87 C Heating GPU=0 at 22:01:54 from 58 C to 69 C, for up to 90 sec. MonitorTemp GPU=0 Tj=69 over temp limit of 69 at 20 s Preheat done GPU=0 Tj=69 Tach=994 P=171 V=1.037 G=1725.0 TDP BaseClockTest starting GPU=0, RTP power=0 W, compensated AcousticTC temperature = 83 C, TGP limit = 172 W Heating GPU=0 at 22:02:17 at 65 C for 30 sec. BaseClock done GPU=0 Tj=64 Tach=1887.2 P=106 V=0.743 G=1365.0 TDP BoostClockTest starting GPU=0, RTP = 0 W, compensated AcousticTC = 83 C, TGP limit = 172 W Heating GPU=0 at 22:02:48 at 61 C for 30 sec. BoostClock done GPU=0 Tj=61 Tach=3244.8 P=116 V=0.912 G=1710.0 Exit 000000000000 : BaseBoostClockTest (test 275) ok Enter SetPState (test 0) Switched to PState 8 (8.intersect.nvvdd). Pcie Speed=2500, x16 ClkM = 405.00 MHz ClkHost = 555.00 MHz ClkDisp = 1080.00 MHz ClkGpc = 645.00 MHz ClkXbar = 600.00 MHz ClkSys = 630.00 MHz ClkHub = 202.50 MHz ClkPwr = 540.00 MHz ClkNvd = 585.00 MHz ClkPexGen = 1.00 NVVDD = 700 mV Exit 000000000000 : SetPState (test 0) ok Enter BaseBoostClockTest (test 275) Found board name PG161-0042 WARNING: RTP power policy is not supported in the VBIOS on GPU 0 Could not open/detect USB temperature sensor. Disabling reads from the sensor, defaulting ambient temperature to 30C, and continuing. Starting Preheat GPU=0, setting max power limit TGP=172 W, RTP=0 W, AcousticTC = 87 C Heating GPU=0 at 22:03:18 from 58 C to 69 C, for up to 90 sec. MonitorTemp GPU=0 Tj=69 over temp limit of 69 at 20 s Preheat done GPU=0 Tj=69 Tach=999 P=170 V=1.037 G=1725.0 TDP BaseClockTest starting GPU=0, RTP power=0 W, compensated AcousticTC temperature = 83 C, TGP limit = 172 W Heating GPU=0 at 22:03:41 at 66 C for 30 sec. BaseClock done GPU=0 Tj=64 Tach=1888.4 P=106 V=0.743 G=1365.0 TDP BoostClockTest starting GPU=0, RTP = 0 W, compensated AcousticTC = 83 C, TGP limit = 172 W Heating GPU=0 at 22:04:12 at 61 C for 30 sec. BoostClock done GPU=0 Tj=61 Tach=3244.866666666667 P=115 V=0.912 G=1710.0 Exit 000000000000 : BaseBoostClockTest (test 275) ok GPU tests completed.
Running test(s) on GPU 0 [01:00.0] (DID: 0x1f08) Enter SetPState (test 0) Switched to PState 0 (0.max). Pcie Speed=8000, x16 ClkM = 7000.98 MHz ClkHost = 1380.00 MHz ClkDisp = 1330.00 MHz ClkGpc = 1935.00 MHz ClkXbar = 1845.00 MHz ClkSys = 1935.00 MHz ClkHub = 810.00 MHz ClkPwr = 540.00 MHz ClkNvd = 1785.00 MHz ClkPexGen = 3.00 NVVDD = 1043 mV Exit 000000000000 : SetPState (test 0) ok Enter BaseBoostClockTest (test 275) Found board name PG161-0042 WARNING: RTP power policy is not supported in the VBIOS on GPU 0 Could not open/detect USB temperature sensor. Disabling reads from the sensor, defaulting ambient temperature to 30C, and continuing. Starting Preheat GPU=0, setting max power limit TGP=172 W, RTP=0 W, AcousticTC = 87 C Heating GPU=0 at 22:45:20 from 48 C to 69 C, for up to 90 sec. MonitorTemp GPU=0 Tj=69 over temp limit of 69 at 49 s Preheat done GPU=0 Tj=69 Tach=999 P=172 V=1.043 G=1725.0 TDP BaseClockTest starting GPU=0, RTP power=0 W, compensated AcousticTC temperature = 83 C, TGP limit = 172 W Heating GPU=0 at 22:46:12 at 65 C for 30 sec. BaseClock done GPU=0 Tj=63 Tach=1886.7333333333333 P=106 V=0.743 G=1365.0 TDP BoostClockTest starting GPU=0, RTP = 0 W, compensated AcousticTC = 83 C, TGP limit = 172 W Heating GPU=0 at 22:46:43 at 61 C for 30 sec. BoostClock done GPU=0 Tj=59 Tach=3256.733333333333 P=113 V=0.912 G=1710.0 Exit 000000000000 : BaseBoostClockTest (test 275) ok Enter SetPState (test 0) Switched to PState 0 (0.intersect.nvvdd). Pcie Speed=8000, x16 ClkM = 7000.98 MHz ClkHost = 1155.00 MHz ClkDisp = 1080.00 MHz ClkGpc = 1305.00 MHz ClkXbar = 1245.00 MHz ClkSys = 1305.00 MHz ClkHub = 648.00 MHz ClkPwr = 540.00 MHz ClkNvd = 1200.00 MHz ClkPexGen = 3.00 NVVDD = 706 mV Exit 000000000000 : SetPState (test 0) ok Enter BaseBoostClockTest (test 275) Found board name PG161-0042 WARNING: RTP power policy is not supported in the VBIOS on GPU 0 Could not open/detect USB temperature sensor. Disabling reads from the sensor, defaulting ambient temperature to 30C, and continuing. Starting Preheat GPU=0, setting max power limit TGP=172 W, RTP=0 W, AcousticTC = 87 C Heating GPU=0 at 22:47:14 from 56 C to 69 C, for up to 90 sec.
ERROR: ** ModsDrvBreakPoint **
------------------------- BEGIN ASSERT INFO DUMP ------------------------- rLimitMaxW": 0, "maxAcousticTC": 87, "minAcousticTC": 65, "tgpPowerLimitMaxW": 172.5, "Description": "PG150 sku510", "heaterTestNum": 200, "preHeatTestVector": {}, "baseTestNum": 200, "baseTestVector": { "MNKMode": 2, "Msize": 7168, "Nsize": 256, "Ksize": 1024 }, "boostTestNum": 200, "boostTestVector": { "MNKMode": 2, "Msize": 2048, "Nsize": 256, "Ksize": 24 }, "powerSanityCheckThresholdW": 0, "overTempMargin": 5, "clocksPercentile": 50 } pcsensor: Could not find the USB device. Aborting USB setup. pcsensor: Could not find the USB device. Aborting USB setup. pcsensor: Could not find the USB device. Aborting USB setup. pcsensor: Could not find the USB device. Aborting USB setup. pcsensor: Could not find the USB device. Aborting USB setup. pcsensor: Could not find the USB device. Aborting USB setup. Could not open/detect USB temperature sensor. Disabling reads from the sensor, defaulting ambient temperature to 30C, and continuing. Setting RPM to 1000 Heating GPU=0 at 22:47:14 from 56 C to 69 C, for up to 90 sec. Found CUDA device "GeForce RTX 2060" with GPU instance 0 +++ thread_name: 'HeaterThread' thread ID: 30 +++ Entering RunThread Calling my start listeners: About to call my run function ModsTest Js Properties: Test name: CudaLinpackSgemm MassertVerboseFlags: 0x0000 MassertAllowed: 0x0 MassertDisabled: false UnexpectedHwIntVerboseFlags: 0x0000 UnexpectedHwIntAllowed: 0x0 UnexpectedHwIntDisabled: false EccErrCountVerboseFlags: 0x0000 EccErrCountAllowed: 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0 EccErrCountDisabled: false EdcErrCountVerboseFlags: 0x0000 EdcErrCountAllowed: 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0 EdcErrCountDisabled: false OverTempCountVerboseFlags: 0x0000 OverTempCountAllowed: 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0 OverTempCountDisabled: false PrintQueueOverflowVerboseFlags: 0x0000 PrintQueueOverflowAllowed: 0x0 PrintQueueOverflowDisabled: false Testing device : GPU 0 [01:00.0] DisableRcWatchdog : false PwrSampleIntervalMs : 500 TargetNvSwitch : false TestConfiguration Js Properties: StartLoop: 0 RestartSkipCount: 100 Loops: 1000 Seed: 0x00001234 TimeoutMs: 1000.000000 Display: 1024x768x32 60Hz ZDepth: 32 FSAAMode: Disabled Surface: 1024x768 PushBufferLocation: Memory::Coherent DstLocation: Memory::Optimal SrcLocation: Memory::Optimal MemoryType: Memory::Coherent UseIndMem: false ChannelType: UseNewestAvailable (multiple channels NOT allowed) ChannelSize: 0x00100000 (1048576) UseTiledSurface: false DisableCrt: false EarlyExitOnErrCount: false Verbose: false ShortenTestForSim: false Dma Protocol: Default NotifierLocation: Memory::Coherent GpFifoEntries: 0x00000200 (512) AutoFlush: true AutoFlushThresh: 256 ChannelLogging: false AllowVIC: true SemaphorePayloadSize: Default DisplayMgrRequirements: RequireNullDisplay GoldenValues Js Properties: PlatformName: TU106 NameSuffix: Action: Golden.Check SkipCount: 100 Codes: 0 NumCodeBins: 97 StopOnError: true BufferFetchHint: opCpuDma CalculationAlgorithm: CpuCalcAlgorithm CheckDmaOnFail: false RetryDmaOnFail: false SendTrigger: false TriggerLoop: 0 TriggerSubdevice: 0 PrintCsv: false Print: 0 (Never) Interact: 0 (Never) DumpTga: 0 (Never) DumpPng: 0 (Never) CheckLoops 1000 RuntimeMs: 0 KeepRunning: true Test type: Normal MNK Align mode: Strict Max data dump: 4294967295 NumNewMatrices: 0 Dump Miscompares: false Dump Matrices: false PrintPerf: false GflopsLowerBound: 0.000000 GflopsUpperBound: 0.000000 Verify Results: true Naive Init: false Alpha: 0.500000 Beta: 0.500000 Synchronous Mode: false LaunchDelay(uSec): 0 UseCrcToVerify: false SkipAlphaBetaCheck: false CMatrixScale: 1 CtaSwizzle: false Mtx fill data type: Random Mtx fill mean: 0.000000 Mtx fill std dev: 5.000000 Warning: MSI for device 0000:1:00.0 not serviced Monitor gpu=0 0 sec temp=56 tach=3284 pwr=49 g=1920 v=1050 NVRM: ctxshareConstruct_IMPL CtxShare Ptr: 7F2A3899B910 ChanGrp: 7F2A3899ACE0 ! Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Loading CUDA module linpack75.cubin Loading CUDA module arch_sgemm_nt75.cubin Warning: MSI for device 0000:1:00.0 not serviced Function name turing_sgemm_128x128_mods_nt Buffer size: 34537472 elements Allocating on GPU: 135 MB Num SM: 30 Block size 256 1 Grid size 30 2 Msize: 3840 (x128) Nsize: 256 (x128) Ksize: 8192 (x1) Matrix A: 0x3ff6023c0000 Matrix B: 0x3ff609bc0000 Matrix C: 0x3ff602000000 Matrix RefC: 0x3ff60a400000 CudaLinpack run: M=3840 N=256 K=8192 Loops=1000 TimePerLoop=2225us NVRM: BIND_ERROR pending with error code ffffffff! NVRM: SCHED_ERROR pending with error code ffffffff! NVRM: CHSW_ERROR pending with error code ffffffff! NVRM: FB flush timeout NVRM: LB_ERROR pending with error code ffffffff! NVRM fifoServicePreemptRunlist_GM107: OBJSCHEDMGR get runlist 0x4 SCHED failed NVRM: fifoServicePBDMA_TU101: Error 0x00000021 returned from fifoPBDMAGetChannel_HAL(pGpu, pFifo, index, &pFifoData, NV_FALSE). NVRM: NV_PFIFO_INTR_CTXSW_TIMEOUT pending for engine : 0 NVRM: NV_PFIFO_INTR_CTXSW_TIMEOUT pending for engine : 11 NVRM: NV_PFIFO_INTR_CTXSW_TIMEOUT pending for engine : 12 NVRM: NV_PFIFO_INTR_CTXSW_TIMEOUT pending for engine : 8 NVRM: NV_PFIFO_INTR_CTXSW_TIMEOUT pending for engine : 9 NVRM: NV_PFIFO_INTR_CTXSW_TIMEOUT pending for engine : 10 NVRM: NV_PFIFO_INTR_CTXSW_TIMEOUT pending for engine : 1 NVRM: NV_PFIFO_INTR_CTXSW_TIMEOUT pending for engine : 3 NVRM: NV_PFIFO_INTR_CTXSW_TIMEOUT pending for engine : 2 NVRM: NV_PFB_FBPA_1_ECC_STATUS(0)_SEC_INTR_PENDING NVRM fbEccReadAddrRBC_GP100: Row 0x1ffff Bank 0xf Col 0x7ff ExtCol 0xff ExtBank 0x1 PseudoChannel 0x1 NVRM: _pmuMutexIdGen_GK104: The PMU mutex ID generator returned 0xFFFFFFFF suggesting there may be an error with BAR0. Verify BAR0 is functional before filing a bug. NVRM: bp @ ../../../../resman/kernel/pmu/kepler/pmu_gk104.c:1594
ERROR: ** ModsDrvBreakPoint **
-------------------------- END ASSERT INFO DUMP --------------------------
ERROR: ** ModsDrvBreakPoint **
------------------------- BEGIN ASSERT INFO DUMP ------------------------- leCrt: false EarlyExitOnErrCount: false Verbose: false ShortenTestForSim: false Dma Protocol: Default NotifierLocation: Memory::Coherent GpFifoEntries: 0x00000200 (512) AutoFlush: true AutoFlushThresh: 256 ChannelLogging: false AllowVIC: true SemaphorePayloadSize: Default DisplayMgrRequirements: RequireNullDisplay GoldenValues Js Properties: PlatformName: TU106 NameSuffix: Action: Golden.Check SkipCount: 100 Codes: 0 NumCodeBins: 97 StopOnError: true BufferFetchHint: opCpuDma CalculationAlgorithm: CpuCalcAlgorithm CheckDmaOnFail: false RetryDmaOnFail: false SendTrigger: false TriggerLoop: 0 TriggerSubdevice: 0 PrintCsv: false Print: 0 (Never) Interact: 0 (Never) DumpTga: 0 (Never) DumpPng: 0 (Never) CheckLoops 1000 RuntimeMs: 0 KeepRunning: true Test type: Normal MNK Align mode: Strict Max data dump: 4294967295 NumNewMatrices: 0 Dump Miscompares: false Dump Matrices: false PrintPerf: false GflopsLowerBound: 0.000000 GflopsUpperBound: 0.000000 Verify Results: true Naive Init: false Alpha: 0.500000 Beta: 0.500000 Synchronous Mode: false LaunchDelay(uSec): 0 UseCrcToVerify: false SkipAlphaBetaCheck: false CMatrixScale: 1 CtaSwizzle: false Mtx fill data type: Random Mtx fill mean: 0.000000 Mtx fill std dev: 5.000000 Warning: MSI for device 0000:1:00.0 not serviced Monitor gpu=0 0 sec temp=56 tach=3284 pwr=49 g=1920 v=1050 NVRM: ctxshareConstruct_IMPL CtxShare Ptr: 7F2A3899B910 ChanGrp: 7F2A3899ACE0 ! Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Loading CUDA module linpack75.cubin Loading CUDA module arch_sgemm_nt75.cubin Warning: MSI for device 0000:1:00.0 not serviced Function name turing_sgemm_128x128_mods_nt Buffer size: 34537472 elements Allocating on GPU: 135 MB Num SM: 30 Block size 256 1 Grid size 30 2 Msize: 3840 (x128) Nsize: 256 (x128) Ksize: 8192 (x1) Matrix A: 0x3ff6023c0000 Matrix B: 0x3ff609bc0000 Matrix C: 0x3ff602000000 Matrix RefC: 0x3ff60a400000 CudaLinpack run: M=3840 N=256 K=8192 Loops=1000 TimePerLoop=2225us NVRM: BIND_ERROR pending with error code ffffffff! NVRM: SCHED_ERROR pending with error code ffffffff! NVRM: CHSW_ERROR pending with error code ffffffff! NVRM: FB flush timeout NVRM: LB_ERROR pending with error code ffffffff! NVRM fifoServicePreemptRunlist_GM107: OBJSCHEDMGR get runlist 0x4 SCHED failed NVRM: fifoServicePBDMA_TU101: Error 0x00000021 returned from fifoPBDMAGetChannel_HAL(pGpu, pFifo, index, &pFifoData, NV_FALSE). NVRM: NV_PFIFO_INTR_CTXSW_TIMEOUT pending for engine : 0 NVRM: NV_PFIFO_INTR_CTXSW_TIMEOUT pending for engine : 11 NVRM: NV_PFIFO_INTR_CTXSW_TIMEOUT pending for engine : 12 NVRM: NV_PFIFO_INTR_CTXSW_TIMEOUT pending for engine : 8 NVRM: NV_PFIFO_INTR_CTXSW_TIMEOUT pending for engine : 9 NVRM: NV_PFIFO_INTR_CTXSW_TIMEOUT pending for engine : 10 NVRM: NV_PFIFO_INTR_CTXSW_TIMEOUT pending for engine : 1 NVRM: NV_PFIFO_INTR_CTXSW_TIMEOUT pending for engine : 3 NVRM: NV_PFIFO_INTR_CTXSW_TIMEOUT pending for engine : 2 NVRM: NV_PFB_FBPA_1_ECC_STATUS(0)_SEC_INTR_PENDING NVRM fbEccReadAddrRBC_GP100: Row 0x1ffff Bank 0xf Col 0x7ff ExtCol 0xff ExtBank 0x1 PseudoChannel 0x1 NVRM: _pmuMutexIdGen_GK104: The PMU mutex ID generator returned 0xFFFFFFFF suggesting there may be an error with BAR0. Verify BAR0 is functional before filing a bug. NVRM: bp @ ../../../../resman/kernel/pmu/kepler/pmu_gk104.c:1594
ERROR: ** ModsDrvBreakPoint ** First error recorded: 818 Mods detected an assertion failure JavaScript stack trace: NVRM: pmuMutexAcquireByIndex_GK104: error generating a mutex identifer. NVRM mcServiceList_GK104: Failed GPU reg read : 0xffffffff. Check whether GPU is present on the bus !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !! ** PMU HALTED ** !! !! Please file a bug with the following information. !! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! <<<<<<<<<<<< FALCON DEBUG INFO GPU 0 - START >>>>>>>>>>>> Falcon engine: : 0x0000000c Falcon core revision : 0x00f3 Falcon security model : 0x03 PMU_DEBUG(00) : 0xffffffff PMU_DEBUG(01) : 0xffffffff PMU_DEBUG(02) : 0xffffffff PMU_DEBUG(03) : 0xffffffff PMU_MAILBOX(00) : 0xffffffff PMU_MAILBOX(01) : 0xffffffff PMU_MAILBOX(02) : 0xffffffff PMU_MAILBOX(03) : 0xffffffff PMU_MAILBOX(04) : 0xffffffff PMU_MAILBOX(05) : 0xffffffff PMU_MAILBOX(06) : 0xffffffff PMU_MAILBOX(07) : 0xffffffff PMU_MAILBOX(08) : 0xffffffff PMU_MAILBOX(09) : 0xffffffff PMU_MAILBOX(10) : 0xffffffff PMU_MAILBOX(11) : 0xffffffff Falcon is in HS mode. We won't be able to update TRACEPC index through priv. We can only dump the branch indicated by the current value of TRACEIDX_IDX, instead of the entire buffer. FALCON_TRACEPC(255) = 0x00ffffff PMU_BAR0_ERROR_STATUS : 0xffffffff PMU_BAR0_ADDR : 0xffffffff PMU_BAR0_DATA : 0xffffffff PMU_BAR0_TIMEOUT : 0xffffffff PMU_BAR0_CTL : 0xffffffff ** Internal special purpose registers are inaccessible through ICD.** ** RSTAT registers are inaccessible through ICD.** OS : 0xffffffff CPUCTL : 0xffffffff IDLESTATE : 0xffffffff MAILBOX0 : 0xffffffff MAILBOX1 : 0xffffffff IRQSTAT : 0xffffffff IRQMODE : 0xffffffff IRQMASK : 0xffffffff IRQDEST : 0xffffffff DMACTL : 0xffffffff DMATRFCMD : 0xffffffff DMATRFBASE : 0xffffffff DMATRFMOFFS : 0xffffffff DMATRFFBOFFS : 0xffffffff BOOTVEC : 0xffffffff HWCFG : 0xffffffff HWCFG1 : 0xffffffff ENGCTL : 0xffffffff CURCTX : 0xffffffff NXTCTX : 0xffffffff EXTERRSTAT : 0xffffffff ** EXTERR Detected! EXTERRADDR : 0xffffffff RSTAT0 : 0xffffffff RSTAT3 : 0xffffffff EXTERR_INFO : 0xffffffff SCTL : 0xffffffff SCTL1 : 0xffffffff Last Exception info : EXCEPTION TYPE: UNKOWN at PC 0xffffffff. <<<<<<<<<<<<< FALCON DEBUG INFO GPU 0 - END >>>>>>>>>>>>> NVRM: bp @ ../../../../resman/kernel/pmu/maxwell/pmugm200.c:351 (reason=112)
ERROR: ** ModsDrvBreakPoint **
-------------------------- END ASSERT INFO DUMP --------------------------
ERROR: ** ModsDrvBreakPoint **
------------------------- BEGIN ASSERT INFO DUMP ------------------------- device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Warning: MSI for device 0000:1:00.0 not serviced Loading CUDA module linpack75.cubin Loading CUDA module arch_sgemm_nt75.cubin Warning: MSI for device 0000:1:00.0 not serviced Function name turing_sgemm_128x128_mods_nt Buffer size: 34537472 elements Allocating on GPU: 135 MB Num SM: 30 Block size 256 1 Grid size 30 2 Msize: 3840 (x128) Nsize: 256 (x128) Ksize: 8192 (x1) Matrix A: 0x3ff6023c0000 Matrix B: 0x3ff609bc0000 Matrix C: 0x3ff602000000 Matrix RefC: 0x3ff60a400000 CudaLinpack run: M=3840 N=256 K=8192 Loops=1000 TimePerLoop=2225us NVRM: BIND_ERROR pending with error code ffffffff! NVRM: SCHED_ERROR pending with error code ffffffff! NVRM: CHSW_ERROR pending with error code ffffffff! NVRM: FB flush timeout NVRM: LB_ERROR pending with error code ffffffff! NVRM fifoServicePreemptRunlist_GM107: OBJSCHEDMGR get runlist 0x4 SCHED failed NVRM: fifoServicePBDMA_TU101: Error 0x00000021 returned from fifoPBDMAGetChannel_HAL(pGpu, pFifo, index, &pFifoData, NV_FALSE). NVRM: NV_PFIFO_INTR_CTXSW_TIMEOUT pending for engine : 0 NVRM: NV_PFIFO_INTR_CTXSW_TIMEOUT pending for engine : 11 NVRM: NV_PFIFO_INTR_CTXSW_TIMEOUT pending for engine : 12 NVRM: NV_PFIFO_INTR_CTXSW_TIMEOUT pending for engine : 8 NVRM: NV_PFIFO_INTR_CTXSW_TIMEOUT pending for engine : 9 NVRM: NV_PFIFO_INTR_CTXSW_TIMEOUT pending for engine : 10 NVRM: NV_PFIFO_INTR_CTXSW_TIMEOUT pending for engine : 1 NVRM: NV_PFIFO_INTR_CTXSW_TIMEOUT pending for engine : 3 NVRM: NV_PFIFO_INTR_CTXSW_TIMEOUT pending for engine : 2 NVRM: NV_PFB_FBPA_1_ECC_STATUS(0)_SEC_INTR_PENDING NVRM fbEccReadAddrRBC_GP100: Row 0x1ffff Bank 0xf Col 0x7ff ExtCol 0xff ExtBank 0x1 PseudoChannel 0x1 NVRM: _pmuMutexIdGen_GK104: The PMU mutex ID generator returned 0xFFFFFFFF suggesting there may be an error with BAR0. Verify BAR0 is functional before filing a bug. NVRM: bp @ ../../../../resman/kernel/pmu/kepler/pmu_gk104.c:1594
ERROR: ** ModsDrvBreakPoint ** First error recorded: 818 Mods detected an assertion failure JavaScript stack trace: NVRM: pmuMutexAcquireByIndex_GK104: error generating a mutex identifer. NVRM mcServiceList_GK104: Failed GPU reg read : 0xffffffff. Check whether GPU is present on the bus !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !! ** PMU HALTED ** !! !! Please file a bug with the following information. !! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! <<<<<<<<<<<< FALCON DEBUG INFO GPU 0 - START >>>>>>>>>>>> Falcon engine: : 0x0000000c Falcon core revision : 0x00f3 Falcon security model : 0x03 PMU_DEBUG(00) : 0xffffffff PMU_DEBUG(01) : 0xffffffff PMU_DEBUG(02) : 0xffffffff PMU_DEBUG(03) : 0xffffffff PMU_MAILBOX(00) : 0xffffffff PMU_MAILBOX(01) : 0xffffffff PMU_MAILBOX(02) : 0xffffffff PMU_MAILBOX(03) : 0xffffffff PMU_MAILBOX(04) : 0xffffffff PMU_MAILBOX(05) : 0xffffffff PMU_MAILBOX(06) : 0xffffffff PMU_MAILBOX(07) : 0xffffffff PMU_MAILBOX(08) : 0xffffffff PMU_MAILBOX(09) : 0xffffffff PMU_MAILBOX(10) : 0xffffffff PMU_MAILBOX(11) : 0xffffffff Falcon is in HS mode. We won't be able to update TRACEPC index through priv. We can only dump the branch indicated by the current value of TRACEIDX_IDX, instead of the entire buffer. FALCON_TRACEPC(255) = 0x00ffffff PMU_BAR0_ERROR_STATUS : 0xffffffff PMU_BAR0_ADDR : 0xffffffff PMU_BAR0_DATA : 0xffffffff PMU_BAR0_TIMEOUT : 0xffffffff PMU_BAR0_CTL : 0xffffffff ** Internal special purpose registers are inaccessible through ICD.** ** RSTAT registers are inaccessible through ICD.** OS : 0xffffffff CPUCTL : 0xffffffff IDLESTATE : 0xffffffff MAILBOX0 : 0xffffffff MAILBOX1 : 0xffffffff IRQSTAT : 0xffffffff IRQMODE : 0xffffffff IRQMASK : 0xffffffff IRQDEST : 0xffffffff DMACTL : 0xffffffff DMATRFCMD : 0xffffffff DMATRFBASE : 0xffffffff DMATRFMOFFS : 0xffffffff DMATRFFBOFFS : 0xffffffff BOOTVEC : 0xffffffff HWCFG : 0xffffffff HWCFG1 : 0xffffffff ENGCTL : 0xffffffff CURCTX : 0xffffffff NXTCTX : 0xffffffff EXTERRSTAT : 0xffffffff ** EXTERR Detected! EXTERRADDR : 0xffffffff RSTAT0 : 0xffffffff RSTAT3 : 0xffffffff EXTERR_INFO : 0xffffffff SCTL : 0xffffffff SCTL1 : 0xffffffff Last Exception info : EXCEPTION TYPE: UNKOWN at PC 0xffffffff. <<<<<<<<<<<<< FALCON DEBUG INFO GPU 0 - END >>>>>>>>>>>>> NVRM: bp @ ../../../../resman/kernel/pmu/maxwell/pmugm200.c:351 (reason=112)
ERROR: ** ModsDrvBreakPoint ** <<<<<<<<<<<< FALCON DEBUG INFO GPU 0 - START >>>>>>>>>>>> Falcon engine: : 0x0000000c Falcon core revision : 0x00f3 Falcon security model : 0x03 PMU_DEBUG(00) : 0xffffffff PMU_DEBUG(01) : 0xffffffff PMU_DEBUG(02) : 0xffffffff PMU_DEBUG(03) : 0xffffffff PMU_MAILBOX(00) : 0xffffffff PMU_MAILBOX(01) : 0xffffffff PMU_MAILBOX(02) : 0xffffffff PMU_MAILBOX(03) : 0xffffffff PMU_MAILBOX(04) : 0xffffffff PMU_MAILBOX(05) : 0xffffffff PMU_MAILBOX(06) : 0xffffffff PMU_MAILBOX(07) : 0xffffffff PMU_MAILBOX(08) : 0xffffffff PMU_MAILBOX(09) : 0xffffffff PMU_MAILBOX(10) : 0xffffffff PMU_MAILBOX(11) : 0xffffffff Falcon is in HS mode. We won't be able to update TRACEPC index through priv. We can only dump the branch indicated by the current value of TRACEIDX_IDX, instead of the entire buffer. FALCON_TRACEPC(255) = 0x00ffffff PMU_BAR0_ERROR_STATUS : 0xffffffff PMU_BAR0_ADDR : 0xffffffff PMU_BAR0_DATA : 0xffffffff PMU_BAR0_TIMEOUT : 0xffffffff PMU_BAR0_CTL : 0xffffffff ** Internal special purpose registers are inaccessible through ICD.** ** RSTAT registers are inaccessible through ICD.** OS : 0xffffffff CPUCTL : 0xffffffff IDLESTATE : 0xffffffff MAILBOX0 : 0xffffffff MAILBOX1 : 0xffffffff IRQSTAT : 0xffffffff IRQMODE : 0xffffffff IRQMASK : 0xffffffff IRQDEST : 0xffffffff DMACTL : 0xffffffff DMATRFCMD : 0xffffffff DMATRFBASE : 0xffffffff DMATRFMOFFS : 0xffffffff DMATRFFBOFFS : 0xffffffff BOOTVEC : 0xffffffff HWCFG : 0xffffffff HWCFG1 : 0xffffffff ENGCTL : 0xffffffff CURCTX : 0xffffffff NXTCTX : 0xffffffff EXTERRSTAT : 0xffffffff ** EXTERR Detected! EXTERRADDR : 0xffffffff RSTAT0 : 0xffffffff RSTAT3 : 0xffffffff EXTERR_INFO : 0xffffffff SCTL : 0xffffffff SCTL1 : 0xffffffff Last Exception info : EXCEPTION TYPE: UNKOWN at PC 0xffffffff. <<<<<<<<<<<<< FALCON DEBUG INFO GPU 0 - END >>>>>>>>>>>>> NVRM: bp @ ../../../../resman/kernel/pmu/kepler/pmu_gk104.c:2646 (reason=112)
ERROR: ** ModsDrvBreakPoint **
-------------------------- END ASSERT INFO DUMP --------------------------
ERROR: ** ModsDrvBreakPoint **
------------------------- BEGIN ASSERT INFO DUMP ------------------------- 8192 (x1) Matrix A: 0x3ff6023c0000 Matrix B: 0x3ff609bc0000 Matrix C: 0x3ff602000000 Matrix RefC: 0x3ff60a400000 CudaLinpack run: M=3840 N=256 K=8192 Loops=1000 TimePerLoop=2225us NVRM: BIND_ERROR pending with error code ffffffff! NVRM: SCHED_ERROR pending with error code ffffffff! NVRM: CHSW_ERROR pending with error code ffffffff! NVRM: FB flush timeout NVRM: LB_ERROR pending with error code ffffffff! NVRM fifoServicePreemptRunlist_GM107: OBJSCHEDMGR get runlist 0x4 SCHED failed NVRM: fifoServicePBDMA_TU101: Error 0x00000021 returned from fifoPBDMAGetChannel_HAL(pGpu, pFifo, index, &pFifoData, NV_FALSE). NVRM: NV_PFIFO_INTR_CTXSW_TIMEOUT pending for engine : 0 NVRM: NV_PFIFO_INTR_CTXSW_TIMEOUT pending for engine : 11 NVRM: NV_PFIFO_INTR_CTXSW_TIMEOUT pending for engine : 12 NVRM: NV_PFIFO_INTR_CTXSW_TIMEOUT pending for engine : 8 NVRM: NV_PFIFO_INTR_CTXSW_TIMEOUT pending for engine : 9 NVRM: NV_PFIFO_INTR_CTXSW_TIMEOUT pending for engine : 10 NVRM: NV_PFIFO_INTR_CTXSW_TIMEOUT pending for engine : 1 NVRM: NV_PFIFO_INTR_CTXSW_TIMEOUT pending for engine : 3 NVRM: NV_PFIFO_INTR_CTXSW_TIMEOUT pending for engine : 2 NVRM: NV_PFB_FBPA_1_ECC_STATUS(0)_SEC_INTR_PENDING NVRM fbEccReadAddrRBC_GP100: Row 0x1ffff Bank 0xf Col 0x7ff ExtCol 0xff ExtBank 0x1 PseudoChannel 0x1 NVRM: _pmuMutexIdGen_GK104: The PMU mutex ID generator returned 0xFFFFFFFF suggesting there may be an error with BAR0. Verify BAR0 is functional before filing a bug. NVRM: bp @ ../../../../resman/kernel/pmu/kepler/pmu_gk104.c:1594
ERROR: ** ModsDrvBreakPoint ** First error recorded: 818 Mods detected an assertion failure JavaScript stack trace: NVRM: pmuMutexAcquireByIndex_GK104: error generating a mutex identifer. NVRM mcServiceList_GK104: Failed GPU reg read : 0xffffffff. Check whether GPU is present on the bus !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !! ** PMU HALTED ** !! !! Please file a bug with the following information. !! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! <<<<<<<<<<<< FALCON DEBUG INFO GPU 0 - START >>>>>>>>>>>> Falcon engine: : 0x0000000c Falcon core revision : 0x00f3 Falcon security model : 0x03 PMU_DEBUG(00) : 0xffffffff PMU_DEBUG(01) : 0xffffffff PMU_DEBUG(02) : 0xffffffff PMU_DEBUG(03) : 0xffffffff PMU_MAILBOX(00) : 0xffffffff PMU_MAILBOX(01) : 0xffffffff PMU_MAILBOX(02) : 0xffffffff PMU_MAILBOX(03) : 0xffffffff PMU_MAILBOX(04) : 0xffffffff PMU_MAILBOX(05) : 0xffffffff PMU_MAILBOX(06) : 0xffffffff PMU_MAILBOX(07) : 0xffffffff PMU_MAILBOX(08) : 0xffffffff PMU_MAILBOX(09) : 0xffffffff PMU_MAILBOX(10) : 0xffffffff PMU_MAILBOX(11) : 0xffffffff Falcon is in HS mode. We won't be able to update TRACEPC index through priv. We can only dump the branch indicated by the current value of TRACEIDX_IDX, instead of the entire buffer. FALCON_TRACEPC(255) = 0x00ffffff PMU_BAR0_ERROR_STATUS : 0xffffffff PMU_BAR0_ADDR : 0xffffffff PMU_BAR0_DATA : 0xffffffff PMU_BAR0_TIMEOUT : 0xffffffff PMU_BAR0_CTL : 0xffffffff ** Internal special purpose registers are inaccessible through ICD.** ** RSTAT registers are inaccessible through ICD.** OS : 0xffffffff CPUCTL : 0xffffffff IDLESTATE : 0xffffffff MAILBOX0 : 0xffffffff MAILBOX1 : 0xffffffff IRQSTAT : 0xffffffff IRQMODE : 0xffffffff IRQMASK : 0xffffffff IRQDEST : 0xffffffff DMACTL : 0xffffffff DMATRFCMD : 0xffffffff DMATRFBASE : 0xffffffff DMATRFMOFFS : 0xffffffff DMATRFFBOFFS : 0xffffffff BOOTVEC : 0xffffffff HWCFG : 0xffffffff HWCFG1 : 0xffffffff ENGCTL : 0xffffffff CURCTX : 0xffffffff NXTCTX : 0xffffffff EXTERRSTAT : 0xffffffff ** EXTERR Detected! EXTERRADDR : 0xffffffff RSTAT0 : 0xffffffff RSTAT3 : 0xffffffff EXTERR_INFO : 0xffffffff SCTL : 0xffffffff SCTL1 : 0xffffffff Last Exception info : EXCEPTION TYPE: UNKOWN at PC 0xffffffff. <<<<<<<<<<<<< FALCON DEBUG INFO GPU 0 - END >>>>>>>>>>>>> NVRM: bp @ ../../../../resman/kernel/pmu/maxwell/pmugm200.c:351 (reason=112)
ERROR: ** ModsDrvBreakPoint ** <<<<<<<<<<<< FALCON DEBUG INFO GPU 0 - START >>>>>>>>>>>> Falcon engine: : 0x0000000c Falcon core revision : 0x00f3 Falcon security model : 0x03 PMU_DEBUG(00) : 0xffffffff PMU_DEBUG(01) : 0xffffffff PMU_DEBUG(02) : 0xffffffff PMU_DEBUG(03) : 0xffffffff PMU_MAILBOX(00) : 0xffffffff PMU_MAILBOX(01) : 0xffffffff PMU_MAILBOX(02) : 0xffffffff PMU_MAILBOX(03) : 0xffffffff PMU_MAILBOX(04) : 0xffffffff PMU_MAILBOX(05) : 0xffffffff PMU_MAILBOX(06) : 0xffffffff PMU_MAILBOX(07) : 0xffffffff PMU_MAILBOX(08) : 0xffffffff PMU_MAILBOX(09) : 0xffffffff PMU_MAILBOX(10) : 0xffffffff PMU_MAILBOX(11) : 0xffffffff Falcon is in HS mode. We won't be able to update TRACEPC index through priv. We can only dump the branch indicated by the current value of TRACEIDX_IDX, instead of the entire buffer. FALCON_TRACEPC(255) = 0x00ffffff PMU_BAR0_ERROR_STATUS : 0xffffffff PMU_BAR0_ADDR : 0xffffffff PMU_BAR0_DATA : 0xffffffff PMU_BAR0_TIMEOUT : 0xffffffff PMU_BAR0_CTL : 0xffffffff ** Internal special purpose registers are inaccessible through ICD.** ** RSTAT registers are inaccessible through ICD.** OS : 0xffffffff CPUCTL : 0xffffffff IDLESTATE : 0xffffffff MAILBOX0 : 0xffffffff MAILBOX1 : 0xffffffff IRQSTAT : 0xffffffff IRQMODE : 0xffffffff IRQMASK : 0xffffffff IRQDEST : 0xffffffff DMACTL : 0xffffffff DMATRFCMD : 0xffffffff DMATRFBASE : 0xffffffff DMATRFMOFFS : 0xffffffff DMATRFFBOFFS : 0xffffffff BOOTVEC : 0xffffffff HWCFG : 0xffffffff HWCFG1 : 0xffffffff ENGCTL : 0xffffffff CURCTX : 0xffffffff NXTCTX : 0xffffffff EXTERRSTAT : 0xffffffff ** EXTERR Detected! EXTERRADDR : 0xffffffff RSTAT0 : 0xffffffff RSTAT3 : 0xffffffff EXTERR_INFO : 0xffffffff SCTL : 0xffffffff SCTL1 : 0xffffffff Last Exception info : EXCEPTION TYPE: UNKOWN at PC 0xffffffff. <<<<<<<<<<<<< FALCON DEBUG INFO GPU 0 - END >>>>>>>>>>>>> NVRM: bp @ ../../../../resman/kernel/pmu/kepler/pmu_gk104.c:2646 (reason=112)
ERROR: ** ModsDrvBreakPoint ** NVRM-RC: tmrGetTimeEx_GK104: Consistently Bad TimeLo value ffffffff NVRM: bp @ ../../../../resman/kernel/tmr/kepler/tmrgk104.c:97
ERROR: ** ModsDrvBreakPoint **
-------------------------- END ASSERT INFO DUMP -------------------------- MODS is exiting with failure after assert #5. Breakpoint limit exceeded.
Error Code = 000000000818 (Mods detected an assertion failure)
Заголовок сообщения: Re: MSI GeForce RTX2060 Ventus OS RU Зависает в определенных ситуациях.
Добавлено: 17 окт 2022, 20:19
Продвинутый форумчанин
Зарегистрирован: 03 ноя 2017, 01:24 Наличности на руках: 5,011.04 Сообщения: 3985 Откуда: Budapest
demonis писал(а):
MODS
что пишет если так протестировать ./mods gputest.js -oqa -old_gold -test 94 -fan_speed 70 -dramclk_percent 110 -ignore_fatal_errors -run_on_error -matsinfo
demonis
[ТС]
Заголовок сообщения: Re: MSI GeForce RTX2060 Ventus OS RU Зависает в определенных ситуациях.
Добавлено: 18 окт 2022, 08:47
Интересующийся
Зарегистрирован: 08 мар 2016, 18:22 Наличности на руках: 247.86 Сообщения: 143 Откуда: Irkutsk
Еще попробовал прогнать Superposition на стандартных настройках тест проходит, на ультрах так же вылетает. Heaven вылетает в одном и том же месте что на DX11 что на OpenGL
Еще постараюсь добраться проверить у знакомого на другой платформе. Все еще надеюсь что какой то элемент у меня глючит. Уж больно проблема специфичная идентичная на двух картах продавца, да и на его ПК тот же HEAVEN крутился идеально.
Дельная мысль проверить на другой платформе учитывая это: NVRM mcServiceList_GK104: Failed GPU reg read : 0xffffffff. Check whether GPU is present on the bus
demonis
[ТС]
Заголовок сообщения: Re: MSI GeForce RTX2060 Ventus OS RU Зависает в определенных ситуациях. [РЕШЕНО]
Добавлено: 13 ноя 2022, 07:52
Интересующийся
Зарегистрирован: 08 мар 2016, 18:22 Наличности на руках: 247.86 Сообщения: 143 Откуда: Irkutsk
Проверка на другой системе дала частично положительный результат. Минут 20 без зависаний тест крутился. Потом так же завис. БП там был 400W. В итоге на основном ПК решил заменить БП с 500W на 650W Chiftec. Проблема решилась полностью. Видимо старый БП чуть подсохли емкости, и в определенных пиковых нагрузках не тянул.
Вы не можете начинать темы Вы не можете отвечать на сообщения Вы не можете редактировать свои сообщения Вы не можете удалять свои сообщения Вы не можете добавлять вложения