Господа, приветствую. Есть microserver gen 8 с g1610t и 4gb ram, на esxi 5.5 крутиться jun 1.02b, DSM 6.1.7-15284 Update 3.
Переодически в логах промелькивает такая хрень -
[ 992.620742] BUG: unable to handle kernel paging request at 0000000c0000055c
[ 992.622724] IP: [<ffffffff814abbc4>] mutex_lock+0x4/0x20
[ 992.624154] PGD 4d152067 PUD 0
[ 992.625059] Oops: 0002 [#1] SMP
[ 992.625974] Modules linked in: snd_usb_hiface snd_pcm_oss snd_mixer_oss snd_usb_audio snd_pcm snd_timer snd_hwdep snd_usbmidi_lib snd_rawmidi snd_seq_device snd snd_page_alloc soundcore bridge stp aufs macvlan veth xt_conntrack xt_addrtype nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables ipt_MASQUERADE xt_REDIRECT xt_nat iptable_nat nf_nat_ipv4 nf_nat xt_recent xt_iprange xt_limit xt_state xt_tcpudp xt_multiport xt_LOG nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables cifs udf isofs loop iscsi_target_mod(O) target_core_ep(O) target_core_file(O) target_core_iblock(O) target_core_mod(O) syno_extent_pool(PO) rodsp_ep(O) hid_generic usbhid hid usblp nf_conntrack x_tables bromolow_synobios(PO) button ax88179_178a usbnet tg3 r8169 cnic bnx2 vmxnet3 pcnet32 e1000 sfc netxen_nic qlge qlcnic
[ 992.647692] qla3xxx pch_gbe ptp_pch sky2 skge jme ipg uio alx atl1c atl1e atl1 libphy mii exfat(O) btrfs synoacl_vfs(PO) zlib_deflate hfsplus md4 hmac bnx2x(O) libcrc32c mdio mlx5_core(O) mlx4_en(O) mlx4_core(O) mlx_compat(O) compat(O) qede(O) qed(O) atlantic(O) r8168(O) tn40xx(O) i40e(O) ixgbe(O) be2net(O) igb(O) i2c_algo_bit e1000e(O) dca fuse vfat fat crc32c_intel glue_helper lrw gf128mul ablk_helper sha512_generic arc4 cryptd ecryptfs sha256_generic sha1_generic ecb aes_x86_64 authenc des_generic ansi_cprng cts md5 cbc cpufreq_conservative cpufreq_powersave cpufreq_performance cpufreq_ondemand mperf processor thermal_sys cpufreq_stats freq_table dm_snapshot crc_itu_t crc_ccitt quota_v2 quota_tree psnap p8022 llc sit tunnel4 ip_tunnel ipv6 zram(C) sg etxhci_hcd mpt3sas mpt2sas(O) megaraid_sas
[ 992.671692] ata_piix mptctl mptsas mptspi mptscsih mptbase scsi_transport_spi megaraid megaraid_mbox megaraid_mm vmw_pvscsi BusLogic usb_storage xhci_hcd uhci_hcd ohci_hcd ehci_pci ehci_hcd usbcore usb_common el000(O) [last unloaded: ip_tables]
[ 992.677941] CPU: 1 PID: 21982 Comm: SYNO.DSM.Info_2 Tainted: P C O 3.10.102 #15284
[ 992.680066] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/17/2015
[ 992.682813] task: ffff88005f3fd800 ti: ffff880028e94000 task.ti: ffff880028e94000
[ 992.684758] RIP: 0010:[<ffffffff814abbc4>] [<ffffffff814abbc4>] mutex_lock+0x4/0x20
[ 992.686811] RSP: 0018:ffff880028e97d08 EFLAGS: 00010246
[ 992.688218] RAX: ffff88007842e200 RBX: 0000000c0000055c RCX: 0000000000000000
[ 992.690069] RDX: 000000000000ffff RSI: 0000000000000002 RDI: 0000000c0000055c
[ 992.691915] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
[ 992.693761] R10: 000000000000010f R11: 0000000000000246 R12: ffffffff8184e4b0
[ 992.695606] R13: ffff880028e97d80 R14: 000000000000fa80 R15: 0000000c0000055c
[ 992.697460] FS: 00007f956aea97c0(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000
[ 992.699556] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 992.701055] CR2: 0000000c0000055c CR3: 0000000056279000 CR4: 00000000001407e0
[ 992.702942] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 992.704811] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 992.706664] Stack:
[ 992.707215] 0000000c00000404 ffffffff8137ab5b 0000000000000000 ffff880028e97d80
[ 992.709332] 00007ffd08067f30 ffff880028e97d80 00007ffd08067f30 0000000000000000
[ 992.711430] 0000000000000001 ffffffffa0caa21e ffffffffffffffff ffffffffa0ca6e27
[ 992.713527] Call Trace:
[ 992.714191] [<ffffffff8137ab5b>] ? syno_cpu_temperature+0xfb/0x1d0
[ 992.715832] [<ffffffffa0caa21e>] ? GetCpuTemperatureDenlowI3Transfer+0xe/0x80 [bromolow_synobios]
[ 992.718160] [<ffffffffa0ca6e27>] ? synobios_ioctl+0x4f7/0x10f0 [bromolow_synobios]
[ 992.720141] [<ffffffff8111120e>] ? dput+0x1e/0x2a0
[ 992.721419] [<ffffffff811193f3>] ? mntput_no_expire+0x13/0x130
[ 992.722959] [<ffffffff81109bb6>] ? path_openat.isra.45+0x146/0x4f0
[ 992.724592] [<ffffffff810d243e>] ? handle_mm_fault+0x13e/0x2a0
[ 992.726132] [<ffffffff8110ac9f>] ? do_filp_open+0x2f/0x70
[ 992.727572] [<ffffffff8111120e>] ? dput+0x1e/0x2a0
[ 992.728848] [<ffffffff8110d0ce>] ? do_vfs_ioctl+0x20e/0x880
[ 992.730321] [<ffffffff8110d7c0>] ? SyS_ioctl+0x80/0xa0
[ 992.731687] [<ffffffff814afe32>] ? system_call_fastpath+0x16/0x1b
[ 992.733293] Code: 31 c0 48 8d 74 24 20 4c 89 e7 89 44 24 0c e8 54 40 ba ff 8b 44 24 0c 48 8d 65 d8 5b 41 5c 41 5d 41 5e 41 5f 5d c3 90 53 48 89 fb <f0> ff 0f 79 05 e8 b2 0c 00 00 65 48 8b 04 25 c0 a7 00 00 48 89
[ 992.741078] RIP [<ffffffff814abbc4>] mutex_lock+0x4/0x20
[ 992.742530] RSP <ffff880028e97d08>
[ 992.743455] CR2: 0000000c0000055c
[ 992.744455] ---[ end trace b3cdc3a5e3e34adb ]---
После этого какое-то время все работает, а потом тыква. Постепенно растет load average, при достижении 100-150 начинается своп, при приближении к 200 все окукливается, при этом загрузка процессора 0-5%. Причем это возникает всегда по вине SYNO.DSM.Info_2 и судя по бэктрейсу что-то связано с температурой. Бага может вылезти как через 15 минут, так и через пару дней. Винты с памятью проверил. Пробовал более свежии версии гипервизора. В безгипервизорном режиме протянул пару-тройку дней, но вентилятор утомил :). Ошибки такой небыло.
Кто нибудь сталкивался с таким?