asheenlevrai Posted June 21, 2021 Share #1 Posted June 21, 2021 (edited) Hi I am currently setting up an xpen rig with the following specs: - z77-based MB - i7 3770k - DDR3 1066 RAM - 2x 120GB SSDs (RAID1 array -> volume1) - 4x 3TB HDDs (RAID5 array -> volume2) - HDDs are connected to a jmb585-based 5ports sATA controller (PCIe3.0 x4), while SSDs are connected to the onboard sATA3 (6Gbps) ports. - quad port GbE NIC AIC (PCIe2.0 x4) -> Bond connection, 4x GbE using LACP. Network infrastructure is compatible with LACP) - 550W PSU I used DS3617xs (loader 1.03b), DSM 6.2.3-25426 Update 3 Unfortunately, this setup is unstable. On occasions, the connection is lost. The rig is no longer detected by Synology Assistant either. All I can do is a hard reset and then it comes back online. I wonder if it could be a problem with the network connection or if it could be resulting from an issue with sleep management. I already tried to disable memory compression and HDD hibernation. What should I do next to troubleshoot this problem? Thank you very much in advance for your help. Best, -a- PS: after the last hard reset, DSM started data scrubbing on Volume2 upon reboot. Is this useful to locate the origin of the issue? Edited June 21, 2021 by asheenlevrai Quote Link to comment Share on other sites More sharing options...
asheenlevrai Posted June 23, 2021 Author Share #2 Posted June 23, 2021 (edited) I tried to use only 1 Ethernet cable (normal GbE, no longer using LACP aggregation) but the problem remains. I am now testing with a different (single port) GbE NIC. Edited June 23, 2021 by asheenlevrai Quote Link to comment Share on other sites More sharing options...
nemesis122 Posted June 23, 2021 Share #3 Posted June 23, 2021 Hi I had the same issue with 3617 and 1.03b try 3615 with 1.03b and all is fine. 1 Quote Link to comment Share on other sites More sharing options...
asheenlevrai Posted June 23, 2021 Author Share #4 Posted June 23, 2021 (edited) 3 hours ago, nemesis122 said: Hi I had the same issue with 3617 and 1.03b try 3615 with 1.03b and all is fine. Thanks @nemesis122 :s Well it would be quite inconvenient to reset the whole system up, right? I mean, there is no way to go straight from ds3617xs to ds3615xs without erasing the disks, right? note: so far the test with the other NIC (single port GbE) didn't lead to any connection failure. It seems to indicate that the problem is thus caused by the quad-port NIC. - driver? - configuration/setting? - hw? This quad-port NIC is brand new. What is different between ds3617xs and ds3615xs? Thanks again. -a- Edited June 23, 2021 by asheenlevrai Quote Link to comment Share on other sites More sharing options...
flyride Posted June 23, 2021 Share #5 Posted June 23, 2021 2 hours ago, asheenlevrai said: Well it would be quite inconvenient to reset the whole system up, right? I mean, there is no way to go straight from ds3617xs to ds3615xs without erasing the disks, right? You can switch platforms, it's called a migration install. 2 hours ago, asheenlevrai said: What is different between ds3617xs and ds3615xs? https://xpenology.com/forum/topic/13333-tutorialreference-6x-loaders-and-platforms/ 1 Quote Link to comment Share on other sites More sharing options...
asheenlevrai Posted June 24, 2021 Author Share #6 Posted June 24, 2021 9 hours ago, flyride said: You can switch platforms, it's called a migration install. Thanks For some reason I keep on forgetting that migration does not necessarily have to go from one model to a newer one. 🥴 9 hours ago, flyride said: https://xpenology.com/forum/topic/13333-tutorialreference-6x-loaders-and-platforms/ The only difference in this table (AFAICT) is the max number of CPU threads. Since my CPU is 4c/8t, this won't be limiting. Now, I wonder why the NIC would be problematic on a ds3617xs rig and work fine on a ds3615xs rig, but I guess it would be easier to migrate and see what happens than to actually troubleshoot the issue with the ds3615xs note: so far the test with the alternative NIC (single port GbE) still didn't lead to any connection failure. Quote Link to comment Share on other sites More sharing options...
IG-88 Posted June 24, 2021 Share #7 Posted June 24, 2021 10 hours ago, asheenlevrai said: The only difference in this table (AFAICT) is the max number of CPU threads. Since my CPU is 4c/8t, this won't be limiting. there are som differences in the default drivers from synology, like newer lsi sas drivers, newer mellanox 10G nic drivers but the kernel with dsm 6.2 is the same for both with 7.0 there will be more difference, 3617 get 4.4 kernel like 918+ has and 3615 stays on 3.10, kernel, if that is of importance depends on what loader(s) we might see for 7.0 (atm 6.2.4 and 7.0 are off limit with loader 1.03c/1.04b) 10 hours ago, asheenlevrai said: Now, I wonder why the NIC would be problematic on a ds3617xs rig and work fine on a ds3615xs rig, but I guess it would be easier to migrate and see what happens than to actually troubleshoot the issue with the ds3615xs note: so far the test with the alternative NIC (single port GbE) still didn't lead to any connection failure. did you check that the hardware of the 4port nic is reliable? maybe boot up i live linux and copy some data (that way anything hardware related is the same as with dsm) 1 Quote Link to comment Share on other sites More sharing options...
asheenlevrai Posted June 24, 2021 Author Share #8 Posted June 24, 2021 3 hours ago, IG-88 said: did you check that the hardware of the 4port nic is reliable? maybe boot up i live linux and copy some data (that way anything hardware related is the same as with dsm) hw? This quad-port NIC is brand new. But I'll check that too. Thanks Quote Link to comment Share on other sites More sharing options...
asheenlevrai Posted June 30, 2021 Author Share #9 Posted June 30, 2021 I'm wondering, how can I know if my problem indeed comes from a loss of connection or rather from a crash of the device? Quote Link to comment Share on other sites More sharing options...
asheenlevrai Posted July 5, 2021 Author Share #10 Posted July 5, 2021 (edited) On 6/23/2021 at 11:36 PM, flyride said: You can switch platforms, it's called a migration install. https://xpenology.com/forum/topic/13333-tutorialreference-6x-loaders-and-platforms/ Hi @flyride I'm currently trying to migrate from 3717xs to 3615xs in order to see if this solves my NIC problems on this rig (as suggested by @nemesis122 on the 3rd post of this thread). I made a new USB loader using 1.03b for 3615xs (I assumed it would be better than using 1.02b and DSM 6.1.x, right?) I rebooted using this USB dongle and Synology Assistant detects the rig as migratable. Then I start the migration process by providing the .pat file I could find here: https://archive.synology.com/download/Os/DSM/6.2.3-25426-3 -> synology_bromolow_3615xs.pat I get an error 13 (file corrupted). I guess that's a noob mistake but I cannot figure out what I'm doing wrong... Please help Best, -a- Edited July 5, 2021 by asheenlevrai Quote Link to comment Share on other sites More sharing options...
flyride Posted July 5, 2021 Share #11 Posted July 5, 2021 Error 13 is usually vid/pid mistake 1 Quote Link to comment Share on other sites More sharing options...
asheenlevrai Posted July 5, 2021 Author Share #12 Posted July 5, 2021 Thanks @flyride I paid attention using the right vid & pid when making the USB loader for 3615xs but I must have made a mistake somewhere I guess. I'll try fixing that by pressing C at boot and re-entering vid and pid. If it still fails, I'll re-make the dongle. Quote Link to comment Share on other sites More sharing options...
asheenlevrai Posted July 6, 2021 Author Share #13 Posted July 6, 2021 8 hours ago, asheenlevrai said: Thanks @flyride I paid attention using the right vid & pid when making the USB loader for 3615xs but I must have made a mistake somewhere I guess. I'll try fixing that by pressing C at boot and re-entering vid and pid. If it still fails, I'll re-make the dongle. I did all that. I still get Error 13 even after re-making the USB dongle Quote Link to comment Share on other sites More sharing options...
asheenlevrai Posted July 6, 2021 Author Share #14 Posted July 6, 2021 vid is 058F Should I have rather put 058f into grub.cfg? Could the capital F be misinterpreted? Quote Link to comment Share on other sites More sharing options...
flyride Posted July 6, 2021 Share #15 Posted July 6, 2021 No, but many people forget the prefix "0x" i.e. 0x058F 1 Quote Link to comment Share on other sites More sharing options...
asheenlevrai Posted July 6, 2021 Author Share #16 Posted July 6, 2021 51 minutes ago, flyride said: No, but many people forget the prefix "0x" i.e. 0x058F Thanks I was careful not to forget 0x Quote Link to comment Share on other sites More sharing options...
asheenlevrai Posted July 7, 2021 Author Share #17 Posted July 7, 2021 After reading this, I wondered if the USB medium could be the source of the error 13. Thus I tried burning the loader image on an USB dongle that I know previously worked in another Xpen rig. -> I still got the error 13, though. Now, for my 3617xs rigs I used the serial generator from here. It worked OK. I'm wondering if the version for 3615xs might return invalid serials or something and maybe this leads to error 13? Tx -a- Quote Link to comment Share on other sites More sharing options...
asheenlevrai Posted July 8, 2021 Author Share #18 Posted July 8, 2021 On 7/5/2021 at 4:00 PM, asheenlevrai said: Then I start the migration process by providing the .pat file I could find here: https://archive.synology.com/download/Os/DSM/6.2.3-25426-3 -> synology_bromolow_3615xs.pat I get an error 13 (file corrupted). I figured this was the origin of the problem for the emigration install. I shouldn't use this file for DSM6.2.3-25426-3 but rather the install file for DSM6.2.3-25426 (no update3) https://archive.synology.com/download/Os/DSM/6.2.3-25426 -> DSM_DS3615xs_25426.pat 🤪 Then -> no error 13 -> migration OK Quote Link to comment Share on other sites More sharing options...
flyride Posted July 8, 2021 Share #19 Posted July 8, 2021 The -3 indicates a patch. Look at the file sizes - one is 273MB and the other is 44MB. 1 Quote Link to comment Share on other sites More sharing options...
asheenlevrai Posted July 8, 2021 Author Share #20 Posted July 8, 2021 (edited) Yes @flyride I didn't pay attention. My bad... 🙏 Edited July 8, 2021 by asheenlevrai Quote Link to comment Share on other sites More sharing options...
asheenlevrai Posted August 10, 2021 Author Share #21 Posted August 10, 2021 By reading a bit more, I just realized that grub.cfg contains the following argument: set netif_num=1 I wonder if this could be the source of all my problems with my NICs. I never touched it (left "=1" while I have multiple LAN ports. Were can I find more information about what it does and how I am supposed to set it up. Especially when the onboard LAN is disabled. Does it still count as one or not? Thanks a lot for your help. best, -a- Quote Link to comment Share on other sites More sharing options...
asheenlevrai Posted August 11, 2021 Author Share #22 Posted August 11, 2021 (edited) OK... AFAICU from this post, set netif_num= should not matter too much since it should automagically be corrected according to what is declared as set mac1=, set mac2= etc... (I still wonder what would happen if grub.cfg sets more mac addresses than there are actual physical LAN ports, though. For instance if 4 macs are set in grub.cfg while only one LAN port is present. In this case netif_num would be 4) Anyways. I decided to migrate the rig to 1.04b (918+) using supported hardware: 4770k z87-express based MB Same PSU, same disks and same RAM no more PCIe sATA controller required Same PCIe quad port NIC (based on RTL8111G chipset) the onboard LAN is disabled in BIOS After migration, DSM only detects 2 LAN ports out of the 4. This is new. A new problem... I'm currently testing if LACP over 2 ports is stable or not (single-port Ethernet was stable for 24h, which still might be luck) I tried another unit of the same quad-port NIC -> same thing (only 2 LAN ports detected) I tried another quad-port NIC (based on i350) -> same thing (only 2 LAN ports detected) Any ideas? Thank you very much in advance for your help. best, -a- Edited August 11, 2021 by asheenlevrai Quote Link to comment Share on other sites More sharing options...
asheenlevrai Posted August 13, 2021 Author Share #23 Posted August 13, 2021 @flyride, @IG-88 ever heard of quad-port NICs where only 2 ports are detected before? Thanks -a- Quote Link to comment Share on other sites More sharing options...
flyride Posted August 13, 2021 Share #24 Posted August 13, 2021 On the machine you are testing the card with, is it DS918+? If so, did you set maxlanport? 1 Quote Link to comment Share on other sites More sharing options...
asheenlevrai Posted August 13, 2021 Author Share #25 Posted August 13, 2021 Thanks Yes I migrated from 103b 3617xs to 1.04b 918+. I don't know what maxlanport is, so I didn't change anything. I'll google that and see if I can find information. Or maybe you can point to a link if you have time. Thanks a lot -a- Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.