Jump to content
XPEnology Community
  • 0

Problem with quad-port GbE NIC based on Realtek RTL8111G


asheenlevrai

Question

Hi

 

In order to explain my problem more easily, let me list here the xpen machines I use for reference:

 

machineA (shutdown issue reported here)

DS3617xs 1.03b

MB : ga-z68ap-d3 rev2.0 (onboard NIC is dead and disabled in BIOS)

CPU : i5 3470

RAM : 2x 2GB DDR3 1066MHz

NIC : GbE AIC based on Broadcom BCM95722A2202G

sATA : 5ports sATA3 AIC based on jmb585 (2 disks on MB sATA3 ports, 5 disks on AIC)

 

machineB (instability issue reported here)

DS3617xs 1.03b

MB :  Asus p8z77-m (onboard NIC disabled in BIOS)

CPU : i7 3770k

RAM : 4x 4GB DDR3 1066MHz

NIC : PCIe2.0 x4 quad-port GbE NIC based on Realtek RTL8111G

sATA : 5ports sATA3 AIC based on jmb585 (2 disks on MB sATA3 ports, 4 disks on AIC)

 

machineC (works OK)

DS918+ 1.04b

MB : ga-z87-hd3

CPU : i7 4770s

RAM : 2x 2GB DDR3 1333MHz

NIC : onboard GbE

sATA : 5ports sATA3 AIC based on jmb585 (6 disks on MB sATA3 ports, 2 disks on AIC)

 

machineD (for testing, OK)

DS918+ 1.04b

MB : optiplex 9020 sff (onboard NIC disabled in BIOS. i217-LM Doesn't work in windows 10 : Error 10. Driver Issue)

CPU : i7 4770

RAM : 1x 4GB DDR3 1600MHz

NIC : random GbE AIC (works) or PCIe2.0 x4 quad-port GbE NIC based on Realtek RTL8111G (doesn't)

sATA : onboard (1 disk)

 

After installing and configuring machineB I noticed I had stability issues. The machine would randomly become unresponsive and Synology Assistant would no longer detect it. Following a hard reboot to revive it, there was no message mentioning it was shutdown improperly, which has me thinking the system froze rather than just lost Ethernet connection. First, I thought maybe it was due to the bond connection (LACP) so I disabled that and started testing with only 1 port (without LACP). The problems remain when using only 1 port at a time.

 

If I put a random GbE AIC or use the onboard GbE, it seems to work alright.

 

Then, I needed to test the quad-port GbE AIC. I thus tried it - actually tried them, since I have 2 units - in windows10 and they seem to work alright.

Then I tried using this card (using only 1 port) in machineA, machineC or when installing machineD. This card won't work in any of these!?!

For machineA and machineC, I know DSM starts up but doesn't get any Ethernet connectivity since a hard reboot leads to a report of improper shutdown after I put back the original GbE NIC.

For machineD, I couldn't install using this card (I tried both ds3617xs 1.03b and ds918+ 1.04b). I tried installing with a random GbE NIC and then replace it with the quad-port NIC, but again, it doesn't work (no Ethernet connectivity).

 

Declaring mac1 to mac4 in grub.cfg seems to have no effect functionally for the tests I am currently running.

 

I don't understand how on earth I am able to use this quad-port card on machineB. What is different with this HW compared to the 3 other rigs?

 

Thank you very much in advance for your help.

 

Best,

-a-

Link to comment
Share on other sites

12 answers to this question

Recommended Posts

  • 0

A few comments:

  1. There are two times when network connectivity matters - first, on the initial boot for the install, and second, when DSM finally boots after install.  Just because it works for install doesn't mean it will work when the DSM flavor of Linux is initialized.
     
  2. If DSM boots post-install and you observe connectivity, it isn't "lost" due to instability unless you have a NIC hardware failure, which is incredibly unlikely.
     
  3. System instability is not a typical problem with DSM.  If that is occurring, I would check 1) memory, 2) don't overclock and 3) system and CPU cooling.

Now this:

 

5 hours ago, asheenlevrai said:

machineB (instability issue reported here)

DS3617xs 1.03b

MB :  Asus p8z77-m (onboard NIC disabled in BIOS)

CPU : i7 3770k

RAM : 4x 4GB DDR3 1066MHz

 

In your other thread, you didn't post the DIMM configuration, but I note that this is the only one of the four machines that has four DIMMs.  Most desktop systems are less stable with 4 DIMMS instead of 2, and may need a voltage bump or more conservative speed/timing settings to remain stable.  Have you run hardware/memory/stress tests on this box?  Have you tried pulling two of the DIMMs to see if it is then stable?  No idea if this is your issue, but never assume the hardware is working great, especially if it is repurposed/old.

 

5 hours ago, asheenlevrai said:

I don't understand how on earth I am able to use this quad-port card on machineB. What is different with this HW compared to the 3 other rigs?

 

No real clue here.  Make sure you are plugging it into a real 4x slot (many motherboards have long slots wired for 1x or 2x to save PCI lanes). Did you add extra.lzma on the system that works?  Also you have two different DSM platforms in play (DS3617xs and DS918+).  Nothing wrong with that, but variability is doubled for troubleshooting.

Link to comment
Share on other sites

  • 0

Thank you very much for your help :D

 

10 hours ago, flyride said:

A few comments:

  1. There are two times when network connectivity matters - first, on the initial boot for the install, and second, when DSM finally boots after install.  Just because it works for install doesn't mean it will work when the DSM flavor of Linux is initialized

I understand that. Makes sense. Thanks

 

10 hours ago, flyride said:
  1. System instability is not a typical problem with DSM.  If that is occurring, I would check 1) memory, 2) don't overclock and 3) system and CPU cooling

 

 

You're right. Memory should be checked I guess How? (See below).

I don't overclock.

How would you suggest I monitor/log system and CPU cooling. I have alerts set up in the BIOS for when the CPU goes above 80°c but I never heard that beep. Or at least I wasn't next to the machine to hear it.

 

10 hours ago, flyride said:

Most desktop systems are less stable with 4 DIMMS instead of 2, and may need a voltage bump or more conservative speed/timing settings to remain stable.

Do you mean most PCs? Or most DSM systems?

 

10 hours ago, flyride said:

Have you run hardware/memory/stress tests on this box?  Have you tried pulling two of the DIMMs to see if it is then stable?  No idea if this is your issue, but never assume the hardware is working great, especially if it is repurposed/old.

I haven't. I will. How would you recommend I do that? Using something like SystemRescue or UltimateBootCD, for instance?

 

10 hours ago, flyride said:

No real clue here.  Make sure you are plugging it into a real 4x slot (many motherboards have long slots wired for 1x or 2x to save PCI lanes). Did you add extra.lzma on the system that works?  Also you have two different DSM platforms in play (DS3617xs and DS918+).  Nothing wrong with that, but variability is doubled for troubleshooting.

Yes I use real PCI x4 slots (or a x16 slot running at x4). No slots that would run at lower speed when another AIC is present in a different slot sharing the same PCI lanes.

 

I didn't use extra.lzma in any of my machines.

 

I started with machineC and chose 918+ because I had a haswell CPU available. Then I went on with machineA & B and had to use 3617xs since the CPU wouldn't support 918+. For machineD, I actually tried both (as I mentioned in the OP). This one is for testing only right now. Anyway, it seems like for some obscure reason this quad-port NIC I need to use doesn't seem to work on any hardware other than machineC, which is very surprising and inconvenient to me. It doesn't seem to make a whole lot of sense, right?

 

Once again, thank you very much for your highly valuable help. I appreciate it.

Best,

-a-

 

Link to comment
Share on other sites

  • 0

I am currently running MemTest86+ v5.01 from SystemRescue.

I have no experience with that so I am not sure how to interpret the results.

So far, 2 passes, 0 errors. I don't know what that means (how many passes with no errors is considered significant?) but at least it's not a bad sign.

CPU temp during the test: 52-55°c

 

EDIT: at least 8 passes are required apparently

Edited by asheenlevrai
Link to comment
Share on other sites

  • 0
On 7/8/2021 at 5:24 PM, asheenlevrai said:

After installing and configuring machineB I noticed I had stability issues. The machine would randomly become unresponsive and Synology Assistant would no longer detect it. Following a hard reboot to revive it, there was no message mentioning it was shutdown improperly, which has me thinking the system froze rather than just lost Ethernet connection.

I made a mistake.

Actually I DO get an error message mentioning the system shut down improperly after a hard reset (required because the machine became unresponsive).

I guess this tells me that DSM is was still running (system didn't freeze) but somehow lost internet connection, right?

Link to comment
Share on other sites

  • 0

I tried the following after machineB lost network connection again

(2 RAM modules, quad-port NIC in bond connection, no longer detected in Synology Assistant)

 

 - I went into the BIOS and re-enabled the onboard NIC

 - I search the NAS with Synology Assistant

 - The NAS now was found twice :

1x for the quad-port NIC -> connection failed /ready

1x for the onboard NIC -> ready

 - I could only connect to the NAS using the IP for the onboard NIC

 - In DSM both the bond connection (quad-port NIC) and the new LAN5 (onboard GbE) are indicated as OK and connected.

 - After a reboot, Synology Assistant only detects the NAS once (LAN5). No longer via the quad-port NIC. Although DSM still reports the bond connection as connected.

 

 

 

 

 

 

Edited by asheenlevrai
Link to comment
Share on other sites

  • 0

As mentioned on this thread, I ordered 2 quad port NICs based on intel i350.

1 for "machineB" and one for a future setup.

 

Both of them work in random windows PCs I have around.

None of them work in machineB. I mean the system won't even post with this NIC (while it work OK with a single-port NIC).

 

This is a nightmare... but it's not related to XPEnology in this case, I guess. Since the system won't even post...

Edited by asheenlevrai
Link to comment
Share on other sites

  • 0

By reading a bit more, I just realized that grub.cfg contains the following argument:

 

set netif_num=1

 

I wonder if this could be the source of all my problems with my NICs.

I never touched it (left "=1" while I have multiple LAN ports.

 

Were can I find more information about what it does and how I am supposed to set it up.

Especially when the onboard LAN is disabled. Does it still count as one or not?

 

Thanks a lot for your help.

best,

-a-

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Answer this question...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...