10GbE NICS issues / suddenly died

Hey there brilliant minds.
We have 2 of the brilliant HL15’s at our little audio studios. Love these things. Both are pretty identical using the X11SPH-nCTF motherboard.
Both on truenas scale.
Anyway - working away at various bits of tech today, I got a message that our primary NAS went offline.
After a little sniffing around, I noticed there was no activity on our switch connected to the NAS - and then NO activity on the eth01 port on the back of the NAS.
Replaced cable. Changed port (even plugged in working cable from the second nas to make sure it wasn’t just super unlucky.)
So - dead ports.
While sniffing around and before I’d put much of a plan together, I noticed the machine rebooted itself 4 times. But hasn’t now in the last couple hours.

I’m connected to the console only - but at least I could get into that.

Drive pool seems healthy.

But there are some strange things coming up with the little snooping that I know what I’m doing. (Ok, I know that I don’t know enough about what I’m doing, but I’m giving it a go…)

So…

Here’s what I’ve found out.

ip -br link | egrep "eno1|eno2"

2: eno1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000
3: eno2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000
root@klangtank[/]# ip -br link | egrep "eno1|eno2"
2: eno1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000
3: eno2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000

 ethtool eno1 | egrep "Link detected|Speed|Duplex|Auto-negotiation"

        Speed: Unknown!
        Duplex: Unknown! (255)
        Auto-negotiation: off
        Link detected: no

dmesg -T | egrep -i "i40e|eno1|eno2|unsupported SFP|Rx/Tx is disabled|link"

[Wed Feb 11 16:06:57 2026] NET: Registered PF_NETLINK/PF_ROUTE protocol family
[Wed Feb 11 16:06:57 2026] audit: initializing netlink subsys (disabled)
[Wed Feb 11 16:06:57 2026] pci 0000:04:00.0: 31.504 Gb/s available PCIe bandwidth, limited by 8.0 GT/s PCIe x4 link at 0000:00:1d.0 (capable of 63.012 Gb/s with 16.0 GT/s PCIe x4 link)
[Wed Feb 11 16:06:57 2026] ACPI: PCI: Interrupt link LNKA configured for IRQ 11
[Wed Feb 11 16:06:57 2026] ACPI: PCI: Interrupt link LNKB configured for IRQ 10
[Wed Feb 11 16:06:57 2026] ACPI: PCI: Interrupt link LNKC configured for IRQ 11
[Wed Feb 11 16:06:57 2026] ACPI: PCI: Interrupt link LNKD configured for IRQ 11
[Wed Feb 11 16:06:57 2026] ACPI: PCI: Interrupt link LNKE configured for IRQ 11
[Wed Feb 11 16:06:57 2026] ACPI: PCI: Interrupt link LNKF configured for IRQ 11
[Wed Feb 11 16:06:57 2026] ACPI: PCI: Interrupt link LNKG configured for IRQ 11
[Wed Feb 11 16:06:57 2026] ACPI: PCI: Interrupt link LNKH configured for IRQ 11
[Wed Feb 11 16:06:58 2026] i40e: Intel(R) Ethernet Connection XL710 Network Driver
[Wed Feb 11 16:06:58 2026] i40e: Copyright (c) 2013 - 2019 Intel Corporation.
[Wed Feb 11 16:06:58 2026] i40e 0000:67:00.0: fw 5.5.67510 api 1.12 nvm 5.50 0x800032e8 1.3082.0 [8086:37d2] [15d9:37d2]
[Wed Feb 11 16:06:59 2026] i40e 0000:67:00.0: MAC address: 7c:c2:55:ab:f0:a8
[Wed Feb 11 16:06:59 2026] i40e 0000:67:00.0: FW LLDP is enabled
[Wed Feb 11 16:06:59 2026] i40e 0000:67:00.0: Added LAN device PF0 bus=0x67 dev=0x00 func=0x00
[Wed Feb 11 16:06:59 2026] i40e 0000:67:00.0: Features: PF-id[0] VFs: 32 VSIs: 66 QP: 20 RSS FD_ATR FD_SB NTUPLE DCB VxLAN Geneve PTP VEPA
[Wed Feb 11 16:06:59 2026] i40e 0000:67:00.1: fw 5.5.67510 api 1.12 nvm 5.50 0x800032e8 1.3082.0 [8086:37d2] [15d9:37d2]
[Wed Feb 11 16:06:59 2026] i40e 0000:67:00.1: MAC address: 7c:c2:55:ab:f0:a9
[Wed Feb 11 16:06:59 2026] i40e 0000:67:00.1: FW LLDP is enabled
[Wed Feb 11 16:06:59 2026] i40e 0000:67:00.1: Added LAN device PF1 bus=0x67 dev=0x00 func=0x01
[Wed Feb 11 16:06:59 2026] i40e 0000:67:00.1: Features: PF-id[1] VFs: 32 VSIs: 66 QP: 20 RSS FD_ATR FD_SB NTUPLE DCB VxLAN Geneve PTP VEPA
[Wed Feb 11 16:06:59 2026] ata2: SATA link down (SStatus 0 SControl 300)
[Wed Feb 11 16:06:59 2026] ata1: SATA link down (SStatus 0 SControl 300)
[Wed Feb 11 16:06:59 2026] ata5: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[Wed Feb 11 16:06:59 2026] ata4: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[Wed Feb 11 16:06:59 2026] ata3: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[Wed Feb 11 16:06:59 2026] ata7: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[Wed Feb 11 16:06:59 2026] ata8: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[Wed Feb 11 16:06:59 2026] ata9: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[Wed Feb 11 16:06:59 2026] ata6: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[Wed Feb 11 16:06:59 2026] ata10: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[Wed Feb 11 16:07:00 2026] i40e 0000:67:00.0 eno1: renamed from eth0
[Wed Feb 11 16:07:00 2026] i40e 0000:67:00.1 eno2: renamed from eth1
[Wed Feb 11 16:23:20 2026] i40e 0000:67:00.0: Rx/Tx is disabled on this device because an unsupported SFP module type was detected.
[Wed Feb 11 16:23:20 2026] i40e 0000:67:00.0: Refer to the Intel(R) Ethernet Adapters and Devices User Guide for a list of supported modules.
[Wed Feb 11 16:34:16 2026] i40e 0000:67:00.1: Rx/Tx is disabled on this device because an unsupported SFP module type was detected.
[Wed Feb 11 16:34:16 2026] i40e 0000:67:00.1: Refer to the Intel(R) Ethernet Adapters and Devices User Guide for a list of supported modules.
[Wed Feb 11 16:47:54 2026] i40e 0000:67:00.0: Rx/Tx is disabled on this device because an unsupported SFP module type was detected.
[Wed Feb 11 16:47:54 2026] i40e 0000:67:00.0: Refer to the Intel(R) Ethernet Adapters and Devices User Guide for a list of supported modules.

lspci -nn | egrep -i "ethernet|network"

67:00.0 Ethernet controller [0200]: Intel Corporation Ethernet Connection X722 for 10GBASE-T [8086:37d2] (rev 09)
67:00.1 Ethernet controller [0200]: Intel Corporation Ethernet Connection X722 for 10GBASE-T [8086:37d2] (rev 09)

ethtool -i eno1
driver: i40e
version: 6.6.32-production+truenas
firmware-version: 5.50 0x800032e8 1.3082.0
expansion-rom-version:
bus-info: 0000:67:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes

So - of all that, (commands from other folk on proxmox forum / and some attempts to get cgpt to help me with my lack of linux knowledge… gah!). the strangest bit to me is where it reports back that an unsupported SFP module type was detected, yet my motherboard only has 10G baseT ports / NO SFP connections.

Anyhow. I’m at a bit of a loss. I can SEE that my pools are all healthy, so there’s that. Nothing strange is coming up in the logs other than 4 restarts about 2 hours ago. No temp issues.

Can anyone suggest a way forward?

We do have a second nas - which itself was just rebuilt - so the quick way forward is for me to transfer drives into that HL15 and get the storage pools back online there. I think. And then figure out how we will do backups while this NAS is offline.

I feel like its hardware failure - since it was so sudden, and has been working well the last 18 months.

Does anyone else have any other suggestions?

You could rule out software by booting some other OS like a Live USB version of Ubuntu or some other Linux. Maybe not Debian if we’re trying to eliminate TrueNAS (Debian) driver updates. But, I suspect you are right about it being a hardware failure. Something may have corrupted or killed the NVM (non-volatile memory) chip, like heat. Maybe inspect the motherboard for any loose heatsinks or discoloration.

It sounds lame, but the first and easiest thing to try is a deep power cycle. Because NICs have wake-on-LAN features a simple reboot doesn’t always do a full reset of their state. Shut down the server, unplug the power cable entirely or turn off the external switch on the PSU, and hold the main power button for 15-30 seconds to drain all capacitors. Wait 5 minutes, then reconnect and boot.

Check NVM Integrity. If the card is visible in a Live Linux environment, try to read its info using Intel’s nvmupdate utility (bootable ISO or Linux command line version). Run the utility in “inventory” mode (usually just running the executable without flags will scan). If the utility hangs, crashes, or reports “Flash update failed” / “Access denied,” the NVM is likely corrupted.

Since the X722 is part of the Platform Controller Hub (PCH), a failure there could indicate a dying chipset. It is often isolated to just the network lanes and you could install a PCIe NIC to replace the dead ports, but the failure might be a harbinger of a broader PCH failure, so I’d watch the SATA and USB connections all seem ok. The rebooting was probably an indication of whatever component it was in the process of dying; developing a short or whatever.

Sometimes re-flashing BIOS can help in cases where memory or firmware has become corrupted, nut I suspect that won’t help in this case.

I’m not sure about RMA and all that.

2 Likes

Thanks for this.

I have only really had time to do a deep power cycle. Good idea, but NICS did not come back online.

I was able to export our main project pool (10 drives) and import them into our second HL15. Always a little heard in mouth stuff, but it works, and we have backups, but its not pleasant.

Now just need to setup all the access to our second NAS to match the main nas. It was always just used for backups (and deliberately the same hardware for this very type of occasion!)

I’ll get to some of your other things to look at in a bit. There’s a lot there I don’t know in your message - but nothing some web searches and just messing about likely wont help me figure it out. Thanks for the suggestions. Will be good to somehow actually nail down what part of the mb has died. Ie, chipset as you said feels a possibility.

I’m sorry to hearyou’reu having issues with one of your HL15 servers, and thank you for beinga partt of the 45Drives comunity

Please reach out to info@45homelab.com, and a support member will be able to assist you with this issue

2 Likes

Certainly, reach out to 45HomeLab as suggested by Hutch.

Since you have two nearly identical HL15’s, I think you’re actually in a lucky spot and can compare between the systems. You can run the same checks on the working HL15 to see which of these might be a red herring. If the other system also has DMESG’s about SFP adapters but works then you can have some confidence it’s unrelated to your immediate problem.

2 Likes