Performance issues with new SCALE 25.10.0 system [slow], compared to CORE 13.0-u6.2 [fast]

sheetsg77 · November 19, 2025, 1:01am

I’m upgrading old hardware and Truenas (Core [Supermicro X9 MB] → Scale [HL15 Supermicro X11 MB). Both with 32 GM ram. Both systems have 5 spinning disks, RAIDZ2 config, and both with 500 GB SSD for log and 500 GB SSD for cache. Both datasets use “STANDARD” sync.

Both systems have 10Gb SFP+ to aggregation switch, and iperf3 results show link is fully functional.

Remote VM on a Proxmox system has two NFS mounts, one to CORE and one to SCALE, but limited to 2.5 Gb/s link.

FIO test script, shows markedly different results between the two NFS mounts: (script does random and sequential read/write)

CORE : ~ 145 MB/s write and 265 MB/s read, Average IOPS 31,543

SCALE: ~ 15 MB/s write and 39 MB/s read, Average IOPS 3654

Running FIO command for random write directly on CORE system shows 2.5 GB/s write and average IOPS 659k

Running FIO command for random write directly on SCALE system shows 98 MB/s write and average IOPS 23,875

Any pointers where to start looking and finding the issue for discrepancy?

rymandle05 · November 19, 2025, 2:07am

Boy, I’m really struggling on this one. You’re seeing a big difference between the two so I have to believe something is wrong. I guess I’d start with these questions:

What parameters are you using with fio?

What drives are you using? Have you tried testing without the log and/or cache vdev?

Is the pool on the HL15 brand new? Is record size also the same on the two datasets?

DigitalGarden · November 19, 2025, 3:00am

Maybe use iostat and zpool iostat to confirm that the SLOG is actually being used on Scale. It seems like it is not. You might also want to check the autotrim setting on Scale and turn it off for this type of testing.

Also for comparison testing you probably want to be sure you disable the ARC and clear any file system cache before each test.

I’d get R/W working as expected on the system directly before investigating NFS.

Tell us a bit more about the HL15 system; what bays are the HDDs and SSDs in and are those bays the ones going to the SAS controller or the SATA controller? Is the motherboard BIOS and SAS controller BIOS up to date?

sheetsg77 · November 19, 2025, 8:37pm

OK, I’ve reverted to 25.04 because of this post: 25.10 drive performance has dropped significantly - TrueNAS General - TrueNAS Community Forums

I’m trying to make sure I have the right “tools” and understanding to evaluate performance and other appropriate settings.

This is the “basic” format of my FIO test. My previous post, I have a script with runs through RanW, SeqW, RanR and SeqR.

sudo sync;fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test_rand_write --filename=test_rand_write --numjobs=32 --bs=4k --iodepth=256 --size=200M --readwrite=randwrite --ramp_time=4 --group_reporting

Testing with a single device pool with a 500 GB SSD (using above), directly on Truenas system

Random write: 1092 MB/s, IOPS: 27,9650

However, from remote VM (with 2.5 Gb/s link), with NFS share and using the same command as above,

Random write: 11.5 MB/s, IOPS: 2,953

I’m not understanding the 95% drop in performance due to “NFS and 2.5 GbE link” interconnect.

So, questions:

Am I running a “resonable” test FIO to evaluate?
Why the 95% reduction in BW between server and VM (i.e. 2.5 GbE should produce 250 MB/s throughput)

My Truenas system:

MB: X11SPH-nCTPF (built in SATA and 2 10G SFP+), BIOS version 4.7, FIRMWARE version 1.74.17. Processor: Intel Xeon 3204 (6 core 1.9 GHz typ)

Memory: 32 GB

DigitalGarden · November 19, 2025, 9:20pm

So are you trying to test synchronous or asynchronous writes? Adding --sync=1 to the fio command might be more representative of the writes the VMs should be able to achieve. For spitballing performance, I assume the "500 GB SSD"s you refer to are SATA or SAS SSDs, not NVMe.

Maybe set up a VM locally on the Scale box and see what the NFS performance is? Perhaps there’s something going on with the networking on the Proxmox box?

What are rsize and wsize set to for the NFS mounts?

sheetsg77 · November 19, 2025, 10:48pm

I’m trying to keep with synchronous writes, for a safe VM and database environment.

I ran the above FIO command again adding ‘–sync=1’ on the TrueNAS SCALE system and got these resutls:

Random write 36.1MB/s and IOPS: 9185.

So, what does the “sync” command in my test line “sudo SYNC; fio….” do?

As this is a new system, I get to experiment and try to learn and understand various configs. Through my youtube, and online research, I’m thinking that I’d like a VDEV with 5 Seagate EXOS 20TB each in RAIDZ2 config, with LOG and CACHE and a triple mirror METADATA NVMe. The metadata would be new to config.

However, when I noted the stark difference between my old NAS system (CORE v13) and new 45Drives (SCALE 25.04 and/or 25.10), I started digging.

Your comment about Proxmox being the issue might be valid, yet my CORE TN system has significantly higher throughput. Both systems use 10GbE (via direct SFP+ connections) through USW Aggregation switch (see original post).

DigitalGarden · November 19, 2025, 11:56pm

The Linux sync command forces all pending changes and buffered data held in the RAM (eg the page cache and ZFS ARC) to be immediately written to the physical disk. This ensures that all pending writes are committed before the next fio test starts. It has nothing to do with sync/async acknowledge mode for the test itself.

I agree it should be looked into. I’m not an expert here, just trying to think through the general process I would use. There are lots of things that can affect read and write performance, but they’re often a few percent, or maybe 10% here and there, not an order of magnitude difference. Also read and write follow different paths and (under-)performance in one may or may not be related to the other. Finally, my understanding is BSD has a bit better support for sync writes and NFS than Linux, but there again I don’t think the difference is an order of magnitude one.

If you are getting a random sync write of 36.1 MB/s I believe (but could be corrected) that is consistent with the expectation for a SATA SSD. So I guess you could proceed one of three ways; try to dig into getting the remote NFS performance up to 36.1 MB/s, add in SLOG and HDDs and get the local performance of a pool better, or continue to compare back to your Core system, but I’d suggest just using that as a final target and not comparing a bunch of interim tests to it.

I’m not an NFS expert, but if it was me, my next step would probably be to set up a local VM and test the NFS performance over the bridged network connection. If that seemed OK, I’d add back a pool with a VDEV and SLOG.

Is there some reason you’re mounting the NFS shares within a VM and not the Proxmox host directly? That might be the desired end state, but for testing it might be better to mount and test the shares just in the host first.