HL15 Build - Sanity Check - Going to be ingesting + working with a lot of audio

Hey folks. I just ordered my HL15 chassis + backplane a day ago and now I’m figuring out what I’m going to put in it.

My planned usage:

Primarily storage - I plan to store hundreds of thousands of hours audio - ingesting about 24GB / day. I also plan to read all of it remotely from another machine on the same network multiple times a day.

I think I’m going to be optimizing for storage space + transfer speed.

I think I need a bit of a sanity check on my list of parts. I’m used to working with consumer parts but this is my first time building something enterprisey from scratch.

Let me know what you think. Thanks!

CPU: AMD EPYC 7402P
Motherboard: Supermicro H12SSL-i
RAM: 256GB - 8x32GB 2133P DDR4 ECC REG

(Just a tugm4470 bundle - AMD EPYC 7402P CPU + Supermicro H12SSL-i + 2133P RAM multiple choices | eBay

Cooler: Arctic Freezer 4U SP3

HBA: LSI-9305-16i

Boot drives: 2 x 1tb m.2 nvme ssds in raid1. (I’ll probably grab whatever samsung 990 / sabrent is in stock on amazon)

Networking: Mellanox ConnectX-3 Pro EN Dual Port 40GbE QSFP+ PCIe Ethernet Adapter

Storage add-on: ASUS Hyper M.2 X16 PCIe 4.0 X4 Expansion Card - 4 NVMe M.2

Drives: Probably get started with 4 x Seagate Exos X22 20TB drives

Power Supply: Corsair RM750e

1 Like

How much is “all of it”? The daily 24GB or the hundreds of thousands of hours? How much data do you need to move across the network each day and how fast does that need to happen?

What are your plans/needs for RAID and backup of the data?

You might want more smaller drives than fewer larger drives, or SAS drives, if you want to get that data over the network at 40Gb/s.

Every job will process all of it (the hundreds of thousands of hours). At some point, running that job will take more than a day and that’s fine. I don’t need overlapping runs.

I just want it to be as fast as possible without spending too much over $3k (an artificial limit, I can go higher if I need to).

Backup + Raid: No plan for backups. I’m ok with data loss here. Possibly multiple raid0 arrays?

If you’re reading the same files multiple times a day it will come from ARC most probably. Those 4 SATA drives won’t give you much speed directly so the first read might be slow.

1 Like

You won’t need a CPU that powerful just to serve files, even at 40gbps. The fastest access to files will always be locally. I’ll ask the dumb question, just to be sure; the other machine has some proprietary software or hardware, or for some other reason needs to be physically separate from the HL15 and can’t be run in/on it either directly in the OS or a VM solution?

1 Like

Do you know what’s the minimum I can get away with for saturating 40gbps (assuming everything else is in place)? I thought a 7402p was being conservative.

The reason I’m keeping them separate is because the box doing all the processing is going to have multiple GPUs and I don’t think they’d fit in my HL15 after I add networking, storage add-in cards and the HBA.

1 Like

My main point was just about CPU cores. You’ll probably find the CPU running at only a few percent of load. The CPU doesn’t have to do a lot when reading the disks or sending data over the network. A bit like some gaming builds that don’t really need a powerful CPU because the GPUs are doing the majority of the workload. It seems like the HBA and NIC you listed are both PCIe 3 x8 cards, so in a sense the minimum would be any CPU that supports that, since you haven’t defined much of any additional workload (transcoding, encryption, VMs, …). I would think even the HL15 “Full Build”, even though the CPU used has 1/15th the compute of your Epyc, can still push data around about the same. Having more compute available is certainly not a bad thing. I didn’t go with the full build. But, if someone wants to come in and correct me that’s fine.

That’s what I thought. Just felt like I needed to ask.

As Fossil said, the bottleneck isn’t going to be with the HBA or the NIC or the CPU. It’s going to be with the “spinning rust”. A single Seagate Exos X22 lists a maximum sustained throughput of 285MB/s. That’s very much ‘best-case’. A typical sustained transfer rate for a 7200 RPM drive is more like half of that. You can have some data cached in RAM or SSD, and the drives have a small cache, but a 4-disk RAID 0 is only going to return data at say a typical sustained (4 disks * 125MB/s * 8 bits/byte =) 4 Gb/s. I think you’d be better off with fifteen 6TB or 8TB drives than four 20TB drives. Even then that might only put you in the 15-20 gbps range. 15K rpm drives might get you to 25 gbps. To get to 40, I think you’d need dual actuar disks, SSDs, or a chassis like the 45Drives Q30 or something from a different company like Supermicro that has 24 or 36 3.5" drive bays.

1 Like