I’d guess it’s something to do with length.
Replaced all four cables (not taking chances) with Supermicro/Molex direct from their eStore. I also swapped them around - slots 1-8 are now connected to SAS and 9-15 are connected to SATA.
No issues so far. Will rebuild the HL15 into the “production” version soon.
I was the one who called out that 45drives was probably on reduced staff for the holidays and that no one should expect a response until after the holidays were over.
My disappointment stems from the lack of response from 45drives after the holidays were over and that someone had responded to @rymandle05 several days before my previous update where I mentioned the lack of response.
Further, it doesn’t look like the person who took the support ticket that was opened was aware of this thread or that multiple users have experienced problems with the cables that came from 45Drives. I just responded and let them know I replaced all four cables (about 100USD) and sent them a link to the thread.
45Drives needs to look into the cable issue and determine whether or not they have more defective cables in the field.
I’m willing to send the cables that came with the HL15 back to 45Drives at your cost if that would be useful.
Hey @pxpunx ,
I am sorry for the inconvenience you had. Our service team sent you an email on 01/08/2024, and it seems that you did not receive the email. I have requested that our enterprise support team reach out to you again. I have also provided them with the history of our conversation.
Thank you for your patience again.
FYI - 45Drives reached out to me again to have the original SAS cables sent back for further inspection. I’m working on getting them sent out and will continue to check in with Corey to see what conclusion they come to.
Well I finally swapped out the 10gtek cables for the replacements from 45Drives this evening. I’ll be putting these cables through the paces over the next few days. Let’s see how they fair.
So what was the final outcome of this investigation? Was it a bad batch of cables? If so, have the affected units shipped with the bad batch been identified?
I can confirm the new SAS cables 45Drives sent me have been working just fine. I’ll check in with Corey to see if they more info they can share to close the loop here.
@Vikram-45HomeLabs I noticed I can’t edit the original post anymore for this thread. Is that by design? I’d like to keep updating to let newcomers know that it’s been RESOLVED and point to the all the major posts in the thread.
Hi @rymandle05,
I am checking on this and I will get back to you as soon as I hear back from our team.
Hi @rymandle05, I updated the heading of the post to reflect the resolved status.
Thanks for the great idea!
Can also confirm no issues since I replaced mine w/ cables from Supermicro.
Thank you @rymandle05 for all your research - I don’t have a homelab box, but your diagnosis and test results helped me nonetheless, especially your lsiutil
writeup.
I have a mix of Exos and HGST drives, and only the Exos were giving me random write errors with no data or SMART errors.
Changing to SuperMicro cables didn’t help, but limiting the link speed to 6Gbps did, I think. Time will tell, but through my experience in past couple days, I should have already received a some errors by now.
I can’t really make use of the 12Gbps speeds anyways, so for me this is an acceptable solution, the other being I swap out all my Exos drives for HGST.
You’re absolutely welcome! I appreciate you letting me know this was helpful. It’s interesting the EXOS drives were impacted and not the HGST drives. I suppose Exos drives must just have less tolerance for signal degradation or issues or HGST is better at handling it.
Were you seeing anything logged on the Protocol Specific port log page for SAS SSP
section when running smartctl -x
like Invalid DWORD count
or Loss of DWORD synchronization
incrementing?
I didn’t check prior, but I do see this currently when checking one of the Exos drives:
Invalid DWORD count = 4
Running disparity error count = 0
Loss of DWORD synchronization = 1
Phy reset problem = 0
Phy event descriptors:
Invalid word count: 4
Running disparity error count: 0
Loss of dword synchronization count: 1
It really does seem to be working, 75% through the resilver and still no errors on any drives… I had definitely accumulated a few when I did this two days ago.
What would we do without the internet??
The internet is indeed a wonderful and scary place
The DWORD errors I was seeing was my first clue that the signal was somehow an issue at 12gpbs. Seeing that you have them too is a good indicator that is indeed a similar problem and not something different.
I have no doubts you’ll be fine running at 6gpbs. If you ever put in some SAS SSD’s then you’d probably want to have SAS3 speeds. Too bad the supermicro cables didn’t work out. Maybe you hit the unlucky lottery and those cables too are marginal?
What about firmware updates to your HBA or the drives? I also see something about Molex-to-SATA power adapters sometimes being an issue. Not sure how many drives you have or how they are connected power-wise.
Yep, absolutely no errors after limiting to 6Gbps, and went through two resilvers today. At this point I’m happy as a clam. Thank you all so much.
@rymandle05 I’m really curious as to whether or not I’ll experience the same issue with SSDs if I’m able to retrofit them into the 3.5" caddies at some point in the future.
If I ever do need that kind of performance, and it probably won’t be for a while, as I really just need file syncing between my employees and clients, and hosting OpenProject, I’d probably go for a 2U 24-bay SuperMicro chassis in which case I wouldn’t expect to have any issues.
@DigitalGarden I did make sure to update all the firmware. Updating the firmware on one of my Exos drives actually fixed a horrible buzzing sound it was making at idle (thanks to a post on Reddit).
I’ve got a 2U 12-bay RackChoice chassis with an LSI 9305-16i and a no-name backplane. Honestly I’d bet the issue is with the backplane - the marketing page makes no mention of 12Gbps support. Do you think the SFF-8087 connectors on the backplane are a limiting factor/problem?
The backplane accepts molex connectors… maybe I can try fudging with which PSU power cable (SATA vs. Peripheral) I’m plugging in to see if it’ll help… then again at this point I don’t want to keep powering on off and my server as it needs to be ready and running during business hours, so as long as it’s stable I’m quite happy.
Agreed. 12 gbps for spinning rust has limited benefit depending on type of workload anyway.
SFF-8087 should support 12gbps SAS. We are talking about SAS drives, right? And, looking only really briefly, it seems like those backplanes would be direct wired and not using some sort of SAS expander or SATA port multiplier, so there shouldn’t be any real logic between the drive and the HBA. But, I do see some of the listings on Amazon and Newegg do explicitly call out 6 gbps in the title. Not sure if that is a result of it by default shipping with SATA reverse breakout cables, or something more definite about the capabilities of the backplane.
Yes, I have all SAS drives. Good point about the lack of logic.
The case did come with SFF-8087 to 4x SATA reverse breakout cables (I should probably label them in case I try to use them later on…), so that very well could be why 6Gbps is listed. I do believe I was seeing the 12Gbps successfully negotiated with my drives prior to limiting the speeds.