geekhack

geekhack Community => Other Geeky Stuff => Topic started by: suicidal_orange on Fri, 18 November 2016, 09:04:47

Title: Anyone speak S.M.A.R.T ? (possible dying hard drives)
Post by: suicidal_orange on Fri, 18 November 2016, 09:04:47
My computer is making a strange clicking sound, not often enough to be a wire in a fan so I'm thinking hard drive.

I ran smartctl in Linux and got this back:

Code: [Select]
=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  9 Power_On_Hours          0x0032   099   099   000    Old_age   Always       -       3977
 12 Power_Cycle_Count       0x0032   098   098   000    Old_age   Always       -       1256
177 Wear_Leveling_Count     0x0013   099   099   000    Pre-fail  Always       -       31
179 Used_Rsvd_Blk_Cnt_Tot   0x0013   100   100   010    Pre-fail  Always       -       0
181 Program_Fail_Cnt_Total  0x0032   100   100   010    Old_age   Always       -       0
182 Erase_Fail_Count_Total  0x0032   100   100   010    Old_age   Always       -       0
183 Runtime_Bad_Block       0x0013   100   100   010    Pre-fail  Always       -       0
187 Uncorrectable_Error_Cnt 0x0032   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0032   078   067   000    Old_age   Always       -       22
195 ECC_Error_Rate          0x001a   200   200   000    Old_age   Always       -       0
199 CRC_Error_Count         0x003e   253   253   000    Old_age   Always       -       0
235 POR_Recovery_Count      0x0012   099   099   000    Old_age   Always       -       138
241 Total_LBAs_Written      0x0032   099   099   000    Old_age   Always       -       4895084500

So many  'Old_age' and 'Pre-fail', can't be good.  Then I realised this was actually my boot SSD :eek:

My HDD actually reports this
Code: [Select]
=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   174   171   021    Pre-fail  Always       -       6300
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       664
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   098   098   000    Old_age   Always       -       1778
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       582
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       53
193 Load_Cycle_Count        0x0032   190   190   000    Old_age   Always       -       31817
194 Temperature_Celsius     0x0022   122   108   000    Old_age   Always       -       28
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   195   000    Old_age   Always       -       1812
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

Different, but not much better.

Should I be scared?  I'm one of those lucky people who's never needed backups :-\
Title: Re: Anyone speak S.M.A.R.T ? (possible dying hard drives)
Post by: davkol on Fri, 18 November 2016, 09:37:55
Ummm, I'm not sure you can read the smartctl output.

The most important column is VALUE. When it drops to THRESH, it's a sign of problems.
Title: Re: Anyone speak S.M.A.R.T ? (possible dying hard drives)
Post by: CSCoder4ever on Fri, 18 November 2016, 09:44:15
looks fine to me, the when failed values are all empty.

You could also always download crystaldiskinfo and check it out from the windows side of things
Title: Re: Anyone speak S.M.A.R.T ? (possible dying hard drives)
Post by: suicidal_orange on Fri, 18 November 2016, 09:51:08
Ah that's good, I was reading the other way and they are all well over the threshold :))

I've rebooted to windows and am running a full test, between the random clicking, unresponsive mouse and flickering screen due to a non responsive graphics driver pretty sure I have a problem.  It might finish the test in 120 hours (still rising...) or hopefully will get past the bad bit and be done in around 8 as it suggested at the beginning...
Title: Re: Anyone speak S.M.A.R.T ? (possible dying hard drives)
Post by: tp4tissue on Fri, 18 November 2016, 09:52:42
Backup, and you never have to think about this stuff again..
Title: Re: Anyone speak S.M.A.R.T ? (possible dying hard drives)
Post by: Tactile on Fri, 18 November 2016, 09:55:12
The only thing that looks weird to me is the UDMA_CRC_Error_Count.

Here's my info gathered using "Disks" in Ubuntu:

[attach=1]

The first things to try before replacement would be to check if there's a firmware update for the drive & replace the cable.
Title: Re: Anyone speak S.M.A.R.T ? (possible dying hard drives)
Post by: suicidal_orange on Fri, 18 November 2016, 10:16:11
Backup, and you never have to think about this stuff again..
:p

The only time I've lost data was when I ran 'dd if=/dev/urandom of=/dev/sdc' forgetting that I was booted to a test install at the end of a data drive so all my drives weren't their usual letters.  First it overwrote all my music then started on the partition I was booted from...  Wasn't long after that it crashed!

The only thing that looks weird to me is the UDMA_CRC_Error_Count.

Here's my info gathered using "Disks" in Ubuntu:

(Attachment Link)

The first things to try before replacement would be to check if there's a firmware update for the drive & replace the cable.

Got to 255 hours remaining, then the computer crashed.  Not sure if coincidence...

Thanks for the pointer to disks - it looks like this. 

(http://i.imgur.com/LHFfYtn.png)

Will run a test there and see if I get similar results to the unusable windows ones...
Title: Re: Anyone speak S.M.A.R.T ? (possible dying hard drives)
Post by: jaffers on Fri, 18 November 2016, 11:54:28
CLICK OF DEATHHHHHHHHHHHH
Title: Re: Anyone speak S.M.A.R.T ? (possible dying hard drives)
Post by: suicidal_orange on Fri, 18 November 2016, 13:09:44
CLICK OF DEATHHHHHHHHHHHH

Apparently so, the test died silently at around 30% so now copying 1TB of data off it - at 8.7MB/sec.  This could take a while...

Edit:  Got interesting stuff in my dmesg and UDMA_CRC_Error_Count has increased, searching suggests it may be a PSU problem?  Hmm...
Title: Re: Anyone speak S.M.A.R.T ? (possible dying hard drives)
Post by: animal on Fri, 18 November 2016, 13:18:42
Since you are using linux, you could also try Gsmartcontrol. It gives a nice explanation for every setting when you hover the mouse over the setting. Yes is graphical.
Title: Re: Anyone speak S.M.A.R.T ? (possible dying hard drives)
Post by: suicidal_orange on Fri, 18 November 2016, 13:31:17
Since you are using linux, you could also try Gsmartcontrol. It gives a nice explanation for every setting when you hover the mouse over the setting. Yes is graphical.

Thanks - this revealed that the test died because 'Interrupted (host reset)' :thumb:
Title: Re: Anyone speak S.M.A.R.T ? (possible dying hard drives)
Post by: tp4tissue on Fri, 18 November 2016, 14:25:33
If you suspect hardware fault,

DO NOT TEST the drive..

Start copying immediately ..
Title: Re: Anyone speak S.M.A.R.T ? (possible dying hard drives)
Post by: ygor on Fri, 18 November 2016, 14:31:06
This thread is triggering my PTSD.
Title: Re: Anyone speak S.M.A.R.T ? (possible dying hard drives)
Post by: suicidal_orange on Fri, 18 November 2016, 14:59:35
If you suspect hardware fault,

DO NOT TEST the drive..

Start copying immediately ..

Can a dodgy cable cause clicking?  After several slowdown reboot cycles I swapped the cable and now it's happily transferred 100BG at reasonable speed with no more clicking. 

Just have to hope I have enough space lying around to get everything off then I can format and zero fill it overnight...


This thread is triggering my PTSD.

What happened and what did you learn from it?