DJViking Skrevet 25. november 2011 Del Skrevet 25. november 2011 (endret) Jeg hadde problemer med en ekstern harddisk. Enkelte filer var plutselig ikke mulig å lese lenger, verken i Linux eller Windows. Filsystemet var NTFS. Jeg tok deretter å slettet hele harddisken. Løsnet harddisken fra kabinettet (som en gang hadde gått i gulvet for lenge siden) og satt den inn i Linux-boksen min. Deretter formaterte jeg harddisken med XFS og monterte den. Etter å ha kjørt SMART på denne harddisken fikk jeg masse informasjon som jeg ikke helt viste hvordan jeg skulle tolke. Noen som kan hjelpe meg å tolke denne informasjonen? Har vedlagt informasjonen fra SMART i filen smartSDH.txt. Først: djviking@linux-1wa3:~> sudo /usr/sbin/smartctl -d ata -H /dev/sdh root's password: smartctl 5.42 2011-10-20 r3458 [x86_64-linux-3.1.0-1.2-desktop] (SUSE RPM) Copyright © 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED smartSDH.txt Endret 25. november 2011 av DJViking Lenke til kommentar
brgr Skrevet 25. november 2011 Del Skrevet 25. november 2011 SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 194 193 051 Pre-fail Always - 8547 3 Spin_Up_Time 0x0003 178 174 021 Pre-fail Always - 8075 4 Start_Stop_Count 0x0032 096 096 000 Old_age Always - 4156 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x000e 200 200 051 Old_age Always - 0 9 Power_On_Hours 0x0032 080 080 000 Old_age Always - 15027 10 Spin_Retry_Count 0x0012 100 100 051 Old_age Always - 0 11 Calibration_Retry_Count 0x0012 100 100 051 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 647 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 134 193 Load_Cycle_Count 0x0032 172 172 000 Old_age Always - 86019 194 Temperature_Celsius 0x0022 127 097 000 Old_age Always - 25 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0012 200 198 000 Old_age Always - 7 198 Offline_Uncorrectable 0x0010 100 253 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 100 253 051 Old_age Offline - 0 Etter hva jeg forstår så er det disse nøkkeltallene som er det viktigste. Letteste er bare å sammenligne mot Wikipedia SMART siden og se hva som blir sagt om hver enkelt egenskap der. Lenke til kommentar
kyrsjo Skrevet 29. november 2011 Del Skrevet 29. november 2011 Dersom du får masse meldinger med "{" og ordene "CRC" (trur eg) i terminalen, er det et godt tegn om at disken er moden for dynga. Denne linja her: Raw_Read_Error_Rate 0x000f 194 193 051 Pre-fail Always - 8547 med "value" >> "threshold" lover ærlig talt heller ikke godt. Lenke til kommentar
mikeys Skrevet 29. november 2011 Del Skrevet 29. november 2011 (endret) 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0012 200 198 000 Old_age Always - 7 Du har 7 sektorer som trenger reallokering. Det kan være en god ide å kjøre badblocks på disken slik at den får reallokert de sektorene som trenger det. Hvis du ikke har noen viktige data på disken kan lønne seg å kjøre badblocks med overskriving for å få testet grundig. Etter du har kjørt badblocks kan det være lurt å kjøre "smartctl -t long <disk>" for å få oppdatert SMART dataene. Hvis smart finner igjen finner noen dårlige sektorer vil den gi beskjed hvilken. EDIT: Når det gjelder "self-assessment" så er det opp til hver enkelt produsent å bedømme hva de regner som godkjent. Det er derfor godt mulig at de er noe liberale for hva som er OK og ikke. Hvis du noen gang skulle få FAILED kan det være grunn til å bytte disk. Endret 29. november 2011 av mikeys Lenke til kommentar
FraXinuS Skrevet 29. november 2011 Del Skrevet 29. november 2011 (endret) Jeg har også problemer med en disk, den ville ikke mounte etter en reboot og jeg fikk feilmeldinger. feilmeldinger i dmesg: [279100.756477] ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 [279100.756485] ata4.00: irq_stat 0x40000001 [279100.756492] ata4.00: failed command: READ DMA EXT [279100.756503] ata4.00: cmd 25/00:08:47:00:00/00:00:58:00:00/e0 tag 0 dma 4096 in [279100.756506] res 51/40:00:47:00:00/00:00:58:00:00/00 Emask 0x9 (media error) [279100.756511] ata4.00: status: { DRDY ERR } [279100.756515] ata4.00: error: { UNC } [279100.796727] ata4.00: configured for UDMA/133 [279100.796746] ata4: EH complete [279101.109993] ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 [279101.110001] ata4.00: irq_stat 0x40000001 [279101.110009] ata4.00: failed command: READ DMA EXT [279101.110021] ata4.00: cmd 25/00:08:47:00:00/00:00:58:00:00/e0 tag 0 dma 4096 in [279101.110024] res 51/40:00:47:00:00/00:00:58:00:00/00 Emask 0x9 (media error) [279101.110029] ata4.00: status: { DRDY ERR } [279101.110034] ata4.00: error: { UNC } [279101.220153] ata4.00: configured for UDMA/133 [279101.220178] ata4: EH complete [279101.584736] ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 [279101.584744] ata4.00: irq_stat 0x40000001 [279101.584752] ata4.00: failed command: READ DMA EXT [279101.584763] ata4.00: cmd 25/00:08:47:00:00/00:00:58:00:00/e0 tag 0 dma 4096 in [279101.584766] res 51/40:00:47:00:00/00:00:58:00:00/00 Emask 0x9 (media error) [279101.584772] ata4.00: status: { DRDY ERR } [279101.584776] ata4.00: error: { UNC } [279101.684786] ata4.00: configured for UDMA/133 [279101.684804] ata4: EH complete [279101.998847] ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 [279101.998856] ata4.00: irq_stat 0x40000001 [279101.998863] ata4.00: failed command: READ DMA EXT [279101.998875] ata4.00: cmd 25/00:08:47:00:00/00:00:58:00:00/e0 tag 0 dma 4096 in [279101.998878] res 51/40:00:47:00:00/00:00:58:00:00/00 Emask 0x9 (media error) [279101.998883] ata4.00: status: { DRDY ERR } [279101.998887] ata4.00: error: { UNC } [279102.039113] ata4.00: configured for UDMA/133 [279102.039134] ata4: EH complete [279102.362485] ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 [279102.362493] ata4.00: irq_stat 0x40000001 [279102.362500] ata4.00: failed command: READ DMA EXT [279102.362512] ata4.00: cmd 25/00:08:47:00:00/00:00:58:00:00/e0 tag 0 dma 4096 in [279102.362514] res 51/40:00:47:00:00/00:00:58:00:00/00 Emask 0x9 (media error) [279102.362520] ata4.00: status: { DRDY ERR } [279102.362524] ata4.00: error: { UNC } [279102.472640] ata4.00: configured for UDMA/133 [279102.472667] sd 3:0:0:0: [sdd] Unhandled sense code [279102.472672] sd 3:0:0:0: [sdd] Result: hostbyte=0x00 driverbyte=0x08 [279102.472679] sd 3:0:0:0: [sdd] Sense Key : 0x3 [current] [descriptor] [279102.472687] Descriptor sense data with sense descriptors (in hex): [279102.472690] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 [279102.472704] 58 00 00 47 [279102.472710] sd 3:0:0:0: [sdd] ASC=0x11 ASCQ=0x4 [279102.472716] sd 3:0:0:0: [sdd] CDB: cdb[0]=0x28: 28 00 58 00 00 47 00 00 08 00 [279102.472729] end_request: I/O error, dev sdd, sector 1476395079 [279102.472736] quiet_error: 6 callbacks suppressed [279102.472741] Buffer I/O error on device sdd1, logical block 1476395016 [279102.472748] Buffer I/O error on device sdd1, logical block 1476395017 [279102.472753] Buffer I/O error on device sdd1, logical block 1476395018 [279102.472758] Buffer I/O error on device sdd1, logical block 1476395019 [279102.472762] Buffer I/O error on device sdd1, logical block 1476395020 [279102.472766] Buffer I/O error on device sdd1, logical block 1476395021 [279102.472771] Buffer I/O error on device sdd1, logical block 1476395022 [279102.472776] Buffer I/O error on device sdd1, logical block 1476395023 [279102.472800] ata4: EH complete SMART: # smartctl -a /dev/sdd smartctl 5.42 2011-10-20 r3458 [x86_64-linux-3.1.2-1-ARCH] (local build) Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Model Family: Seagate Barracuda Green (Adv. Format) Device Model: ST2000DL003-9VT166 Serial Number: 5YD3KN3X LU WWN Device Id: 5 000c50 03871056f Firmware Version: CC32 User Capacity: 2,000,398,934,016 bytes [2.00 TB] Sector Size: 512 bytes logical/physical Device is: In smartctl database [for details use: -P show] ATA Version is: 8 ATA Standard is: ATA-8-ACS revision 4 Local Time is: Tue Nov 29 19:49:31 2011 CET SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x82) Offline data collection activity was completed without error. Auto Offline Data Collection: Enabled. Self-test execution status: ( 121) The previous self-test completed having the read element of the test failed. Total time to complete Offline data collection: ( 612) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 1) minutes. Extended self-test routine recommended polling time: ( 255) minutes. Conveyance self-test routine recommended polling time: ( 2) minutes. SCT capabilities: (0x30b7) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 100 079 006 Pre-fail Always - 5740464 3 Spin_Up_Time 0x0003 093 093 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 26 5 Reallocated_Sector_Ct 0x0033 099 099 036 Pre-fail Always - 1272 7 Seek_Error_Rate 0x000f 069 060 030 Pre-fail Always - 8928938 9 Power_On_Hours 0x0032 096 096 000 Old_age Always - 4158 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 22 183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0 184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0 187 Reported_Uncorrect 0x0032 001 001 000 Old_age Always - 3839 188 Command_Timeout 0x0032 100 091 000 Old_age Always - 68720656405 189 High_Fly_Writes 0x003a 099 099 000 Old_age Always - 1 190 Airflow_Temperature_Cel 0x0022 070 062 045 Old_age Always - 30 (Min/Max 17/32) 191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 0 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 10 193 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 26 194 Temperature_Celsius 0x0022 030 040 000 Old_age Always - 30 (0 17 0 0 0) 195 Hardware_ECC_Recovered 0x001a 028 004 000 Old_age Always - 5740464 197 Current_Pending_Sector 0x0012 098 095 000 Old_age Always - 188 198 Offline_Uncorrectable 0x0010 098 095 000 Old_age Offline - 188 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 53596896890938 241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 2760731740 242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 3046742494 SMART Error Log Version: 1 ATA Error Count: 3001 (device log contains only the most recent five errors) CR = Command Register [HEX] FR = Features Register [HEX] SC = Sector Count Register [HEX] SN = Sector Number Register [HEX] CL = Cylinder Low Register [HEX] CH = Cylinder High Register [HEX] DH = Device/Head Register [HEX] DC = Device Command Register [HEX] ER = Error register [HEX] ST = Status register [HEX] Powered_Up_Time is measured from power on, and printed as DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes, SS=sec, and sss=millisec. It "wraps" after 49.710 days. Error 3001 occurred at disk power-on lifetime: 4157 hours (173 days + 5 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 25 00 00 ff ff ff ef 00 3d+09:11:09.514 READ DMA EXT 25 00 00 ff ff ff ef 00 3d+09:11:09.269 READ DMA EXT c8 00 08 3f 00 80 e8 00 3d+09:11:09.264 READ DMA c8 00 08 3f 00 40 e8 00 3d+09:11:09.262 READ DMA c8 00 08 3f 00 00 e8 00 3d+09:11:09.253 READ DMA Error 3000 occurred at disk power-on lifetime: 4157 hours (173 days + 5 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 25 00 00 ff ff ff ef 00 3d+09:07:51.430 READ DMA EXT 27 00 00 00 00 00 e0 00 3d+09:07:51.429 READ NATIVE MAX ADDRESS EXT ec 00 00 00 00 00 a0 00 3d+09:07:51.428 IDENTIFY DEVICE ef 03 46 00 00 00 a0 00 3d+09:07:51.428 SET FEATURES [set transfer mode] 27 00 00 00 00 00 e0 00 3d+09:07:51.428 READ NATIVE MAX ADDRESS EXT Error 2999 occurred at disk power-on lifetime: 4157 hours (173 days + 5 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 25 00 00 ff ff ff ef 00 3d+09:07:51.430 READ DMA EXT 27 00 00 00 00 00 e0 00 3d+09:07:51.429 READ NATIVE MAX ADDRESS EXT ec 00 00 00 00 00 a0 00 3d+09:07:51.428 IDENTIFY DEVICE ef 03 46 00 00 00 a0 00 3d+09:07:51.428 SET FEATURES [set transfer mode] 27 00 00 00 00 00 e0 00 3d+09:07:51.428 READ NATIVE MAX ADDRESS EXT Error 2998 occurred at disk power-on lifetime: 4157 hours (173 days + 5 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 25 00 00 ff ff ff ef 00 3d+09:07:49.773 READ DMA EXT 27 00 00 00 00 00 e0 00 3d+09:07:49.772 READ NATIVE MAX ADDRESS EXT ec 00 00 00 00 00 a0 00 3d+09:07:49.772 IDENTIFY DEVICE ef 03 46 00 00 00 a0 00 3d+09:07:49.771 SET FEATURES [set transfer mode] 27 00 00 00 00 00 e0 00 3d+09:07:49.771 READ NATIVE MAX ADDRESS EXT Error 2997 occurred at disk power-on lifetime: 4157 hours (173 days + 5 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 25 00 00 ff ff ff ef 00 3d+09:07:49.773 READ DMA EXT 27 00 00 00 00 00 e0 00 3d+09:07:49.772 READ NATIVE MAX ADDRESS EXT ec 00 00 00 00 00 a0 00 3d+09:07:49.772 IDENTIFY DEVICE ef 03 46 00 00 00 a0 00 3d+09:07:49.771 SET FEATURES [set transfer mode] 27 00 00 00 00 00 e0 00 3d+09:07:49.771 READ NATIVE MAX ADDRESS EXT SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed: read failure 90% 4158 58703839 # 2 Short offline Completed: read failure 90% 4156 49153870 # 3 Short offline Completed: read failure 90% 4156 55197795 # 4 Short offline Completed: read failure 20% 4156 55197795 # 5 Short offline Completed: read failure 10% 4154 55197795 # 6 Short offline Completed: read failure 90% 4153 37028432 # 7 Short offline Completed: read failure 90% 4153 1073741880 # 8 Short offline Completed: read failure 90% 4153 1073741880 # 9 Extended offline Completed: read failure 90% 4151 58703839 #10 Extended offline Completed: read failure 90% 4072 376166080 #11 Short offline Completed without error 00% 4072 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. Jeg liker ikke helt smart-verdiene og den feiler på self-test. Holder disken på dø? Endret 29. november 2011 av FraXinuS Lenke til kommentar
DJViking Skrevet 29. november 2011 Forfatter Del Skrevet 29. november 2011 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0012 200 198 000 Old_age Always - 7 Du har 7 sektorer som trenger reallokering. Det kan være en god ide å kjøre badblocks på disken slik at den får reallokert de sektorene som trenger det. Hvis du ikke har noen viktige data på disken kan lønne seg å kjøre badblocks med overskriving for å få testet grundig. Etter du har kjørt badblocks kan det være lurt å kjøre "smartctl -t long <disk>" for å få oppdatert SMART dataene. Hvis smart finner igjen finner noen dårlige sektorer vil den gi beskjed hvilken. EDIT: Når det gjelder "self-assessment" så er det opp til hver enkelt produsent å bedømme hva de regner som godkjent. Det er derfor godt mulig at de er noe liberale for hva som er OK og ikke. Hvis du noen gang skulle få FAILED kan det være grunn til å bytte disk. Hvordan får jeg kjørt badblocks? Finner ingen parameter for det til smartcl. Lenke til kommentar
FraXinuS Skrevet 29. november 2011 Del Skrevet 29. november 2011 badblocks er et eget program. Men det kan ta lang tid å kjøre, jeg avbrøt etter 50 timer, da hadde den kommet til ca 10%, det var i "non-destructive read-write mode". I read-only mode tok det ca 6 timer. Men vær forsiktig hvis du har data på disken, hvis du bruker feil parameter kan du miste data. Lenke til kommentar
mikeys Skrevet 29. november 2011 Del Skrevet 29. november 2011 Hvordan får jeg kjørt badblocks? Finner ingen parameter for det til smartcl. Sorry, jeg utelot den biten. Hvis du ikke har noe du vil ta vare på disken så kan du kjøre badblocks direkte, badblocks er et vanlig program du kan kjøre i terminalen. Du kan feks. gjøre "badblocks -w -s <disk>", som vil gjøre en destruktiv test av disken. Dette kan ta nokså lang tid. Lenke til kommentar
DJViking Skrevet 29. november 2011 Forfatter Del Skrevet 29. november 2011 (endret) badblocks er et eget program. Men det kan ta lang tid å kjøre, jeg avbrøt etter 50 timer, da hadde den kommet til ca 10%, det var i "non-destructive read-write mode". I read-only mode tok det ca 6 timer. Men vær forsiktig hvis du har data på disken, hvis du bruker feil parameter kan du miste data. Kjører nå badblocks mot disken: djviking@linux-1wa3:~> sudo /sbin/badblocks -nvs /dev/sdhChecking for bad blocks in non-destructive read-write mode From block 0 to 976762583 Checking for bad blocks (non-destructive read-write test) Den har sålangt kommet til 1% på 1 time. Fortsetter den slik vil det ta 100 timer. Bedre at det tar lang tid å fikse enn å kjøpe ny disk i disse dagene dersom den skulle feile. Prisene på harddisk har jo steget til enorme høyder. Bruker vanligvis å kjøpe WD Caviar Black 2TB og den har økt i pris med 150%. Endret 29. november 2011 av DJViking Lenke til kommentar
FraXinuS Skrevet 30. november 2011 Del Skrevet 30. november 2011 Da tok disken min kvelden i løpet av natten, den forsvant fra /dev og jeg kommer ikke inn i bios så lenge den er tilkoblet. Håper jeg får ny da den var bare ca 5 måneder gammel. Badblocks vil ikke fikse de ødelagte sektorene dine, det vil finne dem. Etterpå kan du gi listen du får fra badblocks til filsystemet så det vet at disse blokkene ikke skal brukes. Men da er det viktig at du bruker riktig block-size (-b) når du kjører badblocks, hvis ikke blir tallene du får feil, det er mulig det går an å regne om tallene etterpå, men det vet jeg ikke noe om. Derfor er det anbefalt å bruke e2fsck (med -c -c) da er du sikker på at block-nummerene blir riktig. Men dette er for ext3/ext4 jeg vet ikke hvordan det funker med XFS. Lenke til kommentar
DJViking Skrevet 30. november 2011 Forfatter Del Skrevet 30. november 2011 (endret) Hadde nok heller kjøpt meg en ny harddisk hadde ikke denne harddisk-krisen inntruffet. En WD 1TB disk koster nå like mye som 2TB kostet før krisen(1000kr). Heldigvis har jeg en annen harddisk med full backup av denne i tilfelle noe skal gå galt. Prøvde meg med en harddisk-reperasjon i Windows for denne harddisken og etter 2 dager så avbrøt jeg hele operasjonen. Endret 30. november 2011 av DJViking Lenke til kommentar
kyrsjo Skrevet 30. november 2011 Del Skrevet 30. november 2011 Jeg har også problemer med en disk, den ville ikke mounte etter en reboot og jeg fikk feilmeldinger. feilmeldinger i dmesg: [279100.756477] ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 [279100.756485] ata4.00: irq_stat 0x40000001 [279100.756492] ata4.00: failed command: READ DMA EXT [279100.756503] ata4.00: cmd 25/00:08:47:00:00/00:00:58:00:00/e0 tag 0 dma 4096 in [279100.756506] res 51/40:00:47:00:00/00:00:58:00:00/00 Emask 0x9 (media error) [279100.756511] ata4.00: status: { DRDY ERR } [279100.756515] ata4.00: error: { UNC } [279100.796727] ata4.00: configured for UDMA/133 [279100.796746] ata4: EH complete [279101.109993] ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 [279101.110001] ata4.00: irq_stat 0x40000001 [279101.110009] ata4.00: failed command: READ DMA EXT [279101.110021] ata4.00: cmd 25/00:08:47:00:00/00:00:58:00:00/e0 tag 0 dma 4096 in [279101.110024] res 51/40:00:47:00:00/00:00:58:00:00/00 Emask 0x9 (media error) [279101.110029] ata4.00: status: { DRDY ERR } [279101.110034] ata4.00: error: { UNC } [279101.220153] ata4.00: configured for UDMA/133 [279101.220178] ata4: EH complete [279101.584736] ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 [279101.584744] ata4.00: irq_stat 0x40000001 [279101.584752] ata4.00: failed command: READ DMA EXT [279101.584763] ata4.00: cmd 25/00:08:47:00:00/00:00:58:00:00/e0 tag 0 dma 4096 in [279101.584766] res 51/40:00:47:00:00/00:00:58:00:00/00 Emask 0x9 (media error) [279101.584772] ata4.00: status: { DRDY ERR } [279101.584776] ata4.00: error: { UNC } [279101.684786] ata4.00: configured for UDMA/133 [279101.684804] ata4: EH complete [279101.998847] ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 [279101.998856] ata4.00: irq_stat 0x40000001 [279101.998863] ata4.00: failed command: READ DMA EXT [279101.998875] ata4.00: cmd 25/00:08:47:00:00/00:00:58:00:00/e0 tag 0 dma 4096 in [279101.998878] res 51/40:00:47:00:00/00:00:58:00:00/00 Emask 0x9 (media error) [279101.998883] ata4.00: status: { DRDY ERR } [279101.998887] ata4.00: error: { UNC } [279102.039113] ata4.00: configured for UDMA/133 [279102.039134] ata4: EH complete [279102.362485] ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 [279102.362493] ata4.00: irq_stat 0x40000001 [279102.362500] ata4.00: failed command: READ DMA EXT [279102.362512] ata4.00: cmd 25/00:08:47:00:00/00:00:58:00:00/e0 tag 0 dma 4096 in [279102.362514] res 51/40:00:47:00:00/00:00:58:00:00/00 Emask 0x9 (media error) [279102.362520] ata4.00: status: { DRDY ERR } [279102.362524] ata4.00: error: { UNC } [279102.472640] ata4.00: configured for UDMA/133 [279102.472667] sd 3:0:0:0: [sdd] Unhandled sense code [279102.472672] sd 3:0:0:0: [sdd] Result: hostbyte=0x00 driverbyte=0x08 [279102.472679] sd 3:0:0:0: [sdd] Sense Key : 0x3 [current] [descriptor] [279102.472687] Descriptor sense data with sense descriptors (in hex): [279102.472690] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 [279102.472704] 58 00 00 47 [279102.472710] sd 3:0:0:0: [sdd] ASC=0x11 ASCQ=0x4 [279102.472716] sd 3:0:0:0: [sdd] CDB: cdb[0]=0x28: 28 00 58 00 00 47 00 00 08 00 [279102.472729] end_request: I/O error, dev sdd, sector 1476395079 [279102.472736] quiet_error: 6 callbacks suppressed [279102.472741] Buffer I/O error on device sdd1, logical block 1476395016 [279102.472748] Buffer I/O error on device sdd1, logical block 1476395017 [279102.472753] Buffer I/O error on device sdd1, logical block 1476395018 [279102.472758] Buffer I/O error on device sdd1, logical block 1476395019 [279102.472762] Buffer I/O error on device sdd1, logical block 1476395020 [279102.472766] Buffer I/O error on device sdd1, logical block 1476395021 [279102.472771] Buffer I/O error on device sdd1, logical block 1476395022 [279102.472776] Buffer I/O error on device sdd1, logical block 1476395023 [279102.472800] ata4: EH complete SMART: # smartctl -a /dev/sdd smartctl 5.42 2011-10-20 r3458 [x86_64-linux-3.1.2-1-ARCH] (local build) Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Model Family: Seagate Barracuda Green (Adv. Format) Device Model: ST2000DL003-9VT166 Serial Number: 5YD3KN3X LU WWN Device Id: 5 000c50 03871056f Firmware Version: CC32 User Capacity: 2,000,398,934,016 bytes [2.00 TB] Sector Size: 512 bytes logical/physical Device is: In smartctl database [for details use: -P show] ATA Version is: 8 ATA Standard is: ATA-8-ACS revision 4 Local Time is: Tue Nov 29 19:49:31 2011 CET SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x82) Offline data collection activity was completed without error. Auto Offline Data Collection: Enabled. Self-test execution status: ( 121) The previous self-test completed having the read element of the test failed. Total time to complete Offline data collection: ( 612) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 1) minutes. Extended self-test routine recommended polling time: ( 255) minutes. Conveyance self-test routine recommended polling time: ( 2) minutes. SCT capabilities: (0x30b7) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 100 079 006 Pre-fail Always - 5740464 3 Spin_Up_Time 0x0003 093 093 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 26 5 Reallocated_Sector_Ct 0x0033 099 099 036 Pre-fail Always - 1272 7 Seek_Error_Rate 0x000f 069 060 030 Pre-fail Always - 8928938 9 Power_On_Hours 0x0032 096 096 000 Old_age Always - 4158 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 22 183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0 184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0 187 Reported_Uncorrect 0x0032 001 001 000 Old_age Always - 3839 188 Command_Timeout 0x0032 100 091 000 Old_age Always - 68720656405 189 High_Fly_Writes 0x003a 099 099 000 Old_age Always - 1 190 Airflow_Temperature_Cel 0x0022 070 062 045 Old_age Always - 30 (Min/Max 17/32) 191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 0 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 10 193 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 26 194 Temperature_Celsius 0x0022 030 040 000 Old_age Always - 30 (0 17 0 0 0) 195 Hardware_ECC_Recovered 0x001a 028 004 000 Old_age Always - 5740464 197 Current_Pending_Sector 0x0012 098 095 000 Old_age Always - 188 198 Offline_Uncorrectable 0x0010 098 095 000 Old_age Offline - 188 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 53596896890938 241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 2760731740 242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 3046742494 SMART Error Log Version: 1 ATA Error Count: 3001 (device log contains only the most recent five errors) CR = Command Register [HEX] FR = Features Register [HEX] SC = Sector Count Register [HEX] SN = Sector Number Register [HEX] CL = Cylinder Low Register [HEX] CH = Cylinder High Register [HEX] DH = Device/Head Register [HEX] DC = Device Command Register [HEX] ER = Error register [HEX] ST = Status register [HEX] Powered_Up_Time is measured from power on, and printed as DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes, SS=sec, and sss=millisec. It "wraps" after 49.710 days. Error 3001 occurred at disk power-on lifetime: 4157 hours (173 days + 5 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 25 00 00 ff ff ff ef 00 3d+09:11:09.514 READ DMA EXT 25 00 00 ff ff ff ef 00 3d+09:11:09.269 READ DMA EXT c8 00 08 3f 00 80 e8 00 3d+09:11:09.264 READ DMA c8 00 08 3f 00 40 e8 00 3d+09:11:09.262 READ DMA c8 00 08 3f 00 00 e8 00 3d+09:11:09.253 READ DMA Error 3000 occurred at disk power-on lifetime: 4157 hours (173 days + 5 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 25 00 00 ff ff ff ef 00 3d+09:07:51.430 READ DMA EXT 27 00 00 00 00 00 e0 00 3d+09:07:51.429 READ NATIVE MAX ADDRESS EXT ec 00 00 00 00 00 a0 00 3d+09:07:51.428 IDENTIFY DEVICE ef 03 46 00 00 00 a0 00 3d+09:07:51.428 SET FEATURES [set transfer mode] 27 00 00 00 00 00 e0 00 3d+09:07:51.428 READ NATIVE MAX ADDRESS EXT Error 2999 occurred at disk power-on lifetime: 4157 hours (173 days + 5 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 25 00 00 ff ff ff ef 00 3d+09:07:51.430 READ DMA EXT 27 00 00 00 00 00 e0 00 3d+09:07:51.429 READ NATIVE MAX ADDRESS EXT ec 00 00 00 00 00 a0 00 3d+09:07:51.428 IDENTIFY DEVICE ef 03 46 00 00 00 a0 00 3d+09:07:51.428 SET FEATURES [set transfer mode] 27 00 00 00 00 00 e0 00 3d+09:07:51.428 READ NATIVE MAX ADDRESS EXT Error 2998 occurred at disk power-on lifetime: 4157 hours (173 days + 5 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 25 00 00 ff ff ff ef 00 3d+09:07:49.773 READ DMA EXT 27 00 00 00 00 00 e0 00 3d+09:07:49.772 READ NATIVE MAX ADDRESS EXT ec 00 00 00 00 00 a0 00 3d+09:07:49.772 IDENTIFY DEVICE ef 03 46 00 00 00 a0 00 3d+09:07:49.771 SET FEATURES [set transfer mode] 27 00 00 00 00 00 e0 00 3d+09:07:49.771 READ NATIVE MAX ADDRESS EXT Error 2997 occurred at disk power-on lifetime: 4157 hours (173 days + 5 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 25 00 00 ff ff ff ef 00 3d+09:07:49.773 READ DMA EXT 27 00 00 00 00 00 e0 00 3d+09:07:49.772 READ NATIVE MAX ADDRESS EXT ec 00 00 00 00 00 a0 00 3d+09:07:49.772 IDENTIFY DEVICE ef 03 46 00 00 00 a0 00 3d+09:07:49.771 SET FEATURES [set transfer mode] 27 00 00 00 00 00 e0 00 3d+09:07:49.771 READ NATIVE MAX ADDRESS EXT SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed: read failure 90% 4158 58703839 # 2 Short offline Completed: read failure 90% 4156 49153870 # 3 Short offline Completed: read failure 90% 4156 55197795 # 4 Short offline Completed: read failure 20% 4156 55197795 # 5 Short offline Completed: read failure 10% 4154 55197795 # 6 Short offline Completed: read failure 90% 4153 37028432 # 7 Short offline Completed: read failure 90% 4153 1073741880 # 8 Short offline Completed: read failure 90% 4153 1073741880 # 9 Extended offline Completed: read failure 90% 4151 58703839 #10 Extended offline Completed: read failure 90% 4072 376166080 #11 Short offline Completed without error 00% 4072 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. Jeg liker ikke helt smart-verdiene og den feiler på self-test. Holder disken på dø? dmesg sier "media error", så ja den står med ene beinet og vel-så-det i grava. Sørg for å ha en blodfersk backup, og kjøp ny disk før du må holde gravøl for dataene dine istedet. Lenke til kommentar
FraXinuS Skrevet 30. november 2011 Del Skrevet 30. november 2011 Disken min døde i natt, og jeg hadde ikke backup. Lenke til kommentar
kyrsjo Skrevet 30. november 2011 Del Skrevet 30. november 2011 Disken min døde i natt, og jeg hadde ikke backup. Jaja, du har i allefall en unnskyldning for en øl. Tøffe gutter tar ikke backup, men gråter mye... Lenke til kommentar
DJViking Skrevet 30. november 2011 Forfatter Del Skrevet 30. november 2011 Backup er dyrt. Jeg har bare backup av èn disk som inneholder det aller viktigste. Når man har en selvbygd NAS med 8 harddisker, kreves det en ganske stor backup-enhet. Så langt er min selvbygde backup-enhet kun på planleggingstadiet. Lenke til kommentar
mikeys Skrevet 30. november 2011 Del Skrevet 30. november 2011 Badblocks vil ikke fikse de ødelagte sektorene dine, det vil finne dem. Etterpå kan du gi listen du får fra badblocks til filsystemet så det vet at disse blokkene ikke skal brukes. Men da er det viktig at du bruker riktig block-size (-b) når du kjører badblocks, hvis ikke blir tallene du får feil, det er mulig det går an å regne om tallene etterpå, men det vet jeg ikke noe om. Derfor er det anbefalt å bruke e2fsck (med -c -c) da er du sikker på at block-nummerene blir riktig. Men dette er for ext3/ext4 jeg vet ikke hvordan det funker med XFS. Stemmer det. Badblocks fikser ikke noe selv, men klarer som oftest å fremprovosere disken til å remappe den dårlige sektoren. Lista med dårlige sektorer som badblocks skriver ut kan brukes til å manuelt provosere disken til å gjøre remapping. Howtoen under viser hvordan du selv kan gå til verks for å fikse selv. http://smartmontools.sourceforge.net/badblockhowto.html Leit å høre om disken som tok kvelden Det er aldri gøy. Lenke til kommentar
DJViking Skrevet 30. november 2011 Forfatter Del Skrevet 30. november 2011 (endret) Badblocks vil ikke fikse de ødelagte sektorene dine, det vil finne dem. Etterpå kan du gi listen du får fra badblocks til filsystemet så det vet at disse blokkene ikke skal brukes. Men da er det viktig at du bruker riktig block-size (-b) når du kjører badblocks, hvis ikke blir tallene du får feil, det er mulig det går an å regne om tallene etterpå, men det vet jeg ikke noe om. Derfor er det anbefalt å bruke e2fsck (med -c -c) da er du sikker på at block-nummerene blir riktig. Men dette er for ext3/ext4 jeg vet ikke hvordan det funker med XFS. Stemmer det. Badblocks fikser ikke noe selv, men klarer som oftest å fremprovosere disken til å remappe den dårlige sektoren. Lista med dårlige sektorer som badblocks skriver ut kan brukes til å manuelt provosere disken til å gjøre remapping. Howtoen under viser hvordan du selv kan gå til verks for å fikse selv. http://smartmontools...blockhowto.html Leit å høre om disken som tok kvelden Det er aldri gøy. These examples assume the Linux operating system and either the ext2/ext3 or ReiserFS file system. Jeg bruker XFS som filsystem. Edit: djviking@linux-1wa3:~> sudo /sbin/badblocks -vs /dev/sdhChecking blocks 0 to 976762583 Checking for bad blocks (read-only test): done Pass completed, 0 bad blocks found. Det betyr vel da at harddisken er i orden. Endret 30. november 2011 av DJViking Lenke til kommentar
FraXinuS Skrevet 1. desember 2011 Del Skrevet 1. desember 2011 Du kan også prøve å kjøre en selftest på disken: smartctl -t long /dev/sdh Den kjører i bakgrunnen og tar kanskje 2-3 timer, men den stopper ved første error. Du sjekker resultatet med: smartctl -l selftest /dev/sdh Heisann, Jeg bestiller ut en ny disk til deg så kaster du bare den gamle. Ha en flott dag! Jeg liker kundeservicen til dustinhome.no Lenke til kommentar
mikeys Skrevet 1. desember 2011 Del Skrevet 1. desember 2011 (endret) These examples assume the Linux operating system and either the ext2/ext3 or ReiserFS file system. Mesteparten av guiden er lagt av til å finne riktig block på filsystemet ditt, du kan fint få utført det samme med xfs_db som debugfs blir brukt til i guiden. Endret 1. desember 2011 av mikeys Lenke til kommentar
DJViking Skrevet 1. desember 2011 Forfatter Del Skrevet 1. desember 2011 (endret) Hva er det med kolonnen TYPE fra S.M.A.R.T. informasjonen? Alle mine disker har enten Pre-fail eller Old_age i denne kolonnen. Selv mine aller nyeste harddisker av WD Caviar Black. Edit: Fant en forklaring på disse verdiene i kolonnen: The term "Old_age" is describing a parameter type for some of the named attributes, which indicates it is expected to change with age, rather than predict a failure. You would see those same words in that output even for a new disk - but with different actual values. Endret 1. desember 2011 av DJViking Lenke til kommentar
Anbefalte innlegg
Opprett en konto eller logg inn for å kommentere
Du må være et medlem for å kunne skrive en kommentar
Opprett konto
Det er enkelt å melde seg inn for å starte en ny konto!
Start en kontoLogg inn
Har du allerede en konto? Logg inn her.
Logg inn nå