How to Check Disk Health with `smartctl` on Arch Linux
smartctl
on Arch LinuxCategories:
5 minute read
Monitoring the health of your storage devices is a crucial aspect of system administration and desktop maintenance. Disks, whether HDDs or SSDs, are prone to failure over time due to mechanical wear or flash memory degradation. Luckily, most modern drives support SMART (Self-Monitoring, Analysis, and Reporting Technology), a monitoring system included in the firmware of storage devices. On Arch Linux, the smartmontools
package and its utility smartctl
provide the tools you need to assess disk health.
In this article, we’ll explore how to install and use smartctl
to monitor your disks on Arch Linux. We’ll go over key SMART attributes, interpret common outputs, and schedule regular health checks.
What is SMART?
SMART is a system built into most modern HDDs and SSDs that continuously monitors various parameters of the drive, such as reallocated sectors, temperature, read/write errors, and more. These metrics help predict impending drive failure and inform the user about overall disk reliability.
However, SMART only works if you proactively check it. That’s where smartctl
comes in.
Installing smartmontools
on Arch Linux
Before using smartctl
, ensure the necessary tools are installed. On Arch Linux, this is simple:
sudo pacman -S smartmontools
This installs smartctl
, which is the command-line utility used to interact with SMART-enabled devices, and the smartd
daemon, which can automate SMART monitoring and send alerts.
Checking if a Disk Supports SMART
Not all disks support SMART, especially older or cheap USB drives. To check if a device supports SMART, run:
sudo smartctl -i /dev/sdX
Replace /dev/sdX
with your actual device (like /dev/sda
or /dev/nvme0n1
).
Example output:
smartctl 7.3 2022-02-28 r5338 [x86_64-linux-6.8.1-arch1-1] (local build)
Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Samsung SSD 980 NVMe Series
Device Model: Samsung SSD 980 1TB
Firmware Version: 5B2QGXA7
...
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
If SMART support is available but not enabled, you can enable it with:
sudo smartctl -s on /dev/sdX
Running a Basic Health Check
To get a quick summary of the drive’s health:
sudo smartctl -H /dev/sdX
Example output:
SMART overall-health self-assessment test result: PASSED
A “PASSED” result is good news. If it says “FAILED” or “UNKNOWN,” your drive may be experiencing issues or SMART may be improperly configured.
Getting Detailed SMART Data
For a more comprehensive report:
sudo smartctl -A /dev/sdX
You’ll see a list of SMART attributes. Here’s an example snippet from an HDD:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0
9 Power_On_Hours 0x0032 091 091 000 Old_age Always - 7529
194 Temperature_Celsius 0x0022 046 054 000 Old_age Always - 46 (Min/Max 22/54)
Let’s break down the important columns:
- ID: Attribute identifier.
- ATTRIBUTE_NAME: Description of the parameter.
- VALUE: Normalized value (usually 1–100 or 1–200, higher is better).
- WORST: Lowest recorded value.
- THRESH: Threshold; crossing it indicates failure.
- TYPE: Indicates whether it is a pre-fail or old-age metric.
- WHEN_FAILED: If this field is populated, the value crossed its threshold.
- RAW_VALUE: The actual value reported by the drive; interpretation varies by manufacturer.
Key SMART Attributes to Watch
While many SMART attributes are recorded, some are more critical than others:
1. Reallocated_Sector_Ct
Indicates how many sectors have been moved due to read/write errors. A non-zero value may signal impending failure.
2. Current_Pending_Sector
Sectors that are waiting to be reallocated. If these increase, data integrity may be compromised.
3. Offline_Uncorrectable
Indicates sectors that cannot be read during an offline scan. Should be zero.
4. Temperature_Celsius
Higher temperatures can shorten disk lifespan. Ideal values vary by model, but 30–50°C is typical.
5. Power_On_Hours
Tracks the age of the drive. Useful for determining wear, especially for SSDs.
6. Wear_Leveling_Count (on SSDs)
Estimates the amount of wear experienced. SSDs have a finite number of program/erase cycles.
7. Media_Wearout_Indicator (on some Intel SSDs)
Starts at 100 and decreases to 0 as the device wears out.
Running Self-Tests
SMART drives support built-in self-tests to validate their health.
Short Test (~1-2 minutes)
sudo smartctl -t short /dev/sdX
To check the results:
sudo smartctl -l selftest /dev/sdX
Long Test (~10+ minutes)
A more thorough test that checks the entire disk:
sudo smartctl -t long /dev/sdX
Note: Tests run in the background. You’ll need to wait until the test completes before viewing results.
Example output of self-test log:
# 1 Short offline Completed without error 00% 13456
If the test shows errors, it may be time to back up data and replace the disk.
Checking SMART on NVMe SSDs
NVMe drives use a different interface, so the syntax is slightly different:
Basic Info
sudo smartctl -i /dev/nvme0n1
Health Summary
sudo smartctl -H /dev/nvme0n1
Detailed Health Report
sudo smartctl -a /dev/nvme0n1
Typical NVMe attributes include:
- Percentage Used: How much of the drive’s lifespan has been consumed.
- Data Units Written: Useful for tracking SSD wear.
- Media and Data Integrity Errors: Should be zero.
- Temperature: Operating temp of the controller.
Scheduling Regular Checks
You can use smartd
for automatic health monitoring. To enable the daemon:
- Edit the configuration file:
sudo nano /etc/smartd.conf
Add a line like:
/dev/sdX -a -o on -S on -s (S/../.././03|L/../../6/04) -W 4,40,45 -m your@email.com
This config:
- Enables all SMART features.
- Runs short tests daily at 3 AM.
- Runs long tests every Saturday at 4 AM.
- Sends email alerts on failures.
- Monitors temperature thresholds.
- Enable and start the daemon:
sudo systemctl enable smartd
sudo systemctl start smartd
To test email notifications, simulate an alert or check logs with:
journalctl -u smartd
Troubleshooting
- SMART not available: Some USB enclosures block SMART passthrough. Consider connecting the drive via SATA.
- SMART enabled but no data: Try using
-d
option to specify device type:
sudo smartctl -a -d sat /dev/sdX
- Access denied: Run as root (
sudo
).
Conclusion
Using smartctl
on Arch Linux is a powerful and reliable way to monitor the health of your storage devices. Whether you’re a system administrator or a curious desktop user, checking SMART data regularly can help prevent data loss and ensure optimal performance.
To summarize:
- Install
smartmontools
. - Use
smartctl -i
and-A
for basic health checks. - Run periodic self-tests.
- Monitor critical attributes like reallocated sectors, pending sectors, and temperature.
- Enable
smartd
for automated monitoring and alerts.
With just a few commands, you gain deep insights into your storage devices and can act before failures occur. Don’t wait for a disaster—start monitoring your drives today.
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.