Support >
  About cloud server >
  What to do when your Japanese cloud server disk is almost full? An automatic alert script can solve the problem.
What to do when your Japanese cloud server disk is almost full? An automatic alert script can solve the problem.
Time : 2025-12-22 16:06:45
Edit : Jtti

Running out of disk space on a Japanese cloud server can lead to inability to write new content, database crashes, logging stoppages, and even complete service downtime. Unlike CPU or memory usage which spikes instantly, disk usage increases slowly until it reaches full, providing an opportunity for automated early warning and intervention. Writing a Linux shell script that automatically alerts when file system usage exceeds 90% is a fundamental skill every server administrator should master. The script's core task is clear: periodic checks, accurate assessments, and timely notifications.

Achieving this primarily relies on Linux's built-in `df` command. However, we need formatted data that can be easily parsed by the script. While the commonly used `df -h` is human-friendly, it contains symbols like `G`, `T`, and `%`, which are not conducive to precise comparisons. Therefore, in the script, we use `df --output=pcent,target` or the more general `df -P` combined with `awk` to obtain clean percentage numbers. The `-P` parameter ensures output in POSIX standard format, avoiding line break issues. An example command to get the root partition usage is: `df -P / | awk 'NR==2 {print $5}' | tr -d '%'`, which directly outputs an integer (e.g., `85`).

Based on this, we can build the core logic of the script. First, it needs to define a clear alarm threshold (e.g., 90%). Then, it iterates through all mounted file systems in the system, or monitors only important partitions (e.g., `/`, `/home`, `/var`). For each partition, it extracts its usage value and compares it to the threshold. If the threshold is exceeded, an alarm is triggered. A basic script framework is as follows:

#!/bin/bash

# Filename: disk_usage_monitor.sh

# Function: Check disk usage and send an alarm when it exceeds the threshold

# Set the alarm threshold (percentage)

THRESHOLD=90

# Set the mount points to monitor. To monitor all, use `df -hP | awk 'NR>1 {print $6}'`

MOUNTPOINTS="/ /home /var /boot"

# Loop through each mount point

for mountpoint in $MOUNTPOINTS; do

# Get the usage rate of the mount point (remove the percentage sign)

USAGE=$(df -P "$mountpoint" | awk 'NR==2 {print $5}' | tr -d '%')

# Check if the usage rate exceeds the threshold

if [ "$USAGE" -ge "$THRESHOLD" ]; then

# Retrieve current time, hostname, and other context information to construct an alert message:

CURRENT_TIME=$(date "+%Y-%m-%d %H:%M:%S")

HOSTNAME=$(hostname)

MESSAGE="【Disk Space Alert】 Time: $CURRENT_TIME Host: $HOSTNAME Mountpoint: $mountpoint Usage: $USAGE% Exceeded threshold ${THRESHOLD}%! Please clean up immediately."

# The alert sending function should be called here

echo "$MESSAGE" >&2 # First print to standard error output as a demonstration

# send_alert "$MESSAGE" # Actually send the alert

fi

done

However, conditional logic alone is insufficient. A robust alert script must include an effective notification mechanism to ensure alerts reach the administrator. The simplest way is to send an email using the `mailx` or `sendmail` command. However, emails may be delayed or ignored, so integrating more immediate methods such as DingTalk, WeChat Work, Slack, or SMS APIs is a common practice in production environments. Below is an example of an extended alert function that supports both logging and email sending:

# Alert and Log Functions

LOG_FILE="/var/log/disk_alert.log"

send_alert() {

local msg="$1"

# 1. Log to a local log file (ensure the log directory exists and has write permissions)

echo "$(date '+%Y-%m-%d %H:%M:%S') - $msg" >> "$LOG_FILE"

# 2. Send an email (assuming email sending is configured)

echo "$msg" | mailx -s "【Important】Server Disk Space Alert - $(hostname)" admin@yourdomain.com

# 3. Send to the DingTalk bot (requires network connectivity and a Webhook URL)

# DING_WEBHOOK_URL="https://oapi.dingtalk.com/robot/send?access_token=YOUR_TOKEN"

# curl -s "$DING_WEBHOOK_URL" -H ​​'Content-Type: application/json' -d "{\"msgtype\":\"text\",\"text\":{\"content\":\"$msg\"}}" > /dev/null

# 4. In severe cases, you can try restarting certain services or cleaning up temporary files (operate with caution)

# if [ "$USAGE" -ge 95 ]; then

# echo "Attempting to clean up the /tmp directory..."

# find /tmp -type f -mtime +7 -delete 2>/dev/null | head -50

# fi

}

After saving the script, grant execute permissions via:

chmod +x disk_usage_monitor.sh

The most crucial step is to add it to `crontab` for regular automatic checks. For example, add a line to `/etc/crontab` to run it every 10 minutes as root:

*/10 * * * * root /usr/local/bin/disk_usage_monitor.sh

For production environments, the following important aspects need to be considered to improve script reliability:

Preventing alert storms: Add a "dormancy" mechanism to the alert logic. A status file can be written to record the last alert time. If alerts are triggered repeatedly within a short period (e.g., within an hour), no new alerts will be sent, avoiding screen flooding.

More granular troubleshooting: When an alert is triggered, the script can run automatically:

du -sh /path/to/mountpoint/* | sort -rh | head -10

Find the top 10 directories or files with the largest space usage and attach the results to the alert message, greatly saving manual troubleshooting time.

Configuration-based: Parameters such as `THRESHOLD`, `MOUNTPOINTS`, the email address for receiving alerts, and the Webhook URL are extracted into separate configuration files, facilitating reuse and modification across different servers without altering the main script.

Security and Permissions: Ensure appropriate permission settings for the script and log files (e.g., `root:root` and `644`) to prevent sensitive information leakage. If special commands are invoked, ensure the executing user has the corresponding permissions.

Ultimately, the value of this script extends far beyond a single conditional statement. It constructs an automated closed loop from monitoring, judgment, alerting to assisted troubleshooting. In a Japanese cloud server environment, you can further integrate it with monitoring systems (such as Zabbix or Prometheus) or use more advanced configuration management tools (such as Ansible) to deploy it in batches to all servers.

Pre-sales consultation
JTTI-Jean
JTTI-Eom
JTTI-Selina
JTTI-Coco
JTTI-Defl
JTTI-Ellis
JTTI-Amano
Technical Support
JTTI-Noc
Title
Email Address
Type
Sales Issues
Sales Issues
System Problems
After-sales problems
Complaints and Suggestions
Marketing Cooperation
Information
Code
Submit