Bash scripts can be used for many purposes, from automating system administration tasks to creating tools for data analysis. One area where Bash shines is in its ability to manipulate characters and strings. GNU Bash provides several character classes that can be used in pattern matching and regular expressions. In this blog post, we will explore these character classes and learn how to use them in a Bash script that test and categorize ASCII characters in a detailed ASCII table.
What are character classes in Bash?
In Bash, character classes are patterns representing a group of characters. They are enclosed in square brackets and can be used to match or test for certain types of characters. For example, the character class [[:digit:]]
matches any digit character (0-9), and [[:alpha:]]
matches any alphabetic character.
Bash supports several built-in character classes, which are described below:
Class | Description |
---|---|
alnum | The [:alnum:] character class matches all alphabetic and numeric characters. |
alpha | The [:alpha:] character class matches all alphabetic characters. |
ascii | The [:ascii:] character class matches all ASCII characters. |
blank | The [:blank:] character class matches spaces and tabs. |
cntrl | The [:cntrl:] character class matches all control characters. |
digit | The [:digit:] character class matches all numeric digits. |
graph | The [:graph:] character class matches all printable characters except space. |
lower | The [:lower:] character class matches all lowercase alphabetic characters. |
print | The [:print:] character class matches all printable characters. |
punct | The [:punct:] character class matches all punctuation characters. |
space | The [:space:] character class matches all whitespace characters. |
upper | The [:upper:] character class matches all uppercase alphabetic characters. |
word | The [:word:] character class matches all alphanumeric characters and the underscore. |
xdigit | The [:xdigit:] character class matches all hexadecimal digits. |
Make sure to enclose the desired character class name in square brackets with a colon to use that character class in a Bash script. For example, to match all alphabetic characters, we can use the [:alpha:]
character class as follows:
# Using a If Statement
if [[ $char =~ [:alpha:] ]]; then
# do something
fi
# Or using ls command
ls -ld [[:alpha:]]*
We can also use negation with character classes. For example, to match all non-alphabetic characters, we can use the caret symbol ^
as follows:
# Using a If Statement
if [[ $char =~ [^[:alpha:]] ]]; then
# do something
fi
# Or using ls command
ls -ld [^[:alpha:]]*
Example Script to Generate an ASCII-Table with Character Class
The script we will explore uses character classes in Bash to test and categorize ASCII characters. It does this by iterating through the ASCII range from 32 to 127. Then, for each character, checks whether it belongs to any of the following character classes: alnum, alpha, ascii, blank, cntrl, digit, graph, lower, print, punct, space, upper, word, xdigit.
for ((i=32;i<127;i++)); do
printf -v char "\x$(printf '%x' "$i" )"
as_class=()
for class in alnum alpha ascii blank cntrl digit graph lower print punct space upper word xdigit; do
[[ "$char" = @([[:$class:]]) ]] && as_class+=("$class") || as_class+=('')
done
printf '%3d | %s | %-6s | %-6s | %-6s | %-6s | %-6s | %-6s | %-6s | %-6s | %-6s | %-6s | %-6s | %-6s | %-6s| %-6s | %-6s\n' "$i" "$char" "${as_class[@]}"
done
Let’s break down this snippet script and see what it does:
- The for loop iterates through the ASCII range from 32 to 127.
- The
printf
command converts the ASCII code to its corresponding character and stores it in thechar
variable. - The
as_class
bash array is initialized to later store the character class. - The inner
for loop
checks whether the character belongs to each of the 14 character classes using the Bash conditional statement with[[ ... ]]
test operator and the@()
array construct. - If the character belongs to a class, its name is added to the
as_class
array. - The final
printf
command prints the character code, the character itself, and the results of the character class tests in a formatted table.
This Bash script is a helpful tool for testing and categorizing ASCII characters based on their properties. By using character classes, the script can quickly and easily identify characters belonging to specific categories, such as digits, letters, or punctuation. This script is an excellent example of how Bash can be used to manipulate characters and strings and can be easily modified to fit specific use cases.
Conclusion
In this blog post, we explored the Bash built-in character classes and learned how to use them in a Bash script. Character classes are a powerful feature of Bash and can be used in various ways to match patterns and manipulate strings. With this knowledge, you can make your Bash scripts more efficient and powerful.