Image of Python isAlpha, isAlnum, isDigit, isDecimal, isNumeric, & Other String Methods

ADVERTISEMENT

Introduction

Characters are marks or symbols - like letters, numbers, and currency signs - that convey information. They're such an integral part of programming that some languages, like C, even make them an explicit data type to store a single character value.

Python is not one of those languages, however. Instead, the Python string data type encapsulates zero or more characters as a single object.

Still, we understand that different characters can represent different types of information and should, therefore, be classified differently. For example, while "a" and "1" are both strings, the former is an alphabetic character while the latter is numeric. This may be an important distinction in your program, though Python will consider them to be the same data type (class 'str'). Thankfully, there is a way to distinguish them!

In this article, you'll learn eight different string methods that you can use to check just what kind of character data is contained within a given string.

string.isalpha()

This character classification method returns True if a string contains characters that are classified as a letter in the Unicode character database, and False otherwise:

>>> letters_only = "InitialCommit"
>>> letters_only.isalpha()
True

If the string contains any character other than a letter, such as a whitespace character, then the result will be false:

>>> letters_and_spaces = "Initial Commit"
>>> letters_and_spaces.isalpha()
False

Note that you can also call these classification methods directly on the string itself, without assigning it to a variable first:

>>> "InitialCommit".isalpha()
True

This convention will be used throughout remainder of article.

string.isdecimal()

This character classification method returns true if a character is an integer in the Base-10 number system:

>>> "0".isdecimal()
True

Note that this classification method does not work for what you may intuitively think of as a decimal number, that is, a positive or negative number that may or may not include a decimal point:

>>> "3.14159".isdecimal()
False
>>> "-1".isdecimal()
False

In other words, this method will only return true for the set of positive integers {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} (including zero).

string.isdigit()

This character classification method returns true if a symbol is an expression of the ten decimal numbers. This offers support for Unicode subscripts, superscripts, and the like:

>>> "\u00B2".isdigit() # superscript 2
True

This classification still does not work for negative numbers:

>>> "-1".isdigit()
False

As the minus operator ( - ) is technically not a number itself, this is to be expected.

string.isnumeric()

This character classification method returns True if a symbol can be conceptually interpreted as a number, even if the decimal digits themselves aren't used. This adds support for roman numerals, fractions, currency numerators, and much more:

>>> "\u2168".isnumeric() # roman numeral for the number 9
True

Again, strings for negative numbers due to the inclusion of the minus sign operator are not considered numeric in Python:

>>> "-1".isnumeric()
False

string.isalnum()

This character classification method returns True if any of the previous four conditions is true:

>>> "abcdefg".isalnum() # letters: isalpha
True
>>> "12345".isalnum() # numbers: isdecimal
True
>>> "\u00B2".isalnum() # superscript 2: isdigit
True
>>> "\u2168".isalnum() # roman numeral 9: isnumeric
True

string.isspace()

This character classification method returns true if the string represents a whitespace character (space, tab, or newline including \t, \r, \n):

>>> "    ".isspace() # four spaces
True
>>> "   ".isspace() # a tab
True
>>> """
...
... """.isspace() # a multiline string
True

You can use this method on escape sequences as well:

>>> "\n".isspace() # newline
True
>>> "\t".isspace() # tab
True
>>> "\r".isspace() # carriage return
True
>>> "\f".isspace() # form feed
True

string.isprintable()

This character classification method returns true if all characters in the string are printable. In general, control characters like the ones seen in the previous section that specify newlines, carriage returns, separators, and so on will not be considered to be printable:

>>> "n".isprintable() # alphabet
True
>>> "\n".isprintable() # newline
False
>>> "\t".isprintable() # tab
False
>>> "\r".isprintable() # carriage return
False

Bonus: string.isidentifier()

This character classification method returns true if a string is a valid name for a Python object, be that a variable, a function, a class, a module, etc. The string must follow Python's rules for naming objects; in other words, it must start with a letter, and may contain any decimal number 0-9 or underscores:

>>> "hello".isidentifier()
True
>>> "1hello".isidentifier()
False
>>> "h_e_l_l_o".isidentifier()
True

However, this method cannot check for whether or not a desired identifier is a reserved keyword. These are words, like and, for, and else that are used by Python to control program flow and are not available for object assignment:

>>> "and".isidentifier()
True

In this case, the string "and" is determined to be a valid identifier, even though it would not be allowed in Python due to its status as a reserved keyword. Trying to assign a Python object to this word would result in a SyntaxError:

>>> and = "my new string"
  File "<stdin>", line 1
    and = "my new string"
    ^
SyntaxError: invalid syntax

When using this string method, you'll need to take an additional step to ensure the chosen string is not a reserved keyword, in addition to being a valid identifier. You can do this by calling the iskeyword() function from the built-in keyword module:

>>> from keyword import iskeyword
>>> iskeyword('and')
True

Then, you can rewrite your code to ensure that the result of str.isidentifier() is true while the result of iskeyword() is false:

>>> "hello".isidentifier()
True
>>> iskeyword("hello")
False

That way, you can confirm that the desired string is actually available for use as an object identifier.

Summary

In this article, you learned how to use string methods to identify what type of character data is contained within a given string.

You saw various different methods for classifying common symbols in the form of letters and numbers, as well as methods for working with special characters such as roman numerals, subscripts and superscripts, whitespace characters, and more.

Lastly, you saw how you could use the str.isidentifier() method in combination with the built-in iskeyword() function to ensure that a given string is indeed available for use as a Python identifier.

Next Steps

In this article, you saw three different ways for formatting strings in Python. There are other operations you can perform on strings as well, including making strings lowercase and checking if a string contains a substring.

If you're interested in learning more about the basics of Python, coding, and software development, check out our Coding Essentials Guidebook for Developers, where we cover the essential languages, concepts, and tools that you'll need to become a professional developer.

Thanks and happy coding! We hope you enjoyed this article. If you have any questions or comments, feel free to reach out to jacob@initialcommit.io.

Final Notes