Python comes wih many powrerfull tools to handle text formating, I cover standard string and string related operations
For Python think of strings as a sequence of characters, which means you can use index or slice notation.
Strings as sequence of characters | x = "Hello World" print(x[0]) # results in H print(x[-1]) # results in d print(x[:-1]) # results in Hello Worl print(len(x)) # results in 11 |
Concatenation of strings | x = "Hello " + "World" y8 = 8 * "y" # results in 'yyyyyyyy' |
Python has a number of special characters and escape sequences. Python also uses the standard ASCII character set. I have listed the common escape sequences in the table below
\' | Single-quote character |
\" | Double-quote character |
\\ | Backslash character |
\a | Bell character |
\b | Backspace character |
\f | Formfeed character |
\n | Newline character |
\r | Carriage-Return character (not the same as \n) |
\t | Tab character |
\x | hexadecimal |
\v | Vertical tab character |
Python strings have a number of built-in methods that can used manipulate the string, you use the dot (.) operator between the string variable name and method, string are immutable and thus cannot be changed. The methods can be seen below
Join | " ".join(["join", "puts", "spaces", "between", "elements"]) "::".join(["Separated", "with", "colons"]) "".join(["Separated", "by", "nothing"]) |
Split | x = "You\t\t can have tabs\t\n \t and newlines \n\n mixed in" x.split() # ['You', 'can', 'have', 'tabs', 'and', 'newlines', 'mixed', 'in'] x = "Mississippi" x.split("ss") # ['Mi', 'i', 'ippi'] x = 'a b c d' x.split(' ', 1) # ['a', 'b c d'], the second argument determines how many splits you want x.split(' ', 2) # ['a', 'b', 'c d'] x.split(' ', 9) # ['a', 'b', 'c', 'd'] |
Converting Strings to Numbers | float('123.45') int('3333') int('3333.33') # will cause a compile error as its not using float method int('101', 2) # results in 5, the second argument is the base, 2 in this case int('ff', 16) # results in 255, we are using base 16 |
Removing whitespace or specific character/s | x = " Hello, World\t\t " x.strip() # 'Hello, World' x.lstrip() # 'Hello, World\t\t ' x.rstrip() # ' Hello, World' x = "www.python.org" x.strip("w") # '.python.org' as we specified to remove w's x = "wdwdwx.python.org" x.strip("wd") # x.python.org, removes all w's and d's, note it does not removed just wd's x = "hello World" x.rfind("l") # returns 9, rfind starts at the end of the string Note: strip - removes any whitespace/characters at the beginning or end of the string lstrip and rstrip - remove whitespace/characters only at the left or right end of the original string |
Searching | x = "Mississippi" x.find("ss") # returns the number found 2 in this case x.find("zz") # returns -1 as nothing was found x = "Mississippi" x.find("ss", 3) # returns 5, the second argument is the start position x = "Mississippi" x.find("ss", 0, 3) # returns -1, the second/third arguments are the start and end position x = "hello World" x.rfind("l") # returns 9, rfind starts at the end of the string, you can also use start/end positions x = "hello World" x.count("l") # returns 3 as there are 3 l's x = "Mississippi" x.startswith("Miss") # True x.startswith("Mist") # False x.endswith("pi") # True x.endswith("p") # False x.endswith(("p", "i")) # True, you can use multiple characters or strings |
Modifying | x = "Mississippi" x.replace("ss", "+++") # Mi+++i+++ippi x.upper() # MISSISSIPPI x.lower() # mississippi x.capalize() # Mississippi Note: there are many other such as title, ljust, rjust, center, swapcase, expandtabs, etc |
List manipulations | text = "Hello, World" wordList = list(text) # convert the text into a list wordList[6:] = [] # removes everything after comma wordList.reverse() # reverse the list text = "".join(wordList) # joins with on space between results in ,olleH |
Useful Methods | x = "123" x.isdigit() # True x.isalpha() # False x = "M" x.islower() # False x.isupper() # True Note: there are other methods like isdigit, etc |
Convert Objects to Strings | repr([1, 2, 3]) # '[1, 2, 3]', repr coverts list to a string x = [1] x.append(2) x.append([3, 4]) 'the list x is ' + repr(x) # 'the list x is [1, 2, [3, 4]]' Note: almost anything can be converted to some sort of a string representation by using the built-in repr function, you also use str which produces printable string representations |
Python uses two ways to format strings, I will give examples of both
Using positional parameters | "{0} is the {1} of {2}".format("Ambrosia", "food", "the gods") # the replacement positions start at 0 (old way) "{{Ambrosia}} is the {0} of {1}".format("food", "the gods") # the replacement positions start at 0 (newer way) |
Using named parameters | "{food} is the food of {user}".format(food="Ambrosia", user="the gods") # the named value will be replaced by the name in the brackets |
format specifiers | "{0:10} is the food of gods".format("Ambrosia") # make the field 10 spaces wide and pad with spaces "{0:{1}} is the food of gods".format("Ambrosia", 10) # takes the width from the second parameter "{food:{width}} is the food of gods".format(food="Ambrosia", width=10) # same as above written differently "{0:>10} is the food of gods".format("Ambrosia") # forces right-justofication of the field and pads with spaces "{0:&>10} is the food of gods".format("Ambrosia") # forces right-justification and pads with & instead of spaces |
Formating strings with % | "%s is the %s of %s" % ("Ambrosia", "food", "the gods") # 'Ambrosia is the food of the gods' x = [1, 2, "three"] "The %s contains: %s" % ("list", x) # "The list contains: [1, 2, 'three']" "Pi is <%-6.2f>" % 3.14159 # use of the formatting sequence: %–6.2f, C uses a similar formatting |
String interpolation | value = 42 message = f"The answer is {value}" # using brackets and the name of the variable |
More Examples of above | age = 24 print("My age is " + str(age) + " years") print("My age is {0} years".format(age)) city = "Milton Keynes" print("My age is {0} years and lives in {1}".format(age, city)) ## Old way and no longer used depreciated - Python 2 print("My age is %d years and lives in %s" % (age, city)) for i in range(1, 12): print("No, %2d squared is %d and cubed is %d" % (i, i ** 2, i ** 3)) print("PI is approx %12f" % (22 / 7)) print("PI is approx %12.50f" % (22 / 7)) ## the new way and should use going forward for i in range(1, 12): print("No, {0:2} squared is {1:<4} and cubed is {2:<4}".format(i, i ** 2, i ** 3)) print("PI is approx {0:12}".format(22 / 7)) print("PI is approx {0:12.50}".format(22 / 7)) |