Understanding Python strings
A guide to understand the python strings and string formatting
What are strings?
Strings are sequences of characters with the syntax of either single quotes or double quotes.
Example: “hello” or ‘hello’
Fun fact:
What if we have a string which already has a quote in it. For example, consider the sentence — I don’t like her. We will quote it using double quotes since a single quote is used in “don’t”.
“I don’t like her”
>>> 'Raj doesn't like him.'
File "<ipython-input-4-cc76e3b3f34e>", line 1
'Raj doesn't like him.'
^
SyntaxError: invalid syntax
The error is because the word doesn’t also has a single quote. To avoid this we will enclose it in double quotes.
>>> "Raj doesn't like him."
"Raj doesn't like him."
Operations on strings:
Strings are ordered sequences. Therefore we do the following:
1. Indexing
Indexing notation uses [] and allows to fetch a single character from the string. In python indexing starts at 0. We use [] and a number is written inside it to indicate what position we need to fetch.
To grab ‘e’ we can mention the number ‘1’. Incase we want to grab the last character, we can use reverse indexing method and mention the number ‘-1’
>>> string = "Python is fun"
>>> print(string)
Python is fun>>> string[0]
'P'>>> string[5]
'n'>>> string[-1]
'n'
2. Slicing
Slicing gives the provision to fetch a subsection of the string. Its like slicing a piece from whole. The syntax is [start:stop:step]. Start is the numerical index for the point from where to start the slicing. Stop is the numerical index of the point till which the slice should be but it is not included. Step is the numerical index of the size of jump to be taken.
>>> string[0:2]
'Py'>>> string[0:]
'Python is fun'>>> string[2:]
'thon is fun'
Step size of 2, jump 2 times
>>> string[::2]
'Pto sfn'>>> string[::-1]
'nuf si nohtyP'
String properties — Immutability
It means that the strings cannot change.
>>> word = "python"
>>> word[0] = 'q'--------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-23-d61bfcaff2b2> in <module>
----> 1 word[0] = 'q'TypeError: 'str' object does not support item assignment
Error is generated because strings are immutable and item assignment cannot be performed. So to change the string we will have to use slicing and string concatenation (+).
Change python -> qython using ‘+’ operator
>>> last=word[1:]
>>> 'q' + last
'qython'
Multiplication of letters
>>> string1 = "j"
>>> string1 * 10
'jjjjjjjjjj'
When we have numbers written as strings and concatenated using ‘+’ operator, the numbers wont be added but joined.
>>> '5' + '2'
'52'
Built-in string methods
Objects in python generally have built-in methods. Let’s look at some useful methods.
Upper case
>>> str1 = "Python is fun"
>>> str1.upper()
'PYTHON IS FUN'
Lower case
>>> str1.lower()
'python is fun'
Split method
It splits the string based on the character we pass
Splitting on white spaces
>>> str1.split()
['Python', 'is', 'fun']
Splitting on ‘n’
>>> str1.split('n')
['Pytho', ' is fu', '']
String formatting
String formatting lets you inject items into a string rather than trying to chain items together using commas or string concatenation. There are multiple ways to format strings for printing variables in them. This is known as string interpolation. There are three ways to perform string formatting.
- The oldest method involves placeholders using the modulo % character.
- An improved technique uses the .format() string method.
- The newest method, introduced with Python 3.6, uses formatted string literals, called f-strings.
1) Placeholder method
You can use %s to inject strings into your print statements. The modulo % is referred to as a "string formatting operator".
>>> print("Python is %s." %'fun')
Python is fun.>>> print("Python is %s and %s" %('fun','easy'))
Python is fun and easy>>> x = 'fun'
>>> y = 'easy'
>>> print("Python is %s and %s" %(x,y))Python is fun and easy
Format conversion using placeholders
There are two methods %s and %r that convert any python object to a string using two separate methods: str() and repr(). %r and repr() deliver the string representation of the object, including quotation marks and any escape characters.
>>> print('Python is %s.' %'fun')
>>> print('Python is %r.' %'fun')Python is fun.
Python is 'fun'.
Insert tab into a string
>>> print('Python is %s.' %'\tfun')
>>> print('Python is %r.' %'\tfun')Python is fun.
Python is '\tfun'.
The %s operator converts whatever it sees into a string, including integers and floats. The %d operator converts numbers to integers first, without rounding.
>>> print("I am %s kgs" %45.67)
>>> print("I am %d kgs" %45.67)I am 45.67 kgs
I am 45 kgs
Padding and Precision of Floating Point Numbers using placeholders
Floating point numbers use the format %5.2f. Here, 5 would be the minimum number of characters the string should contain; these may be padded with white space if the entire number does not have this many digits. Next to this, .2f stands for how many numbers to show past the decimal point.
>>> print('Number is: %5.2f' %(87.345))
Number is: 87.34>>> print('Number is: %5.8f' %(87.345))
Number is: 87.34500000>>> print('Number is: %10.2f' %(87.345))
Number is: 87.34
2) .format() method
Syntax: “String here {} then also {}”.format(‘something1’, ‘something2’)
>>> print("Python is {}".format("fun"))
Python is fun>>> print('I am {} {}'.format('jane', 'penne'))
I am jane penne
Inserting using index position
>>> print('I am {1} {0}'.format('jane', 'penne'))
I am penne jane
Inserting using keywords
>>> print('I am {j} {p}'.format(j='jane', p='penne'))
I am jane penne
For float formatting syntax is : “{value:width.precision f}”
>>> number = 121/544
>>> number
0.22242647058823528>>> print("The number is {r}".format(r=number))
The number is 0.22242647058823528
Rounding up to 3 decimal places with width 1. Width just adds white space.
>>> print("The number is {r:1.3f}".format(r=number))
The number is 0.222>>> print("The number is {r:10.3f}".format(r=number))
The number is 0.222
Alignment, padding and precision using .format() method
Within the curly braces you can assign field lengths, left/right alignments, rounding parameters and more
>>> print('{0:8} | {1:6}'.format('Name', 'Age'))
>>> print('{0:8} | {1:6}'.format('Ravi', 13.))
>>> print('{0:8} | {1:6}'.format('Raj', 19))Name | Age
Ravi | 13.0
Raj | 19
By default, .format() aligns text to the left, numbers to the right. You can pass an optional <,^, or > to set a left, center or right alignment
>>> print('{0:<8} | {1:^8} |{2:>8}'.format('Left','Center','Right'))
>>> print('{0:<8} | {1:^8} | {2:>8}'.format(10,20,30))Left | Center | Right
10 | 20 | 30
You can precede the alignment operator with a padding character
>>> print('{0:=<8}|{1:-^8}|{2:.>8}'.format('Left','Center','Right'))
>>> print('{0:=<8} | {1:-^8} | {2:.>8}'.format(11,22,33))Left==== | -Center- | ...Right
11====== | ---22--- | ......33
3) f-strings (formatted string literals)
The advantage of this method is that you can bring outside variables immediately into to the string rather than pass them as arguments through .format(var).
>>> word = 'fun'
>>> print(f'python is {word}')
python is fun>>> word1 = 'python'
>>> print(f'{word1} is {word}')
python is fun
To get string representation pass !r as shown below:
>>> print(f"Python is {word!r}")
Python is 'fun'
For float formatting syntax is: “value:{width}.{precision}”. Here, precision refers to the total number of digits and not just those following the decimal point.
>>> print(f"The number is {number:{10}.{4}}")
The number is 0.2224
Also f-strings does not allow padding to the right of the decimal. If its necessary then use .format() method syntax inside an f-string.
>>> num1 = 98.21
>>> print("The number is:{r:10.4f}".format(r=num1))
>>> print(f"The number is:{num1:{10}.{4}}")
>>> print(f"The number is:{num1:10.4f}")The number is: 98.2100
The number is: 98.21
The number is: 98.2100
Refer to notebook here.
Beginner-level books to refer to learn Python:
Advance-level books to refer to learn Python:
Reach out to me: LinkedIn
Check out my other work: GitHub