String Data Type
- A string is a sequence of characters
- A string literal uses quotes ‘Hello’ or “Hello”
- For strings, + means “concatenate”
- When a string contains numbers, it is still a string
- We can convert numbers in a string into a number using int()
str1 = "Hello"
str2 = 'there'
bob = str1 + str2
print(bob)
>>>Hellothere
str3 = '123'
x = int(str3) + 1
print(x)
>>> 124
Reading and Converting
- We prefer to read data in using strings and then parse and convert the data as we need
- This gives us more control over error situations and/or bad user input
- Raw input numbers must be converted from strings
Looking Inside Strings
- We can get at any single character in a string using an index specified in square brackets
- The index value must be an integer and starts at zero
- The index value can be an expression that is computed
fruit = 'banana'
letter = fruit[1]
print(letter)
>>> a
x = 3
w = fruit[x-1]
print(w)
>>> n
A Character Too Far
- You will get a python error if you attempt to index beyond the end of a string.
- So be careful when constructing index values and slices
Strings Have Length
The built-in function len gives us the length of a string
fruit = 'banana'
print(len(fruit))
>>> 6
Len Function
A function is some stored code that we use. A function takes some input and produeces an output.
Looping Through Strings
Using a while statement and an iteration variable, and the len function, we can construct a loop to look at each of the letters in a string individually.
fruit = 'banana'
index = 0
while index < len(fruit):
letter = fruit[index]
print(index, letter)
index = index + 1
>>> 0 b
1 a
2 n
3 a
4 n
5 a
- A definite loop using a for statement is much more elegant
- The iteration variable is completely taken care of by the for loop
fruit = 'banana'
for letter in fruit:
print(letter)
Looping and Countring
This is a simple loop that loops through each letter in a string and counts the number of times the loop encounters the ‘a’ character.
word = 'banana'
count = 0
for letter in word:
if letter == 'a':
count = count + 1
print(count)
Looking deeper into in
- The iteration variable “iterates” through the sequence (oredered set)
- The block (body) of code is executed once for each value in the sequence
- The iteration variable moves through all of the values in the sequence
for letter in 'banana':
print(letter)
Slicing Strings
- We can also look at any continuous section of a string using a colon operator
- The second number is one beyond the end of the slice - “up to but not including”
- If the second number is beyond the end of the string, it stops at the end
s = 'Monty Python'
>>>print(s[0:4])
Mont
>>>print(s[6:7])
P
>>>print(s[6:20])
Python
If we leave off the first number of the last number of the slice, it is assumed to be the beginning or end of the string respectively.
String Concatenation
When the + operator is applied to strings, it means “concatenation”
>>> a = 'Hello'
>>> b = a + 'There'
>>> print(b)
HelloThere
>>> c = a + ' ' + 'There'
>>> print(c)
Hello There
Using in as a logical Opearator
- The in keyword can also be used to check to see if one string is “in” another string
- The in expression is a logical expression that returns True or False and can be used in an if statement
String Library
- Python has a number of string functions which are in the string library
- These functions are already built into every string - we invoke them by appending the function to the string variable
- These functions do not modify the original string, instead they return a new string that has been altered
>>> greet = 'Hello Bob'
>>> zap = greet.lower()
>>> print(zap)
hello bob
>>>print(greet)
Hello Bob
>>>print('Hi There'.lower())
hi there
- Strings are also objects.
str.capitalize()
str.center(width[, fillchar])
str.endswitch(suffix[, start[, end]])
str.find(sub[, start[, end]])
str.lstrip([chars])
str.replace(old, new[, count])
str.lower()
str.rstrip([chars])
str.strip([cahrs])
str.upper()
Searching a String
- We use the find() function to search for a substring within another string
- find() finds the first occurrence of the substring
- If the substring is not found, find() returns -1
Remember that string position starts at zero
>>> fruit = 'banana'
>>> pos = fruit.find('na')
>>> print(pos)
2
>>> aa = fruit.find('z)
>>> print(aa)
-1
Making everything UPPER CASE
- You can make a copy of a string in lower case or upper case
- Often when we are searching for a string using find() we first convert the string to lower case so we can search a string regardless of case
Search and Replace
- The replace() function is like a “search and replace” operation in a word processor
- It replaces all occurrences of the search string with the replacement string
>>> greet = 'Hello Bob'
>>> nstr = greet.replace('Bob', 'Jane')
>>> print(nstr)
Hello Jane
>>> nstr = greet.replace('o', 'X')
>>> print(nstr)
HellX BXb
Stripping Whitespace
- Sometimes we want to take a string and remove whitespace at the beginning and/or end
- lstrip() and rstrip() remove whitespace at the left or right
- strip() removes both beginning and ending whitespace
>>> greet = ' Hello Bob '
>>> greet.lstrip()
'Hello Bob '
>>> greet.rstrip()
' Hello Bob'
>>> greet.strip()
'Hello Bob'
Prefixes
>>> line = 'Please have a nice day'
>>> line.startswitch('Please')
True
>>> line.startswith('p')
False
Parsing and Extracting