Python Basics 0#
In this notebook we will go over some of the basics of Python.
Python is a general purpose programming language that is useful for writing scripts to work effectively and reproducibly with data.
After this introduction, we recommend that you work through Python Basics 1-5, which will go over these concepts, (as well as others) in more detail, with some hands on activities for practice.
from Python Basics 1
Python Syntax and Commands#
Expressions & Operators
Data Types (Integers, Floats, and Strings)
Functions
Variables
Loops
Comparison and Equality
Conditionals
Input
Importing Libraries
more on Python syntax
Expressions and Operators#
The simplest form of Python programming is an expression using an operator. An expression is a simple mathematical statement like:
1 + 1
The operator in this case is +
, sometimes called “plus” or “addition”. Try this operation in the code box below. Remember to click the “Run” button or press Ctrl + Enter (Windows) or shift + return (OS X) on your keyboard to run the code.
# Type the expression in this code block. Then run it.
We can also do multiplication (*) and division (/). While you may have used an “×” to represent multiplication in grade school, Python uses an asterisk (*). In Python,
2 × 2
is written as
2 * 2
When you run, or evaluate, an expression in Python, the order of operations is followed. This means that expressions are evaluated in this order:
Parentheses
Exponents
Multiplication and Division (from left to right)
Addition and Subtraction (from left to right)
Python can evaluate parentheses and exponents, as well as a number of additional operators you may not have learned in grade school.
# Try operations in this code cell.
# What happens when you add in parentheses?
Data Types (Integers, Floats, and Strings)#
All expressions evaluate to a single value. In the above examples, our expressions evaluated to single numerical value. Numerical values come in two basic forms:
integer
float (or floating-point number)
An integer, what we sometimes call a “whole number”, is a number without a decimal point that can be positive or negative. When a value uses a decimal, it is called a float or floating-point number.
Python can also help us manipulate text. A snippet of text in Python is called a string. A string can be written with single or double quotes. A string can use letters, spaces, line breaks, and numbers. So 5 is an integer, 5.0 is a float, but ‘5’ and ‘5.0’ are strings. A string can also be blank, such as ‘’.
Familiar Name |
Programming name |
Examples |
---|---|---|
Whole number |
integer |
-3, 0, 2, 534 |
Decimal |
float |
6.3, -19.23, 5.0, 0.01 |
Text |
string |
‘Hello world’, ‘1700 butterflies’, ‘’, ‘1823’ |
The distinction between each of these data types may seem unimportant, but Python treats each one differently. For example, we can ask Python whether an integer is equal to a float, but we cannot ask whether a string is equal to an integer or a float.
To evaluate whether two values are equal, we can use two equals signs between them. The expression will evaluate to either True
or False
.
# Run this code cell to determine whether the values are equal
42 == 42.0
True
# Run this code cell to compare an integer with a string
15 == 'fifteen'
False
# Run this code cell to compare an integer with a string
15 == '15'
False
When we use the addition operator on integers or floats, they are added to create a sum. When we use the addition operator on strings, they are combined into a single, longer string. This is called concatenation.
# Combine the strings 'Hello' and 'World'
'hello' + 'world'
'helloworld'
Notice that the strings are combined exactly as they are written. There is no space between the strings. If we want to include a space, we need to add the space to the end of ‘Hello’ or the beginning of ‘World’. We can also concatenate multiple strings.
# Combine three strings
When we use addition operator, the values must be all numbers or all strings. Combining them will create an error.
# Try adding a string to an integer
'55' + 23
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[8], line 2
1 # Try adding a string to an integer
----> 2 '55' + 23
TypeError: can only concatenate str (not "int") to str
Here, we receive the error can only concatenate str (not "int") to str
. Python assumes we would like to join two strings together, but it does not know how to join a string to an integer. Put another way, Python is unsure if we want:
‘55’ + 23
to become
‘5523’
or
78
We can multiply a string by an integer. The result is simply the string repeated the appropriate number of times.
Variables#
A variable is like a container that stores information. There are many kinds of information that can be stored in a variable, including the data types we have already discussed (integers, floats, and string). We create (or initialize) a variable with an assignment statement. The assignment statement gives the variable an initial value.
# Initialize an integer variable and add 22
new_integer_variable = 5
new_integer_variable + 22
The value of a variable can be overwritten with a new value.
You can create a variable with almost any name, but there are a few guidelines that are recommended.
Variable Names Should be Descriptive#
If we create a variable that stores the day of the month, it is helpful to give it a name that makes the value stored inside it clear like day_of_month
. From a logical perspective, we could call the variable almost anything (hotdog
, rabbit
, flat_tire
). As long as we are consistent, the code will execute the same. When it comes time to read, modify, and understand the code, however, it will be confusing to you and others. Consider this simple program that lets us change the days
variable to compute the number of seconds in that many days.
As code gets longer and more complex, having clear variable names and explanatory comments is very important.
In addition to being descriptive, variable names must follow 3 basic rules:
Must be one word (no spaces allowed)
Only letters, numbers and the underscore character (_)
Cannot begin with a number
The print()
and input()
Functions#
Many different kinds of programs often need to do very similar operations. Instead of writing the same code over again, you can use a function. Essentially, a function is a small snippet of code that can be quickly referenced. There are three kinds of functions:
Native functions built into Python
Functions others have written that you can import
Functions you write yourself
For now, let’s look at a few of the native functions. One of the most common functions used in Python is the print()
function which simply prints a string.
# A print function that prints: Hello World!
print('Hello World!')
Hello World!
We could also define a variable with our string 'Hello World!'
and then pass that variable into the print()
function. It is common for functions to take an input, called an argument, that is placed inside the parentheses ().
# Define a string and then print it
our_string = 'Hello World!'
print(our_string)
Hello World!
There is also an input()
function for taking user input.
# A program to greet the user by name
print('Hi. What is your name?') # Ask the user for their name
user_name = input() # Take the user's input and put it into the variable user_name
print('Pleased to meet you, ' + user_name) # Print a greeting with the user's name
Hi. What is your name?
We defined a string variable user_name
to hold the user’s input. We then called the print()
function to print the concatenation of ‘Pleased to meet you, ‘ and the user’s input that was captured in the variable user_name
. Remember that we can use a +
to concatenate, meaning join these strings together.
Here are couple more tricks we can use. You can pass a string variable into the input
function for a prompt and you can use an f string
to add the variable into the print string without use the +
operator to concatenate both strings.
# A program to greet the user by name
# Passing the prompt into the input() function
# prints it automatically before taking the input
user_name = input('Hi. What is your name? ')
An f string
starts with a letter f and can automatically concatenate variables enclosed in curly braces, like {variable_name}
.
# Using an f string to automatically concatenate
# Without using the plus (+) operator
print(f'Pleased to meet you, {user_name}')
Pleased to meet you, ri
We can concatenate many strings together, but we cannot concatenate strings with integers or floats.
# Concatenating many strings within a print function
print('Hello, ' + 'all ' + 'these ' + 'strings ' + 'are ' + 'being ' + 'connected ' + 'together.')
Hello, all these strings are being connected together.
The str()
, int()
, and float()
functions#
We can transform one variable type into another variable type with the str()
, int()
, and float()
functions.
# Converting an integer into a string
print('There are ' + str(7) + ' continents.')
There are 7 continents.
from Python Basics 2
Flow Control statements#
if
statementselse
statementselif
statementswhile
andfor
loop statementsHandling errors with
try
andexcept
Used in this section:
random
to generate random numbers
Flow Control Statements#
To write programs that can do multiple tasks you’ll need a way for your programs to decide which action comes next. We can control when (or if) code gets executed with flow control statements. If a program is a set of steps for accomplishing a task, then flow control statements help the program decide the next action.
Flow control statements work like a flowchart. For example, let’s say your goal is to hang out and relax with friends. There are a number of steps you might take, depending on whether your friends are available or you feel like making some new friends.
Each diamond in our flowchart represents a decision that has to be made about the best step to take next. This is the essence of flow control statements. They help a program decide what the next step should be given the current circumstances.
Boolean Values#
One way we to create flow control statements is with boolean values that have two possible values: True or False. In our example above, we could consider a “Yes” to be “True” and a “No” to be “False.” When we have the data we need to answer each question, we could store that answer in a variable, like:
are_friends_available = False
make_new_friends = True
new_friend_available = True
This would allow us to determine which action to take next. When we assign boolean values to a variable, the first letter must be capitalized:
Comparison Operators#
Now that we have a way to store integers, floats, strings, and boolean values in variables, we can use a comparison operator to help make decisions based on those values. We use the comparison operator ==
in Python Basics 1. This operator asks whether two expressionsare equal to each other.
There are additional comparison operators that can help us with flow control statements.
Operator |
Meaning |
---|---|
== |
Equal to |
!= |
Not equal to |
< |
Less than |
> |
Greater than |
<= |
Less than or equal to |
>= |
Greater than or equal to |
A string cannot be equal to a float or an integer. But an integer can be equal to a float. We can use the comparison operator with variables
Boolean Operators (and/or/not)#
We can also use Boolean operators) (and/or/not) to create expressions that evaluate to a single Boolean value (True/False).
Using the Boolean Operator and
#
The and
operator determines whether both conditions are True.
# If condition one is True AND condition two is True
# What will the evaluation be?
True and True
# If condition one is True AND condition two is False
# What will the evaluation be?
True and False
In order for an and
expression to evaluate to True, every condition must be True. Here is the “Truth Table” for every pair:
Expression |
Evaluation |
---|---|
True and True |
True |
True and False |
False |
False and True |
False |
False and False |
False |
Since and
expressions require all conditions to be True, they can easily result in False evaluations.
Using the Boolean Operator or
#
The or
operator determines whether any condition is True.
An or
expression evaluates to True if any condition is True. Here is the “Truth Table” for every pair:
Expression |
Evaluation |
---|---|
True or True |
True |
True or False |
True |
False or True |
True |
False or False |
False |
Since or
expressions only require a single condition to be True, they can easily result in True evaluations.
Using the Boolean Operator not
#
Thenot
operator only operates on a single expression, essentially flipping True to False or False to True.
Combining Boolean and Comparison Operators#
We can combine Boolean operators and comparison operators to create even more nuanced Truth tests.
So far, we have evaluated one or two conditions at once, but we could compare even more at once. (In practice, this is rare since it creates code that can be difficult to read.) Boolean operators also have an order of operations like mathematical operators. They resolve in the order of not
, and
, then or
.
Writing a Flow Control Statement#
The general form of a flow control statement in Python is a condition followed by an action clause:
In this condition:
perform this action
Let’s return to part of our flowchart for hanging out with friends.
We can imagine a flow control statement that would look something like:
if have_homework == True:
complete assignment
The condition is given followed by a colon (:). The action clause then follows on the next line, indented into a code block.
If the condition is fulfilled (evaluates to True), the action clause in the block of code is executed.
If the condition is not fulfilled (evaluates to False), the action clause in the block of code is skipped over.
Code Blocks#
A code block is a snippet of code that begins with an indentation. A code block can be a single line or many lines long. Blocks can contain other blocks forming a hierarchal structure. In such a case, the second block is indented an additional degree. Any given block ends when the number of indentations in the current line is less than the number that started the block.
Since the level of indentation describes which code block will be executed, improper indentations will make your code crash. When using indentations to create code blocks, look carefully to make sure you are working in the code block you intend. Each indentation for a code block is created by pressing the tab key.
Types of Flow Control Statements#
The code example above uses an if
statement, but there are other kinds of flow control statements available in Python.
Statement |
Means |
Condition for execution |
---|---|---|
|
if |
if the condition is fulfilled |
|
else if |
if no previous conditions were met and this condition is met |
|
else |
if no condition is met (no condition is supplied for an |
|
while |
while condition is true |
|
for |
execute in a loop for this many times |
|
try |
try this and run the |
Let’s take a look at each of these flow control statement types.
Our program works fairly well so long as the user inputs ‘Yes’ or ‘yes’. If they type ‘no’ or something else, it simply ends. If we want to have our program still respond, we can use an else
statement.
else
Statements#
An else
statement does not require a condition to evaluate to True or False. It simply executes when none of the previous conditions are met. The form looks like this:
else:
perform this action
Our updated flowchart now contains a second branch for our program.
Our new program is more robust. The new else
statement still gives the user a response if they do not respond “Yes” or “yes”. But what if we wanted to add an option for when a user says “No”? Or when a user inputs something besides “Yes” or “No”? We could use a series of elif
statements.
elif
Statements#
An elif
statement, short for “else if,” allows us to create a list of possible conditions where one (and only one) action will be executed. elif
statements come after an initial if
statement and before an else
statement:
if condition A is True:
perform action A
elif condition B is True:
perform action B
elif condition C is True:
perform action C
elif condition D is True:
perform action D
else:
perform action E
For example, we could add an elif
statement to our program so it responds to both “Yes” and “No” with unique answers. We could then add an else
statement that responds to any user input that is not “Yes” or “No”.
# A program that responds to whether the user is having a good or bad day
having_good_day = input('Are you having a good day? (Yes or No) ') # Define a variable having_good_day to hold the user's input
if having_good_day == 'Yes' or having_good_day == 'yes': # If the user has input the string 'Yes' or 'yes'
print('Glad to hear your day is going well!') # Print: Glad to hear your day is going well!
# Write an elif statement for having_good_day == 'No'
# An else statement that catches if the answer is not 'yes' or 'no'
else: # Execute this if none of the other branches executes
print('Sorry, I only understand "Yes" or "No"') # Note that we can use double quotations in our string because it begins and ends with single quotes
while
Loop Statements#
So far, we have used flow control statements like decision-making branches to decide what action should be taken next. Sometimes, however, we want a particular action to loop (or repeat) until some condition is met. We can accomplish this with a while
loop statement that takes the form:
while condition is True:
take this action
After the code block is executed, the program loops back to check and see if the while
loop condition has changed from True to False. The code block stops looping when the condition becomes False.
In the following program, the user will guess a number until they get it correct.
Try changing the variable name i
to something else. What effect does that have on the program?
from Python Basics 3
lists and dictionaries#
-including:
Lists and dictionaries help us store many values inside a single variable. This is helpful for a few reasons.
We can store many items in a single list or dictionary, making it easier to keep the data together
Lists and dictionaries only require a single assignment statement
Lists and dictionaries have additional capabilities that will make organizing our data easier
The fundamental difference between a list and a dictionary is that a list stores items in sequential order (starting from 0) while a dictionary stores items in key/value pairs. When we want to retrieve an item in a list, we use an index number or a set of index numbers called a slice as a reference. When we want to retrieve an item from a dictionary, we supply a key that returns the value (or set of values) associated with that key. Each of these approaches can be beneficial depending on what kind of data we are working with (and what we intend to do with the data).
Lists#
A list can store anywhere from zero to millions of items. The items that can be stored in a list include the data types we have already learned: integers, floats, and strings. A list assignment statement takes the form. my_list = [item1, item2, item3, item4…]
my_list = [item1, item2, item3, item4...]
# A list containing integers
my_favorite_numbers = [7, 21, 100]
print(my_favorite_numbers)
# A list containing strings
my_inspirations = ['Harriet Tubman', 'Rosa Parks', 'Pauli Murray']
print(my_inspirations)
Both my_favorite_numbers
and my_inspirations
have three items, but we could have also initialized them with no items my_favorite_numbers = []
or many more items.
List Index#
Each item has an index number that depends on their order. The first item is 0, the second item is 1, the third item is 2, etc.
Lists can also contain other lists. To retrieve a value from a list within a list, we use two indexes (or indices).
We can also select items from a list beginning from the end/right side of a list by using negative index numbers.
It is not uncommon for lists to be hundreds or thousands of items long. It would be a chore to count all those items to create a slice. If you want to know the length of a list, you can use the len() function.
Slicing a list#
We can also retrieve a group of consecutive items from a list using slices instead of a single index number. We create a slice by indicating a starting and ending index number. The slice is a smaller list containing all the items between our starting and stopping index number.
Notice in our slice that the second index in a slice is the stopping point. That is our return list contains staff[1]
('Brianna Barton'
) and staff[2]
('Carla Cameron'
), but it does not include staff[3]
('Delia Darcy'
). This can be confusing if you were expecting three items instead of two. One way to remember this is by subtracting the indexes in your head (3 - 1 = 2 items).
The staff
list is 7 items long, meaning the whole list is within the slice staff[0:7]
. When we take a slice of a list we can also leave out the first index number (0 is assumed) or the stopping index number (the last item is assumed).
The in
and not in
Operators#
If we have a long list, it may be helpful to check whether a value is in the list. We can do this with the in
and not in
operators, which return a boolean value: True or False.
We can change the value of any item in a list using an assignment statement that contains the item’s index number.
List Methods#
A method is a kind of function. (Remember, functions end in parentheses.) Methods, however, act on objects (like lists) so they have a slightly different written form. We will take a look at five useful methods for working with lists.
Method Name |
Purpose |
Form |
---|---|---|
index() |
search for an item in a list and return the index number |
list_name.index(item_name) |
append() |
add an item to the end of a list |
list_name.append(item_name) |
insert() |
insert an item in the middle of a list |
list_name.insert(index_number, item_name) |
remove() |
remove an item from a list based on value |
list_name.remove(‘item_value’) |
sort() |
sort the order of a list |
list_name.sort() |
The index()
Method#
The index()
method checks to see if a value is in a list. If the value is found, it returns the index number for the first item with that value. (Keep in mind, there could be multiple items with a single value in a list). If the value is not found, the index()
method returns a ValueError
.
Lists can contain multiple identical items. If there are multiple identical items, we need to use some flow control to return all the indices. The enumerate()
function allows us to keep track of the index
and element
at the same time. Notice that this for
loop defines two variables. The first variable keeps track of the current index number in the loop, the second variable keeps track of the value for the current element.
The append()
Method#
The append()
method adds a value to the end of a list.
We can also add an item to the end of a list by using an assignment statement.
The insert()
Method#
The insert()
method is similar to append()
but it takes an argument that lets us choose an index number to insert the new item.
The remove()
Method#
The remove()
method removes the first item from the list that has a matching value.
Like the .index()
method, the .remove()
only works on the first item it finds in the list. If there are repeating items on the list, you can use some flow control to make sure you remove them all.
If you know the value you wish to remove then the remove()
method is the best option. If you know the index number of the item, you can use a del
statement to delete list items.
The sort()
Method#
The sort()
method sorts a list in alphabetical order, where strings with capital letters are sorted A-Z, then strings with lowercase letters are sorted A-Z.
Iterate through a list with a for
loop#
We can use a for
loop to iterate through all the items in a list. The for
loop will create a new temporary variable to store the current item in the list (no assignment statement required).
Dictionaries#
Like a list, a dictionary can hold many values within a single variable. We have seen that the items of a list are stored in a strictly-ordered fashion, starting from item 0. In a dictionary, each value is stored in relation to a descriptive key forming a key/value pair. Technically, as of Python 3.7 (June 2018), dictionaries are also ordered by insertion. In practice, however, the most useful aspect of a dictionary is the ability to supply a key and receive a value without reference to indices. Whereas a list is typed with brackets []
, a dictionary is typed with braces {}
. The key and/or value can be an integer, float, or string.
example_dictionary = {key1 : value1, key2 : value2, key3 : value3}
# An example of a dictionary storing names and occupations
contacts ={
'Amanda Bennett': 'Engineer, electrical',
'Bryan Miller': 'Radiation protection practitioner',
'Christopher Garrison': 'Planning and development surveyor',
'Debra Allen': 'Intelligence analyst',
'Donna Decker': 'Architect',
'Heather Bullock': 'Media planner',
'Jason Brown': 'Energy manager',
'Jason Soto': 'Lighting technician, broadcasting/film/video',
'Marissa Munoz': 'Further education lecturer',
'Matthew Mccall': 'Chief Technology Officer',
'Michael Norman': 'Translator',
'Nicole Leblanc': 'Financial controller',
'Noah Delgado': 'Engineer, land',
'Rachel Charles': 'Physicist, medical',
'Stephanie Petty': 'Architect'}
from pprint import pprint # We import the pretty print function which prints out dictionaries in a neater fashion than the built-in print() function
pprint(contacts) # Use the pretty print function to print `contacts`
We can add a new key/value pair to our dictionary using an assignment statement.
# Adding the key 'Mirza, Rafia' with the value 'Digital Scholarship Librarian' to the dictionary contact
contacts['Rafia Mirza'] = 'Digital Scholarship Librarian'
pprint(contacts) # Use the pretty print function to print `contacts`
Similar to deleting an item from a list, we can use a del
statement to delete a key/value pair. We do not need to worry about duplicates because every key in a dictionary must be unique.
Dictionary Methods#
We’ll take a look at five useful methods for working with dictionaries: update()
, keys()
, values()
, items()
, and get()
.
Method Name |
Purpose |
Form |
---|---|---|
update() |
add new key/value pairs to a dictionary |
dict_name.update({key1:value1, key2:value2}) |
|
combine two dictionaries |
dict_name.update(dict_name2) |
keys() |
check if a key is in a dictionary (True/False) |
key_name in dict_name.keys() |
|
Loop through the keys in a dictionary |
for k in dict.keys(): |
values() |
check if a value is in a dictionary (True/False) |
value_name in dict_name.values() |
|
Loop through the values in a dictionary |
for v in dict.values(): |
items() |
Loop through the keys and values in a dictionary |
for k, v in dict.items(): |
get() |
retrieve the value for a specific key |
dict_name.get(key_name) |
The update()
Method#
The update()
method is useful for adding many key/value pairs to a dictionary at once. The update()
method accepts a single key/value pair, multiple pairs, or even other dictionaries.
# Add a single key/value pair to the dictionary contacts using the update() method
contacts.update(
{'Rafia Mirza'': 'Digital Scholarship Librarian'}
)
pprint(contacts) # Use the pretty print function to print `contacts`
# Adding several key/value pairs to the dictionary contacts using the update() method
contacts.update(
{"Matt Lincoln": "Software Engineer",
'Ian DesJardins': 'Software Engineer',
'Zhuo Chen': 'Text Analysis Instructor'}
)
pprint(contacts) # Use the pretty print function to print `contacts`
The keys()
and values()
Methods#
The keys()
, values()
, and items()
methods are useful for when checking whether a particular key or value exists in a dictionary. We can pair them with in
or not in
operators to check whether a value is in our dictionary (just like we did with lists).
# Checking if a key is in the contacts dictionary
# Do I know a Noah Delgado?
'Noah Delgado' in contacts.keys()
# Checking if a value is in the contacts dictionary
# Do I know an Architect?
'Architect' in contacts.values()
The get()
Method#
If we are sure a key exists, we can return the corresponding value using:
dict_name[key_name]
# Return a value for a particular key
contacts['Noah Delgado']
However, if the key](https://constellate.org/docs/key-terms/#key-value-pair) is not found, the result will be a KeyError
.
# Asking for a key that does not exist
contacts['Mickey Mouse']
The more robust approach is to use the get()
method. If the key is not found, the None
value will be returned. (Optionally, we can also specify a default message to return.)
dict_name.get('key_name', 'key_not_found_message')
# Using the get() method to retrieve the value for the key 'Marissa Munoz'
contacts.get('Marissa Munoz')
Combining keys()
, values()
, and items()
with Flow Control Statements#
It is often usful to combine for
loops with the keys(), values(), or items() methods to repeat a task for each entry in a dictionary. We have the following options:
.keys()
iterates through only the dictionary keys.values()
iterates through only the dictionary values.items()
iterates through the keys and values
Just like a list for
loop, a temporary variable will be created based on whatever name comes after for
.
# Print every key in our contacts dictionary
for name in contacts.keys(): # The variable `name` could be any variable name we choose
print(name)
# Print every value in our contacts dictionary
for occupation in contacts.values(): # The variable `occupation` here could be any variable name we choose
print(occupation)
If we use the .items()
method, we need to define two variable names. It is valid Python to define two variables at once.
# Define two variables at once in Python and assign two strings
word1, word2 = 'Python', 'Basics'
# Verify the variables have been properly assigned
print(word1)
print(word2)
# Print every key and value in our contacts dictionary
for name, occupation in contacts.items():
print(f'{name} has the job: {occupation}')
from Python Basics 4
Functions#
To write your own functions use:
This section concludes with a description of popular Python packages.
Python LibrariesUsed:
[
time
to put make the computer wait a few seconds](https://www.geeksforgeeks.org/python/python-time-module/
Functions#
We have used several Python functions already, including print()
, input()
, and range()
. You can identify a function by the fact that it ends with a set of parentheses () where arguments can be passed into the function. Depending on the function (and your goals for using it), a function may accept no arguments, a single argument, or many arguments. For example, when we use the print()
function, a string (or a variable containing a string) is passed as an argument.
Functions are a convenient shorthand, like a mini-program, that makes our code more modular. We don’t need to know all the details of how the print()
function works in order to use it. Functions are sometimes called “black boxes”, in that we can put an argument into the box and a return value comes out. We don’t need to know the inner details of the “black box” to use it. (Of course, as you advance your programming skills, you may become curious about how certain functions work. And if you work with sensitive data, you may need to peer in the black box to ensure the security and accuracy of the output.)
Libraries and Modules#
While Python comes with many functions, there are thousands more that others have written. Adding them all to Python would create mass confusion, since many people could use the same name for functions that do different things. The solution then is that functions are stored in modules that can be imported for use. A module is a Python file (extension “.py”) that contains the definitions for the functions written in Python. These modules (individual Python files) can then be collected into even larger groups called packages and libraries. Depending on how many functions you need for the program you are writing, you may import a single module, a package of modules, or a whole library.
The general form of importing a module is:
import module_name
You may recall from an eariler lesson, we imported the time
module and used the sleep()
function to wait 5 seconds.
# A program that waits five seconds then prints "Done"
import time # We import all the functions in the `time` module
print('Waiting 5 seconds...')
time.sleep(5) # We run the sleep() function from the time module using `time.sleep()`
print('Done')
We can also just import the sleep()
function without importing the whole time
module. The syntax is:
from module import function
# A program that waits five seconds then prints "Done"
from time import sleep # We import just the sleep() function from the time module
print('Waiting 5 seconds...')
sleep(5) # Notice that we just call the sleep() function, not time.sleep()
print('Done')
Writing a Function#
In the above examples, we called a function that was already written. However, we can also create our own functions!
The first step is to define the function before we call it. We use a function definition statement followed by a function description and a code block containing the function’s actions:
def my_function():
"""Description of what the functions does"""
python code to be executed
After the function is defined, we can call on it to do us a favor whenever we need by simply executing the function like so:
my_function()
After the function is defined, we can call it as many times as we want without having to rewrite its code. In the example below, we create a function called complimenter_function
then call it twice.
# Create a complimenter function
def complimenter_function():
"""prints a compliment""" # Function definition statement
print('You are looking great today!')
After you define a function, don’t forget to call it to make it do the work!
Ideally, a function definition statement should specify the data that the function takes and whether it returns any data. The triple quote notation can use single or double quotes, and it allows the string for the definition statement to expand over multiple lines in Python. If you would like to see a function’s definition statement, you can use the help()
function to check it out.
Parameters vs. Arguments#
When we write a function definition, we can define a parameter to work with the function. We use the word parameter to describe the variable in parentheses within a function definition:
def my_function(input_variable):
"""Takes in X and returns Y"""
do this task
In the pseudo-code above, input_variable
is a parameter because it is being used within the context of a function definition. When we actually call and run our function, the actual variable or value we pass to the function]is called an argument.
Arguments can be passed in based on parameter order (positional) or they can be explicitly passed using an =
. (This could be useful if we wanted to pass an argument for the 10th parameter, but we did not want to pass arguments for the nine other parameters defined before it.)
In the above example, we passed a string into our function, but we could also pass a variable. Try this next. Since the complimenter_function
has already been defined, you can call it in the next cell without defining it again.
A variable passed into a function could contain a list or dictionary. The type of objects that should be passed in and returned can optionally be suggested through type hinting in Python.
The Importance of Avoiding Duplication#
Using functions makes it easier for us to update our code. Let’s say we wanted to change our compliment. We can simply change the function definition one time to make the change everywhere. See if you can change the compliment given by our complimenter function.
By changing our function definition just one time, we were able to make our program behave differently every time it was called. If our program was large, it might call our custom function hundreds of times. If our code repeated like that, we would need to change it in every place!
Generally, it is good practice to avoid duplicating program code to avoid having to change it in multiple places. When programmers edit their code, they may spend time deduplicating (getting rid of code that repeats). This makes the code easier to read and maintain.
Function Return Values#
Whether or not a function takes an argument, it will always return a value. If we do not specify that return value in our function definition, it is automatically set to None
, a special value like the Boolean True
and False
that simply means null or nothing. (None
is not the same thing as, say, the integer 0.) We can also specify return values for our function using a flow control statement followed by return
in a code block.
If you don’t write a Return
statement in your function, a None
value will be returned. If you don’t write a Return
statement in your function, a None
value will be returned.
Instead of automatically printing inside the function, the better approach is to return a string value and let the user decide whether to print it or do something else with it. Ideally, our function definition statement should indicate what goes into the function and what is returned by the function.
Returning the string allows the programmer to use the output instead of just printing it automatically. This is usually the better practice.
We can also offer multiple return statements with flow control. Let’s write a function for telling fortunes. We can call it fortune_picker
and it will accept a number (1-6) then return a string for the fortune.
In our example, we passed the argument 3
that returned the string 'A new friend will help you find yourself'
. To change the fortune, we would have to pass a different integer into the function. To make our fortune-teller random, we could import the function randint()
that chooses a random number between two integers. We pass the two integers as arguments separated by a comma.
Local and Global Scope#
We have seen that functions make maintaining code easier by avoiding duplication. One of the most dangerous areas for duplication is variable names. As programming projects become larger, the possibility that a variable will be re-used goes up. This can cause weird errors in our programs that are hard to track down. We can alleviate the problem of duplicate variable names through the concepts of local scope and global scope.
We use the phrase local scope to describe what happens within a function. The local scope of a function may contain a local variables, but once that function has completed the local variables and their contents are erased.
On the other hand, we can also create global variables that persist at the top-level of the program and also within the local scope of a function.
* In the global scope, Python does not recognize any local variable from within the program's functions
* In the local scope of a function, Python can recognize any global variables
* It is possible for there to be a global variable and a local variable with the same name
Ideally, Python programs should limit the number of global variables and create most variables in a local scope. This keeps confounding variables localized in functions where they are used and then discarded.
The code above defines a global variable global_string
with the value of ‘global’. A function, called print_strings
, then defines a local variable local_string
with a value of ‘local’. When we call the print_strings()
function, it prints the local variable and the global variable.
After the print_strings()
function completes, we try to print both variables in a global scope. The program prints global_string
but crashes when trying to print local_string
in a global scope.
It’s a good practice not to name a local variable the same thing as a global variable. If we define a variable with the same name in a local scope, it becomes a local variable within that scope. Once the function is closed, the global variable retains its original value.
Popular Python Packages#
Modules containing functions for a similar type of task are often grouped together into a package. Here are some of the most popular packages used in Python:
Processing and cleaning data#
Visualizing data#
matplotlib- Creates static, animated, and interactive visualizations.
Seaborn- An expansion of matplotlib that provides a “high-level interface for drawing attractive and informative statistical graphics
Plotly- Create graphs, analytics, and statistics visualizations
Dash- Create interactive web applications and dashboards
Text Analysis#
Artificial Intelligence and Machine Learning#
sci-kit-learn- Implement machine learning in areas such as classification, predictive analytics, regression, and clustering
Keras- Implement deep learning using neural networks
TensorFlow- Implement machine learning with a particular focus on training and deep neural networks
🤗 Transformers- Easily work a variety of models based on Hugging Face 🤗
Data Gathering#
Requests- An HTTP client that helps connect to websites and download files
urllib3- Another HTTP client that helps connect to websites and download files
Beautiful Soup- Pull data out of HTML or XML files, helpful for scraping information from websites
Scrapy- Helps extract data from websites
Textual Digitization#
Tesseract- Use optical character recognition to convert images into plaintext
Pillow- Read and manipulate images with Python
Packages are generally installed by using PyPI, the official Python package index. As of April 2022, there are over 350,000 packages available.
Installing a Python Package in Juypter Lab#
If you would like to install a package that is not in Juypter Lab, we recommend using the pip installer with packages from the Python Package Index. In a code cell insert the following code:
!pip install package_name
for the relevant package you would like to install. The exclamation point indicates the line should be run as a terminal command.
Refer to the package’s documentation for guidance.
from Python Basics 5
strings#
To use strings, you will use:
Escape characters
String methods
The print()
function#
Often when working with strings, we use the print()
function. A deeper understanding of print()
will help us work with strings more flexibly.
Escape characters#
Python strings can use single or double quotes. If the string contains a single quote character, it may be beneficial to use double quotes. Try printing out the string in the next code cell:
# Print out a string using single or double quotes
string = 'Hello World: Here's a string.'
print(string)
An easy solution would be to use double quotes, such as:
string = “Hello World: Here’s a string.”
The use of double quotes keeps Python from ending the string prematurely. But what if your string contains both single and double quotes? Escape characters help us insert certain characters into a string. An escape character begins with a \
. For example, we could insert a single quote into a string surrounded by single quotes by using an escape character.
# Print out a single quote in a Python string
string = 'There\'s an escape character in this string.'
print(string)
The backslash character \
in front of the single quote tells Python not to end the string prematurely. Of course, this opens a new question: How do we create a string with a backslash? The answer is another escape character using two backslashes.
# Print a backslash using an escape character
string = 'Adding a backslash \\ requires an escape character.'
print(string)
Another option is to use a raw string, which ignores any escape characters. A raw string simply starts with an r
similar to an f
string.
string = r'No escape characters \ here'
print(string)
Escape characters also do more than just allow us to add quotes and backslashes. They are also responsible for string formatting for aspects such as tabs and new lines.
Code |
Result |
---|---|
|
‘ |
|
\ |
|
tab |
|
new line |
# Print out a string with two lines
# Print out a string with a tab
The newline escape character \n
can affect readability for many lines. Consider this string containing four lines of a Shakespeare sonnet.
string = 'Shall I compare thee to a summer’s day?\nThou art more lovely and more temperate:\nRough winds do shake the darling buds of May,\nAnd summer’s lease hath all too short a date;\n'
print(string)
A more readable option is to create a string with a triple quote (single or double). This string type can also automatically interpret new lines and tabs.
# Print out Shakespeare's Sonnet 18
string = """Shall I compare thee to a summer’s day?
Thou art more lovely and more temperate:
Rough winds do shake the darling buds of May,
And summer’s lease hath all too short a date;
Sometime too hot the eye of heaven shines,
And often is his gold complexion dimm'd;
And every fair from fair sometime declines,
By chance or nature’s changing course untrimm'd;
But thy eternal summer shall not fade,
Nor lose possession of that fair thou ow’st;
Nor shall death brag thou wander’st in his shade,
When in eternal lines to time thou grow’st:
So long as men can breathe or eyes can see,
So long lives this, and this gives life to thee."""
print(string)
Formatted strings (f-strings)#
An f-string can help us concatenate a variable inside of string. Consider this example where a print function must concatenate three strings:
# Greeting a user with a concatenated string
username = input('Hi. What is your name? ')
print('Hello ' + username + '!')
We used the +
operator twice to concatenate username
between the strings 'Hello '
and '!'
. A simpler method would be to use an f-string. Similar to the way a raw string begins with an r r'string'
, the formatted string begins with an f f'string'
. The variable to be concatenated is then included in curly brackets {}
.
# Print the username inside a formatted string
print(f'Hello {username}!')
Using print()
with a sep
or end
argument#
The print()
function can accept additional arguments such as sep
or end
. These can help format a string appropriately for output. By default, the print function will print many objects separated by a comma.
# Print multiple objects with a single print() statement
string1 = 'Hello'
string2 = 'World'
string3 = '!'
print(string1, string2, string3)
We can even remove the separator by specifying an empty string.
The print print()
function also concatenates a new line by default. The is specified in the default argument end='\n'
.
String slices and methods#
String slices#
The characters of a string can also be indexed and sliced like the items of a list.
# Using a string index
string = 'Python Basics'
string[0]
# Slicing a string
string = 'Python Basics'
string[0:6]
We can use flow control on a string the same way we would with a list.
# Use a for loop on the string
# To print each character except any letter 'o'
string = 'Hello World'
String methods#
There are a variety of methods for manipulating strings.
Method |
Purpose |
Form |
---|---|---|
.lower() |
change the string to lowercase |
string.lower() |
.upper() |
change the string to uppercase |
string.upper() |
.join() |
joins together a list of strings |
‘ ‘.join(string_list) |
.split() |
splits strings apart |
string.split() |
.replace() |
replaces characters in a string |
string.replace(oldvalue, newvalue) |
.rjust(), .ljust(), .center() |
pad out a string |
string.rjust(5) |
.rstrip(), .lstrip(), .strip() |
strip out whitespace |
string.rstrip() |
All of the characters in a string can be lowercased with .lower()
or uppercased with .upper()
.
# Lowercase a string
string = 'Hello World'
string.lower()
These methods do not change the original string, but they return a string that can be saved to a new variable.
# The original string is unchanged
print(string)
# The returned string can be assigned to a new variable
new_string = string.upper()
print(new_string)
A string can be split on any character, or set of characters, passed into .split()
. By default, strings are split on any whitespace including spaces, new lines, and tabs.
# Splitting a string on white space
string = 'This string will be split on whitespace.'
string.split()
# Splitting a phone string based on the '-' character
phone_string = '313-555-3434'
phone_string.split('-')
Similarly, lists of strings can be joined together by passing them into .join()
. A joining string must be specified before the .join()
, even if it is the empty string ''
.
# List of strings joined together
name_list = ['Sam', 'Delilah', 'Jordan']
', '.join(name_list)
The .strip()
method will strip leading and trailing whitespace (including spaces, tabs, and new lines) from a string. Remember, these changes will not affect the original string, but they can be assigned to a new variable.
# Stripping leading and trailing whitespaces from a string
string = ' Python Basics '
string.strip()
It is also possible to only strip whitespace from the right or left of a string.
# Stripping leading whitespace from the leftside of a string
string = ' Python Basics '
string.lstrip()
Characters in a string can be replaced with other characters using the .replace()
method.
# Replacing characters in a string with .replace()
string = 'Hello world'
string.replace('l', 'x')
# Removing characters from a string
# using .replace with an empty string
string = 'Hello! World!'
string.replace('!', '')
Finally, strings can be justified (or padded out) with characters leading, trailing, or both. By default, strings are justified with spaces but other characters can be specified by passing a second argument.
# Left justifying a string
string1 = 'Hello'
string2 = 'world!'
print(string1.ljust(10) + string2)
# Left justifying a string with pluses
string1 = 'Hello'
string2 = 'world!'
print(string1.ljust(10, '+') + string2)
# Right justifying a string
string1 = 'Hello'
string2 = 'world!'
print(string1 + string2.rjust(10))
# Center a string
string = 'Hello world!'
print('|' + string.center(20) + '|')
# Center a string
string = 'Hello world!'
print('|' + string.center(20, '+') + '|')
# Printing a dictionary of contacts in neat columns
contacts ={
'Amanda Bennett': 'Engineer, electrical',
'Bryan Miller': 'Radiation protection practitioner',
'Christopher Garrison': 'Planning and development surveyor',
'Debra Allen': 'Intelligence analyst'}
print('Name', 'Occupation')
for name, occupation in contacts.items():
print(name, occupation)
Checking string contents#
There are a variety of ways to to verify the contents of a string. These return a Boolean True
or False
and are useful for flow control. For example, we can check if a particular set of characters is inside of a string with the in
and not in
operators. The result is a Boolean True or False.
# Check whether a set of characters can be found in a string
string = 'Python Basics'
'Basics' in string
The following string methods also return Boolean True
or False
values.
Method |
Purpose |
Form |
---|---|---|
.startswith(), .endswith() |
returns |
string.startswith(‘abc’) |
.isupper(), .islower() |
returns |
string.isupper() |
.isalpha() |
returns |
string.isalpha() |
.isalnum() |
returns |
string.alnum() |
.isdigit() |
returns |
string.isdigit() |
# Checking if a string starts
# with a particular set of characters
string = 'Python Basics'
string.startswith('Python')
# Checking if a string is lowercased
string = 'python basics'
string.islower()
# Checking if a string is alphabet characters
string = 'PythonBasics'
string.isalpha()
# Checking if a string only
# alphabetic characters and numbers
string = 'PythonBasics5'
string.isalnum()
# Checking if a string is only numbers
string = '50'
string.isdigit()
The .isdigit()
method checks each character to verify it is a digit between 0-9. It will return false
if there is a negative (-) or decimal point (.) character.
Attribution
Created by Nathan Kelber and Ted Lawless for JSTOR Labs under Creative Commons CC BY License