Python Basics 1#
Description: This lesson describes operators, expressions, data types, variables, and basic functions.
Required: Ability to run Jupyter Notebooks
Introduction#
Python is the fastest-growing language in computer programming. Learning Python is a great choice because Python is:
Widely-adopted in the digital humanities and data science
Regarded as an easy-to-learn language
Flexible, having wide support for working with numerical and textual data
A skill desired by employers in academic, non-profit, and private sectors
The second most-popular language for digital humanities and data science work is R.
The skills you’ll learn in Python Basics are general-purpose Python skills, applicable for any of the text analysis notebooks that you may explore later. They are also widely applicable to many other kinds of tasks in Python beyond text analysis.
Making Mistakes is Important
Every programmer at every skill level gets errors in their code. Making mistakes is how we all learn to program. Programming is a little like solving a puzzle where the goal is to get the desired outcome through a series of attempts. You won’t solve the puzzle if you’re afraid to test if the pieces match. An error message will not break your computer. Remember, you can always reload a notebook if it stops working properly or you misplace an important piece of code. Under the edit menu, there is an option to undo changes. (Alternatively, you can use command z on Mac and control z on Windows.) To learn any skill, you need to be willing to play and experiment. Programming is no different.
Expressions and Operators#
The simplest form of Python programming is an expression using an operator. An expression is a simple mathematical statement like:
1 + 1
The operator in this case is +
, sometimes called “plus” or “addition”. Try this operation in the code box below. Remember to click the “Run” button or press Ctrl + Enter (Windows) or shift + return (OS X) on your keyboard to run the code.
# Type the expression in this code block. Then run it.
Python can handle a large variety of expressions. Let’s try subtraction in the next code cell.
# Type an expression that uses subtraction in this cell. Then run it.
We can also do multiplication (*) and division (/). While you may have used an “×” to represent multiplication in grade school, Python uses an asterisk (*). In Python,
2 × 2
is written as
2 * 2
Try a multiplication and a division in the next code cell.
# Try a multiplication in this cell. Then try a division.
# What happens if you combine them? What if you combine them with addition and/or subtraction?
When you run, or evaluate, an expression in Python, the order of operations is followed. (In grade school, you may remember learning the shorthand “PEMDAS”.) This means that expressions are evaluated in this order:
Parentheses
Exponents
Multiplication and Division (from left to right)
Addition and Subtraction (from left to right)
Python can evaluate parentheses and exponents, as well as a number of additional operators you may not have learned in grade school. Here are the main operators that you might use presented in the order they are evaluated:
Operator |
Operation |
Example |
Evaluation |
---|---|---|---|
** |
Exponent/Power |
3 ** 3 |
27 |
% |
Modulus/Remainder |
34 % 6 |
4 |
/ |
Division |
30 / 6 |
5 |
* |
Multiplication |
7 * 8 |
56 |
- |
Subtraction |
18 - 4 |
14 |
+ |
Addition |
4 + 3 |
7 |
# Try operations in this code cell.
# What happens when you add in parentheses?
Data Types (Integers, Floats, and Strings)#
All expressions evaluate to a single value. In the above examples, our expressions evaluated to single numerical value. Numerical values come in two basic forms:
integer
float (or floating-point number)
An integer, what we sometimes call a “whole number”, is a number without a decimal point that can be positive or negative. When a value uses a decimal, it is called a float or floating-point number. Two numbers that are mathematically equivalent could be in two different data types. For example, mathematically 5 is equal to 5.0, yet the former is an integerwhile the latter is a float.
Of course, Python can also help us manipulate text. A snippet of text in Python is called a string. A string can be written with single or double quotes. A string can use letters, spaces, line breaks, and numbers. So 5 is an integer, 5.0 is a float, but ‘5’ and ‘5.0’ are strings. A string can also be blank, such as ‘’.
Familiar Name |
Programming name |
Examples |
---|---|---|
Whole number |
integer |
-3, 0, 2, 534 |
Decimal |
float |
6.3, -19.23, 5.0, 0.01 |
Text |
string |
‘Hello world’, ‘1700 butterflies’, ‘’, ‘1823’ |
The distinction between each of these data types may seem unimportant, but Pythontreats each one differently. For example, we can ask Python whether an integer is equal to a float, but we cannot ask whether a string is equal to an integer or a float.
To evaluate whether two values are equal, we can use two equals signs between them. The expression will evaluate to either True
or False
.
# Run this code cell to determine whether the values are equal
42 == 42.0
True
# Run this code cell to compare an integer with a string
15 == 'fifteen'
False
# Run this code cell to compare an integer with a string
15 == '15'
False
When we use the addition operator on integers or floats, they are added to create a sum. When we use the addition operator on strings, they are combined into a single, longer string. This is called concatenation.
# Combine the strings 'Hello' and 'World'
'hello' + 'world'
'helloworld'
Notice that the strings are combined exactly as they are written. There is no space between the strings. If we want to include a space, we need to add the space to the end of ‘Hello’ or the beginning of ‘World’. We can also concatenate multiple strings.
# Combine three strings
When we use addition operator, the values must be all numbers or all strings. Combining them will create an error.
# Try adding a string to an integer
'55' + 23
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[10], line 2
1 # Try adding a string to an integer
----> 2 '55' + 23
TypeError: can only concatenate str (not "int") to str
Here, we receive the error can only concatenate str (not "int") to str
. Python assumes we would like to join two strings together, but it does not know how to join a string to an integer. Put another way, Python is unsure if we want:
‘55’ + 23
to become
‘5523’
or
78
We can multiply a string by an integer. The result is simply the string repeated the appropriate number of times.
# Multiply a string by an integer
Variables#
A variable is like a container that stores information. There are many kinds of information that can be stored in a variable, including the data types we have already discussed (integers, floats, and string). We create (or initialize) a variable with an assignment statement. The assignment statement gives the variable an initial value.
# Initialize an integer variable and add 22
new_integer_variable = 5
new_integer_variable + 22
The value of a variable can be overwritten with a new value.
# Overwrite the value of my_favorite_number when the commented out line of code is executed.
# Remove the # in the line "#my_favorite_number = 9" to turn the line into executable code.
my_favorite_number = 7
my_favorite_number = 9
my_favorite_number
# Overwriting the value of a variable using its original value
cats_in_house = 1
cats_in_house = cats_in_house + 2
cats_in_house
# A shorthand version
cats_in_house += 2
cats_in_house
# Initialize a string variable and concatenate another string
new_string_variable = 'Hello '
new_string_variable + 'World!'
You can create a variable with almost any name, but there are a few guidelines that are recommended.
Variable Names Should be Descriptive#
If we create a variable that stores the day of the month, it is helpful to give it a name that makes the value stored inside it clear like day_of_month
. From a logical perspective, we could call the variable almost anything (hotdog
, rabbit
, flat_tire
). As long as we are consistent, the code will execute the same. When it comes time to read, modify, and understand the code, however, it will be confusing to you and others. Consider this simple program that lets us change the days
variable to compute the number of seconds in that many days.
# Compute the number of seconds in 3 days
days = 3
hours_in_day = 24
minutes_in_hour = 60
seconds_in_minute = 60
days * hours_in_day * minutes_in_hour * seconds_in_minute
We could write a program that is logically the same, but uses confusing variable names.
hotdogs = 60
sasquatch = 24
example = 3
answer = 60
answer * sasquatch * example * hotdogs
This code gives us the same answer as the first example, but it is confusing. Not only does this code use variable names that are confusing, it also does not include any comments to explain what the code does. It is not clear that we would change example
to set a different number of days. It is not even clear what the purpose of the code is. As code gets longer and more complex, having clear variable names and explanatory comments is very important.
Variable Naming Rules#
In addition to being descriptive, variable names must follow 3 basic rules:
Must be one word (no spaces allowed)
Only letters, numbers and the underscore character (_)
Cannot begin with a number
# Which of these variable names are acceptable?
# Comment out the variables that are not allowed in Python and run this cell to check if the variable assignment works.
# If you get an error, the variable name is not allowed in Python.
$variable = 1
a variable = 2
a_variable = 3
4variable = 4
variable5 = 5
variable-6 = 6
variAble = 7
Avariable = 8
Variable Naming Style Guidelines#
The three rules above describe absolute rules of Python variable naming. If you break those rules, your code will create an error and fail to execute properly. There are also style guidelines that, while they won’t break your code, are generally advised for making your code readable and understandable. These style guidelines are written in the Python Enhancement Proposals (PEP) Style Guide.
The current version of the style guide advises that variable names should be written:
lowercase, with words separated by underscores as necessary to improve readability.
If you have written code before, you may be familiar with other styles, but these notebooks will attempt to follow the PEP guidelines for style. Ultimately, the most important thing is that your variable names are consistent so that someone who reads your code can follow what it is doing. As your code becomes more complicated, writing detailed comments with #
will also become more important.
The print()
and input()
Functions#
Many different kinds of programs often need to do very similar operations. Instead of writing the same code over again, you can use a function. Essentially, a function is a small snippet of code that can be quickly referenced. There are three kinds of functions:
Native functions built into Python
Functions others have written that you can import
Functions you write yourself
For now, let’s look at a few of the native functions. One of the most common functions used in Python is the print()
function which simply prints a string.
# A print function that prints: Hello World!
print('Hello World!')
We could also define a variable with our string 'Hello World!'
and then pass that variable into the print()
function. It is common for functions to take an input, called an argument, that is placed inside the parentheses ().
# Define a string and then print it
our_string = 'Hello World!'
print(our_string)
There is also an input()
function for taking user input.
# A program to greet the user by name
print('Hi. What is your name?') # Ask the user for their name
user_name = input() # Take the user's input and put it into the variable user_name
print('Pleased to meet you, ' + user_name) # Print a greeting with the user's name
We defined a string variable user_name
to hold the user’s input. We then called the print()
function to print the concatenation of ‘Pleased to meet you, ‘ and the user’s input that was captured in the variable user_name
. Remember that we can use a +
to concatenate, meaning join these strings together.
Here are couple more tricks we can use. You can pass a string variable into the input
function for a prompt and you can use an f string
to add the variable into the print string without use the +
operator to concatenate both strings.
# A program to greet the user by name
# Passing the prompt into the input() function
# prints it automatically before taking the input
user_name = input('Hi. What is your name? ')
An f string
starts with a letter f and can automatically concatenate variables enclosed in curly braces, like {variable_name}
.
# Using an f string to automatically concatenate
# Without using the plus (+) operator
print(f'Pleased to meet you, {user_name}')
We can concatenate many strings together, but we cannot concatenate strings with integers or floats.
# Concatenating many strings within a print function
print('Hello, ' + 'all ' + 'these ' + 'strings ' + 'are ' + 'being ' + 'connected ' + 'together.')
# Trying to concatenate a string with an integer causes an error
print('There are ' + 7 + 'continents.')
The str()
, int()
, and float()
functions#
We can transform one variable type into another variable type with the str()
, int()
, and float()
functions. Let’s convert the integer above into a string so we can concatenate it.
# Converting an integer into a string
print('There are ' + str(7) + ' continents.')
# Convert the variable `number` to an integer
# Then add 10 to it and print `number`
number = '5'
Mixing strings with floats and integers can have unexpected results. See if you can spot the problem with the program below.
# A program to tell a user how many months old they are
user_age = input('How old are you? ') # Take the user input and put it into the variable user_age
number_of_months = user_age * 12 # Define a new variable number_of_months that multiplies the user's age by 12
print('That is more than ' + number_of_months + ' months old!' ) # Print a response that tells the user they are at least number_of_months old
In order to compute the variable number_of_months
, we multiply user_age
by 12. The problem is that user_age
is a string. Multiplying a string by 12 simply makes the string repeat 12 times. After the user gives us their age, we need that input to be converted to an integer. Can you fix the program?
Attribution
Created by Nathan Kelber and Ted Lawless for JSTOR Labs under Creative Commons CC BY License