This notebook is to help you learn the basics of Python data structures, particularly lists and dictionaries. You have hopefully seen a little bit of these two data structures in 2308, as list
or vector
and hash_map
. Python and JavaScript both make these structures very easy to work with, and web development depends heavily on them.
Python provides a variety of basic data types that support common operations.
You can write numbers and strings:
a_num = 5
a_str = 'puppy'
You can do math with these:
a_num * 3
And manipulate strings:
a_str + ' dog'
Python has another really cool feature for strings: string formatting. You can write a string that has some placeholders ({}
), and then fill those in with values:
'the {} says {} {} times'.format('cat', 'meow', 5)
This is most useful when you have variables that you want to put in to a string:
'the {} says {}'.format(a_str, 'woof')
To find out how long a string is, use the len
function:
len(a_str)
Lists do what they say on the tin: they store a list of things. Think like a grocery list, or a to-do list: it is a sequence of items.
For example, the following code creates a new list and assigns it to a variable called beatles
:
beatles = ['John', 'Paul', 'George', 'Ringo']
We can see that the list has 4 items, again with the len
function:
len(beatles)
The len
function applies to lots of different kinds of Python objects. If you want to find out how ‘big’ something is, try len
.
Lists function a lot like arrays in C++. You can get a specific item (indexes start from 0, just like C++):
beatles[0]
You can also reassign a particular value:
beatles[2] = 'Albert'
One of the most common things to do with a list is to iterate over it. Python's for
loop does this: it loops over the list, running the code inside it once for each element in the list. The loop also defines a variable that, during each time through the loop, stores that particular value. For example, if we want to print the Beatles:
for beatle in beatles:
print(beatle)
We can see the different members of the Beatles. Well, with one change — the member we swapped out back in line 10.
What if we want to know which beatle number we're at? In C++, when you loop over an array using an index variable, e.g.
for (int i = 0; i < n_items; i++) {
}
you have the i
variable that tells you where you are. In Python, to do that, we use the enumerate
function and loop over its results:
for i, beatle in enumerate(beatles):
print('Beatle number {} is {}'.format(i+1, beatle))
What enumerate
does is associate a position with each element in the list, and then we loop over those pairs.
We can also loop with an index i
, but it is not idiomatic Python:
for i in range(len(beatles)):
print('Beatle number {} is {}'.format(i+1, beatles[i]))
OK, so we can make lists, and just write them out — that's easier than C++! — and we can iterate over them (with for
), and find out how long they are (with len
). What else can we do with them?
One really useful thing is to append to them. This is where they get much more interesting than C++ arrays. Let's start by storing an empty list in a variable:
results = []
Now results
references an empty list — there's nothing in it:
len(results)
Let's add something:
results.append('wumpus')
Each call to results.append
added a new element to the end of results
. We can see that there's an item now:
len(results)
And what it is:
results[0]
We can also delete it:
del results[0]
len(results)
Let's put several things in it. Maybe, things about the Beatles:
# Even though results is empty again, it's good practice to create our
# empty accumulator right before the loop:
results = []
for i, beatle in enumerate(beatles):
results.append('Beatle number {} is {}'.format(i+1, beatle))
Now how many results do we have?
len(results)
And what, praytell, are these results?
results
(The Python notebook dumps out a list in Python syntax if we just ask it to display a list. Pretty nifty.)
There are a bunch more things that we can do with lists, but those are the basics. The big things to remember are:
append
) and removed (with del
).Lists let us store a bunch of things in some order, and we can access them by position (or index) in the list.
But what if we have things that have names, and maybe we don't care about order?
A dictionary allows us to store a bunch of things and access them by name (called a key) rather than by position. A name, not a number.
Let's make a simple dictionary that describes a British monarch:
liz_the_first = {
'name': 'Elizabeth I',
'region': 'England',
'birth_date': '1533-09-07',
'death_date': '1603-03-24',
'coronation_date': '1559-01-15'
}
The squiggly braces denote a dictionary, and the keys and values are separated by colons (:
). We say that the dictionary maps its keys to values.
Again, len
tells us how many things are in it:
len(liz_the_first)
We can get individual items:
liz_the_first['name']
We see here that items are identified by their keys, or names, instead of numbers. So we can think of a dictionary as being sort-of like a list, with two differences:
Very useful for many kinds of things.
While we can theoretically use any kind of value for the key, we will generally use strings or numbers.
We can also assign new values:
liz_the_first['mother'] = 'Anne Boleyn'
liz_the_first['family'] = 'Tudor'
Now we have two new values:
len(liz_the_first)
We can get the list of keys (or names) in the dictionary:
liz_the_first.keys()
It says dict_keys
at the beginning, but we can treat it just like a list:
for k in liz_the_first.keys():
print('field: {}'.format(k))
Notice that the keys are in a seemingly random order. This is because they are arranged for the internal convenience of the dictionary implementation; we get no choice in the order. However, since we care about names, not numbers, that is (generally) not a problem.
We can also delete keys, like we did with list items, and we can add any kind of value, not just strings. For example, Liz was the queen of multiple regions, so let's be more precise:
del liz_the_first['region']
liz_the_first['regions'] = ['England', 'Ireland']
liz_the_first
The notebook has helpfully printed out the dictionary, so we can see that there is no longer a ‘region’ key, but ‘regions’ is now mapped to a list of the countries of which Elizabeth I was the queen.
Now, we can also iterate over the items (key-value pairs) in a dictionary, in a manner very similar to the enumerate
thing we did to the list:
for key, value in liz_the_first.items():
print('{}: {}'.format(key, value))
We can also get the values of the dictionary, again as a list:
liz_the_first.values()
We're going to be using lists and dictionaries a lot in both Python and JavaScript (the concepts are very similar between two languages, and I will be providing a JavaScript version of this notebook when the time comes).