Lecture 8.1 (March 8)
Tuesday, we saw data access:
- linking to
- defining data types in
- adding data (
- saving changes (
We want to manage user logins in the database. To handle that, we will create a
class User(db.Model): id = db.Column(db.Integer, primary_key=True, autoincrement=True) login = db.Column(db.String(20)) pw_hash = db.Column(db.String(64))
Then we will create a create user form, on the Login page.
Create Account forms generally have two password fields, to make sure they match.
Storing the Password
Once we have validated the password, we need to store it.
- Do not store the password in clear text
- Do not just run the password through MD5 or SHA1 and store it
- Do use a dedicated password encryption function or key-derivation function
Reasonable choices for password encryption:
- PB-KDF2 (password-based key derivation function)
We will use bcrypt, as it has a nice Python module.
For more information, see How to Safely Store a Password.
To encrypt the password:
pw_hash = bcrypt.hashpw(password.encode('utf8'), bcrypt.gensalt())
pw_hash is a bytestring, so we need to decode it to a string:
Verifying the Password
Now we need to modify our
login handler to validate the user's password!
We also need to store user IDs instead of user names in the session cookie.
To verify a password:
check_hash = bcrypt.hashpw(password.encode('utf8'), pw_hash) if check_hash == pw_hash: # it's good
So we see this
decode. What's up with that?
We have to have some way of representing text in the computer. This is done through character encodings. A character encoding describes how the letter 'A', for example, is stored in the computer's memory or storage.
One of the oldest encodings is
ASCII, which can store English letters, numbers, and common symbols.
But the whole world doesn't speak English. So other encodings were developed; many of them extended ASCII. Programs that worked with text had to be aware of the specific encoding that the text was stored in, and the encoded bytes were manipulated directly.
In the early 90's, people decided it'd be nice to have a common way to work with text. So UNICODE was developed to describe every character used in written human language. They defined code points representing the many different characters, and mappings to and from encodings.
A Unicode string is a sequence of code points.
We then encode that string into bytes, using an encoding like 'utf8' or 'ascii' or 'SHIFT-JIS'.
In Python, we have two kinds of strings:
- Unicode strings, which are our ordinary strings (in Python 3 and later). They represent a sequence of code points. Internally, they're stored using an encoding called UCS-4, but that does not matter. We can think of them as an array of code points.
- Byte strings, which store bytes.
To convert a Unicode string to a byte string, we encode it with the
.encode method. UTF-8 is the most common and generally easiest-to-work-with encoding.
To convert a byte string to Unicode, we decode it with the
.decode method. To do this, we must know its encoding! Decoding a Latin-1 string as UTF-8 doesn't work.
bcrypt API works on byte strings. We will encode the password using UTF-8 (so it can contain any valid Unicode character). The resulting hash is encoded in ASCII, which we can decode to save it in the database.
Internally, the database will encode the strings, of course. But our database API exposes string-based APIs, and it works well to decode these particular byte strings as ASCII.
OK, so we have users. Let's make users own the animals they create!
- Add a
Animal. This will be a foreign key referencing
- Add a
Animal, to make it easy to work with the user from Python.
- Show who added the animal in the template.
We will write the example code to do this.
backref to the user relationship.
Show animals for a user!