Python Programming in Context: Chapter 5 -- Session 5.7

One big change in Python 3.0 that we did not notice until after the book had gone to press was a change to the module for reading from the Internet. The urllib module was changed substantially.

As you see from the session below the urlopen function is no longer a part of urllib, it is now a part of urllib.request.



>>> import urllib
>>> page = urllib.request.urlopen('http://www.cs.luther.edu/python/test.html')
Traceback (most recent call last):
  File "", line 1, in 
    page = urllib.request.urlopen('http://www.cs.luther.edu/python/test.html')
AttributeError: 'module' object has no attribute 'request'
>>> import urllib.request
>>> page = urllib.request.urlopen('http://www.cs.luther.edu/python/test.html')
>>> pageText = page.read()
>>> pageText
b'\n\n\n\t\n\tTest Page\n\t\n\t\n\t\n\n\nHello Python Programmer!
\nThis is a test page for the urllib2 module program
\n\n\n'
>>> type(pageText)

If simply moving the urlopen function to urllib.request was the only change that would not have been too bad. The more difficult change is the very subtle addition of the b before the quotes in the pageText string. In fact you can see that the variable pageText refers to something that is called bytes.

The good news is that bytes objects act very similarly to strings. The bad news is that you cannot simply mix and match strings with bytes.

The session below illustrates the difficulty:


>>> 'foo' + b'bar'
Traceback (most recent call last):
  File "", line 1, in 
    'foo' + b'bar'
TypeError: Can't convert 'bytes' object to str implicitly

We will work through these differences in subsequent posts about the rest of chapter 5.

Python Programming in Context

Wednesday, March 18, 2009

Chapter 5 -- Session 5.7

Hello Python Programmer!

No comments:

Post a Comment

Welcome

Notes and Fixes by Tag

Blog Archive

Followers