As you see from the session below the urlopen function is no longer a part of urllib, it is now a part of urllib.request.
>>> import urllib
>>> page = urllib.request.urlopen('http://www.cs.luther.edu/python/test.html')
Traceback (most recent call last):
File "", line 1, in
page = urllib.request.urlopen('http://www.cs.luther.edu/python/test.html')
AttributeError: 'module' object has no attribute 'request'
>>> import urllib.request
>>> page = urllib.request.urlopen('http://www.cs.luther.edu/python/test.html')
>>> pageText = page.read()
>>> pageText
b'\n\n\n\t\n\tTest Page \n\t\n\t\n\t\n\n\nHello Python Programmer!
\nThis is a test page for the urllib2 module program
\n\n\n'
>>> type(pageText)
If simply moving the urlopen function to urllib.request was the only change that would not have been too bad. The more difficult change is the very subtle addition of the b before the quotes in the pageText string. In fact you can see that the variable pageText refers to something that is called bytes.
The good news is that bytes objects act very similarly to strings. The bad news is that you cannot simply mix and match strings with bytes.
The session below illustrates the difficulty:
>>> 'foo' + b'bar'
Traceback (most recent call last):
File "", line 1, in
'foo' + b'bar'
TypeError: Can't convert 'bytes' object to str implicitly
We will work through these differences in subsequent posts about the rest of chapter 5.
No comments:
Post a Comment