Python URLLib / URLLib2开机自检（python开机自启动代码）

25-02-18 16

想了解PythonURLLib/URLLib2开机自检的新动态吗？本文将为您提供详细的信息，我们还将为您解答关于python开机自启动代码的相关问题，此外，我们还将为您介绍关于Pythonurllib

想了解Python URLLib / URLLib2开机自检的新动态吗？本文将为您提供详细的信息，我们还将为您解答关于python开机自启动代码的相关问题，此外，我们还将为您介绍关于Python urllib urllib2、Python urllib2URLError：、Python urllib2。URLError：、Python urllib、urllib2、httplib抓取网页代码实例的新知识。

本文目录一览：

Python URLLib / URLLib2开机自检（python开机自启动代码）
Python urllib urllib2
Python urllib2URLError：
Python urllib2。URLError：
Python urllib、urllib2、httplib抓取网页代码实例

Python URLLib / URLLib2开机自检（python开机自启动代码）

我正在尝试使用wx / Python创建超简单的虚拟输入/输出板。我对存储数据的服务器的请求之一具有以下代码：

data = urllib.urlencode({''q'': ''Status''})u = urllib2.urlopen(''http://myserver/inout-tracker'', data)for line in u.readlines():  print line

没什么特别的。我遇到的问题是，基于我阅读文档的方式，这应该执行“发布请求”，因为我已经提供了data参数，但这种情况没有发生。我在该网址的索引中包含以下代码：

if (!isset($_POST[''q''])) { die (''No action specified''); }echo $_POST[''q''];

每次我运行Python App时，都会在控制台上显示“未指定操作”文本。我将尝试使用Request Objects来实现它，因为我已经看到了一些包含这些对象的演示，但是我想知道是否有人可以帮助我解释为什么我没有收到带有此代码的Post Request。谢谢！

-编辑-

此代码可以正常工作，并且可以正确地发布到我的网页上：

data = urllib.urlencode({''q'': ''Status''})h = httplib.HTTPConnection(''myserver:8080'')headers = {"Content-type": "application/x-www-form-urlencoded",            "Accept": "text/plain"}h.request(''POST'', ''/inout-tracker/index.php'', data, headers)r = h.getresponse()print r.read()

我仍然不确定为什么在提供data参数时urllib2库不发布-对我来说文档指示应该。

答案1

小编典典

u = urllib2.urlopen(''http://myserver/inout-tracker'', data)h.request(''POST'', ''/inout-tracker/index.php'', data, headers)

使用/inout-tracker没有尾随的路径/不会获取index.php。取而代之的是，服务器将302使用尾随版本重定向到版本/。

执行302通常会导致客户端将POST转换为GET请求。

Python urllib urllib2

urlli2是对urllib的扩展。

相似与区别：

最常用的urllib.urlopen和urllib2.urlopen是类似的，但是参数有区别，例如超时和代理。

urllib接受url字符串来获取信息，而urllib2除了url字符串，也接受Request对象，而在Request对象中可以设置headers，而urllib却不能设置headers。

urllib有urlencode方法来对参数进行encode操作，而urllib2没有此方法，所以他们两经常一起使用。

相对来说urllib2功能更多一些，包含了各种handler和opener。

另外还有httplib模块，它提供了最基础的http请求的方法，例如可以做get/post/put等操作。

参考：http://blog.csdn.net/column/details/why-bug.html

最基本的应用：

import urllib2  
response = urllib2.urlopen(''http://www.baidu.com/'')  
html = response.read()  
print html

使用Request对象：

import urllib2    
req = urllib2.Request(''http://www.baidu.com'')    
response = urllib2.urlopen(req)    
the_page = response.read()    
print the_page

发送表单数据：

import urllib    
import urllib2    
  
url = ''http://www.someserver.com/register.cgi''    
    
values = {''name'' : ''WHY'',    
          ''location'' : ''SDU'',    
          ''language'' : ''Python'' }    
  
data = urllib.urlencode(values) # 编码工作  
req = urllib2.Request(url, data)  # 发送请求同时传data表单  
response = urllib2.urlopen(req)  #接受反馈的信息  
the_page = response.read()  #读取反馈的内容

import urllib2    
import urllib  
  
data = {}  
  
data[''name''] = ''WHY''    
data[''location''] = ''SDU''    
data[''language''] = ''Python''  
  
url_values = urllib.urlencode(data)    
print url_values  
  
name=Somebody+Here&language=Python&location=Northampton    
url = ''http://www.example.com/example.cgi''    
full_url = url + ''?'' + url_values  
  
data = urllib2.urlopen(full_url)

在http请求中设置headers：

import urllib    
import urllib2    
  
url = ''http://www.someserver.com/cgi-bin/register.cgi''  
  
user_agent = ''Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)''    
values = {''name'' : ''WHY'',    
          ''location'' : ''SDU'',    
          ''language'' : ''Python'' }    
  
headers = { ''User-Agent'' : user_agent }    
data = urllib.urlencode(values)    
req = urllib2.Request(url, data, headers)    
response = urllib2.urlopen(req)    
the_page = response.read()

下面是关于opener和handler的应用：

from urllib2 import Request, urlopen, URLError, HTTPError  
  
  
old_url = ''http://t.cn/RIxkRnO''  
req = Request(old_url)  
response = urlopen(req)    
print ''Old url :'' + old_url  
print ''Real url :'' + response.geturl()

这里得到url即response.geturl()与old_url不同，是因为重定向。

查看页面信息info()：

from urllib2 import Request, urlopen, URLError, HTTPError  
  
old_url = ''http://www.baidu.com''  
req = Request(old_url)  
response = urlopen(req)    
print ''Info():''  
print response.info()

一个opener和handler的实例：

# -*- coding: utf-8 -*-  
import urllib2  
  
# 创建一个密码管理者  
password_mgr = urllib2.HTTPPasswordMgrWithDefaultRealm()  
  
# 添加用户名和密码  
  
top_level_url = "http://example.com/foo/"  
  
# 如果知道 realm, 我们可以使用他代替 ``None``.  
# password_mgr.add_password(None, top_level_url, username, password)  
password_mgr.add_password(None, top_level_url,''why'', ''1223'')  
  
# 创建了一个新的handler  
handler = urllib2.HTTPBasicAuthHandler(password_mgr)  
  
# 创建 "opener" (OpenerDirector 实例)  
opener = urllib2.build_opener(handler)  
  
a_url = ''http://www.baidu.com/''  
  
# 使用 opener 获取一个URL  
opener.open(a_url)  
  
# 安装 opener.  
# 现在所有调用 urllib2.urlopen 将用我们的 opener.  
urllib2.install_opener(opener)

下面是一些技巧：

代理设置：

import urllib2  
enable_proxy = True  
proxy_handler = urllib2.ProxyHandler({"http" : ''http://some-proxy.com:8080''})  
null_proxy_handler = urllib2.ProxyHandler({})  
if enable_proxy:  
    opener = urllib2.build_opener(proxy_handler)  
else:  
    opener = urllib2.build_opener(null_proxy_handler)  
urllib2.install_opener(opener)

timeout设置，

python2.6前：

import urllib2  
import socket  
socket.setdefaulttimeout(10) # 10 秒钟后超时  
urllib2.socket.setdefaulttimeout(10) # 另一种方式

2.6之后：

import urllib2  
response = urllib2.urlopen(''http://www.google.com'', timeout=10)

Request中加入header：

import urllib2  
request = urllib2.Request(''http://www.baidu.com/'')  
request.add_header(''User-Agent'', ''fake-client'')  
response = urllib2.urlopen(request)  
print response.read()

redirect：

import urllib2  
my_url = ''http://www.google.cn''  
response = urllib2.urlopen(my_url)  
redirected = response.geturl() == my_url  
print redirected  
  
my_url = ''http://rrurl.cn/b1UZuP''  
response = urllib2.urlopen(my_url)  
redirected = response.geturl() == my_url  
print redirected

import urllib2  
class RedirectHandler(urllib2.HTTPRedirectHandler):  
    def http_error_301(self, req, fp, code, msg, headers):  
        print "301"  
        pass  
    def http_error_302(self, req, fp, code, msg, headers):  
        print "303"  
        pass  
  
opener = urllib2.build_opener(RedirectHandler)  
opener.open(''http://rrurl.cn/b1UZuP'')

cookie：

import urllib2  
import cookielib  
cookie = cookielib.CookieJar()  
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cookie))  
response = opener.open(''http://www.baidu.com'')  
for item in cookie:  
    print ''Name = ''+item.name  
    print ''Value = ''+item.value

http的put和delete方法：

import urllib2  
request = urllib2.Request(uri, data=data)  
request.get_method = lambda: ''PUT'' # or ''DELETE''  
response = urllib2.urlopen(request)

得到http返回码：

import urllib2  
try:  
    response = urllib2.urlopen(''http://bbs.csdn.net/why'')  
except urllib2.HTTPError, e:  
    print e.code

debug log：

import urllib2  
httpHandler = urllib2.HTTPHandler(debuglevel=1)  
httpsHandler = urllib2.HTTPSHandler(debuglevel=1)  
opener = urllib2.build_opener(httpHandler, httpsHandler)  
urllib2.install_opener(opener)  
response = urllib2.urlopen(''http://www.google.com'')

Python urllib2URLError：

我正在与API建立多个连接。进行删除查询。我在第3000个查询中遇到了该错误。

像这样：

 def delete_request(self,path):
    opener = urllib2.build_opener(urllib2.HTTPHandler)
    request = urllib2.Request('%s%s'%(self.endpoint,path))
    signature = self._gen_auth('DELETE',path,'')
    request.add_header('X-COMPANY-SIGNATURE-AUTH',signature)
    request.get_method = lambda: 'DELETE'
    resp = opener.open(request)

比在控制台中：

for i in xrange(300000): 
    con.delete_request('/integration/sitemap/item.xml/media/%d/' % i)

在第3000个请求之后，它会说：

URLError: urlopen error [Errno 10048]
Only one usage of each socket address (protocol/network address/port)
is normally permitted

Python urllib2。URLError：

我正在与API建立多个连接。进行删除查询。我在第3000个查询中遇到了该错误。

像这样：

 def delete_request(self,path):
    opener = urllib2.build_opener(urllib2.HTTPHandler)
    request = urllib2.Request('%s%s'%(self.endpoint,path))
    signature = self._gen_auth('DELETE',path,'')
    request.add_header('X-COMPANY-SIGNATURE-AUTH',signature)
    request.get_method = lambda: 'DELETE'
    resp = opener.open(request)

比在控制台中：

for i in xrange(300000): 
    con.delete_request('/integration/sitemap/item.xml/media/%d/' % i)

在第3000个请求之后，它会说：

URLError: urlopen error [Errno 10048]
Only one usage of each socket address (protocol/network address/port)
is normally permitted

Python urllib、urllib2、httplib抓取网页代码实例

使用urllib2，太强大了
试了下用代理登陆拉取cookie，跳转抓图片......
文档：http://docs.python.org/library/urllib2.html

直接上demo代码了
包括：直接拉取，使用Reuqest(post/get),使用代理，cookie,跳转处理

#!/usr/bin/python
# -*- coding:utf-8 -*-
# urllib2_test.py
# author: wklken
# 2012-03-17 wklken@yeah.net


import urllib,urllib2,cookielib,socket

url = "http://www.testurl....." #change yourself
#最简单方式
def use_urllib2():
 try:
  f = urllib2.urlopen(url, timeout=5).read()
 except urllib2.URLError, e:
  print e.reason
 print len(f)

#使用Request
def get_request():
 #可以设置超时
 socket.setdefaulttimeout(5)
 #可以加入参数 [无参数，使用get，以下这种方式，使用post]
 params = {"wd":"a","b":"2"}
 #可以加入请求头信息，以便识别
 i_headers = {"User-Agent": "Mozilla/5.0 (Windows; U; Windows NT 5.1; zh-CN; rv:1.9.1) Gecko/20090624 Firefox/3.5",
       "Accept": "text/plain"}
 #use post,have some params post to server,if not support ,will throw exception
 #req = urllib2.Request(url, data=urllib.urlencode(params), headers=i_headers)
 req = urllib2.Request(url, headers=i_headers)

 #创建request后，还可以进行其他添加,若是key重复，后者生效
 #request.add_header(''Accept'',''application/json'')
 #可以指定提交方式
 #request.get_method = lambda: ''PUT''
 try:
  page = urllib2.urlopen(req)
  print len(page.read())
  #like get
  #url_params = urllib.urlencode({"a":"1", "b":"2"})
  #final_url = url + "&#63;" + url_params
  #print final_url
  #data = urllib2.urlopen(final_url).read()
  #print "Method:get ", len(data)
 except urllib2.HTTPError, e:
  print "Error Code:", e.code
 except urllib2.URLError, e:
  print "Error Reason:", e.reason

def use_proxy():
 enable_proxy = False
 proxy_handler = urllib2.ProxyHandler({"http":"http://proxyurlXXXX.com:8080"})
 null_proxy_handler = urllib2.ProxyHandler({})
 if enable_proxy:
  opener = urllib2.build_opener(proxy_handler, urllib2.HTTPHandler)
 else:
  opener = urllib2.build_opener(null_proxy_handler, urllib2.HTTPHandler)
 #此句设置urllib2的全局opener
 urllib2.install_opener(opener)
 content = urllib2.urlopen(url).read()
 print "proxy len:",len(content)

class NoExceptionCookieProcesser(urllib2.HTTPCookieProcessor):
 def http_error_403(self, req, fp, code, msg, hdrs):
  return fp
 def http_error_400(self, req, fp, code, msg, hdrs):
  return fp
 def http_error_500(self, req, fp, code, msg, hdrs):
  return fp

def hand_cookie():
 cookie = cookielib.CookieJar()
 #cookie_handler = urllib2.HTTPCookieProcessor(cookie)
 #after add error exception handler
 cookie_handler = NoExceptionCookieProcesser(cookie)
 opener = urllib2.build_opener(cookie_handler, urllib2.HTTPHandler)
 url_login = "https://www.yourwebsite/&#63;login"
 params = {"username":"user","password":"111111"}
 opener.open(url_login, urllib.urlencode(params))
 for item in cookie:
  print item.name,item.value
 #urllib2.install_opener(opener)
 #content = urllib2.urlopen(url).read()
 #print len(content)
#得到重定向 N 次以后最后页面URL
def get_request_direct():
 import httplib
 httplib.HTTPConnection.debuglevel = 1
 request = urllib2.Request("http://www.google.com")
 request.add_header("Accept", "text/html,*/*")
 request.add_header("Connection", "Keep-Alive")
 opener = urllib2.build_opener()
 f = opener.open(request)
 print f.url
 print f.headers.dict
 print len(f.read())

if __name__ == "__main__":
 use_urllib2()
 get_request()
 get_request_direct()
 use_proxy()
 hand_cookie()

登录后复制

关于Python URLLib / URLLib2开机自检和python开机自启动代码的问题我们已经讲解完毕，感谢您的阅读，如果还想了解更多关于Python urllib urllib2、Python urllib2URLError：、Python urllib2。URLError：、Python urllib、urllib2、httplib抓取网页代码实例等相关内容，可以在本站寻找。

本文标签：