我对编码很陌生,在学习的过程中努力学习。
我正在尝试创建一个python脚本,它将从txt文件中的urls列表中获取和打印所有头文件。
它似乎正在实现,但我被困在一个无限循环中,其中一个urls,我不知道为什么,出于某种原因,"-h",或“-帮助”不会返回usage()
。任何帮助都将不胜感激。
以下是我到目前为止的情况:
#!/usr/bin/python
import pycurl
import cStringIO
import sys, getopt
buf = cStringIO.StringIO()
c = pycurl.Curl()
def usage():
print "-h --help, -i --urlist, -o --proxy"
sys.exit()
def main(argv):
iurlist = None
proxy = None
try:
opts, args = getopt.getopt(argv,"hi:o:t",["help", "iurlist=","proxy="])
if not opts:
print "No options supplied"
print "Type -h for help"
sys.exit()
except getopt.GetoptError as err:
print str(err)
usage()
sys.exit(2)
for opt, arg in opts:
if opt == ("-h", "--help"):
usage()
sys.exit()
elif opt in ("-i", "--iurlist"):
iurlist = arg
elif opt in ("-o", "--proxy"):
proxy = arg
else:
assert False, "Unhandeled option"
with open(iurlist) as f:
iurlist = f.readlines()
print iurlist
try:
for i in iurlist:
c.setopt(c.URL, i)
c.setopt(c.PROXY, proxy)
c.setopt(c.HEADER, 1)
c.setopt(c.FOLLOWLOCATION, 1)
c.setopt(c.MAXREDIRS, 30)
c.setopt(c.USERAGENT, 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:24.0) Gecko/20100101 Firefox/24.0')
c.setopt(c.TIMEOUT, 8)
c.setopt(c.CONNECTTIMEOUT, 5)
c.setopt(c.NOBODY, 1)
c.setopt(c.PROXY, proxy)
c.setopt(c.WRITEFUNCTION, buf.write)
c.setopt(c.SSL_VERIFYPEER, 0)
c.perform()
print buf.getvalue()
buf.close
except pycurl.error, error:
errno, errstr = error
print 'An error has occurred: ', errstr
if __name__ == "__main__":
main(sys.argv[1:])
这是最新的代码:
#!/usr/bin/python
import pycurl
import cStringIO
import sys, getopt
c = pycurl.Curl()
def usage():
print "-h --help, -i --urlist, -o --proxy"
print "Example Usage: cURLdect.py -i urlist.txt -o http://192.168.1.64:8080"
sys.exit()
def main(argv):
iurlist = None
proxy = None
try:
opts, args = getopt.getopt(argv,"hi:o:t",["help", "iurlist=","proxy="])
if not opts:
print "No options supplied"
print "Type -h for help"
sys.exit()
except getopt.GetoptError as err:
print str(err)
usage()
sys.exit(2)
for opt, arg in opts:
if opt in ("-h", "--help"):
usage()
sys.exit()
elif opt in ("-i", "--iurlist"):
iurlist = arg
elif opt in ("-o", "--proxy"):
proxy = arg
else:
assert False, "Unhandeled option"
with open(iurlist) as f:
iurlist = f.readlines()
print iurlist
try:
for i in iurlist:
buf = cStringIO.StringIO()
c.setopt(c.WRITEFUNCTION, buf.write)
c.setopt(c.PROXY, proxy)
c.setopt(c.HEADER, 1)
c.setopt(c.FOLLOWLOCATION, 1)
c.setopt(c.MAXREDIRS, 300)
c.setopt(c.USERAGENT, 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:24.0) Gecko/20100101 Firefox/24.0')
c.setopt(c.TIMEOUT, 8)
c.setopt(c.CONNECTTIMEOUT, 5)
c.setopt(c.NOBODY, 1)
c.setopt(c.SSL_VERIFYPEER, 0)
c.setopt(c.URL, i)
c.perform()
print buf.getvalue()
buf.close()
except pycurl.error, error:
errno, errstr = error
print 'An error has occurred: ', errstr
if __name__ == "__main__":
main(sys.argv[1:])
发布于 2014-01-13 09:26:08
如果你在学习,吡咯烷酮可能不是最好的选择。它们让你对图书馆很熟悉。来自http://pycurl.sourceforge.net/
PycURL是针对高级开发人员的--如果您需要几十个并发、快速和可靠的连接或上面列出的任何复杂特性,那么PycURL就是适合您的。 PycURL的主要缺点是它在libcurl上是一个相对薄的层,没有任何优秀的Pythonic类层次结构。这意味着它有一个有点陡峭的学习曲线,除非您已经熟悉libcurl的C API。
这就是他们如何做一个多取:https://github.com/pycurl/pycurl/blob/master/examples/retriever-multi.py
要获取头文件,请安装requests
库,只需执行以下操作:
for url in list_of_urls:
r = requests.get(url)
print r.headers
要处理命令行参数,请在python附带的电池中使用argparser
。
发布于 2014-01-13 07:41:08
你在用
如果选择== ("-h",“--帮助”):
关于帮助选项,但是
如果在中选择(.)
所有其他的选择。opt
要么是-h
,要么是--help
,但不是两者兼而有之,所以您需要使用in
来检查opt
是否也是两者之一。
https://stackoverflow.com/questions/21095055
复制相似问题