python 环境配置 requests 模块使用，打开tcp keepalive解决请求长时间返回时tcp连接断开问题

2022年4月25日 0条评论 225次阅读 0人点赞新之助meow

提示 Can‘t connect to HTTPS URL because the SSL module is not available

Anaconde 安装环境问题

将
libcrypto-1_1_x64.dll
libssl-1_1-x64.dll
从
D:\Anaconda\Library\bin
复制到
D:\Anaconda\DLLs

即可
参考
https://blog.csdn.net/Sky_Tree_Delivery/article/details/109078288
https://github.com/conda/conda/issues/8273

手动安装requests模块

在 Anaconde3\pkg 下找到requests模块压缩包
将包中的lib\site-packages\下的
requests
requests-2.21.0.dist-info
目录复制到anaconda3\lib\site-packages目录下

如果仍然无法import ，则添加模块搜索路径
查看当前搜索路径

>>> import sys
>>> sys.path
['', '/usr/local/lib/python35.zip', '/usr/local/lib/python3.5', '/usr/local/lib/python3.5/plat-linux', '/usr/local/lib/python3.5/lib-dynload', '/usr/local/lib/python3.5/site-packages']

添加模块搜索路径
set PYTHONPATH=c:\programdata\anaconda3\lib\site-packages

参考 Python 模块搜索路径
https://blog.csdn.net/liang19890820/article/details/76219560

用requests 模块模拟 curl请求

import requests

headers = {
    'Content-type': 'application/json',
}

data = '{"text":"Hello, World!"}'

response = requests.post('https://hooks.slack.com/services/asdfasdfasdf', headers=headers, data=data)

参考
https://stackoverflow.com/questions/25491090/how-to-use-python-to-execute-a-curl-command
Python3之requests模块
https://www.cnblogs.com/wang-yc/p/5623711.html

requests模块报错 https certificate verify failed

要么设置为不验证https证书，要么添加证书
不验证
requests.get('https://example.com', verify=False)

取消校验输出警告的问题
import urllib3
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)

参考
如何解决Requests的SSLError？
https://www.jianshu.com/p/8deb13738d2c
解决Python3 控制台输出InsecureRequestWarning的问题
https://www.cnblogs.com/helloworldcc/p/11107920.html

使用json模块处理json数据

import json
js=json.loads(jsontext)
field=js['field']

json.dump(dict1,out_file,indent=6)  #,ensure_ascii=true

python request请求打开tcp keepalive

python设置tcp keepalive

windows和linux的设置方法有区别
linux

    sock.setsockopt(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1)
    sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPIDLE, after_idle_sec)
    sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPINTVL, interval_sec)
    sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPCNT, max_fails)

windows 会报错AttributeError: 'module' object has no attribute 'TCP_KEEPIDLE'
windows需要用 ioctl来设置

sock.ioctl(socket.SIO_KEEPALIVE_VALS, (1, 10000, 3000))

这是操作系统socket接口实现不一致导致的差异
参考【整理】Python如何保持TCP心跳
http://www.jyguagua.com/?p=3066

requests 设置tcpkeepalive

为啥要设置keepalive呢，起因是某些requests长时间不返回的调用会报错
Python: [WinError 10054] An existing connection was forcibly closed by the remote host
这是说远端断开了连接，但对比chrome长时间调用就能返回，不会报错，说明python这边应该有啥可以改的。
搜了搜发现可能和tcp的keepalive 有关系，chrome默认是打开tcp的keepalive的45秒发送一次，python默认是不开tcp的keepalive的。

设置方法研究

1 在requests的实现中找

requests模块可以通过自定义adapter的方式来自己初始化PoolManager
然后在PoolManager初始化时带入socket_options 来设置socket_options，方法如下。
https://requests.readthedocs.io/en/master/user/advanced/#transport-adapters
https://urllib3.readthedocs.io/en/latest/reference/urllib3.connection.html

#参考  https://stackoverflow.com/questions/24569428/how-to-specify-socket-options-in-python-requests-lib-since-urllib3-v1-8-3-has
# How to specify “socket_options” in python-requests lib since urllib3 v1.8.3 has been added the “socket_options” feature?
# 这里有完整的代码
import requests
import socket
from requests.adapters import HTTPAdapter
from requests.adapters import PoolManager
from requests.packages.urllib3.connection import HTTPConnection

class SockOpsAdapter(HTTPAdapter):
  def __init__(self, options, **kwargs):
    self.options = options
    super(SockOpsAdapter, self).__init__()
  def init_poolmanager(self, connections, maxsize, block=False):
    print "init_poolmanager"
    self.poolmanager = PoolManager(num_pools=connections,
                                   maxsize=maxsize,
                                   block=block,
                                   socket_options=self.options)

options =  HTTPConnection.default_socket_options + [
              (socket.SOL_SOCKET, socket.SO_REUSEADDR, 1),
          ]

print "build session"
s = requests.Session()
s.mount('http://', SockOpsAdapter(options))
s.mount('https://', SockOpsAdapter(options))

for i in xrange(0, 10):
  print "sending request %i" % i
  url = 'http://host:port' #put in a host/port here
  headers = {'Content-Type':'text/plain', 'Accept':'text/plain'}
  post_status = s.get(url, headers=headers)
  print "Post Status Code = %s" % str(post_status.status_code)
  print post_status.content[0:50]

2 硬改urlilb3的源码

但由于windows要用ioctl来设置，所以这样不管用，urllib3没有找到在那暴露了ioctl设置。
看源码发现上面带入socketoption的方法只能 sock.setsockopt(*opt)，没有ioctl
最后是改了urllib3的util/connectio.py 源码，暴力的增加了代码。。。。。如下
https://github.com/urllib3/urllib3/blob/d0b20763f55536aec43caae9d180aa16c7b77d09/src/urllib3/util/connection.py

def _set_socket_options(sock, options):
    if options is None:
        return

    for opt in options:
        sock.setsockopt(*opt)
    #在这增加了设置tcp keepalive
    sock.ioctl(socket.SIO_KEEPALIVE_VALS,(1,10000,3000))

Anaconda3的默认环境的话，文件在

c:\programdata\anaconda3\lib\site-packages\urllib3\util\connection.py

3 通过获取requests 的fileno来获取socket对象

https://stackoverflow.com/questions/32310951/how-to-get-the-underlying-socket-when-using-python-requests
对于流连接（使用stream = True参数打开的连接），可以在响应对象上调用.raw.fileno（）方法以获取打开的文件描述符。然后可以使用socket.fromfd（...）方法从描述符创建Python套接字对象。

>>> import requests
>>> import socket
>>> r = requests.get('http://google.com/', stream=True)
>>> s = socket.fromfd(r.raw.fileno(), socket.AF_INET, socket.SOCK_STREAM)
>>> s.getpeername()
('74.125.226.49', 80)
>>> s.getsockname()
('192.168.1.60', 41323)

评论还说可以通过response的hook callback来获取，就可以跳过必须是stream的限制。
但搜response的文档，发现只能对设置response设置hook。。。。。这个功能我用不上了。
这个hook的使用可以参考下面的文档，官方文档内容太少了，用时还要看源码
Using hooks for custom behaviour in requests
https://alexwlchan.net/2017/10/requests-hooks/

注意 SO_KEEPALIVE 不等于HTTP Keep-Alive

参考
冤枉urllib3了，望文生义不可取
https://steemit.com/python/@oflyhigh/urllib3

其他

Python Requests 小技巧总结
https://blog.csdn.net/xie_0723/article/details/52790786

参考

TCP keepalive的探究 (2) : 浏览器的Keepalive机制
https://blog.chionlab.moe/2016/11/07/tcp-keepalive-on-chrome/

Chrome对TCP连接的保活机制
从上面的抓包结果中看到，在服务器返回完整HTTP 200报文的45秒后（Time=72），本地发出了第一个TCP Keepalive探测包并收到来自服务器的ACK。
这说明，Chrome对于可复用的TCP连接，采用的保活机制是TCP层（传输层）自带的Keepalive机制，通过TCP Keepalive探测包的方式实现，而不是在七层报文上自成协议来传输其它数据。

TCP keepalive
https://zhuanlan.zhihu.com/p/82035839

0.00 avg. rating (0% score) - 0 votes

这里是新之助meow

python 环境配置 requests 模块使用，打开tcp keepalive解决请求长时间返回时tcp连接断开问题

提示 Can‘t connect to HTTPS URL because the SSL module is not available

手动安装requests模块

用requests 模块模拟 curl请求

requests模块报错 https certificate verify failed

使用json模块处理json数据

python request请求打开tcp keepalive

python设置tcp keepalive

requests 设置tcpkeepalive

设置方法研究

1 在requests的实现中找

2 硬改urlilb3的源码

3 通过获取requests 的fileno来获取socket对象

注意 SO_KEEPALIVE 不等于HTTP Keep-Alive

其他

参考

本作品采用知识共享署名-相同方式共享 4.0 国际许可协议进行许可

这里是新之助meow

提示 Can‘t connect to HTTPS URL because the SSL module is not available

手动安装requests模块

用requests 模块模拟 curl请求

requests模块报错 https certificate verify failed

使用json模块处理json数据

python request请求打开tcp keepalive

python设置tcp keepalive

requests 设置tcpkeepalive

设置方法研究

1 在requests的实现中找

2 硬改urlilb3的源码

3 通过获取requests 的fileno来获取socket对象

注意 SO_KEEPALIVE 不等于HTTP Keep-Alive

其他

参考

本作品采用 知识共享署名-相同方式共享 4.0 国际许可协议 进行许可

本作品采用知识共享署名-相同方式共享 4.0 国际许可协议进行许可