运维开发网

协议缓冲区python – unicode解码错误

运维开发网 https://www.qedev.com 2020-06-21 19:42 出处:网络 作者:运维开发网整理
我需要在我的 python – tornado服务器上接收协议缓冲区消息,并从二进制消息中获取内容. postContent = self.request.body message = prototemp.ReqMessage() message.ParseFromString(postContent) 它使用测试工具完美地工作.当我在沙盒环境中运行它并模拟来自我的客户端的1000个请求时,它在某
我需要在我的 python – tornado服务器上接收协议缓冲区消息,并从二进制消息中获取内容.

postContent = self.request.body
message = prototemp.ReqMessage()
message.ParseFromString(postContent)

它使用测试工具完美地工作.当我在沙盒环境中运行它并模拟来自我的客户端的1000个请求时,它在某些情况下有效,但在大多数请求中,它会抛出异常 –

File "server1.py", line 21, in post
    message.ParseFromString(postContent)
  File "/usr/lib/python2.6/site-packages/protobuf-2.4.1-py2.6.egg/google/protobuf/message.py", line 179, in ParseFromString
    self.MergeFromString(serialized)
  File "/usr/lib/python2.6/site-packages/protobuf-2.4.1-py2.6.egg/google/protobuf/internal/python_message.py", line 755, in MergeFromString
    if self._InternalParse(serialized, 0, length) != length:
  File "/usr/lib/python2.6/site-packages/protobuf-2.4.1-py2.6.egg/google/protobuf/internal/python_message.py", line 782, in InternalParse
    pos = field_decoder(buffer, new_pos, end, self, field_dict)
  File "/usr/lib/python2.6/site-packages/protobuf-2.4.1-py2.6.egg/google/protobuf/internal/decoder.py", line 544, in DecodeField
    if value._InternalParse(buffer, pos, new_pos) != new_pos:
  File "/usr/lib/python2.6/site-packages/protobuf-2.4.1-py2.6.egg/google/protobuf/internal/python_message.py", line 782, in InternalParse
    pos = field_decoder(buffer, new_pos, end, self, field_dict)
  File "/usr/lib/python2.6/site-packages/protobuf-2.4.1-py2.6.egg/google/protobuf/internal/decoder.py", line 410, in DecodeField
    field_dict[key] = local_unicode(buffer[pos:new_pos], 'utf-8')
UnicodeDecodeError: 'utf8' codec can't decode byte 0xce in position 1: invalid continuation byte

在其他一些情况下,它会出现这些错误 –

UnicodeDecodeError: 'utf8' codec can't decode byte 0xbf in position 3: invalid start byte

UnicodeDecodeError: 'utf8' codec can't decode byte 0xe7 in position 3: unexpected end of data

可能是什么原因 ?

我与RabbitMQ和Protocol Buffers有完全相同的问题.问题是协议缓冲区假定输入是str类型,而RabbitMQ似乎在某些情况下将消息解码为unicode(如果字节数组包含大于127的字节).龙卷风也可能发生同样的情况.到目前为止,似乎可以通过以下代码解决问题:

body = self.request.body
if type(body) == unicode:
    data = bytearray(body, "utf-8")
    body = bytes(data)
message = whatever.FromString(body)

此代码将unicode字符串转换为python bytes对象,可以通过协议缓冲区消息快乐地解析. Dunno如果有更好的方法可以做到这一点,但至少这似乎有效.

0

精彩评论

暂无评论...
验证码 换一张
取 消