pybind11使用教程笔记__4.1_数据类型转换--string

1. Strings, bytes and Unicode conversions

Passing Python strings to C++

将python str格式数据传递给C++的函数,如果C++函数的形参是 std::string or char * ,pybind11会自动将Python stringz转为UTF-8的编码方式。所有python的str都能以UTF-8来编码,所以pybind11的这个操作通常都会成功。
The C++ language is encoding agnostic. It is the responsibility of the programmer to track encodings. It’s often easiest to simply use UTF-8 everywhere.

PYBIND11_MODULE(py_string_to_cpp, m){
    m.def("utf8_test", [](const std::string &s ){
        std::cout<<"utf-8 is icing on cake!!";
        std::cout<< s << std::endl;
    });
    
    m.def("utf8_charptr", [](char* s){
        std::cout<<"my favoriate food is "<< s <<std::endl;
    });
}
s = "cake noodles"
utf8_test(s)
utf8_charptr(s)

在这里插入图片描述

无论C++的函数的形参是传值调用还是引用调用,无论形参中是否使用const,测试结果都是一样的。

Passing bytes to C++

python bytes对象 传递给形参为 std::string or char*的C++函数,无需类型转换
为了在python3中使函数只接受bytes (and not str),在C++中使用py::bytes来声明形参。

2. Returning C++ strings to Python

C++返回std::string or char*给python,pybind11会假定 string 为UTF-8ge格式,并将编码为python的str(using the same API as Python uses to perform bytes.decode(‘utf-8’))。如果编码失败,pybind11会报错(UnicodeDecodeErro)

    m.def("std_string_return", [](){
        return std::string("this std::string needs to be UTF-8 encoded!");
    });
    
    m.def("char_ptr_return", [](){
        char * s = "thish string needs to be UTF-8 encoded!";
        return s;
from py_string_to_cpp import std_string_return, char_ptr_return

print(std_string_return())
print(char_ptr_return())

isinstance(std_string_return(), str)
isinstance(char_ptr_return(), str)
this std::string needs to be UTF-8 encoded!
thish string needs to be UTF-8 encoded!
True
True

Because UTF-8 is inclusive of pure ASCII, there is never any issue with returning a pure ASCII string to Python. If there is any possibility that the string is not pure ASCII, it is necessary to ensure the encoding is valid UTF-8.

Wide character strings

当Python str传递给形参为std::wstring, wchar_t*, std::u16string or std::u32string的C++函数时,str会被编码为UTF-16 or UTF-32,
取决于C++编译器。当这些类型的string从C++向python返回时,会假定这些string有效的UTF-16 UTF-32 格式,并将其编码为python str。

#define UNICODE
#include <windows.h>

m.def("set_window_text",
    [](HWND hwnd, std::wstring s) {
        // Call SetWindowText with null-terminated UTF-16 string
        ::SetWindowText(hwnd, s.c_str());
    }
);

Character literals

形参为 char wchar_t 的C++函数,如果收到python str类型的输入,会将python str 的第一个字符作为函数的输入,后面的字符会被忽略。

当C++返回一个Character literal时,会将其转换为只有一个字符的python str。

    m.def("pass_char", [](char c){
        return c;
    });

    m.def("pass_wchar", [](wchar_t wc){
        return wc;
    });
from py_string_to_cpp import pass_char, pass_wchar

try:
    print(pass_char("abcde"))
    print(pass_wchar("abcde"))
except Exception as e:
    print(e)
else:
    print("pass_car can accept multi char")
finally:
    print(pass_char("a"))
    print(pass_wchar("a"))
    
Expected a character, but multi-character string found
a
a

版权声明:本文为weixin_41521681原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接和本声明。