使用DataFrame转换.txt文件某一列的数据类型

例1:转换成int型

import os
import pandas as pd

#记得更改路径path
path = '/Mydata/test0.txt'
print(path)
df = pd.read_csv(path, sep=' ', header=None)
#起列名
new_co = ['new_1', 'new_2', 'new_3', 'new_4', 'new_5', 'new_6', 'new_7']
df.columns = new_co
print(df.columns)
#最后一列转为int型
df['new_7'] = df["new_7"].astype(int)
#保存不带索引和列名的数据
pd.DataFrame(df).to_csv('/MyData1/test0.txt', header=False, index=False)

保存下来的文件是string类型,每一列不能按照列名区分开。这里我用的简单粗暴的方法,去除了列与列之间的逗号','。代码如下:

#将所有的逗号用空格代替
old_signal = ","
new_sigal = " "
with open(r'test0.txt', 'r', encoding='UTF-8') as file:
    data = file.read()
    data = data.replace(search_text, replace_text)

with open(r'test.txt', 'w', encoding='UTF-8') as file:
    file.write(data)
    print("REPLACE!")

例2:转换为保留小数点后前三位

import os
import pandas as pd

path = '/MyData/test0.txt'
    print(path)
    df = pd.read_csv(path, sep=' ', header=None)
    new_co = ['new_1', 'new_2', 'new_3', 'new_4', 'new_5', 'new_6', 'new_7']
    df.columns = new_co
    #前三列保留小数点后三位
    df['new_1'] = df['new_1'].apply(lambda x: format(x, '.3f'))
    df['new_2'] = df['new_2'].apply(lambda x: format(x, '.3f'))
    df['new_3'] = df['new_3'].apply(lambda x: format(x, '.3f'))
    pd.DataFrame(df).to_csv('/MyData1/test0.txt', header=False, index=False)

保存下来的文件依旧存在是string类型的问题,依旧使用空格代替逗号的方法:

#将所有的逗号用空格代替
old_signal = ","
new_sigal = " "
with open(r'test0.txt', 'r', encoding='UTF-8') as file:
    data = file.read()
    data = data.replace(search_text, replace_text)

with open(r'test.txt', 'w', encoding='UTF-8') as file:
    file.write(data)
    print("REPLACE!")

这样新保存下来的test.txt既更改了对应列的数据类型,也可以按照索引取出特定列


版权声明:本文为LiuXu11111原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接和本声明。