Python Pandas 提升运算速度

1.依次赋值和一次赋值

(1)遍历columns name,用时0.75s

    df = pd.DataFrame(columns=['A','B','C','D','E'])
    start = time.time()
    for i in range(1000):
        num = i
        for col in df.columns:
            df.loc[i,col] = num
            num+=1
    end = time.time()
    print(end-start)

(2)手动写入列名依次赋值,用时0.83s

    df = pd.DataFrame(columns=['A','B','C','D','E'])
    start = time.time()
    for i in range(1000):
        df.loc[i,'A'] = i
        df.loc[i,'B'] = i+1
        df.loc[i,'C'] = i+2
        df.loc[i,'D'] = i+3
        df.loc[i,'E'] = i+4
    end = time.time()
    print(end-start)

(3)一次赋值,用时0.84s

    df = pd.DataFrame(columns=['A','B','C','D','E'])
    start = time.time()
    for i in range(1000):
        df.loc[i,['A','B','C','D','E']] = [i,i+1,i+2,i+3,i+4]
    end = time.time()
    print(end-start)

2.使用replace填充没有的数据

(1)依次赋值,ABC为填充值,用时0.75s

    df = pd.DataFrame(columns=['A','B','C','D','E'])
    start = time.time()
    for i in range(1000):
        df.loc[i,'A'] = 0
        df.loc[i,'B'] = 0
        df.loc[i,'C'] = 0
        df.loc[i,'D'] = i+3
        df.loc[i,'E'] = i+4
    end = time.time()
    print(end-start)

(2)replace赋值,ABC为填充,用时0.55s

    df = pd.DataFrame(columns=['A','B','C','D','E'])
    start = time.time()
    for i in range(1000):
        df.loc[i,'D'] = i+3
        df.loc[i,'E'] = i+4
    df.replace(np.nan, 0,inplace=True)
    end = time.time()
    print(end-start)


版权声明:本文为weixin_39405468原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接和本声明。