Python之pandas的简单使用

提示:文章写完后,目录可以自动生成,如何生成可参考右边的帮助文档


前言

前天在给用户通过SQL操作数据时,由于数据过多超过10万以上的数据,在DB直接弄出来挺费劲,发现Python里这pandas工具做数据还是挺管用的,几十万的数据Python轻松搞定,学习了下pandas使用


提示:以下是本篇文章正文内容,下面案例可供参考

一、pandas是什么?

示例:pandas 是基于NumPy 的一种工具,该工具是为了解决数据分析任务而创建的。Pandas 纳入了大量库和一些标准的数据模型,提供了高效地操作大型数据集所需的工具。pandas提供了大量能使我们快速便捷地处理数据的函数和方法。

二、使用步骤

1.开发工具

操作系统:Windows 10
开发软件:spyder 3
Python版本:3.6

2.详细代码

代码如下(示例):

# -*- coding: utf-8 -*-

import datetime
import numpy as np
import sqlalchemy
import pandas as pd

#db_engine = sqlalchemy.create_engine('oracle+cx_oracle://mesai:ai#525@WBOEE_UAT',echo=True) # oracle
db_engine = sqlalchemy.create_engine('oracle+cx_oracle://mesai:ai#525@wboee.world',echo=True) 


try:
    # SQL 查询语句
   
    sql = "SELECT * FROM  PDD_ACTIVE_DEVICE WHERE ACT_FLAG='Y' AND DEVICE_TYPE='PRD' AND STATUS='Released' AND IDENTIFY_STATE='Active' AND create_time>'20201230000000'"
    df_ai_pdd = pd.read_sql_query(sql, db_engine)
    
    sql = "SELECT * FROM mesai.pdd_active_wirebond"
    df_ai_wirebond = pd.read_sql_query(sql, db_engine)
  
    df_ai = pd.merge(df_ai_pdd,df_ai_wirebond,left_on='sysid', right_on='deviceid',how='left')
    
    col_name=df_ai.columns.tolist() 
    
    col_name.insert(1,"sub_con")
    df_ai=df_ai.reindex(columns=col_name) 
    df_ai['sub_con']=""
    
    df_ai["tpn"]=""
   
    df_ai["bonding"] = np.select([df_ai['cust_code'] == "GDS",
                    df_ai['cust_code'] == "SLW"],
                    [df_ai['s_scc_dwg_no'], df_ai['s_scc_dwg_no']+df_ai['s_scc_dwg_rev']],
                    default = 'XXX')
    df_ai["wiretype"]=""
   
    df_ai["wiretype"] = np.where(df_ai["wire_type"].isnull(),"", df_ai["wire_type"])
   
    df_ai["wiretype"] =  np.select([df_ai['wiretype'] == "AG WIRE",df_ai['wiretype'] == "GOLD WIRE",
                    df_ai['wiretype'] == "COPPER WIRE"],
                    ["A","G","C"],
                    default = '')
    
    df_ai["pin1方向"]="对缺角"
    
    df_ai["pin1"]=""
    df_ai["pin1"]=np.where(df_ai["cust_code"]=="SLW","NA", "")
    df_ai["str"]=""
    df_ai["m_q"]=""
    
    data = df_ai[(df_ai["cust_code"].isin(['GDS','SLW']))]
    
  
    
    data_01 = data[data["cust_code"]=="GDS"]
    data_01 = data_01[['sub_con','s_cust_device_desc','cust_device','cust_code','s_scc_dwg_no', 's_scc_dwg_rev','s_marking_spec','tpn','bonding','pin1方向','wiretype','pin1','str','m_q']].rename(columns={'s_cust_device_desc':'cpn','cust_device':'ipn','s_marking_spec':'marking'})
    #CUST_DEVICE NOT LIKE 'G%'
    data_02 = data[(data["cust_code"]=="SLW")&(~df_ai_pdd["cust_device"].str.startswith("G"))]
    data_02 = data_02[['sub_con','s_cust_device_desc','cust_device','cust_code','s_scc_dwg_no', 's_scc_dwg_rev','s_marking_spec','tpn','bonding','pin1方向','wiretype','pin1','str','m_q']].rename(columns={'s_cust_device_desc':'cpn','cust_device':'ipn','s_marking_spec':'marking'})
    
    data_new = pd.concat([data_01,data_02]) 
    
    print(data_new.head(50))
    
    print(data_new.shape)
    
    data_new.to_excel("E:\\WebFiles\\{0}.xlsx".format(datetime.datetime.now().strftime('%Y%m%d%H%M%S')),index=False)
    print("执行成功!!")
except Exception as e:
    print('exception!!!' + e)
finally:
    #db_conn.close()
    db_engine.dispose()

在这里插入图片描述


总结

记录点点滴滴


版权声明:本文为panda_225400原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接和本声明。