今天遇到一个需要根据dataframe index批量获取数据的问题,记录一下。
总表:records.csv,所有数据记录。
总表数据:
| date | data |
|---|---|
| 2014/3/1 | 0.947014982 |
| 2014/6/1 | 0.746103818 |
| 2014/9/1 | 0.736764841 |
| 2014/12/1 | 0.724937624 |
| 2015/3/1 | 0.85043738 |
| 2015/6/1 | 0.332503212 |
| 2015/9/1 | 0.75289366 |
| 2015/12/1 | 0.358275104 |
| 2016/3/1 | 0.077250716 |
| 2016/6/1 | 0.436182277 |
| 2016/9/1 | 0.424714671 |
| 2016/12/1 | 0.842471104 |
| 2017/3/1 | 0.740035625 |
| 2017/6/1 | 0.183588529 |
| 2017/9/1 | 0.143363207 |
sub表索引:
| date |
|---|
| 2016/12/1 |
| 2016/12/1 |
| 2014/3/1 |
| 2014/9/1 |
| 2014/3/1 |
现在要根据sub标准中的索引,从总表中取数据。
代码如下:
In [47]: import pandas as pd
...: df=pd.read_csv("records.csv", index_col='date')
...: df1=pd.read_csv("sub.csv", index_col='date')
...:
In [48]: df.loc[df1.index]
Out[48]:
data
date
2016/12/1 0.842471
2016/12/1 0.842471
2014/3/1 0.947015
2014/9/1 0.736765
2014/3/1 0.947015
为匹配到的数据增加一列标识:
In [49]: df.loc[df1.index, 'ST'] = "st"
In [50]: df
Out[50]:
0 ST
date
2014/3/1 0.947015 st
2014/6/1 0.746104 NaN
2014/9/1 0.736765 st
2014/12/1 0.724938 NaN
2015/3/1 0.850437 NaN
2015/6/1 0.332503 NaN
2015/9/1 0.752894 NaN
2015/12/1 0.358275 NaN
2016/3/1 0.077251 NaN
2016/6/1 0.436182 NaN
2016/9/1 0.424715 NaN
2016/12/1 0.842471 st
2017/3/1 0.740036 NaN
2017/6/1 0.183589 NaN
2017/9/1 0.143363 NaN
版权声明:本文为leenuxcore原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接和本声明。