首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >Dataframe替换不工作编码= 'ISO-8859-1',Python 3.6

Dataframe替换不工作编码= 'ISO-8859-1',Python 3.6
EN

Stack Overflow用户
提问于 2019-01-18 14:39:16
回答 2查看 1K关注 0票数 0

我将一些CSV文件导入到数据帧中。

代码语言:javascript
运行
复制
Data = pd.read_csv(filePath, encoding = 'ISO-8859-1', dtype=object)

我正在用一些值替换列“指示器”。

代码语言:javascript
运行
复制
DataT['Indicator'] = DataT['Indicator'].str.replace('export(us$ mil)', 'exports (in us$ mil)')
DataT['Indicator'] = DataT['Indicator'].str.replace('import(us$ mil)', 'imports (in us$ mil)')

但由于编码问题,替换无效。

请建议如何解决这个问题?

从:allyears.zip下载的文件

导入所有csv文件的代码:-

代码语言:javascript
运行
复制
for i, file in os.listdir(sourcePath):
    if file.upper().endswith('.CSV'):
    filePath = os.path.join(sourcePath, file)
    Data = pd.read_csv(filePath, encoding = 'ISO-8859-1', dtype=object) 

    Data['FileName'] = file
    DataAll = pd.concat([DataAll, Data], sort=False)
EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2019-01-18 15:40:00

经过多次试用,我进入了下面的解决方案,只是导入重新模块。

但是,您可以将代码简化为:

代码语言:javascript
运行
复制
import pandas as pd
import glob
import re
for f in glob('/your_Dir_path/somefiles*.csv'):
    Data = pd.read_csv(f, encoding = 'ISO-8859-1', dtype=object)

数据集:

代码语言:javascript
运行
复制
>>> Data['Indicator'].head()
0     GDP (current US$ Mil)
1    No. Of Export partners
2    No. Of Export products
3    No. Of Import partners
4    No. Of Import products
Name: Indicator, dtype: object
>>> Data['Indicator'].head(100)
0                     GDP (current US$ Mil)
1                    No. Of Export partners
2                    No. Of Export products
3                    No. Of Import partners
4                    No. Of Import products
5                   No. Of Tariff Agreement
6           Trade Balance (current US$ Mil)
7      Trade (US$ Mil)-Top 5 Export Partner
8      Trade (US$ Mil)-Top 5 Export Partner
9      Trade (US$ Mil)-Top 5 Export Partner
10     Trade (US$ Mil)-Top 5 Export Partner
11     Trade (US$ Mil)-Top 5 Import Partner
12     Trade (US$ Mil)-Top 5 Export Partner
13     Trade (US$ Mil)-Top 5 Import Partner
14     Trade (US$ Mil)-Top 5 Export Partner
15     Trade (US$ Mil)-Top 5 Import Partner
16     Trade (US$ Mil)-Top 5 Export Partner
17     Trade (US$ Mil)-Top 5 Export Partner
18     Trade (US$ Mil)-Top 5 Import Partner

结果:

代码语言:javascript
运行
复制
>>> Data['Indicator'].str.replace(re.escape("Trade (US$ Mil)"), "IN Trade (US$ Mil)").head(100)
0                       GDP (current US$ Mil)
1                      No. Of Export partners
2                      No. Of Export products
3                      No. Of Import partners
4                      No. Of Import products
5                     No. Of Tariff Agreement
6             Trade Balance (current US$ Mil)
7     IN Trade (US$ Mil)-Top 5 Export Partner
8     IN Trade (US$ Mil)-Top 5 Export Partner
9     IN Trade (US$ Mil)-Top 5 Export Partner
10    IN Trade (US$ Mil)-Top 5 Export Partner
11    IN Trade (US$ Mil)-Top 5 Import Partner
12    IN Trade (US$ Mil)-Top 5 Export Partner
13    IN Trade (US$ Mil)-Top 5 Import Partner
14    IN Trade (US$ Mil)-Top 5 Export Partner
15    IN Trade (US$ Mil)-Top 5 Import Partner
16    IN Trade (US$ Mil)-Top 5 Export Partner
17    IN Trade (US$ Mil)-Top 5 Export Partner
18    IN Trade (US$ Mil)-Top 5 Import Partner
19    IN Trade (US$ Mil)-Top 5 Import Partner
20    IN Trade (US$ Mil)-Top 5 Import Partner
21    IN Trade (US$ Mil)-Top 5 Export Partner
22    IN Trade (US$ Mil)-Top 5 Export Partner
23    IN Trade (US$ Mil)-Top 5 Export Partner
24    IN Trade (US$ Mil)-Top 5 Export Partner
25    IN Trade (US$ Mil)-Top 5 Export Partner
26    IN Trade (US$ Mil)-Top 5 Export Partner
27    IN Trade (US$ Mil)-Top 5 Export Partner
28    IN Trade (US$ Mil)-Top 5 Import Partner
29    IN Trade (US$ Mil)-Top 5 Export Partner
                       ...
70      Partner share(%)-Top 5 Export Partner
71      Partner share(%)-Top 5 Import Partner
72      Partner share(%)-Top 5 Export Partner
73      Partner share(%)-Top 5 Import Partner
74      Partner share(%)-Top 5 Export Partner
75      Partner share(%)-Top 5 Export Partner
76      Partner share(%)-Top 5 Import Partner
77      Partner share(%)-Top 5 Import Partner
78      Partner share(%)-Top 5 Import Partner
79      Partner share(%)-Top 5 Export Partner
80      Partner share(%)-Top 5 Export Partner
81      Partner share(%)-Top 5 Export Partner
82      Partner share(%)-Top 5 Export Partner
83      Partner share(%)-Top 5 Export Partner
84      Partner share(%)-Top 5 Export Partner
85      Partner share(%)-Top 5 Export Partner
86      Partner share(%)-Top 5 Import Partner
87      Partner share(%)-Top 5 Export Partner
88      Partner share(%)-Top 5 Import Partner
89      Partner share(%)-Top 5 Export Partner
90                         Country Growth (%)
91           Duty Free Tariff Lines Share (%)
92                    Export Product share(%)
93                    Export Product share(%)
94                    Export Product share(%)
95                    Export Product share(%)
96                    Export Product share(%)
97                    Export Product share(%)
98                    Export Product share(%)
99                    Export Product share(%)
Name: Indicator, Length: 100, dtype: object

对于您的例子,您应该尝试如下:

代码语言:javascript
运行
复制
import re

DataT['Indicator'] = DataT['Indicator'].str.replace(re.escape('export(us$ mil)'), 'exports (in us$ mil)')
DataT['Indicator'] = DataT['Indicator'].str.replace(re.escape('import(us$ mil)'), 'imports (in us$ mil)')
票数 1
EN

Stack Overflow用户

发布于 2019-01-18 15:11:35

从数据中加载一个示例时,我注意到“指示器”列的值并不都是小写--即'Export(US$ Mil)'而不是'export(us$ mil)'。您需要使用正确的值,或者:

代码语言:javascript
运行
复制
DataT['Indicator'] = DataT['Indicator'].str.lower().replace('export(us$ mil)',
                                                            'exports (in us$ mil)')

始终可以使用df[col].unique()检查列的唯一值。

票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/54256149

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档