发布
社区首页 >问答首页 >转换记录中的xml数据

转换记录中的xml数据
EN

Stack Overflow用户
提问于 2020-12-02 23:11:39
回答 1查看 29关注 0票数 0

下面是我的xml数据

代码语言:javascript
代码运行次数:0
复制
<?xml version="1.0" ?>
<ALLRECORDS>
<RECORD>
<RecordType>Forward</RecordType>
<Env SINGLE="Y" TYPE="EntList">
<ENV>
<InputDate>20201118</InputDate>
<Company>CompanyName1</Company>
<Book>Book1</Book>
<Status>DONE</Status>
</ENV>
</Env>
<Back SINGLE="Y" TYPE="EntList">
<BACK>
<PendingUser>User1</PendingUser>
<FairValueLevel>LEVEL2</FairValueLevel>
</BACK>
</Back>
<Financial SINGLE="Y" TYPE="EntList">
<FINANCIAL>
<PorS>S</PorS>
<UserSelectedFlag>SELL</UserSelectedFlag>
<BoughtAmt TYPE="Numeric">860000000</BoughtAmt>
<Valdate>20220513</Valdate>
<Rate TYPE="Numeric">103.0575</Rate>
</FINANCIAL>
</Financial>
</RECORD>
<RECORD>
<RecordType>Forward</RecordType>
<Env SINGLE="Y" TYPE="EntList">
<ENV>
<InputDate>20201119</InputDate>
<Company>CompanyName2</Company>
<Book>Book2</Book>
<Status>DONE</Status>
</ENV>
</Env>
<Back SINGLE="Y" TYPE="EntList">
<BACK>
<PendingUser>User2</PendingUser>
<FairValueLevel>LEVEL3</FairValueLevel>
</BACK>
</Back>
<Financial SINGLE="Y" TYPE="EntList">
<FINANCIAL>
<PorS>S</PorS>
<UserSelectedFlag>SELL</UserSelectedFlag>
<BoughtAmt TYPE="Numeric">860000000</BoughtAmt>
<Valdate>20220512</Valdate>
<Rate TYPE="Numeric">103.09</Rate>
</FINANCIAL>
</Financial>
</RECORD>
</ALLRECORDS>

所以,基本上,这有一个标签ALLRECORDS。每条记录都由记录表示。有趣的是,每个记录都有三个子标签: Env、Back和Financial。

问题是:如何为每个记录创建一行数据(不用担心列名,因为FINANCIAL下的标记会发生变化)。

以下是我到目前为止所做的工作:

代码语言:javascript
代码运行次数:0
复制
import xml.dom.minidom
import xml.etree.ElementTree as ET
import pandas as pd
from os import listdir
from os.path import isfile, join
import datetime


def get_row(record):

    envrecords = record.find('Env')
    env = envrecords.find('ENV')
    InputDate = env.find('InputDate').text
    Company = env.find('Company').text
    Book = env.find('Book').text
    Status = env.find('Status').text

    backrecords = record.find('Back')
    back = backrecords.find('BACK')
    PendingUser = back.find('PendingUser').text
    FairValueLevel = '' if back.find('FairValueLevel') is None else back.find('FairValueLevel').text

    finrecords = record.find('Financial')
    fin = finrecords.find('FINANCIAL')
    PorS = fin.find('PorS').text
    UserSelectedFlag = fin.find('UserSelectedFlag').text
    BoughtAmt = fin.find('BoughtAmt').text
    Valdate = fin.find('Valdate').text
    Rate = fin.find('Rate').text

    row = [InputDate, Company, Book, Status, PendingUser, FairValueLevel,
           PorS, UserSelectedFlag, BoughtAmt, Valdate, Rate]

    return row

def get_xml_data(xmlpath):

    cols = ['InputDate', 'Company', 'Book', 'Status', 'PendingUser', 'FairValueLevel',
            'PorS', 'UserSelectedFlag', 'BoughtAmt', 'Valdate', 'Rate']

    rows = []
    allfiles = [f for f in listdir(xmlpath) if isfile(join(xmlpath, f)) and f.endswith('xml')]
    for filename in allfiles:
        fullfilename = join(xmlpath, filename)
        root = ET.parse(fullfilename).getroot()
        tradetypes = ['RECORD']
        for tt in tradetypes:
            for rec in root.findall(tt):
                eachrow = get_row(rec)
                eachrowdict = dict(zip(cols,eachrow))
                rows.append(eachrowdict)

    return rows

filename = 'C:/Temp/FX/xmldata/stupidsample'
rows = get_xml_data(filename)
for a in rows:
    print(a)

如您所见,我为标记"FairValueLevel“添加了一个例外。现在,在“金融”部分,我不知道会出现哪些新的列名。如果在FINANCIAL下面看到一个名为"BusinessDate“的新标签,我该如何更改代码?

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2020-12-05 10:59:58

如果节点发生变化,请考虑没有硬编码节点的动态解决方案。下面使用Python 3.5+中提供的latest method合并字典:

代码语言:javascript
代码运行次数:0
复制
import xml.etree.ElementTree as ET
import pandas as pd

doc = ET.parse("FinancialXMLSingleRow.xml")

df = pd.DataFrame([ {**{i.tag:i.text.strip() for i in rec.findall("Env/ENV/*")},
                     **{i.tag:i.text.strip() for i in rec.findall("Back/BACK/*")},
                     **{i.tag:i.text.strip() for i in rec.findall("Financial/FINANCIAL/*")} }
                           for rec in doc.findall(".//RECORD")])
                    
print(df)
#   InputDate       Company   Book Status PendingUser FairValueLevel PorS UserSelectedFlag  BoughtAmt   Valdate      Rate
# 0  20201118  CompanyName1  Book1   DONE       User1         LEVEL2    S             SELL  860000000  20220513  103.0575
# 1  20201119  CompanyName2  Book2   DONE       User2         LEVEL3    S             SELL  860000000  20220512    103.09
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/65110808

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档