python爬虫之解析网络报文xml

讨论 Rambo_gor
Lv5 宗师级炼丹师
发布在 Python编程   1310   0
讨论 Rambo_gor   1310   0

    本节主要是讲解在项目中怎么解析获取的xml报文并获取相关字段。
    xml解析第三方库学习地址:http://www.runoob.com/python/python-xml.html

    xml文件如下:

    <?xml version="1.0" encoding="UTF-8"?>
    <Task version="1.3" xmlns="http://schemas.microsoft.com/windows/2004/02/mit/task">
      <RegistrationInfo>
        <Date>2018-03-19T03:57:44.2908045</Date>
        <Author>FANBINGLIN\Administrator</Author>
        <Description>开机提醒事件</Description>
      </RegistrationInfo>
      <Triggers>
        <LogonTrigger>
          <Enabled>true</Enabled>
        </LogonTrigger>
      </Triggers>
      <Principals>
        <Principal id="Author">
          <UserId>FANBINGLIN\Administrator</UserId>
          <LogonType>InteractiveToken</LogonType>
          <RunLevel>LeastPrivilege</RunLevel>
        </Principal>
      </Principals>
      <Settings>
        <MultipleInstancesPolicy>IgnoreNew</MultipleInstancesPolicy>
        <DisallowStartIfOnBatteries>true</DisallowStartIfOnBatteries>
        <StopIfGoingOnBatteries>true</StopIfGoingOnBatteries>
        <AllowHardTerminate>true</AllowHardTerminate>
        <StartWhenAvailable>false</StartWhenAvailable>
        <RunOnlyIfNetworkAvailable>false</RunOnlyIfNetworkAvailable>
        <IdleSettings>
          <StopOnIdleEnd>true</StopOnIdleEnd>
          <RestartOnIdle>false</RestartOnIdle>
        </IdleSettings>
        <AllowStartOnDemand>true</AllowStartOnDemand>
        <Enabled>true</Enabled>
        <Hidden>false</Hidden>
        <RunOnlyIfIdle>false</RunOnlyIfIdle>
        <DisallowStartOnRemoteAppSession>false</DisallowStartOnRemoteAppSession>
        <UseUnifiedSchedulingEngine>false</UseUnifiedSchedulingEngine>
        <WakeToRun>false</WakeToRun>
        <ExecutionTimeLimit>P3D</ExecutionTimeLimit>
        <Priority>7</Priority>
      </Settings>
      <Actions Context="Author">
        <ShowMessage>
          <Title>每日提醒</Title>
          <Body>
    1、掌握python基本语法,3.19-3.24 
    2、VBA程序研究
    3、工作任务总结</Body>
        </ShowMessage>
      </Actions>
    </Task>

    解析的代码(中间有部分调试文件):

    #!/usr/bin/python3
    #coding:utf-8
    
    from xml.dom.minidom import parse
    import xml.dom.minidom
    Root = xml.dom.minidom.parse('开机提醒.xml')
    # print(dir(DOMTree))
    task = Root.documentElement
    # print(dir())
    for line in task.childNodes:
        # print('line.nodeName:',line.nodeName,'line.nodeType:',line.nodeType,'line.nodeValue:',line.nodeValue,'line.normalize:',line.normalize)
        # print(len(line))
        # print(line)
        if 3 == line.nodeType:
            continue
        if 'Actions' == line.nodeName:
    
            for tmp in line.childNodes:
                # print(tmp)
                if 3 == tmp.nodeType:
                    continue
                # print(tmp)
                for tmp1 in tmp.childNodes:
                    if 3 == tmp1.nodeType:
                        continue     
                    for tmp2 in tmp1.childNodes:
                        # print(tmp2)
                        # if 3 == tmp2.nodeType:
                        #   continue
                        print(tmp2.nodeValue)
        # for line1 in line.childNodes:
        #   if 3 == line1.nodeType:
        #       continue
        #   # print(line1.nodeName)
        #   # print(dir(line1))
    
        #   for line2 in line1.childNodes:
        #       if 3 == line2.nodeType:
        #           continue
                # print(line2.nodeValue)
                # print(line2.data)

    效果图:

    版权声明:作者保留权利,不代表意本站立场。如需转载请联系本站以及作者。

    参与讨论

    回复《 python爬虫之解析网络报文xml

    EditorJs 编辑器

    沙发,很寂寞~
    反馈
    to-top--btn