[Python]Wikipediaの概要取得（改）:ただのテキスト処理とも言う無気力ラボ

先日[Python3.x]lxml 3.1.0を用いて巨大なxmlファイルの処理をゴリ押すでWikipediaの概要を取得するためにxmlで頑張ったのだが、記事で最後に述べたようにParser使わずに普通にテキスト処理したほうが速くね？ということに気がついてしまったので一応書いておく。
前回のはただのPythonの勉強だよ！

if __name__ == '__main__':
    with open('jawiki-20130216-abstract.xml', encoding='utf-8') as ifile:
        with open('abstract.txt',mode='w',encoding='utf-8') as ofile:
            for line in ifile:
                if line.find('<abstract>') != -1:
                    line = line.replace('<abstract>','')
                    ofile.write(line.replace('</abstract>',''))
    print("finish!")

if __name__ == '__main__':

with open('jawiki-20130216-abstract.xml', encoding='utf-8') as ifile:

with open('abstract.txt',mode='w',encoding='utf-8') as ofile:

for line in ifile:

if line.find('<abstract>') != -1:

line = line.replace('<abstract>','')

ofile.write(line.replace('</abstract>',''))

print("finish!")

~~やっぱ酒飲むと(ry~~

月	火	水	木	金	土	日
« 4月
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

[Python]Wikipediaの概要取得（改）:ただのテキスト処理とも言う

コメントを残すコメントをキャンセル

プロフィール

Ainocce

最近の投稿

アーカイブ

カテゴリー

最近のコメント

メタ情報

コメントを残す コメントをキャンセル