- 帖子
- 109
- 主題
- 1
- 精華
- 0
- 積分
- 116
- 點名
- 0
- 作業系統
- win7
- 軟體版本
- 2007
- 閱讀權限
- 20
- 註冊時間
- 2016-8-4
- 最後登錄
- 2018-10-22
 
|
106#
發表於 2016-9-12 23:20
| 只看該作者
回復 102# c_c_lai
提供另一種方式給您參考(解析網頁後處理資料再存成csv)- import requests
- from bs4 import BeautifulSoup
- import csv
- headers = {"User-Agent":"Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.112 Safari/537.36"}
-
- url = 'http://www.twse.com.tw/ch/trading/exchange/MI_MARGN/MI_MARGN.php'
- myDate = '105/09/08'
- payload={'download':'',
- 'qdate':myDate,
- 'selectType':'ALL'}
- res = requests.post(url, headers=headers, data=payload)
- soup = BeautifulSoup(res.text, 'lxml')
- trs = soup.select('table tr')
- myList = []
- subList = []
- header1 = ['股票代號','股票名稱','融資(單位: 交易單位)','','','','','',
- '融券(單位: 交易單位)','','','','','','資券互抵','註記']
- header2 = ['','','買進','賣出','現金償還','前日餘額','今日餘額','限額',
- '買進','賣出','現金償還','前日餘額','今日餘額','限額','','']
- for i, tr in enumerate(trs):
- if i == 6:
- subList = header1
- elif i == 7:
- subList = header2
- else:
- for td in tr.find_all('td'):
- subList.append(td.text)
-
- myList.append(subList)
- subList = []
- if i == 5:
- myList.append(subList)
- with open('output.csv', 'w', new='', encoding='utf-8') as f:
- f.write('\ufeff')
- w = csv.writer(f)
- for sub in myList:
- w.writerow(sub)
複製代碼 如果不求美觀,可以直接用pandas:- import requests
- import pandas as pd
- headers = {"User-Agent":"Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.112 Safari/537.36"}
-
- url = 'http://www.twse.com.tw/ch/trading/exchange/MI_MARGN/MI_MARGN.php'
- myDate = '105/09/07'
- payload={'download':'',
- 'qdate':myDate,
- 'selectType':'ALL'}
- res = requests.post(url, headers=headers, data=payload)
- dfs = pd.read_html(res.text)
- #在ipython,dfs[1]就可以取出主要的資料
- dfs[1]
複製代碼 |
|