ªð¦^¦Cªí ¤W¤@¥DÃD µo©«

[­ì³Ð] python¤W¥«Âd¤T¤jªk¤H¶R½æ¶W¤é³ø¸ê®Æ¤U¸ü

¦^´_ 100# zyzzyva
Ãø©Ç¤j®a·|¨Ó¨Ï¥Î Python¡A¥ú¬O¬Ý¨ä°õ¦æ¯à¤O´N¨¬¨o¡C
­ì¨Ó¬O¨º»ò¦a²¼ä¡B§ã­n¡A¤S¾Ç¨ì¤F¡AÁÂÁ§Aªº¤£§[«ü¾É¡I

TOP

¦^´_ 100# zyzzyva
°²³]¤£¦s¦¨ csv¡A¦Ó¤@¤@Åã¥Ü¤S·|¬O¦p¦ó³B­ù¡A
ª½ºI¨Ï¥Î res¡A¦n¹³µL°Ê§@¡C(·U¨Ó·U³ß·R Python ¤F)
ÁÂÁÂÅo¡I

TOP

¥»©«³Ì«á¥Ñ koshi0413 ©ó 2016-9-12 20:33 ½s¿è

¤p§Ì§Ú²×©ó¥i¥H¦^ÂФF¡A·P°Ê!!
³o¤å³¹§Ú°l«Ü¤[¤F python ¤]¬O¬Ý¨ì³o½g¤~¶}©l¾Çªº¡A
¦b¦¹½×¾Â¾Ç¤F¤£¤Övba  ¤]¬Ý¤F¨Çc_c_lai ¾Ç¨ì¤£¤Ö¡A­è¦n¬Ý¨ìc_c_lai¦³´£°Ý¡A¨Ó³øµª¤@¤U
¤U­±¬° c_c_lai  ¤§»Ý¨D¡A¤U¸ücsv«Ü¤è«K¡A¤]«Ü§Ö!!!  ¦ý¶×¤J SQL³Â·Ð¡A­n¦bÂà½X¡A¬G¤p§Ì·|¿ï¾Üª½±µ´£¨ú¡]¨ä¹ê¬OÁÙ¤£·|¥Îpythonª½±µÂàcsv½X¦b¦Û°Ê¶×¤JSQL¡^
ps:¥N½X¬°zyzzyva­×§ïª©
pps:·½½X¤¤¬q¸¨¦³ # ¬Ò¬°´£¨úºô­¶®É¡A¤ÏÂйêÅç¥Î¡A³oºô­¶¬°¤F´£¨ú©Ò¦Ó¬q¸¨¤]ªá¤F¤@¤p®É¨Ó¸Õ¦UºØºô­¶¥N½X¨Ó¥ÎBeautifulSoup´£¨ú
  1. # -*- coding: utf-8 -*-
  2. import requests
  3. import time
  4. import os
  5. from bs4 import BeautifulSoup as BS
  6. from datetime import date

  7. headers = {"User-Agent":"Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.112 Safari/537.36"}
  8. url1 = "µ¥¯Å¤£°÷¡Aºô§}½Ð¦Û¦æ¿é¤J"
  9. payload = {"download":'',
  10.             "qdate":'105/09/10',
  11.             "selectType":"ALL"}
  12. res = requests.post(url1, headers=headers, data=payload)
  13. #print res.text

  14. soup = BS(res.text)
  15. #print soup.select('.board_trad')[0].text
  16. #tb = soup.select('#main-content')
  17. #print tb
  18. #tb = soup.findAll('table')
  19. soup = soup.select('table')[1]
  20. #print tb
  21. for ta in soup.select('tr')[3:]:
  22.     print ta.select("td")[0].text,ta.select("td")[1].text,ta.select("td")[2].text,ta.select("td")[3].text,ta.select("td")[4].text,ta.select("td")[5].text,ta.select("td")[6].text,ta.select("td")[7].text,ta.select("td")[8].text,ta.select("td")[9].text,ta.select("td")[10].text,ta.select("td")[11].text,ta.select("td")[12].text,ta.select("td")[13].text,ta.select("td")[14].text
½Æ»s¥N½X
³o¬Oµ²ªG¡C¥upo«e´X¬q
0050   ¤¸¤j¥xÆW50       194 104 0 1,056 1,146 194,125 0 5 0 496 501 194,125 19
0051   ¤¸¤j¤¤«¬100      0 0 0 2 2 4,500 0 0 0 0 0 4,500 0
0052   FB¬ì§Þ           0 0 0 0 0 2,000 0 0 0 0 0 2,000 0
0053   ¤¸¤j¹q¤l         0 0 0 0 0 3,247 0 0 0 0 0 3,247 0
0054   ¤¸¤j¥x°Ó50       0 0 0 0 0 4,656 0 0 0 0 0 4,656 0
0055   ¤¸¤jMSCIª÷¿Ä     20 5 0 622 637 17,163 0 4 0 118 122 17,163 0
0056   ¤¸¤j°ªªÑ®§       7 1 0 94 100 67,508 4 0 0 25 21 67,508 3
0057   FB¼¯¥x           0 0 0 0 0 2,006 0 0 0 0 0 2,006 0
0058   FBµo¹F           0 0 0 0 0 1,299 0 0 0 0 0 1,299 0

TOP

¦^´_ 103# koshi0413
ÁÂÁ§A¡I
§Ú¦n¦n¦a¨Ó¬ã¨s¤@¤U¡CBS(res.text) §Ú­×§ï¦¨ BS(res.text, 'lxml')
³Ì«á¤§ print( ... ) (¦]§Úªºª©¥»¬O 3.5) §Ú¦A·Q·Q¬Ý¦³¨S¦³§ó¨Îªºªí¹F³B²z¡C
½Ð±Ð soup.select('table')[1] ¬°¦ó¬O [1]¡H soup.select('tr')[3:] ªº [3:] «üªº¤S¬O¡H
Python §Úºâ¬O¥®¨à¥Í¡AÁÙ±o¦V§A­Ì½Ð¯q¤F¡C ÁÂÁÂÅo¡I

TOP

¦^´_ 103# koshi0413
±z³o»òµuªº®É¶¡¤w¸g¥i¥H§ì¨ì¸ê®Æ¤F¡A¤W¤âªº¯u§Ö¡C
¥ÎBeautifulSoup·|¸Õ»~¬O¥¿±`ªº¡A§Ú¤]±`±`³£¸Õ¦n´X¦¸¤~§ä¨ì¡C

TOP

¦^´_ 102# c_c_lai
´£¨Ñ¥t¤@ºØ¤è¦¡µ¹±z°Ñ¦Ò(¸ÑªRºô­¶«á³B²z¸ê®Æ¦A¦s¦¨csv)
  1. import requests
  2. from bs4 import BeautifulSoup
  3. import csv

  4. headers = {"User-Agent":"Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.112 Safari/537.36"}
  5.            
  6. url = 'http://www.twse.com.tw/ch/trading/exchange/MI_MARGN/MI_MARGN.php'

  7. myDate = '105/09/08'

  8. payload={'download':'',
  9.         'qdate':myDate,
  10.         'selectType':'ALL'}

  11. res = requests.post(url, headers=headers, data=payload)
  12. soup = BeautifulSoup(res.text, 'lxml')

  13. trs = soup.select('table tr')

  14. myList = []
  15. subList = []

  16. header1 = ['ªÑ²¼¥N¸¹','ªÑ²¼¦WºÙ','¿Ä¸ê(³æ¦ì: ¥æ©ö³æ¦ì)','','','','','',
  17.             '¿Ä¨é(³æ¦ì: ¥æ©ö³æ¦ì)','','','','','','¸ê¨é¤¬©è','µù°O']

  18. header2 = ['','','¶R¶i','½æ¥X','²{ª÷ÀvÁÙ','«e¤é¾lÃB','¤µ¤é¾lÃB','­­ÃB',
  19.             '¶R¶i','½æ¥X','²{ª÷ÀvÁÙ','«e¤é¾lÃB','¤µ¤é¾lÃB','­­ÃB','','']

  20. for i, tr in enumerate(trs):
  21.     if i == 6:
  22.         subList = header1
  23.     elif i == 7:
  24.         subList = header2
  25.     else:
  26.         for td in tr.find_all('td'):
  27.             subList.append(td.text)
  28.    
  29.     myList.append(subList)
  30.     subList = []
  31.     if i == 5:
  32.         myList.append(subList)

  33. with open('output.csv', 'w', new='', encoding='utf-8') as f:
  34.     f.write('\ufeff')
  35.     w = csv.writer(f)
  36.     for sub in myList:
  37.         w.writerow(sub)
½Æ»s¥N½X
¦pªG¤£¨D¬üÆ[¡A¥i¥Hª½±µ¥Îpandas¡G
  1. import requests
  2. import pandas as pd

  3. headers = {"User-Agent":"Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.112 Safari/537.36"}
  4.            
  5. url = 'http://www.twse.com.tw/ch/trading/exchange/MI_MARGN/MI_MARGN.php'

  6. myDate = '105/09/07'

  7. payload={'download':'',
  8.         'qdate':myDate,
  9.         'selectType':'ALL'}

  10. res = requests.post(url, headers=headers, data=payload)

  11. dfs = pd.read_html(res.text)

  12. #¦bipython¡Adfs[1]´N¥i¥H¨ú¥X¥D­nªº¸ê®Æ
  13. dfs[1]
½Æ»s¥N½X

TOP

¦^´_ 104# c_c_lai

soup.select('table')[1] ¬°¦ó¬O [1]¡H
ºô­¶¥N½X¤¤¡A¨ú²Ä¤G­Ó'table'

soup.select('tr')[3:] ªº [3:]
ºô­¶¥N½X¤¤¡A¨ú²Ä¤T­Ó'tr'

²Ä¤@¡B¤G­Ó¬°¼ÐÃD¡A¤p§Ìª½±µ¶×¤Jsql¡A©Ò¥H¤£»Ý­n³o¤G¶µ¥Ø¡A©Ò¥H¨S´£¨ú~~
z¤j¦³³Ì·s¦^¤å¡A¥Lªº¥N½X¤ñ¸û¥¿½T¡A¤p§Ìªº¥N½X³£¬O²©ö«¬ªº¡A¥u­n¨ú¨ì¸ê®Æ¯à¾É¤Jsql§Y¥i

TOP

¹ï¤F   ­Ó¤Hıªº³o¬q¥i¥H¹³vba¤@¼Ë¥Î i = 1 to 14 ¨Ó°j°éªí¥Ü
¥u¬OÁÙ¤£·|¥Î

for ta in soup.select('tr')[3:]:
    print ta.select("td")[0].text,ta.select("td")[1].text,ta.select("td")[2].text,ta.select("td")[3].text,ta.select("td")[4].text,ta.select("td")[5].text,ta.select("td")[6].text,ta.select("td")[7].text,ta.select("td")[8].text,ta.select("td")[9].text,ta.select("td")[10].text,ta.select("td")[11].text,ta.select("td")[12].text,ta.select("td")[13].text,ta.select("td")[14].text

TOP

¥»©«³Ì«á¥Ñ lpk187 ©ó 2016-9-12 23:46 ½s¿è

¦^´_ 103# koshi0413

¬Ý¨Ó¼ö°J©ópython ªº¦P¦n¦b³o°Q½×°Ï¡A¤]ÆZ¦h¤Hªº
python ªºio¥i¥H¿é¥X«Ü¦h®æ¦¡¡Acsv .xlsx¬Æ¦Üsql³£¦æ

·íC¤j¶}©l°Q½×³oºô§} http://www.twse.com.tw/ch/trading/exchange/MI_MARGN/MI_MARGN.php
´N¦b·Q¡Aªí®æÁÙ¬O¥æµ¹ªí®æ±M®a pandas ¨Ó³B²z¡A©ó¬O­É¤F§A¤j³¡¥÷¥N½X¨Ó°µ¬Ý¬Ý¡A®ÄªGÁÙ¤£¿ù
  1. import requests
  2. from bs4 import BeautifulSoup
  3. import pandas as pd
  4. import io

  5. url = 'http://www.twse.com.tw/ch/trading/exchange/MI_MARGN/MI_MARGN.php'
  6. headers = {"User-Agent":"Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.112 Safari/537.36"}
  7. url1 = "http://www.twse.com.tw/ch/trading/exchange/MI_MARGN/MI_MARGN.php"
  8. payload = {"download":'',
  9.             "qdate":'105/09/10',
  10.             "selectType":"ALL"}
  11. res = requests.post(url1, headers=headers, data=payload)
  12. tbl=pd.read_html(res.text)

  13. deta=tbl[1]
  14. deta.columns = ['ªÑ²¼¥N¸¹','ªÑ²¼¦WºÙ','¶R¶i','½æ¥X','²{ª÷ÀvÁÙ','«e¤é¾lÃB','¤µ¤é¾lÃB','­­ÃB','¶R¶i','½æ¥X','²{¨éÀvÁÙ','«e¤é¾lÃB','¤µ¤é¾lÃB','­­ÃB','¸ê¨é¤¬©è','µù°O']
  15. deta.to_csv('test1.csv')
½Æ»s¥N½X
a.png
2016-9-12 23:46

TOP

¥»©«³Ì«á¥Ñ c_c_lai ©ó 2016-9-13 08:42 ½s¿è

¦^´_ 106# zyzzyva
¦^´_ 109# lpk187
¦^´_ 107# koshi0413
zyzzyva ¤j¤j¡A ·P¿E¤£ºÉ¡I
ÁÙ¬O¦Ñ°ÝÃD ¡A with open('output.csv', 'w', new='', encoding='utf-8') as f: ªº
new ¯uªº»Ý­n§ï¦¨ with open('output.csv', 'w', "new"='', encoding='utf-8') as f:
"new" ( ¥´¤£¥X¨Ó) ¤~¦æ¡C
koshi0413 ¤j¤j¡A
§Úµo²{³o¨à¨Ï¥Î pandas Åã¥Üµ²ªGÁöµyµy¯Ó®É¡A¦ý¦¨ªG¬O«Üº}«G¡C(«Ü­È±o°Ñ¦ÒÀ³¥Î)
#¦bipython¡Adfs[1]´N¥i¥H¨ú¥X¥D­nªº¸ê®Æ
#dfs[1]
#dfs       #Ãþ¦ü 103¼Ó ªºµ²ªG (®t²§ÂI¡G¤@­Ó¬O¨Ï¥Î pandas¡A¥t¤@¬OÀ³¥Î BeautifulSoup ³B²z)
lpk187 ¤j¤j ¡A   
#109¼Ó¤§¸ÑÃDµ²ªG»P 100¼Ó ªº¸ÑµªÃþ¦ü¡AÁö²Ä¤@Äæ¥[¤J¤F Counter¡A¦ý¹Lµ{¨Ï¥Î¸û¦hªº®M¥ó¡C
§Úµo²{¨Ï¥Î requests.post(url, headers=headers, data=payload, stream=True) ¤£»Ý¹³ pandas
»ÝµÙ§Î¥[¤J deta.columns = ['] µ¥°Ê§@¡C¦b tbl=pd.read_html(res.text) ®É¡A¥¦¤£·|¦Û°Ê±a¤J¶Ü¡H

¬Ý¤F¤T¦ìªº¤£¦P¨Î§@¡A¤~²`²`ı±o§Ú¹ï Python ¤§»{ÃѪì²L¡A¹ê¦b¤Ó·PÁ¤T¦ì¯à±q¤£¦Pªº¨¤«×¨Ó±´°Q
¦@¦P¥DÃD¡A¦p¦¹¾Ç²ßªÌ¤~¯à²`¤ÁÅé·|¡B»â¨ü¡A¦A¦¸ÁÂÁ¡A³o¥ç¬O§Ú¬°¦ó³ß·R³o­Ó½×¾Âªº­ì¦]¡C
·s¼W¬£¤u.png
2016-9-13 08:42

TOP

        ÀR«ä¦Û¦b : §g¤l¦p¤ô¡AÀH¤è´N¶ê¡AµL³B¤£¦Û¦b¡C
ªð¦^¦Cªí ¤W¤@¥DÃD