ªð¦^¦Cªí ¤W¤@¥DÃD µo©«

[­ì³Ð] python¤W¥«Âd¤T¤jªk¤H¶R½æ¶W¤é³ø¸ê®Æ¤U¸ü

¥»©«³Ì«á¥Ñ c_c_lai ©ó 2016-9-13 09:31 ½s¿è

¦^´_ 111# koshi0413
¤@¯ëÀ³¥Î­±¨Ó»¡¡A sql ¸ê®Æ¬O¥i¥Hª½±µ©Ô¸ê®Æ¦Ü EXCEL ªº¡A¦A¥Î EXCEL ¨Óµe»s²Î­p¹Ïªí¡C
¦³¨Ç¤H·|±N¨C¤é¨C¤À¡B¬Æ¦Ü¨C¬íªºªÑ¥«¬ö¿ý¦s¤J¦Ü¸ê®Æ®w (DataBase) ¡A
¦p MS SQL¡B Access¡B MySQL¡B Oracle"¡B PostgreSQL µ¥¡C
µM«á¦A¥h¿z¿ï³B²z¡AµL½×§A¬O¨Ï¥Î¦óºØµ{¦¡»y¨¥¤u¨ã¡C
³o¤è­±§A¥i¥H¦V ­ã´£³¡ªL¡BGBKEE¡BHsieh ¡B ... µ¥¦h¦ìª©¥D«e½ú½Ð¯qªº¡C

TOP

¦^´_ 108# koshi0413
­ì¥»¤]¦³¦Ò¼{¨Ï¥Î for ¨Ó³B­ù¡A¦ý°ÝÃD¥X¦b print() °õ¦æ§¹«á³£·|°e¥X Feed¡C
for ta in soup.select('tr')[3:]:
    for ct in [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14]:
        print ta.select("td")[ct].text
¦ýµ²ªG¬°¡G

TOP

¦^´_ 109# lpk187
import io

¤p§Ìªº®M¥ó¿ù»~¡AµLªk°õ¦æ¡A¦Ó¥B¥Î conda install io ¸Ë¤£°_¨Ó¡A¤Ó¯«©_¤F
¤U¯Z«á¦A¨Ó¸Õ pip install io or easy install io ¸Õ¸Õ

TOP

¦^´_ 110# c_c_lai

c¤j¡A¥u­n¯à¹F¨ì¥Øªº§Y¬O¦n¤èªk¡A¨ä¥¦¬ÛÃö¥N½X¥u¬O¨C­Ó¤H¤è¦¡¤£¦P
zyzzyva¤j & lpk187¤j ¬°µeªíÂà¥X CSV  ³o¤è¦¡«Ü¦n¡A¹ï©ó¥ÎEXCELªí®æ¨Ó»¡¬O¤@¤j¦n¨Æ
¦óªp c¤j¦bvba¤¤®¼¬¡ÅDªº^^
vba§Ú¤]¬O¬ÝµÛ§A­Ìªº¤å³¹ºCºC¾Çªº~

¥»¨Ó¤]¬O·Q§â´£¨úªºÀÉ®×Âà¦scsv
¦ý¬Ý¨ìcsv to sql¦³³\¦hÂà½X°ÝÃD¡A¬G¤~ª½±µ±qpython¶×¤Jsql

¬Ý¨ì¤¶²Ð SQL¦n¹³¥i¥H¦hªíÁp¦X¬d¸ßÅã¥Ü¤§Ãþ¡A¤~¥h¸Iªº(­è¸I¡A¥Ø«e¥u·|Python¶×¤Jsql¡Asql ¶×¥X¦ÜEXCEL)
¤£¦PÀɮצhªíÁp¦X¬d¸ß  EXCEL À³¸Ó¤]¥i¥H§a¡H   ¨S¸Õ¹L¡AÀ³¸Ó­n¥Îvba??

½Ð±Ð°Ý¤@¤Usql¥i¥H©Ô¸ê®Æ¦ÜEXCEL¡A¦A¥ÎEXCELµe¹Ï
¹ï©óvbaµe¹Ï³o¶ô¡A¤§«á¥i¯à­n½Ðc¤j«üÂI¤F@@
ÁöµMpython¤]¥i¥Hµe¹Ï¡A¦ý¾ã¦X©Ê¤£ª¾¹D¦³¨S¦³®t¡AÁÙ¨S¬ã¨s¹L¡A©Ò¥HÁÙ¬O¥ý¿ï³Ì¤F¸ÑªºEXCEL¨Ó¾ã¦X

TOP

¥»©«³Ì«á¥Ñ c_c_lai ©ó 2016-9-13 08:42 ½s¿è

¦^´_ 106# zyzzyva
¦^´_ 109# lpk187
¦^´_ 107# koshi0413
zyzzyva ¤j¤j¡A ·P¿E¤£ºÉ¡I
ÁÙ¬O¦Ñ°ÝÃD ¡A with open('output.csv', 'w', new='', encoding='utf-8') as f: ªº
new ¯uªº»Ý­n§ï¦¨ with open('output.csv', 'w', "new"='', encoding='utf-8') as f:
"new" ( ¥´¤£¥X¨Ó) ¤~¦æ¡C
koshi0413 ¤j¤j¡A
§Úµo²{³o¨à¨Ï¥Î pandas Åã¥Üµ²ªGÁöµyµy¯Ó®É¡A¦ý¦¨ªG¬O«Üº}«G¡C(«Ü­È±o°Ñ¦ÒÀ³¥Î)
#¦bipython¡Adfs[1]´N¥i¥H¨ú¥X¥D­nªº¸ê®Æ
#dfs[1]
#dfs       #Ãþ¦ü 103¼Ó ªºµ²ªG (®t²§ÂI¡G¤@­Ó¬O¨Ï¥Î pandas¡A¥t¤@¬OÀ³¥Î BeautifulSoup ³B²z)
lpk187 ¤j¤j ¡A   
#109¼Ó¤§¸ÑÃDµ²ªG»P 100¼Ó ªº¸ÑµªÃþ¦ü¡AÁö²Ä¤@Äæ¥[¤J¤F Counter¡A¦ý¹Lµ{¨Ï¥Î¸û¦hªº®M¥ó¡C
§Úµo²{¨Ï¥Î requests.post(url, headers=headers, data=payload, stream=True) ¤£»Ý¹³ pandas
»ÝµÙ§Î¥[¤J deta.columns = ['] µ¥°Ê§@¡C¦b tbl=pd.read_html(res.text) ®É¡A¥¦¤£·|¦Û°Ê±a¤J¶Ü¡H

¬Ý¤F¤T¦ìªº¤£¦P¨Î§@¡A¤~²`²`ı±o§Ú¹ï Python ¤§»{ÃѪì²L¡A¹ê¦b¤Ó·PÁ¤T¦ì¯à±q¤£¦Pªº¨¤«×¨Ó±´°Q
¦@¦P¥DÃD¡A¦p¦¹¾Ç²ßªÌ¤~¯à²`¤ÁÅé·|¡B»â¨ü¡A¦A¦¸ÁÂÁ¡A³o¥ç¬O§Ú¬°¦ó³ß·R³o­Ó½×¾Âªº­ì¦]¡C

TOP

¥»©«³Ì«á¥Ñ lpk187 ©ó 2016-9-12 23:46 ½s¿è

¦^´_ 103# koshi0413

¬Ý¨Ó¼ö°J©ópython ªº¦P¦n¦b³o°Q½×°Ï¡A¤]ÆZ¦h¤Hªº
python ªºio¥i¥H¿é¥X«Ü¦h®æ¦¡¡Acsv .xlsx¬Æ¦Üsql³£¦æ

·íC¤j¶}©l°Q½×³oºô§} http://www.twse.com.tw/ch/trading/exchange/MI_MARGN/MI_MARGN.php
´N¦b·Q¡Aªí®æÁÙ¬O¥æµ¹ªí®æ±M®a pandas ¨Ó³B²z¡A©ó¬O­É¤F§A¤j³¡¥÷¥N½X¨Ó°µ¬Ý¬Ý¡A®ÄªGÁÙ¤£¿ù
  1. import requests
  2. from bs4 import BeautifulSoup
  3. import pandas as pd
  4. import io

  5. url = 'http://www.twse.com.tw/ch/trading/exchange/MI_MARGN/MI_MARGN.php'
  6. headers = {"User-Agent":"Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.112 Safari/537.36"}
  7. url1 = "http://www.twse.com.tw/ch/trading/exchange/MI_MARGN/MI_MARGN.php"
  8. payload = {"download":'',
  9.             "qdate":'105/09/10',
  10.             "selectType":"ALL"}
  11. res = requests.post(url1, headers=headers, data=payload)
  12. tbl=pd.read_html(res.text)

  13. deta=tbl[1]
  14. deta.columns = ['ªÑ²¼¥N¸¹','ªÑ²¼¦WºÙ','¶R¶i','½æ¥X','²{ª÷ÀvÁÙ','«e¤é¾lÃB','¤µ¤é¾lÃB','­­ÃB','¶R¶i','½æ¥X','²{¨éÀvÁÙ','«e¤é¾lÃB','¤µ¤é¾lÃB','­­ÃB','¸ê¨é¤¬©è','µù°O']
  15. deta.to_csv('test1.csv')
½Æ»s¥N½X

TOP

¹ï¤F   ­Ó¤Hıªº³o¬q¥i¥H¹³vba¤@¼Ë¥Î i = 1 to 14 ¨Ó°j°éªí¥Ü
¥u¬OÁÙ¤£·|¥Î

for ta in soup.select('tr')[3:]:
    print ta.select("td")[0].text,ta.select("td")[1].text,ta.select("td")[2].text,ta.select("td")[3].text,ta.select("td")[4].text,ta.select("td")[5].text,ta.select("td")[6].text,ta.select("td")[7].text,ta.select("td")[8].text,ta.select("td")[9].text,ta.select("td")[10].text,ta.select("td")[11].text,ta.select("td")[12].text,ta.select("td")[13].text,ta.select("td")[14].text

TOP

¦^´_ 104# c_c_lai

soup.select('table')[1] ¬°¦ó¬O [1]¡H
ºô­¶¥N½X¤¤¡A¨ú²Ä¤G­Ó'table'

soup.select('tr')[3:] ªº [3:]
ºô­¶¥N½X¤¤¡A¨ú²Ä¤T­Ó'tr'

²Ä¤@¡B¤G­Ó¬°¼ÐÃD¡A¤p§Ìª½±µ¶×¤Jsql¡A©Ò¥H¤£»Ý­n³o¤G¶µ¥Ø¡A©Ò¥H¨S´£¨ú~~
z¤j¦³³Ì·s¦^¤å¡A¥Lªº¥N½X¤ñ¸û¥¿½T¡A¤p§Ìªº¥N½X³£¬O²©ö«¬ªº¡A¥u­n¨ú¨ì¸ê®Æ¯à¾É¤Jsql§Y¥i

TOP

¦^´_ 102# c_c_lai
´£¨Ñ¥t¤@ºØ¤è¦¡µ¹±z°Ñ¦Ò(¸ÑªRºô­¶«á³B²z¸ê®Æ¦A¦s¦¨csv)
  1. import requests
  2. from bs4 import BeautifulSoup
  3. import csv

  4. headers = {"User-Agent":"Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.112 Safari/537.36"}
  5.            
  6. url = 'http://www.twse.com.tw/ch/trading/exchange/MI_MARGN/MI_MARGN.php'

  7. myDate = '105/09/08'

  8. payload={'download':'',
  9.         'qdate':myDate,
  10.         'selectType':'ALL'}

  11. res = requests.post(url, headers=headers, data=payload)
  12. soup = BeautifulSoup(res.text, 'lxml')

  13. trs = soup.select('table tr')

  14. myList = []
  15. subList = []

  16. header1 = ['ªÑ²¼¥N¸¹','ªÑ²¼¦WºÙ','¿Ä¸ê(³æ¦ì: ¥æ©ö³æ¦ì)','','','','','',
  17.             '¿Ä¨é(³æ¦ì: ¥æ©ö³æ¦ì)','','','','','','¸ê¨é¤¬©è','µù°O']

  18. header2 = ['','','¶R¶i','½æ¥X','²{ª÷ÀvÁÙ','«e¤é¾lÃB','¤µ¤é¾lÃB','­­ÃB',
  19.             '¶R¶i','½æ¥X','²{ª÷ÀvÁÙ','«e¤é¾lÃB','¤µ¤é¾lÃB','­­ÃB','','']

  20. for i, tr in enumerate(trs):
  21.     if i == 6:
  22.         subList = header1
  23.     elif i == 7:
  24.         subList = header2
  25.     else:
  26.         for td in tr.find_all('td'):
  27.             subList.append(td.text)
  28.    
  29.     myList.append(subList)
  30.     subList = []
  31.     if i == 5:
  32.         myList.append(subList)

  33. with open('output.csv', 'w', new='', encoding='utf-8') as f:
  34.     f.write('\ufeff')
  35.     w = csv.writer(f)
  36.     for sub in myList:
  37.         w.writerow(sub)
½Æ»s¥N½X
¦pªG¤£¨D¬üÆ[¡A¥i¥Hª½±µ¥Îpandas¡G
  1. import requests
  2. import pandas as pd

  3. headers = {"User-Agent":"Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.112 Safari/537.36"}
  4.            
  5. url = 'http://www.twse.com.tw/ch/trading/exchange/MI_MARGN/MI_MARGN.php'

  6. myDate = '105/09/07'

  7. payload={'download':'',
  8.         'qdate':myDate,
  9.         'selectType':'ALL'}

  10. res = requests.post(url, headers=headers, data=payload)

  11. dfs = pd.read_html(res.text)

  12. #¦bipython¡Adfs[1]´N¥i¥H¨ú¥X¥D­nªº¸ê®Æ
  13. dfs[1]
½Æ»s¥N½X

TOP

¦^´_ 103# koshi0413
±z³o»òµuªº®É¶¡¤w¸g¥i¥H§ì¨ì¸ê®Æ¤F¡A¤W¤âªº¯u§Ö¡C
¥ÎBeautifulSoup·|¸Õ»~¬O¥¿±`ªº¡A§Ú¤]±`±`³£¸Õ¦n´X¦¸¤~§ä¨ì¡C

TOP

        ÀR«ä¦Û¦b : ®É®É¦n¤ß´N¬O®É®É¦n¤é¡C
ªð¦^¦Cªí ¤W¤@¥DÃD