ªð¦^¦Cªí ¤W¤@¥DÃD µo©«

[­ì³Ð] python¤W¥«Âd¤T¤jªk¤H¶R½æ¶W¤é³ø¸ê®Æ¤U¸ü

¦^´_ 79# c_c_lai
©Ç©Ç¡A§Úªº¥i¥H»¡¡A±zªº¦bipython¸Ì¥i¥HÅã¥Ü¥i¬O¥Îµ§°O¥»¶}¤Ï¦Ó¤£¦æ¡H
½s½Xªº°ÝÃD§Ú¤]¤@ª½«ÜÀYµh¡A·|¥Îutf-8¬O¦]¬°¸ê®Æ¸Ì¦³¤@¨Ç¤£¬Obig5¡A¹³¡u¥­瀬¸q¾ð ªª®v¡vªº¡u瀬¡v¡C
¦b³Ì«áwith open¨ºÃ䪺encoding§ï¦¨encoding='utf-8-sig'¸Õ¸Õ¬Ý¡C

TOP

¦^´_ 81# c_c_lai

§Úªº¬O¯à¥¿±`Åã¥Ü¡A¥i¯à¬O¨t²ÎÁÙ¬O¨ºÃ䪺°ÝÃD¡A¦³ÂI¨Æ±o¥Xªù¡A©ú¤Ñ¦A§ä§O¥x¹q¸£´ú¬Ý¬Ý¡C

TOP

¦^´_ 83# c_c_lai
«á­±¨º¬q§Ú¯à²z¸Ñ¡A¬OexcelŪ¨úcsv¹w³]½s½Xªº°ÝÃD¡A2007¥H«áªºª©¥»À³¸Ó¼g¤JBOM(byte of mark)¥i¥H¸Ñ¨M¡C
¤§«e¦bencoding=utf-8«á­±¥[¤Wsig´N¬O¥[¤WBOM¡A¤£ª¾¹D¬°¤°»ò¦b2010¤£¦æ¡C±z¦³ªÅªº¸Ü¦A¸Õ¸Õ¬Ý§â¥L§ï¦^¥hutf-8¡A
µM«á¦b¤U­±¥[¤@¦æ¡uf.write('\ufeff')¡v¤â°Ê¼g¤J¸Õ¸Õ¬Ý¡C
«e­±¨º¬qªº°ÝÃD§Ú¹ê¦b·Q¤£¥X­ì¦]¡A§ì¤U¨Óªº¸ê®Æ¦bµ§°O¥»¸Ì¦³®É¥¿±`¦³®É¶Ã½X¡H
¦³¨S¦³¿ìªk½Æ»s¿ù»~¡H½T»{¬O¤£¬O¯S©w±¡ªp©Î¬O­¶­±·|¥X²{¶Ã½Xªº°ÝÃD¡A³o¼Ë¤ñ¸û¦³­Ó¤è¦V¡C
pandas´N§Ú©Òª¾¥D­n¥Î¦b«á¬qªº¸ê®Æ¤ÀªR¡A·Ç³Æ¸ê®Æ³q±`À³¸ÓÁÙ¬O»Ý­n¨ä¥L¤u¨ã»²§U¡C

TOP

¦^´_ 90# c_c_lai
¤Ó¦n¤F!¦pªG³o¼Ë¤]¤£¦æ´NÀYµh¤F¡C
­ì¤å¤¤¦³´£¨ì§Æ±æ¥i¥H¥Î¿¤¥«¬°³æ¦ì¨Ó§ì¡C¦]¬°ºô¯¸¥»¨­´N¦³´£¨Ñ¿¤¥«§O¤ÀÃþªº³sµ²¡A¦p¡G
http://church.oursweb.net/slocation.php?w=1&c=TW&a=¥x¤¤¥«&t=
¨ä¹ê¥u­n§â³sµ²´«¤@¤U¡AÁ`­¶¼Æ§ï¤@¤U¥Î²{¦³ªº code´N¥i¥H§â¸Ó°Ï°ìªº¸ê®Æ§ì¤U¨Ó¤F¡C
¦ý¦pªG­nÅý¨Ï¥ÎªÌ¿ï¾Ü©O¡H°²³]§Ú­Ì¤w¸g¨Ì¨Ï¥ÎªÌ¿ï¾Ü¶i¤J¸Ó¤À°Ïªº­¶­±¡A¦ý¬O­n¦p¦ó¨M©wÁ`­¶¼Æ©O¡H
¥H¥x¤¤¥«¨Ó»¡¡C Screenshot_1.png ¡A¯à§â¹Ï¸ÌÂŰ餤ªº31§ì¥X¨Ó¶Ü¡H

TOP

¦^´_ 92# c_c_lai
»¡«ü¾É¤£´±·í¡A§Ú¤]¬O¶X³o­Ó¾÷·|¤@Ãä°µ¤@Ãä¾Ç¡C
¨ä¹ê¦n¤£¦n§ì¸òºô­¶ªºhtml½s±Æ¦³«Ü¤jªºÃö«Y¡A·|´£³o­Ó¼Æ¦r¬O¦]¬°§Ú¦Û¤v¸Õ¤Fı±o³o­Ó¼Æ¦rÁÙ¦³ÂI¤p³Â·Ð¡C

±z¦pªG¬O¥Îchrome¡A¥i¥H¦b·Q§ì¨úªº¸ê®Æ¤W«ö¥kÁä¡A¿ï¡uÀˬd¡v¡AÀ³¸Ó¥i¥H¬Ý¨ìÃþ¦ü¹Ï¤¤ªºµe­±(¦³®É­Ô»Ý­nÂI¶}¾ðª¬µ²ºc)¡C
¥H³o­Óµ²ºc¨Ó»¡¡A°£¤F¤W­±tableªºclass="tb_pages"¡A§Ú¨S¦³¬Ý¨ì¤°»ò®e©ö¥ÎªºªF¦è¡A©Ò¥H§Ú·|¿ï³o­Ótable°µ¬°°_©lªº°Ñ¦ÒÂI¡C
(³q±`§ä¨ì¥Ø¼Ðªº¤è¦¡³£·|¦³«D±`¦hºØ¡A¥u­n¯à§ä±o¨ì´N¦n¤F)¡A¥Î¤U­±ªºcode´N¥i¥H§ä¨ì¾ã­Ó¥y¤l¡G
  1. url = 'http://church.oursweb.net/slocation.php?w=1&c=TW&a=¥x¤¤¥«&t='

  2. res = requests.get(url)

  3. res.encoding='utf-8'

  4. soup = BeautifulSoup(res.text, 'lxml')

  5. target = soup.select('.tb_pages td')[0].text

  6. print(target)
½Æ»s¥N½X
¦ý¬O¼Æ¦rÁÙ¬OÂæb¸ÌÀY¡A¯u¬O¦³ÂI·Ð¤H¡C
¨ì³oÃä§Ú¬O¦A¥Îregular expression§â¥¦¨ú¥X¨Ó¡C
  1. import re
  2. total_num = re.search(r'/\s\d{1,3}', target).group().replace('/ ','')
  3. print(total_num)
½Æ»s¥N½X
¥t¤@­Ó¤ñ¸û²³æªº¤è¦¡«h¬O¥Îlxml¡C

TOP

­è·Q¤F¤@¤U¡A¥H³o­Ó­¶­±¨Ó»¡¡A¨ä¹ê¤]¤£¥Î¥Î¨ìBeautifulSoup¡Aª½±µ¥Îre´N¥i¥H¤F¡C
  1. import requests
  2. import re

  3. url = 'http://church.oursweb.net/slocation.php?w=1&c=TW&a=¥x¤¤¥«&t='

  4. res = requests.get(url)

  5. total_num = re.findall(r'/\s\d{1,3}', res.text)[0].replace('/ ','')

  6. print(total_num)
½Æ»s¥N½X

TOP

¦^´_ 95# clianghot546
­n¬Ý§A¬O­n¤°»ò¼Ëªºµ²ªG¡C»Ý­n°µ¨º¨Ç³B²z¡C¦pªG­n°µ¤@¨Ç¤@¯ëªº¹Bºâ§Ú³£ÁÙ¬O·|©ñ¨ìexcel¸Ì¡C
ÁöµM²z½×¤Wpython¤]¦³«Ü¦h®M¥ó¥i¥H°µ¦UºØ¤ÀªR¡A¤£¹L¤@¯ë¨Ï¥Î¨Ó»¡¡A§ÚÁÙ¬Oı±oexcelªº¤u§@ªí¸òÀx¦s®æ¤ñ¸û¿Ë¤Á¡C

TOP

¦^´_ 96# c_c_lai
r'/\s\d{1,3}'¡G¦r¦ê«e­±¥[¤Wr¬Oªí¥Üraw string¡A´N¬O³qª¾python¤£­n²z·|¯S®í¦r¤¸¡A·Ó¦r¦ê­ì¥»ªº¼Ë¤l¥N¶i¥hre module¡C
«á­±´N¬Oreªºªí¥Ü¡A¥H¡uÁ`¦@ 5704 µ§¸ê®Æ ¡m¡m¤W¤@­¶ ­¶¦¸ 1 / 286 ¤U¤@­¶¡n¡n¡v¨Ó»¡¡A ¡u/¡v´N¬O¦b¨â­Ó¼Æ¦r(1¡B286)¤§¶¡
¡u/¡v¤§«á¦³¤@­ÓªÅ¥Õ(¦bre¸Ì´N¬O¡u\s¡v¡A«á­±¡u\d¡vªí¥Ü¼Æ¦r¡A¡u{1,3}¡vªí¥Ü¦³1~3­Ó«e­±ªºªF¦è(¥H³o¸Ì¨Ó»¡´N¬O¡u\d¡v¡C
BeautifulSoup¤£¤@©w­n¥Îlxml°µ¬°parser¡Aºô­¶¦pªGµ²ºc¨}¦n¡A¥Î¨ººØ®t§O¨ä¹ê¤£¤j(lxml³t«×¥i¯à¦n¤@¨Ç)¡C
¥i¥H°Ñ¦Òhttps://www.crummy.com/software/BeautifulSoup/bs4/doc.zh/#id49
soup.select('.tb_pages td')ªð¦^ªº¬O¤@­Ólist¡Alist[0]´N¬Olistªº²Ä¤@­Ó¤¸¯Àªº·N«ä¡C
regular expressionªº³t«×º¡¦nªº¡AÀ³¸Ó¦U»y¨¥³£¦³¬ÛÀ³ªº¼Ò²Õ¡A¤£¹L§Ú¤ñ¸û¤Ö¥Î¡A¤@­Ó¬O¤£¼ô¡A¤@­Ó¬Opattern¤ñ¸û¼sªº®É­Ô¡A©È·|¤ñ¹ï¨ì·N®Æ¥~ªº¸ê®Æ¡C
python¤£·|ª¾¹D¨º¬OÁ`­¶¼Æ¡AÁÙ¬O­n¾a¤HÆ[¹î¡Apython¥u¯àª¾¹D¬Y­Ó¦ì¤l(©Î²Å¦Xre pattern)ªº¼Æ¦r¬O¤°»ò¡C

TOP

¦^´_ 99# c_c_lai
³o­Ó¦³´£¨Ñcsv¤U¸üªü¡A¥Î§Ú­Ì³o­Ó°Q½×¦ê¶}ÀYªº¤è¦¡ª½±µ§ìcsvÀ³¸Ó¬O³Ì§Öªº¡C
  1. import requests

  2. headers = {"User-Agent":"Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.112 Safari/537.36"}
  3.            
  4. url = 'http://www.twse.com.tw/ch/trading/exchange/MI_MARGN/MI_MARGN.php'

  5. payload={'download':'csv',
  6.         'qdate':'105/09/07',
  7.         'selectType':'ALL'}

  8. res = requests.post(url, headers=headers, data=payload, stream=True)

  9. with open('test.csv', 'wb',) as f:
  10.     for chunk in res.iter_content(1024):
  11.         f.write(chunk)
½Æ»s¥N½X

TOP

¦^´_ 103# koshi0413
±z³o»òµuªº®É¶¡¤w¸g¥i¥H§ì¨ì¸ê®Æ¤F¡A¤W¤âªº¯u§Ö¡C
¥ÎBeautifulSoup·|¸Õ»~¬O¥¿±`ªº¡A§Ú¤]±`±`³£¸Õ¦n´X¦¸¤~§ä¨ì¡C

TOP

        ÀR«ä¦Û¦b : §Ú­Ì­n°µ¦nªÀ·|ªºÀô«O¡A¤]­n°µ¦n¤º¤ßªºÀô«O¡C
ªð¦^¦Cªí ¤W¤@¥DÃD