欢迎光临
感谢一路有你

采集 | 百度实时热点排行榜

如果你对该文章中的内容有疑问/不解,可以点击此处链接提问
要注明问题和此文章链接地址 点击此处跳转
 

<?php
header('Content-type:text/html;charset=gb2312');
error_reporting(E_ERROR | E_WARNING | E_PARSE);
set_time_limit(0);//d等待时间.不限制
ini_set('memory_limit','200m');//设置内存
 
 // 百度实时热点排行榜
$url ='http://top.baidu.com/buzz?b=1';

//获取网页
$str = get_str($url);

$block_rule ='/<td class="keyword">(.*?)<\/td>/si'; 
preg_match_all($block_rule,$str,$fenlei);
if(!empty($fenlei)){
  $count = count($fenlei[1]);
  for($i=0;$i<$count;$i++){
    $cat_rule='/<a class="list-title" target="_blank" href="(.*?)" href_top="(.*?)">(.*?)<\/a>/si';
    preg_match_all($cat_rule,$fenlei[1][$i],$cats);
    if(!empty($cats)){
      $url_r = $cats[1][0];
      $name_r = $cats[3][0];
      $num = $i+1; 
      echo $num.'&nbsp;&nbsp;<a href='.$url_r.' target="_blank">'.$name_r.'</a><br> ';
    }
  }
}
 
//curl获取网页内容
function get_str($url){
  $UserAgent = 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; SLCC1; .NET CLR 2.0.50727; .NET CLR 3.0.04506; .NET CLR 3.5.21022; .NET CLR 1.0.3705; .NET CLR 1.1.4322)';  
  $curl = curl_init();
  curl_setopt($curl, CURLOPT_URL, $url);  
  curl_setopt($curl, CURLOPT_HEADER, 0);  
  curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);   
  curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false);  
  curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, false);  
  curl_setopt($curl, CURLOPT_ENCODING, ''); 
  curl_setopt($curl, CURLOPT_USERAGENT, $UserAgent);  
  curl_setopt($curl, CURLOPT_FOLLOWLOCATION, 1);  
  curl_setopt($curl, CURLOPT_CONNECTTIMEOUT, 10);//超时 0不限制

  $data = curl_exec($curl);   
  curl_close($curl); 
	return $data;
}

20181009测试有效

赞(0) 打赏
未经允许不得转载:王明昌博客 » 采集 | 百度实时热点排行榜
分享到: 更多 (0)

相关推荐

  • 暂无文章

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏

×
订阅图标按钮