我想通过地址栏传入地址如何www.xxx.com/http://v.youku.com/v_show/id_XMzE0ODExMjE2.html
然后把传入的http://v.youku.com/v_show/id_XMzE0ODExMjE2.html 作为传参数传给http://www.flvcd.com/parse.php?kw=成为这样http://www.flvcd.com/parse.php?kw=http://v.youku.com/v_show/id_XMzE0ODExMjE2.html然后在抓取这个页面的链接.再把这个链接地址传出来
该如何写...请求各位发个帮忙
然后把传入的http://v.youku.com/v_show/id_XMzE0ODExMjE2.html 作为传参数传给http://www.flvcd.com/parse.php?kw=成为这样http://www.flvcd.com/parse.php?kw=http://v.youku.com/v_show/id_XMzE0ODExMjE2.html然后在抓取这个页面的链接.再把这个链接地址传出来
该如何写...请求各位发个帮忙
解决方案 »
- 请问如果手中持有水印的图片,能否判断出一张图片是否在底部增加了水印?
- 如何把支付宝接口移到自己的网站中
- php如何定时转向其他网页
- PHP 谁 include 了我
- 数组怎么组合
- apache,mysql,php,数据库连接池
- 如何实现对字符串变量进行解析,并按规则,分配给临时变量/数组
- php开发使用Wamp Server服务无法创建db
- 本人,学习、使用php有三年多时间,在其期间得到csdn各位高人的指点,前后,共做出了5,6十个采用php+mysql的网站,同时,兼有内部小型mis
- 谁需要密码验证的原代码?自己做的,手工精良,童叟无欺,
- preg_match在PHP不同版本上运行结果不一致
- 网页如何从wordpress调用文章更新板块,专业人员请进
然后你直接取得http://后的地址拼装成
www.flvcd.com/parse.php?kw=http://v.youku.com/v_show/id_XMzE0ODExMjE2.html
就行了。这样的地址如何抓取链接?
用curl分析吧.然后匹配出地址再print
//这个是curl.php,放在localhost下面输入你说的类似url即可
//http://localhost/curl.php?kw=http://v.youku.com/v_show/id_XMzE0ODExMjE2.html
//如果你要模拟抓取的网页的url不包含? ; &之类的东西可以直接这么处理,如果包含的话建议做urldecode处理
$url = $_GET['kw'];
//获取以后直接简单的curl模拟就能获取了啊,不知道你说的获取不到是怎么回事
$ch = curl_init( );
curl_setopt( $ch, CURLOPT_SSL_VERIFYPEER, false );
curl_setopt( $ch, CURLOPT_URL, $url );
curl_setopt( $ch, CURLOPT_TIMEOUT, 15);
ob_start( );
curl_exec( $ch );
//$contents即使获取的页面信息
$contents = ob_get_contents( );
?>
我自己写了一段 你看下<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>无标题文档</title>
</head><body>
<form action="" method="post">
<input type="text" name="url" value="请输入优酷地址" onclick="this.value==''" />
<input type="submit" value="提交" />
</form><?php
$url = $_GET['url'];
//$url=$_POST["url"];
$ch = curl_init();
$timeout = 5; // set to zero for no timeout
curl_setopt ($ch, CURLOPT_URL, $url);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$file_contents = curl_exec($ch);
curl_close($ch);
// display file
echo $file_contents;
echo $content;?> </body>
</html>
能采集 到 但是 出来的是乱码 只有英文正常 而且我没有做到利用地址栏传参进去 请教大哥 改如何做? 同如何 提取它解析出来的地址 我对正则不熟 不知道怎样做才好,希望你能帮我 或qq 547468084
把 <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
改成 <meta http-equiv="Content-Type" content="text/html; charset=gb2312" />即可!
......
$file_contents = curl_exec($ch);
curl_close($ch);
$file_contents = mb_convert_encoding($file_contents,'utf-8','gb2312');
echo $file_contents;
.....
<?php$url=$_POST["url"];
$ch = curl_init();
$timeout = 5; // set to zero for no timeout
curl_setopt ($ch, CURLOPT_URL, $url);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$file_contents = curl_exec($ch);
curl_close($ch);
// display file
//echo $file_contents;
$file_contents= iconv("gb2312", "utf-8",$file_contents);
//echo htmlspecialchars($file_contents);$pat = '/<a(.*?)href="(.*?)"(.*?)>(.*?)<\/a>/i';
preg_match_all($pat, $file_contents, $m);
for($i=0;$i<count($m[2]) ;$i++ ){
echo '<li><a href="'.$_SERVER['PHP_SELF'].'?url='.$m[2][$i].'">'.$m[4][$i].'</a>';
}
<?php
header("Content-type: text/html;charset=gbk");
$url = 'http://www.flvcd.com/parse.php?kw=http://v.youku.com/v_show/id_XMzE0ODExMjE2.html';
$timeout = 30;
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_TIMEOUT, $timeout);
$contents = curl_exec($ch);
curl_close($ch);
//取得采集范围
if(preg_match('/<table\s+width=100%\s+border=0>(.*?)<form/is',$contents,$arr)){
if(!empty($arr[1])){
if(preg_match_all('/<a\s+href\s+=\s+"(.*?)"[^>]*>(.*?)<\/a>/i',$arr[1],$sarr)){
//echo '<pre />';
//print_r($sarr[1]);
foreach($sarr[1] as $i => $v){
echo '<li><a href="'.$_SERVER['PHP_SELF'].'?url='.$sarr[1][$i].'">'.trim($sarr[2][$i]).'</a></li>';
}
}
}
}
?>
<?php
header("Content-type: text/html;charset=utf-8");
$url = 'http://www.flvcd.com/parse.php?kw=http://v.youku.com/v_show/id_XMzE0ODExMjE2.html';
$timeout = 30;
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_TIMEOUT, $timeout);
$contents = curl_exec($ch);
curl_close($ch);
$contents= mb_convert_encoding($contents,'utf-8','gb2312');
//取得采集范围
if(preg_match('/<table\s+width=100%\s+border=0>(.*?)<form/is',$contents,$arr)){
if(!empty($arr[1])){
if(preg_match_all('/<a\s+href\s+=\s+"(.*?)"[^>]*>(.*?)<\/a>/i',$arr[1],$sarr)){
//echo '<pre />';
//print_r($sarr[1]);
foreach($sarr[1] as $i => $v){
echo '<li><a href="'.$_SERVER['PHP_SELF'].'?url='.$sarr[1][$i].'">'.trim($sarr[2][$i]).'</a></li>';
}
}
}
}
?>
想这样http://f.youku.com/player/getFlvPath/sid/00_00/st/flv/fileid/03000203004E......48909728272909如果用源码输出应该能输出完整 可以尝试下吗?
<?php
header("Content-type: text/html;charset=utf-8");
$url = 'http://www.flvcd.com/parse.php?kw=http://v.youku.com/v_show/id_XMzE0ODExMjE2.html';
$timeout = 30;
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_TIMEOUT, $timeout);
$contents = curl_exec($ch);
curl_close($ch);
$contents= mb_convert_encoding($contents,'utf-8','gb2312');if(preg_match('/<table\s+width=100%\s+border=0>(.*?)<form/is',$contents,$arr)){
if(!empty($arr[1])){
if(preg_match_all('/<a\s*href\s*=\s*"(.*?)"[^>]*>/i',$arr[1],$sarr)){
//echo '<pre />';
//print_r($sarr[1]);
foreach($sarr[1] as $i => $v){
echo '<li><a href="'.$_SERVER['PHP_SELF'].'?url='.$v.'">'.$v.'</a></li>';
}
}
}
}
?>
$url=$_GET['q'];
}else{
echo '未提交地址.';
exit();
}$url = "http://www.flvcd.com/parse.php?kw={$url}";
$timeout = 30;
$ch = curl_init();
...
貌似 & 这个符号失效!! 怎样解决呢?
简单 用urlencode 和 urldecode处理一下即可
例如 传入地址前用urlencode处理$url = 'http://v.youku.com/v_show/id_XMjEwNjkxMDQw.html&format=hight';
$url = urlencode($url)
echo $url ;输出就是: http%3A%2F%2Fv.youku.com%2Fv_show%2Fid_XMjEwNjkxMDQw.html%26format%3Dhighturldecode()函数是对经过urlencode编码后的字符串进行解码。结合你的网站,可以这么做:
前导页:
$url = 'http://v.youku.com/v_show/id_XMjEwNjkxMDQw.html&format=hight';
$url = 'www.mvtop.info/?q='.urlencode($url);接受页:
if (isset($_GET['q'])){
$url=urldecode($_GET['q']);
}
大神,是过你说转换的方法 却没有效
代码:
<?phpheader("Content-type: text/html;charset=utf-8");
if (isset($_GET['q'])){
$url=urldecode($_GET['q']);
}else{
echo '未提交地址.';
exit();
}$url = 'http://www.flvcd.com/parse.php?kw='.urlencode($url);//
$timeout = 30;
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_TIMEOUT, $timeout);
$contents = curl_exec($ch);
curl_close($ch);
$contents= mb_convert_encoding($contents,'utf-8','gb2312');if(preg_match('/<table\s+width=100%\s+border=0>(.*?)<form/is',$contents,$arr)){
if(!empty($arr[1])){
if(preg_match_all('/<a\s*href\s*=\s*"(.*?)"[^>]*>/i',$arr[1],$sarr)){
//echo '<pre />';
//print_r($sarr[1]);
foreach($sarr[1] as $i => $v){
//echo '<li><a href="'.$_SERVER['PHP_SELF'].'?url='.$v.'">'.$v.'</a></li>';
$map =$sarr[1];
foreach ($map as $k =>$v) {
if (strpos($v, 'youku') !== false) {
echo "$v<br />";
我在后面改成 &format=super 或&format=high 或没有 都是一样
http://www.mvtop.info/crul.php?q=http://v.youku.com/v_show/id_XMzE0ODExMjE2.html&format=super
$Enurl=urlencode('http://v.youku.com/v_show/id_XMzE0ODExMjE2.html&format=super');
//echo $Enurl;拼接地址
http://www.mvtop.info/crul.php?q=$Enurl;这样能看懂怎么使用?然后再 $YoukuUrl=urldecode($_GET['q']);
取得这个地址 应该就可以拉。
<?phpheader("Content-type: text/html;charset=utf-8");
if (isset($_GET['q'])){$Enurl=urldecode($_GET['q']);}else{
echo '未提交地址.';
exit();
}$url = "http://www.flvcd.com/parse.php?kw=$Enurl";我改成了这样 但是 貌似不行 应该是我的技术问题
‘然后再 $YoukuUrl=urldecode($_GET['q']);
取得这个地址 应该就可以拉。’这句话不太懂
‘然后再 $YoukuUrl=urldecode($_GET['q']);
取得这个地址 应该就可以拉。’这句话不太懂什么页面要取得这个地址就在什么页面将encode过的地址解出来 decode.