python爬取豆瓣影评,涉及知识点:bs4,requests、time、random
2024-01-08 20:31:51
页面源代码:
<!DOCTYPE html>
<html lang="zh-CN" class="ua-windows ua-webkit">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="renderer" content="webkit">
<meta name="referrer" content="always">
<meta name="google-site-verification" content="ok0wCgT20tBBgo9_zat2iAcimtN4Ftf5ccsh092Xeyw" />
<title>
豆瓣最受欢迎的影评
</title>
<meta name="baidu-site-verification" content="cZdR4xxR7RxmM4zE" />
<meta http-equiv="Pragma" content="no-cache">
<meta http-equiv="Expires" content="Sun, 6 Mar 2005 01:00:00 GMT">
<meta name="keywords" content="影评,热门影评,最新影评"/>
<meta name="description" content="豆瓣最受欢迎的影评,发表你的影评"/>
<link rel="alternate" href="/feed/review/best" type="application/rss+xml" title="RSS">
<link href="https://img1.doubanio.com/f/vendors/02814fbb5bee25484516bd0a642af695f7ec5a83/css/douban.css" rel="stylesheet" type="text/css">
<link href="https://img1.doubanio.com/f/vendors/ee6598d46af0bc554cecec9bcbf525b9b0582cb0/css/separation/_all.css" rel="stylesheet" type="text/css">
<link href="https://img1.doubanio.com/f/zerkalo/4b7b75331a86c9c8275ac6b7306e820fc072e09a/css/init.css" rel="stylesheet" type="text/css">
<link rel="stylesheet" href="https://img1.doubanio.com/misc/mixed_static/610438fbda6eb614.css">
<style type="text/css"></style>
<script type="text/javascript">var _head_start = new Date();</script>
<script type="text/javascript" src="https://img1.doubanio.com/f/vendors/6931d89467c7bd3bb6cd748c05cae22368989aea/js/jquery-1.9.1.min.js"></script>
<script type="text/javascript" src="https://img1.doubanio.com/f/vendors/aa9559674f2476cdc16f755b3cdc4ebc478db669/js/douban.js"></script>
<script type="text/javascript" src="https://img1.doubanio.com/f/vendors/e38c65a87555287f5fb7c997e41b908d72ff9731/js/lib/moreurl.js"></script>
<script type="text/javascript" src="https://img1.doubanio.com/f/vendors/b0d3faaf7a432605add54908e39e17746824d6cc/js/separation/_all.js"></script>
<script type="text/javascript" src="https://img1.doubanio.com/f/zerkalo/8f98eaec1c9c779076c24b46fe052ee9c2dd52d8/dist/js/base.js"></script>
<script type="text/javascript"></script>
<link rel="shortcut icon" href="https://img1.doubanio.com/favicon.ico" type="image/x-icon">
</head>
<body>
<script type="text/javascript">var _body_start = new Date();</script>
<link href="//img3.doubanio.com/dae/accounts/resources/ded47ae/shire/bundle.css" rel="stylesheet" type="text/css">
<div id="db-global-nav" class="global-nav">
<div class="bd">
<div class="top-nav-info">
<a href="https://accounts.douban.com/passport/login?source=main" class="nav-login" rel="nofollow">登录/注册</a>
</div>
<div class="top-nav-doubanapp">
<a href="https://www.douban.com/doubanapp/app?channel=top-nav" class="lnk-doubanapp">下载豆瓣客户端</a>
<div id="doubanapp-tip">
<a href="https://www.douban.com/doubanapp/app?channel=qipao" class="tip-link">豆瓣 <span class="version">6.0</span> 全新发布</a>
<a href="javascript: void 0;" class="tip-close">×</a>
</div>
<div id="top-nav-appintro" class="more-items">
<p class="appintro-title">豆瓣</p>
<p class="qrcode">扫码直接下载</p>
<div class="download">
<a href="https://www.douban.com/doubanapp/redirect?channel=top-nav&direct_dl=1&download=iOS">iPhone</a>
<span>·</span>
<a href="https://www.douban.com/doubanapp/redirect?channel=top-nav&direct_dl=1&download=Android" class="download-android">Android</a>
</div>
</div>
</div>
<div class="global-nav-items">
<ul>
<li class="on">
<a href="https://www.douban.com" data-moreurl-dict="{"from":"top-nav-click-main","uid":"0"}">豆瓣</a>
</li>
<li class="">
<a href="https://book.douban.com" target="_blank" data-moreurl-dict="{"from":"top-nav-click-book","uid":"0"}">读书</a>
</li>
<li class="">
<a href="https://movie.douban.com" target="_blank" data-moreurl-dict="{"from":"top-nav-click-movie","uid":"0"}">电影</a>
</li>
<li class="">
<a href="https://music.douban.com" target="_blank" data-moreurl-dict="{"from":"top-nav-click-music","uid":"0"}">音乐</a>
</li>
<li class="">
<a href="https://www.douban.com/location" target="_blank" data-moreurl-dict="{"from":"top-nav-click-location","uid":"0"}">同城</a>
</li>
<li class="">
<a href="https://www.douban.com/group" target="_blank" data-moreurl-dict="{"from":"top-nav-click-group","uid":"0"}">小组</a>
</li>
<li class="">
<a href="https://read.douban.com/?dcs=top-nav&dcm=douban" target="_blank" data-moreurl-dict="{"from":"top-nav-click-read","uid":"0"}">阅读</a>
</li>
<li class="">
<a href="https://fm.douban.com/?from_=shire_top_nav" target="_blank" data-moreurl-dict="{"from":"top-nav-click-fm","uid":"0"}">FM</a>
</li>
<li class="">
<a href="https://time.douban.com/?dt_time_source=douban-web_top_nav" target="_blank" data-moreurl-dict="{"from":"top-nav-click-time","uid":"0"}">时间</a>
</li>
<li class="">
<a href="https://market.douban.com/?utm_campaign=douban_top_nav&utm_source=douban&utm_medium=pc_web" target="_blank" data-moreurl-dict="{"from":"top-nav-click-market","uid":"0"}">豆品</a>
</li>
</ul>
</div>
</div>
</div>
<script>
;window._GLOBAL_NAV = {
DOUBAN_URL: "https://www.douban.com",
N_NEW_NOTIS: 0,
N_NEW_DOUMAIL: 0
};
</script>
<script src="//img3.doubanio.com/dae/accounts/resources/ded47ae/shire/bundle.js" defer="defer"></script>
<link href="//img3.doubanio.com/dae/accounts/resources/ded47ae/movie/bundle.css" rel="stylesheet" type="text/css">
<div id="db-nav-movie" class="nav">
<div class="nav-wrap">
<div class="nav-primary">
<div class="nav-logo">
<a href="https://movie.douban.com">豆瓣电影</a>
</div>
<div class="nav-search">
<form action="https://search.douban.com/movie/subject_search" method="get">
<fieldset>
<legend>搜索:</legend>
<label for="inp-query">
</label>
<div class="inp"><input id="inp-query" name="search_text" size="22" maxlength="60" placeholder="搜索电影、电视剧、综艺、影人" value=""></div>
<div class="inp-btn"><input type="submit" value="搜索"></div>
<input type="hidden" name="cat" value="1002" />
</fieldset>
</form>
</div>
</div>
</div>
<div class="nav-secondary">
<div class="nav-items">
<ul>
<li ><a href="https://movie.douban.com/cinema/nowplaying/"
>影讯&购票</a>
</li>
<li ><a href="https://movie.douban.com/explore"
>选电影</a>
</li>
<li ><a href="https://movie.douban.com/tv/"
>电视剧</a>
</li>
<li ><a href="https://movie.douban.com/chart"
>排行榜</a>
</li>
<li ><a href="https://movie.douban.com/review/best/"
>影评</a>
</li>
<li ><a href="https://movie.douban.com/annual/2023/?fullscreen=1&source=navigation"
>2023年度榜单</a>
</li>
<li ><a href="https://c9.douban.com/app/standbyme-2023/?autorotate=false&fullscreen=true&hidenav=true&monitor_screenshot=true&source=web_navigation"
target="_blank"
>2023年度报告</a>
</li>
</ul>
</div>
<a href="https://movie.douban.com/annual/2023/?fullscreen=1&source=movie_navigation" class="movieannual"></a>
</div>
</div>
<script id="suggResult" type="text/x-jquery-tmpl">
<li data-link="{
{= url}}">
<a href="{
{= url}}" onclick="moreurl(this, {from:'movie_search_sugg', query:'{
{= keyword }}', subject_id:'{
{= id}}', i: '{
{= index}}', type: '{
{= type}}'})">
<img src="{
{= img}}" width="40" />
<p>
<em>{
{= title}}</em>
{
{if year}}
<span>{
{= year}}</span>
{
{/if}}
{
{if sub_title}}
<br /><span>{
{= sub_title}}</span>
{
{/if}}
{
{if address}}
<br /><span>{
{= address}}</span>
{
{/if}}
{
{if episode}}
{
{if episode=="unknow"}}
<br /><span>集数未知</span>
{
{else}}
<br /><span>共{
{= episode}}集</span>
{
{/if}}
{
{/if}}
</p>
</a>
</li>
</script>
<script src="//img3.doubanio.com/dae/accounts/resources/ded47ae/movie/bundle.js" defer="defer"></script>
文章来源:https://blog.csdn.net/jolinoy/article/details/135464797
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。 如若内容造成侵权/违法违规/事实不符,请联系我的编程经验分享网邮箱:veading@qq.com进行投诉反馈,一经查实,立即删除!
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。 如若内容造成侵权/违法违规/事实不符,请联系我的编程经验分享网邮箱:veading@qq.com进行投诉反馈,一经查实,立即删除!