核心方法:百度关键字优化工具

优采云 发布时间: 2022-11-03 13:35

  核心方法:百度关键字优化工具

  国内seo公司的秘密武器现已对外公开,​​欢迎有需要的大家免费测试试用,提出修改意见。

  本软件采用独特的嵌入式搜索算法和自动网络模拟技术,实现基于网络的关键词监控和网站关键词优化。

  主要特点如下:

  1.你想提高你的网站关键词在百度的排名吗?你可以选择这个软件。

  

  2. 你想让你的网站点击量在一夜之间飙升吗?你可以选择这个软件。

  3、你想进一步优化你的网站排名吗?你可以选择这个软件。

  4、你想监测你关注的关键词在各大搜索引擎中的最新出现吗?你可以选择这个软件。

  5.程序简洁大方,傻瓜式操作,可以设置任意关键字进行自由监控。

  功能多多,欢迎用户提出好的建议,有积分送!

  

  版本升级说明

  2013.6.3.1

  1. 新增关键词百度检测功能,为关键词较低的用户提供快速检测,让系统快速识别和优化。

  2.增加了排名积分功能。分数越高,优化的关键词就越高。

  3、由于加入了关键词百度检测功能,优化了关键词搜索速度。原来一个关键词搜索前15页,现在只搜索前5页。

  完整的解决方案:Python识别CMS与批量资产收集拓展

  如果使用模板,网站的结构过于重复,不利于SEO

  回到本文的主题,渗透测试中检测目标URL是否基于cms模板开发,判断使用的cms类型是很重要的一步。可以在网上找到暴露的cms程序,如果cms系统是开源的,也可以下载相应的源码进行代码审计。

  根据个人经验,总结出识别网站cms类型的五种方法,本质上是根据网页内容或网页文件的特点来判断的。只是根据网页的不同类型大致分为五类,便于理解和阅读。至于在网上找到在线识别cms网站的接口并加入爬虫,省时省力,但不在本文讨论范围内。当然,以后会在视频课程的爬虫栏目中具体讨论。

  根据页面内容识别 cms 类型

  许多cms 系统会在网站 的主页上添加一些关键词 来表示cms。例如,WordPress 是一个使用 PHP 语言开发的博客平台。您可以在支持 PHP 和 MySQL 数据库的服务器上设置自己的 网站。如果你使用这个程序构建一个网站,关键词会出现在网站的主页上。以下目标 URL 示例是使用 WordPress 构建的:

  我们的策略是确定网站是使用WordPress系统搭建的,所以我们识别后会继续确认网站使用WordPress的版本号,然后搜索这个版本的相关漏洞,所以以方便接下来的渗透测试工作。执行。

  那么只需要准备一个指纹库,然后访问目标网站后,比较指纹库的内容,如果命中关键词,那么直接判断cms的类型的网址。

  指纹数据库文件需要不断采集和更新。

  如果用代码来实现,大致的大致结构大概是这样的设计:

  # -*- coding:utf-8 -*-<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />import requests<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />body = {<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />'content="WordPress':'WordPress',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> 'wp-includes':'WordPress',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> 'pma_password':'phpMyAdmin',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> 'hexo':'hexo',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />}<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />def CheckCmsFromBody(url):<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />    try:<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />        r = requests.get(url)<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />        encoding = requests.utils.get_encodings_from_content(r.text)[0]<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />        url_content = r.content.decode(encoding, 'replace')<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />        for k,v in body.items():<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />            if k in url_content:<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />                print('目标网址:{} 识别CMS为:{}'.format(url,v))<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />    except Exception as e:<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />        print(e)<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />url = 'http://www.langzi.fun'<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />CheckCmsFromBody(url)<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />

  代码的粗略作用是获取首页的内容,然后从指纹库中寻找关键词的存在来判断cms的类型。

  根据请求头的内容识别cms的类型

  除了从上面的网页内容中识别cms的类型,还可以从请求头消息中获取相关信息,有的cms系统在请求头中返回一个唯一的特征码信息。如下图,目标网站是用PowerEasy(Moving Easy System)构建的网站,可以直接从请求头中获取特征码来判断cms的类型&gt; 使用。

  所以,在渗透测试的信息采集中,任何地方都不能放过~小心挖掘更多的信息。

  这段代码的实现太简单了,开会的同学都懒得看,没有基础知识的同学看了也看不懂……我就不写了……

  主要难点在于请求头指纹库的采集和更新。

  需要注意的是,请求头是字典格式的数据,需要先转换成字符串,然后去指纹库找到特征关键词。

  根据 robots.txt 文件识别 cms 类型

  在渗透测试中,robots.txt 必须是每个人都会尝试访问的文件。robots 协议,也称为 robots.txt,是一个 ASCII 编码的文本文件,存储在 网站 根目录中。哪些文件目录不可访问,哪些可以访问,网站管理员不希望搜索引擎访问的目录是否收录敏感信息?

  答案是肯定的,有些cms系统会不允许搜索引擎爬取一些敏感目录,比如后台管理界面,还会在文件中放入一些唯一的签名,这样我们就可以Text 关键词 用于判断。下图中的目标网站使用的是PHPcms v9构建的系统。通过访问 robots 文件并比较签名,很容易识别所使用的 cms 类型。

  因此,也可以根据robots文件的内容方式来识别。

  根据 网站 文件 md5 值识别 cms 类型

  MD5指纹(文件指纹验证) 当您从网上下载软件时,要确保软件没有被修改(如添加病毒/非官方插件),或在下载过程中被损坏,您可以使用文件指纹验证(MD5)技术进行确认。

  

  原则:

  通过一定的算法,对具体的文件进行校验,得到一个32位的十六进制数(校验和)。待校验文件的文件名和后缀可以更改,不影响校验。只要对原创信息稍作改动,经过md5操作后,结果就会大为改观。因此,如果重新检查后得到的值(md5代码)与本软件发布站或官方网站发布的值不同,则可以认为文件已被更改。

  需要注意的是每个文件对应的MD5的值是不同的,所以回到本文的主题,当你使用完整的cms系统时,下载的cms系统收录CSS、字体、和图标文件,其中有些文件是cms系统独有的,如果文件存在则访问该文件,然后检查该文件的MD5值,如果与指纹库一致,就可以判断网站 的 cms 类型,这种方法比前一种方法精度更高。

  指纹库格式如下:

  网站文件|CMS类型|网站文件的MD5值。<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /><br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />/static/image/admincp/cloud/qun_op.png|DISCUZ|AB35FA459B0BB01D31BA8FAD0953FCC9|<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />/widget/images/thumbnail.jpg|ECSHOP|7BB50E4281FA02758834A2E9D7BA9FB9|<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />/js/calendar/active-bg.gif|ECSHOP|F8FB9F2B7428C94B41320AA1BC9CF601|<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />/phpcms/libs/data/font/Vineta.ttf|PHPCMS|E6E557BAD69B09533827D9652E0C11AB|<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />/statics/images/admin_img/arrowhead-y.png|PHPCMS|6C34F70BD2A05C8C5DDEBB369B5B9509|<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />

  即在文件网站的URL中添加网站,获取文件的MD5值,检查指纹库的值是否一致,如果一致,确认cms 被 网站 使用。识别方法的python实现只有四五行代码。让我们假设一个案例并假设一个指纹库:

  upload/my_girl.jpg|HEXO|587c7132e6043a1de24e03ededa8980d<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />

  然后使用代码实践来识别:

  import hashlib,requests<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />URL = 'http://www.langzi.fun'<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />req = requests.get(url=URL+'/upload/my_girl.jpg')<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />md5 = hashlib.md5()<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />md5.update(req.content)<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />if md5.hexdigest() == '587c7132e6043a1de24e03ededa8980d':<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />    print('目标网址CMS类型为:HEXO')<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />

  测试代码的结果如下:

  根据指定的网页内容识别cms的类型

  这类方法结合了上述情况下的指纹库,即系统默认收录一些网页或文本,构建系统后这些文本仍然存在。您访问这些网页或文本,然后查看这些网页。文本中是否存在特定的 关键词,如果存在则确认 cms 类型。

  使用Empirecms(empire网站管理系统)构建如下目标网站,访问网站:

  e/admin/adminstyle/1/page/about.htm<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />

  文件,发现该页面存在,然后点击关键词,得到的结果是该网站是使用帝国网站管理系统建立的。

  触发字典中的关键词识别,确定为帝国网站管理系统。

  补充说明

  除了以上五种访问不同文件,然后根据指纹库确定cms的类型,有时还可以通过访问错误页面获取信息,可能会被防火墙屏蔽。

  或者有的网站会对目录做一些改动,可以根据实际情况修改扫码,加个二级目录扫描比对指纹库,发现很多敏感信息。

  以上五种方法中,按照准确率从高到低依次为:

  MD5值校验识别>指定网页内容识别>Robots.txt文件识别>请求头信息识别>首页内容识别<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />

  按照扫描所需时间,从低到高依次为:

  首页内容识别=请求头信息识别=Robots.txt文件识别>指定网页内容识别>MD5值校验识别<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />

  综合性价比从高到低排序如下:

  Robots.txt文件识别>请求头信息识别>首页内容识别>指定网页内容识别>MD5值校验识别<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />

  因为首页内容识别、请求头信息识别、Robots.txt文件识别只需要访问网站一次,不会产生大量扫描,成功主要取决于你的指纹库。如果要扫描指定的网页内容并验证MD5值,就会产生更多甚至大量的访问扫描,很容易触发防火墙,造成很多麻烦。当然,这些都可以通过降低扫描频率、自动随机切换代理IP、分布式低频扫描等手段绕过。

  

  回到本文的主题,鉴别cms的类型最重要的是依赖强大的指纹库,甚至可以说完全依赖于指纹库的数据。这五种方法只是思路的一个分类,能不能产生结果,取决于对于依赖指纹库的人来说,如何更新和整理指纹库,可以在 GitHub 上采集整理。各大论坛也有好心人。当然也可以下载多套cms系统,然后分析哪些文件是cms@cms独有的,查看MD5值或者检查关键词等。 ,然后在清理数据后将其添加到您的指纹数据库中。

  Python识别cms代码工程设计实现与扩展

  整个项目虽然看起来很复杂,但是仔细分析每个功能模块就很容易实现。您需要做的就是分别完成每个功能,然后将其构建在一起,形成一组流程。下面的演示代码是一个大体流程,方便阅读理解。

  综合性价比从高到低排序如下:

  Robots.txt文件识别>请求头信息识别>首页内容识别>指定网页内容识别>MD5值校验识别<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />

  所以只需要访问网站首页一次,先获取首页内容和请求头信息进行对比,访问robots.txt文件进行检测,最后扫描指定网页内容检测关键词并验证 MD5 值。

  具体流程图:

  代码实现如下:

  # -*- coding:utf-8 -*-<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />import requests,hashlib<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /># 指纹库信息,因为我收集了差不多5000条指纹,如果全部放在代码中,则文章就写不下了<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /># 本代码仅作演示案例的工程结构<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /># 关于优化以及拓展,详细会在文章中介绍<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /><br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /># 首页内容指纹库<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />body = {<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> 'content="WordPress':'WordPress',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> 'wp-includes':'WordPress',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> 'pma_password':'phpMyAdmin',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> 'hexo':'hexo',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> 'TUTUCMS':'tutucms',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> 'Powered by TUTUCMS':'tutucms',  'Powered by 1024 CMS':'1024 CMS',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />'Discuz':'Discuz',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> '1024 CMS (c)':'1024 CMS',  'Publish By JCms2010':'捷点 JCMS',}<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /># 请求头信息指纹库<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />head = {'X-Pingback':'WordPress',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> 'xmlrpc.php':'WordPress',  'wordpress_test_cookie':'WordPress',  'phpMyAdmin=':'phpMyAdmin=',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> 'adaptcms':'adaptcms',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> 'SS_MID&squarespace.net':'squarespace建站',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> 'X-Mas-Server':'TRS MAS',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> 'dr_ci_session':'dayrui系列CMS',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> 'http://www.cmseasy.cn/service_1.html':'CmsEasy',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> 'Osclass':'Osclass',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> 'clientlanguage':'unknown cms rcms',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> 'X-Powered-Cms: Twilight CMS':'TwilightCMS',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> 'IRe.CMS':'irecms',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> 'DotNetNukeAnonymous':'DotNetNuke',}<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /># robots文件指纹库<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />robots = [<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> 'Tncms', '新为软件E-learning管理系统', '贷齐乐系统', '中企动力CMS', '全国烟草系统', 'Glassfish', 'phpvod', 'jieqi', '老Y文章管理系统',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> 'DedeCMS']<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /># MD5指纹库<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />cms_rule = [<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> '/images/admina/sitmap0.png|08cms|e0c4b6301b769d596d183fa9688b002a|',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> '/install/images/logo.gif|建站之星|ac85215d71732d34af35a8e69c8ba9a2|',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> '/jiaowu/hlp/Images/node.gif|qzdatasoft强智教务管理系统|70ee6179b7e3a5424b5ca22d9ea7d200|',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> '/theme/admin/images/upload.gif|sdcms|d5cd0c796cd7725beacb36ebd0596190|',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> '/themes/README.txt|drupal|5954fc62ae964539bb3586a1e4cb172a|',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> '/view/resource/skin/skin.txt|未知政府采购系统|61a9910d6156bb5b21009ba173da0919|',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> '/theme/admin/images/upload.gif|sdcms|d5cd0c796cd7725beacb36ebd0596190|',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> '/images/logout/topbg.jpg|TurboMail邮箱系统|f6d7a10b8fe70c449a77f424bc626680|',]<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /># 特定网页指纹库<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />body_rule = [<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> '/robots.txt|EmpireCMS|EmpireCMS|', '/images/css.css.lnk|KesionCMS(科讯)|kesioncms|',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> '/data/flashdata/default/cycle_image.xml|ecshop|ecshop|',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> '/admin/SouthidcEditor/Include/Editor.js|良精|southidc|', '/plugin/qqconnect/bind.html|PHP168(国徽)|php168|',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> '/SiteServer/Themes/Language/en.xml|SiteServer|siteserver|', '/system/images/fun.js|KingCMS|kingcms|',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> '/INSTALL.mysql.txt|Drupal(水滴)|drupal|', '/themes/default/style.css|ecshop|ECSHOP|',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> '/hack/gather/template/addrulesql.htm|qiboSoft(齐博)|qiboSoft|',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />'/phpcms/templates/default/wap/header.html|phpcms|phpcms']<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /><br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /><br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />def GetContent(url):<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />    '''<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />    这个函数功能是:<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />        接受一个传入的网址,返回传入网址的  (请求头信息,原始网页数据,解码成中文后的网页内容)<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />        当然前提是访问成功了<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /><br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />        如果访问失败,则直接返回None了<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />    '''<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />    headers = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36'}<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />    try:<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />        r = requests.get(url,timeout=5,headers=headers)<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />        encoding = requests.utils.get_encodings_from_content(r.text)[0]<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />        url_content = r.content.decode(encoding, 'replace')<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />        return (str(r.headers),r.content,url_content)<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />    except:<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />        pass<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /><br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />def CheckCMS(url):<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />    # 根据robots文件判定<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />    url_r = url+'/robots.txt'<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />    res = GetContent(url_r)<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />    if res != None:<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />        for robot in robots:<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />            if robot in res[2]:<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />                return '{}-->CMS类型为:{}'.format(url, robot)<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />    # 然后根据 网页内容和请求头信息判定<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />    res = GetContent(url)<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />    if res != None:<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />        for k,v in head.items():<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />            if k in res[0]:<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />                return '{}-->CMS类型为:{}'.format(url,v)<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />        for k,v in body.items():<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />            if k in res[2]:<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />                return '{}-->CMS类型为:{}'.format(url, v)<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />    # 然后根据特定网址的内容判定<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />    for x in body_rule:<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />        cms_prefix = x.split('|', 3)[0]<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />        cms_name = x.split('|', 3)[1]<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />        cms_md5 = x.split('|', 3)[2]<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />        url_c = url + cms_prefix<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />        res = GetContent(url_c)<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />        if res != None:<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />            if cms_md5 in res[2]:<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />                return '{}-->CMS类型为:{}'.format(url, cms_name)<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />    # 最后根据MD5值判定,其实如果前面都没有判定出来的话,这里扫描的意义也不是很大<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />    for x in cms_rule:<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />        cms_prefix = x.split('|', 3)[0]<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />        cms_name = x.split('|', 3)[1]<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />        cms_md5 = x.split('|', 3)[2]<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />        url_s = url + cms_prefix<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />        res = GetContent(url_s)<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />        if res != None:<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />            md5 = hashlib.md5()<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />            md5.update(res[1])<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />            rmd5 = md5.hexdigest()<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />            if cms_md5 == rmd5:<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />                return '{}-->CMS类型为:{}'.format(url, cms_name)<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /><br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />if __name__ == '__main__':<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />    url = 'http://www.langzi.fun'<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />    print(CheckCMS(url))<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />

  试验结果:

  以上大致就是使用代码来识别cms的类型的过程。有代码基础的同学可以自行扩展。这里有一些例子:

  增加WAF识别功能,即在调用具体网页内容识别和MD5值校验之前启动WAF识别。如果它发现有WAF,它会跳过它。WAF识别码在公众号。有两篇文章文章有源码

  添加随机请求头、随机代理IP等条件伪装成浏览器

  爬取网站的目录,然后爬取爬取的网页内容,与指纹库关键词进行对比。有时候会打很多,因为有的关键词可能不会出现在首页,有的可能出现在错误页面或者404页面

  引入线程池批量扫描URL,然后识别出cms的类型并保存在文本中。还可以达到批量刷牙的效果。不过这种做法疑似涉及HC,所以如果不是业务需求,就不要做。

  找到多个在线识别接口cms,然后编写爬虫一起识别。当然,如果这个要长期使用,需要定期检查接口是否有效

  仅仅依靠首页关键词的内容来判断cms的类型其实误报率很大,可以结合robots文件等等。

  可以结合其他知识点做有趣的东西,主要是拓展自己的思维,比如结合Flask做一个简化的网络版识别cms,代码只有十几行。手痒一会就写一个,大家玩的开心吧……

  代码显示如下:

  # coding:utf-8<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />from flask import Flask,request<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />import requests,hashlib<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />app = Flask(__name__)<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />body = {<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> 'content="WordPress':'WordPress',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> 'wp-includes':'WordPress',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> 'pma_password':'phpMyAdmin',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> 'hexo':'hexo',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> 'TUTUCMS':'tutucms',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> 'Powered by TUTUCMS':'tutucms',  'Powered by 1024 CMS':'1024 CMS',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />'Discuz':'Discuz',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> '1024 CMS (c)':'1024 CMS',  'Publish By JCms2010':'捷点 JCMS',}<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />head = {'X-Pingback':'WordPress',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> 'xmlrpc.php':'WordPress',  'wordpress_test_cookie':'WordPress',  'phpMyAdmin=':'phpMyAdmin=',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> 'adaptcms':'adaptcms',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> 'SS_MID&squarespace.net':'squarespace建站',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> 'X-Mas-Server':'TRS MAS',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> 'dr_ci_session':'dayrui系列CMS',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> 'http://www.cmseasy.cn/service_1.html':'CmsEasy',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> 'Osclass':'Osclass',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> 'clientlanguage':'unknown cms rcms',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> 'X-Powered-Cms: Twilight CMS':'TwilightCMS',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> 'IRe.CMS':'irecms',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> 'DotNetNukeAnonymous':'DotNetNuke',}<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />robots = [<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> 'Tncms', '新为软件E-learning管理系统', '贷齐乐系统', '中企动力CMS', '全国烟草系统', 'Glassfish', 'phpvod', 'jieqi', '老Y文章管理系统',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> 'DedeCMS']<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />cms_rule = [<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> '/images/admina/sitmap0.png|08cms|e0c4b6301b769d596d183fa9688b002a|',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> '/install/images/logo.gif|建站之星|ac85215d71732d34af35a8e69c8ba9a2|',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> '/jiaowu/hlp/Images/node.gif|qzdatasoft强智教务管理系统|70ee6179b7e3a5424b5ca22d9ea7d200|',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> '/theme/admin/images/upload.gif|sdcms|d5cd0c796cd7725beacb36ebd0596190|',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> '/themes/README.txt|drupal|5954fc62ae964539bb3586a1e4cb172a|',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> '/view/resource/skin/skin.txt|未知政府采购系统|61a9910d6156bb5b21009ba173da0919|',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> '/theme/admin/images/upload.gif|sdcms|d5cd0c796cd7725beacb36ebd0596190|',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> '/images/logout/topbg.jpg|TurboMail邮箱系统|f6d7a10b8fe70c449a77f424bc626680|',]<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />body_rule = [<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> '/robots.txt|EmpireCMS|EmpireCMS|', '/images/css.css.lnk|KesionCMS(科讯)|kesioncms|',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> '/data/flashdata/default/cycle_image.xml|ecshop|ecshop|',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> '/admin/SouthidcEditor/Include/Editor.js|良精|southidc|', '/plugin/qqconnect/bind.html|PHP168(国徽)|php168|',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> '/SiteServer/Themes/Language/en.xml|SiteServer|siteserver|', '/system/images/fun.js|KingCMS|kingcms|',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> '/INSTALL.mysql.txt|Drupal(水滴)|drupal|', '/themes/default/style.css|ecshop|ECSHOP|',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /> '/hack/gather/template/addrulesql.htm|qiboSoft(齐博)|qiboSoft|',<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />'/phpcms/templates/default/wap/header.html|phpcms|phpcms']<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /><br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />def GetContent(url):<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />    '''<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />    这个函数功能是:<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />        接受一个传入的网址,返回传入网址的  (请求头信息,原始网页数据,解码成中文后的网页内容)<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />        当然前提是访问成功了<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /><br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />        如果访问失败,则直接返回None了<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />    '''<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />    headers = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36'}<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />    try:<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />        r = requests.get(url,timeout=5,headers=headers)<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />        encoding = requests.utils.get_encodings_from_content(r.text)[0]<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />        url_content = r.content.decode(encoding, 'replace')<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />        return (str(r.headers),r.content,url_content)<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />    except:<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />        pass<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" /><br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />def CheckCMS(url):<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />    # 根据robots文件判定<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />    url_r = url+'/robots.txt'<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />    res = GetContent(url_r)<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />    if res != None:<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />        for robot in robots:<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />            if robot in res[2]:<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />                return '{}-->CMS类型为:{}'.format(url, robot)<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />    # 然后根据 网页内容和请求头信息判定<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />    res = GetContent(url)<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />    if res != None:<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />        for k,v in head.items():<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />            if k in res[0]:<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />                return '{}-->CMS类型为:{}'.format(url,v)<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />        for k,v in body.items():<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />            if k in res[2]:<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />                return '{}-->CMS类型为:{}'.format(url, v)<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />    # 然后根据特定网址的内容判定<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />    for x in body_rule:<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />        cms_prefix = x.split('|', 3)[0]<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />        cms_name = x.split('|', 3)[1]<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />        cms_md5 = x.split('|', 3)[2]<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />        url_c = url + cms_prefix<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />        res = GetContent(url_c)<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />        if res != None:<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !important;word-break: inherit !important;" />            if cms_md5 in res[2]:<br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;overflow-wrap: inherit !imp

0 个评论

要回复文章请先登录注册


官方客服QQ群

微信人工客服

QQ人工客服


线