最近看了新周刊的一篇推送,有关地铁名字的分析,链接如下。
我们分析了3447个地铁站,发现了中国城市地名的秘密
于是乎也想着自己去获取数据,然后进行分析一番。
当然分析水平不可能和他们的相比,毕竟文笔摆在那里,也就那点水平。
大家看着乐呵就好,能提高的估摸着也就只有数据的准确性啦。
文中所用到的地铁站数据并没有去重,对于换乘站,含有大量重复。
即使作者一直在强调换乘站占比很小,影响不是很大。
但于我而言,去除重复数据还是比较简单的。
然后照着人家的路子去分析,多学习一下。
/ 01 / 获取分析
地铁信息获取从高德地图上获取。
上面主要获取城市的「id」,「cityname」及「名称」。
用于拼接请求网址,进而获取地铁线路的具体信息。
找到请求信息,获取各个城市的地铁线路以及线路中站点详情。
/ 02 / 数据获取
具体代码如下。
<p style="overflow-wrap: break-word;line-height: 15px;font-size: 11px;word-spacing: -3px;letter-spacing: 0px;font-family: Consolas, Inconsolata, Courier, monospace;border-radius: 0px;color: rgb(169, 183, 198);background: rgb(40, 43, 46);padding: 0.5em;margin-left: 8px;margin-right: 8px;display: block !important;word-wrap: normal !important;word-break: normal !important;overflow: auto !important;"><span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">import</span> json<br /><span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">import</span> requests<br /><span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">from</span> bs4 <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">import</span> BeautifulSoup<br /><br />headers = {<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'user-agent'</span>: <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36'</span>}<br /><br /><br /><span class="hljs-function" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;"><span class="hljs-keyword" style="font-size: inherit;line-height: inherit;word-wrap: inherit !important;word-break: inherit !important;">def</span> <span class="hljs-title" style="font-size: inherit;line-height: inherit;color: rgb(165, 218, 45);word-wrap: inherit !important;word-break: inherit !important;">get_message</span><span class="hljs-params" style="font-size: inherit;line-height: inherit;color: rgb(255, 152, 35);word-wrap: inherit !important;word-break: inherit !important;">(ID, cityname, name)</span>:</span><br /> <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">"""<br /> 地铁线路信息获取<br /> """</span><br /> url = <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'http://map.amap.com/service/subway?_1555502190153&srhdata='</span> + ID + <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'_drw_'</span> + cityname + <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'.json'</span><br /> response = requests.get(url=url, headers=headers)<br /> html = response.text<br /> result = json.loads(html)<br /> <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">for</span> i <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">in</span> result[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'l'</span>]:<br /> <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">for</span> j <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">in</span> i[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'st'</span>]:<br /> <span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 判断是否含有地铁分线</span><br /> <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">if</span> len(i[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'la'</span>]) > <span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">0</span>:<br /> print(name, i[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'ln'</span>] + <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'('</span> + i[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'la'</span>] + <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">')'</span>, j[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'n'</span>])<br /> <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">with</span> open(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'subway.csv'</span>, <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'a+'</span>, encoding=<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'gbk'</span>) <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">as</span> f:<br /> f.write(name + <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">','</span> + i[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'ln'</span>] + <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'('</span> + i[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'la'</span>] + <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">')'</span> + <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">','</span> + j[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'n'</span>] + <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'n'</span>)<br /> <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">else</span>:<br /> print(name, i[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'ln'</span>], j[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'n'</span>])<br /> <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">with</span> open(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'subway.csv'</span>, <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'a+'</span>, encoding=<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'gbk'</span>) <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">as</span> f:<br /> f.write(name + <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">','</span> + i[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'ln'</span>] + <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">','</span> + j[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'n'</span>] + <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'n'</span>)<br /><br /><br /><span class="hljs-function" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;"><span class="hljs-keyword" style="font-size: inherit;line-height: inherit;word-wrap: inherit !important;word-break: inherit !important;">def</span> <span class="hljs-title" style="font-size: inherit;line-height: inherit;color: rgb(165, 218, 45);word-wrap: inherit !important;word-break: inherit !important;">get_city</span><span class="hljs-params" style="font-size: inherit;line-height: inherit;color: rgb(255, 152, 35);word-wrap: inherit !important;word-break: inherit !important;">()</span>:</span><br /> <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">"""<br /> 城市信息获取<br /> """</span><br /> url = <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'http://map.amap.com/subway/index.html?&1100'</span><br /> response = requests.get(url=url, headers=headers)<br /> html = response.text<br /> <span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 编码</span><br /> html = html.encode(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'ISO-8859-1'</span>)<br /> html = html.decode(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'utf-8'</span>)<br /> soup = BeautifulSoup(html, <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'lxml'</span>)<br /> <span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 城市列表</span><br /> res1 = soup.find_all(class_=<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">"city-list fl"</span>)[<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">0</span>]<br /> res2 = soup.find_all(class_=<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">"more-city-list"</span>)[<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">0</span>]<br /> <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">for</span> i <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">in</span> res1.find_all(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'a'</span>):<br /> <span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 城市ID值</span><br /> ID = i[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'id'</span>]<br /> <span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 城市拼音名</span><br /> cityname = i[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'cityname'</span>]<br /> <span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 城市名</span><br /> name = i.get_text()<br /> get_message(ID, cityname, name)<br /> <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">for</span> i <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">in</span> res2.find_all(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'a'</span>):<br /> <span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 城市ID值</span><br /> ID = i[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'id'</span>]<br /> <span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 城市拼音名</span><br /> cityname = i[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'cityname'</span>]<br /> <span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 城市名</span><br /> name = i.get_text()<br /> get_message(ID, cityname, name)<br /><br /><br /><span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">if</span> __name__ == <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'__main__'</span>:<br /> get_city()<br /></p>
最后成功获取数据。
包含换乘站数据,一共3541个地铁站点。
/ 03 / 数据可视化
先对数据进行清洗,去除重复的换乘站信息。
<p style="overflow-wrap: break-word;line-height: 15px;font-size: 11px;word-spacing: -3px;letter-spacing: 0px;font-family: Consolas, Inconsolata, Courier, monospace;border-radius: 0px;color: rgb(169, 183, 198);background: rgb(40, 43, 46);padding: 0.5em;margin-left: 8px;margin-right: 8px;display: block !important;word-wrap: normal !important;word-break: normal !important;overflow: auto !important;"><span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">from</span> wordcloud <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">import</span> WordCloud, ImageColorGenerator<br /><span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">from</span> pyecharts <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">import</span> Line, Bar<br /><span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">import</span> matplotlib.pyplot <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">as</span> plt<br /><span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">import</span> pandas <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">as</span> pd<br /><span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">import</span> numpy <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">as</span> np<br /><span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">import</span> jieba<br /><br /><span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 设置列名与数据对齐</span><br />pd.set_option(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'display.unicode.ambiguous_as_wide'</span>, <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">True</span>)<br />pd.set_option(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'display.unicode.east_asian_width'</span>, <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">True</span>)<br /><span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 显示10行</span><br />pd.set_option(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'display.max_rows'</span>, <span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">10</span>)<br /><span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 读取数据</span><br />df = pd.read_csv(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'subway.csv'</span>, header=<span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">None</span>, names=[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'city'</span>, <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'line'</span>, <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'station'</span>], encoding=<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'gbk'</span>)<br /><span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 各个城市地铁线路情况</span><br />df_line = df.groupby([<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'city'</span>, <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'line'</span>]).count().reset_index()<br />print(df_line)<br /></p>
通过城市及地铁线路进行分组,得到全国地铁线路总数。
一共183条地铁线路。
<p style="overflow-wrap: break-word;line-height: 15px;font-size: 11px;word-spacing: -3px;letter-spacing: 0px;font-family: Consolas, Inconsolata, Courier, monospace;border-radius: 0px;color: rgb(169, 183, 198);background: rgb(40, 43, 46);padding: 0.5em;margin-left: 8px;margin-right: 8px;display: block !important;word-wrap: normal !important;word-break: normal !important;overflow: auto !important;"><span class="hljs-function" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;"><span class="hljs-keyword" style="font-size: inherit;line-height: inherit;word-wrap: inherit !important;word-break: inherit !important;">def</span> <span class="hljs-title" style="font-size: inherit;line-height: inherit;color: rgb(165, 218, 45);word-wrap: inherit !important;word-break: inherit !important;">create_map</span><span class="hljs-params" style="font-size: inherit;line-height: inherit;color: rgb(255, 152, 35);word-wrap: inherit !important;word-break: inherit !important;">(df)</span>:</span><br /> <span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 绘制地图</span><br /> value = [i <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">for</span> i <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">in</span> df[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'line'</span>]]<br /> attr = [i <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">for</span> i <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">in</span> df[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'city'</span>]]<br /> geo = Geo(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">"已开通地铁城市分布情况"</span>, title_pos=<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'center'</span>, title_top=<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'0'</span>, width=<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">800</span>, height=<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">400</span>, title_color=<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">"#fff"</span>, background_color=<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">"#404a59"</span>, )<br /> geo.add(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">""</span>, attr, value, is_visualmap=<span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">True</span>, visual_range=[<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">0</span>, <span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">25</span>], visual_text_color=<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">"#fff"</span>, symbol_size=<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">15</span>)<br /> geo.render(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">"已开通地铁城市分布情况.html"</span>)<br /><br /><br /><span class="hljs-function" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;"><span class="hljs-keyword" style="font-size: inherit;line-height: inherit;word-wrap: inherit !important;word-break: inherit !important;">def</span> <span class="hljs-title" style="font-size: inherit;line-height: inherit;color: rgb(165, 218, 45);word-wrap: inherit !important;word-break: inherit !important;">create_line</span><span class="hljs-params" style="font-size: inherit;line-height: inherit;color: rgb(255, 152, 35);word-wrap: inherit !important;word-break: inherit !important;">(df)</span>:</span><br /> <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">"""<br /> 生成城市地铁线路数量分布情况<br /> """</span><br /> title_len = df[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'line'</span>]<br /> bins = [<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">0</span>, <span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">5</span>, <span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">10</span>, <span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">15</span>, <span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">20</span>, <span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">25</span>]<br /> level = [<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'0-5'</span>, <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'5-10'</span>, <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'10-15'</span>, <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'15-20'</span>, <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'20以上'</span>]<br /> len_stage = pd.cut(title_len, bins=bins, labels=level).value_counts().sort_index()<br /> <span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 生成柱状图</span><br /> attr = len_stage.index<br /> v1 = len_stage.values<br /> bar = Bar(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">"各城市地铁线路数量分布"</span>, title_pos=<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'center'</span>, title_top=<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'18'</span>, width=<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">800</span>, height=<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">400</span>)<br /> bar.add(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">""</span>, attr, v1, is_stack=<span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">True</span>, is_label_show=<span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">True</span>)<br /> bar.render(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">"各城市地铁线路数量分布.html"</span>)<br /><br /><br /><span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 各个城市地铁线路数</span><br />df_city = df_line.groupby([<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'city'</span>]).count().reset_index().sort_values(by=<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'line'</span>, ascending=<span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">False</span>)<br />print(df_city)<br />create_map(df_city)<br />create_line(df_city)</p>
已经开通地铁的城市数据,还有各个城市的地铁线路数。
一共32个城市开通地铁,其中北京、上海线路已经超过了20条。
城市分布情况。
大部分都是省会城市,还有个别经济实力强的城市。
线路数量分布情况。
可以看到大部分还是在「0-5」这个阶段的,当然最少为1条线。
<p style="overflow-wrap: break-word;line-height: 15px;font-size: 11px;word-spacing: -3px;letter-spacing: 0px;font-family: Consolas, Inconsolata, Courier, monospace;border-radius: 0px;color: rgb(169, 183, 198);background: rgb(40, 43, 46);padding: 0.5em;margin-left: 8px;margin-right: 8px;display: block !important;word-wrap: normal !important;word-break: normal !important;overflow: auto !important;"><span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 哪个城市哪条线路地铁站最多</span><br />print(df_line.sort_values(by=<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'station'</span>, ascending=<span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">False</span>))<br /></p>
探索一下哪个城市哪条线路地铁站最多。
北京10号线第一,重庆3号线第二。
还是蛮怀念北京1张票,2块钱地铁随便做的时候。
可惜好日子一去不复返了。
去除重复换乘站数据。
<p style="overflow-wrap: break-word;line-height: 15px;font-size: 11px;word-spacing: -3px;letter-spacing: 0px;font-family: Consolas, Inconsolata, Courier, monospace;border-radius: 0px;color: rgb(169, 183, 198);background: rgb(40, 43, 46);padding: 0.5em;margin-left: 8px;margin-right: 8px;display: block !important;word-wrap: normal !important;word-break: normal !important;overflow: auto !important;"><span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 去除重复换乘站的地铁数据</span><br />df_station = df.groupby([<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'city'</span>, <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'station'</span>]).count().reset_index()<br />print(df_station)<br /></p>
一共包含3034个地铁站,相较新周刊中3447个地铁站数据。
减少了近400个地铁站。
接下来看一下哪个城市地铁站最多。
<p style="overflow-wrap: break-word;line-height: 15px;font-size: 11px;word-spacing: -3px;letter-spacing: 0px;font-family: Consolas, Inconsolata, Courier, monospace;border-radius: 0px;color: rgb(169, 183, 198);background: rgb(40, 43, 46);padding: 0.5em;margin-left: 8px;margin-right: 8px;display: block !important;word-wrap: normal !important;word-break: normal !important;overflow: auto !important;"><span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 统计每个城市包含地铁站数(已去除重复换乘站)</span><br />print(df_station.groupby([<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'city'</span>]).count().reset_index().sort_values(by=<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'station'</span>, ascending=<span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">False</span>))<br /></p>
32个城市,上海第一,北京第二。
没想到的是,武汉居然有那么多地铁站。
现在来实现一下新周刊中的操作,生成地铁名词云。
<p style="overflow-wrap: break-word;line-height: 15px;font-size: 11px;word-spacing: -3px;letter-spacing: 0px;font-family: Consolas, Inconsolata, Courier, monospace;border-radius: 0px;color: rgb(169, 183, 198);background: rgb(40, 43, 46);padding: 0.5em;margin-left: 8px;margin-right: 8px;display: block !important;word-wrap: normal !important;word-break: normal !important;overflow: auto !important;"><span class="hljs-function" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;"><span class="hljs-keyword" style="font-size: inherit;line-height: inherit;word-wrap: inherit !important;word-break: inherit !important;">def</span> <span class="hljs-title" style="font-size: inherit;line-height: inherit;color: rgb(165, 218, 45);word-wrap: inherit !important;word-break: inherit !important;">create_wordcloud</span><span class="hljs-params" style="font-size: inherit;line-height: inherit;color: rgb(255, 152, 35);word-wrap: inherit !important;word-break: inherit !important;">(df)</span>:</span><br /> <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">"""<br /> 生成地铁名词云<br /> """</span><br /> <span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 分词</span><br /> text = <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">''</span><br /> <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">for</span> line <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">in</span> df[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'station'</span>]:<br /> text += <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">' '</span>.join(jieba.cut(line, cut_all=<span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">False</span>))<br /> text += <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">' '</span><br /> backgroud_Image = plt.imread(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'rocket.jpg'</span>)<br /> wc = WordCloud(<br /> background_color=<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'white'</span>,<br /> mask=backgroud_Image,<br /> font_path=<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'C:WindowsFonts华康俪金黑W8.TTF'</span>,<br /> max_words=<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">1000</span>,<br /> max_font_size=<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">150</span>,<br /> min_font_size=<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">15</span>,<br /> prefer_horizontal=<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">1</span>,<br /> random_state=<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">50</span>,<br /> )<br /> wc.generate_from_text(text)<br /> img_colors = ImageColorGenerator(backgroud_Image)<br /> wc.recolor(color_func=img_colors)<br /> <span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 看看词频高的有哪些</span><br /> process_word = WordCloud.process_text(wc, text)<br /> sort = sorted(process_word.items(), key=<span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">lambda</span> e: e[<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">1</span>], reverse=<span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">True</span>)<br /> print(sort[:<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">50</span>])<br /> plt.imshow(wc)<br /> plt.axis(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'off'</span>)<br /> wc.to_file(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">"地铁名词云.jpg"</span>)<br /> print(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'生成词云成功!'</span>)<br /><br /><br />create_wordcloud(df_station)<br /></p>
词云图如下。
广场、大道、公园占了前三,和新周刊的图片一样,说明分析有效。
<p style="overflow-wrap: break-word;line-height: 15px;font-size: 11px;word-spacing: -3px;letter-spacing: 0px;font-family: Consolas, Inconsolata, Courier, monospace;border-radius: 0px;color: rgb(169, 183, 198);background: rgb(40, 43, 46);padding: 0.5em;margin-left: 8px;margin-right: 8px;display: block !important;word-wrap: normal !important;word-break: normal !important;overflow: auto !important;">words = []<br /><span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">for</span> line <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">in</span> df[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'station'</span>]:<br /> <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">for</span> i <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">in</span> line:<br /> <span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 将字符串输出一个个中文</span><br /> words.append(i)<br /><br /><br /><span class="hljs-function" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;"><span class="hljs-keyword" style="font-size: inherit;line-height: inherit;word-wrap: inherit !important;word-break: inherit !important;">def</span> <span class="hljs-title" style="font-size: inherit;line-height: inherit;color: rgb(165, 218, 45);word-wrap: inherit !important;word-break: inherit !important;">all_np</span><span class="hljs-params" style="font-size: inherit;line-height: inherit;color: rgb(255, 152, 35);word-wrap: inherit !important;word-break: inherit !important;">(arr)</span>:</span><br /> <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">"""<br /> 统计单字频率<br /> """</span><br /> arr = np.array(arr)<br /> key = np.unique(arr)<br /> result = {}<br /> <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">for</span> k <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">in</span> key:<br /> mask = (arr == k)<br /> arr_new = arr[mask]<br /> v = arr_new.size<br /> result[k] = v<br /> <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">return</span> result<br /><br /><br /><span class="hljs-function" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;"><span class="hljs-keyword" style="font-size: inherit;line-height: inherit;word-wrap: inherit !important;word-break: inherit !important;">def</span> <span class="hljs-title" style="font-size: inherit;line-height: inherit;color: rgb(165, 218, 45);word-wrap: inherit !important;word-break: inherit !important;">create_word</span><span class="hljs-params" style="font-size: inherit;line-height: inherit;color: rgb(255, 152, 35);word-wrap: inherit !important;word-break: inherit !important;">(word_message)</span>:</span><br /> <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">"""<br /> 生成柱状图<br /> """</span><br /> attr = [j[<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">0</span>] <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">for</span> j <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">in</span> word_message]<br /> v1 = [j[<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">1</span>] <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">for</span> j <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">in</span> word_message]<br /> bar = Bar(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">"中国地铁站最爱用的字"</span>, title_pos=<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'center'</span>, title_top=<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'18'</span>, width=<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">800</span>, height=<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">400</span>)<br /> bar.add(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">""</span>, attr, v1, is_stack=<span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">True</span>, is_label_show=<span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">True</span>)<br /> bar.render(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">"中国地铁站最爱用的字.html"</span>)<br /><br /><br />word = all_np(words)<br />word_message = sorted(word.items(), key=<span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">lambda</span> x: x[<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">1</span>], reverse=<span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">True</span>)[:<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">10</span>]<br />create_word(word_message)<br /></p>
统计一下,大家最喜欢用什么字来命名地铁。
路最多,在此之中上海的占比很大。
不信往下看。
<p style="overflow-wrap: break-word;line-height: 15px;font-size: 11px;word-spacing: -3px;letter-spacing: 0px;font-family: Consolas, Inconsolata, Courier, monospace;border-radius: 0px;color: rgb(169, 183, 198);background: rgb(40, 43, 46);padding: 0.5em;margin-left: 8px;margin-right: 8px;display: block !important;word-wrap: normal !important;word-break: normal !important;overflow: auto !important;"><span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 选取上海的地铁站</span><br />df1 = df_station[df_station[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'city'</span>] == <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'上海'</span>]<br />print(df1)<br /></p>
统计上海所有的地铁站,一共345个。
选取包含路的地铁站。
<p style="overflow-wrap: break-word;line-height: 15px;font-size: 11px;word-spacing: -3px;letter-spacing: 0px;font-family: Consolas, Inconsolata, Courier, monospace;border-radius: 0px;color: rgb(169, 183, 198);background: rgb(40, 43, 46);padding: 0.5em;margin-left: 8px;margin-right: 8px;display: block !important;word-wrap: normal !important;word-break: normal !important;overflow: auto !important;"><span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 选取上海地铁站名字包含路的数据</span><br />df2 = df1[df1[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'station'</span>].str.contains(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'路'</span>)]<br />print(df2)<br /></p>
有210个,约占上海地铁的三分之二,路的七分之二。
看来上海对路是情有独钟的。
具体缘由这里就不解释了,详情见新周刊的推送,里面还是讲解蛮详细的。
武汉和重庆则是对家这个词特别喜欢。
标志着那片土地开拓者们的籍贯与姓氏。
<p style="overflow-wrap: break-word;line-height: 15px;font-size: 11px;word-spacing: -3px;letter-spacing: 0px;font-family: Consolas, Inconsolata, Courier, monospace;border-radius: 0px;color: rgb(169, 183, 198);background: rgb(40, 43, 46);padding: 0.5em;margin-left: 8px;margin-right: 8px;display: block !important;word-wrap: normal !important;word-break: normal !important;overflow: auto !important;"><span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 选取武汉的地铁站</span><br />df1 = df_station[df_station[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'city'</span>] == <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'武汉'</span>]<br />print(df1)<br /><span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 选取武汉地铁站名字包含家的数据</span><br />df2 = df1[df1[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'station'</span>].str.contains(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'家'</span>)]<br />print(df2)<br /><br /><span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 选取重庆的地铁站</span><br />df1 = df_station[df_station[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'city'</span>] == <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'重庆'</span>]<br />print(df1)<br /><span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 选取重庆地铁站名字包含家的数据</span><br />df2 = df1[df1[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'station'</span>].str.contains(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'家'</span>)]<br />print(df2)</p>
武汉共有17个,重庆共有20个。
看完家之后,再来看一下名字包含门的地铁站。
<p style="overflow-wrap: break-word;line-height: 15px;font-size: 11px;word-spacing: -3px;letter-spacing: 0px;font-family: Consolas, Inconsolata, Courier, monospace;border-radius: 0px;color: rgb(169, 183, 198);background: rgb(40, 43, 46);padding: 0.5em;margin-left: 8px;margin-right: 8px;display: block !important;word-wrap: normal !important;word-break: normal !important;overflow: auto !important;"><span class="hljs-function" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;"><span class="hljs-keyword" style="font-size: inherit;line-height: inherit;word-wrap: inherit !important;word-break: inherit !important;">def</span> <span class="hljs-title" style="font-size: inherit;line-height: inherit;color: rgb(165, 218, 45);word-wrap: inherit !important;word-break: inherit !important;">create_door</span><span class="hljs-params" style="font-size: inherit;line-height: inherit;color: rgb(255, 152, 35);word-wrap: inherit !important;word-break: inherit !important;">(door)</span>:</span><br /> <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">"""<br /> 生成柱状图<br /> """</span><br /> attr = [j <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">for</span> j <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">in</span> door[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'city'</span>][:<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">3</span>]]<br /> v1 = [j <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">for</span> j <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">in</span> door[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'line'</span>][:<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">3</span>]]<br /> bar = Bar(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">"地铁站最爱用“门”命名的城市"</span>, title_pos=<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'center'</span>, title_top=<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'18'</span>, width=<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">800</span>, height=<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">400</span>)<br /> bar.add(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">""</span>, attr, v1, is_stack=<span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">True</span>, is_label_show=<span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">True</span>, yaxis_max=<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">40</span>)<br /> bar.render(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">"地铁站最爱用门命名的城市.html"</span>)<br /><br /><br /><span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 选取地铁站名字包含门的数据</span><br />df1 = df_station[df_station[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'station'</span>].str.contains(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'门'</span>)]<br /><span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 对数据进行分组计数</span><br />df2 = df1.groupby([<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'city'</span>]).count().reset_index().sort_values(by=<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'line'</span>, ascending=<span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">False</span>)<br />print(df2)<br />create_door(df2)<br /></p>
一共有21个城市,地铁站名包含门。
其中北京,南京,西安作为多朝古都,占去了大部分。
具体的地铁站名数据。
<p style="overflow-wrap: break-word;line-height: 15px;font-size: 11px;word-spacing: -3px;letter-spacing: 0px;font-family: Consolas, Inconsolata, Courier, monospace;border-radius: 0px;color: rgb(169, 183, 198);background: rgb(40, 43, 46);padding: 0.5em;margin-left: 8px;margin-right: 8px;display: block !important;word-wrap: normal !important;word-break: normal !important;overflow: auto !important;"><span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 选取北京的地铁站</span><br />df1 = df_station[df_station[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'city'</span>] == <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'北京'</span>]<br /><span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 选取北京地铁站名字包含门的数据</span><br />df2 = df1[df1[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'station'</span>].str.contains(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'门'</span>)]<br />print(df2)<br /><br /><span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 选取南京的地铁站</span><br />df1 = df_station[df_station[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'city'</span>] == <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'南京'</span>]<br /><span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 选取南京地铁站名字包含门的数据</span><br />df2 = df1[df1[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'station'</span>].str.contains(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'门'</span>)]<br />print(df2)<br /><br /><span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 选取西安的地铁站</span><br />df1 = df_station[df_station[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'city'</span>] == <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'西安'</span>]<br /><span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 选取西安地铁站名字包含门的数据</span><br />df2 = df1[df1[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'station'</span>].str.contains(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'门'</span>)]<br />print(df2)<br /></p>
输出如下。
需要源码,点击阅读原文获取
本篇文章来源于: 菜鸟学Python
本文为原创文章,版权归知行编程网所有,欢迎分享本文,转载请保留出处!
你可能也喜欢
- ♥ python如何打开文件10/03
- ♥ 何时使用 Python 类12/10
- ♥ 10分钟,小白也能用Django做个小App!08/04
- ♥ python中的比较操作有哪些01/03
- ♥ python方法的绑定和解除绑定01/05
- ♥ Python if else条件语句详解11/25
内容反馈