知行编程网知行编程网  2022-03-28 20:37 知行编程网 隐藏边栏 |   抢沙发  13 
文章评分 1 次,平均分 5.0

Python分析3034个地铁站,发现中国地铁名字的秘密。


最近看了新周刊的一篇推送,有关地铁名字的分析,链接如下。


我们分析了3447个地铁站,发现了中国城市地名的秘密


于是乎也想着自己去获取数据,然后进行分析一番。


当然分析水平不可能和他们的相比,毕竟文笔摆在那里,也就那点水平。


大家看着乐呵就好,能提高的估摸着也就只有数据的准确性啦。


文中所用到的地铁站数据并没有去重,对于换乘站,含有大量重复。


即使作者一直在强调换乘站占比很小,影响不是很大。


但于我而言,去除重复数据还是比较简单的。


然后照着人家的路子去分析,多学习一下。



/ 01 / 获取分析


地铁信息获取从高德地图上获取。


Python分析3034个地铁站,发现中国地铁名字的秘密。


上面主要获取城市的「id」,cityname名称


用于拼接请求网址,进而获取地铁线路的具体信息。


Python分析3034个地铁站,发现中国地铁名字的秘密。


找到请求信息,获取各个城市的地铁线路以及线路中站点详情。



/ 02 / 数据获取


具体代码如下。


<p style="overflow-wrap: break-word;line-height: 15px;font-size: 11px;word-spacing: -3px;letter-spacing: 0px;font-family: Consolas, Inconsolata, Courier, monospace;border-radius: 0px;color: rgb(169, 183, 198);background: rgb(40, 43, 46);padding: 0.5em;margin-left: 8px;margin-right: 8px;display: block !important;word-wrap: normal !important;word-break: normal !important;overflow: auto !important;"><span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">import</span> json<br  /><span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">import</span> requests<br  /><span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">from</span> bs4 <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">import</span> BeautifulSoup<br  /><br  />headers = {<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'user-agent'</span>: <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36'</span>}<br  /><br  /><br  /><span class="hljs-function" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;"><span class="hljs-keyword" style="font-size: inherit;line-height: inherit;word-wrap: inherit !important;word-break: inherit !important;">def</span> <span class="hljs-title" style="font-size: inherit;line-height: inherit;color: rgb(165, 218, 45);word-wrap: inherit !important;word-break: inherit !important;">get_message</span><span class="hljs-params" style="font-size: inherit;line-height: inherit;color: rgb(255, 152, 35);word-wrap: inherit !important;word-break: inherit !important;">(ID, cityname, name)</span>:</span><br  />    <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">"""<br  />    地铁线路信息获取<br  />    """</span><br  />    url = <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'http://map.amap.com/service/subway?_1555502190153&srhdata='</span> + ID + <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'_drw_'</span> + cityname + <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'.json'</span><br  />    response = requests.get(url=url, headers=headers)<br  />    html = response.text<br  />    result = json.loads(html)<br  />    <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">for</span> i <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">in</span> result[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'l'</span>]:<br  />        <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">for</span> j <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">in</span> i[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'st'</span>]:<br  />            <span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 判断是否含有地铁分线</span><br  />            <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">if</span> len(i[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'la'</span>]) > <span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">0</span>:<br  />                print(name, i[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'ln'</span>] + <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'('</span> + i[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'la'</span>] + <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">')'</span>, j[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'n'</span>])<br  />                <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">with</span> open(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'subway.csv'</span>, <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'a+'</span>, encoding=<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'gbk'</span>) <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">as</span> f:<br  />                    f.write(name + <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">','</span> + i[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'ln'</span>] + <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'('</span> + i[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'la'</span>] + <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">')'</span> + <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">','</span> + j[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'n'</span>] + <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'n'</span>)<br  />            <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">else</span>:<br  />                print(name, i[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'ln'</span>], j[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'n'</span>])<br  />                <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">with</span> open(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'subway.csv'</span>, <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'a+'</span>, encoding=<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'gbk'</span>) <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">as</span> f:<br  />                    f.write(name + <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">','</span> + i[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'ln'</span>] + <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">','</span> + j[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'n'</span>] + <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'n'</span>)<br  /><br  /><br  /><span class="hljs-function" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;"><span class="hljs-keyword" style="font-size: inherit;line-height: inherit;word-wrap: inherit !important;word-break: inherit !important;">def</span> <span class="hljs-title" style="font-size: inherit;line-height: inherit;color: rgb(165, 218, 45);word-wrap: inherit !important;word-break: inherit !important;">get_city</span><span class="hljs-params" style="font-size: inherit;line-height: inherit;color: rgb(255, 152, 35);word-wrap: inherit !important;word-break: inherit !important;">()</span>:</span><br  />    <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">"""<br  />    城市信息获取<br  />    """</span><br  />    url = <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'http://map.amap.com/subway/index.html?&1100'</span><br  />    response = requests.get(url=url, headers=headers)<br  />    html = response.text<br  />    <span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 编码</span><br  />    html = html.encode(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'ISO-8859-1'</span>)<br  />    html = html.decode(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'utf-8'</span>)<br  />    soup = BeautifulSoup(html, <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'lxml'</span>)<br  />    <span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 城市列表</span><br  />    res1 = soup.find_all(class_=<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">"city-list fl"</span>)[<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">0</span>]<br  />    res2 = soup.find_all(class_=<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">"more-city-list"</span>)[<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">0</span>]<br  />    <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">for</span> i <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">in</span> res1.find_all(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'a'</span>):<br  />        <span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 城市ID值</span><br  />        ID = i[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'id'</span>]<br  />        <span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 城市拼音名</span><br  />        cityname = i[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'cityname'</span>]<br  />        <span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 城市名</span><br  />        name = i.get_text()<br  />        get_message(ID, cityname, name)<br  />    <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">for</span> i <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">in</span> res2.find_all(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'a'</span>):<br  />        <span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 城市ID值</span><br  />        ID = i[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'id'</span>]<br  />        <span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 城市拼音名</span><br  />        cityname = i[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'cityname'</span>]<br  />        <span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 城市名</span><br  />        name = i.get_text()<br  />        get_message(ID, cityname, name)<br  /><br  /><br  /><span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">if</span> __name__ == <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'__main__'</span>:<br  />    get_city()<br  /></p>


最后成功获取数据。


Python分析3034个地铁站,发现中国地铁名字的秘密。


包含换乘站数据,一共3541个地铁站点。



/ 03 /  数据可视化


先对数据进行清洗,去除重复的换乘站信息。


<p style="overflow-wrap: break-word;line-height: 15px;font-size: 11px;word-spacing: -3px;letter-spacing: 0px;font-family: Consolas, Inconsolata, Courier, monospace;border-radius: 0px;color: rgb(169, 183, 198);background: rgb(40, 43, 46);padding: 0.5em;margin-left: 8px;margin-right: 8px;display: block !important;word-wrap: normal !important;word-break: normal !important;overflow: auto !important;"><span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">from</span> wordcloud <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">import</span> WordCloud, ImageColorGenerator<br  /><span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">from</span> pyecharts <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">import</span> Line, Bar<br  /><span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">import</span> matplotlib.pyplot <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">as</span> plt<br  /><span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">import</span> pandas <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">as</span> pd<br  /><span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">import</span> numpy <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">as</span> np<br  /><span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">import</span> jieba<br  /><br  /><span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 设置列名与数据对齐</span><br  />pd.set_option(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'display.unicode.ambiguous_as_wide'</span>, <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">True</span>)<br  />pd.set_option(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'display.unicode.east_asian_width'</span>, <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">True</span>)<br  /><span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 显示10行</span><br  />pd.set_option(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'display.max_rows'</span>, <span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">10</span>)<br  /><span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 读取数据</span><br  />df = pd.read_csv(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'subway.csv'</span>, header=<span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">None</span>, names=[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'city'</span>, <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'line'</span>, <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'station'</span>], encoding=<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'gbk'</span>)<br  /><span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 各个城市地铁线路情况</span><br  />df_line = df.groupby([<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'city'</span>, <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'line'</span>]).count().reset_index()<br  />print(df_line)<br  /></p>


通过城市及地铁线路进行分组,得到全国地铁线路总数。


Python分析3034个地铁站,发现中国地铁名字的秘密。


一共183条地铁线路。


<p style="overflow-wrap: break-word;line-height: 15px;font-size: 11px;word-spacing: -3px;letter-spacing: 0px;font-family: Consolas, Inconsolata, Courier, monospace;border-radius: 0px;color: rgb(169, 183, 198);background: rgb(40, 43, 46);padding: 0.5em;margin-left: 8px;margin-right: 8px;display: block !important;word-wrap: normal !important;word-break: normal !important;overflow: auto !important;"><span class="hljs-function" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;"><span class="hljs-keyword" style="font-size: inherit;line-height: inherit;word-wrap: inherit !important;word-break: inherit !important;">def</span> <span class="hljs-title" style="font-size: inherit;line-height: inherit;color: rgb(165, 218, 45);word-wrap: inherit !important;word-break: inherit !important;">create_map</span><span class="hljs-params" style="font-size: inherit;line-height: inherit;color: rgb(255, 152, 35);word-wrap: inherit !important;word-break: inherit !important;">(df)</span>:</span><br  />    <span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 绘制地图</span><br  />    value = [i <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">for</span> i <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">in</span> df[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'line'</span>]]<br  />    attr = [i <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">for</span> i <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">in</span> df[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'city'</span>]]<br  />    geo = Geo(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">"已开通地铁城市分布情况"</span>, title_pos=<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'center'</span>, title_top=<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'0'</span>, width=<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">800</span>, height=<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">400</span>, title_color=<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">"#fff"</span>, background_color=<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">"#404a59"</span>, )<br  />    geo.add(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">""</span>, attr, value, is_visualmap=<span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">True</span>, visual_range=[<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">0</span>, <span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">25</span>], visual_text_color=<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">"#fff"</span>, symbol_size=<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">15</span>)<br  />    geo.render(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">"已开通地铁城市分布情况.html"</span>)<br  /><br  /><br  /><span class="hljs-function" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;"><span class="hljs-keyword" style="font-size: inherit;line-height: inherit;word-wrap: inherit !important;word-break: inherit !important;">def</span> <span class="hljs-title" style="font-size: inherit;line-height: inherit;color: rgb(165, 218, 45);word-wrap: inherit !important;word-break: inherit !important;">create_line</span><span class="hljs-params" style="font-size: inherit;line-height: inherit;color: rgb(255, 152, 35);word-wrap: inherit !important;word-break: inherit !important;">(df)</span>:</span><br  />    <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">"""<br  />    生成城市地铁线路数量分布情况<br  />    """</span><br  />    title_len = df[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'line'</span>]<br  />    bins = [<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">0</span>, <span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">5</span>, <span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">10</span>, <span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">15</span>, <span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">20</span>, <span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">25</span>]<br  />    level = [<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'0-5'</span>, <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'5-10'</span>, <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'10-15'</span>, <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'15-20'</span>, <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'20以上'</span>]<br  />    len_stage = pd.cut(title_len, bins=bins, labels=level).value_counts().sort_index()<br  />    <span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 生成柱状图</span><br  />    attr = len_stage.index<br  />    v1 = len_stage.values<br  />    bar = Bar(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">"各城市地铁线路数量分布"</span>, title_pos=<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'center'</span>, title_top=<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'18'</span>, width=<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">800</span>, height=<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">400</span>)<br  />    bar.add(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">""</span>, attr, v1, is_stack=<span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">True</span>, is_label_show=<span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">True</span>)<br  />    bar.render(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">"各城市地铁线路数量分布.html"</span>)<br  /><br  /><br  /><span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 各个城市地铁线路数</span><br  />df_city = df_line.groupby([<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'city'</span>]).count().reset_index().sort_values(by=<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'line'</span>, ascending=<span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">False</span>)<br  />print(df_city)<br  />create_map(df_city)<br  />create_line(df_city)</p>


已经开通地铁的城市数据,还有各个城市的地铁线路数。


Python分析3034个地铁站,发现中国地铁名字的秘密。


一共32个城市开通地铁,其中北京、上海线路已经超过了20条。


城市分布情况。


Python分析3034个地铁站,发现中国地铁名字的秘密。


大部分都是省会城市,还有个别经济实力强的城市。


线路数量分布情况。


Python分析3034个地铁站,发现中国地铁名字的秘密。


可以看到大部分还是在「0-5」这个阶段的,当然最少为1条线。


<p style="overflow-wrap: break-word;line-height: 15px;font-size: 11px;word-spacing: -3px;letter-spacing: 0px;font-family: Consolas, Inconsolata, Courier, monospace;border-radius: 0px;color: rgb(169, 183, 198);background: rgb(40, 43, 46);padding: 0.5em;margin-left: 8px;margin-right: 8px;display: block !important;word-wrap: normal !important;word-break: normal !important;overflow: auto !important;"><span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 哪个城市哪条线路地铁站最多</span><br  />print(df_line.sort_values(by=<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'station'</span>, ascending=<span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">False</span>))<br  /></p>


探索一下哪个城市哪条线路地铁站最多。


Python分析3034个地铁站,发现中国地铁名字的秘密。


北京10号线第一,重庆3号线第二。


Python分析3034个地铁站,发现中国地铁名字的秘密。

Python分析3034个地铁站,发现中国地铁名字的秘密。


还是蛮怀念北京1张票,2块钱地铁随便做的时候。


可惜好日子一去不复返了。


去除重复换乘站数据。


<p style="overflow-wrap: break-word;line-height: 15px;font-size: 11px;word-spacing: -3px;letter-spacing: 0px;font-family: Consolas, Inconsolata, Courier, monospace;border-radius: 0px;color: rgb(169, 183, 198);background: rgb(40, 43, 46);padding: 0.5em;margin-left: 8px;margin-right: 8px;display: block !important;word-wrap: normal !important;word-break: normal !important;overflow: auto !important;"><span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 去除重复换乘站的地铁数据</span><br  />df_station = df.groupby([<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'city'</span>, <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'station'</span>]).count().reset_index()<br  />print(df_station)<br  /></p>


一共包含3034个地铁站,相较新周刊中3447个地铁站数据。


减少了近400个地铁站。


Python分析3034个地铁站,发现中国地铁名字的秘密。


接下来看一下哪个城市地铁站最多。


<p style="overflow-wrap: break-word;line-height: 15px;font-size: 11px;word-spacing: -3px;letter-spacing: 0px;font-family: Consolas, Inconsolata, Courier, monospace;border-radius: 0px;color: rgb(169, 183, 198);background: rgb(40, 43, 46);padding: 0.5em;margin-left: 8px;margin-right: 8px;display: block !important;word-wrap: normal !important;word-break: normal !important;overflow: auto !important;"><span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 统计每个城市包含地铁站数(已去除重复换乘站)</span><br  />print(df_station.groupby([<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'city'</span>]).count().reset_index().sort_values(by=<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'station'</span>, ascending=<span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">False</span>))<br  /></p>


32个城市,上海第一,北京第二。


没想到的是,武汉居然有那么多地铁站。


Python分析3034个地铁站,发现中国地铁名字的秘密。


现在来实现一下新周刊中的操作,生成地铁名词云。


<p style="overflow-wrap: break-word;line-height: 15px;font-size: 11px;word-spacing: -3px;letter-spacing: 0px;font-family: Consolas, Inconsolata, Courier, monospace;border-radius: 0px;color: rgb(169, 183, 198);background: rgb(40, 43, 46);padding: 0.5em;margin-left: 8px;margin-right: 8px;display: block !important;word-wrap: normal !important;word-break: normal !important;overflow: auto !important;"><span class="hljs-function" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;"><span class="hljs-keyword" style="font-size: inherit;line-height: inherit;word-wrap: inherit !important;word-break: inherit !important;">def</span> <span class="hljs-title" style="font-size: inherit;line-height: inherit;color: rgb(165, 218, 45);word-wrap: inherit !important;word-break: inherit !important;">create_wordcloud</span><span class="hljs-params" style="font-size: inherit;line-height: inherit;color: rgb(255, 152, 35);word-wrap: inherit !important;word-break: inherit !important;">(df)</span>:</span><br  />    <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">"""<br  />    生成地铁名词云<br  />    """</span><br  />    <span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 分词</span><br  />    text = <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">''</span><br  />    <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">for</span> line <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">in</span> df[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'station'</span>]:<br  />        text += <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">' '</span>.join(jieba.cut(line, cut_all=<span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">False</span>))<br  />        text += <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">' '</span><br  />    backgroud_Image = plt.imread(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'rocket.jpg'</span>)<br  />    wc = WordCloud(<br  />        background_color=<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'white'</span>,<br  />        mask=backgroud_Image,<br  />        font_path=<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'C:WindowsFonts华康俪金黑W8.TTF'</span>,<br  />        max_words=<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">1000</span>,<br  />        max_font_size=<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">150</span>,<br  />        min_font_size=<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">15</span>,<br  />        prefer_horizontal=<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">1</span>,<br  />        random_state=<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">50</span>,<br  />    )<br  />    wc.generate_from_text(text)<br  />    img_colors = ImageColorGenerator(backgroud_Image)<br  />    wc.recolor(color_func=img_colors)<br  />    <span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 看看词频高的有哪些</span><br  />    process_word = WordCloud.process_text(wc, text)<br  />    sort = sorted(process_word.items(), key=<span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">lambda</span> e: e[<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">1</span>], reverse=<span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">True</span>)<br  />    print(sort[:<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">50</span>])<br  />    plt.imshow(wc)<br  />    plt.axis(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'off'</span>)<br  />    wc.to_file(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">"地铁名词云.jpg"</span>)<br  />    print(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'生成词云成功!'</span>)<br  /><br  /><br  />create_wordcloud(df_station)<br  /></p>


词云图如下。


Python分析3034个地铁站,发现中国地铁名字的秘密。


广场、大道、公园占了前三,和新周刊的图片一样,说明分析有效。


<p style="overflow-wrap: break-word;line-height: 15px;font-size: 11px;word-spacing: -3px;letter-spacing: 0px;font-family: Consolas, Inconsolata, Courier, monospace;border-radius: 0px;color: rgb(169, 183, 198);background: rgb(40, 43, 46);padding: 0.5em;margin-left: 8px;margin-right: 8px;display: block !important;word-wrap: normal !important;word-break: normal !important;overflow: auto !important;">words = []<br  /><span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">for</span> line <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">in</span> df[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'station'</span>]:<br  />    <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">for</span> i <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">in</span> line:<br  />        <span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 将字符串输出一个个中文</span><br  />        words.append(i)<br  /><br  /><br  /><span class="hljs-function" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;"><span class="hljs-keyword" style="font-size: inherit;line-height: inherit;word-wrap: inherit !important;word-break: inherit !important;">def</span> <span class="hljs-title" style="font-size: inherit;line-height: inherit;color: rgb(165, 218, 45);word-wrap: inherit !important;word-break: inherit !important;">all_np</span><span class="hljs-params" style="font-size: inherit;line-height: inherit;color: rgb(255, 152, 35);word-wrap: inherit !important;word-break: inherit !important;">(arr)</span>:</span><br  />    <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">"""<br  />    统计单字频率<br  />    """</span><br  />    arr = np.array(arr)<br  />    key = np.unique(arr)<br  />    result = {}<br  />    <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">for</span> k <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">in</span> key:<br  />        mask = (arr == k)<br  />        arr_new = arr[mask]<br  />        v = arr_new.size<br  />        result[k] = v<br  />    <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">return</span> result<br  /><br  /><br  /><span class="hljs-function" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;"><span class="hljs-keyword" style="font-size: inherit;line-height: inherit;word-wrap: inherit !important;word-break: inherit !important;">def</span> <span class="hljs-title" style="font-size: inherit;line-height: inherit;color: rgb(165, 218, 45);word-wrap: inherit !important;word-break: inherit !important;">create_word</span><span class="hljs-params" style="font-size: inherit;line-height: inherit;color: rgb(255, 152, 35);word-wrap: inherit !important;word-break: inherit !important;">(word_message)</span>:</span><br  />    <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">"""<br  />    生成柱状图<br  />    """</span><br  />    attr = [j[<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">0</span>] <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">for</span> j <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">in</span> word_message]<br  />    v1 = [j[<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">1</span>] <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">for</span> j <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">in</span> word_message]<br  />    bar = Bar(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">"中国地铁站最爱用的字"</span>, title_pos=<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'center'</span>, title_top=<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'18'</span>, width=<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">800</span>, height=<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">400</span>)<br  />    bar.add(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">""</span>, attr, v1, is_stack=<span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">True</span>, is_label_show=<span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">True</span>)<br  />    bar.render(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">"中国地铁站最爱用的字.html"</span>)<br  /><br  /><br  />word = all_np(words)<br  />word_message = sorted(word.items(), key=<span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">lambda</span> x: x[<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">1</span>], reverse=<span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">True</span>)[:<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">10</span>]<br  />create_word(word_message)<br  /></p>


统计一下,大家最喜欢用什么字来命名地铁。


Python分析3034个地铁站,发现中国地铁名字的秘密。


路最多,在此之中上海的占比很大。


不信往下看。


<p style="overflow-wrap: break-word;line-height: 15px;font-size: 11px;word-spacing: -3px;letter-spacing: 0px;font-family: Consolas, Inconsolata, Courier, monospace;border-radius: 0px;color: rgb(169, 183, 198);background: rgb(40, 43, 46);padding: 0.5em;margin-left: 8px;margin-right: 8px;display: block !important;word-wrap: normal !important;word-break: normal !important;overflow: auto !important;"><span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 选取上海的地铁站</span><br  />df1 = df_station[df_station[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'city'</span>] == <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'上海'</span>]<br  />print(df1)<br  /></p>


统计上海所有的地铁站,一共345个。


Python分析3034个地铁站,发现中国地铁名字的秘密。


选取包含路的地铁站。


<p style="overflow-wrap: break-word;line-height: 15px;font-size: 11px;word-spacing: -3px;letter-spacing: 0px;font-family: Consolas, Inconsolata, Courier, monospace;border-radius: 0px;color: rgb(169, 183, 198);background: rgb(40, 43, 46);padding: 0.5em;margin-left: 8px;margin-right: 8px;display: block !important;word-wrap: normal !important;word-break: normal !important;overflow: auto !important;"><span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 选取上海地铁站名字包含路的数据</span><br  />df2 = df1[df1[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'station'</span>].str.contains(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'路'</span>)]<br  />print(df2)<br  /></p>


有210个,约占上海地铁的三分之二,路的七分之二。


Python分析3034个地铁站,发现中国地铁名字的秘密。


看来上海对是情有独钟的。


具体缘由这里就不解释了,详情见新周刊的推送,里面还是讲解蛮详细的。


武汉和重庆则是对这个词特别喜欢。


标志着那片土地开拓者们的籍贯与姓氏。


<p style="overflow-wrap: break-word;line-height: 15px;font-size: 11px;word-spacing: -3px;letter-spacing: 0px;font-family: Consolas, Inconsolata, Courier, monospace;border-radius: 0px;color: rgb(169, 183, 198);background: rgb(40, 43, 46);padding: 0.5em;margin-left: 8px;margin-right: 8px;display: block !important;word-wrap: normal !important;word-break: normal !important;overflow: auto !important;"><span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 选取武汉的地铁站</span><br  />df1 = df_station[df_station[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'city'</span>] == <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'武汉'</span>]<br  />print(df1)<br  /><span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 选取武汉地铁站名字包含家的数据</span><br  />df2 = df1[df1[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'station'</span>].str.contains(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'家'</span>)]<br  />print(df2)<br  /><br  /><span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 选取重庆的地铁站</span><br  />df1 = df_station[df_station[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'city'</span>] == <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'重庆'</span>]<br  />print(df1)<br  /><span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 选取重庆地铁站名字包含家的数据</span><br  />df2 = df1[df1[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'station'</span>].str.contains(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'家'</span>)]<br  />print(df2)</p>


武汉共有17个,重庆共有20个。


Python分析3034个地铁站,发现中国地铁名字的秘密。

Python分析3034个地铁站,发现中国地铁名字的秘密。


看完家之后,再来看一下名字包含的地铁站。


<p style="overflow-wrap: break-word;line-height: 15px;font-size: 11px;word-spacing: -3px;letter-spacing: 0px;font-family: Consolas, Inconsolata, Courier, monospace;border-radius: 0px;color: rgb(169, 183, 198);background: rgb(40, 43, 46);padding: 0.5em;margin-left: 8px;margin-right: 8px;display: block !important;word-wrap: normal !important;word-break: normal !important;overflow: auto !important;"><span class="hljs-function" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;"><span class="hljs-keyword" style="font-size: inherit;line-height: inherit;word-wrap: inherit !important;word-break: inherit !important;">def</span> <span class="hljs-title" style="font-size: inherit;line-height: inherit;color: rgb(165, 218, 45);word-wrap: inherit !important;word-break: inherit !important;">create_door</span><span class="hljs-params" style="font-size: inherit;line-height: inherit;color: rgb(255, 152, 35);word-wrap: inherit !important;word-break: inherit !important;">(door)</span>:</span><br  />    <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">"""<br  />    生成柱状图<br  />    """</span><br  />    attr = [j <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">for</span> j <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">in</span> door[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'city'</span>][:<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">3</span>]]<br  />    v1 = [j <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">for</span> j <span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">in</span> door[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'line'</span>][:<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">3</span>]]<br  />    bar = Bar(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">"地铁站最爱用“门”命名的城市"</span>, title_pos=<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'center'</span>, title_top=<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'18'</span>, width=<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">800</span>, height=<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">400</span>)<br  />    bar.add(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">""</span>, attr, v1, is_stack=<span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">True</span>, is_label_show=<span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">True</span>, yaxis_max=<span class="hljs-number" style="font-size: inherit;line-height: inherit;color: rgb(174, 135, 250);word-wrap: inherit !important;word-break: inherit !important;">40</span>)<br  />    bar.render(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">"地铁站最爱用门命名的城市.html"</span>)<br  /><br  /><br  /><span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 选取地铁站名字包含门的数据</span><br  />df1 = df_station[df_station[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'station'</span>].str.contains(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'门'</span>)]<br  /><span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 对数据进行分组计数</span><br  />df2 = df1.groupby([<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'city'</span>]).count().reset_index().sort_values(by=<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'line'</span>, ascending=<span class="hljs-keyword" style="font-size: inherit;line-height: inherit;color: rgb(248, 35, 117);word-wrap: inherit !important;word-break: inherit !important;">False</span>)<br  />print(df2)<br  />create_door(df2)<br  /></p>


一共有21个城市,地铁站名包含门。


Python分析3034个地铁站,发现中国地铁名字的秘密。


其中北京,南京,西安作为多朝古都,占去了大部分。


Python分析3034个地铁站,发现中国地铁名字的秘密。


具体的地铁站名数据。


<p style="overflow-wrap: break-word;line-height: 15px;font-size: 11px;word-spacing: -3px;letter-spacing: 0px;font-family: Consolas, Inconsolata, Courier, monospace;border-radius: 0px;color: rgb(169, 183, 198);background: rgb(40, 43, 46);padding: 0.5em;margin-left: 8px;margin-right: 8px;display: block !important;word-wrap: normal !important;word-break: normal !important;overflow: auto !important;"><span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 选取北京的地铁站</span><br  />df1 = df_station[df_station[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'city'</span>] == <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'北京'</span>]<br  /><span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 选取北京地铁站名字包含门的数据</span><br  />df2 = df1[df1[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'station'</span>].str.contains(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'门'</span>)]<br  />print(df2)<br  /><br  /><span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 选取南京的地铁站</span><br  />df1 = df_station[df_station[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'city'</span>] == <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'南京'</span>]<br  /><span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 选取南京地铁站名字包含门的数据</span><br  />df2 = df1[df1[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'station'</span>].str.contains(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'门'</span>)]<br  />print(df2)<br  /><br  /><span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 选取西安的地铁站</span><br  />df1 = df_station[df_station[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'city'</span>] == <span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'西安'</span>]<br  /><span class="hljs-comment" style="font-size: inherit;line-height: inherit;color: rgb(128, 128, 128);word-wrap: inherit !important;word-break: inherit !important;"># 选取西安地铁站名字包含门的数据</span><br  />df2 = df1[df1[<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'station'</span>].str.contains(<span class="hljs-string" style="font-size: inherit;line-height: inherit;color: rgb(238, 220, 112);word-wrap: inherit !important;word-break: inherit !important;">'门'</span>)]<br  />print(df2)<br  /></p>


输出如下。


Python分析3034个地铁站,发现中国地铁名字的秘密。


需要源码,点击阅读原文获取

本篇文章来源于: 菜鸟学Python

本文为原创文章,版权归所有,欢迎分享本文,转载请保留出处!

知行编程网
知行编程网 关注:1    粉丝:1
这个人很懒,什么都没写

发表评论

表情 格式 链接 私密 签到
扫一扫二维码分享