├── travel_data.xls ├── image ├── growth_bar.JPG ├── export_polar.JPG ├── inbound_map.JPG ├── inbound_top.JPG ├── outbound_map.JPG ├── outbound_top.JPG ├── receipts_bar.JPG ├── data_cleaning1.JPG ├── data_cleaning2.JPG ├── expenditure_bar.JPG └── outbound_population_compare.JPG ├── README.md ├── page3.html ├── js └── function.js ├── page4.html ├── css ├── stylesheet2.css ├── stylesheet1.css └── stylesheet.css ├── index.html ├── page2.html ├── page1.html └── project3_ChenChen.py /travel_data.xls: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ChenChenDS/Tourism-data-analysis/HEAD/travel_data.xls -------------------------------------------------------------------------------- /image/growth_bar.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ChenChenDS/Tourism-data-analysis/HEAD/image/growth_bar.JPG -------------------------------------------------------------------------------- /image/export_polar.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ChenChenDS/Tourism-data-analysis/HEAD/image/export_polar.JPG -------------------------------------------------------------------------------- /image/inbound_map.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ChenChenDS/Tourism-data-analysis/HEAD/image/inbound_map.JPG -------------------------------------------------------------------------------- /image/inbound_top.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ChenChenDS/Tourism-data-analysis/HEAD/image/inbound_top.JPG -------------------------------------------------------------------------------- /image/outbound_map.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ChenChenDS/Tourism-data-analysis/HEAD/image/outbound_map.JPG -------------------------------------------------------------------------------- /image/outbound_top.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ChenChenDS/Tourism-data-analysis/HEAD/image/outbound_top.JPG -------------------------------------------------------------------------------- /image/receipts_bar.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ChenChenDS/Tourism-data-analysis/HEAD/image/receipts_bar.JPG -------------------------------------------------------------------------------- /image/data_cleaning1.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ChenChenDS/Tourism-data-analysis/HEAD/image/data_cleaning1.JPG -------------------------------------------------------------------------------- /image/data_cleaning2.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ChenChenDS/Tourism-data-analysis/HEAD/image/data_cleaning2.JPG -------------------------------------------------------------------------------- /image/expenditure_bar.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ChenChenDS/Tourism-data-analysis/HEAD/image/expenditure_bar.JPG -------------------------------------------------------------------------------- /image/outbound_population_compare.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ChenChenDS/Tourism-data-analysis/HEAD/image/outbound_population_compare.JPG -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # DATS6103-project3 2 | ## This is a data mining project about world tourism analysis. 3 | 4 | The international travel & tourism sector is growring at a fast pace. The World Travel and Tourism Council (WTTC) report found that the travel & tourism industry generated 10.4 percent of all global economic growth. Tourism is not only a kind of leisure, it is also essential in a country's economy. In this analysis, I investigate the dynamic trend of evolvement in tourism sector and find the top tourist-sending countries and the countries that are most attractive to international tourists. 5 | 6 | ### Language 7 | python 8 | #### packages 9 | pandas
10 | numpy
11 | plotly
12 | plotly_express
13 | 14 | ### Files 15 | Excel
16 | ipynb
17 | html
18 | 19 | ### Visualization link: 20 | https://chenchends.github.io/Tourism-data-analysis/ 21 | 22 | ### Instruction 23 |
  • Put the excel and python code in the same folder
  • 24 |
  • Need plotly account to run the code
  • 25 |
  • Recommend use google chrome
  • 26 | 27 | ### Author: Chen Chen 28 | 29 | -------------------------------------------------------------------------------- /page3.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
    9 |

    World tourism analysis

    10 |
    11 | 12 |
    13 | Home page 14 | inbound tourism analysis 15 | outbound tourists analysis 16 | GDP and tourism 17 | Summary 18 |
    19 | 20 |
    21 |
    22 |
    23 |

    Correlation between GDP and inbound tourists(number of people)

    24 | 25 |

    There is positive correlation between the GDP and the number of inbound tourists people, which indicates that the develpment in tourist industry will benefit the country's GDP.

    26 |
    27 |
    28 |
    29 |
    30 |

    Correlation between GDP and outbound tourists(number of people)

    31 | 32 |

    There is positive correlation between the GDP and the number of outbound tourists people, which indicates that more people choose to travel aboard as economy grows.

    33 |
    34 |
    35 |
    36 | 37 | 41 | 42 | 43 | -------------------------------------------------------------------------------- /js/function.js: -------------------------------------------------------------------------------- 1 | function myFunction() { 2 | var x = document.getElementById("myDIV"); 3 | if (x.style.display == "none") { 4 | x.style.display = "block"; 5 | } else { 6 | x.style.display = "none"; 7 | } 8 | } 9 | 10 | function myFunction1() { 11 | var x = document.getElementById("myDIV1"); 12 | if (x.style.display == "none") { 13 | x.style.display = "block"; 14 | } else { 15 | x.style.display = "none"; 16 | } 17 | } 18 | 19 | function myFunction2() { 20 | var x = document.getElementById("myDIV2"); 21 | if (x.style.display == "none") { 22 | x.style.display = "block"; 23 | } else { 24 | x.style.display = "none"; 25 | } 26 | } 27 | 28 | function myFunction3() { 29 | var x = document.getElementById("myDIV3"); 30 | if (x.style.display == "none") { 31 | x.style.display = "block"; 32 | } else { 33 | x.style.display = "none"; 34 | } 35 | } 36 | 37 | function myFunction4() { 38 | var x = document.getElementById("myDIV4"); 39 | if (x.style.display == "none") { 40 | x.style.display = "block"; 41 | } else { 42 | x.style.display = "none"; 43 | } 44 | } 45 | 46 | function myFunction5() { 47 | var x = document.getElementById("myDIV5"); 48 | if (x.style.display == "none") { 49 | x.style.display = "block"; 50 | } else { 51 | x.style.display = "none"; 52 | } 53 | } 54 | 55 | function myFunction6() { 56 | var x = document.getElementById("myDIV6"); 57 | if (x.style.display == "none") { 58 | x.style.display = "block"; 59 | } else { 60 | x.style.display = "none"; 61 | } 62 | } 63 | 64 | function myFunction7() { 65 | var x = document.getElementById("myDIV7"); 66 | if (x.style.display == "none") { 67 | x.style.display = "block"; 68 | } else { 69 | x.style.display = "none"; 70 | } 71 | } 72 | -------------------------------------------------------------------------------- /page4.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
    9 |

    World tourism analysis

    10 |
    11 | 12 |
    13 | Home page 14 | inbound tourism analysis 15 | outbound tourists analysis 16 | GDP and tourism 17 | Summary 18 |
    19 | 20 |
    21 |
    22 |
    23 |

    Summary

    24 |

  • Europe, North America, and East Asia are top destinations for tourists, and also people in these areas like traveling aboard the most.
  • 25 |

  • Small coastal or island countries with beautiful sightseeings are more likely to be a tourism-oriented country.
  • 26 |

  • The total number of people traveling aboard in China increases dramatically, but when looking at the per capita data, Hong Kong is the one that increases the most
  •   outstanding one.

    27 |

  • GDP is highly correlated with both inbound and outbound tourists number. The tourist industry is essential in a country's economy.
  • 28 |
    29 |
    30 |

    Reference

    31 |
    Mexico plans for long-term tourism growth. Oxford Business Group. Access at.https://oxfordbusinessgroup.com/overview/looking-ahead-authorities-are-laying-foundation-long-term-growth
    32 |
    (2017)Turkey's failed coup attempt: All you need to know. Aljazeera. Access at.https://www.aljazeera.com/news/2016/12/turkey-failed-coup-attempt-161217032345594.html
    33 |
    Leposa.A., (2019) Stats: Travel Industry Second-Fastest Growing Sector in the World.Travel Agent Central. Access at.https://www.travelagentcentral.com/running-your-business/stats-travel-industry-second-fastest-growing-sector-world
    34 |
    35 |
    36 |
    37 | 38 | 42 | 43 | 44 | 45 | -------------------------------------------------------------------------------- /css/stylesheet2.css: -------------------------------------------------------------------------------- 1 | * { 2 | box-sizing: border-box; 3 | } 4 | body { 5 | font-family: Arial; 6 | padding: 20px; 7 | background: #f1f1f1; 8 | } 9 | 10 | /* Header/Blog Title */ 11 | .header { 12 | padding: 30px; 13 | text-align: center; 14 | background: #ADD8E6; 15 | } 16 | 17 | .header h1 { 18 | font-size: 50px; 19 | } 20 | 21 | /* Style the top navigation bar */ 22 | .topnav { 23 | background-color: #708090; 24 | overflow: hidden; 25 | position: -webkit-sticky; 26 | position: sticky; 27 | top: 0; 28 | } 29 | 30 | /* Style the topnav links */ 31 | .topnav a { 32 | float: left; 33 | display: block; 34 | color: #FFDEAD; 35 | text-align: center; 36 | padding: 14px 16px; 37 | text-decoration: none; 38 | border-right: 1px solid white; 39 | } 40 | 41 | .topnav a:last-child { 42 | border-right: none; 43 | } 44 | 45 | /* Change color on hover */ 46 | .topnav a:hover { 47 | background-color: #ddd; 48 | color: black; 49 | } 50 | 51 | /* Create two unequal columns that floats next to each other */ 52 | /* Left column */ 53 | .leftcolumn { 54 | float: left; 55 | width: 100%; 56 | } 57 | 58 | /* Right column */ 59 | .rightcolumn { 60 | float: left; 61 | width: 0%; 62 | background-color: #f1f1f1; 63 | padding-left: 20px; 64 | } 65 | 66 | /* Add a card effect for articles */ 67 | .card { 68 | background-color: white; 69 | padding: 20px 100px 20px 100px; 70 | margin-top: 20px; 71 | } 72 | /* Clear floats after the columns */ 73 | .row:after { 74 | content: ""; 75 | display: table; 76 | clear: both; 77 | } 78 | 79 | /* Footer */ 80 | .footer { 81 | padding: 20px; 82 | text-align: center; 83 | background: #E6B0AA; 84 | margin-top: 20px; 85 | color: #571B7E ; 86 | } 87 | 88 | /* Responsive layout - when the screen is less than 800px wide, make the two columns stack on top of each other instead of next to each other */ 89 | @media screen and (max-width: 800px) { 90 | .leftcolumn, .rightcolumn { 91 | width: 100%; 92 | padding: 0; 93 | } 94 | } 95 | 96 | /* Responsive layout - when the screen is less than 400px wide, make the navigation links stack on top of each other instead of next to each other */ 97 | @media screen and (max-width: 400px) { 98 | .topnav a { 99 | float: none; 100 | width: 100%; 101 | } 102 | } -------------------------------------------------------------------------------- /css/stylesheet1.css: -------------------------------------------------------------------------------- 1 | * { 2 | box-sizing: border-box; 3 | } 4 | body { 5 | font-family: Arial; 6 | padding: 20px; 7 | background: #f1f1f1; 8 | } 9 | 10 | /* Header/Blog Title */ 11 | .header { 12 | padding: 30px; 13 | text-align: center; 14 | background: #ADD8E6; 15 | } 16 | 17 | .header h1 { 18 | font-size: 50px; 19 | } 20 | 21 | /* Style the top navigation bar */ 22 | .topnav { 23 | background-color: #708090; 24 | overflow: hidden; 25 | position: -webkit-sticky; 26 | position: sticky; 27 | top: 0; 28 | } 29 | 30 | /* Style the topnav links */ 31 | .topnav a { 32 | float: left; 33 | display: block; 34 | color: #FFDEAD; 35 | text-align: center; 36 | padding: 14px 16px; 37 | text-decoration: none; 38 | border-right: 1px solid white; 39 | } 40 | 41 | .topnav a:last-child { 42 | border-right: none; 43 | } 44 | 45 | /* Change color on hover */ 46 | .topnav a:hover { 47 | background-color: #ddd; 48 | color: black; 49 | } 50 | 51 | /* Create two equal columns that floats next to each other */ 52 | /* Left column */ 53 | .leftcolumn { 54 | float: left; 55 | width: 50%; 56 | padding-left: 20px; 57 | } 58 | 59 | /* Right column */ 60 | .rightcolumn { 61 | float: left; 62 | width: 50%; 63 | background-color: #f1f1f1; 64 | padding-left: 20px; 65 | } 66 | 67 | /* Add a card effect for articles */ 68 | .card { 69 | background-color: white; 70 | padding: 30px; 71 | margin-top: 20px; 72 | } 73 | /* Clear floats after the columns */ 74 | .row:after { 75 | content: ""; 76 | display: table; 77 | clear: both; 78 | } 79 | 80 | /* Footer */ 81 | .footer { 82 | padding: 20px; 83 | text-align: center; 84 | background: #E6B0AA; 85 | margin-top: 20px; 86 | color: #571B7E ; 87 | } 88 | 89 | /* Responsive layout - when the screen is less than 800px wide, make the two columns stack on top of each other instead of next to each other */ 90 | @media screen and (max-width: 800px) { 91 | .leftcolumn, .rightcolumn { 92 | width: 100%; 93 | padding: 0; 94 | } 95 | } 96 | 97 | /* Responsive layout - when the screen is less than 400px wide, make the navigation links stack on top of each other instead of next to each other */ 98 | @media screen and (max-width: 400px) { 99 | .topnav a { 100 | float: none; 101 | width: 100%; 102 | } 103 | } -------------------------------------------------------------------------------- /index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
    9 |

    World tourism analysis

    10 |
    11 | 12 |
    13 | Home page 14 | inbound tourism analysis 15 | outbound tourists analysis 16 | GDP and tourism 17 | Summary 18 |
    19 | 20 |
    21 |
    22 |
    23 |

    Why study tourism?

    24 |

    The international travel & tourism sector is growring at a fast pace. The World Travel and Tourism Council (WTTC) report found that the travel & tourism industry generated 10.4 percent of all global economic growth. Tourism is not only a kind of leisure, it is also essential in a country's economy. In this analysis, we will investigate the dynamic trend of evolvement in tourism sector and find the top tourist-sending countries and the countries that are most attractive to interntional tourists.

    25 |
    26 |
    27 |

    Data Source

    28 |
    29 |

    The dataset is come from the World Bank. Datalink

    30 |
  • World Tourism Organization, Yearbook of Tourism Statistics, Compendium of Tourism Statistics and data files.
  • 31 |
    32 |

    The database includes data from more than 200 countries for the period 1995-2017

    33 |
    34 |

    There are seven datasets

    35 |
  • Number_of_arrival
  • 36 |
  • Number_of_departure
  • 37 |
  • Expenditure for travel item
  • 38 |
  • Receipts for travel items
  • 39 |
  • Receipts (% of total exports)
  • 40 |
  • Population
  • 41 |
  • GDP
  • 42 |
    43 |
    44 |
    45 |
    46 |

    Data cleaning

    47 |
    48 |
      49 |

    1. Select the datasheets which will use in the data analysis, and put them together in one Excel file (using excel)
    2. 50 |
      51 |

    3. Define two functions to select relevant data and reshape the data frame format
    4. 52 | 53 | 54 |
    55 |
    56 |
    57 |
    58 | 59 | 63 | 64 | 65 | 66 | -------------------------------------------------------------------------------- /page2.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 |
    8 |

    World tourism analysis

    9 |
    10 | 11 |
    12 | Home page 13 | inbound tourism analysis 14 | outbound tourists analysis 15 | GDP and tourism 16 | Summary 17 |
    18 | 19 |
    20 |
    21 |
    22 |

    World outbound tourists distribution

    23 | 24 | 25 |
    26 | 27 |

    The world map shows from 2007 to 2017, people in Europe, North America and East Asia like traveling abroad the most. And China is the fastest growing country of outbound traveler.

    28 |
    29 | 30 |
    31 |

    Top countries where people like to travel

    32 | 33 | 34 |
    35 | 36 |

    The line chart plots the top 10 countries where people like to travel the most. Again, 6 of 10 are European countries. The number of people traveling abroad in China had tripled within 10 years. This is due to the relaxation of issuing tourists visa to Chinese. Germany and Hong Kong fluctuate in the second and third places.

    37 |
    38 | 39 |
    40 |

    Outbound tourists number compare to country's population

    41 | 42 | 43 |
    44 | 45 |

    The multi-plot compares the outbound tourists number and the country's population. Although China has the leading number of tourists, it is only 10% of the total population. The UK, Poland, Hong Kong, and Germany tourists number is higher than the population number, which means these countries' outbound travel times per capita is over 1.

    46 |
    47 | 48 |
    49 |

    Tourists from which countries spend the most (exclude international transportation)

    50 | 51 | 52 |
    53 | 54 |

    The stacked bar plot shows the per capita spending for the top countries travelers. Before 2014, French and American spent the most, after 2014, Chinese spending is the highest. Poles and Hong Kong people spent much less than the other countries, recalling the result of the previous plot: these two countries per capita travel time is very high, so we can conclude that people in these two countries have many short-distance trips.

    55 |
    56 |
    57 |
    58 | 59 | 63 | 64 | 65 | 66 | -------------------------------------------------------------------------------- /page1.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
    9 |

    World tourism analysis

    10 |
    11 | 12 |
    13 | Home page 14 | inbound tourism analysis 15 | outbound tourists analysis 16 | GDP and tourism 17 | Summary 18 |
    19 | 20 |
    21 |
    22 |
    23 |

    World inbound tourists distribution

    24 | 25 | 26 |
    27 | 28 |

    The world map shows from 2007 to 2017, the major destinations that tourists choose do not change a lot. Europe, North America, and China are famous destinations during these ten years.

    29 |
    30 |
    31 |

    Top countries attract tourists

    32 | 33 | 34 |
    35 | 36 |

    The line chart plot shows that the top 10 destinations that most of the tourists chose during these ten years; 6 of the 10 countries are in Europe. The most popular country is France. It is not surprising since its historical heritage and romantic stories are famous all over the world. People traveling to Spain and the US increase steadily, in 2017 Spain surpassed the US becoming the second best choice of destination.

    37 |
    38 |
    39 |

    Top attractive countries inbound tourists yearly growth rate

    40 | 41 | 42 |
    43 | 44 |

    The grouped bar plot shows the yearly percentage growth of inbound tourists. It is evident that in 2008 and 2009 most country experienced negative growth. This is because of the global financial crisis during 2007 to 2009. And in 2016, Turkey's inbound tourist number had a significant drop, and this was due to the coup attempt. The growth of Mexico in 2014, was due to domestic policy uncertainty.

    45 |
    46 |
    47 |

    Top countries ranked by tourism share of exports

    48 | 49 | 50 |
    51 | 52 |

    The polar bar chart shows the top countries that are tourism oriented. In other words, export depends mainly on the tourist industry. Most of them are small costal countries. And in recent years, their exports have been increasingly relying on tourism, all of their tourist receipts count over 67% of the export. The top three, China(Macao), Grenada and Maldives, are over 83%.

    53 |
    54 |
    55 |
    56 | 57 | 61 | 62 | 63 | 64 | -------------------------------------------------------------------------------- /css/stylesheet.css: -------------------------------------------------------------------------------- 1 | * { 2 | box-sizing: border-box; 3 | } 4 | body { 5 | font-family: Arial; 6 | padding: 20px; 7 | background: #f1f1f1; 8 | } 9 | 10 | /* Header/Blog Title */ 11 | .header { 12 | padding: 30px; 13 | text-align: center; 14 | background: #ADD8E6; 15 | } 16 | 17 | .header h1 { 18 | font-size: 50px; 19 | } 20 | 21 | /* Style the top navigation bar */ 22 | .topnav { 23 | background-color: #708090; 24 | overflow: hidden; 25 | position: -webkit-sticky; 26 | position: sticky; 27 | top: 0; 28 | } 29 | 30 | /* Style the topnav links */ 31 | .topnav a { 32 | float: left; 33 | display: block; 34 | color: #FFDEAD; 35 | text-align: center; 36 | padding: 14px 16px; 37 | text-decoration: none; 38 | border-right: 1px solid white; 39 | } 40 | 41 | .topnav a:last-child { 42 | border-right: none; 43 | } 44 | 45 | /* Change color on hover */ 46 | .topnav a:hover { 47 | background-color: #ddd; 48 | color: black; 49 | } 50 | 51 | /* Create column */ 52 | column { 53 | float: left; 54 | width: 100%; 55 | background-color: #f1f1f1; 56 | padding-left: 20px; 57 | } 58 | 59 | /* Add a card effect for articles */ 60 | .card { 61 | background-color: white; 62 | padding: 30px; 63 | margin-top: 20px; 64 | } 65 | /* Clear floats after the columns */ 66 | .row:after { 67 | content: ""; 68 | display: table; 69 | clear: both; 70 | } 71 | 72 | /* button */ 73 | #myDIV { 74 | display:none; 75 | width: 100%; 76 | padding: 20px; 77 | text-align:center; 78 | background-color: white; 79 | color:#A6ACAF£» 80 | margin-top: 0px; 81 | } 82 | 83 | #myDIV1 { 84 | display:none; 85 | width: 100%; 86 | padding: 20px; 87 | text-align:center; 88 | background-color: white; 89 | color:#A6ACAF£» 90 | margin-top: 0px; 91 | } 92 | 93 | #myDIV2 { 94 | display:none; 95 | width: 100%; 96 | padding: 20px; 97 | text-align:center; 98 | background-color: white; 99 | color:#A6ACAF£» 100 | margin-top: 0px; 101 | } 102 | 103 | #myDIV3 { 104 | display:none; 105 | width: 100%; 106 | padding: 20px; 107 | text-align:center; 108 | background-color: white; 109 | color:#A6ACAF£» 110 | margin-top: 0px; 111 | } 112 | 113 | #myDIV4 { 114 | display:none; 115 | width: 100%; 116 | padding: 20px; 117 | text-align:center; 118 | background-color: white; 119 | color:#A6ACAF£» 120 | margin-top: 0px; 121 | } 122 | 123 | #myDIV5 { 124 | display:none; 125 | width: 100%; 126 | padding: 20px; 127 | text-align:center; 128 | background-color: white; 129 | color:#A6ACAF£» 130 | margin-top: 0px; 131 | } 132 | 133 | #myDIV6 { 134 | display:none; 135 | width: 100%; 136 | padding: 20px; 137 | text-align:center; 138 | background-color: white; 139 | color:#A6ACAF£» 140 | margin-top: 0px; 141 | } 142 | 143 | #myDIV7 { 144 | display:none; 145 | width: 100%; 146 | padding: 20px; 147 | text-align:center; 148 | background-color: white; 149 | color:#A6ACAF£» 150 | margin-top: 0px; 151 | } 152 | 153 | #myDIV8 { 154 | display:none; 155 | width: 100%; 156 | padding: 20px; 157 | text-align:center; 158 | background-color: white; 159 | color:#A6ACAF£» 160 | margin-top: 0px; 161 | } 162 | 163 | #show-code{ 164 | background: #FAD7A0; 165 | color: black ; 166 | display: block; 167 | width: 135px; 168 | font-size: 15px; 169 | padding: 5px; 170 | text-align:left; 171 | margin:5px; 172 | cursor: pointer; 173 | } 174 | 175 | /* Footer */ 176 | .footer { 177 | padding: 20px; 178 | text-align: center; 179 | background: #E6B0AA; 180 | margin-top: 20px; 181 | color: #571B7E ; 182 | } 183 | 184 | /* Responsive layout - when the screen is less than 800px wide, make the two columns stack on top of each other instead of next to each other */ 185 | @media screen and (max-width: 800px) { 186 | .leftcolumn, .rightcolumn { 187 | width: 100%; 188 | padding: 0; 189 | } 190 | } 191 | 192 | /* Responsive layout - when the screen is less than 400px wide, make the navigation links stack on top of each other instead of next to each other */ 193 | @media screen and (max-width: 400px) { 194 | .topnav a { 195 | float: none; 196 | width: 100%; 197 | } 198 | } 199 | -------------------------------------------------------------------------------- /project3_ChenChen.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # coding: utf-8 3 | 4 | # In[1]: 5 | 6 | 7 | import pandas as pd 8 | import numpy as np 9 | import plotly 10 | import plotly.plotly as py 11 | import plotly.graph_objs as go 12 | import warnings 13 | #!pip install plotly_express 14 | import plotly_express as px 15 | 16 | 17 | # In[2]: 18 | 19 | 20 | #sign in plotly 21 | py.sign_in("name", "APIkey") 22 | # set all the data to 2 decimal 23 | pd.set_option("display.float.format", lambda x: "%.2f" % x) 24 | #ignore warning message 25 | warnings.filterwarnings("ignore") 26 | 27 | 28 | # # World inbound toursits from other country 29 | 30 | # In[3]: 31 | 32 | 33 | #import inbound tourist data set 34 | inbound_tourists = pd.read_excel("travel_data.xls", sheet_name="number_of_arrival", header=3, index_col=0) 35 | inbound_tourists.head() 36 | 37 | 38 | # In[4]: 39 | 40 | 41 | #import world country list data 42 | country_list = pd.read_excel("travel_data.xls", sheet_name="country code") 43 | country_list.head() 44 | 45 | 46 | # In[5]: 47 | 48 | 49 | #define a function to select rows and columns 50 | def data_selection(df): 51 | #merge the data frame with country list data frame by county Code, so we can drop irrelevant rows 52 | df = pd.merge(country_list, df, left_on = "Country Code", right_on = "Country Code", how="left") 53 | df = df.set_index("Country Name") 54 | #select columns from year 2007 to 2017 55 | df_new = pd.concat([df["Country Code"],df.loc[:,"2007":"2017"]], axis=1) 56 | #replace missing value by 0 57 | df_new = df_new.fillna(0) 58 | return df_new 59 | 60 | 61 | # In[6]: 62 | 63 | 64 | #use "data_selection" function to select inbound tourists data 65 | inbound_tourists_clean = data_selection(inbound_tourists) 66 | inbound_tourists_clean.head() 67 | 68 | 69 | # In[7]: 70 | 71 | 72 | #define function to reshape the data frame 73 | def reshape_data(df): 74 | df_new = df[:] 75 | df_new.set_index(["Country Code"], inplace = True, append = True) 76 | #use stack to change the data frame shape 77 | df_new = df_new.stack() 78 | df_new = pd.DataFrame(df_new) 79 | df_new = df_new.reset_index() 80 | #rename column names 81 | df_new = df_new.rename(columns = {"level_2":"Year",0:"Amount"}) 82 | return df_new 83 | 84 | 85 | # In[8]: 86 | 87 | 88 | #use "reshape_data" function to reshape the inbound_touists_clean data frame 89 | inbound_tourists_stack = reshape_data(inbound_tourists_clean) 90 | inbound_tourists_stack.head() 91 | 92 | 93 | # In[9]: 94 | 95 | 96 | #make the map plot of the inbound tourists use plotly_express package 97 | inbound_map = px.choropleth(inbound_tourists_stack, 98 | locations="Country Code", 99 | color="Amount", 100 | hover_name="Country Name", 101 | #set the animation slide bar 102 | animation_frame="Year", 103 | color_continuous_scale=px.colors.sequential.BuGn, 104 | #set the map type 105 | projection="natural earth") 106 | inbound_map 107 | #plotly.offline.plot(inbound_map, filename="inbound_map") 108 | 109 | 110 | # # Top countries that attract tourists 111 | 112 | # In[10]: 113 | 114 | 115 | #find the top countries that attract tourists 116 | top_country_inbound = inbound_tourists_clean.sort_values("2017", ascending=False).head(10) 117 | top_inbound_country_list = list(top_country_inbound.index) 118 | top_inbound_country_list 119 | 120 | 121 | # In[11]: 122 | 123 | 124 | #select the top countries' data 125 | top_inbound_data = inbound_tourists_stack.loc[inbound_tourists_stack["Country Name"].isin(top_inbound_country_list)] 126 | top_inbound_data.head() 127 | 128 | 129 | # In[12]: 130 | 131 | 132 | #make the line plot for top countries 133 | inbound_top = px.line(top_inbound_data, 134 | x="Year", 135 | y="Amount", 136 | color="Country Name", 137 | line_group="Country Name", 138 | line_shape="linear", 139 | title="Top countries that attract tourists") 140 | inbound_top 141 | #plotly.offline.plot(inbound_top, filename="inbound_top") 142 | 143 | 144 | # In[13]: 145 | 146 | 147 | top_country_inbound 148 | 149 | 150 | # In[14]: 151 | 152 | 153 | # define fucntion to calculate the yearly military spending growth 154 | def growth_inbound(df): 155 | for year in range(2007, 2017,1): 156 | # add growth columns 157 | df["growth"+ str(year+1)] = (df[str(year+1)] - df[str(year)])/df[str(year)]*100 158 | df_new = df.iloc[:, -10:] 159 | return df_new 160 | 161 | 162 | # In[15]: 163 | 164 | 165 | growth_inbound_data = growth_inbound(top_country_inbound) 166 | growth_inbound_data = growth_inbound_data.stack() 167 | growth_inbound_data = growth_inbound_data.reset_index() 168 | growth_inbound_data = growth_inbound_data.rename(columns = {"level_1":"Year",0:"Amount"}) 169 | growth_inbound_data.head() 170 | 171 | 172 | # In[16]: 173 | 174 | 175 | #make the bar plot of the country's inbound tourists growth 176 | growth_bar = px.bar(growth_inbound_data, 177 | x="Year", 178 | y="Amount", 179 | color="Country Name", 180 | hover_name="Country Name", 181 | #make grouped bar code 182 | barmode="group", 183 | title="Top attractive countries inbound tourists yearly growth rate") 184 | growth_bar 185 | #plotly.offline.plot(growth_bar, filename="growth_bar") 186 | 187 | 188 | # # Top countries with the highest receipts from tourists 189 | 190 | # In[17]: 191 | 192 | 193 | #import tourists receipts data set 194 | receipts = pd.read_excel("travel_data.xls", sheet_name="receipts for travel items", header=3, index_col=0) 195 | receipts.head() 196 | 197 | 198 | # In[18]: 199 | 200 | 201 | #use "data_selection" function to select inbound tourists data 202 | receipts_clean = data_selection(receipts) 203 | receipts_clean.head() 204 | 205 | 206 | # In[19]: 207 | 208 | 209 | #find the top countries with the attract tourists most 210 | top_country_receipts = receipts_clean.sort_values("2017", ascending=False).head(10) 211 | top_receipts_country_list = list(top_country_receipts.index) 212 | top_receipts_country_list 213 | 214 | 215 | # In[20]: 216 | 217 | 218 | #select the top countries' data 219 | top_receipts_data = receipts_clean.loc[receipts_clean.index.isin(top_receipts_country_list)] 220 | top_receipts_data.head() 221 | 222 | 223 | # In[21]: 224 | 225 | 226 | #use "reshape_data" function to reshape the inbound_touists_clean data frame 227 | receipts_clean_stack = reshape_data(top_receipts_data) 228 | receipts_clean_stack.head() 229 | 230 | 231 | # In[22]: 232 | 233 | 234 | #make the bar plot of the top countries with highest receipts from trouism 235 | receipts_bar = px.bar(receipts_clean_stack, 236 | x="Year", 237 | y="Amount", 238 | color="Country Name", 239 | hover_name="Country Name", 240 | title="Top countries with highest receipts") 241 | receipts_bar 242 | #plotly.offline.plot(receipts_bar, filename="receipts_bar") 243 | 244 | 245 | # # Top countries with the highest tourism share of exports (i.e. tourism-oriented export) 246 | 247 | # In[23]: 248 | 249 | 250 | #import receipts count percentage of export data set 251 | percent_export = pd.read_excel("travel_data.xls", sheet_name="receipts (% of total exports)", header=3, index_col=0) 252 | percent_export.head() 253 | 254 | 255 | # In[24]: 256 | 257 | 258 | #use "data_selection" function to select receipts export data 259 | percent_export_clean = data_selection(percent_export) 260 | percent_export_clean.head() 261 | 262 | 263 | # In[25]: 264 | 265 | 266 | #find the top countries with the attract tourists most 267 | top_country_export = percent_export_clean.sort_values("2017", ascending=False).head(10) 268 | top_export_country_list = list(top_country_export.index) 269 | top_export_country_list 270 | 271 | 272 | # In[26]: 273 | 274 | 275 | #use "reshape_data" function to reshape the percent_export_clean data frame 276 | percent_export_stack = reshape_data(percent_export_clean) 277 | percent_export_stack.head() 278 | 279 | 280 | # In[27]: 281 | 282 | 283 | #select the top countries' data 284 | top_country_export = percent_export_stack.loc[percent_export_stack['Country Name'].isin(top_export_country_list)] 285 | top_country_export.head() 286 | 287 | 288 | # In[28]: 289 | 290 | 291 | #make the polar bar plot for the top countries tourism are export oriented 292 | export_polar = px.bar_polar(top_country_export, 293 | #radius column 294 | r="Amount", 295 | #angle column 296 | theta="Country Name", 297 | color="Country Name", 298 | #slider animation column 299 | animation_frame="Year", 300 | title="Top countries ranked by tourism share of exports") 301 | export_polar 302 | #plotly.offline.plot(export_polar, filename="export_polar") 303 | 304 | 305 | # # GDP and tourists number correlation 306 | 307 | # In[29]: 308 | 309 | 310 | #import countries GDP data set 311 | gdp = pd.read_excel("travel_data.xls", sheet_name="GDP", header=3, index_col=0) 312 | gdp.head() 313 | 314 | 315 | # In[30]: 316 | 317 | 318 | #select relevant years data 319 | gdp = gdp.loc[:,"2007":"2017"] 320 | #reshape the data frame 321 | gdp_data = pd.DataFrame(gdp.stack()).reset_index() 322 | #rename the column name 323 | gdp_data = gdp_data.rename(columns = {"level_1":"Year", 0:"GDP"}) 324 | gdp_data.head() 325 | 326 | 327 | # In[31]: 328 | 329 | 330 | #merge the inbound tourists data and gdp data by Country name and year 331 | gdp_inbound = inbound_tourists_stack.merge(gdp_data, left_on=("Country Name","Year"), right_on=("Country Name","Year")) 332 | gdp_inbound.head() 333 | 334 | 335 | # In[32]: 336 | 337 | 338 | #make the scatter plot check the relation between GDP and inbound tourists number 339 | gdp_inbound_plot = px.scatter(gdp_inbound, 340 | x="Amount", 341 | y="GDP", 342 | #set the color of the plot 343 | color_discrete_sequence = px.colors.qualitative.Vivid, 344 | hover_name="Country Name", 345 | #use log data to plot 346 | log_x=True, 347 | log_y=True, 348 | labels="Amount(number of people)", 349 | title="Correlation between GDP and number of inbound tourists") 350 | gdp_inbound_plot 351 | #plotly.offline.plot(gdp_inbound_plot, filename="gdp_inbound_plot") 352 | 353 | 354 | # # world tourists travel outbound 355 | 356 | # In[33]: 357 | 358 | 359 | #import tourists outbound data set 360 | outbound_tourists = pd.read_excel("travel_data.xls", sheet_name="number_of_departure", header=3, index_col=0) 361 | outbound_tourists.head() 362 | 363 | 364 | # In[34]: 365 | 366 | 367 | #use "data_selection" function to select inbound tourists data 368 | outbound_tourists_clean = data_selection(outbound_tourists) 369 | outbound_tourists_clean.head() 370 | 371 | 372 | # In[35]: 373 | 374 | 375 | #use "reshape_data" function to reshape the outbound_tourists_clean data frame 376 | outbound_tourists_stack = reshape_data(outbound_tourists_clean) 377 | outbound_tourists_stack.head() 378 | 379 | 380 | # In[36]: 381 | 382 | 383 | #make the map plot of world outbound tourist 384 | outbound_map = px.scatter_geo(outbound_tourists_stack, 385 | locations="Country Code", 386 | color="Amount", 387 | hover_name="Country Name", 388 | size="Amount", 389 | #set animation column 390 | animation_frame="Year", 391 | color_continuous_scale=px.colors.sequential.Aggrnyl, 392 | projection="natural earth") 393 | outbound_map 394 | #plotly.offline.plot(outbound_map, filename="outbound_map") 395 | 396 | 397 | # In[37]: 398 | 399 | 400 | #find the top countries with the attract tourists most 401 | top_country_outbound = outbound_tourists_clean.sort_values("2017", ascending=False).head(10) 402 | top_outbound_country_list = list(top_country_outbound.index) 403 | top_outbound_country_list 404 | 405 | 406 | # In[38]: 407 | 408 | 409 | top_outbound_data = outbound_tourists_stack.loc[outbound_tourists_stack['Country Name'].isin(top_outbound_country_list)] 410 | top_outbound_data.head() 411 | 412 | 413 | # In[39]: 414 | 415 | 416 | #make the line plot for top outbound countries 417 | outbound_top = px.line(top_outbound_data, 418 | #x axis column 419 | x="Year", 420 | #y axis column 421 | y="Amount", 422 | color="Country Name", 423 | line_group="Country Name", 424 | hover_name="Country Name", 425 | #line type 426 | line_shape="linear", 427 | title="Top countries where people like to travel") 428 | outbound_top 429 | #plotly.offline.plot(outbound_top, filename="outbound_top") 430 | 431 | 432 | # # Relationship between population and the number of outbound tourists 433 | 434 | # In[40]: 435 | 436 | 437 | #import the population data set 438 | population = pd.read_excel("travel_data.xls", sheet_name="population", header=3, index_col=0) 439 | population.head() 440 | 441 | 442 | # In[41]: 443 | 444 | 445 | #use "data_selection" function to select population data set 446 | population_clean = data_selection(population) 447 | population_clean.head() 448 | 449 | 450 | # In[42]: 451 | 452 | 453 | #def a function concate the population data with outbound tourists data 454 | def population_outbound(year): 455 | df1 = population_clean.loc[population_clean.index.isin(top_outbound_country_list)].loc[:,year] 456 | df2 = outbound_tourists_clean.loc[outbound_tourists_clean.index.isin(top_outbound_country_list)].loc[:,year] 457 | df = pd.concat([df1, df2], axis=1) 458 | return df 459 | 460 | 461 | # In[43]: 462 | 463 | 464 | population_outbound("2017") 465 | 466 | 467 | # In[44]: 468 | 469 | 470 | #make the line-bar plot 471 | #make the scatter plot 472 | trace1=go.Scatter( 473 | x=population_outbound("2017").index, 474 | y=population_outbound("2017").iloc[:,0], 475 | name="population") 476 | #make the bar plot 477 | trace2=go.Bar( 478 | x=population_outbound("2017").index, 479 | y=population_outbound("2017").iloc[:,1], 480 | name="tourists", 481 | yaxis="y2", 482 | #transparent level 483 | opacity=0.6, 484 | #bar wide 485 | width=[0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5]) 486 | data=[trace1,trace2] 487 | layout=go.Layout( 488 | title="Outbound tourists number vs country's population", 489 | yaxis=dict(title="population"), 490 | yaxis2=dict(title="number of people", 491 | titlefont=dict(color="rgb(148, 103, 189)"), 492 | tickfont=dict(color="rgb(148, 103, 189)"), 493 | overlaying="y", 494 | side="right")) 495 | outbound_population_compare=go.Figure(data=data, layout=layout) 496 | py.iplot(outbound_population_compare) 497 | #plotly.offline.plot(outbound_population_compare, filename="outbound_population_compare") 498 | 499 | 500 | # In[45]: 501 | 502 | 503 | #merge the outbound tourists data and gdp data 504 | gdp_outbound = outbound_tourists_stack.merge(gdp_data, left_on=("Country Name","Year"), right_on=("Country Name","Year")) 505 | gdp_outbound.head() 506 | 507 | 508 | # In[46]: 509 | 510 | 511 | #make the scatter plot to check the relationship between tourists number and GDP 512 | gdp_outbound_plot = px.scatter(gdp_outbound, 513 | x="Amount", 514 | y="GDP", 515 | hover_name="Country Name", 516 | #use log data to make the plot 517 | log_x=True, 518 | log_y=True, 519 | title="Correlation between GDP and outbound tourists") 520 | gdp_outbound_plot 521 | #plotly.offline.plot(gdp_outbound_plot, filename="gdp_outbound_plot") 522 | 523 | 524 | # In[47]: 525 | 526 | 527 | #import expenditure data set 528 | expenditure = pd.read_excel("travel_data.xls", sheet_name="expenditure for travel item", header=3, index_col=0) 529 | expenditure_clean = data_selection(expenditure) 530 | expenditure_clean.head() 531 | 532 | 533 | # In[48]: 534 | 535 | 536 | #calculate the per capita data 537 | expenditure_p = expenditure_clean.iloc[:,1:12]/outbound_tourists_clean.iloc[:,1:12] 538 | #drop nan, inf rows 539 | expenditure_p = expenditure_p[~expenditure_p.isin([np.nan, np.inf, -np.inf]).any(1)] 540 | expenditure_p.head() 541 | 542 | 543 | # In[49]: 544 | 545 | 546 | #select the top countries' data 547 | top_expenditure_p_data = expenditure_p.loc[expenditure_p.index.isin(top_outbound_country_list)].sort_values("2017", ascending=False) 548 | top_expenditure_p_data.head() 549 | 550 | 551 | # In[50]: 552 | 553 | 554 | #reshape the top_expenditure_p_data data frame 555 | top_expenditure_p_stack = top_expenditure_p_data.stack().reset_index() 556 | top_expenditure_p_stack = top_expenditure_p_stack.rename(columns = {"level_1":"Year",0:"Per capita expenditure"}) 557 | top_expenditure_p_stack.head() 558 | 559 | 560 | # In[51]: 561 | 562 | 563 | #make the bar plot of the top countries' people with highest spending in the travel 564 | expenditure_bar = px.bar(top_expenditure_p_stack, 565 | x="Year", 566 | y="Per capita expenditure", 567 | color="Country Name", 568 | hover_name="Country Name", 569 | title="Top countries per capita expenditure in travel(exclude international transportation)") 570 | expenditure_bar 571 | #plotly.offline.plot(expenditure_bar, filename="expenditure_bar") 572 | 573 | 574 | # In[ ]: 575 | 576 | 577 | 578 | 579 | --------------------------------------------------------------------------------