Net Promoter Score
With over 210 hotels in 43 countries, the Hyatt Corporation is a leader in the hotelier and resort industry. Hyatt places their hotels in not just major cities, but smaller cities, as well as by airports, and major vacation destinations. The Hyatt Corporation has many different brands of hotels, offering many different experiences for each guest, even if they are traveling for business or pleasure. Being one of the major hotel chains in the world, Hyatt is concerned with making sure that the guest that stays at one of their hotels is delivered a distinctive experience and will recommend their hotel to others. Hyatt’s Corporations’ mission statement says it the best. “Every day we care for our guests. Care is at the heart of our business, and it’s this distinct guest experience that makes Hyatt one of the world’s best hospitality brands.” (https://about.hyatt.com/en.html)
Every time a guest stays at a Hyatt property upon departure they are asked to fill out a survey. This project will analyze those guest surveys and the likelihood that a guest would recommend a Hyatt Corporation property. Several factors will be compared, including amenities and service, in order to provide a recommendation for hotels to use in determining how to improve survey results.
Project Scope and Objective
The scope of the project centers on the locations of Hyatt hotels in the United States and the time frame in which the data set was collected. The data set had been preselected for analysis and trimmed for the current analysis. During the initial phase, a decision was made to limit the scope to only the three states with the highest return of completed surveys. These states are: California, Florida, and Texas.
The overall objective is to make actionable recommendations from the analysis based on the likelihood to recommend and the satisfaction of the customers who stay at a Hyatt Corporation property. The project will also make sure that the recommendations that are made meet or exceed the goals of the Net Promoter Score. It will also consist of descriptive statistics, different modeling techniques to show relationships between different variables, and result in solutions for the business questions.
The following business questions will be answered with the analysis on the Hyatt data set.
A. What is the primary driver of the Net Promoter Score? Over the course of the project a deeper dive will be taken to determine the exact variable that drives the Net Promoter Score.
B. Is there a correlation between Net Promoter Score and Revenue? Looking at the data set for Hyatt hotels in the United States, revenue needs to be maximized as well as the Net Promoter Score. Running a correlation between the two will determine if there is a relationship between the two variables.
C. Are we able to predict if a guest would be a promoter based on the amenities of the hotel, customer service, and/or condition of the hotel? Looking at the variables that include all the amenities, we hope to see what would predict if a person was a promoter. There might also be other factors that might indicate if a person is a promoter, such as guest room condition, condition of the overall hotel and the customer service of the hotel staff.
The data used for this project was provided and pre-filtered to exclude international properties, as well as some variables. The final data file provided contained 118 variables with 3 million observations.
In order to further cut down the number of variables as well as the number of observations, making the data more manageable, a quality assessment was performed. During the assessment, it was revealed that a great number of observations had no data in several key variables. It was decided that these observations be removed from the dataset. It was also decided that the scope of the project be limited to only the three states with the highest amount of completed surveys.
Once the data set was loaded, a data frame was created as well as a function to execute the changing of null data to NA. Below is a sample of the code that was used to achieve this data cleansing:
The data was separated into numeric and non-numeric strings. All of the nulls or empty strings were then set to NA and then all columns were then recombined. NA’s were then converted to the mean of each of the columns. This allows us to be able to still use the rows. Commas were then taken out of the numeric columns to ensure just the numbers existed and made this easier to work with. The top three states of California, Texas and Florida were chosen to be the states to focus on, as they are the states with the most completed surveys.
The variables that contained binary information were changed to yes being equal to 1, and no being equal to 0. The rows that did not contain any information were then changed to a zero representing no. This was done so that the information could still be used, and it was safe to say that if a column was left blank on a survey, it was a no response.
After crafting our business questions, the following variables were chosen to direct our focus.
Likelihood_Recommend_H: Likelihood to recommend metric; value on a 1 to 10 scale
Guest_Room_H: Guest room satisfaction metric; value on a 1 to 10 scale
Condition_Hotel_H: Condition of hotel metric; value on a 1 to 10 scale
Customer_SVC_H: Quality of customer service metric; value on a 1 to 10 scale
State_PL: State in which the hotel is located
Convention_PL: Flag indicating if the hotel has a convention space
NPS_Type: Indicates if the guest’s HySat responses mark them as a promoter, a passive, or a detractor
Restaurant_PL: Flag indicating id the hotel has a onsite restaurant
Length_Stay_H: length of stay
REVENUE_USD_R: total USD revenue
Simple descriptive statistics were ran to see the average length of stay per state, average likelihood to recommend score, as well as counting all the likelihood to recommend scores by state, and display a list in descending order. When we executed the code to see the scores by state, this allowed us to choose to work with only the top three states’ data.
A correlation analysis was performed to give more visualization to our hypothesis. The hypothesis states that the likelihood to recommend a Hyatt Corporation property will be based on a few amenities, and how the hotel’s condition is kept. The following correlation was done with the package dataExplorer. It helps to prove that there is a strong correlation between the variables Likelihood_Recommend_H, Guest_Room_H, Condition_Hotel_H and Customer_SVC_H.
This correlation analysis shows that there is a high correlation between likelihood to recommend and guest room, condition of the hotel and customer service. This indicates that a person will more than likely recommend a Hyatt hotel if they receive wonderful customer service and the hotel is in pristine condition as well as the guest room in which they stay. The NPS type of promoter has a positive correlation with the same variables, with the highest correlation to the guest room.
A correlation was also done to see if there was any correlation between the likelihood to recommend and revenue. It turns out that these two variables are negatively correlated and there is not a strong correlation between the two.
The average likelihood to recommend by State (codified by size - bigger dot, more likely to recommend) was mapped to see which state had the highest NPS. This indicates that Florida is the state with the highest average of the likelihood to recommend.
We zoomed in on each state to see where the local scores were highest.
Several modeling techniques were used while assessing the data. We created these models in order to create visual tools that could help answer our business questions, as well as to see any patterns and predict how and which variables affect the Net Promoter Score.
We tried to predict whether having a convention center or restaurant has an effect on likelihood to recommend. When the variable customer service was added, the model was better at predicting the likelihood to recommend, than other amenities. We can predict if someone is a promoter using customer service and condition of room/hotel. Amenities seem to be obsolete when predicting a recommendation. Overall, the linear models gave a very high level of error and did not seem a useful tool for analysis in this case.
We ran a Naive Bayes model to predict whether or not a guest would be a promoter, as defined by if they gave a likelihood to recommend value of 9 or 10. The model using Hotel condition, customer service, and room condition was 87% accurate at predicting whether a person was a promoter. This variable combination had the highest predictive percent (even over restuarant_pl and convention_pl) and lowest RMSE (root mean square error).
We also ran a KSVM model to predict the same as the above, and it proved to be the better model with a better prediction rate and a lower RMSE.
This prediction plot shows that as customer service and hotel room condition ratings increase, the likelihood of the guest being a promoter goes up.
Based on the analysis of the data, different models, and descriptive statistics performed, it is recommended that the hotels provide more staff training to improve customer service. Ensuring that staff members are able to handle all situations pertaining to customers’ distinctive experiences is vital in the importance of the recommendation of the Hyatt Corporation hotels. It is also recommended that more housekeeping staff is hired to improve overall cleanliness of the rooms. The guest room conditions need to live up to the customers’ expectations and experience.
Business Questions Revisited
A. What is the primary driver of the Net Promoter Score? The primary driver of the Net Promoter Score of Hyatt Corporation surveys, are customer service and guest room condition. These are both reflected in the recommendations to ensure better customer service training and that each of the hotels can make sure the guest room and overall hotel condition is superb.
B. Is there a correlation between Net Promoter Score and Revenue? After looking at the data, and doing a correlation analysis, there is a negative correlation and it is not very strong. This indicates that other factors drive the likelihood to recommend and determine the Net Promoter Score.
C. Are we able to predict if a guest would be a promoter based on the amenities of the hotel, customer service, and/or condition of the hotel? We were able to predict if a guest would be a promoter based on the variables customer service and condition of hotel. We also found that the condition of the guest room also played an important role in indicating if someone would be a promoter. However, we were not able to predict with high accuracy, any amenities that would factor in to indicating if a guest was a promoter.