About Bellabeat
Bellabeat is a high-tech manufacturer of health-focused products for women. Bellabeat is a successful small company, but they have the potential to become a larger player in the global smart device market. Collecting data on activity, sleep, stress, and reproductive health has allowed Bellabeat to empower women with knowledge about their health and habits. Since it was founded in 2013, Bellabeat has grown rapidly and quickly, positioned itself as a tech-driven wellness company for women. By 2016, Bellabeat had opened offices around the world and launched multiple products. Bellabeat products became available through a growing number of online retailers in addition to their own e-commerce channel on their website.
Bellabeat Products
Bellabeat app: The Bellabeat app provides users with health data related to their activity, sleep, stress, menstrual cycle, and mindfulness habits. This data can help users better understand their current habits and make healthy decisions. The Bella Beat app connects to their line of smart wellness products.
Leaf: Bella Beat’s classic wellness tracker can be worn as a bracelet, necklace, or clip. The Leaf tracker connects to the Bella Beat app to track activity, sleep, and stress.
Time: This wellness watch combines the timeless look of a classic timepiece with smart technology to track user activity, sleep, and stress. The Time watch connects to the Bella Beat app to provides insights of daily wellness.
Spring: This is a water bottle that tracks daily water intake using smart technology to ensure that user are appropriately hydrated throughout the day. The spring bottle connects to the Bella Beat app to track user hydration levels
Bellabeat membership
Bellabeat also offers a subscription-based membership program for users. Membership gives users 24/7 access to fully personalized guidance on nutrition, activity, sleep, health , beauty and mindfulness based on their lifestyle and goals.
Business Task:
Too analyze smart device usage data in order to gain insight into how consumers use non-Bellabeat smart devices. Then selecting Bellabeat product to apply these insights.
Key Stakeholders
Urška Sršen: Bellabeat’s cofounder and Chief Creative Officer.
Sando Mur: Mathematician and Bellabeat’s cofounder; key member of the Bellabeat executive team.
Bellabeat marketing analytics team: A team of data analysts responsible for collecting, analyzing, and reporting data that helps guide Bellabeat’s marketing strategy.
Data Analysis Process
Ask
1. What are some trends in smart device usage?
2. How could these trends apply to Bellabeat customers?
3. How could these trends help influence Bellabeat marketing strategy?
Prepare
I have used public data that explores smart device users’ daily habits. It is Fitbit Fitness Tracker Data (CC0: Public Domain, dataset made available through Mobius): This Kaggle data set contains personal fitness tracker from thirty Fitbit users. Thirty eligible Fitbit users consented to the submission of personal tracker data, including minute-level output for physical activity, heart rate, and sleep monitoring. It includes information about activity, steps, weight and heart rate that can be used to explore user habits.
Process
The Zip files were downloaded locally and copy was stored in a new folder named Bellabeat project with a csv extension.
The csv files were opened using Excel and copy of relevant datasets was stored in desktop as folder. Then each folder was inspected.
Activity, calories, intensities, steps datasets have no duplicates. Sleep and heart datasets have duplicates that were removed. Weight dataset have no duplicates but some manual reports in this data are false as a result false reports were filtered out, new column USERTYPE was created based on BMI classification.
R STUDIO Codes
# installing packages
install.packages("tidyverse")
install.packages("lubridate")
install.packages("dplyr")
install.packages("ggplot2")
install.packages("tidyr")
install.packages("here")
install.packages("skimr")
install.packages("janitor")
# loading libraries
library(tidyverse)
library(lubridate)
library(dplyr)
library(ggplot2)
library(tidyr)
library(here)
library(skimr)
library(janitor)
# Working Directory
setwd("C:/Users/H/Desktop")
> d_Activity <- read.csv("daily_Activity.csv") # 1
> d_calories <- read.csv("daily_calories.csv") # 2
> d_intensities <- read.csv("daily_intensities.csv") # 3
> d_steps <- read.csv("daily_steps.csv") # 4
> d_weight <- read.csv("cleanbmi.csv") #5
> d_sleep <- read.csv("cleansleep.csv") #6
Analyse
# working with d_sleep dataset
> str(d_sleep)
'data.frame': 410 obs. of 7 variables:
$ Id : num 1.5e+09 1.5e+09 1.5e+09 1.5e+09 1.5e+09 ...
$ SleepDay : chr "04/12/2016 00:00" "4/13/2016 12:00:00 AM" "4/15/2016 12:00:00 AM" "4/16/2016 12:00:00 AM" ...
$ TotalSleepRecords : int 1 2 1 2 1 1 1 1 1 1 ...
$ TotalMinutesAsleep: int 327 384 412 340 700 304 360 325 361 430 ...
$ TotalSleepHours : chr "5:27" "6:24" "6:52" "5:40" ...
$ TotalTimeInBed : int 346 407 442 367 712 320 377 364 384 449 ...
$ TotalBedHours : chr "5:46" "6:47" "7:22" "6:07" ...
> summary(d_sleep)
Id SleepDay TotalSleepRecords TotalMinutesAsleep
Min. :1.504e+09 Length:410 Min. :1.00 Min. : 58.0
1st Qu.:3.977e+09 Class :character 1st Qu.:1.00 1st Qu.:361.0
Median :4.703e+09 Mode :character Median :1.00 Median :432.5
Mean :4.995e+09 Mean :1.12 Mean :419.2
3rd Qu.:6.962e+09 3rd Qu.:1.00 3rd Qu.:490.0
Max. :8.792e+09 Max. :3.00 Max. :796.0
TotalSleepHours TotalTimeInBed TotalBedHours
Length:410 Min. : 61.0 Length:410
Class :character 1st Qu.:403.8 Class :character
Mode :character Median :463.0 Mode :character
Mean :458.5
3rd Qu.:526.0
Max. :961.0
> n_distinct(d_sleep)
[1] 410
# creating usertype based on TotalMinutesAsleep
> user<-d_sleep %>%
+ mutate(user_type=case_when(
+ TotalMinutesAsleep <360 ~ "SSS",
+ TotalMinutesAsleep >=360 & TotalMinutesAsleep <540 ~ "NORMAL",
+ TotalMinutesAsleep >540 ~ "OVERSLEEP"
+ ))
# convert user_type chr to factor user_type
d_user <-mutate(user,user_type=as.factor(user_type))
glimpse(d_user)
Rows: 410
Columns: 8
$ Id <dbl> 1503960366, 1503960366, 1503960366, 1503960366, 15039…
$ SleepDay <chr> "04/12/2016 00:00", "4/13/2016 12:00:00 AM", "4/15/20…
$ TotalSleepRecords <int> 1, 2, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
$ TotalMinutesAsleep <int> 327, 384, 412, 340, 700, 304, 360, 325, 361, 430, 277…
$ TotalSleepHours <chr> "5:27", "6:24", "6:52", "5:40", "11:40", "5:04", "6:0…
$ TotalTimeInBed <int> 346, 407, 442, 367, 712, 320, 377, 364, 384, 449, 323…
$ TotalBedHours <chr> "5:46", "6:47", "7:22", "6:07", "11:52", "5:20", "6:1…
$ user_type <fct> SSS, NORMAL, NORMAL, SSS, OVERSLEEP
ggplot(data = d_user)+
geom_smooth(mapping = aes(x=TotalMinutesAsleep,y=TotalTimeInBed))+
geom_point(mapping = aes(x=TotalMinutesAsleep,y=TotalTimeInBed,color="orange"))
d_user %>%
group_by(user_type) %>%
summarise(total = n()) %>%
mutate(totals = sum(total)) %>%
group_by(user_type) %>%
summarise(Percent = total / totals) %>%
ggplot(aes(user_type,y=Percent, fill=user_type)) +
geom_col()+
scale_y_continuous(labels = scales::percent) +
theme(legend.position="none") +
labs(title="Usertype", x=NULL) +
theme(legend.position="none", text = element_text(size = 20),plot.title = element_text(hjust = 0.5))
# Working with weight data
> str(d_weight)
'data.frame': 41 obs. of 6 variables:
$ Id : num 1.50e+09 1.50e+09 2.87e+09 2.87e+09 4.32e+09 ...
$ WeightKg : num 52.6 52.6 56.7 57.3 72.4 ...
$ WeightPounds : num 116 116 125 126 160 ...
$ BMI : num 22.6 22.6 21.5 21.7 27.5 ...
$ USERTYPE : chr "normal" "normal" "normal" "normal" ...
$ IsManualReport: logi TRUE TRUE TRUE TRUE TRUE TRUE ...
> summary(d_weight)
Id WeightKg WeightPounds BMI
Min. :1.504e+09 Min. :52.60 Min. :116.0 Min. :21.45
1st Qu.:4.559e+09 1st Qu.:61.20 1st Qu.:134.9 1st Qu.:23.89
Median :6.962e+09 Median :61.50 Median :135.6 Median :24.00
Mean :6.074e+09 Mean :62.41 Mean :137.6 Mean :24.39
3rd Qu.:6.962e+09 3rd Qu.:62.10 3rd Qu.:136.9 3rd Qu.:24.24
Max. :6.962e+09 Max. :72.40 Max. :159.6 Max. :27.46
USERTYPE IsManualReport
Length:41 Mode:logical
Class :character TRUE:41
Mode :character
> n_distinct(d_weight)
[1] 22
> #change USERTYPE chr.to factor
> d_w <-mutate(d_weight,USERTYPE=as.factor(USERTYPE))
> glimpse(d_w)
Rows: 41
Columns: 6
$ Id <dbl> 1503960366, 1503960366, 2873212765, 2873212765, 431970357…
$ WeightKg <dbl> 52.6, 52.6, 56.7, 57.3, 72.4, 72.3, 69.7, 70.3, 69.9, 69.…
$ WeightPounds <dbl> 115.9631, 115.9631, 125.0021, 126.3249, 159.6147, 159.394…
$ BMI <dbl> 22.65, 22.65, 21.45, 21.69, 27.45, 27.38, 27.25, 27.46, 2…
$ USERTYPE <fct> normal, normal, normal, normal, overweight, overweight, o…
$ IsManualReport <lgl> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRU…
ggplot(data = d_w)+
geom_smooth(mapping = aes(x=WeightKg,y=BMI))+
geom_point(mapping = aes(x=WeightKg,y=BMI,color="orange"))
d_w %>%
group_by(USERTYPE) %>%
summarise(total = n()) %>%
mutate(totals = sum(total)) %>%
group_by(USERTYPE) %>%
summarise(Total_Percent = total / totals) %>%
ggplot(aes(USERTYPE,y=Total_Percent, fill=USERTYPE)) +
geom_col()+
scale_y_continuous(labels = scales::percent) +
theme(legend.position="none") +
labs(title="USERTYPE", x=NULL) +
theme(legend.position="none", text = element_text(size = 20),plot.title = element_text(hjust = 0.5))
# working with activity, calories, intensities, steps datasets
> #How many unique participants are there in each dataframe?
> n_distinct(d_Activity$Id)
[1] 33
> n_distinct(d_calories$Id)
[1] 33
> n_distinct(d_intensities$Id)
[1] 33
> n_distinct(d_steps$Id)
[1] 33
> #How many observations are there in each dataframe?
> nrow(d_Activity)
[1] 940
> nrow(d_calories)
[1] 940
> nrow(d_intensities)
[1] 940
> nrow(d_steps)
[1] 940
str(d_Activity)
'data.frame': 940 obs. of 15 variables:
$ Id : num 1.5e+09 1.5e+09 1.5e+09 1.5e+09 1.5e+09 ...
$ ActivityDate : chr "04/12/2016" "4/13/2016" "4/14/2016" "4/15/2016" ...
$ TotalSteps : int 13162 10735 10460 9762 12669 9705 13019 15506 10544 9819 ...
$ TotalDistance : num 8.5 6.97 6.74 6.28 8.16 ...
$ TrackerDistance : num 8.5 6.97 6.74 6.28 8.16 ...
$ LoggedActivitiesDistance: num 0 0 0 0 0 0 0 0 0 0 ...
$ VeryActiveDistance : num 1.88 1.57 2.44 2.14 2.71 ...
$ ModeratelyActiveDistance: num 0.55 0.69 0.4 1.26 0.41 ...
$ LightActiveDistance : num 6.06 4.71 3.91 2.83 5.04 ...
$ SedentaryActiveDistance : num 0 0 0 0 0 0 0 0 0 0 ...
$ VeryActiveMinutes : int 25 21 30 29 36 38 42 50 28 19 ...
$ FairlyActiveMinutes : int 13 19 11 34 10 20 16 31 12 8 ...
$ LightlyActiveMinutes : int 328 217 181 209 221 164 233 264 205 211 ...
$ SedentaryMinutes : int 728 776 1218 726 773 539 1149 775 818 838 ...
$ Calories : int 1985 1797 1776 1745 1863 1728 1921 2035 1786 1775 ...
> str(d_calories)
'data.frame': 940 obs. of 3 variables:
$ Id : num 1.5e+09 1.5e+09 1.5e+09 1.5e+09 1.5e+09 ...
$ ActivityDay: chr "4/12/2016" "4/13/2016" "4/14/2016" "4/15/2016" ...
$ Calories : int 1985 1797 1776 1745 1863 1728 1921 2035 1786 1775 ...
> str(d_intensities)
'data.frame': 940 obs. of 10 variables:
$ Id : num 1.5e+09 1.5e+09 1.5e+09 1.5e+09 1.5e+09 ...
$ ActivityDay : chr "04/12/2016" "4/13/2016" "4/14/2016" "4/15/2016" ...
$ SedentaryMinutes : int 728 776 1218 726 773 539 1149 775 818 838 ...
$ LightlyActiveMinutes : int 328 217 181 209 221 164 233 264 205 211 ...
$ FairlyActiveMinutes : int 13 19 11 34 10 20 16 31 12 8 ...
$ VeryActiveMinutes : int 25 21 30 29 36 38 42 50 28 19 ...
$ SedentaryActiveDistance : num 0 0 0 0 0 0 0 0 0 0 ...
$ LightActiveDistance : num 6.06 4.71 3.91 2.83 5.04 ...
$ ModeratelyActiveDistance: num 0.55 0.69 0.4 1.26 0.41 ...
$ VeryActiveDistance : num 1.88 1.57 2.44 2.14 2.71 ...
> str(d_steps)
'data.frame': 940 obs. of 3 variables:
$ Id : num 1.5e+09 1.5e+09 1.5e+09 1.5e+09 1.5e+09 ...
$ ActivityDay: chr "04/12/2016" "4/13/2016" "4/14/2016" "4/15/2016" ...
$ StepTotal : int 13162 10735 10460 9762 12669 9705 13019 15506 10544 9819 ...
>
> #all datasets had the 'Id' field common.
> #all dataets expect for d_activity have ActivityDay common.We can rename the ActivityDate to AcitivityDay
# rename d_Activity data ActivityDate col to ActivityDay col
d_Activity <- rename( d_Activity,
ActivityDay = ActivityDate)
# now we can merge 4 dataset by Id and ActivityDay
> merge_1 <- merge(d_Activity, d_calories, by= c("Id", "ActivityDay"))
> merge_2 <- merge(d_intensities,d_steps, by= c("Id","ActivityDay"))
> All_merge <- merge(merge_1, merge_2, by = c("Id","ActivityDay","SedentaryMinutes",
+ "LightlyActiveMinutes","FairlyActiveMinutes",
+ "VeryActiveMinutes", "SedentaryActiveDistance",
+ "LightActiveDistance", "ModeratelyActiveDistance",
+ "VeryActiveDistance"))
glimpse(All_merge)
Rows: 578
Columns: 17
$ Id <dbl> 1503960366, 1503960366, 1503960366, 1503960366,…
$ ActivityDay <chr> "4/13/2016", "4/14/2016", "4/15/2016", "4/16/20…
$ SedentaryMinutes <int> 776, 1218, 726, 773, 539, 1149, 775, 818, 838, …
$ LightlyActiveMinutes <int> 217, 181, 209, 221, 164, 233, 264, 205, 211, 13…
$ FairlyActiveMinutes <int> 19, 11, 34, 10, 20, 16, 31, 12, 8, 27, 21, 5, 1…
$ VeryActiveMinutes <int> 21, 30, 29, 36, 38, 42, 50, 28, 19, 66, 41, 39,…
$ SedentaryActiveDistance <dbl> 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00,…
$ LightActiveDistance <dbl> 4.71, 3.91, 2.83, 5.04, 2.51, 4.71, 5.03, 4.24,…
$ ModeratelyActiveDistance <dbl> 0.69, 0.40, 1.26, 0.41, 0.78, 0.64, 1.32, 0.48,…
$ VeryActiveDistance <dbl> 1.57, 2.44, 2.14, 2.71, 3.19, 3.25, 3.53, 1.96,…
$ TotalSteps <int> 10735, 10460, 9762, 12669, 9705, 13019, 15506, …
$ TotalDistance <dbl> 6.97, 6.74, 6.28, 8.16, 6.48, 8.59, 9.88, 6.68,…
$ TrackerDistance <dbl> 6.97, 6.74, 6.28, 8.16, 6.48, 8.59, 9.88, 6.68,…
$ LoggedActivitiesDistance <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ Calories.x <int> 1797, 1776, 1745, 1863, 1728, 1921, 2035, 1786,…
$ Calories.y <int> 1797, 1776, 1745, 1863, 1728, 1921, 2035, 1786,…
$ StepTotal <int> 10735, 10460, 9762, 12669, 9705, 13019, 15506, …
> # convert ActivityDay chr to date format
> d_merge <- mutate(All_merge, ActivityDay = as.Date(ActivityDay, format= "%m/%d/%Y"))
> class(Date$ActivityDay)
[1] "Date"
> # convert date to weekday
d_merge$Day <- weekdays(d_merge$ActivityDay)
d_merge$Day <- factor(d_merge$Day,levels = c('Sunday','Monday',
'Tuesday','Wednesday','Thursday','Friday','Saturday'))
glimpse(d_merge)
Rows: 578
Columns: 18
$ Id <dbl> 1503960366, 1503960366, 1503960366, 1503960366,…
$ ActivityDay <date> 2016-04-13, 2016-04-14, 2016-04-15, 2016-04-16…
$ SedentaryMinutes <int> 776, 1218, 726, 773, 539, 1149, 775, 818, 838, …
$ LightlyActiveMinutes <int> 217, 181, 209, 221, 164, 233, 264, 205, 211, 13…
$ FairlyActiveMinutes <int> 19, 11, 34, 10, 20, 16, 31, 12, 8, 27, 21, 5, 1…
$ VeryActiveMinutes <int> 21, 30, 29, 36, 38, 42, 50, 28, 19, 66, 41, 39,…
$ SedentaryActiveDistance <dbl> 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00,…
$ LightActiveDistance <dbl> 4.71, 3.91, 2.83, 5.04, 2.51, 4.71, 5.03, 4.24,…
$ ModeratelyActiveDistance <dbl> 0.69, 0.40, 1.26, 0.41, 0.78, 0.64, 1.32, 0.48,…
$ VeryActiveDistance <dbl> 1.57, 2.44, 2.14, 2.71, 3.19, 3.25, 3.53, 1.96,…
$ TotalSteps <int> 10735, 10460, 9762, 12669, 9705, 13019, 15506, …
$ TotalDistance <dbl> 6.97, 6.74, 6.28, 8.16, 6.48, 8.59, 9.88, 6.68,…
$ TrackerDistance <dbl> 6.97, 6.74, 6.28, 8.16, 6.48, 8.59, 9.88, 6.68,…
$ LoggedActivitiesDistance <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ Calories.x <int> 1797, 1776, 1745, 1863, 1728, 1921, 2035, 1786,…
$ Calories.y <int> 1797, 1776, 1745, 1863, 1728, 1921, 2035, 1786,…
$ StepTotal <int> 10735, 10460, 9762, 12669, 9705, 13019, 15506, …
$ Day <fct> Wednesday, Thursday, Friday, Saturday, Sunday, …
# Summary statistics
> n_distinct(d_merge$Id)
[1] 33
> nrow(d_merge)
[1] 578
> summary(d_merge)
Id ActivityDay SedentaryMinutes LightlyActiveMinutes
Min. :1.504e+09 Min. :2016-04-13 Min. : 2 Min. : 0.0
1st Qu.:2.347e+09 1st Qu.:2016-04-17 1st Qu.: 738 1st Qu.:135.0
Median :4.445e+09 Median :2016-04-21 Median :1070 Median :202.5
Mean :4.882e+09 Mean :2016-04-21 Mean :1004 Mean :198.3
3rd Qu.:6.962e+09 3rd Qu.:2016-04-26 3rd Qu.:1232 3rd Qu.:271.0
Max. :8.878e+09 Max. :2016-04-30 Max. :1440 Max. :518.0
FairlyActiveMinutes VeryActiveMinutes SedentaryActiveDistance
Min. : 0.00 Min. : 0.00 Min. :0.000000
1st Qu.: 0.00 1st Qu.: 0.00 1st Qu.:0.000000
Median : 7.00 Median : 5.00 Median :0.000000
Mean : 13.73 Mean : 22.06 Mean :0.001609
3rd Qu.: 20.00 3rd Qu.: 31.75 3rd Qu.:0.000000
Max. :113.00 Max. :210.00 Max. :0.110000
LightActiveDistance ModeratelyActiveDistance VeryActiveDistance TotalSteps
Min. : 0.000 Min. :0.0000 Min. : 0.000 Min. : 0
1st Qu.: 2.002 1st Qu.:0.0000 1st Qu.: 0.000 1st Qu.: 3992
Median : 3.430 Median :0.2600 Median : 0.270 Median : 7640
Mean : 3.425 Mean :0.5652 Mean : 1.552 Mean : 7787
3rd Qu.: 4.827 3rd Qu.:0.8175 3rd Qu.: 2.150 3rd Qu.:10778
Max. :10.710 Max. :5.1200 Max. :21.660 Max. :29326
TotalDistance TrackerDistance LoggedActivitiesDistance Calories.x
Min. : 0.000 Min. : 0.000 Min. :0.0000 Min. : 0
1st Qu.: 2.683 1st Qu.: 2.683 1st Qu.:0.0000 1st Qu.:1862
Median : 5.335 Median : 5.335 Median :0.0000 Median :2138
Mean : 5.592 Mean : 5.575 Mean :0.1101 Mean :2340
3rd Qu.: 7.728 3rd Qu.: 7.718 3rd Qu.:0.0000 3rd Qu.:2794
Max. :26.720 Max. :26.720 Max. :4.9421 Max. :4900
Calories.y StepTotal Day
Min. : 0 Min. : 0 Sunday :64
1st Qu.:1862 1st Qu.: 3992 Monday :64
Median :2138 Median : 7640 Tuesday :64
Mean :2340 Mean : 7787 Wednesday:97
3rd Qu.:2794 3rd Qu.:10778 Thursday :97
Max. :4900 Max. :29326 Friday :97
Saturday :95
#Plotting a few explorations for d_merge dataframe
#Relation between StepTotal and TotalDistance # positive relation
ggplot(data=d_merge)+
geom_smooth (mapping = aes(x=StepTotal, y=TotalDistance)) +
geom_point(mapping= aes(x=StepTotal,y=TotalDistance, color="orange"))
# Relation between Day and TotalDistance
ggplot(data = d_merge) + geom_smooth(mapping = aes(x=TotalDistance,y=Day,color="orange"))
#grouping of user into four categories based on their activity distance
data_by_usertype_d <- d_merge %>%
summarise(
user_type = factor(case_when(
SedentaryActiveDistance > mean(SedentaryActiveDistance) & LightActiveDistance < mean(LightActiveDistance) & ModeratelyActiveDistance < mean(ModeratelyActiveDistance) & VeryActiveDistance < mean(VeryActiveDistance) ~ "Sedentary",
SedentaryActiveDistance < mean(SedentaryActiveDistance) & LightActiveDistance > mean(LightActiveDistance) & ModeratelyActiveDistance < mean(ModeratelyActiveDistance) & VeryActiveDistance< mean(VeryActiveDistance) ~ "Light",
SedentaryActiveDistance < mean(SedentaryActiveDistance) & LightActiveDistance < mean(LightActiveDistance) & ModeratelyActiveDistance > mean(ModeratelyActiveDistance) & VeryActiveDistance < mean(VeryActiveDistance) ~ "Moderate",
SedentaryActiveDistance < mean(SedentaryActiveDistance) & LightActiveDistance < mean(LightActiveDistance) & ModeratelyActiveDistance < mean(ModeratelyActiveDistance) & VeryActiveDistance > mean(VeryActiveDistance) ~ "Very",
),levels=c("Sedentary", "Light", "Moderate", "Very")), Calories.x, .group=Id) %>%
drop_na()
# viz
data_by_usertype_d %>%
group_by(user_type) %>%
summarise(total = n()) %>%
mutate(totals = sum(total)) %>%
group_by(user_type) %>%
summarise(Total_Percent = total / totals) %>%
ggplot(aes(user_type,y=Total_Percent, fill=user_type)) +
geom_col()+
scale_y_continuous(labels = scales::percent) +
theme(legend.position="none") +
labs(title="User Type Distridution", x=NULL) +
theme(legend.position="none", text = element_text(size = 20),plot.title = element_text(hjust = 0.5))
Here is the link of Presentation
https://docs.google.com/presentation/d/1uxYA5tgKC6lr2Tk6hRA1uR2K_2cV8MSxEK8Yy8auhlU/edit?usp=sharing
Act
Top Three Recommendations
1. Motivate users to walk farther, giving guidance establishing a healthy sleep pattern.
2. Provide rewards and gifts for people that reach their daily goals and recommended diets.
3. Making the leaf product appealing and comfy so that women can wear it in many settings.
Limitation
The information is applicable to small numbers of distinct users we need large sample data.
Demographic information like Age, Gender, Occupation are missing we need that in order to get better understanding.
False BMI manual reports were left out . We need True BMI manual Reports

No comments:
Post a Comment