Thursday, March 16, 2023

Case Study : How Can a Wellness Technology Company Play It Smart

About Bellabeat


Bellabeat is a high-tech manufacturer of health-focused products for women. Bellabeat is a successful small company, but they have the potential to become a larger player in the global smart device market. Collecting data on activity, sleep, stress, and reproductive health has allowed Bellabeat to empower women with knowledge about their health and habits. Since it was founded in 2013, Bellabeat has grown rapidly and quickly, positioned itself as a tech-driven wellness company for women. By 2016, Bellabeat had opened offices around the world and launched multiple products. Bellabeat products became available through a growing number of online retailers in addition to their own e-commerce channel on their website. 

Bellabeat Products


Bellabeat app: The Bellabeat app provides users with health data related to their activity, sleep, stress, menstrual cycle, and mindfulness habits. This data can help users better understand their current habits and make healthy decisions. The Bella Beat app connects to their line of smart wellness products. 

Leaf: Bella Beat’s classic wellness tracker can be worn as a bracelet, necklace, or clip. The Leaf tracker connects to the Bella Beat app to track activity, sleep, and stress. 

Time: This wellness watch combines the timeless look of a classic timepiece with smart technology to track user activity, sleep, and stress. The Time watch connects to the Bella Beat app to provides insights of daily wellness. 

Spring: This is a water bottle that tracks daily water intake using smart technology to ensure that user are appropriately hydrated throughout the day. The spring bottle connects to the Bella Beat app to track user hydration levels

Bellabeat membership
Bellabeat also offers a subscription-based membership program for users. Membership gives users 24/7 access to fully personalized guidance on nutrition, activity, sleep, health , beauty and mindfulness based on their lifestyle and goals.

Business Task:

Too analyze smart device usage data in order to gain insight into how consumers use non-Bellabeat smart devices. Then selecting  Bellabeat product to apply these insights.

Key Stakeholders


Urška Sršen: Bellabeat’s cofounder and Chief Creative Officer.

Sando Mur: Mathematician and Bellabeat’s cofounder; key member of the Bellabeat executive team.

Bellabeat marketing analytics team: A team of data analysts responsible for collecting, analyzing, and reporting data that helps guide Bellabeat’s marketing strategy.

Data Analysis Process


Ask 


1. What are some trends in smart device usage? 
2. How could these trends apply to Bellabeat customers? 
3. How could these trends help influence Bellabeat marketing strategy? 


Prepare

I have used public data that explores smart device users’ daily habits. It is Fitbit Fitness Tracker Data (CC0: Public Domain, dataset made available through Mobius): This Kaggle data set contains personal fitness tracker from thirty Fitbit users. Thirty eligible Fitbit users consented to the submission of personal tracker data, including minute-level output for physical activity, heart rate, and sleep monitoring. It includes information about activity, steps, weight and heart rate that can be used to explore user habits.

Process

The Zip files were downloaded locally and copy was stored in a new folder named Bellabeat project  with a csv extension.
The csv files were opened using Excel and copy of relevant datasets was stored in desktop as folder. Then each folder was inspected.
Activity, calories, intensities, steps  datasets have no duplicates. Sleep and heart datasets have duplicates that were removed. Weight dataset have no duplicates but some manual reports in this data are false as a result false reports were filtered out, new column USERTYPE was created based on BMI classification.

 R STUDIO Codes 

# installing packages
install.packages("tidyverse")
install.packages("lubridate")
install.packages("dplyr")
install.packages("ggplot2")
install.packages("tidyr")
install.packages("here") 
install.packages("skimr") 
install.packages("janitor")

# loading libraries 
library(tidyverse)
library(lubridate)
library(dplyr)
library(ggplot2)
library(tidyr)
library(here)
library(skimr)
library(janitor)

# Working Directory 
setwd("C:/Users/H/Desktop")
> d_Activity <- read.csv("daily_Activity.csv")           # 1
> d_calories <- read.csv("daily_calories.csv")            # 2
> d_intensities <- read.csv("daily_intensities.csv")    # 3
> d_steps <- read.csv("daily_steps.csv")                    # 4
> d_weight <- read.csv("cleanbmi.csv")                    #5
> d_sleep <-  read.csv("cleansleep.csv")                   #6
 

Analyse


# working with d_sleep dataset
> str(d_sleep)
'data.frame': 410 obs. of  7 variables:
 $ Id                : num  1.5e+09 1.5e+09 1.5e+09 1.5e+09 1.5e+09 ...
 $ SleepDay          : chr  "04/12/2016 00:00" "4/13/2016 12:00:00 AM" "4/15/2016 12:00:00 AM" "4/16/2016 12:00:00 AM" ...
 $ TotalSleepRecords : int  1 2 1 2 1 1 1 1 1 1 ...
 $ TotalMinutesAsleep: int  327 384 412 340 700 304 360 325 361 430 ...
 $ TotalSleepHours   : chr  "5:27" "6:24" "6:52" "5:40" ...
 $ TotalTimeInBed    : int  346 407 442 367 712 320 377 364 384 449 ...
 $ TotalBedHours     : chr  "5:46" "6:47" "7:22" "6:07" ...
> summary(d_sleep)
       Id              SleepDay         TotalSleepRecords TotalMinutesAsleep
 Min.   :1.504e+09   Length:410         Min.   :1.00      Min.   : 58.0     
 1st Qu.:3.977e+09   Class :character   1st Qu.:1.00      1st Qu.:361.0     
 Median :4.703e+09   Mode  :character   Median :1.00      Median :432.5     
 Mean   :4.995e+09                      Mean   :1.12      Mean   :419.2     
 3rd Qu.:6.962e+09                      3rd Qu.:1.00      3rd Qu.:490.0     
 Max.   :8.792e+09                      Max.   :3.00      Max.   :796.0     
 TotalSleepHours    TotalTimeInBed  TotalBedHours     
 Length:410         Min.   : 61.0   Length:410        
 Class :character   1st Qu.:403.8   Class :character  
 Mode  :character   Median :463.0   Mode  :character  
                    Mean   :458.5                     
                    3rd Qu.:526.0                     
                    Max.   :961.0                     
> n_distinct(d_sleep)
[1] 410
# creating usertype based on TotalMinutesAsleep
> user<-d_sleep %>% 
+   mutate(user_type=case_when(
+   TotalMinutesAsleep <360 ~ "SSS", 
+   TotalMinutesAsleep >=360 & TotalMinutesAsleep <540 ~ "NORMAL", 
+   TotalMinutesAsleep >540 ~ "OVERSLEEP"
+   ))
# convert user_type chr to factor user_type
d_user <-mutate(user,user_type=as.factor(user_type))
glimpse(d_user)
Rows: 410
Columns: 8
$ Id                 <dbl> 1503960366, 1503960366, 1503960366, 1503960366, 15039…
$ SleepDay           <chr> "04/12/2016 00:00", "4/13/2016 12:00:00 AM", "4/15/20…
$ TotalSleepRecords  <int> 1, 2, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
$ TotalMinutesAsleep <int> 327, 384, 412, 340, 700, 304, 360, 325, 361, 430, 277…
$ TotalSleepHours    <chr> "5:27", "6:24", "6:52", "5:40", "11:40", "5:04", "6:0…
$ TotalTimeInBed     <int> 346, 407, 442, 367, 712, 320, 377, 364, 384, 449, 323…
$ TotalBedHours      <chr> "5:46", "6:47", "7:22", "6:07", "11:52", "5:20", "6:1…
$ user_type          <fct> SSS, NORMAL, NORMAL, SSS, OVERSLEEP


ggplot(data = d_user)+
  geom_smooth(mapping = aes(x=TotalMinutesAsleep,y=TotalTimeInBed))+
  geom_point(mapping = aes(x=TotalMinutesAsleep,y=TotalTimeInBed,color="orange"))

d_user %>%
  group_by(user_type) %>%
  summarise(total = n()) %>%
  mutate(totals = sum(total)) %>%
  group_by(user_type) %>%
  summarise(Percent = total / totals) %>%
  ggplot(aes(user_type,y=Percent, fill=user_type)) +
  geom_col()+
  scale_y_continuous(labels = scales::percent) +
  theme(legend.position="none") +
  labs(title="Usertype", x=NULL) +
  theme(legend.position="none", text = element_text(size = 20),plot.title = element_text(hjust = 0.5))



 # Working with weight data
> str(d_weight)
'data.frame': 41 obs. of  6 variables:
 $ Id            : num  1.50e+09 1.50e+09 2.87e+09 2.87e+09 4.32e+09 ...
 $ WeightKg      : num  52.6 52.6 56.7 57.3 72.4 ...
 $ WeightPounds  : num  116 116 125 126 160 ...
 $ BMI           : num  22.6 22.6 21.5 21.7 27.5 ...
 $ USERTYPE      : chr  "normal" "normal" "normal" "normal" ...
 $ IsManualReport: logi  TRUE TRUE TRUE TRUE TRUE TRUE ...
> summary(d_weight)
       Id               WeightKg      WeightPounds        BMI       
 Min.   :1.504e+09   Min.   :52.60   Min.   :116.0   Min.   :21.45  
 1st Qu.:4.559e+09   1st Qu.:61.20   1st Qu.:134.9   1st Qu.:23.89  
 Median :6.962e+09   Median :61.50   Median :135.6   Median :24.00  
 Mean   :6.074e+09   Mean   :62.41   Mean   :137.6   Mean   :24.39  
 3rd Qu.:6.962e+09   3rd Qu.:62.10   3rd Qu.:136.9   3rd Qu.:24.24  
 Max.   :6.962e+09   Max.   :72.40   Max.   :159.6   Max.   :27.46  
   USERTYPE         IsManualReport
 Length:41          Mode:logical  
 Class :character   TRUE:41       
 Mode  :character                 
                                                          
> n_distinct(d_weight)
[1] 22
> #change USERTYPE chr.to factor 
> d_w <-mutate(d_weight,USERTYPE=as.factor(USERTYPE))
> glimpse(d_w)
Rows: 41
Columns: 6
$ Id             <dbl> 1503960366, 1503960366, 2873212765, 2873212765, 431970357…
$ WeightKg       <dbl> 52.6, 52.6, 56.7, 57.3, 72.4, 72.3, 69.7, 70.3, 69.9, 69.…
$ WeightPounds   <dbl> 115.9631, 115.9631, 125.0021, 126.3249, 159.6147, 159.394…
$ BMI            <dbl> 22.65, 22.65, 21.45, 21.69, 27.45, 27.38, 27.25, 27.46, 2…
$ USERTYPE       <fct> normal, normal, normal, normal, overweight, overweight, o…
$ IsManualReport <lgl> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRU…

ggplot(data = d_w)+
  geom_smooth(mapping = aes(x=WeightKg,y=BMI))+
  geom_point(mapping = aes(x=WeightKg,y=BMI,color="orange"))

)
d_w %>%
    group_by(USERTYPE) %>%
    summarise(total = n()) %>%
    mutate(totals = sum(total)) %>%
    group_by(USERTYPE) %>%
    summarise(Total_Percent = total / totals) %>%
    ggplot(aes(USERTYPE,y=Total_Percent, fill=USERTYPE)) +
    geom_col()+
    scale_y_continuous(labels = scales::percent) +
    theme(legend.position="none") +
    labs(title="USERTYPE", x=NULL) +
    theme(legend.position="none", text = element_text(size = 20),plot.title = element_text(hjust = 0.5))


# working with activity, calories, intensities, steps datasets
> #How many unique participants are there in each dataframe? 
> n_distinct(d_Activity$Id)
[1] 33
> n_distinct(d_calories$Id)
[1] 33
> n_distinct(d_intensities$Id)
[1] 33
> n_distinct(d_steps$Id)
[1] 33
> #How many observations are there in each dataframe?
> nrow(d_Activity)
[1] 940
> nrow(d_calories)
[1] 940
> nrow(d_intensities)
[1] 940
> nrow(d_steps)
[1] 940
str(d_Activity)
'data.frame': 940 obs. of  15 variables:
 $ Id                      : num  1.5e+09 1.5e+09 1.5e+09 1.5e+09 1.5e+09 ...
 $ ActivityDate            : chr  "04/12/2016" "4/13/2016" "4/14/2016" "4/15/2016" ...
 $ TotalSteps              : int  13162 10735 10460 9762 12669 9705 13019 15506 10544 9819 ...
 $ TotalDistance           : num  8.5 6.97 6.74 6.28 8.16 ...
 $ TrackerDistance         : num  8.5 6.97 6.74 6.28 8.16 ...
 $ LoggedActivitiesDistance: num  0 0 0 0 0 0 0 0 0 0 ...
 $ VeryActiveDistance      : num  1.88 1.57 2.44 2.14 2.71 ...
 $ ModeratelyActiveDistance: num  0.55 0.69 0.4 1.26 0.41 ...
 $ LightActiveDistance     : num  6.06 4.71 3.91 2.83 5.04 ...
 $ SedentaryActiveDistance : num  0 0 0 0 0 0 0 0 0 0 ...
 $ VeryActiveMinutes       : int  25 21 30 29 36 38 42 50 28 19 ...
 $ FairlyActiveMinutes     : int  13 19 11 34 10 20 16 31 12 8 ...
 $ LightlyActiveMinutes    : int  328 217 181 209 221 164 233 264 205 211 ...
 $ SedentaryMinutes        : int  728 776 1218 726 773 539 1149 775 818 838 ...
 $ Calories                : int  1985 1797 1776 1745 1863 1728 1921 2035 1786 1775 ...
> str(d_calories)
'data.frame': 940 obs. of  3 variables:
 $ Id         : num  1.5e+09 1.5e+09 1.5e+09 1.5e+09 1.5e+09 ...
 $ ActivityDay: chr  "4/12/2016" "4/13/2016" "4/14/2016" "4/15/2016" ...
 $ Calories   : int  1985 1797 1776 1745 1863 1728 1921 2035 1786 1775 ...
> str(d_intensities)
'data.frame': 940 obs. of  10 variables:
 $ Id                      : num  1.5e+09 1.5e+09 1.5e+09 1.5e+09 1.5e+09 ...
 $ ActivityDay             : chr  "04/12/2016" "4/13/2016" "4/14/2016" "4/15/2016" ...
 $ SedentaryMinutes        : int  728 776 1218 726 773 539 1149 775 818 838 ...
 $ LightlyActiveMinutes    : int  328 217 181 209 221 164 233 264 205 211 ...
 $ FairlyActiveMinutes     : int  13 19 11 34 10 20 16 31 12 8 ...
 $ VeryActiveMinutes       : int  25 21 30 29 36 38 42 50 28 19 ...
 $ SedentaryActiveDistance : num  0 0 0 0 0 0 0 0 0 0 ...
 $ LightActiveDistance     : num  6.06 4.71 3.91 2.83 5.04 ...
 $ ModeratelyActiveDistance: num  0.55 0.69 0.4 1.26 0.41 ...
 $ VeryActiveDistance      : num  1.88 1.57 2.44 2.14 2.71 ...
> str(d_steps)
'data.frame': 940 obs. of  3 variables:
 $ Id         : num  1.5e+09 1.5e+09 1.5e+09 1.5e+09 1.5e+09 ...
 $ ActivityDay: chr  "04/12/2016" "4/13/2016" "4/14/2016" "4/15/2016" ...
 $ StepTotal  : int  13162 10735 10460 9762 12669 9705 13019 15506 10544 9819 ...
> #all datasets had  the 'Id' field common.
> #all dataets expect for d_activity have ActivityDay common.We can rename the ActivityDate to AcitivityDay

# rename d_Activity data ActivityDate col to ActivityDay col
d_Activity <- rename( d_Activity,
                      ActivityDay = ActivityDate)
# now we can merge 4 dataset by Id and ActivityDay                                                                                         
>    merge_1 <- merge(d_Activity, d_calories, by= c("Id", "ActivityDay"))
>    merge_2 <- merge(d_intensities,d_steps, by= c("Id","ActivityDay"))
>    All_merge <- merge(merge_1, merge_2, by = c("Id","ActivityDay","SedentaryMinutes",
+                                                "LightlyActiveMinutes","FairlyActiveMinutes",
+                                                "VeryActiveMinutes", "SedentaryActiveDistance", 
+                                                "LightActiveDistance", "ModeratelyActiveDistance", 
+                                                "VeryActiveDistance"))

glimpse(All_merge)
Rows: 578
Columns: 17
$ Id                       <dbl> 1503960366, 1503960366, 1503960366, 1503960366,…
$ ActivityDay              <chr> "4/13/2016", "4/14/2016", "4/15/2016", "4/16/20…
$ SedentaryMinutes         <int> 776, 1218, 726, 773, 539, 1149, 775, 818, 838, …
$ LightlyActiveMinutes     <int> 217, 181, 209, 221, 164, 233, 264, 205, 211, 13…
$ FairlyActiveMinutes      <int> 19, 11, 34, 10, 20, 16, 31, 12, 8, 27, 21, 5, 1…
$ VeryActiveMinutes        <int> 21, 30, 29, 36, 38, 42, 50, 28, 19, 66, 41, 39,…
$ SedentaryActiveDistance  <dbl> 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00,…
$ LightActiveDistance      <dbl> 4.71, 3.91, 2.83, 5.04, 2.51, 4.71, 5.03, 4.24,…
$ ModeratelyActiveDistance <dbl> 0.69, 0.40, 1.26, 0.41, 0.78, 0.64, 1.32, 0.48,…
$ VeryActiveDistance       <dbl> 1.57, 2.44, 2.14, 2.71, 3.19, 3.25, 3.53, 1.96,…
$ TotalSteps               <int> 10735, 10460, 9762, 12669, 9705, 13019, 15506, …
$ TotalDistance            <dbl> 6.97, 6.74, 6.28, 8.16, 6.48, 8.59, 9.88, 6.68,…
$ TrackerDistance          <dbl> 6.97, 6.74, 6.28, 8.16, 6.48, 8.59, 9.88, 6.68,…
$ LoggedActivitiesDistance <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ Calories.x               <int> 1797, 1776, 1745, 1863, 1728, 1921, 2035, 1786,…
$ Calories.y               <int> 1797, 1776, 1745, 1863, 1728, 1921, 2035, 1786,…
$ StepTotal                <int> 10735, 10460, 9762, 12669, 9705, 13019, 15506, …

> #  convert ActivityDay chr to date format
>    d_merge <- mutate(All_merge, ActivityDay = as.Date(ActivityDay, format= "%m/%d/%Y"))
>    class(Date$ActivityDay)
[1] "Date"
> #  convert date to weekday 
  d_merge$Day <- weekdays(d_merge$ActivityDay)
  d_merge$Day <- factor(d_merge$Day,levels = c('Sunday','Monday',
              'Tuesday','Wednesday','Thursday','Friday','Saturday'))

glimpse(d_merge)
Rows: 578
Columns: 18
$ Id                       <dbl> 1503960366, 1503960366, 1503960366, 1503960366,…
$ ActivityDay              <date> 2016-04-13, 2016-04-14, 2016-04-15, 2016-04-16…
$ SedentaryMinutes         <int> 776, 1218, 726, 773, 539, 1149, 775, 818, 838, …
$ LightlyActiveMinutes     <int> 217, 181, 209, 221, 164, 233, 264, 205, 211, 13…
$ FairlyActiveMinutes      <int> 19, 11, 34, 10, 20, 16, 31, 12, 8, 27, 21, 5, 1…
$ VeryActiveMinutes        <int> 21, 30, 29, 36, 38, 42, 50, 28, 19, 66, 41, 39,…
$ SedentaryActiveDistance  <dbl> 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00,…
$ LightActiveDistance      <dbl> 4.71, 3.91, 2.83, 5.04, 2.51, 4.71, 5.03, 4.24,…
$ ModeratelyActiveDistance <dbl> 0.69, 0.40, 1.26, 0.41, 0.78, 0.64, 1.32, 0.48,…
$ VeryActiveDistance       <dbl> 1.57, 2.44, 2.14, 2.71, 3.19, 3.25, 3.53, 1.96,…
$ TotalSteps               <int> 10735, 10460, 9762, 12669, 9705, 13019, 15506, …
$ TotalDistance            <dbl> 6.97, 6.74, 6.28, 8.16, 6.48, 8.59, 9.88, 6.68,…
$ TrackerDistance          <dbl> 6.97, 6.74, 6.28, 8.16, 6.48, 8.59, 9.88, 6.68,…
$ LoggedActivitiesDistance <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ Calories.x               <int> 1797, 1776, 1745, 1863, 1728, 1921, 2035, 1786,…
$ Calories.y               <int> 1797, 1776, 1745, 1863, 1728, 1921, 2035, 1786,…
$ StepTotal                <int> 10735, 10460, 9762, 12669, 9705, 13019, 15506, …
$ Day                      <fct> Wednesday, Thursday, Friday, Saturday, Sunday, …

# Summary statistics
>   n_distinct(d_merge$Id)
[1] 33
>   nrow(d_merge)
[1] 578
>  summary(d_merge)
       Id             ActivityDay         SedentaryMinutes LightlyActiveMinutes
 Min.   :1.504e+09   Min.   :2016-04-13   Min.   :   2     Min.   :  0.0       
 1st Qu.:2.347e+09   1st Qu.:2016-04-17   1st Qu.: 738     1st Qu.:135.0       
 Median :4.445e+09   Median :2016-04-21   Median :1070     Median :202.5       
 Mean   :4.882e+09   Mean   :2016-04-21   Mean   :1004     Mean   :198.3       
 3rd Qu.:6.962e+09   3rd Qu.:2016-04-26   3rd Qu.:1232     3rd Qu.:271.0       
 Max.   :8.878e+09   Max.   :2016-04-30   Max.   :1440     Max.   :518.0       
                                                                               
 FairlyActiveMinutes VeryActiveMinutes SedentaryActiveDistance
 Min.   :  0.00      Min.   :  0.00    Min.   :0.000000       
 1st Qu.:  0.00      1st Qu.:  0.00    1st Qu.:0.000000       
 Median :  7.00      Median :  5.00    Median :0.000000       
 Mean   : 13.73      Mean   : 22.06    Mean   :0.001609       
 3rd Qu.: 20.00      3rd Qu.: 31.75    3rd Qu.:0.000000       
 Max.   :113.00      Max.   :210.00    Max.   :0.110000       
                                                              
 LightActiveDistance ModeratelyActiveDistance VeryActiveDistance   TotalSteps   
 Min.   : 0.000      Min.   :0.0000           Min.   : 0.000     Min.   :    0  
 1st Qu.: 2.002      1st Qu.:0.0000           1st Qu.: 0.000     1st Qu.: 3992  
 Median : 3.430      Median :0.2600           Median : 0.270     Median : 7640  
 Mean   : 3.425      Mean   :0.5652           Mean   : 1.552     Mean   : 7787  
 3rd Qu.: 4.827      3rd Qu.:0.8175           3rd Qu.: 2.150     3rd Qu.:10778  
 Max.   :10.710      Max.   :5.1200           Max.   :21.660     Max.   :29326  
                                                                                
 TotalDistance    TrackerDistance  LoggedActivitiesDistance   Calories.x  
 Min.   : 0.000   Min.   : 0.000   Min.   :0.0000           Min.   :   0  
 1st Qu.: 2.683   1st Qu.: 2.683   1st Qu.:0.0000           1st Qu.:1862  
 Median : 5.335   Median : 5.335   Median :0.0000           Median :2138  
 Mean   : 5.592   Mean   : 5.575   Mean   :0.1101           Mean   :2340  
 3rd Qu.: 7.728   3rd Qu.: 7.718   3rd Qu.:0.0000           3rd Qu.:2794  
 Max.   :26.720   Max.   :26.720   Max.   :4.9421           Max.   :4900  
                                                                          
   Calories.y     StepTotal            Day    
 Min.   :   0   Min.   :    0   Sunday   :64  
 1st Qu.:1862   1st Qu.: 3992   Monday   :64  
 Median :2138   Median : 7640   Tuesday  :64  
 Mean   :2340   Mean   : 7787   Wednesday:97  
 3rd Qu.:2794   3rd Qu.:10778   Thursday :97  
 Max.   :4900   Max.   :29326   Friday   :97  
                                Saturday :95  
#Plotting a few explorations for d_merge dataframe
#Relation between StepTotal and TotalDistance # positive relation
  ggplot(data=d_merge)+
    geom_smooth (mapping = aes(x=StepTotal, y=TotalDistance)) +
    geom_point(mapping= aes(x=StepTotal,y=TotalDistance, color="orange"))


# Relation between Day and TotalDistance
    ggplot(data = d_merge) + geom_smooth(mapping = aes(x=TotalDistance,y=Day,color="orange"))


#grouping of  user into four categories based on  their activity distance 
  data_by_usertype_d <- d_merge %>%
    summarise(
      user_type = factor(case_when(
        SedentaryActiveDistance > mean(SedentaryActiveDistance) & LightActiveDistance < mean(LightActiveDistance) & ModeratelyActiveDistance < mean(ModeratelyActiveDistance) & VeryActiveDistance < mean(VeryActiveDistance) ~ "Sedentary",
        SedentaryActiveDistance < mean(SedentaryActiveDistance) & LightActiveDistance > mean(LightActiveDistance) & ModeratelyActiveDistance < mean(ModeratelyActiveDistance) & VeryActiveDistance< mean(VeryActiveDistance) ~ "Light",
        SedentaryActiveDistance < mean(SedentaryActiveDistance) & LightActiveDistance < mean(LightActiveDistance) & ModeratelyActiveDistance > mean(ModeratelyActiveDistance) & VeryActiveDistance < mean(VeryActiveDistance) ~ "Moderate",
        SedentaryActiveDistance < mean(SedentaryActiveDistance) & LightActiveDistance < mean(LightActiveDistance) & ModeratelyActiveDistance < mean(ModeratelyActiveDistance) & VeryActiveDistance > mean(VeryActiveDistance) ~ "Very",
      ),levels=c("Sedentary", "Light", "Moderate", "Very")), Calories.x, .group=Id) %>%
    drop_na()
# viz
data_by_usertype_d %>%
    group_by(user_type) %>%
    summarise(total = n()) %>%
    mutate(totals = sum(total)) %>%
    group_by(user_type) %>%
    summarise(Total_Percent = total / totals) %>%
    ggplot(aes(user_type,y=Total_Percent, fill=user_type)) +
    geom_col()+
    scale_y_continuous(labels = scales::percent) +
    theme(legend.position="none") +
    labs(title="User Type Distridution", x=NULL) +
    theme(legend.position="none", text = element_text(size = 20),plot.title = element_text(hjust = 0.5))


Share 

Here is the link of Presentation

Act 


Top Three Recommendations

1. Motivate users to walk farther, giving guidance establishing a healthy sleep pattern.

2. Provide rewards and gifts for people that reach their daily goals and recommended diets.

3. Making the leaf product appealing and comfy so that women can wear it in many settings.



Limitation

The information is applicable to small numbers of  distinct users we need large sample data. 

Demographic information like Age, Gender, Occupation are missing we need that in order to get better understanding.

False BMI manual reports were left out . We need True BMI  manual Reports 



No comments:

Post a Comment

Malnutrition and Cognitive Development: How Early Nutrition Shapes Educational Outcomes – Evidence from Global Studies

  image source: child rights and you  Introduction Malnutrition is a condition where body is either undernourished or over nourished resulti...