## Case Study

Assume Apple Inc. is researching for a new feature for next year’s iPhone upgrade and you are one of the Data Scientists working in this research team. Your team wants to survey people in the most populous states in the US; California, Texas, Florida and New York. Suppose Your team has decided to have a big event outdoors for this research purpose, randomly picking people and asking them whether they like the new feature or not based on whether the people you picked currently uses an iPhone, uses a different phone and don’t want to switch phones ever or currently using another phone but hoping to switch to an iPhone in the future.

Of those who are currently using an iPhone, 80% like the new feature. From those who are currently using a different phone and don’t want to switch phones ever, 52% liked the new feature.From those who are currently using another phone but hoping to switch to an iPhone in the future, 48% liked the new feature.

1. Notations.
a. Use capital letters to denote different events in this question.
b. Specify all probabilities mentioned in the problem using conditional probability notations and general probability notations.

Here is the answer to  #1:
L: Like the new feature
I: Currently using an iPhone
N: Not using an iPhone and don’t want to switch phones ever
S: Currently using another phone but hoping to switch to an iPhone in the future

P(L|I): Probability of liking the new feature given that a person is currently using an iPhone
P(L|N): Probability of liking the new feature given that a person is not using an iPhone and doesn’t want to switch phones ever
P(L|S): Probability of liking the new feature given that a person is using another phone but hoping to switch to an iPhone in the future

P(I): Probability of using an iPhone currently
P(N): Probability of not using an iPhone and not wanting to switch phones ever
P(S): Probability of using another phone but hoping to switch to an iPhone in the future

2. You can see that these probabilities are only for the survey conducted in New York. But you need to easily find the probabilities for any given state. Therefore, write an RStudio function that uses simulations that will find the probability that if a person liked the feature then they are currently using an iPhone. This function should have inputs that these will enter when using it. The function should basically create a new random sample of the phenomena using the sample statistics from the textual description, i.e the function is almost like re-running the survey but with random number generation (using the R “sample” function).

3. Using the function you wrote plug in the values from the New York data. This should be just one line of code, of you using the function you made with the New York data to find the probability that if a person liked the phone then they are currently using an iPhone. In New York; From those people you randomly picked, there were 340 people who are currently using an iPhone, 185 people using a different phone and don’t want to switch phones ever and 232 people are currently using another phone but hoping to switch to an iPhone in the future.

4. Write a function to Calculate the probability directly without using a simulation, use that function to compute the probability. It should be very similar to your answer to part 3.

Here is the answer to number 4:

prob = function(iphone_users_like, diff_phone_no_switch_like, diff_phone_switch_like) {
total_likes = iphone_users_like + diff_phone_no_switch_like + diff_phone_switch_like
prob_iphone = iphone_users_like / total_likes
return(prob_iphone)
}

NYiphone_users_like = 0.80 * 340
NYdiff_phone_no_switch_like = 0.52 * 185
NYdiff_phone_switch_like = 0.48 * 232

prob(NYiphone_users_like, NYdiff_phone_no_switch_like, NYdiff_phone_switch_like)

0.5671866

5. Now use Monte Carlo simulation and compare your answer to the probability found in part 4. Here, write a function that can calculate the probability in part 3(the probability that if a person liked the phone then they are currently using an iPhone) for many different number of iterations (as in Monte Carlo simulation).

6. What would be a rough minimum “large enough” number of iterations that you need to get a good probability. Use the New York Data to answer this question. (Hint: You can use a graph to answer this question).

Data: From those people you randomly picked, there were 610 people who are currently using an iPhone, 580 people using a different phone and don’t want to switch phones ever and 330 people are currently using another phone but hoping to switch to an iPhone in the future. Of those who are currently using an iPhone, 92% like the new feature. From those who are currently using a different phone and don’t want to switch phones ever, 48% liked the new feature.From those who are currently using another phone but hoping to switch to an iPhone in the future, 67% liked the new feature.

7. Assuming you do this survey again in California, Find the probability that this person is currently using an iPhone given that they like the feature.
a. Use your function from part 5 and the function you wrote in part 2 to find the above probability. b. Use the function you wrote in part 4 to calculate the above probability without using a simulation. b. Compare these probabilities you got from the simulation with the probability you calculate by hand(Q7: part a and b). What can you conclude?

Both comments and pings are currently closed.