Interview Question
Data Scientist Interview
-
MetaGiven the following tables how would you know who has the most friends REQUESTS date | sender_id | accepter_id ACCEPTED accepted_at | accepter_id | sender_id
Interview Answers
14 Answers
Since if two people become friends, the request has to be accepted. We may use the accepted table only for the question how many friends each id has. However, one person can either send or accept friend, we will need to remove the duplication. select a.accepter_id, count(*) as cnt from (select distinct accepter_id, send_id from accepted union select distinct send_id as accepter_id, accepter_id as send_id from accepted ) a group by accpeter_id order by cnt limit 1;
guest on
In the vein of answers 6 and 7: SELECT a.user, COUNT(DISTINCT a.friend) AS friend_count FROM ( (SELECT accepter_id AS user, sender_id AS friend FROM ACCEPTED) UNION (SELECT sender_id AS user, acceptor_id AS friend FROM ACCEPTED) ) a GROUP BY a.user ORDER BY friend_count LIMIT 1;
Anonymous on
I think this would be simpler select requester_id from request_accepted union all select accepter_id from request_accepted) t group by 1 order by count desc limit 1
Anonymous on
Believe answer number 6 is correct, but, not sure whether or not you need to count distinct, as the union should remove duplicates for you. Union all would leave duplicates, but simple union removes them for you.
DG on
Explaining the ^ code. We want 2 account for 2 cases 1) Where the sender in the request table is the sender in the accepted table 2) Where the sender in the request table is the accepter in the request table So this will both account for the cases where a person both sends and accepts friend requests Now we do a UNION ALL and not a UNION because if both queries have an ID appearing in both, we do not want to remove the duplicate cases as that may affect the count
Anonymous on
Reposting: We want 2 account for 2 cases 1) Where the sender in the request table is the sender in the accepted table 2) Where the sender in the request table is the accepter in the request table So this will both account for the cases where a person both sends and accepts friend requests Now we do a UNION ALL and not a UNION because if both queries have an ID appearing in both, we do not want to remove the duplicate cases as that may affect the count WITH CTE AS ( SELECT R.sender_id AS ID FROM REQUESTS R LEFT JOIN ACCEPTED A ON R.sender_id = A.sender_id WHERE accepted_at IS NOT NULL UNION ALL SELECT A.accepter_id AS ID FROM REQUESTS R LEFT JOIN ACCEPTED A ON R.sender_id = A.accepter_id WHERE accepted_at IS NOT NULL) SELECT ID, count(DISTINCT ID) As total_friends GROUP BY 1 ORDER BY 2 DESC LIMIT 1
Anonymous on
SELECT top 1 t.requester_id as id, sum(t.ct)as num FROM ( select requester_id, count(*) ct from request_accepted group by requester_id union all select accepter_id as requester_id, count(*) from request_accepted group by accepter_id )as t group by t.requester_id order by sum(t.ct) desc
MS SQL SERVER SOLUTION - Super Easy! on
A Pythonic answer. Again, using just the 'request_accepted' table (pd.DataFrame( np.concatenate( [request_accepted[['accepter_id','sender_id']], request_accepted[['sender_id','accepter_id']]] ) , columns=['user_id','friend_id'] ) .drop_duplicates() .groupby('user_id') .size() .sort_values(ascending=False) .to_frame() )
Richard on
select user_id , sum(num_firends) from (select sender_id as user_id, count(accepter_id) as num_friend from ACCEPTED group by sender_id union select accepter_id as user_id, count(sender_id) as num_friend from ACCEPTED group by accepter_id) T2 group by user_id order by sum(num_firends) desc ;
Titi on
You will need to do an self join (IMO) to count number of times a user was the accepter and number of times they were a sender
Anonymous on
Select distinct User_id From ( Select *, rank() over(partition by user_id order by friend_count desc) as friend_count_rank from ( Select distinct User_id, Count(distinct R.acceptor_id) + count(distinct A.sender_id) as friend_count From ( Select distinct sender_id as user_id from REQUESTS R UNION Select distinct accept_id as user_id from ACCEPTS A ) USERS Left join REQUESTS R on R.sender_id = users.user_id Left join ACCEPTS A on A.acceptor_id = users.user_id Where R.acceptor_id is not null And A.accept_date is not null Group by user_id ) Z ) Y Where friend_count_rank = 1
Trevor Daniel on
The above response seems overly complicated.... why can't you just: SELECT r.sender_id, count(a.accepter_id || a.sender_id) as count_friends FROM requests as r INNER JOIN accepted as a on r.sender_id = a.sender_id AND r.accepter_id = a.accepter_id order by 2 desc limit 1;
jc on
Try counting the number of times a user has sent and was accepted, then count the times the same user has accepted, then add them up. SELECT c.sender_id, SUM(c.accepter_sum,d.sender_sum) as total FROM (SELECT b.sender_id, COUNT(b.accepter_id) as accepter_sum FROM REQUESTS a RIGHT JOIN ACCEPTED b ON a.sender_id = sender_id GROUP BY b.sender_id) c FULL OUTER JOIN (SELECT accepter_id, COUNT(sender) as sender_sum FROM ACCEPTED GROUP BY sender_id) d ON c.sender_id = d.accepter_id ORDER BY SUM(c.accepter_sum,d.sender_sum) DESC
Jesse on
select sender_id, sum(s) as Total_Friends from ( select sender_id, count(accepter_id) as s from requests group by sender_id union all select accepter_id, count(sender_id) as s from accepted group by accepter_id ) as A group by sender_id
Mansi on