1. Problem Description:
You are given serveral tables as follows:
- Table:
Movies
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| movie_id | int |
| title | varchar |
+---------------+---------+
movie_id is the primary key for this table.
title is the name of the movie.
- Table:
Users
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| user_id | int |
| name | varchar |
+---------------+---------+
user_id is the primary key for this table.
- Table:
Movie_Rating
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| movie_id | int |
| user_id | int |
| rating | int |
| created_at | date |
+---------------+---------+
(movie_id, user_id) is the primary key for this table.
This table contains the rating of a movie by a user in their review.
created_at is the user ' s review date.
Your task is to write the Query Scripts to complete two subtasks:
Find the name of the user who has rated the greatest number of the movies.
In case of a tie, return lexicographically smaller user name.Find the movie name with the highest average rating in February 2020.
In case of a tie, return lexicographically smaller movie name.
Example
Movies table:
+-------------+--------------+
| movie_id | title |
+-------------+--------------+
| 1 | Avengers |
| 2 | Frozen 2 |
| 3 | Joker |
+-------------+--------------+
Users table:
+-------------+--------------+
| user_id | name |
+-------------+--------------+
| 1 | Daniel |
| 2 | Monica |
| 3 | Maria |
| 4 | James |
+-------------+--------------+
Movie_Rating table:
+-------------+--------------+--------------+-------------+
| movie_id | user_id | rating | created_at |
+-------------+--------------+--------------+-------------+
| 1 | 1 | 3 | 2020-01-12 |
| 1 | 2 | 4 | 2020-02-11 |
| 1 | 3 | 2 | 2020-02-12 |
| 1 | 4 | 1 | 2020-01-01 |
| 2 | 1 | 5 | 2020-02-17 |
| 2 | 2 | 2 | 2020-02-01 |
| 2 | 3 | 2 | 2020-03-01 |
| 3 | 1 | 3 | 2020-02-22 |
| 3 | 2 | 4 | 2020-02-25 |
+-------------+--------------+--------------+-------------+
Result table:
+--------------+
| results |
+--------------+
| Daniel |
| Frozen 2 |
+--------------+
Daniel and Maria have rated 3 movies (“Avengers”, “Frozen 2” and “Joker”) but Daniel is smaller lexicographically.
Frozen 2 and Joker have a rating average of 3.5 in February but Frozen 2 is smaller lexicographically.
Prepared SQL KnowLedge
Since we need to write two indepent qeury and union each result to get the result table. In SQL, There are two union operators,which are
UNION
andUNION ALL
, They all combines the result set of two or more SELECT statements. However,UNION ALL
allows duplicates rows, whileUNION
will only contains distinct values. ReferenceIn this subtask 1, we should first sort by the number of ratings and if any two are the same, we should sort by its user name lexicographically. So, In order to sort in this way, Order cause should be written as follows:
# ORDER BY column1 DESC, column2
This sorts everything by column1 (descending) first,
and then by column2 (ascending, which is the default)
whenever the column1 fields for two or more rows are equal.
Solutions:
- For the first task, we could
GROUP BY
the user_id, andCOUNT
the number of movies (represented by movie_id) which the user rated. Sort by the number, and its lexicographically smaller user name. USELIMIT
to get the first record.
Query script for your reference.
SELECT u.name, COUNT(m.movie_id) as max_rating_num
FROM Movie_Rating AS m
LEFT JOIN Users AS u
ON m.user_id = u.user_id
GROUP BY m.user_id
ORDER BY max_rating_num DESC,u.name
LIMIT 1
- For the second task, its quite similar to the first query. Notice that Since what we really concern about is the ratings in February 2020. So Before we group the rows and compute the
AVE
, we should filter the dta in February 2020 byWHERE
CAUSE. Still, the moving_rating table is the primary table, so we use the left join with the movie table to get its name.
Query script for your reference.
SELECT m.title, AVG(r.rating) as movie_rating
FROM Movie_Rating AS r
LEFT JOIN Movies AS m
ON m.movie_id = r.movie_id
WHERE MONTH(r.created_at) = 2 AND YEAR(r.created_at) = 2020
GROUP BY r.movie_id
ORDER BY movie_rating DESC, m.title
LIMIT 1
Final Answer:
Combination with UNION
keyword.
(SELECT u.name AS results
FROM Movie_Rating AS m
LEFT JOIN Users AS u
ON m.user_id = u.user_id
GROUP BY m.user_id
ORDER BY COUNT(m.movie_id) DESC,u.name
LIMIT 1)
UNION
(SELECT m.title AS results
FROM Movie_Rating AS r
LEFT JOIN Movies AS m
ON m.movie_id = r.movie_id
WHERE MONTH(r.created_at) = 2 AND YEAR(r.created_at) = 2020
GROUP BY r.movie_id
ORDER BY AVG(r.rating) DESC, m.title
LIMIT 1)