P roblem 1:
In this problem, you have to write SQL and RA queries for an online music app database, similar to Spotify. Here are the tables:
User (u id, uname, profiletext)
Artist (artistId, artistName, nationality) Album (albumId, albumName, releaseTime)
Song (sid, sname, artistId, language, style, duration, releaseTime)
Playlist (playlistId, playlistName, uid, createTime, description) SongInAlbum (s id, albumId)
SongInPlaylist (sid, playistId, addTime)
PlayHistory (uid, sid, startTime, endTime)
The music streaming app provides various songs in the system, and users can listen to songs using the app. They can also create their own playlists with name and description. After a playlist is created, one or more songs can be added to it. When a user listens to one song at some time, the system will keep a record of the play history, including the user, the song, start time, and end time (since the user may stop listening early). Every song has a unique id, a name, a creator, language, style, duration, and release date. One song is released one time, but can be added to multiple albums. Those albums may have different names and may be released at different dates. So the release time for the song and the album can be different. Also a song may not belong to any album. But an album must contain at least one song.
Note that there could be many different songs with the same title (sname) but different sids. These could be different recordings of the same song by the same artist, or other artists recording the same song, or completely different songs that have the same title — we do not keep track of which case it is. Also, each record in song has only one style, e.g., “Classic”, “Pop”, “Rock”, “Jazz”, etc.
- Draw an ER diagram that models this relational schema. Identify any weak entities, and the cardinalities of all the
- Create the above schema in a database system, choose appropriate attributes types, and define primary keys, foreign keys and other constraints (you may use any relational database system, as long as it supports basic SQL, views, and some sort of triggers).
Write the create table statements and describe how you loaded the data into the database. You can find the .csv file under the Resources tab on NYU classes.
- Write the following SQL queries and execute them on your database. Show the queries and the results:
- List the id and name of any song that does not belong to an
- List the id and name of every song that is saved in a playlists where the playlist contains the word “sleep” (only output each song once).
- List the id and name of any user who has played at least one song created by a French
- List any multi-language album with album name and release
- For each song, list the sid, sname, artist name, and the number of times it was played during 2020. The result should be sorted in descending order based on how often a song has been played (startTime should be in the year 2020).
- Output the id, name, and nationality of the artist who released the largest number of songs of style “Jazz” during
- Output the id and name of any user who has played every song by Justin Bieber at least
- List the id, and name of any album that contains at least ten songs of style “Folk”.
- Write expressions in Relational Algebra for queries (iv) to (viii).
- Write SQL statements to perform the following updates to the database:
- The user with id 3 adds the song named “Hold On” from playlist “Motivation” to playlist “Rock” (assume there is only one such song).
- The user with id 5 deletes all his(/her) play history within the last 30
- The user with id 10 creates a playlist named “Reading” with the description “Soft piano music to accompany reading”.
P roblem 2:
In this problem, you need to create views and triggers given the relational schema about the grocery purchase database given below. You do not need to consider returns or exchange of items in this grocery, and we do not store any information about the customers.
Item (iid, item_name, category, price, origin, description)
Purchase (p id, p_timestamp, total_amount, payment_method, card_number, status) Purchase_items (pid, iid, quantity, total_price)
Some explanation for the schema:
For each item, we have a unique id, the item name, the category of the item (e.g. food, furniture, snack…), the price (e.g. $3 for each pound of pork, $10 for one chair), the origin country, and some description of this item. For each purchase, we record the unique purchase id, when this purchase happened, the total amount of the purchase including tax, the payment_method (e.g.
credit card, debit, cash…), the card number if using a credit or debit card, and the status (e.g. fail, pending, finished). For each purchase, we also need to record which and how many items are included in this purchase. To do so, we record the purchase id, the item id, the count of items purchased, and the price of all items in another table Purchase_items.
- Define a view that stores, for each id of item, the item id, the date, and the total count of items purchased on that
- Using this view, output for each month in 2020, the id of the item that was purchased the most during that month (if you buy the same item five times in one purchase, it should be treated as 5, not 1).
- Write a trigger that rejects any insertion of a new purchase with a credit card for which there is either a failed or at least 3 pending purchases in the
- Create a new table “star_item_each_day”, containing the attributes date, iid, item name and total purchase value, for the item(s) that had the highest sales (in $) on that date. Write a query to create and initialize this table given the current database. Then write a trigger that keeps this table up to date as new measurements are inserted into the database. If there is a tie between several items on a given day, you need to output several iids and item_names. Also, keep in mind that when a new day arrives, you may have to insert a new record into the table on the first purchase that
P roblem 3:
Although you always wanted to be a professional gamer, somehow you followed your other dream and ended up becoming an expert in databases. Your old love is still there, so you decide to build a Esports website for game lovers. The core of this website is a database with a schema that captures all the information about Esports.
- The database should keep information about video game companies that develop games, such as their unique names, the dates they were founded, headquarters,
- You will also include some hot Esports games developed by these companies in your database. Each game is uniquely identified by its name, and you also want to record the developer company, release date, and genre (MOBA, fighting, shooting, etc). Each game has exactly one developer company, and the developer company must also be in your
- You personally are a huge fan of several professional players, so you want to add them into your database as well. Each player is uniquely identified by an ID, and their names, nationality, and date of birth must be recorded. To be included in our database, a player must play at least one game that is contained in our dataset. Also, for every game that a player plays, there must be exactly one team that the player represents for that
- Every gaming team has a unique name, and you also want to record its founding date, where it is based, the owner of the team (you can assume that the owner is not a
player). A gaming team could be competing in multiple games. E.g., Team Liquid
competes in games such as Hearthstone, Rocket League, Super Smash Bros., etc.
- Some of the players might be playing multiple games and competing for multiple teams at the same time. For example, the player Daryl Koh Pei Xiang, also known as iceiceice, plays StarCraft II for the team FXOpen e-Sports. He’s also competing in the game Dota2 representing the team Evil Geniuses. Your database should keep track of the team that the player currently plays for, what game the player represents the team to play, his/her role on that team such as “support” or “carry”.
- Each gaming team will participate in various tournaments, and your database should store information about them. A tournament represents a competition between teams playing the same game, and is associated with one game only. Each tournament can be uniquely identified by the name and the season, and you should also record the game of the tournament, start date, location that it is being held at, prize pool (the total prize money). You should also store the result of the tournaments by recording the top 3 winning teams, their ranks, and the prize that they won (e.g. 1 million dollars).
- Since this is an initial draft of your database design, you decide to make it simple. Therefore, you do not care about the individual match in the tournament, but only how many teams are competing in a tournament, and the final result of that
- Design a database for the above scenario using the ER model. Draw the ER diagram, show the cardinalities of all relationships, and identify primary keys and any weak entities. List any assumptions you have as
- Convert your ER diagram into a relational schema. Identify all tables, attributes, primary keys, and foreign
- Write SQL queries for the following questions. If you cannot answer a query using your schema, then you have to modify your solutions in (a) and (b)
- For each game developer company, list their name together with the number of the games they have developed for each genre. The result should be first sorted in ascending order by company name, and if there is a tie, break the tie by sorting by the number of games for each genre in descending order. (output schema: company name, game genre, count number)
- For each team that plays the game PUBG, output the average age of the team members that play the role “Attacker”.
- For each team that plays the game Dota2, but has never participated in any Dota2
tournament in 2020, output the name of all its team members without duplicates.
- Output the name of the gaming team(s) that has ranked in the top 3 for every single tournament it has participated in during
- Create tables in the database system, and insert some sample data (5-10 tuples per table, but choose an interesting and meaningful data set, so that queries do not output empty results). Submit screenshots of what your tables look like after you inserted the
data. Then execute the queries in (c) and submit the screenshots of the queries and outputs.