- Complete the
transform_avg_rating()
function by grouping by thecourse_id
column, and taking the mean of therating
column. - Use
extract_rating_data()
to extract raw ratings data. It takes in as argument the database enginedb_engines
. - Use transform_avg_rating() on the raw rating data you've extracted.
- Mencari kesamaan antar user 1 2 3 berdasarkan rating yang di input di sistem
Now that you have a grasp of what's happening in the datacamp_application
database, let's go ahead and write up a query for that database.
The goal is to get a feeling for the data in this exercise. You'll get the rating data for three sample users and then use a predefined helper function, print_user_comparison()
, to compare the sets of course ids these users rated.
- Complete the connection URI. The database is called
datacamp_application
. The host islocalhost
with port5432
. The username isrepl
and the password ispassword
. - Select the ratings of users with id:
4387
,18163
and8770
. - Fill in
print_user_comparison()
with the three users you selected.
# Complete the connection URI
Course id overlap between users:
================================ User 1 and User 2 overlap: {32, 96, 36, 6, 7, 44, 95} User 1 and User 3 overlap: set() User 2 and User 3 overlap: set()
In this exercise, you'll complete a transformation function transform_avg_rating()
that aggregates the rating data using the pandas
DataFrame's .groupby()
method. The goal is to get a DataFrame with two columns, a course id and its average rating:
course_id | avg_rating |
---|---|
123 | 4.72 |
111 | 4.62 |
… | … |
In this exercise, you'll complete this transformation function, and apply it on raw rating data extracted via the helper function extract_rating_data()
which extracts course ratings from the rating
table.
- Complete the
transform_avg_rating()
function by grouping by thecourse_id
column, and taking the mean of therating
column. - Use
extract_rating_data()
to extract raw ratings data. It takes in as argument the database enginedb_engines
. - Use
transform_avg_rating()
on the raw rating data you've extracted
Komentar
Posting Komentar