Description Using the Cypher statements given in the next section, you will create on Neo4j Desktop a graph database describing a fictitious social network where users share information about the books they have read. In this network, users can follow other users and rate the books they have read. The database contains information about the users (username and age), the books (id, title, genre, author and publisher), the “follows” relationships among the users, the “read” relationships between the users and the books, and the ratings of the users for the books they have read. For the first exercise, you will create appropriate Cypher statements to retrieve information from the database (as described in the exercise). For the second exercise, you will think of some simple recommendation algorithms with which you would recommend users to follow or books to read to a user of the network and implement these algorithms using Cypher statements. Your report must contain: – The Cypher statements that you created for exercise 1. – The Cypher statements that you created for exercise 2 along with some brief description of the recommendation algorithms they implement. Setup Before you attempt the exercises, follow the steps below to create the graph database on Neo4j Desktop. Examine the csv files and the database that you created to verify that the database has been correctly implemented and to familiarise with the structure of the database. 1. Using Neo4j Desktop, create a new graph database and name it Book Graph. 2. Download from Moodle and copy the files books.csv, users.csv, followers.csv and ratings.csv to the import folder of your database. 3. Run the following Cypher statements (one at a time) to populate the database with data about books, the users of a social network, the relationships among the users and the relationships between the users and the books. a. LOAD CSV WITH HEADERS from ‘file:///books.csv’ AS book CREATE (:Book {bookID:book.BookId, title:book.Title, genre:book.Genre, author:book.Author, publisher:book.Publisher}) This should create 101 nodes with label Book, each with a bookID, a title, a genre, an author and a publisher property b. LOAD CSV WITH HEADERS from ‘file:///users.csv’ AS user CREATE (:User {username:user.Username, age:toFloat(user.Age)}) This should create 26 nodes with label User, each with a username and an age property. c. LOAD CSV WITH HEADERS from ‘file:///followers.csv’ AS fol MATCH (u1:User {username:fol.User1}), (u2:User {username:fol.User2}) CREATE (u1)-[:FOLLOWS]->(u2) This should create 100 :FOLLOWS relationships among users. d. LOAD CSV WITH HEADERS from ‘file:///ratings.csv’ AS rat MATCH (u:User {username:rat.User}), (b:Book {bookID:rat.Book}) CREATE (u)- [:READ {rating:toInteger(rat.Rating)}]->(b) This should create 199 :READ relationships between users and books each with a rating property. Exercise 1 Create Cypher queries to: 1. List the titles of the books that have been read by Charles and by a user whose age is more than 20, and have received a rating by both that is greater than 2. 2. List the titles and authors of the books that have been published by MIT Press, Penguin, Springer or Wiley and their genre is fiction, history, mathematics or economics. Show the results in alphabetic order of the titles. 3. For each pair of users such that one follows the other, list the titles and the publishers of the books that they have both read. 4. List the names of users who follow Fiona and have read more than 10 books. For each such user, show also the number of books they have read. 5. List all publishers such that the average rating of the books they have published is higher than the average rating of the books published by Pearson. 6. List the nodes in the shortest path from Adam to Lilly. 7. Show the maximum distance from a user to a science book, where the distance from node A to node B is the length of the shortest path from A to B. 8. List the titles of the books for which the publisher is not known and for each of them the list of names of the users that have read them. For each such book, add the label UknownPublisher. 9. List the names of the users that are followed by Fiona and, if they have read any nonfiction books, the list of titles of those books. 10. A book is considered popular if it has been read by more than 4 users and it has received at least two ratings that are greater than 3. List the titles of the popular books. Exercise 2 1. Write down two algorithms that provide recommendations for users to follow, using the available information about users and books, and implement each of them as a CYPHER statement. The statement should create new RECOMMENDED_USER relationships, each connecting a user with a recommended user to follow. The recommended users should not include those that the user already follows. 2. Write down two algorithms that provide recommendations for books, using the available information about users and books, and implement each of them as a CYPHER statement. The statement should create new RECOMMENDED_BOOK relationships, each connecting a user with a recommended book. The recommended books should not include those that the user has already read. Marking Criteria and Procedure This set of exercises counts as 50% of the total course assessment. Exercise 1 is worth 30% (marks are divided equally among its subquestions) and Exercise 2 is worth 20% (marks are divided equally among its subquestions). Marks will be awarded according to: – whether the answers are technically correct (i.e. the syntax of the Cypher statements is correct and the statements produce the correct results) – whether the answers given are as straightforward as possible and not more complicated than necessary – whether the answers are set out clearly and in good style – (for Exercise 2) whether the recommendation algorithms are clearly described and correctly implemented and the recommendations they produce are reasonable
© 版权声明
文章版权归作者所有,未经允许请勿转载。
THE END
喜欢就支持一下吧