Published September 15, 2020 | Version v1
Dataset Open

User Groups for Robustness of Meta Matrix Factorization Against Decreasing Privacy Budgets

  • 1. Know-Center GmbH
  • 2. Graz University of Technology

Description

This dataset comprises a subset of rating data from five different datasets, i.e., Douban [1], Hetrec-MovieLens [2], MovieLens 1M [3], Ciao [4] and Jester [5]. Each subset represents rating data from three distinct user groups: users with few ratings (low), users with a medium amount of ratings (med) and users with lots of ratings (high). Each row in the user files includes a user's id and her number of ratings. The rows of the ratings files are in the format (user_id, item_id, rating). For more details, we refer to our publication in https://rd.springer.com/chapter/10.1007/978-3-030-72240-1_8.

Douban
* 375 users (i.e., 125 users per user group)
* 32,191 items
* 266,517 ratings

Hetrec-MovieLens 
* 318 users (i.e., 106 users per user group)
* 9,553 items
* 207,943 ratings

MovieLens 1M
* 906 users (i.e., 302 users per user group)
* 3,613 items
* 275,119 ratings

Ciao
* 1,107 users (i.e., 369 users per user group)
* 60,132 items
* 107,807 ratings

Jester
* 11,013 users (i.e., 3,671 per user group)
* 100 items
* 618768 ratings

The python code for generating and utilizing this dataset can be found in https://github.com/pmuellner/RobustnessOfMetaMF.

This work is supported by the H2020 project TRUSTS (GA: 871481) and the "DDAI'' COMET Module within the COMET – Competence Centers for Excellent Technologies Programme, funded by the Austrian Federal Ministry for Transport, Innovation and Technology (bmvit), the Austrian Federal Ministry for Digital and Economic Affairs (bmdw), the Austrian Research Promotion Agency (FFG), the province of Styria (SFG) and partners from industry and academia. The COMET Programme is managed by FFG.

[1] Hu, L., Sun, A., Liu, Y.: Your neighbors affect your ratings: on geographical neighborhood influence to rating prediction. In: SIGIR’14 (2014)
[2] Cantador, I., Brusilovsky, P., Kuflik, T.: Second international workshop on information heterogeneity and fusion in recommender systems (hetrec2011). In: RecSys’11(2011)
[3] Harper, F. M., Konstan, J. A.: The movielens datasets: History and context. ACM Transactions on Interactive Intelligent Systems (TIIS) 5(4), 1–19 (2015)
[4] Guo, G., Zhang, J., Thalmann, D., Yorke-Smith, N.: Etaf: An extended trust antecedents framework for trust prediction. In: ASONAM’14 (2014)
[5] Goldberg, K., Roeder, T., Gupta, D., Perkins, C.:  Eigentaste: A constant time collaborative filtering algorithm. Information Retrieval 4(2), 133–151 (2001)

Files

User Groups.zip

Files (10.5 MB)

Name Size Download all
md5:b2451356dd5480f0a2239acf83ba0712
10.5 MB Preview Download