• DOI: 10.22146/IJCCS.62716
  • Corpus ID: 234087134

Recommendation System for Thesis Topics Using Content-based Filtering

  • Hans Satria Kusuma , Aina Musdholifah
  • Published in IJCCS (Indonesian Journal of… 31 January 2021
  • Computer Science, Education
  • IJCCS (Indonesian Journal of Computing and Cybernetics Systems)

Figures and Tables from this paper

figure 1

3 Citations

Neural collaborative with sentence bert for news recommender system, sistem rekomendasi tugas akhir mahasiswa pada amik indonesia untuk mendukung merdeka belajar-kampus merdeka menggunakan metode collaborative filtering (cf), use of hybrid methods in making e-commerce product recommendation systems to overcome cold start problems, 11 references, decision support system to deciding thesis topic, optimized tf-idf algorithm with the adaptive weight of position of word, introduction to modern information retrieval, 3rd edition, recommender systems: kembellec/recommender systems, universal lemmatizer: a sequence-to-sequence model for lemmatizing universal dependencies treebanks, rekomendasi topik tugas akhir mahasiswa teknik informatika di universitas muhammadiyah jember menggunakan metode naïve bayesian classifier, implementasi stop word removal untuk pembangunan applikasi alkitab berbasis windows 8, introduction to modern information retrieval, related papers.

Showing 1 through 3 of 0 Related Papers

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Springer Nature - PMC COVID-19 Collection

Logo of phenaturepg

Scientific paper recommendation systems: a literature review of recent publications

Christin katharina kreutz.

1 Cologne University of Applied Sciences, Cologne, Germany

Ralf Schenkel

2 Trier University, Trier, Germany

Scientific writing builds upon already published papers. Manual identification of publications to read, cite or consider as related papers relies on a researcher’s ability to identify fitting keywords or initial papers from which a literature search can be started. The rapidly increasing amount of papers has called for automatic measures to find the desired relevant publications, so-called paper recommendation systems. As the number of publications increases so does the amount of paper recommendation systems. Former literature reviews focused on discussing the general landscape of approaches throughout the years and highlight the main directions. We refrain from this perspective, instead we only consider a comparatively small time frame but analyse it fully. In this literature review we discuss used methods, datasets, evaluations and open challenges encountered in all works first released between January 2019 and October 2021. The goal of this survey is to provide a comprehensive and complete overview of current paper recommendation systems.

Introduction

The rapidly increasing number of publications leads to a large quantity of possibly relevant papers [ 6 ] for more specific tasks such as finding related papers [ 28 ], finding ones to read [ 109 ] or literature search in general to inspire new directions and understand the state-of-the-art approaches [ 46 ]. Overall researchers typically spend a large amount of time on searching for relevant related work [ 7 ]. Keyword-based search options are insufficient to find relevant papers [ 9 , 52 , 109 ], they require some form of initial knowledge about a field. Oftentimes, users’ information needs are not explicitly specified [ 56 ] which impedes this task further.

To close this gap, a plethora of paper recommendation systems have been proposed recently [ 37 , 39 , 88 , 104 , 117 ]. These systems should fulfil different functions: for  junior researchers  systems  should  recommend a broad variety of papers, for senior ones the recommendations should align more with their already established interests [ 9 ] or help them discover relevant interdisciplinary research [ 100 ]. In general paper recommendation approaches positively affect researchers’ professional lives as they enable finding relevant literature more easily and faster [ 50 ].

As there are many different approaches, their objectives and assumptions are also diverse. A simple problem definition of a paper recommendation system could be the following: given one paper recommend a list of papers fitting the source paper [ 68 ]. This definition would not fit all approaches as some specifically do not require any initial paper to be specified but instead observe a user as input [ 37 ]. Some systems recommend sets of publications fitting the queried terms only if these papers are all observed together [ 60 , 61 ], most of the approaches suggest a number of single publications as their result [ 37 , 39 , 88 , 117 ], such that any single one of these papers satisfies the information need of a user fully. Most approaches assume that all required data to run a system is present already [ 37 , 117 ] but some works [ 39 , 88 ] explicitly crawl general publication information or even abstracts and keywords from the web.

In this literature review we observe papers recently published in the area of scientific paper recommendation between and including January 2019 and October 2021 1 . We strive to give comprehensive overviews on their utilised methods as well as their datasets, evaluation measures and open challenges of current approaches. Our contribution is fourfold:

  • We propose a current multidimensional characterisation of current paper recommendation approaches.
  • We compile a list of recently used datasets in evaluations of paper recommendation approaches.
  • We compile a list of recently used evaluation measures for paper recommendation.
  • We analyse existing open challenges and identify current novel problems in paper recommendation which could be specifically helpful for future approaches to address.

In the following Sect.  2 we describe the general problem statement for paper recommendation systems before we dive into the literature review in Sect.  3 . Section  4 gives insight into datasets used in current work. In the following Sect.  5 different definitions of relevance, relevance assessment as well as evaluation measures are analysed. Open challenges and objectives are discussed in detail in Sect.  7 . Lastly Sect.  8 concludes this literature review.

Problem statement

Over  the  years  different  formulations  for  a  problem statement of a paper recommendation system have emerged. In general they should specify the input for the recommendation system, the type of recommendation results, the point in time when the recommendation will be made and which specific goal an approach tries to achieve. Additionally, the target audience should be specified.

As input we can either specify an initial paper [ 28 ], keywords [ 117 ], a user [ 37 ], a user and a paper [ 5 ] or more  complex  information  such  as  user-constructed knowledge graphs [ 109 ]. Users can be modelled as a combination  of  features  of  papers  they  interacted with [ 19 , 21 ], e.g. their clicked [ 26 ] or authored publications [ 22 ]. Papers can for example be represented by their textual content [ 88 ].

As types of recommendation we could either specify single (independent) papers [ 37 ] or a set of papers which is to be observed completely to satisfy the information need [ 61 ]. A study by Beierle et al. [ 18 ] found that existing digital libraries recommend between three and ten single papers, in their case the optimal number of suggestions to display to users was five to six.

As for the point in time , most work focuses on immediate recommendation of papers. Only a few approaches also consider delayed suggestion 2 via newsletter for example [ 56 ].

In general, recommended papers should be relevant in one way or another to achieve certain goals . The intended goal of authors of papers could, e.g. either be to recommend papers which should be read [ 109 ] by a user or recommend papers which are simply somehow related to an initial paper [ 28 ], by topic, citations or user interactions.

Different target audiences , for example junior or senior researcher, have different demands from paper recommendation systems [ 9 ]. Usually paper recommendation approaches target single users but there are also works which strive to recommend papers for sets of users [ 110 , 111 ].

Literature review

In this chapter we first clearly define the scope of our literature review (see Sect.  3.1 ) before we conduct a meta-analysis on the observed papers (see Sect.  3.2 ). Afterwards our categorisation or lack thereof is discussed in depth (see Sect.  3.3 ), before we give short overviews of all paper recommendation systems we found (see Sect.  3.5 ) and some other relevant related work (see Sect.  3.6 ).

To the best of our knowledge the literature reviews by Bai et al. [ 9 ], Li and Zou [ 58 ] and Shahid et al. [ 92 ] are the most recent ones targeting the domain of scientific paper recommendation systems. They were accepted for publication or published in 2019 so they only consider paper recommendation systems up until 2019 at most. We want to bridge the gap between papers published after their surveys were finalised and current work so we only focus on the discussion of publications which appeared between January 2019 and October 2021 when this literature search was conducted.

We conducted our literature search on the following digital libraries: ACM 3 , dblp 4 , GoogleScholar 5 and Springer 6 . Titles of considered publications had to contain either paper , article or publication as well as some form of recommend . Papers had to be written in English to be observed. We judged relevance of retrieved publications by observing titles and abstracts if the title alone did not suffice to assess their topical relevance. In addition to these papers found by systematically searching digital libraries, we also considered their referenced publications if they were from the specified time period and of topical fit. For all papers their date of first publication determines their publication year which decides if they lie in our time observed time frame or not. For example, for journal articles we consider the point in time when they were first published online instead of the date on which they were published in an issue, for conference articles we consider the date of the conference instead a later date when they were published online. Figure  1 depicts the PRISMA [ 79 ] workflow for this study.

An external file that holds a picture, illustration, etc.
Object name is 799_2022_339_Fig1_HTML.jpg

PRISMA workflow of our literature review process

We refrain from including works in our study which do not identify as scientific paper recommendation systems such as Wikipedia article recommendation [ 70 , 78 , 85 ] or general news article recommendation [ 33 , 43 , 103 ]. Citation recommendation systems [ 72 , 90 , 124 ] are also out of scope of this literature review. Even though citation and paper recommendation can be regarded as analogous [ 45 ], we argue the differing functions of citations [ 34 ] and tasks of these recommendation systems [ 67 ] should not be mixed with the problem of paper recommendation. Färber and Jatowt [ 32 ] also support this view by stating that both are disjunctive, with paper recommendation pursuing the goal of providing papers to read and investigate while incorporating user interaction data and citation recommendation supporting users with finding citations for given text passages. 7 We also consciously refrain from discussing the plethora of more area-independent recommender systems which could be adopted to the domain of scientific paper recommendation.

Our literature research resulted in 82 relevant papers. Of these, three were review articles. We found 14 manuscripts which do not present paper recommendation systems but are relevant works for the area nonetheless, they are discussed in Sect.  3.6 . This left 65 publications describing paper recommendation systems for us to analyse in the following.

Meta analysis

For papers within our scope, we consider their publication year as stated in the citation information for this meta-analysis. This could affect the publication year of papers compared to the former definition of which papers are included in this survey. For example, for journal articles we do not set the publication year as the point in time when they were first published online, instead for consistency (this data is present in the citation information of papers) for this analysis we use the year the issue was published in which the article is contained. Of the 65 relevant system papers, 21 were published in 2019, 23 were published in 2020 and 21 were published in 2021. On average each paper has 4.0462 authors (std. dev. = 1.6955) and 12.4154 pages (std. dev. = 9.2402). 35 (53.85%) of the papers appeared as conference papers, 27 (41.54%) papers were published in journals and there were two preprints (3.08%) which have not yet been published otherwise. There has been one master’s thesis (1.54%) within scope. The most common venues for publications were the ones depicted in Table  1 . Some papers [ 74 – 76 , 93 , 94 ] described the same approach without modification or extension of the actual paper recommendation methodology, e.g. by providing evaluations 8 . This left us with 62 different paper recommendation systems to discuss.

Top most common venues where relevant papers were published together with their type and number of papers (#p). Other venues had only one associated paper

TypeVenue#p
JournalIEEE Access5
JournalScientometrics2
JournalPeerJ CS2
ConferenceWWW2
ConferenceChineseCSCW2
ConferenceCSCWD2

Categorisation

Former categorisation.

The  already  mentioned  three  most  recent [ 9 , 58 , 92 ] and one older but highly influential [ 16 ] literature reviews in scientific paper recommendation utilise different categorisations to group approaches. Beel et al. [ 16 ] categorise observed papers by their underlying recommendation  principle  into  stereotyping,  content-based filtering, collaborative filtering, co-occurrence, graph-based, global relevance and hybrid models. Bai et al. [ 9 ] only utilise the classes content-based filtering, collaborative filtering, graph-based methods, hybrid methods and other models. Li and Zou [ 58 ] use the categories content-based recommendation, hybrid recommendation, graph-based recommendation and recommendation based on deep learning. Shahid et al. [ 92 ] label approaches by the criterion they identify relevant papers with: content, metadata, collaborative filtering and citations.

The four predominant categories thus are content-based filtering, collaborative filtering, graph-based and hybrid systems. Most of these categories are defined precisely but graph-based approaches are not always characterised concisely: Content-based filtering (CBF) methods are said to be ones where user interest is inferred by observing their historic interactions with papers [ 9 , 16 , 58 ]. Recommendations are composed by observing features of papers and users [ 5 ]. In collaborative filtering (CF) systems the preferences of users similar to a current one are observed to identify likely relevant publications [ 9 , 16 , 58 ]. Current users’ past interactions need to be similar to similar users’ past interactions [ 9 , 16 ]. Hybrid approaches are ones which combine multiple types of recommendations [ 9 , 16 , 58 ].

Graph-based methods can be characterised in multiple ways. A very narrow definition only encompasses ones which observe the recommendation task as a link prediction problem or utilise random walk [ 5 ]. Another less strict definition identifies these systems as ones which construct networks of papers and authors and then  apply  some  graph  algorithm  to  estimate relevance [ 9 ]. Another definition specifies this class as one using graph metrics such as random walk with restart, bibliographic coupling or co-citation inverse document frequency [ 106 ]. Li and Zhou [ 58 ] abstain from clearly characterising this type of systems directly but give examples which hint that in their understanding of graph-based methods somewhere in the recommendation process, some type of graph information, e.g. bibliographic coupling or co-citation strength, should be used. Beel et al. [ 16 ] as well as Bai et al. [ 9 ] follow a similar line, they characterise graph-based methods broadly as ones which build upon the existing connections in a scientific context to construct a graph network.

When trying to classify approaches by their recommendation type, we encountered some problems:

Indications as what type of paper recommendation system works describe themselves with indication if the description is a common used label (c)

WorkLabelc
[ ]Knowledge-based
[ ]Hybrid
[ ]Deep learning-based
[ ]Unified model
[ ]Graph-based
[ ]User-specific
[ ]Hybrid
[ ]Graph-based
[ ]Active one-shot learning
[ ]Collaborative filtering
[ ]Hybrid
[ ]Hybrid
[ ]Hybrid
[ ]Hybrid
[ ]hybrid
[ ]Hybrid
[ ]Network-based
[ ]Content-based
[ ]Graph-based
[ ]Neuro-collaborative filtering
[ ]Meta-path based
[ ]Heterogeneous graph representation based
[ ]Social network-based
[ ]Hybrid
[ ]Content-based
[ – ]Content-based
[ ]Hybrid
[ ]Content-based
[ ]Collaborative filtering
[ ]Hybrid
[ , ]In-text citation frequencies-based
[ ]Hybrid
[ ]content-based
[ ]Hybrid
[ ]Graph-based
[ ]Hybrid
[ ]Knowledge-aware path recurrent network
[ ]Graph-based
[ ]Hybrid
[ ]Hybrid
[ ]Hybrid
[ ]Network
[ ]Hybrid
  • When considering the broadest definition of graph-based methods many recent paper recommendation systems tend to belong to the class of hybrid methods. Most of the approaches [ 5 , 46 , 48 , 49 , 57 , 88 , 105 , 117 ] utilise some type of graph structure information as part of the approach which would classify them as graph-based but as they also utilise historic user-interaction data or descriptions of paper features (see, e.g. Li et al. [ 57 ] who describe their approach as network-based while using a graph structure, textual components and user profiles) which would render them as either CF or CBF also.

Thus we argue the former categories do not suffice to classify the particularities of current approaches in a meaningful way. So instead, we introduce more dimensions by which systems could be grouped.

Current categorisation

Recent paper recommendation systems can be categorised in 20 different dimensions by general information on the approach (G), already existing data directly taken from the papers used (D) and methods which might create or (re-)structure data, which are part of the approach (M):

  • (G) Personalisation (person.): The approach produces personalised recommendations. The recommended items depend on the person using the approach, if personalisation is not considered, the recommendation solely depends on the input keywords or paper. This dimension is related to the existence of user profiles.
  • (G) Input: The approach requires some form of input, either a paper (p), keywords (k), user (u) or something else, e.g. an advanced type of input (o). Hybrid forms are also possible. In some cases the input is not clearly specified throughout the paper so it is unknown (?).
  • (D) Title: The approach utilises titles of papers.
  • (D) Abstract (abs.): The approach utilises abstracts of papers.
  • (D) Keyword (key.): The approach utilises keywords of papers. These keywords are usually explicitly defined by the authors of papers, contrasting key phrases.
  • (D) Text: The approach utilises some type of text of papers which is not clearly specified as titles, abstracts or keywords. In the evaluation this approach might utilise specified text fragments of publications.
  • (D) Citation (cit.): The approach utilises citation information, e.g. numbers of citations or co-references.
  • (D) Historic interaction (inter.): The approach uses some sort of historic user-interaction data, e.g. previously authored, cited or liked publications. An approach can only include historic user-interaction data if it also somehow contains user profiles.
  • (M) User profile (user): The approach constructs some sort of user profile or utilises profile information. Most approaches using personalisation also construct user profiles but some do not explicitly construct profiles but rather encode user information in the used structures.
  • (M) Popularity (popul.): The approach utilises some sort of popularity indication, e.g. CORE rank, numbers of citations 9 or number of likes.
  • (M) Key phrase (KP): The approach utilises key phrases. Key phrases are not explicitly provided by authors of papers but are usually computed from the titles and abstracts of papers to provide a descriptive summary, contrasting keywords of papers.
  • (M) Embedding (emb.): The approach utilises some sort  of  text  or  graph  embedding  technique,  e.g. BERT or Doc2Vec.
  • (M) Topic model (TM): The approach utilises some sort of topic model, e.g. LDA.
  • (M) Knowledge graph (KG): The approach utilises or builds some sort of knowledge graph. This dimension surpasses the mere incorporation of a graph which describes a network of nodes and edges of different types. A knowledge graph is a sub-category of a graph.
  • (M) Graph: The approach actively builds or directly uses a graph structure, e.g. a knowledge graph or scientific heterogeneous network. Utilisation of a neural network is not considered in this dimension.
  • (M) Meta-path (path): The approach utilises meta-paths. They usually are composed from paths in a network.
  • (M) Random Walk (with Restart) (RW): The approach utilises Random Walk or Random Walk with Restart.
  • (M) Advanced machine learning (AML): The approach utilises some sort of advanced machine learning component in its core such as a neural network. Utilisation of established embedding methods which themselves use neural networks (e.g. BERT) are not considered in this dimension. We do not consider traditional and simple ML techniques such as k means in this dimension but rather mention methods explicitly defining a loss function, using multi-layer perceptrons or GCNs.
  • (M) Crawling (crawl.): The approach conducts some sort of web crawling step.
  • (M) Cosine similarity (cosine): The approach utilises cosine similarity at some point.

Of the observed paper recommendation systems, six were general systems or methods which were only applied on the domain of paper recommendation [ 3 , 4 , 24 , 60 , 118 , 121 ]. Two were targeting explicit set-based recommendation of publications where only all papers in the set together satisfy users’ information needs [ 60 , 61 ], two recommend multiple papers [ 42 , 71 ] (e.g. on a path [ 42 ]), all the other approaches focused on recommendation of k single papers. Only two approaches focus on recommendation of papers to user groups instead of single users [ 110 , 111 ]. Only one paper [ 56 ] supports subscription-based recommendation of papers, all other approaches solely regarded a scenario in which papers were suggested straight away.

Table  3 classifies the observed approaches according to the afore discussed dimensions.

Indications whether works utilise the specific data or methods. Papers describing the same approach without extension of the methodology (e.g. only describing more details or an evaluation) are regarded in combination with each other

WorkGeneralDataMethods
Person.InputTitleAbs.Key.TextCitat.Inter.UserPopul.KPEmb.TMKGGraphPathRWAMLCrawl.Cosine
[ ] u
[ ]p
[ ] u
[ ] u
[ ] pu
[ ] u
[ ] u
[ ] u
[ ]k
[ ] k
[ ] ku
[ ]u
[ ]p
[ ]p
[ ] pu
[ ] u
[ ]p
[ ]p
[ ] u
[ ]p
[ ]p
[ ]p
[ ]p
[ ]p
[ ] u
[ ] u
[ ] k
[ ] u
[ ]k
[ ]k
[ ]k
[ ] u
[ ] u
[ ] u
[ ] pu
[ ] u
[ ] p
[ – ] u
[ ] u
[ ]p
[ ]p
[ ]p
[ , ]p
[ ]p
[ ]p
[ ]k
[ ] u
[ ]p
[ ]p
[ ]p
[ ] u
[ ] ko
[ ] u
[ ] u
[ ] ku
[ ]pk
[ ]k
[ ] u
[ ] o
[ ]?
[ ]?
[ ] u

Comparison of paper recommendation systems in different categories

In this Section, we describe the scientific directions associated with the categories we presented in the previous section as the 65 relevant publications. We focus only on the methodological categories and describe how they are incorporated in the respective approaches.

User profile

32  approaches  construct  explicit  user  profiles.  They utilise different components to describe users. We differentiate  between  profiles  derived  from  user  interactions and ones derived from papers.

Most user profiles are constructed from users’ actual interactions : unspecified historical interaction [ 30 , 37 , 56 , 57 , 64 , 118 ], the mean of the representation of interacted with papers [ 19 ], time decayed interaction behaviour [ 62 ], liked papers [ 69 , 123 ], bookmarked papers [ 84 , 119 ], read papers [ 111 , 113 ], rated papers [ 3 , 4 , 110 ], clicked on papers [ 24 , 26 , 49 ], categories of clicked papers [ 1 ], features of clicked papers [ 104 ], tweets [ 74 – 76 ], social interactions [ 65 ] and explicitly defined topics of interest tags [ 119 ].

Some approaches derived user profiles from users’ written papers : authored papers [ 5 , 21 , 22 , 55 , 63 , 74 – 76 , 116 ], a partitioning of authored papers [ 27 ], research fields of authored papers [ 41 ] and referenced papers [ 116 ].

We found 13 papers using some type of popularity measure. Those can be defined on authors, venues or papers.

For author-based popularity measures we found unspecified ones [ 65 ] such as authority [ 116 ] as well as ones regarding the citations an author received: citation count of papers [ 22 , 96 , 108 , 119 ], change in citation count [ 25 , 26 ], annual citation count [ 26 ], number of citations related to papers [ 59 ], h-index [ 26 ]. We found two definitions of author’s popularity using the graph structure of scholarly networks, namely the number of co-authors [ 41 ] and a person’s centrality [ 108 ].

For venue-based popularity measures, we found an unspecific reputation notion [ 116 ] as well as incorporation of the impact factor [ 26 , 117 ].

For paper-based popularity measures we encountered some citation-based definitions such as vitality [ 117 ], citation count of papers [ 22 ] and theirs centrality [ 96 ] in the citation network. Additionally, some approaches incorporated less formal interactions: number of downloads [ 56 ], social media mentions [ 119 ] and normalised number of bookmarks [ 84 ].

Only four papers use key phrases in some shape or form: Ahmad and Afzal [ 2 ] construct key terms from preprocessed titles and abstracts using tf-idf to represent papers. Collins and Beel [ 28 ] use the Distiller Framework [ 12 ] to extract uni-, bi- and tri-gram key phrase candidates from tokenised, part-of-speech tagged and stemmed titles and abstracts. Key phrase candidates were weighted and the top 20 represent candidate papers. Kang et al. [ 46 ] extract key phrases from CiteSeer to describe the diversity of recommended papers. Renuka et al. [ 86 ] apply rapid automatic keyword extraction.

In summary, different length key phrases usually get constructed from titles and abstracts with automatic methods such as tf-idf or the Distiller Framework to represent the most important content of publications.

We found a lot of approaches utilising some form of embedding based on existing document representation methods. We distinguish by embedding of papers, users and papers and sophisticated embedding from the proposed approaches.

Among the most common methods was their application on papers : in an unspecified representation [ 30 , 119 ],  Word2Vec [ 19 , 37 , 44 , 45 , 55 , 104 , 113 ],  Word2Vec of LDA top words [ 24 , 107 ], Doc2vec [ 21 , 28 , 48 , 62 , 63 , 107 ], Doc2Vec of word pairs [ 109 ], BERT [ 123 ] and SBERT [ 5 , 19 ]. Most times these approaches do not mention which part of the paper to use as input but some specifically mention the following parts: titles [ 37 ], titles and abstracts [ 28 , 45 ], titles, abstracts and bodies [ 48 ], keywords and paper [ 119 ].

Few approaches observed user profiles and papers , here Word2Vec [ 21 ] and NPLM [ 29 ] embeddings were used.

Several approaches embed the information in their own model embedding: a heterogeneous information network [ 5 ], a two-layer NN [ 37 ], a scientific social reference network [ 41 ], the TransE model [ 56 ], node embeddings [ 63 ], paper, author and venue embedding [ 116 ], user and item embedding [ 118 ], a GRU and association rule mining model [ 71 ], a GCN embedding of users [ 104 ] and an LSTM model [ 113 ].

Topic model

Eight approaches use some topic modelling component. Most of them use LDA to represent papers’ content [ 3 , 5 , 24 , 27 , 107 , 117 ]. Only two of them do not follow this method: Subathra and Kumar [ 98 ] use LDA on papers to find their top n words, then they use LDA again on these words’ Wikipedia articles. Xie et al. [ 115 ] use a hierarchical LDA adoption on papers, which introduces a discipline classification.

Knowledge graph

Only six of the observed papers incorporate knowledge graphs. Only one uses a predefined one, the Watson for Genomics knowledge graph [ 95 ]. Most of the approaches build their own knowledge graphs, only one asks users to construct the graphs: Wang et al. [ 109 ] build two knowledge graphs, one in-domain and one cross-domain graph. The graphs are user-constructed and include representative papers for the different concepts.

All other approaches do not rely on users building the knowledge graph: Afsar et al. [ 1 ] utilise an expert-built knowledge base as a source for their categorisation of papers, which are then recommended to users. Li et al. [ 56 ] employ a knowledge graph-based embedding of authors, keywords and venues. Tang et al. [ 104 ] link words with high tf-idf weights from papers to LOD and then merge this knowledge graph with the user-paper graph. Wang et al. [ 113 ] construct a knowledge graph consisting of users and papers.

In terms of graphs, we found 33 approaches explicitly mentioning the graph structure they were utilising. We can describe which graph structure is used and which algorithms or methods are applied on the graphs.

Of the observed approaches, most specify some form of (heterogeneous) graph structure . Only a few of them are unspecific and mention an undefined heterogeneous graph [ 63 – 65 ] or a multi-layer [ 48 ] graph. Most works clearly define the type of graph they are using: author-paper-venue-label-topic graph [ 5 ], author-paper-venue-keyword graph [ 56 , 57 ], paper-author graph [ 19 , 29 , 55 , 104 ],   paper-topic   graph [ 29 ],   author-paper-venue graph [ 42 , 121 , 122 ],  author  graph [ 41 ],  paper-paper graph [ 42 , 49 ],  citation  graph [ 2 , 44 – 46 , 88 , 89 , 106 , 108 , 117 ] or undirected citation graph [ 60 , 61 ]. Some approaches specifically mention usage of co-citations [ 2 , 45 ], bibliographic coupling or both [ 88 , 89 , 96 , 108 ].

As for algorithms or methods used on these graphs , we encountered usage of centrality measures in different graph types [ 41 , 96 , 108 ], some use knowledge graphs (see Sect.  3.4.6 ), some using meta-paths (see Sect.  3.4.8 ), some using random walks e.g. in form of PageRank or hubs and authorities (see Sect.  3.4.9 ), construction of Steiner trees [ 61 ], usage of the graph as input for a GCN [ 104 ], BFS [ 113 ], clustering [ 117 ] or calculation of a closeness degree [ 117 ].

We found only four approaches incorporating meta-paths. Hua et al. [ 42 ] construct author-paper-author and author-paper-venue-paper-author paths by applying beam search. Papers on the most similar paths are recommended to users. Li et al. [ 57 ] construct meta-paths of a max length between users and papers and use random walk on these paths. Ma et al. [ 63 , 64 ] use meta-paths to measure the proximity between nodes in a graph.

Random walk (with restart)

We found twelve approaches using some form of random walk in their methodology. We differentiate between ones using random walk, random walk with restart and algorithms using a random walk component.

Some methods use random walk on heterogeneous graphs [ 29 , 65 ] and weighted multi-layer graphs [ 48 ]. A few approaches use random walk to identify [ 42 , 57 ] or determine the proximity between [ 64 ] meta-paths.

Three approaches explicitly utilise random walk with restart . They determine similarity between papers [ 106 ], identify papers to recommend [ 44 ] or find most relevant papers in clusters [ 117 ].

Some  approaches  use  algorithms  which  incorporate a random walk component : PageRank [ 107 ] and the identifications of hubs and authorities [ 122 ] with PageRank [ 121 ].

Advanced machine learning

29 approaches utilised some form of advanced machine learning. We encountered different methods being used and some papers specifically presenting novel machine learning models. All of these papers surpass mere usage of a topic model or typical pre-trained embedding method.

We found a multitude of machine learning methods being used, from multi armed bandits [ 1 ], LSTM [ 24 , 37 , 113 ], multi-layer perceptrons [ 62 , 96 , 104 ], (bi-)GRU [ 37 , 69 , 71 , 123 ], matrix factorisation [ 4 , 62 , 69 , 110 , 111 ], gradient ascent or descent [ 41 , 57 , 63 , 116 ], some form of simple neural network [ 30 , 37 , 56 ], some form of graph neural network [ 19 , 49 , 104 ], autoencoder [ 4 ], neural collaborative filtering [ 62 ], learning methods [ 30 , 123 ] to DTW [ 48 ]. Three approaches ranked the papers to recommend [ 56 , 57 , 118 ] with, e.g. Bayesian Personalized Ranking. Two of the observed papers proposed topic modelling approaches [ 3 , 115 ].

Several papers proposed models : a bipartite network embedding [ 5 ], heterogeneous graph embeddings [ 29 , 42 , 48 , 63 ], a scientific social reference network [ 41 ], a paper-author-venue embedding [ 116 ] and a relation prediction model [ 64 ].

We found nine papers incorporating a crawling step as part of their approach. PDFs are oftentimes collected from CiteSeer [ 38 , 46 ] or CiteSeerX [ 2 , 93 , 94 ], in some cases [ 39 , 88 , 110 ] the sources are not explicitly mentioned. Fewer used data sources are Wikipedia for articles explaining the top words from papers [ 98 ] or papers from ACM, IEEE and EI [ 109 ]. Some approaches explicitly mention the extraction of citation information [ 2 , 38 , 39 , 46 , 88 , 93 , 94 ] e.g. to identify co-citations.

Cosine similarity

Some form of cosine similarity was encountered in most (31) paper recommendation approaches. It is often applied between papers, between users, between users and papers and in other forms.

For application between papers we encountered the possibility of using unspecified embeddings: unspecified word or vector representations of papers [ 30 , 48 , 107 , 110 ], papers’ key terms or top words [ 2 , 98 ] and key phrases [ 46 ]. We found some approaches using vector space model variants: unspecified [ 59 ], tf vectors [ 39 , 88 ], tf-idf vectors [ 42 , 95 , 111 ], dimensionality reduced tf-idf vectors [ 86 ] and lastly, tf-idf and entity embeddings [ 56 ]. Some approaches incorporated more advanced embedding techniques: SBERT embeddings [ 5 ], Doc2Vec embeddings [ 28 ], Doc2Vec embeddings with incorporation of their emotional score [ 109 ] and NPLM representations [ 29 ].

Cosine similarity was used between preferences or profiles of users and papers in the following ways: unspecified representations [ 63 , 84 , 113 , 115 ], Boolean representation of users and keywords [ 60 ], tf-idf vectors [ 21 , 74 – 76 ],  cf-idf  vectors [ 74 – 76 ]  and  hcf-idf vectors [ 74 – 76 ].

For between users application of cosine similarity, we found unspecified representations [ 41 ] and time-decayed Word2Vec embeddings of users’ papers’ keyword [ 55 ].

Other applications include the usage between input keywords and paper clusters [ 117 ] and between nodes in a graph represented by their neighbouring nodes [ 121 , 122 ].

Paper recommendation systems

The 65 relevant works identified in our literature search are described in this section. We deliberately refrain from trying to structure the section by classifying papers by an arbitrary dimension and instead point to Table  3 to identify those dimensions in which a reader is interested to navigate the following short descriptions. The works are ordered by the surname of the first author and ascending publication year. An exception to this rule are papers presenting extensions of previous approaches with different first authors. These papers are ordered to their preceding approaches.

Afsar et al. [ 1 ] propose KERS, a multi-armed bandit approach for patients to help with medical treatment decision making. It consists of two phases: first an exploration phase identifies categories users are implicitly interested in. This is supported by an expert-built knowledge base. Afterwards an exploitation phase takes place where articles from these categories are recommended until a user’s focus changes and another exploitation phase is initiated. The authors strive to minimise the exploration efforts while maximising users’ satisfaction.

Ahmedi et al. [ 3 ] propose a personalised approach which can also be applied to more general recommendation scenarios which include user profiles. They utilise Collaborative  Topic  Regression  to  mine  association rules from historic user interaction data.

Alfarhood and Cheng [ 4 ] introduce Collaborative Attentive Autoencoder, a deep learning-based model for general recommendation targeting the data sparsity problem. They apply probabilistic matrix factorisation while also utilising textual information to train a model which identifies latent factors in users and papers.

Ali et al. [ 5 ]  construct  PR-HNE,  a  personalised probabilistic paper recommendation model based on a joint representation of authors and publications. They utilise graph information such as citations as well as co-authorships, venue information and topical relevance to suggest papers. They apply SBERT and LDA to represent author embeddings and topic embeddings respectively.

Bereczki [ 19 ] models users and papers in a bipartite graph. Papers are represented by their contents’ Word2Vec or BERT embeddings, users’ vectors consist of representations of papers they interacted with. These vectors are then aggregated with simple graph convolution.

Bulut et al. [ 22 ] focus on current user interest in their approach which utilises k-Means and KNN. Users’ profiles are constructed from their authored papers. Recommended papers are the highest cited ones from the cluster most similar to a user. In a subsequent work they extended their research group to again work in the same domain. Bulut et al. [ 21 ] again focus on users’ features. They represent users as the sum of features of their papers. These representations are then compared with all papers’ vector representations to find the most similar ones. Papers can be represented by TF-IDF, Word2Vec or Doc2Vec vectors.

Chaudhuri et al. [ 25 ] use indirect features derived from direct features of papers in addition to direct ones in their paper recommendation approach: keyword diversification, text complexity and citation analysis. In an extended group Chaudhuri et al. [ 26 ] later propose usage of more indirect features such as quality in paper recommendation. Users’ profiles are composed of their clicked papers. Subsequently they again worked on an approach in the same area but in a slightly smaller group. Chaudhuri et al. [ 24 ] propose the general Hybrid Topic Model and apply it on paper recommendation. It learns users’ preferences and intentions by combining LDA and Word2Vec. They compute user’s interest from probability distributions of words of clicked papers and dominant topics in publications.

Chen and Ban [ 27 ] introduce CPM, a recommendation model based on topically clustered user interests mined from their published papers. They derive user need models from these clusters by using LDA and pattern equivalence class mining. Candidate papers are then ranked against the user need models to identify the best-fitting suggestions.

Collins and Beel [ 28 ] propose the usage of their paper recommendation system Mr. DLib as a recommender as-a-service. They compare representing papers via Doc2Vec with a key phrase-based recommender and TF-IDF vectors.

Du et al. [ 29 ] introduce HNPR, a heterogeneous network method using two different graphs. The approach incorporates citation information, co-author relations and research areas of publications. They apply random walk on the networks to generate vector representations of papers.

Du et al. [ 30 ] propose Polar++, a personalised active  one-shot  learning-based  paper  recommendation system where new users are presented articles to vote on before they obtain recommendations. The model trains a neural network by incorporating a matching score between a query article and the recommended articles as well as a personalisation score dependant on the user.

Guo et al. [ 37 ] recommend publications based on papers initially liked by a user. They learn semantics between titles and abstracts of papers on word- and sentence-level, e.g. with Word2Vec and LSTMs to represent user preferences.

Habib and Afzal [ 38 ] crawl full texts of papers from CiteSeer. They then apply bibliographic coupling between input papers and a clusters of candidate papers to identify the most relevant recommendations. In a subsequent work Afzal again used a similar technique. Ahmad and Afzal [ 2 ] crawled papers from CiteSeerX. Cosine similarity of TF-IDF representations of key terms from titles and abstracts is combined with co-citation strength of paper pairs. This combined score then ranks the most relevant papers the highest.

Haruna et al. [ 39 ] incorporate paper-citation relations combined with contents of titles and abstracts of papers to recommend the most fitting publications for an input query corresponding to a paper.

Hu et al. [ 41 ] present ADRCR, a paper recommendation  approach  incorporating  author-author  and author-paper citation relationships as well as authors’ and papers’ authoritativeness. A network is built which uses citation information as weights. Matrix decomposition helps learning the model.

Hua et al. [ 42 ] propose PAPR which recommends relevant paper sets as an ordered path. They strive to overcome recommendation merely based on similarity by observing topics in papers changing over time. They combine similarities of TF-IDF paper representations with random-walk on different scientific networks.

Jing and Yu [ 44 ] build a three-layer graph model which they traverse with random-walk with restart in an algorithm named PAFRWR. The graph model consists of one layer with citations between papers’ textual content represented via Word2Vec vectors, another layer modelling co-authorships between authors and the third layer encodes relationships between papers and topics contained in them.

Kanakia et al. [ 45 ] build their approach upon the MAG dataset and strive to overcome the common problems of scalability and cold-start. They combine TF-IDF and Word2Vec representations of the content with co-citations of papers to compute recommendations. Speedup is achieved by comparing papers to clusters of papers instead of all other single papers.

Kang et al. [ 46 ] crawl full texts of papers from CiteSeer and construct citation graphs to determine candidate papers. Then they compute a combination of section-based citation and key phrase similarity to rank recommendations.

Kong et al. [ 48 ] present VOPRec, a model combining textual components in form of Doc2vec and Paper2Vec paper representations with citation network information in form of Struc2vec. Those networks of papers connect the most similar publications based on text and structure. Random walk on these graphs contributes to the goal of learning vector representations.

L et al. [ 49 ] base their recommendation on lately accessed papers of users as they assume future accessed papers are similar to recently seen ones. They utilise a sliding window to generate sequences of papers, on those they construct a GNN to aggregate neighbouring papers to identify users’ interests.

Li et al. [ 56 ]  introduce  a  subscription-based  approach which learns a mapping between users’ browsing history and their clicks in the recommendation mails. They learn a re-ranking of paper recommendations by using its metadata, recency, word representations and entity representations by knowledge graphs as input for a neural network. Their defined target audience are new users.

Li et al. [ 55 ] present HNTA a paper recommendation method utilising heterogeneous networks and changing user interests. Paper similarities are calculated with Word2Vec representations of words recommended for each paper. Changing user interest is modelled with help of an exponential time decay function on word vectors.

Li et al. [ 57 ] utilise user profiles with a history of preferences to construct heterogeneous networks where they apply random walks on meta-paths to learn personalised weights. They strive to discover user preference patterns and model preferences of users as their recently cited papers.

Lin et al. [ 59 ] utilise authors’ citations and years they have been publishing papers in their recommendation approach. All candidate publications are matched against user-entered keywords, the two factors of authors of these candidate publications are combined to identify the overall top recommendations.

Liu et al. [ 60 ] explicitly do not require all recommended publications to fit the query of a user perfectly. Instead they state the set of recommended papers fulfils the information need only in the complete form. Here they treat paper recommendation as a link prediction problem incorporating publishing time, keywords and author influence. In a subsequent work, part of the previous research group again observes the same problem. In this work Liu et al. [ 61 ] propose an approach utilising numbers of citations (author popularity) and relationships between publications in an undirected citation graph. They compute Steiner trees to identify the sets of papers to recommend.

Lu et al. [ 62 ] propose TGMF-FMLP, a paper recommendation approach focusing on the changing preferences of users and novelty of papers. They combine category attributes (such as paper type, publisher or journal), a time-decay function, Doc2Vec representations of the papers’ content and a specialised matrix factorisation to compute recommendations.

Ma et al. [ 64 ] introduce HIPRec, a paper recommendation approach on heterogeneous networks of authors, papers, venues and topics specialised on new publications. They use the most interesting meta-paths to construct significant meta-paths. With these paths and features from these paths they train a model to identify new papers fitting users. Together with another researcher Ma further pursued this research direction. Ma and Wang [ 63 ] propose HGRec, a heterogeneous graph representation learning-based model working on the same network. They use meta-path-based features and Doc2Vec paper embeddings to learn the node embeddings in the network.

Manju et al. [ 65 ] attempt to solve the cold-start problem with their paper recommendation approach coding social interactions as well as topical relevance into a heterogeneous graph. They incorporate believe propagation into the network and compute recommendations by applying random walk.

Mohamed Hassan et al. [ 69 ] adopt an existing tag prediction model which relies on a hierarchical attention network to capture semantics of papers. Matrix factorisation then identifies the publications to recommend.

Nair et al. [ 71 ] propose C-SAR, a paper recommendation approach using a neural network. They input GloVe embeddings of paper titles into their Gated Recurrent Union model to compute probabilities of similarities of papers. The resulting adjacency matrix is input to an association rule mining a priori algorithm which generates the set of recommendations.

Nishioka et al. [ 74 , 75 ] state serendipity of recommendations as their main objective. They incorporate users’ tweets to construct profiles in hopes to model recent interests and developments which did not yet manifest in users’ papers. They strive to diversity the list of recommended papers. In more recent work Nishioka et al. [ 76 ] explained their evaluation more in depth.

Rahdari and Brusilovsky [ 84 ] observe paper recommendation  for  participants  of  scientific  conferences. Users’ profiles are composed of their past publications. Users control the impact of features such as publication similarity, popularity of papers and its authors to influence the ordering of their suggestions.

Renuka et al. [ 86 ] propose a paper recommendation approach utilising TF-IDF representations of automatically extracted keywords and key phrases. They then either use cosine similarity between vectors or a clustering method to identify the most similar papers for an input paper.

Sakib et al. [ 89 ] present a paper recommendation approach utilising second-level citation information and citation context. They strive to not rely on user profiles in the paper recommendation process. Instead they measure similarity of candidate papers to an input paper based on co-occurred or co-occurring papers. In a follow-up work with a bigger research group Sakib et al. [ 88 ] combine contents of titles, keywords and abstracts with their previously mentioned collaborative filtering approach. They again utilise second-level citation relationships between papers to find correlated publications.

Shahid et al. [ 94 ] utilise in-text citation frequencies and assume a reference is more important to a referencing paper the more often it occurs in the text. They crawl papers from CiteSeerX to retrieve the top 500 citing papers. In a follow-up work with a partially different research group Shahid et al. [ 93 ] evaluate the previously presented approach with a user study.

Sharma et al. [ 95 ] propose IBM PARSe, a paper recommendation system for the medical domain to reduce the number of papers to review for keeping an existing knowledge graph up-to-date. Classifiers identify new papers from target domains, named entity recognition finds relevant medical concepts before papers’ TF-IDF vectors are compared to ones in the knowledge graph. New publications most similar to already relevant ones with matching entities are recommended to be included in the knowledge base.

Subathra and Kumar [ 98 ] constructed an paper recommendation system which applies LDA on Wikipedia articles twice. Top related words are computed using pointwise mutual information before papers are recommended for these top words.

Tang et al. [ 104 ] introduce CGPrec, a content-based and knowledge graph-based paper recommendation system. They focus on users’ sparse interaction history with papers and strive to predict papers on which users are likely to click. They utilise Word2Vec and a Double Convolutional Neural Network to emulate users’ preferences directly from paper content as well as indirectly by using knowledge graphs.

Tanner et al. [ 106 ] consider relevance and strength of citation relations to weigh the citation network. They fetch citation information from the parsed full texts of papers. On the weighted citation networks they run either weighted co-citation inverse document frequency, weighted bibliographic coupling or random walk with restart to identify the highest scoring papers.

Tao et al. [ 107 ] use embeddings and topic modelling to compute paper recommendations. They combine LDA and Word2Vec to obtain topic embeddings. Then they calculate most similar topics for all papers using Doc2Vec vector representations and afterwards identify the most similar papers. With PageRank on the citation network they re-rank these candidate papers.

Waheed et al. [ 108 ] propose CNRN, a recommendation approach using a multilevel citation and authorship network to identify recommendation candidates. From these candidate papers ones to recommend are chosen by combining centrality measures and authors’ popularity. Highly correlated but unrelated Shi et al. [ 96 ] present AMHG, an approach utilising a multilayer perceptron. They also construct a multilevel citation network as described before with added author relations. Here they additionally utilise vector representations of publications and recency.

Wang et al. [ 113 ] introduce a knowledge-aware path recurrent network model. An LSTM mines path information from the knowledge graphs incorporating papers and users. Users are represented by their downloaded, collected and browsed papers, papers are represented by TF-IDF representations of their keywords.

Wang et al. [ 109 ] require users to construct knowledge graphs to specify the domain(s) and enter keywords for which recommended papers are suggested. From the keywords they compute initially selected papers. They apply Doc2Vec and emotion-weighted similarity between papers to identify recommendations.

Wang et al. [ 110 ] regard paper recommendation targeting a group of people instead of single users and introduce GPRAH_ER. They employ a two-step process which first individually predicts papers for users in the group before recommended papers are aggregated. Here users in the group are not considered equal, different importance and reliability weights are assigned such that important persons’ preferences are more decisive of the recommended papers. Together with a different research group two authors again pursued this definition of the paper recommendation problem. Wang et al. [ 111 ] recommend papers for groups of users in an approach called GPMF_ER. As with the previous approach they compute TF-IDF vectors of keywords of papers to calculate most similar publications for each user. Probabilistic matrix factorisation is used to integrate these similarities in a model such that predictive ratings of all users and papers can be obtained. In the aggregation phase the number of papers read by a user is determined to replace the importance component.

Xie et al. [ 116 ] propose JTIE, an approach incorporating contents, authors and venues of papers to learn paper embeddings. Further, directed citation relations are included into the model. Based on users’ authored and referenced papers personalised recommendations are computed. They consider explainability of recommendations.  In  a  subsequent  work  part  of  the  researchers again work on this topic. Xie et al. [ 115 ] specify on recommendation of papers from different areas for user-provided keywords or papers. They use hierarchical LDA to model evolving concepts of papers and citations as evidence of correlation in their approach.

Yang et al. [ 117 ] incorporate the age of papers and impact factors of venues as weights in their citation network-based approach named PubTeller. Papers are clustered by topic, the most popular ones from the clusters most similar to the query terms are recommendation candidates. In this approach, LDA and TF-IDF are used to represent publications.

Yu et al. [ 118 ] propose ICMN, a general collaborative memory network approach. User and item embeddings are composed by incorporating papers’ neighbourhoods and users’ implicit preferences.

Zavrel et al. [ 119 ] present the scientific literature recommendation  platform  Zeta  Alpha,  which  bases their recommended papers on examples tagged in user-defined categories. The approach includes these user-defined tags as well as paper content embeddings, social media mentions and citation information in their ensemble learning approach to recommend publications.

Zhang et al. [ 121 ] propose W-Rank, a general approach weighting edges in a heterogeneous author, paper and venue graph by incorporating citation relevance and author contribution. They apply their method on paper recommendation. Network- (via citations) and semantic-based (via AWD) similarity between papers is combined for weighting edges between papers, harmonic counting defines weights of edges between authors and papers. A HITS-inspired algorithm computes the final authority scores. In a subsequent work in a slightly smaller group they focus on a specialised approach  for  paper  recommendation.  Here  Zhang  et al. [ 122 ] strive to emulate a human expert recommending papers. They construct a heterogeneous network with authors, papers, venues and citations. Citation weights are determined by semantic- and network-level similarity  of  papers.  Lastly,  recommendation  candidates are re-ranked while combining the weighted heterogeneous network and recency of papers.

Zhao et al. [ 123 ] present a personalised approach focusing on diversity of results which consists of three parts. First LFM extracts latent factor vectors of papers and users from the users’ interactions history with papers. Then BERT vectors are constructed for each word of the papers, with those vectors as input and the latent factor vectors as label a BiGRU model is trained. Lastly, diversity and a user’s rating weights determine the ranking of recommended publications for the specific user.

Other relevant work

We now briefly discuss some papers which did not present novel paper recommendation approaches but are relevant in the scope of this literature review nonetheless.

Surrounding paper recommendation

Here we present two works which could be classified as ones to use on top of or in combination with existing paper recommendation systems: Lee et al. [ 51 ] introduce LIMEADE, a general approach for opaque recommendation systems which can for example be applied on any paper recommendation system. They produce explanations for recommendations as a list of weighted interpretable features such as influential paper terms.

Beierle  et  al. [ 18 ]  use  the  recommendation-as-a-service provider Mr. DLib to analyse choice overload in user evaluations. They report several click-based measures and discuss effects of different study parameters on engagement of users.

(R)Evaluations

The following four works can be grouped as ones which provide (r)evaluations of already existing approaches. Their results could be useful for the construction of novel systems: Ostendorff [ 77 ] suggests considering the context of paper similarity in background, methodology and findings sections instead of undifferentiated textual similarity for scientific paper recommendation.

Mohamed Hassan et al. [ 68 ] compare different text embedding methods such as BERT, ELMo, USE and InferSent to express semantics of papers. They perform paper recommendation and re-ranking of recommendation candidates based on cosine similarity of titles.

Le et al. [ 50 ] evaluate the already existing paper recommendation system Mendeley Suggest, which provides recommendations with different collaborative or content-based approaches. They observe different usage behaviours and state utilisation of paper recommendation systems does positively effect users’ professional lives.

Barolli et al. [ 11 ] compare similarities of paper pairs utilising n-grams, tf-idf and a transformer based on BERT. They model cosine similarities of these pairs into a paper connection graph and argue for the combination of content-based and graph-based methods in the context of COVID-19 paper recommendation systems.

Living labs

Living labs help researchers conduct meaningful evaluations by providing an environment, in which recommendations produced by experimental systems are shown to real users in realistic scenarios [ 14 ]. We found three relevant works for the area of scientific paper recommendation: Beel et al. [ 14 ] proposed a living lab for scholarly recommendation built on top of Mr. DLib, their recommender-as-a-service system. They log users’ actions such as clicks, downloads and purchases for related recommended papers. Additionally, they plan to extend their living lab to also incorporate research grant or research collaborator recommendation.

Gingstad et al. [ 36 ] propose ArXivDigest, an online living lab for explainable and personalised paper recommendations from arXiv. Users can either be suggested papers while browsing their website or via email as a subscription-type service. Different approaches can be hooked into ArXivDigest, the recommendations generated by them can then be evaluated by users. A simple text-based baseline compares user-input topics with articles. Target values of evaluations are users’ clicked and saved papers.

Schaer et al. [ 91 ] held the Living Labs for Academic Search (LiLAS) where they hosted two shared tasks: dataset recommendation for scientific papers and ad-hoc multi-lingual retrieval of most relevant publications regarding specific queries. To overcome the gap between real-world and lab-based evaluations they allowed integrating participants’ systems into real-world academic search systems, namely LIVIO and GESIS Search.

Multilingual/cross-lingual recommendation

The previous survey by Li and Zhou [ 58 ] identifies cross-language paper recommendation as a future research direction. The following two works could be useful for this aspect: Keller and Munz [ 47 ] present their results of participating on the CLEF LiLAS challenge where they tackled recommendation of multilingual papers based on queries. They utilised a pre-computed ranking approach, Solr and pseudo-relevance feedback to extend queries and identify fitting papers.

Safaryan et al. [ 87 ] compare different already existing techniques for cross-language recommendation of publications. They compare word by word translation, linear projection from a Russian to an English vector representation, VecMap alignment and MUSE word embeddings.

Related recommendation systems

Some recommendation approaches are slightly out of scope of pure paper recommendation systems but could still provide inspiration or relevant results: Ng [ 73 ] proposes CBRec, a children’s book recommendation system utilising matrix factorisation. His goal is to encourage good reading habits of children. The approach combines readability levels of users and books with TF-IDF representations of books to find ones which are similar to ones which a child may have already liked.

Patra et al. [ 80 ] recommend publications relevant for datasets to increase reusability. Those papers could describe the dataset, use it or be related literature. The authors represent datasets and articles as vectors and use cosine similarity to identify the best fitting papers. Re-ranking them with usage of Word2Vec embeddings results in the final recommendation.

As the discussed paper recommendation systems utilise different inputs or components of scientific publications and pursue slightly different objectives, datasets to experiment on are also of diverse nature. We do not consider datasets of approaches which do not contain an evaluation [ 60 , 119 ] or do not evaluate the actual paper recommendation [ 2 , 25 , 38 , 84 , 86 ] such as the cosine similarity between a recommended and an initial paper [ 2 , 86 ], the clustering quality on the constructed features [ 25 ] or the Jensen Shannon Divergence between probability distributions of words between an initial and recommended papers [ 38 ]. We also do not discuss datasets where only the data sources are mentioned but no remarks are made regarding the size or composition of the dataset [ 21 , 104 ] or ones where we were not able to identify actual numbers [ 65 ]. Table  4 gives an overview of datasets used in the evaluation of the considered discussed methods. Many of the datasets are unavailable only few years after publication of the approach. Most approaches utilise their own modified version of a public dataset which makes exact replication of experiments hard. In the following the main underlying data sources and publicly available datasets are discussed. Non-publicly available datasets are briefly described in Table  5 .

Overview of datasets utilised in most recent related work with (unofficial) names, public availability of the possibly modified dataset which was used (A?), and a list of papers it was used in. Datasets are grouped by their underlying data source if possible

NameA?Used by
DBLP + Citations v1 [ ] [ ]
DBLP + Citations v8 [ ] [ , ]
DBLP + Citations v11 [ ]
dblp + IEEE + ACM + Pubmed [ ]
DBLP paths [ ]
DBLP-Citation-network f. AMiner [ ]
dblp [ ]
DBLP-REC [ ]
dblp + AMiner KG [ ]
dblp + AMiner + venue [ ]
SPRD_Senior [ ]
SPRD [ ] [ , , ]
Citeulike-a [ ] [ , , , , , , , ]
Citeulike-t [ ] [ ]
Citeulike_huge [ ]
Citeulike_medium [ ]
Citeulike_tiny [ ]
ACM paths [ ]
ACM citation network V8 [ – ]
Scopus_tiny [ , ]
ScienceDirect+Scopus [ ]
Scopus [ ]
AMiner [ ]
AMiner + Wanfang [ ]
AMiner_tiny [ ]
AMiner_huge [ ]
ACM C-D [ ]
AAN_original [ ] [ ]
AAN_modified [ , ]
AAN_tiny [ ]
Sowiport [ ]
RARD_tiny [ ]
CiteSeer [ ]
CiteSeer_tiny [ ]
CiteSeer_medium [ ]
Patents_tiny [ ]
Patents [ ]
ACM H-I [ ]
Hep-TH graph [ ]
arXiv Hep-TH [ ]
MSA [ ]
MAG 2017 [ ]
MAG 2018 [ ]
BBC [ ]
PRSDataset [ , ]
Physical review A [ ]
ACL selection network [ ]
Prostate cancer [ ]
Peltarion [ ]
Jabref [ ]
DM [ ]
Graphs [ ]
SCHOLAT [ ]
IEEE Xplore [ ]
KGs [ ]
Wanfang [ ]
Watson™for Genomics [ ]
Wikipedia [ ]
LibraryThing [ ]

Description of private datasets utilised in most recent related work with (unofficial) names. Datasets are grouped by their underlying data source if possible

NameUsed byDescription
DBLP + Citations v8 [ ][ , ]2,133 from 20 from 2000 to 2016, 39,530 , 15,708 topics
dblp + IEEE + ACM + Pubmed[ ]Sources: dblp, IEEE, ACM, Pubmed. 3,394,616 (titles), , publication years, keywords,
DBLP paths[ ]1,782,700 (titles, abstracts, keywords), 2,052,414 , 18,936 , 100,000 , 9,590,600
DBLP-Citation-network f. AMiner[ ]63,469 from 2013 to 2019, 152,586
dblp[ ]2,126,267 , 8686 , 1,221,259 , 256,214 , 3765 relations
DBLP-REC[ ]DBLP-Citation-network v11 + ScienceDirect + IEEE, 3,590,853 , 3,276,803 , 35,254,530
dblp + AMiner KG[ ]KG with 223,431 , 337,561 , 5578 , 1179 keyword nodes, 16,328,642
dblp + AMiner + venue[ ]3,056,388 (titles, abstracts, keywords), 1,752,401 , 354,693 keywords, 11,397 , , discipline labels
Citeulike_huge[ ]210,137 , 3,039 , 284,960 from Nov 2004 to Dec 2007
Citeulike_medium[ ]2,065 users, 718 groups, 85,542
Citeulike_tiny[ ]1,659 users, 718 groups, 82,376 , 198,744
ACM paths[ ]2,385,057 (titles, abstracts, keywords), 2,004,398 , 269,467 , 61,618 , 12,048,682
ACM citation network V8[ – ]1,669,237 (titles, abstracts), ,
Scopus_tiny[ , ]2,000
ScienceDirect + Scopus[ ] ’s browsed prior to first email from ScienceDirect, metadata from Scopus, 4,392 recommendation sessions (emails with clicks on , ’ browsing history)
Scopus[ ]528,224 , , , discipline tags
Scopus + venue[ ]1,304,907 (titles, abstracts, keywords), 482,602 , 127,630 keywords, 7653 , , discipline labels
AMiner[ ]2,070,699 , 263,250 , 1,557,147 , 735,059 , 9398 relations
AMiner + Wanfang[ ]4 mio . 3 sets: data from 2018 and 2019 (221,076 , 503,945 ), mathematical analysis (98,702 , 117,183 ), image processing (49,098 , 107,290 )
AMiner_tiny[ ]188 input , 10 candidate for each input
AMiner_huge[ ]2,092,356 , 1,712,433 , 8,024,869 , 4,258,615 co-autorships
ACM C-D[ ]43,380 from AMiner, , ACM CSS tags
AAN_modified[ , ]21,455 from 312 from NLP, 17,342 , 113,367
AAN_tiny[ ]2082 (ids, titles, publication year), 8194 , avg. 7.87 per , ,
Sowiport[ ] data from Mar 2017 to Oct 2018, 0.1% click-through rate
RARD_tiny[ ]800 input from Related-Article Recommendation Dataset from Sowiport [ ]
CiteSeer[ ]1,100 , 10 sets of relevant
CiteSeer_tiny[ ]400 -pairs, 1,230 contexts
CiteSeer_medium[ ]10 , 226 -pairs
Patents_tiny[ ]67 input patents, 20 candidate patents for each input
Patents[ ]182,260 patents, 73,974
ACM H-I[ ]70,090 patents with ownership from 2017, , ACM CSS tags
Hep-TH graph[ ]graph with 8,721 (keywords)
arXiv Hep-TH[ ] 29,000 , 350,000 , 14,909 , 428 journals
MSA[ ]101,205 , 190,146 in 300 conferences
MAG 2017[ ]Based on data until 2017, area: intrusion detection in cyber security, 6428 , 94,887 , 18,890 , 6428 journals
MAG 2018[ ]Based on MAG Azure database from Oct 2018, 206,676,892
Physical Review A[ ]393 from 2007 to 2009 with 2,664 from American Physical Society
ACL selection network[ ]18,718 (titles, summaries) from ACL proceedings
prostate cancer[ ]500 tagged with 5 categories
Peltarion[ ]290 , from Dec 2018 to May 2021 of of Peltarion Knowledge Center who have read 5
Jabref[ ] data from Mar 2017 to Oct 2018, 0.22% click-through rate
DM[ ]8,301 from journals: DMKD, TKDE + conferences: KDD, ICDM, SDM
Graphs[ ]Cora (1 graph, 2.7k nodes), TU-IMDB (1.5k graphs,  13 nodes each), TU-MUTAG (188 molecules, 18 nodes)
SCHOLAT[ ]34,518 (titles, abstracts, keywords),
IEEE Xplore[ ]3 (keywords), , appeared in IEEE between 2010 and 2017
KGs[ ]Knowledge graphs, 600 from information retrieval + machine learning
Wanfang[ ]500 , 5 sets of relevant
Watson™for Genomics[ ]15,320 from top 10 percentile genomics journals from Jun 2016
Wikipedia[ ]1000 from Wikipedia, 20 topics
LibraryThing[ ]120,150 books (titles, abstracts), , 185,210 favourites records, 150,216 ratings, 139,530 reviews of 12,350

We used the following abbreviations: user(s) u , paper(s) p , interaction(s) i , author(s) a , venue(s) v , reference(s) r , citation(s) c , term(s) t

dblp-based datasets

The dblp computer science bibliography (dblp) is a digital library offering metadata on authors, papers and venues from the area of computer science and adjacent fields [ 54 ]. They provide publicly available short-time stored daily and longer-time stored monthly data dumps 10 .

The dblp + Citations v1 dataset [ 105 ] builds upon a dblp version from 2010 mapped on AMiner. It contains 1,632,442 publications with 2,327,450 citations.

The dblp + Citations v11 dataset 11 builds upon dblp. It contains 4,107,340 papers, 245,204 authors, 16,209 venues and 36,624,464 citations

These datasets do not contain supervised labels provided by human annotators even though the citation information could be used as interaction data.

SPRD-based datasets

The Scholarly Paper Recommendation Dataset (abbreviation: SPRD) 12 was constructed by collecting publications written by 50 researchers of different seniority from the area of computer science which are contained in dblp from 2000 to 2006 [ 58 , 101 , 102 ]. The dataset contains 100,351 candidate papers extracted from the ACM Digital Library as well as citations and references for papers. Relevance assessments of papers relevant to their current interests of the 50 researchers are also included.

A subset of SPRD, SPRD_Senior , which contains only the data of senior researchers can also be constructed [ 99 ].

These datasets specifically contain supervised labels provided by human annotators in the form of sets of papers, which researchers found relevant for themselves.

CiteULike-based datasets

CiteULike [ 20 ] was a social bookmarking site for scientific papers. It contained papers and their metadata. Users were able to include priorities, tags or comments for papers on their reading list. There were daily data dumps available from which datasets could be constructed.

Citeulike-a  [ 112 ] 13 contains 5,551 users, 16,980 papers with titles and abstracts from 2004 to 2006 and their 204,986 interactions between users and papers. Papers are represented by their title and abstract.

Citeulike-t  [ 112 ] 14 contains 7,947 users, 25,975 papers and 134,860 user-paper interactions. Papers are represented by their pre-processed title and abstract.

These datasets contain labelled data as they build upon CiteULike, which provides bookmarked papers of users.

ACM-based datasets

The ACM Digital Library (ACM) is a semi-open digital library offering information on scientific authors, papers, citations and venues from the area of computer science 15 . They offer an API to query for information. Datasets building upon this source do not contain supervised labels provided by annotators even though the citation information could be used as interaction data.

Scopus-based datasets

Scopus is a semi-open digital library containing metadata on authors, papers and affiliations in different scientific areas 16 . They offer an API to query for data. Datasets building upon this source usually do not contain labels provided by annotators.

AMiner-based datasets

ArnetMiner (AMiner) [ 105 ] is an open academic search system modelling the academic network consisting of authors, papers and venues from all areas 17 . They provide an API to query for information. Datasets building upon this source usually do not contain labelled user interaction data.

AAN-based datasets

The ACL Anthology Network (AAN) [ 81 – 83 ] is a networked database containing papers, authors and citations from the area of computational linguistics 18 . It consists of three networks representing paper-citation relations,  author-collaboration  relations  and  the  author-citation  relations.  The  original  dataset  contains 24,766 papers and 124,857 citations [ 71 ]. Datasets building  upon  this  source usually do  not  contain labelled user interaction data even though the paper-citation,  author-collaboration  or  author-citation relationships could be utilised to replace this data.

Sowiport-based datasets

Sowiport was an open digital library containing information on publications from the social sciences and adjacent fields [ 15 , 40 ]. The dataset linked papers by their attributes such as authors, publishers, keywords, journals, subjects and citation information. Via author names, keywords and venue titles the network could be traversed by triggering them to start a new search [ 40 ]. Sowiport co-operated with the recommendation-as-a-service system Mr. DLib [ 28 ]. Datasets building upon this  source  usually  contain  labelled  user  interaction data, the clicked papers of users.

CiteSeerX-based datasets

CiteSeerX [ 35 , 114 ] is a digital library focused on metadata and full-texts of open access literature 19 . It is the overhauled form of the former digital library CiteSeer. Datasets building upon this source usually do not inherently contain labelled user interaction data.

Patents-based datasets

The Patents dataset provides information on patents and trademarks granted by the United States Patent and Trademark Office 20 . Datasets building upon this source usually do not contain labelled user interaction data.

Hep-TH-based datasets

The original unaltered Hep-TH  [ 53 ] dataset 21 stems from the area of high energy physics theory. It contains papers in a graph which were published between 1993 and 2003. It was released as part of KDD Cup 2003. Datasets building upon this source usually do not contain labelled user interaction data.

MAG-based datasets

The Microsoft Academic Graph (MAG) [ 97 ] was an open scientific network containing metadata on academic communication activities 22 . Their heterogeneous graph consists of nodes representing fields of study, authors, affiliations, papers and venues. Datasets building upon this source usually do not contain labelled user interaction data besides citation information.

The  following  datasets  have  no  common  underlying data source: The BBC 23 dataset contains 2,225 BBC news articles which stem from 5 topics. This dataset does not contain labelled user interaction data.

PRSDataset 24   contains  2,453  users,  21,940  items and 35,969 pairs of users and items. This dataset contains user-item interactions.

The performance of a paper recommendation system can be quantified by measuring how well a target value has been approximated by the recommended publications. Relevancy estimations of papers can come from different sources, such as human ratings or datasets. Different interactions derived from clicked or liked papers determine the target values which a recommendation system should approximate. The quality of the recommendation can be described by evaluation measures such as precision or MRR. For example, a dataset could provide information on clicked papers, that are then deemed relevant. The target value which should be approximated with the recommender system are those clicked papers, and the percentage of the recommendations which are contained in the clicked papers could then be reported as the system’s precision.

Due to the vast differences in approaches and datasets used to apply the methods, there is also a spectrum of used evaluation measures and objectives. In this section, first we observe different notions of relevance of recommended papers and individual assessment strategies for relevance. Afterwards we analyse commonly used evaluation measures and list ones which are only rarely encountered in evaluation of paper recommendation systems. Lastly we shed light on the different types of evaluation which authors conducted.

In this discussion we again only consider paper recommendation systems which also evaluate their actual approach. We disregard approaches which do evaluate other properties [ 2 , 25 , 38 , 84 , 86 , 122 ] or contain no evaluation [ 60 , 119 ]. Thus we observe 54 different approaches in this analysis.

Relevance and assessment

Relevance of recommended publications can be evaluated against multiple target values: clicked papers [ 24 , 56 , 104 ], references [ 44 , 115 ], references of recently authored papers [ 57 ], papers an author interacted with in the past [ 49 ], degree-of-relevancy which is determined by citation strength [ 94 ], a ranking based on future citation numbers [ 121 ] as well as papers accepted [ 26 ] or deemed relevant by authors [ 39 , 88 ].

Assessing the relevance of recommendations can also be conducted in different ways: the top n papers recommended by a system can be judged by either a referee team [ 109 ] or single persons [ 26 , 74 , 75 ]. Other options for relevance assessment are the usage of a dataset with user ratings [ 39 , 88 ] or emulation of users and their interests [ 1 , 57 ].

Table  6 holds information on utilised relevance indicators and target values which indicate relevance for the 54 discussed approaches. Relevancy describes the method that defines which of the recommended papers are relevant:

  • Human rating: The approach is evaluated using assessments of real users of results specific to the approach.
  • Dataset: The approach is evaluated using some type of assessment of a target value which is not specific to the approach but from a dataset. The assessment was either conducted for another approach and re-used or it was collected independent of an approach.
  • Papers: The approach is evaluated by some type of assessment of a target value which is directly generated from the papers contained in the dataset such as citations or their keywords.

The target values in Table  6 describe the entities which the approach tried to approximate:

  • Clicked: The approximated target value is derived from users’ clicks on papers.
  • Read: The approximated target value is derived from users’ read papers.
  • Cited: The approximated target value is derived from cited papers.
  • Liked: The approximated target value is derived from users’ liked papers.
  • Relevancy: The approximated target value is derived from users’ relevance assessment of papers.
  • Other user: The approximated target value is derived from other entities associated with a user input, e.g. acceptance of users, users’ interest and relevancy of the recommended papers’ topics.
  • Other automatic: The approximated target value is automatically derived from other entities, e.g. user profiles, papers with identical references, degree-of-relevancy, keywords extracted from papers, papers containing the query keywords in the optimal Steiner tree, neighbouring (cited and referencing) papers, included keywords, the classification tag, future citation numbers and an unknown measure derived from a dataset. We refrain from trying to introduce sub-categories for this broad field.

Only three approaches evaluate against multiple target values [ 21 , 30 , 104 ]. Six approaches (11.11%) utilise clicks of users, only one approach (1.85%) uses read papers as target value. Even though cited papers are not the main objective of paper recommendation systems but rather citation recommendation systems, this target was approximated by 13 (24.07%) of the observed systems. Ten approaches (18.52%) evaluated against liked papers, 15 (27.78%) against relevant papers and 13 (24.07%) against some other target value, either user input (three, 5.55%) or automatically derived (ten, 18.52%).

Indications whether approaches utilise the specified relevancy definitions, target values of evaluations and evaluation measures

WorkRelevancyTarget valueMeasures
Human ratingDatasetPapersClickedReadCitedLikedRelevancyOther userOther automaticPrecisionRecallF1nDCGMRRMAPOther
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ – ]
[ ]
[ ]
[ , ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]

Evaluation measures

We differentiate between commonly used and rarely used evaluation measures for the task of scientific paper recommendation. They are described in the following sections. Table  6 holds indications of utilised evaluation measures for the 54 discussed approaches. Measures are the methods used to evaluate the approach’s ability to approximate the target value which can be of type precision, recall, f1 measure, nDCG, MRR, MAP or another one.

Out of the observed systems, twelve 25 approaches [ 1 , 28 , 30 , 49 , 59 , 64 , 69 , 71 , 74 – 76 , 107 , 115 , 116 ] (22.22%) only report one single measure, all others report at least two different ones.

Commonly used evaluation measures

Bai et al. [ 9 ] identify precision (P), recall (R), F1 , nDCG , MRR and MAP as evaluation features which have been used regularly in the area of paper recommendation systems. Table  7 gives usage percentages of each of these measures in observed related work.

Common evaluation measures and percentage of observed evaluations of paper recommendation systems in which they were applied

1nDCGMRRMAP
%48.1524.075025.9227.7822.22

Percentages are rounded to two decimal places

Alfarhood and Cheng [ 4 ] argue against the use of precision when utilising implicit feedback. If a user gives no feedback for a paper it could either mean disinterest or that a user does not know of the existence of the specific publication.

Rarely used evaluation measures

We found a plethora of rarer used evaluation measures which have either been utilised only by the work they were introduced in or to evaluate few approaches. Our analysis in this aspect might be highly influenced by the narrow time frame we observe. Novel measures might require more time to be adopted by a broader audience. Thus we differentiate between novel rarely used evaluation measures and ones where authors do not explicitly claim they are novel. A list of rare but already defined evaluation measures can be found in Table  8 . In total 25 approaches (46.3%) did use an evaluation measure not considered common.

Overview of rare existing measures used in evaluations of observed approaches

MeasureUsed byDescription
Average precision[ ]Area under precision-recall curve
Receiver operating characteristic[ ]Plot of true positives against false positives
AUC[ , ]Area under receiver operating characteristic curve
Computation time[ , ]Time to compute recommendation list
DCG[ ]Summed up relevancy divided by logarithm of rank + 1
Click-through-rates[ , ]percentage of Clicks on recommendations
Reward[ , ]Weighted sum of interactions of users with recommendations, e.g. clicked and saved papers
Spearman correlation coefficient[ , ]Correlation between ranks of paper lists
Hit ratio[ , , ]Percentage of relevant items in top recommendations
Accuracy[ , , ]Percentage of relevant papers which the approach identified
Specificity[ ]True negative rate
Mean absolute error[ ]Average difference between real and predicted values
Root mean square error[ ]Expected squared difference between real and predicted values
Fallout[ ]Percentage of irrelevant recommendations out of all irrelevant papers
Support[ ]Frequency of occurrences of set
TopN[ ]Probability that target keywords are encountered in first n recommended papers
FindN[ ]Number of target keywords which are encountered in first n recommended papers
Coverage[ ]Method’s ability to discover the long tail of papers
Popularity[ ]Average logarithm of the number of ratings of papers in recommendation, indicates novelty of results
Average paper popularity[ ]Paper popularity divided by number of recommendations
Intra-list similarity[ ]Dissimilarity between recommended papers, smaller value indicates more diverse recommendation
Serendipity score[ – ]Summed up usefulness divided by unexpectedness of recommended papers
Success rate[ ]Number of recommendations number of keywords
Number of recommended papers[ ]Size of set of recommended papers

Novel rarely used Evaluation Measures. In our considered approaches we only encountered three novel evaluation measures: Recommendation quality as defined by Chaudhuri et al. [ 26 ] is the acceptance of recommendations by users rated on a Likert scale from 1 to 10.

TotNP_EU is a measure defined by Manju et al. [ 65 ] specifically introduced for measuring performance of approaches regarding the cold start problem. It indicates the number of new publications suggested to users with a prediction value above a certain threshold.

TotNP_AVG is another measure defined by Manju et al. [ 65 ] for measuring performance of approaches regarding the cold start problem. It indicates the average number of new publications suggested to users with a prediction value above a certain threshold.

Evaluation types

Evaluations can be classified into different categories. We follow the notion of Beel and Langer [ 17 ] who differentiate between user studies, online evaluations and offline evaluations. They define user studies as ones where users’ satisfaction with recommendation results is measured by collecting explicit ratings. Online evaluations are ones where users do not explicitly rate the recommendation results; relevancy is derived from e.g. clicks. In offline evaluations a ground truth is used to evaluate the approach.

From the 54 observed approaches we found four using multiple evaluation types [ 29 , 46 , 92 , 94 , 109 ]. Twelve (22.22%) were conducting user studies which describe the size and composition of the participant group. 26 Only two approaches [ 28 , 65 ] (3.7%) in the observed papers were evaluated with an online evaluation. We found 44 approaches (81.48%) providing an offline evaluation. Offline evaluations being the most common form of evaluation is unsurprising as this tendency has also been observed in an evaluation of general scientific recommender systems [ 23 ]. Offline evaluations are fast and do not require users [ 23 ]. Nevertheless the margin by which this form of evaluation is conducted could be rather surprising.

A distinction in lab-based vs. real world user studies can be conducted [ 16 , 17 ]. User studies where participants rate recommendations according to some criteria and are aware of the study are lab-based, all others are considered real-world studies. Living labs [ 14 , 36 , 91 ] for example enable real-world user studies. On average the lab-based user studies were conducted with 17.83 users. Table  9 holds information on the number of participants for all studies as well as the composition of groups in terms of seniority.

For all observed works with user studies we list their number of participants (# P) and their composition

Work# PComposition
Bulut et al. [ ]50PhD students studying in Turkey in 2019
Bulut et al. [ ]10 + 30Researchers
Chaudhuri et al. [ ]50NA
Chaudhuri et al. [ ]45From 9 different areas, different seniority levels: 12 faculty members, 20 postgraduate students, 13 undergraduate students
Du et al. [ ]NACollege students or patent analysis experts
Hua et al. [ ]10Experts
Kanakia et al. [ ]40Full-time computer science researchers at Microsoft Research
Kang et al. [ ]12Postgraduates
Nishioka et al. [ – ]22Seniority based on highest degree: 2 Master’s, 13 PhD, 7 lecturers/professors; 2 female, 20 male; 17 working in academia, 3 working in industry
Shahid et al. [ ]20Post-graduate students
Waheed et al. [ ]20Researchers
Wang et al. [ ]51 doctoral supervisor, 2 master supervisors, 2 graduate students

NA indicates that #P or compositions were not described in a specific user study

For offline evaluation, they can either be ones with an explicit ground truth given by a dataset containing user rankings, implicit ones by deriving user interactions such as liked or cited papers or expert ones with manually collected expert ratings [ 17 ]. We found 22 explicit offline evaluations (40.74%) corresponding to ones using datasets to estimate relevance (see Table  6 ) and 21 implicit offline evaluations (38.89%) corresponding to ones using paper information to identify relevant recommendations (see Table  6 ). We did not find any expert offline evaluations.

Changes compared to 2016

This chapter briefly summarises some of the changes in the set of papers we observed when compared to the study by Beel et al. [ 16 ]. Before we start the comparison, we want to point to the fact that we observed papers from two years in which the publication process could have been massively affected by the COVID-19 pandemic.

Number of papers per year and publication medium

Beel et al. [ 16 ] studied works between and including 1998 and 2013 while we observed works which appeared between January 2019 and October 2021. While the previous study did include all 185 papers (of which 96 were paper recommendation approaches) in their discussion of papers per year which were published in the area of the topic paper or citation recommendation but later on only studied 62 papers for an in-depth review, we generally only studied 65 publications which present novel paper recommendation approaches (see Sect.  3.5 ) in this aspect. Compared to the time frame observed in this previous literature review, we encountered fewer papers being published on the actual topic of scientific paper recommendation per year. In the former work, the published number of papers was rising and hitting 40 in 2013. We found this number being stuck on a constant level between 21 and 23 in the three years we observed. This could hint at differing interest in this topic over time, with a current demise or the trend to work in this area having surpassed its zenith.

While Beel et al. [ 16 ] found 59% of conference papers and 16% of journal articles, we found 54.85% of conference papers and 41.54% of journal articles. The shift to journal articles could stem from a general shift towards journal articles in computer science 27 .

Classification

While Beel et al. [ 16 ] found 55% of their studied 62 papers applying methods from content-based filtering, we found only found 7.69% (5) of our 65 papers identifying as content-based approaches. Beel et al. [ 16 ] report 18% of approaches applied collaborative filtering. We encountered 4.62% (three) having this component as part of their self-defined classification. As for graph-based recommendation approach, Beel et al. [ 16 ] found 16% while we only encountered 7.69% (five) of papers with this description. In terms of hybrid approaches, Beel et al. [ 16 ] encountered five (8.06%) truly hybrid ones. In our study, we found 18 approaches (27.69%) labelling themselves as hybrid recommendation systems. 28

Table  10 shows the comparison of the distributions of the different types of evaluations between our study observing 54 papers with evaluations and the one conducted by Beel et al. [ 16 ], which regards 75 papers for this aspect. The percentage of quantitative user studies (User quant) is comparable for both studies. A peculiar difference is the percentage of offline evaluations, which is much higher in our current time frame.

Percentage of studies using the different methods. Some studies utilised multiple methods, thus the percentages do not add up to 100%

OfflineOnlineUser quant.User qual.
[ ]717253
Current81.483.724.070

When observing the evaluation measures, we found some differences compared to the previous study. While 48.15% of papers with an evaluation report precision in our case, in Beel et al.’s [ 16 ] 72% of approaches with an evaluation report this value. As a contrast, we found 50% of papers reporting F1 while only 11% of papers reported this measure according to Beel et al. [ 16 ]. This might hint at a shift away from precision (which Beel et al. [ 16 ] did describe as a problematic measure) to focus more on also incorporating recall into the quality assessment of recommendation systems.

In general, the two reviews regard different time frames. We encounter non-marginal differences in the three dimensions discussed in this Section. A more concise comparison could be made if a time slice would be regarded for both studies, such that the research output and shape could be observed from three years each. We cannot clearly identify emerging trends (as with the offline evaluation) as we do not know if it has been conducted in this percentage of papers since the 2010s or if it only just picked up to be a more wide-spread evaluation form.

Open challenges and objectives

All paper recommendation approaches which were considered in this survey could have been improved in some way or another. Some papers did not conduct evaluations which would satisfy a critical reader, others could be more convincing if they compared their methods to appropriate competitors. The possible problems we encountered within the papers can be summarised in different open challenges, which papers should strive to overcome. We separate our analysis and discussion of open challenges in those which have already been described by previous literature reviews (see Sect.  7.1 ) and ones we identify as new or emerging problems (see Sect.  7.2 ). Lastly we briefly discuss the presented challenges (see Sect.  7.3 ).

Challenges highlighted in previous works

In the following we will explain possible shortcomings which were already explicitly discussed in previous literature reviews [ 9 , 16 , 92 ]. We regard these challenges in light of current paper recommendation systems to identify problems which are nowadays still encountered.

Neglect of user modelling

Neglect of user modelling has been described by Beel et al. [ 16 ] as identification of target audiences’ information needs. They describe the trade-off between specifying keywords which brings recommendation systems closer to search engines and utilising user profiles as input.

Currently only some approaches consider users of systems to influence the recommendation outcome, as seen with Table  3 users are not always part of the input to systems. Instead many paper recommendation systems assume that users do not state their information needs explicitly but only enter keywords or a paper. With paper recommendation systems where users are not considered, the problem of neglecting user modelling still holds.

Focus on accuracy

Focus on accuracy as a problem is described by Beel et al. [ 16 ]. They state putting users’ satisfaction with recommendations on a level with accuracy of approaches does not depict reality. More factors should be considered.

Only over one fourth of current approaches do not only report precision or accuracy but also observe more diversity focused measures such as MMR. We also found usage of less widespread measures to capture different aspects such as popularity, serendipity or click-through-rate.

Translating research into practice

The missing translation of research into practice is described by Beel et al. [ 16 ]. They mention the small percentage of approaches which are available as prototype as well as the discrepancy between real world systems and methods described in scientific papers.

Only five of our observed approaches definitively must have been available online at any point in time [ 28 , 45 , 65 , 84 , 119 ]. We did not encounter any of the more complex approaches being used in widespread paper recommendation systems.

Persistence and authority

Beel et al. [ 16 ] describe the lack of persistence and authority in the field of paper recommendation systems as one of the main reasons why research is not adapted in practice.

The analysis of this possible shortcoming of current work could be highly affected by the short time period from which we observed works. We found several groups publishing multiple papers as seen in Table  11 which corresponds to 29.69% of approaches. The most papers a group published was three so this amount still cannot fully mark a research group as authority in the area.

Overview of research groups with multiple papers

GroupPapers
Capital University of Science and Technology[ , ]
Fırat University[ , ]
IIT Kharagpur[ – ]
Qufu Normal University[ , ]
Kyoto-Kiel-Essex[ – ]
University of Malaya-Bayero University[ , ]
Pakistan[ , ]
Hefei University of Technology[ , ]
Shandong University[ , ]
Australia[ , ]

Cooperation

Problems with cooperation are described by Beel et al. [ 16 ]. They state even though approaches have been proposed by multiple authors building upon prior work is rare. Corporations between different research groups are also only encountered sporadically.

Here again we want to point to the fact that our observed time frame of less than three years might be too short to make substantive claims regarding this aspect. Table  12 holds information on the different numbers of authors for papers and the percentage of papers out of the 64 observed ones which are authored by groups of this size. We only encountered little cooperation between different co-author groups (see Haruna et al. [ 39 ] and Sakib et al. [ 88 ] for an exception). There were several groups not extending their previous work [ 121 , 122 ]. We refrain from analysing citations of related previous approaches as our considered period of less than three years is too short for all publications to have been able to be recognised by the wider scientific community.

Percentage of the 64 considered papers with different numbers of authors (#). Publications with 1 and 10 authors were encountered only once (1.56% each)

#2345678
%14.0631.2514.0623.447.813.133.13

Information scarcity

Information scarcity is described by Beel et al. [ 16 ] as researchers’ tendency to only provide insufficient detail to re-implement their approaches. This leads to problems with reproducibility.

Many of the approaches we encountered did not provide sufficient information to make a re-implementation possible: with Afsar et al. [ 1 ] it is unclear how the knowledge graph and categories were formed, Collins and Beel [ 28 ] do not describe their Doc2Vec enough, Liu et al. [ 61 ] do not specify the extraction of keywords for papers in the graph and Tang et al. [ 104 ] do not clearly describe their utilisation of Word2Vec. In general oftentimes details are missing [ 3 , 4 , 60 , 117 ]. Exceptions to these observations are e.g. found with Bereczki [ 19 ], Nishioka et al. [ 74 – 76 ] and Sakib et al. [ 88 ].

We did not find a single paper’s code e.g. provided as a link to GitHub.

Pure collaborative filtering systems encounter the cold start problem as described by Bai et al. [ 9 ] and Shahid et al. [ 92 ]. If new users are considered, no historical data is available, they cannot be compared to other users to find relevant recommendations.

While this problem still persists, most current approaches are no pure collaborative filtering based recommendation systems (see Sect.  3.3.1 ). Systems using deep learning could overcome this issue [ 58 ]. There are approaches specifically targeting this problem [ 59 , 96 ], some [ 59 ] also introduced specific evaluation measures (totNP_EU and avgNP_EU) to quantify systems’ ability to overcome the cold start problem.

Sparsity or reduce coverage

Bai et al. [ 9 ] state the user-paper-matrix being sparse for collaborative filtering based approaches. Shahid et al. [ 92 ] also mention this problem as the reduce coverage problem . This trait makes it hard for approaches to learn relevancy of infrequently rated papers.

Again, while this problem is still encountered, current approaches mostly are no longer pure collaborative filtering-based systems but instead utilise more information (see Sect.  3.3.1 ). Using deep learning in the recommendation process might reduce the impact of this problem [ 58 ].

Scalability

The problem of scalability was described by Bai et al. [ 9 ]. They state paper recommendation systems should be able to work in huge, ever expanding environments where new users and papers are added regularly.

A few approaches [ 38 , 46 , 88 , 109 ] contain a web crawling step which directly tackles challenges related to outdated or missing data. Some approaches [ 26 , 61 ] evaluate the time it takes to compute paper recommendations which also indicates their focus on this general problem. But most times scalability is not explicitly mentioned by current paper recommendation systems. There are several works [ 42 , 45 , 96 , 108 , 116 ] evaluating on bigger datasets with over 1 million papers and which thus are able to handle big amounts of data. Sizes of current relevant real-world data collections exceed this threshold many times over (see, e.g. PubMed with over 33 million papers 29 or SemanticScholar with over 203 million papers 30 ). Kanakia et al. [ 45 ] explicitly state scalability as a problem their approach is able to overcome. Instead of comparing each paper to all other papers they utilise clustering to reduce the number of required computations. They present the only approach running on several hundred million publications. Nair et al. [ 71 ] mention scalability issues they encountered even when only considering around 25,000 publications and their citation relations.

The problem of privacy in personalised paper recommendation is described by Bai et al. [ 9 ]. Shahid et al. [ 92 ] also mention this as a problem occurring in collaborative filtering approaches. An issue is encountered when sensitive information such as habits or weaknesses that users might not want to disclose is used by a system. This leads to users’ having negative impressions of systems. Keeping sensitive information private should therefore be a main goal.

In the current approaches, we did not find a discussion of privacy concerns. Some approach even explicitly utilise likes [ 84 ] or association rules [ 3 ] of other users while failing to mention privacy altogether. In approaches not incorporating any user data, this issue does not arise at all.

Serendipity

Serendipity is described by Bai et al. [ 9 ] as an attribute often encountered in collaborative filtering [ 16 ]. Usually paper recommender systems focus on identification of relevant papers even though also including not obviously relevant ones might enhance the overall recommendation. Junior researchers could profit from stray recommendations to broaden their horizon, senior researchers might be able to gain knowledge to enhance their research. The ratio between clearly relevant and serendipitous papers is crucial to prevent users from losing trust in the recommender system.

A main objective of the works of Nishioka et al. [ 74 – 76 ] is serendipity. Other approaches do not mention this aspect.

Unified scholarly data standards

Different data formats of data collections is mentioned as a problem by Bai et al. [ 9 ]. They mention digital libraries containing relevant information which needs to be unified in order to use the data in a paper recommendation system. Additionally the combination of datasets could also lead to problems.

Many of the approaches we observe do not consider data collection or preparation as part of the approach, they often only mention the combination of different datasets as part of the evaluation (see e.g. Du et al. [ 29 ], Li et al. [ 56 ] or Xie et al. [ 115 ]). An exception to this general rule are systems which contain a web crawling step for data (see e.g. Ahmad and Afzal [ 2 ] or Sakib et al. [ 88 ]). Even with this type of approaches the combination of datasets and their diverse data formats is not identified as a problem.

Shahid et al. [ 92 ] describe the problem of synonymy encountered in collaborative filtering approaches. They define this problem as different words having the same meaning.

Even though there are still approaches (not necessarily CF ones) utilising basic TF-IDF representations of papers [ 2 , 42 , 86 , 95 ], nowadays this problem can be bypassed by using a text embedding method such as Doc2Vec or BERT.

Gray sheep is a problem described by Shahid et al. [ 92 ] as an issue encountered in collaborative filtering approaches. They describe it as some users not consistently (dis)agreeing with any reference group.

We did not find any current approach mentioning this problem.

Black sheep

Black sheep is a problem described by Shahid et al. [ 92 ] as an issue encountered in collaborative filtering approaches. They describe it as some users not (dis)agree-ing with any reference group.

Shilling attack

Shilling attacks are described by Shahid et al. [ 92 ] as a problem encountered in collaborative filtering approaches. They define this problem as users being able to manually enhance visibility of their own research by rating authored papers as relevant while negatively rating any other recommendations.

Although we did not find any current approach mentioning this problem we assume maybe it is no longer highly relevant as most approaches are no longer pure collaborative filtering ones. Additionally from the considered collaborative filtering approaches no one explicitly stated to feed relevance ratings back into the system.

Emerging challenges

In addition to the open challenges discussed in former literature reviews by Bai et al. [ 9 ], Beel et al. [ 16 ] and Shahid et al. [ 92 ] we identified the following problems and derive desirable goals for future approaches from them.

User evaluation

Paper recommendation is always targeted at human users. But oftentimes an evaluation with real users to quantify users’ satisfaction with recommended publications is simply not conducted [ 84 ]. Conducting huge user studies is not feasible [ 38 ]. So sometimes user data to evaluate with is fetched from the presented datasets [ 39 , 88 ] or user behaviour is artificially emulated [ 1 , 19 , 57 ]. Noteworthy counter-examples 31 are the studies by Bulut et al. [ 22 ] who emailed 50 researchers to rate relevancy of recommended articles or Chaudhuri et al. [ 26 ] who asked 45 participants to rate their acceptance of recommended publications. Another option to overcome this issue is utilisation of living labs as seen with ArXivDigest [ 36 ], Mr. DLib’s living lab [ 14 ] or LiLAS for the related tasks of dataset recommendation for scientific publications and multi-lingual document retrieval [ 91 ].

Desirable goal Paper recommendation systems targeted at users should always contain a user evaluation with a description of the composition of participants.

Target audience

Current works mostly fail to clearly characterise the intended users of a system altogether and the varying interests of different types of users are not examined in their evaluations. There are some noteworthy counter-examples: Afsar et al. [ 1 ] mention cancer patients and their close relatives as intended target audience. Bereczki [ 19 ] identifies new users as a special group they want to recommend papers to. Hua et al. [ 42 ] consider users who start diving into a topic which they have not yet researched before. Sharma et al. [ 95 ] name subject matter experts incorporating articles into a medical knowledge base as their target audience. Shi et al. [ 96 ] clearly state use cases for their approach which always target users which are unaware of a topic but already have one interesting paper from the area. They strive to recommend more papers similar to the first one.

User characteristics such as registration status of users are already mentioned by Beel et al. [ 16 ] as a factor which is disregarded in evaluations. We want to extend on this point and highlight the oftentimes missing or inadequate descriptions of intended users of paper recommendation systems. Traits of users and their information needs are not only important for experiments but should also be regarded in the construction of an approach. The targeted audience of a paper recommendation system should influence its suggestions. Bai et al. [ 9 ] highlight different needs of junior researchers which should be recommended a broad variety of papers as they still have to figure out their direction. They state recommendations for senior researchers should be more in line with their already established interests. Sugiyama and Kan [ 100 ] describe the need to help discover interdisciplinary research for this experienced user group. Most works do not recognise possible different functions of paper recommendation systems for users depending on their level of seniority. If papers include an evaluation with real persons, they e.g. mix Master’s students with professors but do not address their different goals or expectations from paper recommendation [ 74 ]. Chaudhuri et al. [ 26 ] have junior, experienced and expert users as participants of their study and give individual ratings but do not calculate evaluation scores per user group. In some studies the exact composition of test users is not even mentioned (see Table  9 ).

Desirable goal Definition and consideration of a specific target audience for an approach and evaluation with members of this audience. If there is no specific person group a system should suit best, this should be discussed, executed and evaluated accordingly.

Recommendation scenario

Suggested papers from an approach should either be ones to read [ 44 , 109 ], to cite or fulfil another specified information need such as help patients in cancer treatment decision making [ 1 ]. Most work does not clearly state which is the case. Instead recommended papers are only said to be related [ 4 , 28 ], relevant [ 4 , 5 , 26 , 27 , 38 , 42 , 45 , 48 , 56 , 57 , 105 , 115 , 117 ], satisfactory [ 42 , 61 ], suitable [ 21 ], appropriate and useful [ 22 , 88 ] or a description which scenario is tackled is skipped altogether [ 3 , 37 , 39 , 84 ].

In rare cases if the recommendation scenario is mentioned there is the possibility of it not perfectly fitting the evaluated scenario. This can, e.g. be seen in the work of Jing and Yu [ 44 ] where they propose paper recommendation for papers to read but evaluate papers which were cited. Cited papers should always be ones which have been read beforehand but the decision to cite papers can be influenced by multiple aspects [ 34 ].

Desirable goal The clear description of the recommendation scenario is important for comparability of approaches as well as the validity of the evaluation.

Fairness/diversity

Anand et al. [ 8 ] define fairness as the balance between relevance and diversity of recommendation results. Only focusing on fit between the user or input paper and suggestions would lead to highly similar results which might not be vastly different from each other. Having diverse recommendation results can help cover multiple aspects of a user query instead of only satisfying the most prominent feature of the query [ 8 ]. In general more diverse recommendations provide greater utility for users [ 76 ]. Ekstrand et al. [ 31 ] give a detailed overview of current constructs for measuring algorithmic fairness in information access and describe possibly arising problems in this context.

Most of the current paper recommendation systems do not consider fairness but some approaches specifically mention diversity [ 26 , 74 – 76 ] while striving to recommend relevant publications. Thus these systems consider fairness.

Over one fourth of considered approaches with an evaluation report MMR as a measure of their system’s quality. This at least seems to show researchers’ awareness of the general problem of diverse recommendation results.

Desirable Goal Diversification of suggested papers to ensure fairness of the approach.

Paper recommendation systems tend to become more complex, convoluted or composed of multiple parts. We observed this trend by regarding the classification of current systems compared to previous literature reviews (see Sect.  3.3.1 ). While systems’ complexity increases, users’ interaction with the systems should not become more complex. If an approach requires user interaction at all, it should be as simple as possible. Users should not be required to construct sophisticated knowledge graphs [ 109 ] or enter multiple rounds of keywords for an approach to learn their user profile [ 24 ].

Desirable Goal Maintain simplicity of usage even if approaches become more complex.

Explainability

Confidence in the recommendation system has already been mentioned by Beel et al. [ 16 ] as an example of what could enhance users’ satisfaction but what is overlooked in approaches in favour of accuracy. This aspect should be considered with more vigour as the general research area of explainable recommendation has gained immense traction [ 120 ]. Gingstad et al. [ 36 ] regard explainability as a core component of paper recommendation systems. Xie et al. [ 116 ] mention explainability as a key feature of their approach but do not state how they achieve it or if their explanations satisfy users. Suggestions of recommendation systems should be explainable to enhance their trustworthiness and make them more engaging [ 66 ]. Here, different explanation goals such as effectiveness, efficiency, transparency or trust and their influence on each other should be considered [ 10 ]. If an approach uses neural networks [ 24 , 37 , 49 , 56 ] it is oftentimes impossible to explain why the system learned, that a specific suggested paper might be relevant.

Lee et al. [ 51 ] introduce a general approach which could be applied to any paper recommendation system to generate explanations for recommendations. Even though this option seems to help solve the described problem it is not clear how valuable post-hoc explanations are compared to systems which construct them directly.

Desirable Goal The conceptualisation of recommendation systems which comprehensibly explain their users why a specific paper is suggested.

Public dataset

Current approaches utilise many different datasets (see Table  4 ). A large portion of them are built by the authors such that they are not publicly available for others to use as well [ 1 , 30 , 111 ]. Part of the approaches already use open datasets in their evaluation but a large portion still does not seem to regard this as a priority (see Table  5 ). Utilisation of already public data sources or construction of datasets which are also published and remain available thus should be a priority in order to support reproducibility of approaches.

Desirable Goal Utilisation of publicly available datasets in the evaluation of paper recommendation systems.

Comparability

From the approaches we observed, many identified themselves as paper recommendation ones but only evaluated against systems, which are more general recommendation systems or ones utilising some same methodologies but not from the sub-domain of paper recommendation (seen with e.g. Guo et al [ 37 ], Tanner et al. [ 106 ] or Yang et al. [ 117 ]). While some of the works might claim to only be applied on paper recommendation and be of more general applicability (see, e.g. the works by Ahmedi et al. [ 3 ] or Alfarhood and Cheng [ 4 ]) we state that they should still be compared to ones, which mainly identify as paper recommendation systems as seen in the work of Chaudhuri et al. [ 24 ]. Only if a more general approach is compared to a paper recommendation approach, its usefulness for the area of paper recommendation can be fully assessed.

Several times, the baselines to evaluate against are not even other works but artificially constructed ones [ 2 , 38 ] or no other approach at all [ 22 ].

Desirable Goal Evaluation of paper recommendation approaches, even those which are applicable in a wider context, should always be against at least one paper recommendation system to clearly report relevance of the proposed method in the claimed context.

Discussion and outlook

From the already existing problems, several of them are still encountered in current paper recommendation approaches. Users are not always part of the approaches so users are not always modelled but this also prevents privacy issues. Accuracy seems to still be the main focus of recommendation systems. Novel techniques proposed in papers are not available online or applied by existing paper recommendation systems. Approaches do not provide enough details to enable re-implementation. Providing the code online or in a living lab environment could help overcome many of these issues.

Other problems mainly encountered in pure collaborative filtering systems such as the cold start problem, sparsity, synonymy, gray sheep, black sheep and shilling attacks do not seem to be as relevant anymore. We observed a trend towards hybrid models, this recommendation system type can overcome these issues. These hybrid models should also be able to produce serendipitous recommendations.

Unifying data sources is conducted often but nowadays it does not seem to be regarded as a problem. With scalability we encountered the same. Approaches are oftentimes able to handle millions of papers, here they do not specifically mention scalability as a problem they overcome but they also mostly do not consider huge datasets with several hundreds of millions of publications.

Due to the limited scope of our survey we are not able to derive substantive claims regarding cooperation and persistence. We found around 30% of approaches published by groups which authored multiple papers and very few collaborations between different author groups.

As for the newly introduced problems, part of the observed approaches conducted evaluations with users, on publicly available datasets and against other paper recommendation systems. Many works considered a low complexity for users. Even though user evaluations are desirable, they come with high costs. Usage of evaluation datasets with real human annotations could help overcome this issue partially, another straightforward solution would be the incorporation in a living lab. The second option would also help with comparability of approaches. Usage of available datasets can become increasingly complicated if approaches use new data which is currently not contained in existing datasets. 32

Target audiences in general were rarely defined, the recommendation scenario was mostly not described. Diversity was considered by few. Overall the explainability of recommendations was dismissed. The first two of these issues are ones which could be comparatively easily fixed or addressed in the papers without changing the approach. As for diversity and explainability, the approaches would need to be modelled specifically such that these attributes could be satisfied.

To conclude, there are many challenges which are not constantly considered by current approaches. They define the requirements for future works in the area of paper recommendation systems.

This literature review of publications targeting paper recommendation between January 2019 and October 2021 provided comprehensive overviews of their methods, datasets and evaluation measures. We showed the need for a richer multi-dimensional characterisation of paper recommendation as former ones no longer seem sufficient in classifying the increasingly complex approaches. We also revisited known open challenges in the current time frame and highlighted possibly under-observed problems which future works could focus on.

Efforts should be made to standardise or better differentiate between the varying notions of relevancy and recommendation scenarios when it comes to paper recommendation. Future work could try revaluate already existing methods with real humans and against other paper recommendation systems. This could for example be realised in an extendable paper recommendation benchmarking system similar to the in a living lab environments ArXivDigest [ 36 ], Mr. DLib’s living lab [ 14 ] or LiLAS [ 91 ] but with the additional property that it also provides build-in offline evaluations. As fairness and explainability of current paper recommendation systems have not been tackled widely, those aspects should be further explored. Another direction could be the comparison of multiple rare evaluation measures on the same system to help identify those which should be focused on in the future. As we observed a vast variety in datasets utilised for evaluation of the approaches (see Table  4 ), construction of publicly available and widely reusable ones would be worthwhile.

Funding Information

Open Access funding enabled and organized by Projekt DEAL.

1 The most recent surveys [ 9 , 58 , 92 ] focusing on scientific paper recommendation appeared in 2019 such that this time frame is not yet covered.

2 Non-immediate variants allow using methods which require more time to compute recommendations. Temporal patterns of user behaviour could be incorporated in the recommendation process to identify a fitting moment to present new recommendations to a user. The moment a recommendation is presented to a user influences their interest, as the delayed recommendation might no longer be relevant or does not fit the current task of a user.

3 https://dl.acm.org/ .

4 https://dblp.uni-trier.de/ .

5 https://scholar.google.com/ .

6 https://link.springer.com/ .

7 For a survey of current trends in citation recommendation refer to Färber and Jatowt [ 32 ].

8 These papers could either be a demo paper and a later published full paper or the conference and journal version of the same approach, which is then slightly extended by more experiments. These paper clusters are no exact duplicates or fraudulent publications.

9 The number of citations can be regarded both as an input data as well as a method to denote popularity.

10 https://dblp.uni-trier.de/xml/ .

11 https://www.aminer.org/citation .

12 (shortened) http://shorturl.at/cIQR1 .

13 https://github.com/js05212/citeulike-a .

14 https://github.com/js05212/citeulike-t .

15 https://dl.acm.org/ .

16 https://www.scopus.com/home.uri .

17 https://www.aminer.org/ .

18 https://aan.how/download/ .

19 https://citeseerx.ist.psu.edu/index .

20 https://bulkdata.uspto.gov/ .

21 https://snap.stanford.edu/data/cit-HepTh.html .

22 (shortened) http://shorturl.at/orwXY .

23 http://mlg.ucd.ie/datasets/bbc.html .

24 https://sites.google.com/site/tinhuynhuit/dataset .

25 One approach is described in three papers.

26 Shi et al. [ 96 ] also conduct a user study but do not describe their participants.

27 Compare the 99.363 journal articles and 151.617 conference papers published in 2013 to the 187.263 journal articles and 157.460 conference articles in 2021 in dblp.

28 Note that not all approaches classified their type of paper recommendation and several papers did not classify themselves in the wide-spread categorisation (see Sect.  3.3.1 ).

29 https://pubmed.ncbi.nlm.nih.gov/ .

30 https://www.semanticscholar.org/product/api .

31 For a full list of approaches conducting user studies see Table  9 .

32 We did not encounter many papers utilising types of data as part of their approach, which is not typically included in existing datasets; one of the noteworthy exceptions could be the approach by Nishioka et al. [ 74 – 76 ], which utilised Tweets of users.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Christin Katharina Kreutz, Email: [email protected] .

Ralf Schenkel, Email: ed.reirt-inu@leknehcs .

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base
  • Research paper
  • How to Write Recommendations in Research | Examples & Tips

How to Write Recommendations in Research | Examples & Tips

Published on September 15, 2022 by Tegan George . Revised on July 18, 2023.

Recommendations in research are a crucial component of your discussion section and the conclusion of your thesis , dissertation , or research paper .

As you conduct your research and analyze the data you collected , perhaps there are ideas or results that don’t quite fit the scope of your research topic. Or, maybe your results suggest that there are further implications of your results or the causal relationships between previously-studied variables than covered in extant research.

Instantly correct all language mistakes in your text

Upload your document to correct all your mistakes in minutes

upload-your-document-ai-proofreader

Table of contents

What should recommendations look like, building your research recommendation, how should your recommendations be written, recommendation in research example, other interesting articles, frequently asked questions about recommendations.

Recommendations for future research should be:

  • Concrete and specific
  • Supported with a clear rationale
  • Directly connected to your research

Overall, strive to highlight ways other researchers can reproduce or replicate your results to draw further conclusions, and suggest different directions that future research can take, if applicable.

Relatedly, when making these recommendations, avoid:

  • Undermining your own work, but rather offer suggestions on how future studies can build upon it
  • Suggesting recommendations actually needed to complete your argument, but rather ensure that your research stands alone on its own merits
  • Using recommendations as a place for self-criticism, but rather as a natural extension point for your work

Don't submit your assignments before you do this

The academic proofreading tool has been trained on 1000s of academic texts. Making it the most accurate and reliable proofreading tool for students. Free citation check included.

thesis topic recommendation system

Try for free

There are many different ways to frame recommendations, but the easiest is perhaps to follow the formula of research question   conclusion  recommendation. Here’s an example.

Conclusion An important condition for controlling many social skills is mastering language. If children have a better command of language, they can express themselves better and are better able to understand their peers. Opportunities to practice social skills are thus dependent on the development of language skills.

As a rule of thumb, try to limit yourself to only the most relevant future recommendations: ones that stem directly from your work. While you can have multiple recommendations for each research conclusion, it is also acceptable to have one recommendation that is connected to more than one conclusion.

These recommendations should be targeted at your audience, specifically toward peers or colleagues in your field that work on similar subjects to your paper or dissertation topic . They can flow directly from any limitations you found while conducting your work, offering concrete and actionable possibilities for how future research can build on anything that your own work was unable to address at the time of your writing.

See below for a full research recommendation example that you can use as a template to write your own.

Recommendation in research example

Scribbr Citation Checker New

The AI-powered Citation Checker helps you avoid common mistakes such as:

  • Missing commas and periods
  • Incorrect usage of “et al.”
  • Ampersands (&) in narrative citations
  • Missing reference entries

thesis topic recommendation system

If you want to know more about AI for academic writing, AI tools, or research bias, make sure to check out some of our other articles with explanations and examples or go directly to our tools!

Research bias

  • Survivorship bias
  • Self-serving bias
  • Availability heuristic
  • Halo effect
  • Hindsight bias
  • Deep learning
  • Generative AI
  • Machine learning
  • Reinforcement learning
  • Supervised vs. unsupervised learning

 (AI) Tools

  • Grammar Checker
  • Paraphrasing Tool
  • Text Summarizer
  • AI Detector
  • Plagiarism Checker
  • Citation Generator

While it may be tempting to present new arguments or evidence in your thesis or disseration conclusion , especially if you have a particularly striking argument you’d like to finish your analysis with, you shouldn’t. Theses and dissertations follow a more formal structure than this.

All your findings and arguments should be presented in the body of the text (more specifically in the discussion section and results section .) The conclusion is meant to summarize and reflect on the evidence and arguments you have already presented, not introduce new ones.

The conclusion of your thesis or dissertation should include the following:

  • A restatement of your research question
  • A summary of your key arguments and/or results
  • A short discussion of the implications of your research

For a stronger dissertation conclusion , avoid including:

  • Important evidence or analysis that wasn’t mentioned in the discussion section and results section
  • Generic concluding phrases (e.g. “In conclusion …”)
  • Weak statements that undermine your argument (e.g., “There are good points on both sides of this issue.”)

Your conclusion should leave the reader with a strong, decisive impression of your work.

In a thesis or dissertation, the discussion is an in-depth exploration of the results, going into detail about the meaning of your findings and citing relevant sources to put them in context.

The conclusion is more shorter and more general: it concisely answers your main research question and makes recommendations based on your overall findings.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

George, T. (2023, July 18). How to Write Recommendations in Research | Examples & Tips. Scribbr. Retrieved August 19, 2024, from https://www.scribbr.com/dissertation/recommendations-in-research/

Is this article helpful?

Tegan George

Tegan George

Other students also liked, how to write a discussion section | tips & examples, how to write a thesis or dissertation conclusion, how to write a results section | tips & examples, get unlimited documents corrected.

✔ Free APA citation check included ✔ Unlimited document corrections ✔ Specialized in correcting academic texts

Scientific paper recommendation systems: a literature review of recent publications

thesis topic recommendation system

New Citation Alert added!

This alert has been successfully added and will be sent to:

You will be notified whenever a record that you have chosen has been cited.

To manage your alert preferences, click on the button below.

New Citation Alert!

Please log in to your account

Information & Contributors

Bibliometrics & citations, view options.

  • Stergiopoulos V Vassilakopoulos M Tousidou E Corral A (2024) An academic recommender system on large citation data based on clustering, graph modeling and deep learning Knowledge and Information Systems 10.1007/s10115-024-02094-7 66 :8 (4463-4496) Online publication date: 1-Aug-2024 https://dl.acm.org/doi/10.1007/s10115-024-02094-7
  • Li W Xie Y Jiang H Sun Y (2023) Differentiable Topics Guided New Paper Recommendation Neural Information Processing 10.1007/978-981-99-8076-5_4 (44-56) Online publication date: 20-Nov-2023 https://dl.acm.org/doi/10.1007/978-981-99-8076-5_4

Index Terms

Information systems

Information retrieval

Retrieval tasks and goals

Document filtering

Information extraction

Information systems applications

Recommendations

Acknowledgments in scientific publications: presence in spanish science and text patterns across disciplines.

The acknowledgments in scientific publications are an important feature in the scholarly communication process. This research analyzes funding acknowledgment presence in scientific publications and introduces a novel approach for discovering text ...

Retracted publications in the biomedical literature with authors from mainland China

The number of retracted articles with Chinese authors has raised much attention, but no systematic study has specifically explored the retraction of academic publications by researchers from mainland China. Here, we determined the characteristics of ...

Scholarly recommendation systems: a literature survey

A scholarly recommendation system is an important tool for identifying prior and related resources such as literature, datasets, grants, and collaborators. A well-designed scholarly recommender significantly saves the time of researchers and can ...

Information

Published in.

Springer-Verlag

Berlin, Heidelberg

Publication History

Author tags.

  • Paper recommendation system
  • Publication suggestion
  • Literature review
  • Research-article

Funding Sources

  • Technische Hochschule Köln (3331)

Contributors

Other metrics, bibliometrics, article metrics.

  • 1 Total Citations View Citations
  • 0 Total Downloads
  • Downloads (Last 12 months) 0
  • Downloads (Last 6 weeks) 0

View options

Login options.

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Share this publication link.

Copying failed.

Share on social media

Affiliations, export citations.

  • Please download or close your previous search result export first before starting a new bulk export. Preview is not available. By clicking download, a status dialog will open to start the export process. The process may take a few minutes but once it finishes a file will be downloadable from your browser. You may continue to browse the DL while the export process is in progress. Download
  • Download citation
  • Copy citation

We are preparing your search results for download ...

We will inform you here when the file is ready.

Your file of search results citations is now ready.

Your search export query has expired. Please try again.

Academia.edu no longer supports Internet Explorer.

To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to  upgrade your browser .

Enter the email address you signed up with and we'll email you a reset link.

  • We're Hiring!
  • Help Center

paper cover thumbnail

Recommendation System for Thesis Topics Using Content-based Filtering

Profile image of Aina Musdholifah

When pursuing their bachelor degree, every students are required to pursue a thesis in order to graduate from the major that they take. However, during the process, students got several difficulty regarding chosing their thesis topics. Therefore, a recommendation system is needed to classify thesis topics based on the students’ interest and abilities. This study developed a recommendation system for thesis topics using content-based filtering where the students will be asked to choose the course that they interested in along with their grades. After getting all the required data, the recommendation system will process the data and then it’ll show the title and the abstract of publication that fits the criteria. In this research, there are 2 datasets that is used, there are lecturer publication within 3 years and syllabus data of Computer Science UGM course. After running this research, it was found that the recommendation system has an average 7.46 seconds running time. It was also ...

Related Papers

Business and Economics Journal

Dalvinder singh Grewal

thesis topic recommendation system

IRJMETS Publication

International Research Journal of Modernization in Engineering Technology and Science (IRJMETS)

Now a days students are showing interest to join in their dream colleges according to their required branches and facilities regarding the studies. But here issue is due to lack of information about the colleges they are facing difficulties to choose their dream course and the colleges after completion of their standard education. Now we are creating a recommendation system which is helpful to the students to choose their dream course and college which are they are looking. We collect the information about the colleges. The information about the placements, courses, facilities in the college we collect from old students and current studying students. We rank the colleges based on the NAAC , NBA and placements. It is helpful to the students who are looking for the best colleges to join and fulfill their dream. The objective of the project has to enable the students to find the colleges by using the proposed recommendation system. This recommendation system works by using the collaborative filtering algorithm. When the student search for the colleges. This recommendation system shows the list of colleges according to student criteria. Then student can choose among those in which college he is interested to do admission. The project is implemented using rapid miner tool.

THE IJES Editor

The choice Master's degree students have to make regarding elective subject selection in a chosen specialization is really decisive. A wrong decision may affect their personal and academic goals and may impact negatively on their future professional direction. Making bad choices on the elective subjects to offer at this stage may lead to loss of interest by the student, which can result to dropping out of the higher degree programme. It is therefore important that students are given support so as to make the right choices regarding elective courses in a chosen specialization, using decision support systems. There are records of successful use of recommender systems to suggest items to users in several domains, like education, e-commerce, entertainment domains, and the like. In this work, 5 supervised machine learning algorithms are evaluated to determine the most efficient for the training and prediction on elective courses to be offered by a Master's Degree student based on the student's background knowledge of undergraduate courses, and on the academic record of previous Master's students. This research employed an experimental approach, and a Python software application is modeled with the Object-Oriented Analysis and Design (OOAD) using the standards notations and techniques of the Unified Modelling Language (UML) to build the recommender system. The result from the evaluation of the five machine learning models shows that the Naïve Bayes and Decision Tree algorithms have equal accuracy value of 99.409%, which is the highest and an equal F1 Score of 0.918, which is equally the highest. Decision tree was selected to be the classifier model for the recommender system.

International Journal of Modern Education and Computer Science

Agis Susanto

Indonesian Journal of Computing and Cybernetics Systems

annisaa utami

Case Based Reasoning (CBR) is a method that aims to resolve a new case by adapting the solutions contained in previous cases that are similar to the new case. The system built in this study is the CBR system to make recommendations on the topic of student thesis concentration. This study used data from undergraduate students of Informatics Engineering IST AKPRIND Yogyakarta with a total of 115 data consisting of 80 training data and 35 test data. This study aims to design and build a Case Based Reasoning system using the Nearest Neighbor and Manhattan Distance Similarity Methods, and to compare the results of the accuracy value using the Nearest Neighbor Similarity and Manhattan Distance Similarity methods. The recommendation process is carried out by calculating the value of closeness or similarity between new cases and old cases stored on a case basis using the Nearest Neighbor Method and Manhattan Distance. The features used in this study consisted of GPA and course grades. The ...

Journal of Physics: Conference Series https://iopscience.iop.org/issue/1742-6596/1899/1

Aeri Rachmad , Eka Sari

Thesis topic is an inseparable part in the world of tertiary education. Determining the thesis topic becomes a problem for students. The determination of the thesis topic leads to the trend of the topic in the development of computer science. The determination of the topic of thesis for students often ignores their ability to process. Ideally in determining the topic of the thesis, the record of student grades can be an important variable in deciding topics for students, where the student's grade record is contained in the transcript. Therefore, this study uses the Support Vector Machine (SVM) method in recommending thesis topics by classifying selected subject groups that have been taken by students. The Support Vector Machine method is a classification method of supervision because it requires testing data and training data as a training process at the time of prediction. Support Vector Machine provides an optimal model, which provides a solution with a maximum margin to determine the distance of data to the hyperplane. The test results show an accuracy of 80%.

International Journal for Research in Applied Science & Engineering Technology

IJRASET Publication

In the digital world of online information huge amounts of Massive Open Online Courses (MOOCs) are available of different category and domain. Multiple online courses are available on different platform finding appropriate course from this massive available course is difficult for students. Recommender system plays vital role in finding appropriate courses to students. Managing massive amount of information and identifying individual users' choice and behavior has become tedious task nowadays, so the aim of recommender system is to suggest relevant course to student based on user behavior and similarity with another course. Several recommender system techniques are being implemented like content based, collaborative, Knowledge based. This paper aims to build a hybrid approach using collaborative filtering with content base filtering. This system recommendation is based on course description and ratings. Experiments were conducted on real datasets to get the overall performance of proposed system. I. INTRODUCTION With the ever-growing large volume of online information, recommender systems are an efficient strategy to beat such information overload. Recommender systems are the systems that are designed to recommend things to the user supported various factors. Companies like YouTube, Netflix, Amazon, etc. use recommender systems to help their users to recommend the correct product, video or movies for them. Advances in technology has changed the way of education. Massive Open Online Courses (MOOCs) are capable of providing several learners to access courses over the web. Recommender System (RS) is computerized system that suggest/recommend item to user. The number of MOOCs and the number of students registered in MOOCs are growing per annum. In 2018, more than 900 universities were offering MOOCs with 11,400 courses available, and around 101 million students had registered in them (Shah, 2018), providing learners with a good sort of choices. With such a high number of courses available, learners now face the matter of choosing courses without being overwhelmed. With the rise in e-commerce and online business, the number of users interested in online Web services has increased. Both MOOC providers and online businesses advertise their courses and services while learners look for courses that match their interests and needs. In these situations, recommender systems play a crucial role, and have attracted the attention of researchers. Recommender systems are algorithms and techniques that, based on their preferences, suggest matching and related courses or services to the learner, knowledge about which comes from learner profiles and systems-gathered histories. Recommender systems help MOOC providers grow and learners find more appropriate and customized services based on their personalities and interests. Recommender systems discover patterns in considerable datasets to find out preferences of different users and predict items that correlate to their needs. Recommender systems is divided into two broad categories: collaborative filtering recommender systems and content-based recommender systems. Collaborative filtering recommender systems perform recommendations on users who have had similar taste in the past will make similar choices in the future. Content based recommender systems consider the profile of users and items. The online course recommendation systems suggest to the students the best courses in which they are interested. This paper presents a recommendation methodology that recommends courses to students based on similarity between courses taken by the target student and other students. It aims to provide an effective course recommendation using multiple techniques. The students will be clustered into groups based on traditional data-mining (DM) techniques will to Collaborative filtering using knowledgebase.

International Journal of Advanced Computer Science and Applications

mustafa man

Turkish Journal of Electrical Engineering & Computer Sciences

Miftahul Jannat Mokarrama

Recommender system (RS) is a knowledge discovery and decision-making system that has been extensively used in a myriad of applications to assist people in making distinct choices from vast sources. This paper proposes a recommendation system that will help the prospective students of Bangladesh in choosing the most suitable private universities for getting admission. Since selecting the best private university does not depend merely on a few criteria or choices and making a decision considering all those criteria is not an easy task, a recommendation system can be of great assistance in this scenario for the prospective students. In this proposed recommendation system a list of top-K private universities is recommended to the students who are willing to get admitted to the private universities using content-based filtering technique. To attain this goal we considered six parameters, namely grade point average (GPA) of secondary school certificate (SSC) examination, GPA of higher secondary certificate (HSC) examination, total GPA, tuition fees, university ratings, and university rankings. Finally, we evaluated the system with a total of 947 real feedback from prospective students and obtained the accuracies of 89.05%, 95.85%, 48%, 92.32%, and 71.93% using 5 different performance metrics: precision, recall, specificity, F1 score, and balanced accuracy, respectively.

Journal of Physics: Conference Series

Dede Kurniadi

This article aims to proposed framework an Intelligent Recommender System (IRS) for students in higher education institutions. This conceptual framework includes problems in predicting student performance, the possibility of graduating on time, and recommends choosing subjects according to performance, and career interests, which are useful for assisting pedagogical interventions in future student development. The success in the development and implementation of the proposed IRS framework is inseparable from using data mining and machine learning techniques in predicting and providing recommendations. Data analysis consisted of clustering techniques, association rules, and classification using Support Vector Machine (SVM), Naïve Bayes, and k-Nearest Neighbour (k-NN). These techniques are used to solve problems related to students and to provide appropriate recommendations. The result is an IRS conceptual framework for the college student that can be used as smart agents to provide s...

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.

RELATED PAPERS

Simon Philip

Journal of Applied Intelligent System

Erwin Hidayat

HARISH YADAV

Proceedings of the 11th International Joint Conference on Computational Intelligence

Haneen Algethami

Umesh Pandey

Maiga Chang

ICST Transactions on Scalable Information Systems

Sipra Sahoo

Hael Al Bashiri

Indonesian Journal of Electrical Engineering and Computer Science

International Research Journal of Modernization in Engineering Technology and Science

Pratiksha Sawant

Shade Kuyoro

International Journal of Recent Technology and Engineering (IJRTE), ISSN 2277-3878 (online), SCOPUS

Nikhat Akhtar

Biazen Getnet

IEEE Access

Tayyaba Azim

Jurnal Teknik Informatika (Jutif)

Ruvita Faurina

Saber Modallal

Aishwarya Nalawade

Applied Computational Intelligence and Soft Computing

esmael Ahmed

IJRCAR JOURNAL

Evans Asenso

IJIRST - International Journal for Innovative Research in Science and Technology

IJCCS (Indonesian Journal of Computing and Cybernetics Systems)

Aina Musdholifah

IAEME Publication

RELATED TOPICS

  •   We're Hiring!
  •   Help Center
  • Find new research papers in:
  • Health Sciences
  • Earth Sciences
  • Cognitive Science
  • Mathematics
  • Computer Science
  • Academia ©2024
  • Survey paper
  • Open access
  • Published: 03 May 2022

A systematic review and research perspective on recommender systems

  • Deepjyoti Roy   ORCID: orcid.org/0000-0002-8020-7145 1 &
  • Mala Dutta 1  

Journal of Big Data volume  9 , Article number:  59 ( 2022 ) Cite this article

77k Accesses

136 Citations

9 Altmetric

Metrics details

Recommender systems are efficient tools for filtering online information, which is widespread owing to the changing habits of computer users, personalization trends, and emerging access to the internet. Even though the recent recommender systems are eminent in giving precise recommendations, they suffer from various limitations and challenges like scalability, cold-start, sparsity, etc. Due to the existence of various techniques, the selection of techniques becomes a complex work while building application-focused recommender systems. In addition, each technique comes with its own set of features, advantages and disadvantages which raises even more questions, which should be addressed. This paper aims to undergo a systematic review on various recent contributions in the domain of recommender systems, focusing on diverse applications like books, movies, products, etc. Initially, the various applications of each recommender system are analysed. Then, the algorithmic analysis on various recommender systems is performed and a taxonomy is framed that accounts for various components required for developing an effective recommender system. In addition, the datasets gathered, simulation platform, and performance metrics focused on each contribution are evaluated and noted. Finally, this review provides a much-needed overview of the current state of research in this field and points out the existing gaps and challenges to help posterity in developing an efficient recommender system.

Introduction

The recent advancements in technology along with the prevalence of online services has offered more abilities for accessing a huge amount of online information in a faster manner. Users can post reviews, comments, and ratings for various types of services and products available online. However, the recent advancements in pervasive computing have resulted in an online data overload problem. This data overload complicates the process of finding relevant and useful content over the internet. The recent establishment of several procedures having lower computational requirements can however guide users to the relevant content in a much easy and fast manner. Because of this, the development of recommender systems has recently gained significant attention. In general, recommender systems act as information filtering tools, offering users suitable and personalized content or information. Recommender systems primarily aim to reduce the user’s effort and time required for searching relevant information over the internet.

Nowadays, recommender systems are being increasingly used for a large number of applications such as web [ 1 , 67 , 70 ], books [ 2 ], e-learning [ 4 , 16 , 61 ], tourism [ 5 , 8 , 78 ], movies [ 66 ], music [ 79 ], e-commerce, news, specialized research resources [ 65 ], television programs [ 72 , 81 ], etc. It is therefore important to build high-quality and exclusive recommender systems for providing personalized recommendations to the users in various applications. Despite the various advances in recommender systems, the present generation of recommender systems requires further improvements to provide more efficient recommendations applicable to a broader range of applications. More investigation of the existing latest works on recommender systems is required which focus on diverse applications.

There is hardly any review paper that has categorically synthesized and reviewed the literature of all the classification fields and application domains of recommender systems. The few existing literature reviews in the field cover just a fraction of the articles or focus only on selected aspects such as system evaluation. Thus, they do not provide an overview of the application field, algorithmic categorization, or identify the most promising approaches. Also, review papers often neglect to analyze the dataset description and the simulation platforms used. This paper aims to fulfil this significant gap by reviewing and comparing existing articles on recommender systems based on a defined classification framework, their algorithmic categorization, simulation platforms used, applications focused, their features and challenges, dataset description and system performance. Finally, we provide researchers and practitioners with insight into the most promising directions for further investigation in the field of recommender systems under various applications.

In essence, recommender systems deal with two entities—users and items, where each user gives a rating (or preference value) to an item (or product). User ratings are generally collected by using implicit or explicit methods. Implicit ratings are collected indirectly from the user through the user’s interaction with the items. Explicit ratings, on the other hand, are given directly by the user by picking a value on some finite scale of points or labelled interval values. For example, a website may obtain implicit ratings for different items based on clickstream data or from the amount of time a user spends on a webpage and so on. Most recommender systems gather user ratings through both explicit and implicit methods. These feedbacks or ratings provided by the user are arranged in a user-item matrix called the utility matrix as presented in Table 1 .

The utility matrix often contains many missing values. The problem of recommender systems is mainly focused on finding the values which are missing in the utility matrix. This task is often difficult as the initial matrix is usually very sparse because users generally tend to rate only a small number of items. It may also be noted that we are interested in only the high user ratings because only such items would be suggested back to the users. The efficiency of a recommender system greatly depends on the type of algorithm used and the nature of the data source—which may be contextual, textual, visual etc.

Types of recommender systems

Recommender systems are broadly categorized into three different types viz. content-based recommender systems, collaborative recommender systems and hybrid recommender systems. A diagrammatic representation of the different types of recommender systems is given in Fig.  1 .

figure 1

Content-based recommender system

In content-based recommender systems, all the data items are collected into different item profiles based on their description or features. For example, in the case of a book, the features will be author, publisher, etc. In the case of a movie, the features will be the movie director, actor, etc. When a user gives a positive rating to an item, then the other items present in that item profile are aggregated together to build a user profile. This user profile combines all the item profiles, whose items are rated positively by the user. Items present in this user profile are then recommended to the user, as shown in Fig.  2 .

figure 2

One drawback of this approach is that it demands in-depth knowledge of the item features for an accurate recommendation. This knowledge or information may not be always available for all items. Also, this approach has limited capacity to expand on the users' existing choices or interests. However, this approach has many advantages. As user preferences tend to change with time, this approach has the quick capability of dynamically adapting itself to the changing user preferences. Since one user profile is specific only to that user, this algorithm does not require the profile details of any other users because they provide no influence in the recommendation process. This ensures the security and privacy of user data. If new items have sufficient description, content-based techniques can overcome the cold-start problem i.e., this technique can recommend an item even when that item has not been previously rated by any user. Content-based filtering approaches are more common in systems like personalized news recommender systems, publications, web pages recommender systems, etc.

Collaborative filtering-based recommender system

Collaborative approaches make use of the measure of similarity between users. This technique starts with finding a group or collection of user X whose preferences, likes, and dislikes are similar to that of user A. X is called the neighbourhood of A. The new items which are liked by most of the users in X are then recommended to user A. The efficiency of a collaborative algorithm depends on how accurately the algorithm can find the neighbourhood of the target user. Traditionally collaborative filtering-based systems suffer from the cold-start problem and privacy concerns as there is a need to share user data. However, collaborative filtering approaches do not require any knowledge of item features for generating a recommendation. Also, this approach can help to expand on the user’s existing interests by discovering new items. Collaborative approaches are again divided into two types: memory-based approaches and model-based approaches.

Memory-based collaborative approaches recommend new items by taking into consideration the preferences of its neighbourhood. They make use of the utility matrix directly for prediction. In this approach, the first step is to build a model. The model is equal to a function that takes the utility matrix as input.

Model = f (utility matrix)

Then recommendations are made based on a function that takes the model and user profile as input. Here we can make recommendations only to users whose user profile belongs to the utility matrix. Therefore, to make recommendations for a new user, the user profile must be added to the utility matrix, and the similarity matrix should be recomputed, which makes this technique computation heavy.

Recommendation = f (defined model, user profile) where user profile  ∈  utility matrix

Memory-based collaborative approaches are again sub-divided into two types: user-based collaborative filtering and item-based collaborative filtering. In the user-based approach, the user rating of a new item is calculated by finding other users from the user neighbourhood who has previously rated that same item. If a new item receives positive ratings from the user neighbourhood, the new item is recommended to the user. Figure  3 depicts the user-based filtering approach.

figure 3

User-based collaborative filtering

In the item-based approach, an item-neighbourhood is built consisting of all similar items which the user has rated previously. Then that user’s rating for a different new item is predicted by calculating the weighted average of all ratings present in a similar item-neighbourhood as shown in Fig.  4 .

figure 4

Item-based collaborative filtering

Model-based systems use various data mining and machine learning algorithms to develop a model for predicting the user’s rating for an unrated item. They do not rely on the complete dataset when recommendations are computed but extract features from the dataset to compute a model. Hence the name, model-based technique. These techniques also need two steps for prediction—the first step is to build the model, and the second step is to predict ratings using a function (f) which takes the model defined in the first step and the user profile as input.

Recommendation = f (defined model, user profile) where user profile  ∉  utility matrix

Model-based techniques do not require adding the user profile of a new user into the utility matrix before making predictions. We can make recommendations even to users that are not present in the model. Model-based systems are more efficient for group recommendations. They can quickly recommend a group of items by using the pre-trained model. The accuracy of this technique largely relies on the efficiency of the underlying learning algorithm used to create the model. Model-based techniques are capable of solving some traditional problems of recommender systems such as sparsity and scalability by employing dimensionality reduction techniques [ 86 ] and model learning techniques.

Hybrid filtering

A hybrid technique is an aggregation of two or more techniques employed together for addressing the limitations of individual recommender techniques. The incorporation of different techniques can be performed in various ways. A hybrid algorithm may incorporate the results achieved from separate techniques, or it can use content-based filtering in a collaborative method or use a collaborative filtering technique in a content-based method. This hybrid incorporation of different techniques generally results in increased performance and increased accuracy in many recommender applications. Some of the hybridization approaches are meta-level, feature-augmentation, feature-combination, mixed hybridization, cascade hybridization, switching hybridization and weighted hybridization [ 86 ]. Table 2 describes these approaches.

Recommender system challenges

This section briefly describes the various challenges present in current recommender systems and offers different solutions to overcome these challenges.

Cold start problem

The cold start problem appears when the recommender system cannot draw any inference from the existing data, which is insufficient. Cold start refers to a condition when the system cannot produce efficient recommendations for the cold (or new) users who have not rated any item or have rated a very few items. It generally arises when a new user enters the system or new items (or products) are inserted into the database. Some solutions to this problem are as follows: (a) Ask new users to explicitly mention their item preference. (b) Ask a new user to rate some items at the beginning. (c) Collect demographic information (or meta-data) from the user and recommend items accordingly.

Shilling attack problem

This problem arises when a malicious user fakes his identity and enters the system to give false item ratings [ 87 ]. Such a situation occurs when the malicious user wants to either increase or decrease some item’s popularity by causing a bias on selected target items. Shilling attacks greatly reduce the reliability of the system. One solution to this problem is to detect the attackers quickly and remove the fake ratings and fake user profiles from the system.

Synonymy problem

This problem arises when similar or related items have different entries or names, or when the same item is represented by two or more names in the system [ 78 ]. For example, babywear and baby cloth. Many recommender systems fail to distinguish these differences, hence reducing their recommendation accuracy. To alleviate this problem many methods are used such as demographic filtering, automatic term expansion and Singular Value Decomposition [ 76 ].

Latency problem

The latency problem is specific to collaborative filtering approaches and occurs when new items are frequently inserted into the database. This problem is characterized by the system’s failure to recommend new items. This happens because new items must be reviewed before they can be recommended in a collaborative filtering environment. Using content-based filtering may resolve this issue, but it may introduce overspecialization and decrease the computing time and system performance. To increase performance, the calculations can be done in an offline environment and clustering-based techniques can be used [ 76 ].

Sparsity problem

Data sparsity is a common problem in large scale data analysis, which arises when certain expected values are missing in the dataset. In the case of recommender systems, this situation occurs when the active users rate very few items. This reduces the recommendation accuracy. To alleviate this problem several techniques can be used such as demographic filtering, singular value decomposition and using model-based collaborative techniques.

Grey sheep problem

The grey sheep problem is specific to pure collaborative filtering approaches where the feedback given by one user do not match any user neighbourhood. In this situation, the system fails to accurately predict relevant items for that user. This problem can be resolved by using pure content-based approaches where predictions are made based on the user’s profile and item properties.

Scalability problem

Recommender systems, especially those employing collaborative filtering techniques, require large amounts of training data, which cause scalability problems. The scalability problem arises when the amount of data used as input to a recommender system increases quickly. In this era of big data, more and more items and users are rapidly getting added to the system and this problem is becoming common in recommender systems. Two common approaches used to solve the scalability problem is dimensionality reduction and using clustering-based techniques to find users in tiny clusters instead of the complete database.

Methodology

The purpose of this study is to understand the research trends in the field of recommender systems. The nature of research in recommender systems is such that it is difficult to confine each paper to a specific discipline. This can be further understood by the fact that research papers on recommender systems are scattered across various journals such as computer science, management, marketing, information technology and information science. Hence, this literature review is conducted over a wide range of electronic journals and research databases such as ACM Portal, IEEE/IEE Library, Google Scholars and Science Direct [ 88 ].

The search process of online research articles was performed based on 6 descriptors: “Recommender systems”, “Recommendation systems”, “Movie Recommend*”, “Music Recommend*”, “Personalized Recommend*”, “Hybrid Recommend*”. The following research papers described below were excluded from our research:

News articles.

Master’s dissertations.

Non-English papers.

Unpublished papers.

Research papers published before 2011.

We have screened a total of 350 articles based on their abstracts and content. However, only research papers that described how recommender systems can be applied were chosen. Finally, 60 papers were selected from top international journals indexed in Scopus or E-SCI in 2021. We now present the PRISMA flowchart of the inclusion and exclusion process in Fig.  5 .

figure 5

PRISMA flowchart of the inclusion and exclusion process. Abstract and content not suitable to the study: * The use or application of the recommender system is not specified: **

Each paper was carefully reviewed and classified into 6 categories in the application fields and 3 categories in the techniques used to develop the system. The classification framework is presented in Fig.  6 .

figure 6

Classification framework

The number of relevant articles come from Expert Systems with Applications (23%), followed by IEEE (17%), Knowledge-Based System (17%) and Others (43%). Table 3 depicts the article distribution by journal title and Table 4 depicts the sector-wise article distribution.

Both forward and backward searching techniques were implemented to establish that the review of 60 chosen articles can represent the domain literature. Hence, this paper can demonstrate its validity and reliability as a literature review.

Review on state-of-the-art recommender systems

This section presents a state-of-art literature review followed by a chronological review of the various existing recommender systems.

Literature review

In 2011, Castellano et al. [ 1 ] developed a “NEuro-fuzzy WEb Recommendation (NEWER)” system for exploiting the possibility of combining computational intelligence and user preference for suggesting interesting web pages to the user in a dynamic environment. It considered a set of fuzzy rules to express the correlations between user relevance and categories of pages. Crespo et al. [ 2 ] presented a recommender system for distance education over internet. It aims to recommend e-books to students using data from user interaction. The system was developed using a collaborative approach and focused on solving the data overload problem in big digital content. Lin et al. [ 3 ] have put forward a recommender system for automatic vending machines using Genetic algorithm (GA), k-means, Decision Tree (DT) and Bayesian Network (BN). It aimed at recommending localized products by developing a hybrid model combining statistical methods, classification methods, clustering methods, and meta-heuristic methods. Wang and Wu [ 4 ] have implemented a ubiquitous learning system for providing personalized learning assistance to the learners by combining the recommendation algorithm with a context-aware technique. It employed the Association Rule Mining (ARM) technique and aimed to increase the effectiveness of the learner’s learning. García-Crespo et al. [ 5 ] presented a “semantic hotel” recommender system by considering the experiences of consumers using a fuzzy logic approach. The system considered both hotel and customer characteristics. Dong et al. [ 6 ] proposed a structure for a service-concept recommender system using a semantic similarity model by integrating the techniques from the view of an ontology structure-oriented metric and a concept content-oriented metric. The system was able to deliver optimal performance when compared with similar recommender systems. Li et al. [ 7 ] developed a Fuzzy linguistic modelling-based recommender system for assisting users to find experts in knowledge management systems. The developed system was applied to the aircraft industry where it demonstrated efficient and feasible performance. Lorenzi et al. [ 8 ] presented an “assumption-based multiagent” system to make travel package recommendations using user preferences in the tourism industry. It performed different tasks like discovering, filtering, and integrating specific information for building a travel package following the user requirement. Huang et al. [ 9 ] proposed a context-aware recommender system through the extraction, evaluation and incorporation of contextual information gathered using the collaborative filtering and rough set model.

In 2012, Chen et al. [ 10 ] presented a diabetes medication recommender model by using “Semantic Web Rule Language (SWRL) and Java Expert System Shell (JESS)” for aggregating suitable prescriptions for the patients. It aimed at selecting the most suitable drugs from the list of specific drugs. Mohanraj et al. [ 11 ] developed the “Ontology-driven bee’s foraging approach (ODBFA)” to accurately predict the online navigations most likely to be visited by a user. The self-adaptive system is intended to capture the various requirements of the online user by using a scoring technique and by performing a similarity comparison. Hsu et al. [ 12 ] proposed a “personalized auxiliary material” recommender system by considering the specific course topics, individual learning styles, complexity of the auxiliary materials using an artificial bee colony algorithm. Gemmell et al. [ 13 ] demonstrated a solution for the problem of resource recommendation in social annotation systems. The model was developed using a linear-weighted hybrid method which was capable of providing recommendations under different constraints. Choi et al. [ 14 ] proposed one “Hybrid Online-Product rEcommendation (HOPE) system” by the integration of collaborative filtering through sequential pattern analysis-based recommendations and implicit ratings. Garibaldi et al. [ 15 ] put forward a technique for incorporating the variability in a fuzzy inference model by using non-stationary fuzzy sets for replicating the variabilities of a human. This model was applied to a decision problem for treatment recommendations of post-operative breast cancer.

In 2013, Salehi and Kmalabadi [ 16 ] proposed an e-learning material recommender system by “modelling of materials in a multidimensional space of material’s attribute”. It employed both content and collaborative filtering. Aher and Lobo [ 17 ] introduced a course recommender system using data mining techniques such as simple K-means clustering and Association Rule Mining (ARM) algorithm. The proposed e-learning system was successfully demonstrated for “MOOC (Massively Open Online Courses)”. Kardan and Ebrahimi [ 18 ] developed a hybrid recommender system for recommending posts in asynchronous discussion groups. The system was built combining both collaborative filtering and content-based filtering. It considered implicit user data to compute the user similarity with various groups, for recommending suitable posts and contents to its users. Chang et al. [ 19 ] adopted a cloud computing technology for building a TV program recommender system. The system designed for digital TV programs was implemented using Hadoop Fair Scheduler (HFC), K-means clustering and k-nearest neighbour (KNN) algorithms. It was successful in processing huge amounts of real-time user data. Lucas et al. [ 20 ] implemented a recommender model for assisting a tourism application by using associative classification and fuzzy logic to predict the context. Niu et al. [ 21 ] introduced “Affivir: An Affect-based Internet Video Recommendation System” which was developed by calculating user preferences and by using spectral clustering. This model recommended videos with similar effects, which was processed to get optimal results with dynamic adjustments of recommendation constraints.

In 2014, Liu et al. [ 22 ] implemented a new route recommendation model for offering personalized and real-time route recommendations for self-driven tourists to minimize the queuing time and traffic jams infamous tourist places. Recommendations were carried out by considering the preferences of users. Bakshi et al. [ 23 ] proposed an unsupervised learning-based recommender model for solving the scalability problem of recommender systems. The algorithm used transitive similarities along with Particle Swarm Optimization (PSO) technique for discovering the global neighbours. Kim and Shim [ 24 ] proposed a recommender system based on “latent Dirichlet allocation using probabilistic modelling for Twitter” that could recommend the top-K tweets for a user to read, and the top-K users to follow. The model parameters were learned from an inference technique by using the differential Expectation–Maximization (EM) algorithm. Wang et al. [ 25 ] developed a hybrid-movie recommender model by aggregating a genetic algorithm (GA) with improved K-means and Principal Component Analysis (PCA) technique. It was able to offer intelligent movie recommendations with personalized suggestions. Kolomvatsos et al. [ 26 ] proposed a recommender system by considering an optimal stopping theory for delivering books or music recommendations to the users. Gottschlich et al. [ 27 ] proposed a decision support system for stock investment recommendations. It computed the output by considering the overall crowd’s recommendations. Torshizi et al. [ 28 ] have introduced a hybrid recommender system to determine the severity level of a medical condition. It could recommend suitable therapies for patients suffering from Benign Prostatic Hyperplasia.

In 2015, Zahálka et al. [ 29 ] proposed a venue recommender: “City Melange”. It was an interactive content-based model which used the convolutional deep-net features of the visual domain and the linear Support Vector Machine (SVM) model to capture the semantic information and extract latent topics. Sankar et al. [ 30 ] have proposed a stock recommender system based on the stock holding portfolio of trusted mutual funds. The system employed the collaborative filtering approach along with social network analysis for offering a decision support system to build a trust-based recommendation model. Chen et al. [ 31 ] have put forward a novel movie recommender system by applying the “artificial immune network to collaborative filtering” technique. It computed the affinity of an antigen and the affinity between an antibody and antigen. Based on this computation a similarity estimation formula was introduced which was used for the movie recommendation process. Wu et al. [ 32 ] have examined the technique of data fusion for increasing the efficiency of item recommender systems. It employed a hybrid linear combination model and used a collaborative tagging system. Yeh and Cheng [ 33 ] have proposed a recommender system for tourist attractions by constructing the “elicitation mechanism using the Delphi panel method and matrix construction mechanism using the repertory grids”, which was developed by considering the user preference and expert knowledge.

In 2016, Liao et al. [ 34 ] proposed a recommender model for online customers using a rough set association rule. The model computed the probable behavioural variations of online consumers and provided product category recommendations for e-commerce platforms. Li et al. [ 35 ] have suggested a movie recommender system based on user feedback collected from microblogs and social networks. It employed the sentiment-aware association rule mining algorithm for recommendations using the prior information of frequent program patterns, program metadata similarity and program view logs. Wu et al. [ 36 ] have developed a recommender system for social media platforms by aggregating the technique of Social Matrix Factorization (SMF) and Collaborative Topic Regression (CTR). The model was able to compute the ratings of users to items for making recommendations. For improving the recommendation quality, it gathered information from multiple sources such as item properties, social networks, feedback, etc. Adeniyi et al. [ 37 ] put forward a study of automated web-usage data mining and developed a recommender system that was tested in both real-time and online for identifying the visitor’s or client’s clickstream data.

In 2017, Rawat and Kankanhalli [ 38 ] have proposed a viewpoint recommender system called “ClickSmart” for assisting mobile users to capture high-quality photographs at famous tourist places. Yang et al. [ 39 ] proposed a gradient boosting-based job recommendation system for satisfying the cost-sensitive requirements of the users. The hybrid algorithm aimed to reduce the rate of unnecessary job recommendations. Lee et al. [ 40 ] proposed a music streaming recommender system based on smartphone activity usage. The proposed system benefitted by using feature selection approaches with machine learning techniques such as Naive Bayes (NB), Support Vector Machine (SVM), Multi-layer Perception (MLP), Instance-based k -Nearest Neighbour (IBK), and Random Forest (RF) for performing the activity detection from the mobile signals. Wei et al. [ 41 ] have proposed a new stacked denoising autoencoder (SDAE) based recommender system for cold items. The algorithm employed deep learning and collaborative filtering method to predict the unknown ratings.

In 2018, Li et al. [ 42 ] have developed a recommendation algorithm using Weighted Linear Regression Models (WLRRS). The proposed system was put to experiment using the MovieLens dataset and it presented better classification and predictive accuracy. Mezei and Nikou [ 43 ] presented a mobile health and wellness recommender system based on fuzzy optimization. It could recommend a collection of actions to be taken by the user to improve the user’s health condition. Recommendations were made considering the user’s physical activities and preferences. Ayata et al. [ 44 ] proposed a music recommendation model based on the user emotions captured through wearable physiological sensors. The emotion detection algorithm employed different machine learning algorithms like SVM, RF, KNN and decision tree (DT) algorithms to predict the emotions from the changing electrical signals gathered from the wearable sensors. Zhao et al. [ 45 ] developed a multimodal learning-based, social-aware movie recommender system. The model was able to successfully resolve the sparsity problem of recommender systems. The algorithm developed a heterogeneous network by exploiting the movie-poster image and textual description of each movie based on the social relationships and user ratings.

In 2019, Hammou et al. [ 46 ] proposed a Big Data recommendation algorithm capable of handling large scale data. The system employed random forest and matrix factorization through a data partitioning scheme. It was then used for generating recommendations based on user rating and preference for each item. The proposed system outperformed existing systems in terms of accuracy and speed. Zhao et al. [ 47 ] have put forward a hybrid initialization method for social network recommender systems. The algorithm employed denoising autoencoder (DAE) neural network-based initialization method (ANNInit) and attribute mapping. Bhaskaran and Santhi [ 48 ] have developed a hybrid, trust-based e-learning recommender system using cloud computing. The proposed algorithm was capable of learning online user activities by using the Firefly Algorithm (FA) and K-means clustering. Afolabi and Toivanen [ 59 ] have suggested an integrated recommender model based on collaborative filtering. The proposed model “Connected Health for Effective Management of Chronic Diseases”, aimed for integrating recommender systems for better decision-making in the process of disease management. He et al. [ 60 ] proposed a movie recommender system called “HI2Rec” which explored the usage of collaborative filtering and heterogeneous information for making movie recommendations. The model used the knowledge representation learning approach to embed movie-related information gathered from different sources.

In 2020, Han et al. [ 49 ] have proposed one Internet of Things (IoT)-based cancer rehabilitation recommendation system using the Beetle Antennae Search (BAS) algorithm. It presented the patients with a solution for the problem of optimal nutrition program by considering the objective function as the recurrence time. Kang et al. [ 50 ] have presented a recommender system for personalized advertisements in Online Broadcasting based on a tree model. Recommendations were generated in real-time by considering the user preferences to minimize the overhead of preference prediction and using a HashMap along with the tree characteristics. Ullah et al. [ 51 ] have implemented an image-based service recommendation model for online shopping based random forest and Convolutional Neural Networks (CNN). The model used JPEG coefficients to achieve an accurate prediction rate. Cai et al. [ 52 ] proposed a new hybrid recommender model using a many-objective evolutionary algorithm (MaOEA). The proposed algorithm was successful in optimizing the novelty, diversity, and accuracy of recommendations. Esteban et al. [ 53 ] have implemented a hybrid multi-criteria recommendation system concerned with students’ academic performance, personal interests, and course selection. The system was developed using a Genetic Algorithm (GA) and aimed at helping university students. It combined both course information and student information for increasing system performance and the reliability of the recommendations. Mondal et al. [ 54 ] have built a multilayer, graph data model-based doctor recommendation system by exploiting the trust concept between a patient-doctor relationship. The proposed system showed good results in practical applications.

In 2021, Dhelim et al. [ 55 ] have developed a personality-based product recommending model using the techniques of meta path discovery and user interest mining. This model showed better results when compared to session-based and deep learning models. Bhalse et al. [ 56 ] proposed a web-based movie recommendation system based on collaborative filtering using Singular Value Decomposition (SVD), collaborative filtering and cosine similarity (CS) for addressing the sparsity problem of recommender systems. It suggested a recommendation list by considering the content information of movies. Similarly, to solve both sparsity and cold-start problems Ke et al. [ 57 ] proposed a dynamic goods recommendation system based on reinforcement learning. The proposed system was capable of learning from the reduced entropy loss error on real-time applications. Chen et al. [ 58 ] have presented a movie recommender model combining various techniques like user interest with category-level representation, neighbour-assisted representation, user interest with latent representation and item-level representation using Feed-forward Neural Network (FNN).

Comparative chronological review

A comparative chronological review to compare the total contributions on various recommender systems in the past 10 years is given in Fig.  7 .

figure 7

Comparative chronological review of recommender systems under diverse applications

This review puts forward a comparison of the number of research works proposed in the domain of recommender systems from the year 2011 to 2021 using various deep learning and machine learning-based approaches. Research articles are categorized based on the recommender system classification framework as shown in Table 5 . The articles are ordered according to their year of publication. There are two key concepts: Application fields and techniques used. The application fields of recommender systems are divided into six different fields, viz. entertainment, health, tourism, web/e-commerce, education and social media/others.

Algorithmic categorization, simulation platforms and applications considered for various recommender systems

This section analyses different methods like deep learning, machine learning, clustering and meta-heuristic-based-approaches used in the development of recommender systems. The algorithmic categorization of different recommender systems is given in Fig.  8 .

figure 8

Algorithmic categorization of different recommender systems

Categorization is done based on content-based, collaborative filtering-based, and optimization-based approaches. In [ 8 ], a content-based filtering technique was employed for increasing the ability to trust other agents and for improving the exchange of information by trust degree. In [ 16 ], it was applied to enhance the quality of recommendations using the account attributes of the material. It achieved better performance concerning with F1-score, recall and precision. In [ 18 ], this technique was able to capture the implicit user feedback, increasing the overall accuracy of the proposed model. The content-based filtering in [ 30 ] was able to increase the accuracy and performance of a stock recommender system by using the “trust factor” for making decisions.

Different collaborative filtering approaches are utilized in recent studies, which are categorized as follows:

Model-based techniques

Neuro-Fuzzy [ 1 ] based technique helps in discovering the association between user categories and item relevance. It is also simple to understand. K-Means Clustering [ 2 , 19 , 25 , 48 ] is efficient for large scale datasets. It is simple to implement and gives a fast convergence rate. It also offers automatic recovery from failures. The decision tree [ 2 , 44 ] technique is easy to interpret. It can be used for solving the classic regression and classification problems in recommender systems. Bayesian Network [ 3 ] is a probabilistic technique used to solve classification challenges. It is based on the theory of Bayes theorem and conditional probability. Association Rule Mining (ARM) techniques [ 4 , 17 , 35 ] extract rules for projecting the occurrence of an item by considering the existence of other items in a transaction. This method uses the association rules to create a more suitable representation of data and helps in increasing the model performance and storage efficiency. Fuzzy Logic [ 5 , 7 , 15 , 20 , 28 , 43 ] techniques use a set of flexible rules. It focuses on solving complex real-time problems having an inaccurate spectrum of data. This technique provides scalability and helps in increasing the overall model performance for recommender systems. The semantic similarity [ 6 ] technique is used for describing a topological similarity to define the distance among the concepts and terms through ontologies. It measures the similarity information for increasing the efficiency of recommender systems. Rough set [ 9 , 34 ] techniques use probability distributions for solving the challenges of existing recommender models. Semantic web rule language [ 10 ] can efficiently extract the dataset features and increase the model efficiency. Linear programming-based approaches [ 13 , 42 ] are employed for achieving quality decision making in recommender models. Sequential pattern analysis [ 14 ] is applied to find suitable patterns among data items. This helps in increasing model efficiency. The probabilistic model [ 24 ] is a famous tool to handle uncertainty in risk computations and performance assessment. It offers better decision-making capabilities. K-nearest neighbours (KNN) [ 19 , 37 , 44 ] technique provides faster computation time, simplicity and ease of interpretation. They are good for classification and regression-based problems and offers more accuracy. Spectral clustering [ 21 ] is also called graph clustering or similarity-based clustering, which mainly focuses on reducing the space dimensionality in identifying the dataset items. Stochastic learning algorithm [ 26 ] solves the real-time challenges of recommender systems. Linear SVM [ 29 , 44 ] efficiently solves the high dimensional problems related to recommender systems. It is a memory-efficient method and works well with a large number of samples having relative separation among the classes. This method has been shown to perform well even when new or unfamiliar data is added. Relational Functional Gradient Boosting [ 39 ] technique efficiently works on the relational dependency of data, which is useful for statical relational learning for collaborative-based recommender systems. Ensemble learning [ 40 ] combines the forecast of two or more models and aims to achieve better performance than any of the single contributing models. It also helps in reducing overfitting problems, which are common in recommender systems.

SDAE [ 41 ] is used for learning the non-linear transformations with different filters for finding suitable data. This aids in increasing the performance of recommender models. Multimodal network learning [ 45 ] is efficient for multi-modal data, representing a combined representation of diverse modalities. Random forest [ 46 , 51 ] is a commonly used approach in comparison with other classifiers. It has been shown to increase accuracy when handling big data. This technique is a collection of decision trees to minimize variance through training on diverse data samples. ANNInit [ 47 ] is a type of artificial neural network-based technique that has the capability of self-learning and generating efficient results. It is independent of the data type and can learn data patterns automatically. HashMap [ 50 ] gives faster access to elements owing to the hashing methodology, which decreases the data processing time and increases the performance of the system. CNN [ 51 ] technique can automatically fetch the significant features of a dataset without any supervision. It is a computationally efficient method and provides accurate recommendations. This technique is also simple and fast for implementation. Multilayer graph data model [ 54 ] is efficient for real-time applications and minimizes the access time through mapping the correlation as edges among nodes and provides superior performance. Singular Value Decomposition [ 56 ] can simplify the input data and increase the efficiency of recommendations by eliminating the noise present in data. Reinforcement learning [ 57 ] is efficient for practical scenarios of recommender systems having large data sizes. It is capable of boosting the model performance by increasing the model accuracy even for large scale datasets. FNN [ 58 ] is one of the artificial neural network techniques which can learn non-linear and complex relationships between items. It has demonstrated a good performance increase when employed in different recommender systems. Knowledge representation learning [ 60 ] systems aim to simplify the model development process by increasing the acquisition efficiency, inferential efficiency, inferential adequacy and representation adequacy. User-based approaches [ 2 , 55 , 59 ] specialize in detecting user-related meta-data which is employed to increase the overall model performance. This technique is more suitable for real-time applications where it can capture user feedback and use it to increase the user experience.

Optimization-based techniques

The Foraging Bees [ 11 ] technique enables both functional and combinational optimization for random searching in recommender models. Artificial bee colony [ 12 ] is a swarm-based meta-heuristic technique that provides features like faster convergence rate, the ability to handle the objective with stochastic nature, ease for incorporating with other algorithms, usage of fewer control parameters, strong robustness, high flexibility and simplicity. Particle Swarm Optimization [ 23 ] is a computation optimization technique that offers better computational efficiency, robustness in control parameters, and is easy and simple to implement in recommender systems. Portfolio optimization algorithm [ 27 ] is a subclass of optimization algorithms that find its application in stock investment recommender systems. It works well in real-time and helps in the diversification of the portfolio for maximum profit. The artificial immune system [ 31 ]a is computationally intelligent machine learning technique. This technique can learn new patterns in the data and optimize the overall system parameters. Expectation maximization (EM) [ 32 , 36 , 38 ] is an iterative algorithm that guarantees the likelihood of finding the maximum parameters when the input variables are unknown. Delphi panel and repertory grid [ 33 ] offers efficient decision making by solving the dimensionality problem and data sparsity issues of recommender systems. The Firefly algorithm (FA) [ 48 ] provides fast results and increases recommendation efficiency. It is capable of reducing the number of iterations required to solve specific recommender problems. It also provides both local and global sets of solutions. Beetle Antennae Search (BAS) [ 49 ] offers superior search accuracy and maintains less time complexity that promotes the performance of recommendations. Many-objective evolutionary algorithm (MaOEA) [ 52 ] is applicable for real-time, multi-objective, search-related recommender systems. The introduction of a local search operator increases the convergence rate and gets suitable results. Genetic Algorithm (GA) [ 2 , 22 , 25 , 53 ] based techniques are used to solve the multi-objective optimization problems of recommender systems. They employ probabilistic transition rules and have a simpler operation that provides better recommender performance.

Features and challenges

The features and challenges of the existing recommender models are given in Table 6 .

Simulation platforms

The various simulation platforms used for developing different recommender systems with different applications are given in Fig.  9 .

figure 9

Simulation platforms used for developing different recommender systems

Here, the Java platform is used in 20% of the contributions, MATLAB is implemented in 7% of the contributions, different fold cross-validation are used in 8% of the contributions, 7% of the contributions are utilized by the python platform, 3% of the contributions employ R-programming and 1% of the contributions are developed by Tensorflow, Weka and Android environments respectively. Other simulation platforms like Facebook, web UI (User Interface), real-time environments, etc. are used in 50% of the contributions. Table 7 describes some simulation platforms commonly used for developing recommender systems.

Application focused and dataset description

This section provides an analysis of the different applications focused on a set of recent recommender systems and their dataset details.

Recent recommender systems were analysed and found that 11% of the contributions are focused on the domain of healthcare, 10% of the contributions are on movie recommender systems, 5% of the contributions come from music recommender systems, 6% of the contributions are focused on e-learning recommender systems, 8% of the contributions are used for online product recommender systems, 3% of the contributions are focused on book recommendations and 1% of the contributions are focused on Job and knowledge management recommender systems. 5% of the contributions concentrated on social network recommender systems, 10% of the contributions are focused on tourist and hotels recommender systems, 6% of the contributions are employed for stock recommender systems, and 3% of the contributions contributed for video recommender systems. The remaining 12% of contributions are miscellaneous recommender systems like Twitter, venue-based recommender systems, etc. Similarly, different datasets are gathered for recommender systems based on their application types. A detailed description is provided in Table 8 .

Performance analysis of state-of-art recommender systems

The performance evaluation metrics used for the analysis of different recommender systems is depicted in Table 9 . From the set of research works, 35% of the works use recall measure, 16% of the works employ Mean Absolute Error (MAE), 11% of the works take Root Mean Square Error (RMSE), 41% of the papers consider precision, 30% of the contributions analyse F1-measure, 31% of the works apply accuracy and 6% of the works employ coverage measure to validate the performance of the recommender systems. Moreover, some additional measures are also considered for validating the performance in a few applications.

Research gaps and challenges

In the recent decade, recommender systems have performed well in solving the problem of information overload and has become the more appropriate tool for multiple areas such as psychology, mathematics, computer science, etc. [ 80 ]. However, current recommender systems face a variety of challenges which are stated as follows, and discussed below:

Deployment challenges such as cold start, scalability, sparsity, etc. are already discussed in Sect. 3.

Challenges faced when employing different recommender algorithms for different applications.

Challenges in collecting implicit user data

Challenges in handling real-time user feedback.

Challenges faced in choosing the correct implementation techniques.

Challenges faced in measuring system performance.

Challenges in implementing recommender system for diverse applications.

Numerous recommender algorithms have been proposed on novel emerging dimensions which focus on addressing the existing limitations of recommender systems. A good recommender system must increase the recommendation quality based on user preferences. However, a specific recommender algorithm is not always guaranteed to perform equally for different applications. This encourages the possibility of employing different recommender algorithms for different applications, which brings along a lot of challenges. There is a need for more research to alleviate these challenges. Also, there is a large scope of research in recommender applications that incorporate information from different interactive online sites like Facebook, Twitter, shopping sites, etc. Some other areas for emerging research may be in the fields of knowledge-based recommender systems, methods for seamlessly processing implicit user data and handling real-time user feedback to recommend items in a dynamic environment.

Some of the other research areas like deep learning-based recommender systems, demographic filtering, group recommenders, cross-domain techniques for recommender systems, and dimensionality reduction techniques are also further required to be studied [ 83 ]. Deep learning-based recommender systems have recently gained much popularity. Future research areas in this field can integrate the well-performing deep learning models with new variants of hybrid meta-heuristic approaches.

During this review, it was observed that even though recent recommender systems have demonstrated good performance, there is no single standardized criteria or method which could be used to evaluate the performance of all recommender systems. System performance is generally measured by different evaluation matrices which makes it difficult to compare. The application of recommender systems in real-time applications is growing. User satisfaction and personalization play a very important role in the success of such recommender systems. There is a need for some new evaluation criteria which can evaluate the level of user satisfaction in real-time. New research should focus on capturing real-time user feedback and use the information to change the recommendation process accordingly. This will aid in increasing the quality of recommendations.

Conclusion and future scope

Recommender systems have attracted the attention of researchers and academicians. In this paper, we have identified and prudently reviewed research papers on recommender systems focusing on diverse applications, which were published between 2011 and 2021. This review has gathered diverse details like different application fields, techniques used, simulation tools used, diverse applications focused, performance metrics, datasets used, system features, and challenges of different recommender systems. Further, the research gaps and challenges were put forward to explore the future research perspective on recommender systems. Overall, this paper provides a comprehensive understanding of the trend of recommender systems-related research and to provides researchers with insight and future direction on recommender systems. The results of this study have several practical and significant implications:

Based on the recent-past publication rates, we feel that the research of recommender systems will significantly grow in the future.

A large number of research papers were identified in movie recommendations, whereas health, tourism and education-related recommender systems were identified in very few numbers. This is due to the availability of movie datasets in the public domain. Therefore, it is necessary to develop datasets in other fields also.

There is no standard measure to compute the performance of recommender systems. Among 60 papers, 21 used recall, 10 used MAE, 25 used precision, 18 used F1-measure, 19 used accuracy and only 7 used RMSE to calculate system performance. Very few systems were found to excel in two or more matrices.

Java and Python (with a combined contribution of 27%) are the most common programming languages used to develop recommender systems. This is due to the availability of a large number of standard java and python libraries which aid in the development process.

Recently a large number of hybrid and optimizations techniques are being proposed for recommender systems. The performance of a recommender system can be greatly improved by applying optimization techniques.

There is a large scope of research in using neural networks and deep learning-based methods for developing recommender systems. Systems developed using these methods are found to achieve high-performance accuracy.

This research will provide a guideline for future research in the domain of recommender systems. However, this research has some limitations. Firstly, due to the limited amount of manpower and time, we have only reviewed papers published in journals focusing on computer science, management and medicine. Secondly, we have reviewed only English papers. New research may extend this study to cover other journals and non-English papers. Finally, this review was conducted based on a search on only six descriptors: “Recommender systems”, “Recommendation systems”, “Movie Recommend*”, “Music Recommend*”, “Personalized Recommend*” and “Hybrid Recommend*”. Research papers that did not include these keywords were not considered. Future research can include adding some additional descriptors and keywords for searching. This will allow extending the research to cover more diverse articles on recommender systems.

Availability of data and materials

Not applicable.

Castellano G, Fanelli AM, Torsello MA. NEWER: A system for neuro-fuzzy web recommendation. Appl Soft Comput. 2011;11:793–806.

Article   Google Scholar  

Crespo RG, Martínez OS, Lovelle JMC, García-Bustelo BCP, Gayo JEL, Pablos PO. Recommendation system based on user interaction data applied to intelligent electronic books. Computers Hum Behavior. 2011;27:1445–9.

Lin FC, Yu HW, Hsu CH, Weng TC. Recommendation system for localized products in vending machines. Expert Syst Appl. 2011;38:9129–38.

Wang SL, Wu CY. Application of context-aware and personalized recommendation to implement an adaptive ubiquitous learning system. Expert Syst Appl. 2011;38:10831–8.

García-Crespo Á, López-Cuadrado JL, Colomo-Palacios R, González-Carrasco I, Ruiz-Mezcua B. Sem-Fit: A semantic based expert system to provide recommendations in the tourism domain. Expert Syst Appl. 2011;38:13310–9.

Dong H, Hussain FK, Chang E. A service concept recommendation system for enhancing the dependability of semantic service matchmakers in the service ecosystem environment. J Netw Comput Appl. 2011;34:619–31.

Li M, Liu L, Li CB. An approach to expert recommendation based on fuzzy linguistic method and fuzzy text classification in knowledge management systems. Expert Syst Appl. 2011;38:8586–96.

Lorenzi F, Bazzan ALC, Abel M, Ricci F. Improving recommendations through an assumption-based multiagent approach: An application in the tourism domain. Expert Syst Appl. 2011;38:14703–14.

Huang Z, Lu X, Duan H. Context-aware recommendation using rough set model and collaborative filtering. Artif Intell Rev. 2011;35:85–99.

Chen RC, Huang YH, Bau CT, Chen SM. A recommendation system based on domain ontology and SWRL for anti-diabetic drugs selection. Expert Syst Appl. 2012;39:3995–4006.

Mohanraj V, Chandrasekaran M, Senthilkumar J, Arumugam S, Suresh Y. Ontology driven bee’s foraging approach based self-adaptive online recommendation system. J Syst Softw. 2012;85:2439–50.

Hsu CC, Chen HC, Huang KK, Huang YM. A personalized auxiliary material recommendation system based on learning style on facebook applying an artificial bee colony algorithm. Comput Math Appl. 2012;64:1506–13.

Gemmell J, Schimoler T, Mobasher B, Burke R. Resource recommendation in social annotation systems: A linear-weighted hybrid approach. J Comput Syst Sci. 2012;78:1160–74.

Article   MathSciNet   Google Scholar  

Choi K, Yoo D, Kim G, Suh Y. A hybrid online-product recommendation system: Combining implicit rating-based collaborative filtering and sequential pattern analysis. Electron Commer Res Appl. 2012;11:309–17.

Garibaldi JM, Zhou SM, Wang XY, John RI, Ellis IO. Incorporation of expert variability into breast cancer treatment recommendation in designing clinical protocol guided fuzzy rule system models. J Biomed Inform. 2012;45:447–59.

Salehi M, Kmalabadi IN. A hybrid attribute–based recommender system for e–learning material recommendation. IERI Procedia. 2012;2:565–70.

Aher SB, Lobo LMRJ. Combination of machine learning algorithms for recommendation of courses in e-learning System based on historical data. Knowl-Based Syst. 2013;51:1–14.

Kardan AA, Ebrahimi M. A novel approach to hybrid recommendation systems based on association rules mining for content recommendation in asynchronous discussion groups. Inf Sci. 2013;219:93–110.

Chang JH, Lai CF, Wang MS, Wu TY. A cloud-based intelligent TV program recommendation system. Comput Electr Eng. 2013;39:2379–99.

Lucas JP, Luz N, Moreno MN, Anacleto R, Figueiredo AA, Martins C. A hybrid recommendation approach for a tourism system. Expert Syst Appl. 2013;40:3532–50.

Niu J, Zhu L, Zhao X, Li H. Affivir: An affect-based Internet video recommendation system. Neurocomputing. 2013;120:422–33.

Liu L, Xu J, Liao SS, Chen H. A real-time personalized route recommendation system for self-drive tourists based on vehicle to vehicle communication. Expert Syst Appl. 2014;41:3409–17.

Bakshi S, Jagadev AK, Dehuri S, Wang GN. Enhancing scalability and accuracy of recommendation systems using unsupervised learning and particle swarm optimization. Appl Soft Comput. 2014;15:21–9.

Kim Y, Shim K. TWILITE: A recommendation system for twitter using a probabilistic model based on latent Dirichlet allocation. Inf Syst. 2014;42:59–77.

Wang Z, Yu X, Feng N, Wang Z. An improved collaborative movie recommendation system using computational intelligence. J Vis Lang Comput. 2014;25:667–75.

Kolomvatsos K, Anagnostopoulos C, Hadjiefthymiades S. An efficient recommendation system based on the optimal stopping theory. Expert Syst Appl. 2014;41:6796–806.

Gottschlich J, Hinz O. A decision support system for stock investment recommendations using collective wisdom. Decis Support Syst. 2014;59:52–62.

Torshizi AD, Zarandi MHF, Torshizi GD, Eghbali K. A hybrid fuzzy-ontology based intelligent system to determine level of severity and treatment recommendation for benign prostatic hyperplasia. Comput Methods Programs Biomed. 2014;113:301–13.

Zahálka J, Rudinac S, Worring M. Interactive multimodal learning for venue recommendation. IEEE Trans Multimedia. 2015;17:2235–44.

Sankar CP, Vidyaraj R, Kumar KS. Trust based stock recommendation system – a social network analysis approach. Procedia Computer Sci. 2015;46:299–305.

Chen MH, Teng CH, Chang PC. Applying artificial immune systems to collaborative filtering for movie recommendation. Adv Eng Inform. 2015;29:830–9.

Wu H, Pei Y, Li B, Kang Z, Liu X, Li H. Item recommendation in collaborative tagging systems via heuristic data fusion. Knowl-Based Syst. 2015;75:124–40.

Yeh DY, Cheng CH. Recommendation system for popular tourist attractions in Taiwan using delphi panel and repertory grid techniques. Tour Manage. 2015;46:164–76.

Liao SH, Chang HK. A rough set-based association rule approach for a recommendation system for online consumers. Inf Process Manage. 2016;52:1142–60.

Li H, Cui J, Shen B, Ma J. An intelligent movie recommendation system through group-level sentiment analysis in microblogs. Neurocomputing. 2016;210:164–73.

Wu H, Yue K, Pei Y, Li B, Zhao Y, Dong F. Collaborative topic regression with social trust ensemble for recommendation in social media systems. Knowl-Based Syst. 2016;97:111–22.

Adeniyi DA, Wei Z, Yongquan Y. Automated web usage data mining and recommendation system using K-Nearest Neighbor (KNN) classification method. Appl Computing Inform. 2016;12:90–108.

Rawat YS, Kankanhalli MS. ClickSmart: A context-aware viewpoint recommendation system for mobile photography. IEEE Trans Circuits Syst Video Technol. 2017;27:149–58.

Yang S, Korayem M, Aljadda K, Grainger T, Natarajan S. Combining content-based and collaborative filtering for job recommendation system: A cost-sensitive Statistical Relational Learning approach. Knowl-Based Syst. 2017;136:37–45.

Lee WP, Chen CT, Huang JY, Liang JY. A smartphone-based activity-aware system for music streaming recommendation. Knowl-Based Syst. 2017;131:70–82.

Wei J, He J, Chen K, Zhou Y, Tang Z. Collaborative filtering and deep learning based recommendation system for cold start items. Expert Syst Appl. 2017;69:29–39.

Li C, Wang Z, Cao S, He L. WLRRS: A new recommendation system based on weighted linear regression models. Comput Electr Eng. 2018;66:40–7.

Mezei J, Nikou S. Fuzzy optimization to improve mobile health and wellness recommendation systems. Knowl-Based Syst. 2018;142:108–16.

Ayata D, Yaslan Y, Kamasak ME. Emotion based music recommendation system using wearable physiological sensors. IEEE Trans Consum Electron. 2018;64:196–203.

Zhao Z, Yang Q, Lu H, Weninger T. Social-aware movie recommendation via multimodal network learning. IEEE Trans Multimedia. 2018;20:430–40.

Hammou BA, Lahcen AA, Mouline S. An effective distributed predictive model with matrix factorization and random forest for big data recommendation systems. Expert Syst Appl. 2019;137:253–65.

Zhao J, Geng X, Zhou J, Sun Q, Xiao Y, Zhang Z, Fu Z. Attribute mapping and autoencoder neural network based matrix factorization initialization for recommendation systems. Knowl-Based Syst. 2019;166:132–9.

Bhaskaran S, Santhi B. An efficient personalized trust based hybrid recommendation (TBHR) strategy for e-learning system in cloud computing. Clust Comput. 2019;22:1137–49.

Han Y, Han Z, Wu J, Yu Y, Gao S, Hua D, Yang A. Artificial intelligence recommendation system of cancer rehabilitation scheme based on IoT technology. IEEE Access. 2020;8:44924–35.

Kang S, Jeong C, Chung K. Tree-based real-time advertisement recommendation system in online broadcasting. IEEE Access. 2020;8:192693–702.

Ullah F, Zhang B, Khan RU. Image-based service recommendation system: A JPEG-coefficient RFs approach. IEEE Access. 2020;8:3308–18.

Cai X, Hu Z, Zhao P, Zhang W, Chen J. A hybrid recommendation system with many-objective evolutionary algorithm. Expert Syst Appl. 2020. https://doi.org/10.1016/j.eswa.2020.113648 .

Esteban A, Zafra A, Romero C. Helping university students to choose elective courses by using a hybrid multi-criteria recommendation system with genetic optimization. Knowledge-Based Syst. 2020;194:105385.

Mondal S, Basu A, Mukherjee N. Building a trust-based doctor recommendation system on top of multilayer graph database. J Biomed Inform. 2020;110:103549.

Dhelim S, Ning H, Aung N, Huang R, Ma J. Personality-aware product recommendation system based on user interests mining and metapath discovery. IEEE Trans Comput Soc Syst. 2021;8:86–98.

Bhalse N, Thakur R. Algorithm for movie recommendation system using collaborative filtering. Materials Today: Proceedings. 2021. https://doi.org/10.1016/j.matpr.2021.01.235 .

Ke G, Du HL, Chen YC. Cross-platform dynamic goods recommendation system based on reinforcement learning and social networks. Appl Soft Computing. 2021;104:107213.

Chen X, Liu D, Xiong Z, Zha ZJ. Learning and fusing multiple user interest representations for micro-video and movie recommendations. IEEE Trans Multimedia. 2021;23:484–96.

Afolabi AO, Toivanen P. Integration of recommendation systems into connected health for effective management of chronic diseases. IEEE Access. 2019;7:49201–11.

He M, Wang B, Du X. HI2Rec: Exploring knowledge in heterogeneous information for movie recommendation. IEEE Access. 2019;7:30276–84.

Bobadilla J, Serradilla F, Hernando A. Collaborative filtering adapted to recommender systems of e-learning. Knowl-Based Syst. 2009;22:261–5.

Russell S, Yoon V. Applications of wavelet data reduction in a recommender system. Expert Syst Appl. 2008;34:2316–25.

Campos LM, Fernández-Luna JM, Huete JF. A collaborative recommender system based on probabilistic inference from fuzzy observations. Fuzzy Sets Syst. 2008;159:1554–76.

Funk M, Rozinat A, Karapanos E, Medeiros AKA, Koca A. In situ evaluation of recommender systems: Framework and instrumentation. Int J Hum Comput Stud. 2010;68:525–47.

Porcel C, Moreno JM, Herrera-Viedma E. A multi-disciplinar recommender system to advice research resources in University Digital Libraries. Expert Syst Appl. 2009;36:12520–8.

Bobadilla J, Serradilla F, Bernal J. A new collaborative filtering metric that improves the behavior of recommender systems. Knowl-Based Syst. 2010;23:520–8.

Ochi P, Rao S, Takayama L, Nass C. Predictors of user perceptions of web recommender systems: How the basis for generating experience and search product recommendations affects user responses. Int J Hum Comput Stud. 2010;68:472–82.

Olmo FH, Gaudioso E. Evaluation of recommender systems: A new approach. Expert Syst Appl. 2008;35:790–804.

Zhen L, Huang GQ, Jiang Z. An inner-enterprise knowledge recommender system. Expert Syst Appl. 2010;37:1703–12.

Göksedef M, Gündüz-Öğüdücü S. Combination of web page recommender systems. Expert Syst Appl. 2010;37(4):2911–22.

Shao B, Wang D, Li T, Ogihara M. Music recommendation based on acoustic features and user access patterns. IEEE Trans Audio Speech Lang Process. 2009;17:1602–11.

Shin C, Woo W. Socially aware tv program recommender for multiple viewers. IEEE Trans Consum Electron. 2009;55:927–32.

Lopez-Carmona MA, Marsa-Maestre I, Perez JRV, Alcazar BA. Anegsys: An automated negotiation based recommender system for local e-marketplaces. IEEE Lat Am Trans. 2007;5:409–16.

Yap G, Tan A, Pang H. Discovering and exploiting causal dependencies for robust mobile context-aware recommenders. IEEE Trans Knowl Data Eng. 2007;19:977–92.

Meo PD, Quattrone G, Terracina G, Ursino D. An XML-based multiagent system for supporting online recruitment services. IEEE Trans Syst Man Cybern. 2007;37:464–80.

Khusro S, Ali Z, Ullah I. Recommender systems: Issues, challenges, and research opportunities. Inform Sci Appl. 2016. https://doi.org/10.1007/978-981-10-0557-2_112 .

Blanco-Fernandez Y, Pazos-Arias JJ, Gil-Solla A, Ramos-Cabrer M, Lopez-Nores M. Providing entertainment by content-based filtering and semantic reasoning in intelligent recommender systems. IEEE Trans Consum Electron. 2008;54:727–35.

Isinkaye FO, Folajimi YO, Ojokoh BA. Recommendation systems: Principles, methods and evaluation. Egyptian Inform J. 2015;16:261–73.

Yoshii K, Goto M, Komatani K, Ogata T, Okuno HG. An efficient hybrid music recommender system using an incrementally trainable probabilistic generative model. IEEE Trans Audio Speech Lang Process. 2008;16:435–47.

Wei YZ, Moreau L, Jennings NR. Learning users’ interests by quality classification in market-based recommender systems. IEEE Trans Knowl Data Eng. 2005;17:1678–88.

Bjelica M. Towards TV recommender system: experiments with user modeling. IEEE Trans Consum Electron. 2010;56:1763–9.

Setten MV, Veenstra M, Nijholt A, Dijk BV. Goal-based structuring in recommender systems. Interact Comput. 2006;18:432–56.

Adomavicius G, Tuzhilin A. Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans Knowl Data Eng. 2005;17:734–49.

Symeonidis P, Nanopoulos A, Manolopoulos Y. Providing justifications in recommender systems. IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans. 2009;38:1262–72.

Zhan J, Hsieh C, Wang I, Hsu T, Liau C, Wang D. Privacy preserving collaborative recommender systems. IEEE Trans Syst Man Cybernet. 2010;40:472–6.

Burke R. Hybrid recommender systems: survey and experiments. User Model User-Adap Inter. 2002;12:331–70.

Article   MATH   Google Scholar  

Gunes I, Kaleli C, Bilge A, Polat H. Shilling attacks against recommender systems: a comprehensive survey. Artif Intell Rev. 2012;42:767–99.

Park DH, Kim HK, Choi IY, Kim JK. A literature review and classification of recommender systems research. Expert Syst Appl. 2012;39:10059–72.

Download references

Acknowledgements

We thank our colleagues from Assam Down Town University who provided insight and expertise that greatly assisted this research, although they may not agree with all the interpretations and conclusions of this paper.

No funding was received to assist with the preparation of this manuscript.

Author information

Authors and affiliations.

Department of Computer Science & Engineering, Assam Down Town University, Panikhaiti, Guwahati, 781026, Assam, India

Deepjyoti Roy & Mala Dutta

You can also search for this author in PubMed   Google Scholar

Contributions

DR carried out the review study and analysis of the existing algorithms in the literature. MD has been involved in drafting the manuscript or revising it critically for important intellectual content. Both authors read and approved the final manuscript.

Corresponding author

Correspondence to Deepjyoti Roy .

Ethics declarations

Ethics approval and consent to participate, consent for publication, competing interests.

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Roy, D., Dutta, M. A systematic review and research perspective on recommender systems. J Big Data 9 , 59 (2022). https://doi.org/10.1186/s40537-022-00592-5

Download citation

Received : 04 October 2021

Accepted : 28 March 2022

Published : 03 May 2022

DOI : https://doi.org/10.1186/s40537-022-00592-5

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Recommender system
  • Machine learning
  • Content-based filtering
  • Collaborative filtering
  • Deep learning

thesis topic recommendation system

Scholar Commons

  • < Previous

Home > STUDENT_SCHOLAR > ENG_PHD_THESES > 22

Engineering Ph.D. Theses

Deep learning for recommender systems.

Travis Akira Ebesu

Date of Award

Document type.

Dissertation

Santa Clara : Santa Clara University, 2019

Degree Name

Doctor of Philosophy (PhD)

Computer Engineering

First Advisor

The widespread adoption of the Internet has led to an explosion in the number of choices available to consumers. Users begin to expect personalized content in modern E-commerce, entertainment and social media platforms. Recommender Systems (RS) provide a critical solution to this problem by maintaining user engagement and satisfaction with personalized content.

Traditional RS techniques are often linear limiting the expressivity required to model complex user-item interactions and require extensive handcrafted features from domain experts. Deep learning demonstrated significant breakthroughs in solving problems that have alluded the artificial intelligence community for many years advancing state-of-the-art results in domains such as computer vision and natural language processing.

The recommender domain consists of heterogeneous and semantically rich data such as unstructured text (e.g. product descriptions), categorical attributes (e.g. genre of a movie), and user-item feedback (e.g. purchases). Deep learning can automatically capture the intricate structure of user preferences by encoding learned feature representations from high dimensional data.

In this thesis, we explore five novel applications of deep learning-based techniques to address top-n recommendation. First, we propose Collaborative Memory Network, which unifies the strengths of the latent factor model and neighborhood-based methods inspired by Memory Networks to address collaborative filtering with implicit feedback. Second, we propose Neural Semantic Personalized Ranking, a novel probabilistic generative modeling approach to integrate deep neural network with pairwise ranking for the item cold-start problem. Third, we propose Attentive Contextual Denoising Autoencoder augmented with a context-driven attention mechanism to integrate arbitrary user and item attributes. Fourth, we propose a flexible encoder-decoder architecture called Neural Citation Network, embodying a powerful max time delay neural network encoder augmented with an attention mechanism and author networks to address context-aware citation recommendation. Finally, we propose a generic framework to perform conversational movie recommendations which leverages transfer learning to infer user preferences from natural language. Comprehensive experiments validate the effectiveness of all five proposed models against competitive baseline methods and demonstrate the successful adaptation of deep learning-based techniques to the recommendation domain.

Recommended Citation

Ebesu, Travis Akira, "Deep Learning for Recommender Systems" (2019). Engineering Ph.D. Theses . 22. https://scholarcommons.scu.edu/eng_phd_theses/22

Since July 11, 2019

Included in

Computer Engineering Commons , Electrical and Computer Engineering Commons

  • Collections
  • Disciplines

Advanced Search

  • Notify me via email or RSS

Author Corner

  • Santa Clara University
  • University Library

Home | About | FAQ | My Account | Accessibility Statement

Privacy Copyright

List of recommender system dissertations

Recommender systems related doctoral dissertations by year. For other types of theses see:

  • List of recommender systems master's theses
  • List of recommender systems bachelor's theses

Dissertation within a year are sorted alphabetically by title.

  • 5 External links
  • Adversarial Machine Learning in Recommender Systems - Felice Antonio Merra
  • Offline Approaches to Recommendation with Online Success - Olivier Jeunen
  • Music Recommender Systems: Taking Into Account The Artists’ Perspective - Andres Ferraro
  • Ratings in Recommender Systems: Decision Biases and Explainability - Ludovik Coba
  • Supporting energy-efficient choices using Rasch-based recommender interfaces - Alain Starke
  • From behavior-centered to user-centered : incorporating psychological knowledge and user feedback in personalization - Mark Graus
  • Algorithms for Sequence-aware Recommender Systems - Massimo Quadrana
  • Improving the User Experience of Music Recommender Systems Through Personality and Cultural Information - Bruce Ferwerda
  • A Weighted User Similarity Model for Cold Start Recommendations - Parivash Pirasteh
  • Collaborative Filtering Approaches for Single-domains and Cross-domain Recommender Systems - Rohit Parimi
  • Novelty and Diversity Evaluation and Enhancement in Recommender Systems - Saúl Vargas
  • Towards Effective Research-Paper Recommender Systems and User Modeling based on Mind Maps - Joeran Beel
  • A Model-Based Music Recommendation System for Individual Users and Implicit User Groups - Yajie Hu
  • Advancing Recommender Systems from the Algorithm, Interface and Methodological Perspective - Marco Rossetti
  • Aggregating Information from the Crowd: ratings, recommendations and predictions - Florent Garcin
  • Collaborative Filtering Based Social Recommender Systems - Xiwang Yang
  • Cross-domain Recommendations based on semantically-enhanced User Web Behavior - Julia Hoxha
  • Cryptographically-Enhanced Privacy for Recommender Systems - Arjan Jeckmans
  • Database management system support for collaborative filtering recommender systems - Mohamed Sarwat
  • Empirical Evaluation of Active Learning Strategies in Collaborative Filtering - Mehdi Elahi
  • Dynamic Generation of Personalized Hybrid Recommender Systems - Simon Dooms
  • Enhancing Discovery in Geoportals: Geo-Enrichment, Semantic Enhancement and Recommendation Strategies for Geo-Information Discovery - Bernhard Vockner
  • Exploiting Distributional Semantics for Content-Based and Context-Aware Recommendation - Victor Codina
  • Exploiting Implicit User Activity for Media Recommendation - Michele Trevisiol
  • Information Aggregation in Quantized Consensus, Recommender Systems, and Ranking - Shang Shang
  • More Usable Recommendation Systems for Improving Software Quality - Yoonki Song
  • Next Generation of Recommender Systems: Algorithms and Applications - Lei Li
  • Privacy-enabled scalable recommender systems - Andrés Moreno
  • SmartParticipation: A Fuzzy-Based Recommender System for Political Community-Building - Luis Fernando Terán Tamayo
  • Towards Recommender Engineering: Tools and Experiments for Identifying Recommender Differences - Michael Ekstrand
  • User Factors in Recommender Systems: Case Studies in e-Commerce, News Recommending, and e-Learning - Juha Leino
  • A conceptual model and a software framework for developing context aware hybrid recommender systems - Tim Hussein
  • Effective tag recommendation system based on topic ontology - V Subramaniyaswamy
  • Estrategias de recomendación basadas en conocimiento para la localización personalizada de recursos en repositorios educativos (Spanish) - Almudena Ruiz-Iniesta
  • Evaluating the Accuracy and Utility of Recommender Systems - Alan Said
  • Evaluation in Audio Music Similarity - Julián Urbano
  • Improved online services by personalized recommendations and optimal quality of experience parameters - Toon De Pessemier
  • Integrating Content and Semantic Representations for Music Recommendation - Ben Horsburgh
  • Latent feature models for dyadic prediction - Aditya Krishna Menon
  • Living Analytics Methods for the Social Web - Ernesto Diaz-Aviles
  • Ranking and Context-awareness in Recommender Systems - Yue Shi
  • Recommender Systems and Time Context: Characterization of a Robust Evaluation Protocol to Increase Reliability of Measured Improvemenets - Pedro Campos
  • Session Aware Recommender System in E-Commerce - Jian Wang
  • Social contextuality and conversational recommender systems - Eoin Hurrell
  • Trace-Based Reasoning for User Assistance and Recommendations - Raafat Zarka
  • Trust-Based User Profiling - Nima Dokoohaki
  • Understanding and supporting mobile application usage - Matthias Böhmer
  • User Contrallability in a Hybrid Recommender System - Denis Parra
  • Design and User Perception Issues for Personality-Engaged Recommender Systems - Rong Hu
  • Enhanced Vector Space Models for Content-based Recommender Systems - Cataldo Musto
  • Hybrid Tag Recommendation in Collaborative Tagging Systems - Marek Lipczak
  • Interaction Methods for Large Scale Graph Visualization Systems --- Using Manipulation to Aid Discovery - Brynjar Gretarsson
  • Leveraging Recommender Systems for the Creation and Maintenance of Structure within Collaborative Social Media Platforms - Eva Zangerle
  • Leveraging Tagging Data for Recommender Systems - Fatih Gedikli
  • More like this: machine learning approaches to music similarity - Brian McFee
  • On some Challenges for Online Trust and Reputation Systems - Mozhgan Tavakolifard
  • On the Use of Language Models and Topic Models in the Web: New Algorithms for Filtering, Classification, Ranking, and Recommendation - Ralf Krestel
  • Performance prediction and evaluation in Recommender Systems: an Information Retrieval perspective - Alejandro Bellogin
  • Integrating collaborative filtering and matching-based search for product recommendation - Noraswaliza Abdullah
  • Personalized Recommendation Based on Collaborative Tagging Techniques for an E-learning System - Aleksandra Klašnja‐Milićević
  • Recommender systems in industrial contexts - Frank Meyer
  • Supervised Machine Learning Methods for Item Recommendation - Zeno Gantner
  • Understanding Consistency of Recommender Systems: Behavioral and Algorithmic Perspectives - Jingjing Zhang
  • Bayesian Recommender Systems: Models and Algorithms - Shengbo Guo
  • Context-Aware Collaborative Filtering Recommender Systems - Linas Baltrunas
  • Contextualize Your Listening: The Playlist as Recommendation Engine - Ben Fields
  • Deployment of Recommender Systems: Operational and Strategic Issues - Abhijeet Ghoshal
  • Effective and Efficient Collaborative Filtering - Yi Ding
  • Effective Fusion-based Approaches for Recommender Systems - Xin Xin
  • Formal Concept Analysis and Tag Recommendations in Collaborative Tagging Systems - Robert Jäschke
  • Improved Trust-Aware Recommender System using Small-Worldness of Trust Networks - Weiwei Yuan
  • Personalized Recommendation in Social Network Sites - Jilin Chen
  • Predicting and using social tags to improve the accuracy and transparency of recommender systems - Sharon Givon
  • Recognition and usage of emotive parameters in recommender systems - Marko Tkalčič
  • Swarm intellilgence for clustering dynamic data sets for web usage mining and personalization - Esin Saka
  • User profiling based on folksonomy information in Web 2.0 for personalized recommender systems - Huizhi Liang
  • Trust-based automated recommendation making - Touhid Bhuiyan
  • Using Data Mining for Facilitating User Contributions in the Social Semantic Web - Maryam Ramezani
  • Visualization of music relational information sources for analysis, navigation, and discovery - Justin Donaldson
  • A Domain-Independent Framework for Intelligent Recommendations - Jörn David
  • Content-Based Music Recommender Systems: Beyond simple Frame-Level Audio Similarity - Klaus Seyerlehner
  • Context-Aware Ranking with Factorization Models - Steffen Rendle
  • Evaluating Collaborative Filtering Over Time - Neal Lathia
  • Graph-Based Recommendation in Broad Folksonomies - Robert Wetzker
  • Methods and Applications for Ontology-based Recommender Systems - Tuukka Ruotsalo
  • Personalised video retrieval: application of implicit feedback and semantic user profiles - Frank Hopfgartner
  • Recommender Systems for Social Tagging Systems - Leandro Balby Marinho
  • Recommender Systeme für produktbegleitende Dienstleistungen - Margarethe Frohs
  • Relational clustering models for knowledge discovery and recommender systems - Tao Li
  • Trust networks for recommender systems - Patricia Victor
  • User session and history modeling for collaborative visualization - Fanhai Yang
  • Eine offene Architektur für die agentenbasierte, adaptive, personalisierte Informationsfilterung (German) - Andreas Lommatzsch
  • Explaining recommendations - Nava Tintarev
  • Factorization-Based Large Scale Recommendation Algorithms - István Pilászy
  • Nurturing Tagging Communities - Shilad Sen
  • Recommender systems and market diversity - Daniel M. Fleder
  • Recommender Systems for Social Bookmarking - Toine Bogers
  • Towards Efficient Music Similarity Search, Ranking, and Recommendation - Maria Magdalena Ruxanda
  • Evaluating recommender systems : an evaluation framework to predict user satisfaction for recommender systems in an electronic programme guide context - Joost de Wit
  • Exploiting the Conceptual Space in Hybrid Recommender Systems: a Semantic-based Approach - Iván Cantador
  • Information Enrichment for Quality Recommender Systems - Li-Tung Weng
  • Information-seeking on the Web with Trusted Social Networks – from Theory to Systems - Tom Heath
  • Missing Data Problems in Machine Learning - Benjamin Marlin
  • Music Recommendation and Discovery in the Long Tail - Òscar Celma
  • Recommender System based on Personality Traits - Maria Augusta Silveira Netto Nunes
  • Studies on Hybrid Music Recommendation Using Timbral and Rhythmic Features - Kazuyoshi Yoshii
  • Towards Metadata-aware Algorithms for Recommender Systems - Karen H. L. Tso-Sutter
  • User Decision Improvement and Trust Building in Product Recommender Systems - Li Chen
  • Bayesian Inference for Latent Variable Models - Ulrich Paquet
  • Mining Influence in Recommender Systems - Al Mamunur Rashid
  • Building Trustworthy Recommender Systems - Sheng Zhang
  • Designing social interactions with animated avatars and speech output for product recommendation agents in electronic commerce - Lingyun Qiu
  • Meeting User Information Needs in Recommender Systems - Sean McNee
  • A Market-Based Approach to Recommender Systems - Yan Zheng Wei
  • Explanet: a learning tool and hybrid recommender system for student-authored explanations - Jessica Masters
  • Hybrid recommendation techniques based on user profiles - Pasquale Lops
  • Information Transmission and Recommender Systems - Deran Özmen
  • Supporting People In Finding Information: Hybrid Recommender Systems and Goal-Based Structuring - Mark van Setten
  • Towards Decentralized Recommender Systems - Cai-Nicolas Ziegler
  • Use of Discrete Choice Models with Recommender Systems - Bassam H. Chaptini
  • Collaborative recommender agents based on case-based reasoning and trust - Miquel Montaner
  • The Impact of Internalization and Familiarity on Trust and Adoption of Recommendation Agents - Sherrie Komiak
  • Toward a Personal Recommender System - Bradley N. Miller
  • Recommendation as classification and recommendation as matching: two information-centered approaches to recommendation - Chumki Basu
  • Dynamic Information Filtering - Patrick Baudisch
  • MetaLens: A Framework for Multsource Recommendations - Ben Schafer
  • Sparsity, scalability, and distribution in recommender systems - Badrul Munir Sarwar
  • Recommender Systems for Problem Solving Environments - Naren Ramakrishnan

External links

  • PhD Theses and Doctoral Dissertations Related to Music Information Retrieval
  • Mendeley collection of Recommender System dissertations (based on this page)

Navigation menu

Personal tools.

  • View source
  • View history
  • Current events
  • Recent changes
  • Random page
  • What links here
  • Related changes
  • Special pages
  • Printable version
  • Permanent link
  • Page information
  • This page was last edited on 31 January 2022, at 18:44.
  • Privacy policy
  • About RecSysWiki
  • Disclaimers

A systematic literature review on educational recommender systems for teaching and learning: research trends, limitations and opportunities

  • Published: 14 September 2022
  • Volume 28 , pages 3289–3328, ( 2023 )

Cite this article

thesis topic recommendation system

  • Felipe Leite da Silva   ORCID: orcid.org/0000-0002-2799-3503 1 ,
  • Bruna Kin Slodkowski   ORCID: orcid.org/0000-0002-9028-366X 1 ,
  • Ketia Kellen Araújo da Silva   ORCID: orcid.org/0000-0003-4722-8072 1 &
  • Sílvio César Cazella   ORCID: orcid.org/0000-0003-2343-893X 1 , 2  

11k Accesses

32 Citations

Explore all metrics

Recommender systems have become one of the main tools for personalized content filtering in the educational domain. Those who support teaching and learning activities, particularly, have gained increasing attention in the past years. This growing interest has motivated the emergence of new approaches and models in the field, in spite of it, there is a gap in literature about the current trends on how recommendations have been produced, how recommenders have been evaluated as well as what are the research limitations and opportunities for advancement in the field. In this regard, this paper reports the main findings of a systematic literature review covering these four dimensions. The study is based on the analysis of a set of primary studies ( N  = 16 out of 756, published from 2015 to 2020) included according to defined criteria. Results indicate that the hybrid approach has been the leading strategy for recommendation production. Concerning the purpose of the evaluation, the recommenders were evaluated mainly regarding the quality of accuracy and a reduced number of studies were found that investigated their pedagogical effectiveness. This evidence points to a potential research opportunity for the development of multidimensional evaluation frameworks that effectively support the verification of the impact of recommendations on the teaching and learning process. Also, we identify and discuss main limitations to clarify current difficulties that demand attention for future research.

Similar content being viewed by others

thesis topic recommendation system

Panorama of Recommender Systems to Support Learning

thesis topic recommendation system

Recommender systems to support learners’ Agency in a Learning Context: a systematic review

thesis topic recommendation system

A systematic literature review on adaptive content recommenders in personalized learning environments from 2015 to 2020

Explore related subjects.

  • Artificial Intelligence
  • Digital Education and Educational Technology

Avoid common mistakes on your manuscript.

1 Introduction

Digital technologies are increasingly integrated into different application domains. Particularly in education, there is a vast interest in using them as mediators of the teaching and learning process. In such a task, the computational apparatus serves as an instrument to support human knowledge acquisition from different educational methodologies and pedagogical practices (Becker, 1993 ).

In this sense, Educational Recommender Systems (ERS) play an important role for both educators and students (Maria et al., 2019 ). For instructors, these systems can contribute to their pedagogical practices through recommendations that improve their planning and assist in educational resources filtering. As for the learners, through preferences and educational constraints recognition, recommenders can contribute for their academic performance and motivation by indicating personalized learning content (Garcia-Martinez & Hamou-Lhadj, 2013 ).

Despite the benefits, there are known issues upon the usage of the recommender system in the educational domain. One of the main challenges is to find an appropriate correspondence between the expectations of users and the recommendations (Cazella et al., 2014 ). Difficulties arise from differences in learner’s educational interests and needs (Verbert et al., 2012 ). The variety of student’s individual factors that can influence the learning process (Buder & Schwind, 2012 ) is one of the challenging matters that makes it complex to be overcome. On a recommender standpoint, this reflects an input diversity with potential to tune recommendations for users.

In another perspective, from a technological and artificial intelligence standpoint, the ERS are likely to suffer from already known issues noted on the general-purpose ones, such as the cold start and data sparsity problems (Garcia-Martinez & Hamou-Lhadj, 2013 ). Furthermore, problems are related to the approach used to generate recommendations. For instance, the overspecialization is inherently associated with the way that content-based recommender systems handle data (Iaquinta et al., 2008 ; Khusro et al., 2016 ). These issues pose difficulties to design recommenders that best suit the user’s learning needs and that distance themselves from user’s dissatisfaction in the short and long term.

From an educational point of view, issues emerge on how to evaluate ERS effectiveness. A usual strategy to measure the quality of educational recommenders is to apply the traditional recommender’s evaluation methods (Erdt et al., 2015 ). This approach determines system quality based on performance properties, such as its precision and prediction accuracy. Nevertheless, in the educational domain, system effectiveness needs to take into account the students’ learning performance. This dimension brings new complexities on how to successfully evaluate ERS.

As ERS topic has gradually increased in attraction for scientific community (Zhong et al., 2019 ), extensive research have been carried out in recent years to address these issues (Manouselis et al. 2010 ; Manouselis et al., 2014 ; Tarus et al., 2018 ; George & Lal, 2019 ). ERS has become a field of application and combination of different computational techniques, such as data mining, information filtering and machine learning, among others (Tarus et al., 2018 ). This scenario indicates a diversity in the design and evaluation of recommender systems that support teaching and learning activities. Nonetheless, research is dispersed in literature and there is no recent study that encompasses the current scientific efforts in the field that reveals how such issues are addressed in current research. Reviewing evidence, and synthesizing findings of current approaches in how ERS produce recommendations, how ERS are evaluated and what are research limitations and opportunities can provide a panoramic perspective of the research topic and support practitioners and researchers for implementation and future research directions.

From the aforementioned perspective, this work aims to investigate and summarize the main trends and research opportunities on ERS topic through a Systematic Literature Review (SLR). The study was conducted based on the last six years publications, particularly, regarding to recommenders that support teaching and learning process.

Main trends referrer to recent research direction on the ERS field. They are analyzed in regard to how recommender systems produce recommendations and how they are evaluated. As mentioned above, these are significant dimensions related to current issues of the area. Specifically for the recommendation production, this paper provides a three-axis-based analysis centered on systems underlying techniques, input data and results presentation.

Additionally, research opportunities in the field of ERS as well as their main limitations are highlighted. Because current comprehension of these aspects is fragmented in literature, such an analysis can shed light for future studies.

The SLR was carried out using Kitchenham and Charters ( 2007 ) guidelines. The SLR is the main method for summarizing evidence related to a topic or a research question (Kitchenham et al., 2009 ). Kitchenham and Charters ( 2007 ) guidelines, in turn, are one of the leading orientations for reviews on information technology in education (Dermeval et al., 2020 ).

The remainder of this paper is structured as follows. In Section  2 , the related works are presented. Section  3 details the methodology used in carrying out the SLR. Section  4 covers the SLR results and related discussion. Section  5 presents the conclusion.

2 Related works

In the field of education, there is a growing interest in technologies that support teaching and learning activities. For this purpose, ERS are strategic solutions to provide a personalized educational experience. Research in this sense has attracted the attention of the scientific community and there has been an effort to map and summarize different aspects of the field in the last 6 years.

In Drachsler et al. ( 2015 ) a comprehensive review of technology enhanced learning recommender systems was carried out. The authors analyzed 82 papers published from 2000 to 2014 and provided an overview of the area. Different aspects were analyzed about recommenders’ approach, source of information and evaluation. Additionally, a categorization framework is presented and the study includes the classification of selected papers according to it.

Klašnja-Milićević et al. ( 2015 ) conducted a review on recommendation systems for e-learning environments. The study focuses on requirements, challenges, (dis)advantages of techniques in the design of this type of ERS. An analysis on collaborative tagging systems and their integration in e-learning platform recommenders is also discussed.

Ferreira et al. ( 2017 ) investigated particularities of research on ERS in Brazil. Papers published between 2012 and 2016 in three Brazilian scientific vehicles were analyzed. Rivera et al. ( 2018 ) presented a big picture of the ERS area through a systematic mapping. The study covered a larger set of papers and aimed to detect global characteristics in ERS research. Aiming at the same focus, however, setting different questions and repositories combination, Pinho, Barwaldt, Espíndola, Torres, Pias, Topin, Borba and Oliveira (2019) performed a systematic review on ERS. In these works, it is observed the common concern of providing insights about the systems evaluation methods and the main techniques adopted in the recommendation process.

Nascimento et al. ( 2017 ) carried out a SLR covering learning objects recommender systems based on the user’s learning styles. Learning objects metadata standards, learning style theoretical models, e-learning systems used to provide recommendations and the techniques used by the ERS were investigated.

Tarus et al ( 2018 ) and George and Lal ( 2019 ) concentrated their reviews on ontology-based ERS. Tarus et al. ( 2018 ) examined research distribution in a period from 2005 to 2014 according to their years of publication. Furthermore, the authors summarized the techniques, knowledge representation, ontology types and ontology representations covered in the papers. George and Lal ( 2019 ), in turn, update the contributions of Tarus et al. ( 2018 ), investigating papers published between 2010 and 2019. The authors also discuss how ontology-based ERS can be used to address recommender systems traditional issues, such as cold start problem and rating sparsity.

Ashraf et al. ( 2021 ) directed their attention to investigate course recommendation systems. Through a comprehensive review, the study summarized the techniques and parameters used by this type of ERS. Additionally, a taxonomy of the factors taken into account in the course recommendation process was defined. Salazar et al. ( 2021 ), on the other hand, conducted a review on affectivity-based ERS. Authors presented a macro analysis, identifying the main authors and research trends, and summarized different recommender systems aspects, such as the techniques used in affectivity analysis, the source of affectivity data collection and how to model emotions.

Khanal et al. ( 2019 ) reviewed e-learning recommendation systems based on machine learning algorithms. A total of 10 papers from two scientific vehicles and published between 2016 and 2018 were examined. The study focal point was to investigate four categories of recommenders: those based on collaborative filtering, content-based filtering, knowledge and a hybrid strategy. The dimensions analyzed were the machine learning algorithms used, the recommenders’ evaluation process, inputs and outputs characterization and recommenders’ challenges addressed.

2.1 Related works gaps and contribution of this study

The studies presented in the previous section have a diversity of scope and dimensions of analysis, however, in general, they can be classified into two distinct groups. The first, focus on specific subjects of ERS field, such as similar methods of recommendations (George & Lal, 2019 ; Khanal et al., 2019 ; Salazar et al., 2021 ; Tarus et al., 2018 ) and same kind of recommendable resources (Ashraf et al., 2021 ; Nascimento et al., 2017 ). This type of research scrutinizes the particularities of the recommenders and highlights aspects that are difficult to be identified in reviews with a broader scope. Despite that, most of the reviews concentrate on analyses of recommenders’ operational features and have limited discussion on crosswise issues, such as ERS evaluation and presentation approaches. Khanal et al. ( 2019 ), specifically, makes contributions regarding evaluation, but the analysis is limited to four types of recommender systems.

The second group is composed of wider scope reviews and include recommendation models based on a diversity of methods, inputs and outputs strategies (Drachsler et al., 2015 ; Ferreira et al., 2017 ; Klašnja-Milićević et al., 2015 ; Pinho et al., 2019 ; Rivera et al., 2018 ). Due to the very nature of systematic mappings, the research conducted by Ferreira et al. ( 2017 ) and Rivera et al. ( 2018 ) do not reach in depth some topics, for example, the data synthesized on the evaluations of the ERS are delimited to indicate only the methods used. Ferreira et al. ( 2017 ), in particular, aims to investigate only Brazilian recommendation systems, offering partial contributions to an understanding of the state of the art of the area. In Pinho et al. ( 2019 ) it is noted the same limitation of the systematic mappings. The review was reported with a restricted number of pages, making it difficult to detail the findings. On the other hand, Drachsler et al. ( 2015 ) and, Klašnja-Milićević et al. ( 2015 ) carried out comprehensive reviews that summarizes specific and macro dimensions of the area. However, the papers included in their reviews were published until 2014 and there is a gap on the visto que advances and trends in the field in the last 6 years.

Given the above, as far as the authors are aware, there is no wide scope secondary study that aggregate the research achievements on recommendation systems that support teaching and learning in recent years. Moreover, a review in this sense is necessary since personalization has become an important feature in the teaching and learning context and ERS are one of main tools to deal with different educational needs and preferences that affect individuals’ learning process.

In order to widen the frontiers of knowledge in the field of research, this review aims to contribute to the area by presenting a detailed analysis of the following dimensions: how recommendations are produced and presented, how recommender systems are evaluated and what are the studies limitations and research opportunities. Specifically, to summarize the current knowledge, a SLR was conducted based on four research questions (Section  3.1 ). The review focused on papers published from 2015 to 2020 in scientific journals. A quality assessment was performed to select the most mature systems. The data found on the investigated topics are summarized and discussed in Section  4 .

3 Methodology

This study is based on the SLR methodology for gathering evidences related to the research topic investigated. As stated by Kitchenham and Charters ( 2007 ) and Kitchenham et al. ( 2009 ), this method provides the means for aggregate evidences from current research prioritizing the impartiality and reproducibility of the review. Therefore, a SLR is based on a process that entails the development of a review protocol that guides the selection of relevant studies and the subsequent extraction of data for analysis.

Guidelines for SLR are widely described in literature and the method can be applied for gathering evidences in different domains, such as, medicine and social science (Khan et al., 2003 ; Pai et al., 2004 ; Petticrew & Roberts, 2006 ; Moher et al., 2015 ). Particularly for informatics in education area, Kitchenham and Charters ( 2007 ) guidelines have been reported as one of the main orientations (Dermeval et al, 2020 ). Their approach appears in several studies (Petri & Gresse von Wangenheim, 2017 ; Medeiros et al., 2019 ; Herpich et al, 2019 ) including mappings and reviews on ERS field (Rivera et al., 2018 ; Tarus et al., 2018 ).

As mentioned in Section  1 , Kitchenham and Charters ( 2007 ) guidelines were used in the conducted SLR. They are based on three main stages: the first for planning the review, the second for conducting it and the last for the results report. Following these orientations, the review was structured in three phases with seven main activities distributed among them as depicted in Fig.  1 .

figure 1

Systematic literature review phases and activities

The first was the planning phase. The identification of the need for a SLR about teaching and learning support recommenders and the development of the review protocol occurred on this stage. In activity 1, the search for SLR with the intended scope of this study was performed. The result did not return compatible papers with this review scope. Papers identified are described in Section  2 . In activity 2, the review process was defined. The protocol was elaborated through rounds of discussion by the authors until consensus was reached. The activity 2 output were the research questions, search strategy, papers selection strategy and the data extraction method.

The next was the conducting phase. At this point, activities for relevant papers identification (activity 3) and selection (activities 4) were executed. In Activity 3, searches were carried out in seven repositories indicated by Dermeval et al. ( 2020 ) as relevant to the area of informatics in education. Authors applied the search string into these repositories search engines, however, due to the large number of returned research, the authors established the limit of 600 to 800 papers that would be analyzed. Thus, three repositories whose sum of search results was within the established limits were chosen. The list of potential repositories considered for this review and the selected ones is listed in Section  3.1 . The search string used is also shown in Section  3.1 .

In activity 4, studies were selected through two steps. In the first, inclusion and exclusion criteria were applied to each identified paper. Accepted papers had they quality assessed in the second step. Parsifal Footnote 1 was used to manage planning and conducting phase data. Parsifal is a web system, adhering to Kitchenham and Charters ( 2007 ) guidelines, that helps in SLR conduction. At the end of this step, relevant data were extracted (activity 5) and registered in a spreadsheet. Finally, in the reporting phase, the extracted data were analyzed in order to answer the SLR research questions (activity 6) and the results were recorded in this paper (activity 7).

3.1 Research question, search string and repositories

Teaching and learning support recommender systems have particularities of configuration, design and evaluation method. Therefore, the following research questions (Table 1 ) were elaborated in an effort to synthesize these knowledge as well as the main limitations and research opportunities in the field from the perspective of the most recent studies:

Regarding the search strategy, papers were selected from three digital repositories (Table 2 ). For the search, “Education” and “Recommender system” were defined as the keywords and synonyms were derived from them as secondary terms (Table 3 ). From these words, the following search string was elaborated:

("Education" OR "Educational" OR "E-learning" OR "Learning" OR "Learn") AND ("Recommender system" OR "Recommender systems" OR "Recommendation system" OR "Recommendation systems" OR "Recommending system" OR "Recommending systems")

3.2 Inclusion and exclusion criteria

The first step for the selection of papers was performed through the application of objective criteria, thus a set of inclusion and exclusion criteria was defined. The approved papers formed a group that comprises the primary studies with potential relevance for the scope of the SLR. Table 4 lists the defined criteria. In the description column of Table 4 , the criteria are informed and in the id column they are identified with a code. The latter was defined appending an abbreviation of the respective kind of criteria (IC for Inclusion Criteria and EC for Exclusion Criteria) with an index following the sequence of the list. The Id is used for referencing its corresponding criterion in the rest of this document.

Since the focus of this review is on the analysis of recent ERS publications, only studies from the past 6 years (2015–2020) were screened (see IC1). Targeting mature recommender systems, only full papers from scientific journals that present the recommendation system evaluation were considered (see IC2, IC4 and IC7). Also, solely works written in English language were selected, because they are the most expressive in quantity and are within the reading ability of the authors (see IC3). Search string was verified on papers’ title, abstract and keywords to ensure only studies related to the ERS field were screened (see IC5). The IC6, specifically, delimited the subject of selected papers and aligned it to the scope of the review. Additionally, it prevented the selection of secondary studies in process (e.g., others reviews or systematic mappings). Conversely, exclusion criteria were defined to clarify that papers contrasting with the inclusion criteria should be excluded from review (see EC1 to EC8). Finally, duplicate searches were marked and, when all criteria were met, only the latest was selected.

3.3 Quality evaluation

The second step in studies selection activity was the quality evaluation of the papers. A set of questions were defined with answers of different weights to estimate the quality of the studies. The objective of this phase was to filter researches with higher: (i) validity; (ii) details of the context and implications of the research; and (iii) description of the proposed recommenders. Research that detailed the configuration of the experiment and carried out an external validation of the ERS obtained higher weight in the quality assessment. Hence, the questions related to recommender evaluation (QA8 and QA9) ranged from 0 to 3, while the others, from 0 to 2. The questions and their respective answers are presented in Table 7 (see Appendix). Each paper evaluated had a total weight calculated according to Formula 1 :

Papers total weight range from 0 to 10. Only works that reached the minimum weight of 7 were accepted.

3.4 Screening process

Papers screening process occurred as shown in Fig.  2 . Initially, three authors carried out the identification of the studies. In this activity, the search string was applied into search engines of the repositories along with the inclusion and exclusion criteria through filtering settings. Two searches were undertaken on the three repositories at distinct moments, one in November 2020 and another in January 2021. The second one was performed to ensure that all 2020 published papers in the repositories were counted. A number of 756 preliminary primary studies were returned and their metadata were registered in Parsifal.

figure 2

Flow of papers search and selection

Following the protocol, the selection activity was initiated. At the start, the duplicity verification feature of Parsifal was used. A total of 5 duplicate papers were returned and the oldest copies were ignored. Afterwards, papers were divided into groups and distributed among the authors. Inclusion and exclusion criteria were applied through titles and abstracts reading. In cases which were not possible to determine the eligibility of the papers based on these two fields, the body of text was read until it was possible to apply all criteria accurately. Finally, 41 studies remained for the next step. Once more, papers were divided into three groups and each set of works was evaluated by one author. Studies were read in full and weighted according to each quality assessment question. At any stage of this process, when questions arose, the authors defined a solution through consensus. As a final result of the selection activity, 16 papers were approved for data extraction.

3.5 Procedure for data analysis

Data from selected papers were extracted in a data collection form that registered general information and specific information. The general information extracted was: reviewer identification, date of data extraction and title, authors and origin of the paper. General information was used to manage the data extraction activity. The specific information was: recommendation approach, recommendation techniques, input parameters, data collection strategy, method for data collection, evaluation methodology, evaluation settings, evaluation approaches, evaluation metrics. This information was used to answer the research questions. Tabulated records were interpreted and a descriptive summary with the findings was prepared.

4 Results and discussion

In this section, the SLR results are presented. Firstly, an overview of the selected papers is introduced. Next, the finds are analyzed from the perspective of each research question in a respective subsection.

4.1 Selected papers overview

Each selected paper presents a distinct recommendation approach that advances the ERS field. Following, an overview of these studies is provided.

Sergis and Sampson ( 2016 ) present a recommendation system that supports educators’ teaching practices through the selection of learning objects from educational repositories. It generates recommendations based on the level of instructors’ proficiency on ICT Competences. In Tarus et al. ( 2017 ), the recommendations are targeted at students. The study proposes an e-learning resource recommender based on both user and item information mapped through ontologies.

Nafea et al. ( 2019 ) propose three recommendation approaches. They combine item ratings with student’s learning styles for learning objects recommendation. Klašnja-Milićević et al. ( 2018 ) present a recommender of learning materials based on tags defined by the learners. The recommender is incorporated in Protus e-learning system.

In Wan and Niu ( 2016 ), a recommender based on mixed concept mapping and immunological algorithms is proposed. It produces sequences of learning objects for students. In a different approach, the same authors incorporate the self-organization theory into ERS. Wan and Niu ( 2018 ) deals with the notion of self-organizing learning objects. In this research, resources behave as individuals who can move towards learners. This movement results in recommendations and is triggered based on students’ learning attributes and actions. Wan and Niu ( 2020 ), in turn, self-organization refers to the approach of students motivated by their learning needs. The authors propose an ERS that recommends self-organized cliques of learners and, based on these, recommend learning objects.

Zapata et al. ( 2015 ) developed a learning object recommendation strategy for teachers. The study describes a methodology based on collaborative methodology and voting aggregation strategies for the group recommendations. This approach is implemented in the Delphos recommender system. In a similar research line, Rahman and Abdullah ( 2018 ) show an ERS that recommends Google results tailored to students’ academic profile. The proposed system classifies learners into groups and, according to the similarity of their members, indicates web pages related to shared interests.

Wu et al. ( 2015 ) propose a recommendation system for e-learning environments. In this study, complexity and uncertainties related to user profile data and learning activities is modeled through tree structures combined with fuzzy logic. Recommendations are produced from matches of these structures. Ismail et al. ( 2019 ) developed a recommender to support informal learning. It suggests Wikipedia content taking into account unstructured textual platform data and user behavior.

Huang et al. ( 2019 ) present a system for recommending optional courses. The system indications rely on the student’s curriculum time constraints and similarity of academic performance between him and senior students. The time that individuals dedicate for learning is also a relevant factor in Nabizadeh et al. ( 2020 ). In this research, a learning path recommender that includes lessons and learning objects is proposed. Such a system estimates the learner’s good performance score and, based on that, produces a learning path that satisfies their time constraints. The recommendation approach also provides indication of auxiliary resources for those who do not reach the estimated performance.

Fernandez-Garcia et al. ( 2020 ) deals with recommendations of disciplines through a dataset with few instances and sparse. The authors developed a model based on several techniques of data mining and machine learning to support students’ decision in choosing subjects. Wu et al. ( 2020 ) create a recommender that captures students’ mastery of a topic and produces a list of exercises with a level of difficulty adapted to them. Yanes et al. ( 2020 ) developed a recommendation system, based on different machine learning algorithms, that provides appropriate actions to assist teachers to improve the quality of teaching strategies.

4.2 How teaching and learning support recommender systems produce recommendations?

The process of generating recommendations is analyzed based on two axes. Underlying techniques of recommender systems are discussed first then input parameters are covered. Studies details are provided in Table 5 .

4.2.1 Techniques approaches

Through selected papers analysis is observed that hybrid recommendation systems are predominant in selected papers. Such recommenders are characterized by computing predictions through a set of two or more algorithms in order to mitigate or avoid the limitations of pure recommendation systems (Isinkaye et al., 2015 ). From sixteen analyzed papers, thirteen (p = 81,25%) are based on hybridization. This tendency seems to be related with the support that hybrid approach provides for development of recommender systems that must meet multiple educational needs of users. For example, Sergis and Sampson ( 2016 ) proposed a recommender based on two main techniques: fuzzy set to deal with uncertainty about teacher competence level and Collaborative Filtering (CF) to select learning objects based on neighbors who may have competences similarities. In Tarus et al. ( 2017 ) students and learning resources profiles are represented as ontologies. The system calculates predictions based on them and recommends learning items through a mechanism that applies collaborative filtering followed by a sequential pattern mining algorithm.

Moreover, the hybrid approach that combines CF and Content-Based Filtering (CBF), although a traditional technique (Bobadilla, Ortega, Hernando and Gutiérrez, 2013), it seems to be not popular in teaching and learning support recommender systems research. From the selected papers, only Nafea et al. ( 2019 ) has a proposal in this regard. Additionally, the extracted data indicates that a significant number of hybrid recommendation systems (p = 53.85%, n  = 7) have been built based on the combination of methods of treatment or representation of data, such as the use of ontologies and fuzzy sets, with methods to generate recommendation. For example, Wu et al. ( 2015 ) structure users profile data and learning activities through fuzzy trees. In such structures the values assigned to their nodes are represented by fuzzy sets. The fuzzy tree data model and users’ ratings feed a tree structured data matching method and a CF algorithm for similarities calculation.

Collaborative filtering recommendation paradigm, in turn, plays an important role in research. Nearly a third of the studies (p = 30.77%, n  = 4) that propose hybrid recommenders includes a CF-based strategy. In fact, this is the most frequent pure technique on the research set. A total of 31.25%( n  = 5) are based on a CF adapted version or combine it with other approaches. CBF-based recommenders, in contrast, have not shared the same popularity. This technique is an established recommendation approach that produces results based on the similarity between items known to the user and others recommendable items (Bobadilla et al., 2013 ). Only Nafea et al. ( 2019 ) propose a CBF-based recommendation system.

Also, CF user-based variant is widely used in analyzed research. In this version, predictions are calculated by similarity between users, as opposed to the item-based version where predictions are based on item similarities (Isinkaye et al., 2015 ). All CF-based recommendation systems identified, whether pure or combined with other techniques, use this variant.

The above finds seem to be related to the growing perception, in the education domain, of the relevance of a student-centered teaching and learning process (Krahenbuhl, 2016 ; Mccombs, 2013 ). Recommendation approaches that are based on users’ profile, such as interests, needs, and capabilities, naturally fit this notion and are more widely used than those based on other information such as the characteristics of the recommended items.

4.2.2 Input parameters approaches

In regard to the inputs consumed in the recommendation process, collected data shows that the main parameters are attributes related to users’ educational profile. Examples are ICT competences (Sergis & Sampson, 2016 ); learning objectives (Wan & Niu, 2018 ; Wu et al., 2015 ), learning styles (Nafea et al., 2019 ), learning levels (Tarus et al., 2017 ) and different academic data (Yanes et al., 2020 ; Fernández-García et al., 2020). Only 25% ( n  = 4) of the systems apply item-related information in the recommendation process. Furthermore, with the exception of the Nafea et al. ( 2019 ) CBF-based recommendation, the others are based on a combination of items and users’ information. A complete list of the identified input parameters is provided in Table 5 .

Academic information and learning styles, compared to others parameters, features highly on research. They appear, respectively, in 37.5% ( n  = 6) and 31.25% ( n  = 5) papers. Student’s scores (Huang et al., 2019 ), academic background (Yanes et al., 2020 ), learning categories (Wu et al., 2015 ) and subjects taken (Fernández-García et al.,2020) are some of the academic data used. Learning styles, in turn, are predominantly based on Felder ( 1988 ) theory. Wan and Niu ( 2016 ), exceptionally, combine Felder ( 1988 ), Kolb et al. ( 2001 ) and Betoret ( 2007 ) to build a specific notion of learning styles. This is also used in two other researchers, carried out by the same authors, and has a questionnaire also developed by them (Wan & Niu, 2018 , 2020 ).

Regarding the way inputs are captured, it was observed that explicit feedback is prioritized over others data collection strategies. In this approach, users have to directly provide the information that will be used in the process of preparing recommendations (Isinkaye et al., 2015 ). Half of analyzed studies are based only on explicit feedback. The use of graphical interface components (Klašnja-Milićević et al., 2018 ), questionnaires (Wan & Niu, 2016 ) and manual entry of datasets (Wu et al., 2020 ; Yanes et al., 2020 ) are the main methods identified.

Only 18.75%( n  = 3) ERS rely solely on gathering information through implicit feedback, that is, when inputs are inferred by the system (Isinkaye et al., 2015 ). This type of data collection appears to be more popular when applied with an explicit feedback method for enhancing the prediction tasks. Recommenders that combine both approaches occur in 31.25%( n  = 5) of the studies. Implicit data collection methods identified are user’s data usage tracking, as access, browsing and rating history (Rahman & Abdullah, 2018 ; Sergis & Sampson, 2016 ; Wan & Niu, 2018 ), data extraction from another system (Ismail et al., 2019 ), users data session monitoring (Rahman & Abdullah, 2018 ) and data estimation (Nabizadeh et al., 2020 ).

The aforementioned results indicate that, in the context of the teaching and learning support recommender systems, the implicit collection of data has usually been explored in a complementary way to the explicit one. A possible rationale is that the inference of information is noisy and less accurate (Isinkaye et al., 2015 ) and, therefore, the recommendations produced from it involve greater complexity to be adjusted to the users’ expectations (Nichols, 1998 ). This aspect makes it difficult to apply the strategy in isolation and can be a factor that produces greater user dissatisfaction when compared to the disadvantage of the acquisition load of the explicit strategy inputs.

4.3 How teaching and learning support recommender systems present recommendations?

From the analyzed paper, two approaches for presenting recommendations are identified. The majority of the proposed ERS are based on a listing of ranked items according to a per-user prediction calculation (p = 87.5%, n  = 14). This strategy is applied in all cases where the supported task is to find good items that assist users in teaching and learning tasks (Ricci et al., 2015 ; Drachsler et al., 2015 ). The second one, is based on a learning pathway generation. In this case, recommendations are displayed through a series of linked items tied by some prerequisites. Only 2 recommenders use this approach. In them, the sequence is established by learning objects association attributes (Wan & Niu, 2016 ) and by a combination of prior knowledge of the user, the time he has available and a learning score (Nabizadeh et al., 2020 ). These ERS are associated with the item sequence recommendation task and are intended to guide users who wish to achieve a specific knowledge (Drachsler et al., 2015 ).

In a further examination, it is observed that more than a half (62.5%, n  = 10) do not present details of how recommendations list is presented to the end user. In Huang et al. ( 2019 ), for example, there is a vague description of a production of predicted scores for students and a list of the top-n optional courses and it is not specified how this list is displayed. This may be related to the fact that most of these recommenders do not report an integration into another system (e.g., learning management systems) or the purpose of making it available as a standalone tool (e.g., web or mobile recommendation system). The absence of such requirements mitigates the need for the development of a refined presentation interface. Only Tarus et al. ( 2017 ), Wan and Niu ( 2018 ) and Nafea et al. ( 2019 ) propose recommenders incorporated in an e-learning system and do not detail the way in which the results are exhibited. In the six papers that provide insights about recommendation presentation, a few of them (33.33%, n  = 2), have a graphical interface that explicitly seeks to capture the attention of the user who may be performing another task in the system. This approach highlights recommendations and is common in commercial systems (Beel, Langer and Genzmehr, 2013). In Rahman and Abdullah ( 2018 ), a panel entitled “recommendations for you” is used. In Ismail et al. ( 2019 ), a pop-up box with suggestions is displayed to the user. The other part of the studies exhibits organic recommendations, i.e., naturally arranged items for user interaction (Beel et al., 2013 ).

In Zapata et al. ( 2015 ), after the user defines some parameters, a list of recommended learning objects that are returned similarly to a search engine result. As for the aggregation methods, another item recommended by the system, only the strategy that fits better to the interests of the group is recommended. The result is visualized through a five-star Likert scale that represents the users’ consensus rating. In Klašnja-Milićević et al. ( 2018 ) and Wu et al. ( 2015 ), the recommenders’ results are listed in the main area of the system. In Nabizadeh et al. ( 2020 ) the learning path occupies a panel on the screen and the items associated with it are displayed as the user progresses through the steps. The view of the auxiliary learning objects is not described in the paper. These three last recommenders do not include filtering settings and distance themselves from the archetype of a search engine.

Also, a significant number of researches are centralized on learning objects recommendations (p = 56.25%, n  = 9). Other researches recommendable items identified are learning activities (Wu et al., 2015 ), pedagogical actions (Yanes et al., 2020 ), web pages (Ismail et al., 2019 ; Rahman & Abdullah, 2018 ), exercises (Wu et al., 2020 ), aggregation methods (Zapata et al., 2015 ), lessons (Nabizadeh et al., 2020 ) and subjects (Fernández-García et al., 2020). None of the study relates the way of displaying results to the recommended item. This is a topic that needs further investigation to answer whether there are more appropriate ways to present specific types of items to the user.

4.4 How teaching and learning support recommender systems are evaluated?

In ERS, there are three main evaluation methodologies (Manouselis et al., 2013 ). One of them is the offline experiment, which is based on the use of pre-collected or simulated data to test recommenders’ prediction quality (Shani & Gunawardana, 2010 ). User study is the second approach. It takes place in a controlled environment where information related to real interactions of users are collected (Shani & Gunawardana, 2010 ). This type of evaluation can be conducted, for example, through a questionnaire and A/B tests (Shani & Gunawardana, 2010 ). Finally, the online experiment, also called real life testing, is one in which recommenders are used under real conditions by the intended users (Shani & Gunawardana, 2010 ).

In view of these definitions, the analyzed researches comprise only user studies and offline experiments in reported experiments. Each of these methods were identified in 68.75% ( n  = 11) papers respectively. Note that they are not exclusive for all cases and therefore the sum of the percentages is greater than 100%. For example, Klašnja-Milićević et al. ( 2018 ) and Nafea et al. ( 2019 ) assessed the quality of ERS predictions from datasets analysis and also asked users to use the systems to investigate their attractiveness. Both evaluation methods are carried out jointly in 37.5%( n  = 6) papers. When comparing with methods exclusive usage, each one is conducted at 31.25% ( n  = 5). Therefore, the two methods seem to have a balanced popularity. Real-life tests, on the contrary, although they are the ones that best demonstrate the quality of a recommender (Shani & Gunawardana, 2010 ), are the most avoided, probably due to the high cost and complexity of execution.

An interesting finding concerns user study methods used in research. When associated with offline experiments, the user satisfaction assessment is the most common ( p  = 80%, n  = 5). Of these, only Nabizadeh et al. ( 2020 ) performed an in-depth evaluation combining a satisfaction questionnaire with an experiment to verify the pedagogical effectiveness of their recommender. Wu et al. ( 2015 ), in particular, does not include a satisfaction survey. They conducted a qualitative investigation of user interactions and experiences.

Although questionnaires assist in identification of users’ valuables information, it is sensitive to respondents’ intentions and can be biased with erroneous answers (Shani & Gunawardana, 2010 ). Papers that present only user studies, in contrast, have a higher rate of experiments that results in direct evidence about the recommender’s effectiveness in teaching and learning. All papers in this group have some investigation in this sense. Wan and Niu ( 2018 ), for example, verified whether the recommender influenced the academic score of students and their time to reach a learning objective. Rahman and Abdullah ( 2018 ) investigated whether the recommender impacted the time students took to complete a task.

Regarding the purpose of the evaluations, ten distinct research goals were identified. Through Fig.  3 , it is observed that the occurrence of accuracy investigation excelled the others. Only 1 study did not carry out experiments in this regard. Different traditional metrics were identified for measuring the accuracy of recommenders. The Mean Absolute Error (MAE), in particular, has the higher frequency. Table 6 lists the main metrics identified.

figure 3

Evaluation purpose of recommender systems in selected papers

The system attractiveness analysis, through the verification of user satisfaction, has the second highest occurrence. It is present in 62.5% ( n  = 10) studies. The pedagogical effectiveness evaluation of the ERS has a reduced participation in the studies and occurs in only 37.5% ( n  = 6). Experiments to examine recommendations diversity, user’s profile elicitation accuracy, evolution process, user’s experience and interactions, entropy, novelty and perceived usefulness and easiness were also identified, albeit to a lesser extent.

Also, 81.25% ( n  = 13) papers presented experiments to achieve multiple purposes. For example, in Wan and Niu ( 2020 ) an evaluation is carried out to investigate recommenders’ pedagogical effectiveness, student satisfaction, accuracy, diversity of recommendations and entropy. Only in Huang et al. ( 2019 ), Fernandez-Garcia et al. ( 2020 ) and Yanes et al. ( 2020 ) evaluated a single recommender system dimension.

The upper evidence suggests an engagement of the scientific community in demonstrating the quality of the recommender systems developed through multidimensional analysis. However, offline experiments and user studies, particularly those based on questionnaires, are mostly adopted and can lead to incomplete or biased interpretations. Thus, such data also signalize the need for a greater effort to conduct real life tests and experiments that lead to an understanding of the real impact of recommenders on the teaching and learning process. Researches that synthesize and discuss the empirical possibilities of evaluating the pedagogical effectiveness of ERS can help to increase the popularity of these experiments.

Through papers analysis is also find that the results of offline experiments are usually based on a greater amount of data compared to user studies. In this group, 63.64% ( n  = 7) of evaluation datasets have records of more than 100 users. User studies, on the other hand, predominate sets of up to 100 participants in the experiments (72.72%, n  = 8). In general, offline assessments that have smaller datasets are those that occur in association with a user study. This is because the data for both experiments usually come from the same subjects (Nafea et al., 2019 ; Tarus et al., 2017 ). The cost (e.g., time and money) related to surveying participants for the experiment is possibly a determining factor in defining appropriate samples.

Furthermore, it is also verified that the greater parcel of offline experiments has a 70/30% division approach for training and testing data. Nguyen et al. ( 2021 ) give some insights in this sense arguing that this is the most suitable ratio for training and validating machine learning models. Further details on recommendation systems evaluation approaches and metrics are presented in Table 6 .

4.5 What are the limitations and research opportunities related to the teaching and learning support recommender systems field?

The main limitations observed in selected papers are presented below. They are based on articles’ explicit statements and on authors’ formulations. In this section, only those that are transverse to the majority of the studies are listed. Next, a set of research opportunities for future investigations are pointed out.

4.5.1 Research limitations

Research limitations are factors that hinders current progress in the ERS field. Knowing these factors can assist researchers to attempt coping with them on their study and mitigate the possibility of the area stagnation, that is, when new proposed recommenders does not truly generate better outcomes than the baselines (Anelli et al., 2021 ; Dacrema et al., 2021 ). As a result of this SLR, research limitations were identified in three strands that are presented below.

Reproducibility restriction

The majority of the papers report a specifically collected dataset to evaluate the proposed ERS. The main reason for this is the scarcity of public datasets suited to the research’s needs, as highlighted by some authors (Nabizadeh et al., 2020 ; Tarus et al., 2017 ; Wan & Niu, 2018 ; Wu et al., 2015 ; Yanes et al., 2020 ). Such approach restricts the feasibility of experiment reproduction and makes it difficult to compare recommenders. In fact, this is an old issue in the ERS field. Verbert et al. ( 2011 ) observed, in the beginning of the last decade, the necessity to improve reproducibility and comparison on ERS in order to provide stronger conclusions about their validity and generalizability. Although there was an effort in this direction in the following years based on a broad educational dataset sharing, currently, most of the known ones (Çano & Morisio, 2015 ; Drachsler et al., 2015 ) are retired, and the remaining, proved not to be sufficient to meet current research demands. Of the analyzed studies, only Wu et al. ( 2020 ) use public educational datasets.

Due to the fact that datasets sharing play an important role for recommenders’ model reproduction and comparison in the same conditions, this finding highlight the need of a research community effort for the creation of means to supply this need (e.g., development of public repositories) in order to mitigate current reproducibility limitation.

Dataset size / No of subjects

As can be observed on Table 6 , a few experimental results are based on a large amount of data. Only five studies have information from 1000 or more users. In particular, the offline evaluation conducted by Wu et al. ( 2015 ), despite having an extensive dataset, uses MovieLens records and is not based on real information related to teaching and learning. Another limitation concerns where data comes from, it is usually from a single origin (e.g., class of a college).

Although experiments based on small datasets can reveal the relevance of an ERS, an evaluation based on a large-scale dataset should provide stronger conclusions on recommendation effectiveness (Verbert et al., 2011 ). Experiments based on larger and more diverse data (e.g., users from different areas and domains) would contribute to most generalizable results. On another hand, scarcity of public dataset may be impairing the quantity and diversity of data used on scientific experiments in the ERS field. As reported by Nabizadeh et al. ( 2020 ), the increasement of the size of the experiment is costly in different aspects. If more public dataset were available, researchers would be more likely to find the ones that could be aligned to their needs and, naturally, increasing the size of their experiment. In this sense, they could be favored by reducing data acquisition difficulty and cost. Furthermore, the scientific community would access users’ data out of their surrounding context and could base their experiments on diversified data.

Lack of in-depth investigation of the impact of known issues in the recommendation system field

Cold start, overspecialization and sparsity are some known challenges in the field of recommender systems (Khusro et al., 2016 ). They are mainly related to a reduced and unequally distributed number of users’ feedback or item description used for generating recommendations (Kunaver & Požrl, 2017 ). These issues also permeate the ERS Field. For instance, in Cechinel et al. ( 2011 ) is reported that on a sample of more than 6000 learning objects from Merlot repository was observed a reduced number of users ratings over items. Cechinel et al. ( 2013 ), in turn, observed, in a dataset from the same repository, a pattern of few users rating several resources while the vast number of them rating 5 or less. Since such issues directly impact the quality of recommendations, teaching and learning support recommenders should be evaluated considering such issues to clarify in which extent they can be effective in real life situations. Conversely, in this SLR, we detected an expressive number of papers (43.75%, n  = 7) that do not analyze or discuss how the recommenders behave or handle, at least partially, these issues. Studies that rely on experiments to examine such aspects would elucidate more details of the quality of the proposed systems.

4.5.2 Research opportunities

From the analyzed papers, a set of research opportunities were identified. They are based on gaps related to the subjects explored through the research questions of this SLR. The identified opportunities provide insights of under-explored topics that need further investigation taking into account their potential to contribute to the advancement of the ERS field. Research opportunities were identified in three strands that are presented below.

Study of the potential of overlooked user’s attributes

The papers examined present ERS based on a variety of inputs. Preferences, prior knowledge, learning style, and learning objectives are some examples (Table 5 has the complete list). Actually, as reported by Chen and Wang ( 2021 ), this is aligned with a current research trend of investigating the relationships between individual differences and personalized learning. Nevertheless, one evidence that rises from this SLR also confirms that “some essential individual differences are neglected in existing works” (Chen & Wang, 2021 ). The papers sample suggests a lack of studies that incorporate, in recommendation model, others notably relevant information, such as emotional state and cultural context of students (Maravanyika & Dlodlo, 2018 ; Salazar et al., 2021 ; Yanes et al., 2020 ). This indicates that further investigation is needed in order to clarify the true contributions and existing complexities of collect, measure and apply these other parameters. In this sense, an open research opportunity refers to the investigation of these other users’ attributes in order to explore the impact of such characteristics on the quality of ERS results.

Increase studies on the application of ERS in informal learning situations

Informal learning refers to a type of learning that, typically, occurs out of an education institution (Pöntinen et al., 2017 ). In it, learners do not follow a structured curriculum or have a domain expert to guide him (Pöntinen et al., 2017 ; Santos & Ali, 2012 ). Such aspects influence how ERS can support users. For instance, in informal settings, content can come from multiple providers, as a consequence, it can be delivered without taking into account a proper pedagogical sequence. ERS targeting this scenario, in turn, should concentrate on organizing and sequencing recommendations guiding users’ learning process (Drachsler et al., 2009 ).

Although literature highlight the existence of significative differences on the design of educational recommenders that involves formal or informal learning circumstance (Drachsler et al., 2009 ;Okoye et al, 2012 ; Manouselis et al., 2013 ; Harrathi & Braham, 2021 ), through this SLR was observed that current studies tend to not be explicit in reporting this characteristic. This scenario makes it difficult to obtain a clear landscape of the current field situation in this dimension. Nonetheless, through the characteristics of the proposed ERS, it was observed that current research seems to be concentrated on the formal learning context. This is because recommenders from analyzed papers usually use data that are maintained by institutional learning systems. Moreover, recommendations, predominantly, do not provide a pedagogical sequencing to support self-directed and self-paced learning (e.g., recommendations that build a learning path to lead to specific knowledge). Conversely, informal learning has increasingly gained attention of the scientific community with the emergence of the coronavirus pandemic (Watkins & Marsick, 2020 ).

In view of this, the lack of studies of ERS targeting informal learning settings open a research opportunity. Specifically, further investigation focused on the design and evaluation of recommenders that take into consideration different contexts (ex. location or used device) and that guide users through a learning sequence to achieve a specific knowledge would figure prominently in this context considering the less structured format informal learning circumstances has in terms of learning objectives and learning support.

Studies on the development of multidimensional evaluation frameworks

Evidence from this study shows that the main purpose of ERS evaluation has been to assess recommender’s accuracy and users’ satisfaction (Section  4.4 ). This result, connected with Erdt et al. ( 2015 ) reveals a two decade of evaluation predominantly based on these two goals. Even though others evaluation purposes had a reduced participation in research, they are also critical for measuring the success of ERS. Moubayed et al. ( 2018 ), for example, highlights two e-learning systems evaluation aspects, one is concerned with how to properly evaluate the student performance, the other refers to measuring learners’ learning gains through systems usage. Tahereh et al. ( 2013 ) identifies that stakeholder and indicators associated with technological quality are relevant to consider in educational system assessment. From the perspective of recommender systems field, there are also important aspects to be analyzed in the context of its application in the educational domain such as novelty and diversity (Pu et al., 2011 ; Cremonesi et al., 2013 ; Erdt et al., 2015 ).

Upon this context, it is noted that, although evaluating recommender's accuracy and users’ satisfaction give insights about the value of the ERS, they are not sufficient to fully indicate the quality of the system in supporting the learning process. Other different factors reported in literature are relevant to take in consideration. However, to the best of our knowledge, there is no framework that identifies and organizes these factors to be considered in an ERS evaluation, leading to difficulties for the scientific community to be aware of them and incorporate them in studies.

Because the evaluation of ERS needs to be a joint effort between computer scientists and experts from other domains (Erdt et al., 2015 ), further investigation should be carried out seeking the development of a multidimensional evaluation framework that encompass evaluation requirements based on a multidisciplinary perspective. Such studies would clarify the different dimensions that have the potential to contribute to better ERS evaluation and could even identify which one should be prioritized to truly assess learning impact with reduced cost.

5 Conclusion

In recent years, there has been an extensive scientific effort to develop recommenders that meet different educational needs; however, research is dispersed in literature and there is no recent study that encompasses the current scientific efforts in the field.

Given this context, this paper presents an SLR that aims to analyze and synthesize the main trends, limitations and research opportunities related to the teaching and learning support recommender systems area. Specifically, this study contributes to the field providing a summary and an analysis of the current available information about the teaching and learning support recommender systems topic in four dimensions: (i) how the recommendations are produced (ii) how the recommendations are presented to the users (iii) how the recommender systems are evaluated and (iv) what are the limitations and opportunities for research in the area.

Evidences are based on primary studies published from 2015 to 2020 from three repositories. Through this review, it is provided an overarching perspective of current evidence-based practice in ERS in order to support practitioners and researchers for implementation and future research directions. Also, research limitations and opportunities are summarized in light of current studies.

The findings, in terms of current trends, shows that hybrid techniques are the most used in teaching and learning support recommender systems field. Furthermore, it is noted that approaches that naturally fit a user centered design (e.g., techniques that allow to represent students’ educational constraints) have been prioritized over that based on other aspects, like item characteristics (e.g., CBF Technique). Results show that these approaches have been recognized as the main means to support users with recommendations in their teaching and learning process and provide directions for practitioners and researchers who seek to base their activities and investigations on evidence from current studies. On the other hand, this study also reveals that highly featured techniques in the major topic of general recommender systems, such as the bandit-based and the deep learning ones (Barraza-Urbina & Glowacka, 2020 ; Zhang et al., 2020 ), have been underexplored, implying a mismatch between the areas. Therefore, the result of this systematic review indicates that a greater scientific effort should be employed to investigate the potential of these uncovered approaches.

With respect to recommendation presentation, the organic display is the most used strategy. However, most of the researches have the tendency to not show details of the used approach making it difficult to understand the state of the art of this dimension. Furthermore, among other results, it is observed that the majority of the ERS evaluation are based on the accuracy of recommenders and user's satisfaction analysis. Such a find open research opportunity scientific community for the development of multidimensional evaluation frameworks that effectively support the verification of the impact of recommendations on the teaching and learning process.

Lastly, the limitations identified indicate that difficulties related to obtaining data to carry out evaluations of ERS is a reality that extends for more than a decade (Verbert et al., 2011 ) and call for scientific community attention for the treatment of this situation. Likewise, the lack of in-depth investigation of the impact of known issues in the recommendation system field, another limitation identified, points to the importance of aspects that must be considered in the design and evaluation of these systems in order to provide a better elucidation of their potential application in a real scenario.

With regard to research limitations and opportunities, some of this study findings indicate the need for a greater effort in the conduction of evaluations that provide direct evidence of the systems pedagogical effectiveness and the development of a multidimensional evaluation frameworks for ERS is suggested as a research opportunity. Also, it was observed a scarcity of public dataset usage on current studies that leads to limitation in terms of reproducibility and comparison of recommenders. This seems to be related to a restricted number of public datasets currently available, and such aspect can also be affecting the size of experiments conducted by researchers.

In terms of limitations of this study, the first refers to the number of datasources used for paper selection. Only the repositories mentioned in Section  3.1 were considered. Thus, the scope of this work is restricted to evidence from publications indexed by these platforms. Furthermore, only publications written in English were examined, thus, results of papers written in other languages are beyond the scope of this work. Also, the research limitations and opportunities presented on Section  4.5 were identified based on the extracted data used to answer this SLR research questions, therefore they are limited to their scope. As a consequence, limitations and opportunities of the ERS field that surpass this context were not identified nor discussed in this study. Finally, the SLR was directed to papers published in scientific journals and, due to this, the results obtained do not reflect the state of the area from the perspective of conference publications. In future research, it is intended to address such limitations.

Data availability statement

The datasets generated during the current study correspond to the papers identified through the systematic literature review and the quality evaluation results (refer to Section  3.4 in paper). They are available from the corresponding author on reasonable request.

http://parsif.al/

Anelli, V. W., Bellogín, A., Di Noia, T., & Pomo, C. (2021). Revisioning the comparison between neural collaborative filtering and matrix factorization. Proceedings of the Fifteenth ACM Conference on Recommender Systems , 521–529. https://doi.org/10.1145/3460231.3475944

Ashraf, E., Manickam, S., & Karuppayah, S. (2021). A comprehensive review of curse recommender systems in e-learning. Journal of Educators Online, 18 , 23–35. https://www.thejeo.com/archive/2021_18_1/ashraf_manickam__karuppayah

Google Scholar  

Barraza-Urbina, A., & Glowacka, D. (2020). Introduction to Bandits in Recommender Systems. Proceedings of the Fourteenth ACM Conference on Recommender Systems , 748–750. https://doi.org/10.1145/3383313.3411547

Becker, F. (1993). Teacher epistemology: The daily life of the school (1st ed.). Editora Vozes.

Beel, J., Langer, S., & Genzmehr, M. (2013). Sponsored vs. Organic (Research Paper) Recommendations and the Impact of Labeling. In T. Aalberg, C. Papatheodorou, M. Dobreva, G. Tsakonas, & C. J. Farrugia (Eds.), Research and Advanced Technology for Digital Libraries (Vol. 8092, pp. 391–395). Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-642-40501-3_44

Chapter   Google Scholar  

Betoret, F. (2007). The influence of students’ and teachers’ thinking styles on student course satisfaction and on their learning process. Educational Psychology, 27 (2), 219–234. https://doi.org/10.1080/01443410601066701

Article   Google Scholar  

Bobadilla, J., Serradilla, F., & Hernando, A. (2009). Collaborative filtering adapted to recommender systems of e-learning. Knowledge-Based Systems, 22 (4), 261–265. https://doi.org/10.1016/j.knosys.2009.01.008

Bobadilla, J., Ortega, F., Hernando, A., & Gutiérrez, A. (2013). Recommender systems survey. Knowledge-Based Systems, 46 , 109–132. https://doi.org/10.1016/j.knosys.2013.03.012

Buder, J., & Schwind, C. (2012). Learning with personalized recommender systems: A psychological view. Computers in Human Behavior, 28 (1), 207–216. https://doi.org/10.1016/j.chb.2011.09.002

Çano, E., & Morisio, M. (2015). Characterization of public datasets for Recommender Systems. (2015 IEEE 1 st ) International Forum on Research and Technologies for Society and Industry Leveraging a better tomorrow (RTSI) , 249–257. https://doi.org/10.1109/RTSI.2015.7325106

Cazella, S. C., Behar, P. A., Schneider, D., Silva, KKd., & Freitas, R. (2014). Developing a learning objects recommender system based on competences to education: Experience report. New Perspectives in Information Systems and Technologies, 2 , 217–226. https://doi.org/10.1007/978-3-319-05948-8_21

Cechinel, C., Sánchez-Alonso, S., & García-Barriocanal, E. (2011). Statistical profiles of highly-rated learning objects. Computers & Education, 57 (1), 1255–1269. https://doi.org/10.1016/j.compedu.2011.01.012

Cechinel, C., Sicilia, M. -Á., Sánchez-Alonso, S., & García-Barriocanal, E. (2013). Evaluating collaborative filtering recommendations inside large learning object repositories. Information Processing & Management, 49 (1), 34–50. https://doi.org/10.1016/j.ipm.2012.07.004

Chen, S. Y., & Wang, J.-H. (2021). Individual differences and personalized learning: A review and appraisal. Universal Access in the Information Society, 20 (4), 833–849. https://doi.org/10.1007/s10209-020-00753-4

Cremonesi, P., Garzotto, F., & Turrin, R. (2013). User-centric vs. system-centric evaluation of recommender systems. In P. Kotzé, G. Marsden, G. Lindgaard, J. Wesson, & M. Winckler (Eds.), Human-Computer Interaction – INTERACT 2013, 334–351. Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-642-40477-1_21

Dacrema, M. F., Boglio, S., Cremonesi, P., & Jannach, D. (2021). A troubling analysis of reproducibility and progress in recommender systems research. ACM Transactions on Information Systems, 39 (2), 1–49. https://doi.org/10.1145/3434185

Dermeval, D., Coelho, J.A.P.d.M., & Bittencourt, I.I. (2020). Mapeamento Sistemático e Revisão Sistemática da Literatura em Informática na Educação. Metodologia de Pesquisa Científica em Informática na Educação: Abordagem Quantitativa . Porto Alegre.  https://jodi-ojs-tdl.tdl.org/jodi/article/view/442

Drachsler, H., Hummel, H. G. K., & Koper, R. (2009). Identifying the goal, user model and conditions of recommender systems for formal and informal learning. Journal of Digital Information, 10 (2), 1–17. https://jodi-ojs-tdl.tdl.org/jodi/article/view/442

Drachsler, H., Verbert, K., Santos, O. C., & Manouselis, N. (2015). Panorama of Recommender Systems to Support Learning. In F. Ricci, L. Rokach, & B. Shapira (Eds.), Recommender Systems Handbook (pp. 421–451). Springer. https://doi.org/10.1007/978-1-4899-7637-6_12

Erdt, M., Fernández, A., & Rensing, C. (2015). Evaluating recommender systems for technology enhanced learning: A quantitative survey. IEEE Transactions on Learning Technologies, 8 (4), 326–344. https://doi.org/10.1109/TLT.2015.2438867

Felder, R. (1988). Learning and teaching styles in engineering education. Journal of Engineering Education, 78 , 674–681. Washington.

Fernandez-Garcia, A. J., Rodriguez-Echeverria, R., Preciado, J. C., Manzano, J. M. C., & Sanchez-Figueroa, F. (2020). Creating a recommender system to support higher education students in the subject enrollment decision. IEEE Access, 8 , 189069–189088. https://doi.org/10.1109/ACCESS.2020.3031572

Ferreira, V., Vasconcelos, G., & França, R. (2017). Mapeamento Sistemático sobre Sistemas de Recomendações Educacionais. Proceedings of the XXVIII Brazilian Symposium on Computers in Education , 253-262. https://doi.org/10.5753/cbie.sbie.2017.253

Garcia-Martinez, S., & Hamou-Lhadj, A. (2013). Educational recommender systems: A pedagogical-focused perspective. Multimedia Services in Intelligent Environments. Smart Innovation, Systems and Technologies, 25 , 113–124. https://doi.org/10.1007/978-3-319-00375-7_8

George, G., & Lal, A. M. (2019). Review of ontology-based recommender systems in e-learning. Computers & Education, 142 , 103642–103659. https://doi.org/10.1016/j.compedu.2019.103642

Harrathi, M., & Braham, R. (2021). Recommenders in improving students’ engagement in large scale open learning. Procedia Computer Science, 192 , 1121–1131. https://doi.org/10.1016/j.procs.2021.08.115

Herpich, F., Nunes, F., Petri, G., & Tarouco, L. (2019). How Mobile augmented reality is applied in education? A systematic literature review. Creative Education, 10 , 1589–1627. https://doi.org/10.4236/ce.2019.107115

Huang, L., Wang, C.-D., Chao, H.-Y., Lai, J.-H., & Yu, P. S. (2019). A score prediction approach for optional course recommendation via cross-user-domain collaborative filtering. IEEE Access, 7 , 19550–19563. https://doi.org/10.1109/ACCESS.2019.2897979

Iaquinta, L., Gemmis, M. de,Lops, P., Semeraro, G., Filannino, M.& Molino, P. (2008). Introducing serendipity in a content-based recommender system.  Proceedings of the Eighth International Conference on Hybrid Intelligent Systems , 168-173, https://doi.org/10.1109/HIS.2008.25

Isinkaye, F. O., Folajimi, Y. O., & Ojokoh, B. A. (2015). Recommendation systems: Principles, methods and evaluation. Egyptian Informatics Journal, 16 (3), 261–273. https://doi.org/10.1016/j.eij.2015.06.005

Ismail, H. M., Belkhouche, B., & Harous, S. (2019). Framework for personalized content recommendations to support informal learning in massively diverse information Wikis. IEEE Access, 7 , 172752–172773. https://doi.org/10.1109/ACCESS.2019.2956284

Khan, K. S., Kunz, R., Kleijnen, J., & Antes, G. (2003). Five steps to conducting a systematic review. Journal of the Royal Society of Medicine, 96 (3), 118–121. https://doi.org/10.1258/jrsm.96.3.118

Khanal, S. S., Prasad, P. W. C., Alsadoon, A., & Maag, A. (2019). A systematic review: Machine learning based recommendation systems for e-learning. Education and Information Technologies, 25 (4), 2635–2664. https://doi.org/10.1007/s10639-019-10063-9

Khusro, S., Ali, Z., & Ullah, I. (2016). Recommender Systems: Issues, Challenges, and Research Opportunities. In K. Kim & N. Joukov (Eds.), Lecture Notes in Electrical Engineering (Vol. 376, pp. 1179–1189). Springer. https://doi.org/10.1007/978-981-10-0557-2_112

Kitchenham, B. A., & Charters, S. (2007). Guidelines for performing Systematic Literature Reviews in Software Engineering. Technical Report EBSE 2007–001 . Keele University and Durham University Joint Report. https://www.elsevier.com/data/promis_misc/525444systematicreviewsguide.pdf .

Kitchenham, B., Pearl Brereton, O., Budgen, D., Turner, M., Bailey, J., & Linkman, S. (2009). Systematic literature reviews in software engineering – A systematic literature review. Information and Software Technology, 51 (1), 7–15. https://doi.org/10.1016/j.infsof.2008.09.009

Klašnja-Milićević, A., Ivanović, M., & Nanopoulos, A. (2015). Recommender systems in e-learning environments: A survey of the state-of-the-art and possible extensions. Artificial Intelligence Review, 44 (4), 571–604. https://doi.org/10.1007/s10462-015-9440-z

Klašnja-Milićević, A., Vesin, B., & Ivanović, M. (2018). Social tagging strategy for enhancing e-learning experience. Computers & Education, 118 , 166–181. https://doi.org/10.1016/j.compedu.2017.12.002

Kolb, D., Boyatzis, R., Mainemelis, C., (2001). Experiential Learning Theory: Previous Research and New Directions Perspectives on Thinking, Learning and Cognitive Styles , 227–247.

Krahenbuhl, K. S. (2016). Student-centered Education and Constructivism: Challenges, Concerns, and Clarity for Teachers. The Clearing House: A Journal of Educational Strategies, Issues and Ideas, 89 (3), 97–105. https://doi.org/10.1080/00098655.2016.1191311

Kunaver, M., & Požrl, T. (2017). Diversity in recommender systems – A survey. Knowledge-Based Systems, 123 , 154–162. https://doi.org/10.1016/j.knosys.2017.02.009

Manouselis, N., Drachsler, H., Vuorikari, R., Hummel, H., & Koper, R. (2010). Recommender systems in technology enhanced learning. In F. Ricci, L. Rokach, B. Shapira, & P. Kantor (Eds.), Recommender Systems Handbook (pp. 387–415). Springer. https://doi.org/10.1007/9780-387-85820-3_12

Manouselis, N., Drachsler, H., Verbert, K., & Santos, O. C. (2014). Recommender systems for technology enhanced learning . Springer. https://doi.org/10.1007/978-1-4939-0530-0

Book   Google Scholar  

Manouselis, N., Drachsler, H., Verbert, K., & Duval, E. (2013). Challenges and Outlook. Recommender Systems for Learning , 63–76. https://doi.org/10.1007/978-1-4614-4361-2

Maravanyika, M., & Dlodlo, N. (2018). An adaptive framework for recommender-based learning management systems. Open Innovations Conference (OI), 2018 , 203–212. https://doi.org/10.1109/OI.2018.8535816

Maria, S. A. A., Cazella, S. C., & Behar, P. A. (2019). Sistemas de Recomendação: conceitos e técnicas de aplicação. Recomendação Pedagógica em Educação a Distância , 19–47, Penso.

McCombs, B. L. (2013). The Learner-Centered Model: Implications for Research Approaches. In Cornelius-White, J., Motschnig-Pitrik, R. & Lux, M. (eds), Interdisciplinary Handbook of the Person-Centered Approach , 335–352. 10.1007/ 978-1-4614-7141-7_23

Medeiros, R. P., Ramalho, G. L., & Falcao, T. P. (2019). A systematic literature review on teaching and learning introductory programming in higher education. IEEE Transactions on Education, 62 (2), 77–90. https://doi.org/10.1109/te.2018.2864133

Moher, D., Shamseer, L., Clarke, M., Ghersi, D., Liberati, A., Petticrew, M., Shekelle, P., Stewart, L. A., PRISMA-P Group. (2015). Preferred reporting items for systematic review and meta-analysis protocols (PRISMA-P) 2015 statement. Systematic Reviews, 4 (1), 1. https://doi.org/10.1186/2046-4053-4-1

Moubayed, A., Injadat, M., Nassif, A. B., Lutfiyya, H., & Shami, A. (2018). E-Learning: Challenges and research opportunities using machine learning & data analytics. IEEE Access, 6 , 39117–39138. https://doi.org/10.1109/access.2018.2851790

Nabizadeh, A. H., Gonçalves, D., Gama, S., Jorge, J., & Rafsanjani, H. N. (2020). Adaptive learning path recommender approach using auxiliary learning objects. Computers & Education, 147 , 103777–103793. https://doi.org/10.1016/j.compedu.2019.103777

Nafea, S. M., Siewe, F., & He, Y. (2019). On Recommendation of learning objects using Felder-Silverman learning style model. IEEE Access, 7 , 163034–163048. https://doi.org/10.1109/ACCESS.2019.2935417

Nascimento, P. D., Barreto, R., Primo, T., Gusmão, T., & Oliveira, E. (2017). Recomendação de Objetos de Aprendizagem baseada em Modelos de Estilos de Aprendizagem: Uma Revisão Sistemática da Literatura. Proceedings of XXVIII Brazilian Symposium on Computers in Education- SBIE, 2017 , 213–222. https://doi.org/10.5753/cbie.sbie.2017.213

Nguyen, Q. H., Ly, H.-B., Ho, L. S., Al-Ansari, N., Le, H. V., Tran, V. Q., Prakash, I., & Pham, B. T. (2021). Influence of data splitting on performance of machine learning models in prediction of shear strength of soil. Mathematical Problems in Engineering, 2021 , 1–15. https://doi.org/10.1155/2021/4832864

Nichols, D. M. (1998). Implicit rating and filtering. Proceedings of the Fifth Delos Workshop: Filtering and Collaborative Filtering , 31–36.

Okoye, I., Maull, K., Foster, J., & Sumner, T. (2012). Educational recommendation in an informal intentional learning system. Educational Recommender Systems and Technologies , 1–23. https://doi.org/10.4018/978-1-61350-489-5.ch001

Pai, M., McCulloch, M., Gorman, J. D., Pai, N., Enanoria, W., Kennedy, G., Tharyan, P., & Colford, J. M., Jr. (2004). Systematic reviews and meta-analyses: An illustrated, step-by-step guide. The National Medical Journal of India, 17 (2), 86–95.

Petri, G., & Gresse von Wangenheim, C. (2017). How games for computing education are evaluated? A systematic literature review. Computers & Education, 107 , 68–90. https://doi.org/10.1016/j.compedu.2017.01.00

Petticrew, M., & Roberts, H. (2006). Systematic reviews in the social sciences a practical guide. Blackwell Publishing . https://doi.org/10.1002/9780470754887

Pinho, P. C. R., Barwaldt, R., Espindola, D., Torres, M., Pias, M., Topin, L., Borba, A., & Oliveira, M. (2019). Developments in educational recommendation systems: a systematic review. Proceedings of 2019 IEEE Frontiers in Education Conference (FIE) . https://doi.org/10.1109/FIE43999.2019.9028466

Pöntinen, S., Dillon, P., & Väisänen, P. (2017). Student teachers’ discourse about digital technologies and transitions between formal and informal learning contexts. Education and Information Technologies, 22 (1), 317–335. https://doi.org/10.1007/s10639-015-9450-0

Pu, P., Chen, L., & Hu, R. (2011). A user-centric evaluation framework for recommender systems. Proceedings of the fifth ACM conference on Recommender systems , 157–164. https://doi.org/10.1145/2043932.2043962

Rahman, M. M., & Abdullah, N. A. (2018). A personalized group-based recommendation approach for web search in E-Learning. IEEE Access, 6 , 34166–34178. https://doi.org/10.1109/ACCESS.2018.2850376

Ricci, F., Rokach, L., & Shapira, B. (2015). Recommender Systems: Introduction and Challenges. I Ricci, F., Rokach, L., Shapira, B. (eds), Recommender Systems Handbook , 1–34. https://doi.org/10.1007/978-1-4899-7637-6_1

Rivera, A. C., Tapia-Leon, M., & Lujan-Mora, S. (2018). Recommendation Systems in Education: A Systematic Mapping Study. Proceedings of the International Conference on Information Technology & Systems (ICITS 2018) , 937–947. https://doi.org/10.1007/978-3-319-73450-7_89

Salazar, C., Aguilar, J., Monsalve-Pulido, J., & Montoya, E. (2021). Affective recommender systems in the educational field. A systematic literature review. Computer Science Review, 40 , 100377. https://doi.org/10.1016/j.cosrev.2021.100377

Santos, I. M., & Ali, N. (2012). Exploring the uses of mobile phones to support informal learning. Education and Information Technologies, 17 (2), 187–203. https://doi.org/10.1007/s10639-011-9151-2

Sergis, S., & Sampson, D. G. (2016). Learning object recommendations for teachers based on elicited ICT competence profiles. IEEE Transactions on Learning Technologies, 9 (1), 67–80. https://doi.org/10.1109/TLT.2015.2434824

Shani, G., & Gunawardana, A. (2010). Evaluating recommendation systems. In F. Ricci, L. Rokach, B. Shapira, & P. Kantor (Eds.), Recommender Systems Handbook (pp. 257–297). Springer. https://doi.org/10.1007/978-0-387-85820-3_8

Tahereh, M., Maryam, T. M., Mahdiyeh, M., & Mahmood, K. (2013). Multi dimensional framework for qualitative evaluation in e-learning. 4th International Conference on e-Learning and e-Teaching (ICELET 2013), 69–75. https://doi.org/10.1109/icelet.2013.6681648

Tarus, J. K., Niu, Z., & Yousif, A. (2017). A hybrid knowledge-based recommender system for e-learning based on ontology and sequential pattern mining. Future Generation Computer Systems, 72 , 37–48. https://doi.org/10.1016/j.future.2017.02.049

Tarus, J. K., Niu, Z., & Mustafa, G. (2018). Knowledge-based recommendation: A review of ontology-based recommender systems for e-learning. Artificial Intelligence Review, 50 (1), 21–48. https://doi.org/10.1007/s10462-017-9539-5

Verbert, K., Manouselis, N., Ochoa, X., Wolpers, M., Drachsler, H., Bosnic, I., & Duval, E. (2012). Context-aware recommender systems for learning: A survey and future challenges. IEEE Transactions on Learning Technologies, 5 (4), 318–335. https://doi.org/10.1109/TLT.2012.11

Verbert, K., Drachsler, H., Manouselis, N., Wolpers, M., Vuorikari, R., & Duval, E. (2011). Dataset-Driven Research for Improving Recommender Systems for Learning. Proceedings of the 1st International Conference on Learning Analytics and Knowledge , 44–53. https://doi.org/10.1145/2090116.2090122

Wan, S., & Niu, Z. (2016). A learner oriented learning recommendation approach based on mixed concept mapping and immune algorithm. Knowledge-Based Systems, 103 , 28–40. https://doi.org/10.1016/j.knosys.2016.03.022

Wan, S., & Niu, Z. (2018). An e-learning recommendation approach based on the self-organization of learning resource. Knowledge-Based Systems, 160 , 71–87. https://doi.org/10.1016/j.knosys.2018.06.014

Wan, S., & Niu, Z. (2020). A hybrid E-Learning recommendation approach based on learners’ influence propagation. IEEE Transactions on Knowledge and Data Engineering, 32 (5), 827–840. https://doi.org/10.1109/TKDE.2019.2895033

Watkins, K. E., & Marsick, V. J. (2020). Informal and incidental learning in the time of COVID-19. Advances in Developing Human Resources, 23 (1), 88–96. https://doi.org/10.1177/1523422320973656

Wu, D., Lu, J., & Zhang, G. (2015). A Fuzzy Tree Matching-based personalized E-Learning recommender system. IEEE Transactions on Fuzzy Systems, 23 (6), 2412–2426. https://doi.org/10.1109/TFUZZ.2015.2426201

Wu, Z., Li, M., Tang, Y., & Liang, Q. (2020). Exercise recommendation based on knowledge concept prediction. Knowledge-Based Systems, 210 , 106481–106492. https://doi.org/10.1016/j.knosys.2020.106481

Yanes, N., Mostafa, A. M., Ezz, M., & Almuayqil, S. N. (2020). A machine learning-based recommender system for improving students learning experiences. IEEE Access, 8 , 201218–201235. https://doi.org/10.1109/ACCESS.2020.3036336

Zapata, A., Menéndez, V. H., Prieto, M. E., & Romero, C. (2015). Evaluation and selection of group recommendation strategies for collaborative searching of learning objects. International Journal of Human-Computer Studies, 76 , 22–39. https://doi.org/10.1016/j.ijhcs.2014.12.002

Zhang, S., Yao, L., Sun, A., & Tay, Y. (2020). Deep learning based recommender system. ACM Computing Surveys, 52 (1), 1–38. https://doi.org/10.1145/3285029

Zhong, J., Xie, H., & Wang, F. L. (2019). The research trends in recommender systems for e-learning: A systematic review of SSCI journal articles from 2014 to 2018. Asian Association of Open Universities Journal, 14 (1), 12–27. https://doi.org/10.1108/AAOUJ-03-2019-0015

Download references

Author information

Authors and affiliations.

Centro de Estudos Interdisciplinares em Novas Tecnologias da Educação, Universidade Federal do Rio Grande do Sul, Porto Alegre, Rio Grande do Sul, Brazil

Felipe Leite da Silva, Bruna Kin Slodkowski, Ketia Kellen Araújo da Silva & Sílvio César Cazella

Departamento de Ciências Exatas e Sociais Aplicadas, Universidade Federal de Ciências da Saúde de Porto Alegre, Porto Alegre, Rio Grande do Sul, Brazil

Sílvio César Cazella

You can also search for this author in PubMed   Google Scholar

Contributions

Felipe Leite da Silva: Conceptualization, Methodology approach, Data curation, Writing – original draft. Bruna Kin Slodkowski: Data curation, Writing – original draft. Ketia Kellen Araújo da Silva: Data curation, Writing – original draft. Sílvio César Cazella: Supervision and Monitoring of the research; Writing – review & editing.

Corresponding author

Correspondence to Felipe Leite da Silva .

Ethics declarations

Informed consent.

This research does not involve human participation as research subject, therefore research subject consent does not apply.

Authors consent with the content presented in the submitted manuscript.

Financial and non-financial interests

The authors have no relevant financial or non-financial interests to disclose.

Research involving human participants and/or animals

This research does not involve an experiment with human or animal participation.

Competing interests

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary file1 (PDF 241 KB)

Rights and permissions.

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

da Silva, F.L., Slodkowski, B.K., da Silva, K.K.A. et al. A systematic literature review on educational recommender systems for teaching and learning: research trends, limitations and opportunities. Educ Inf Technol 28 , 3289–3328 (2023). https://doi.org/10.1007/s10639-022-11341-9

Download citation

Received : 05 November 2021

Accepted : 05 September 2022

Published : 14 September 2022

Issue Date : March 2023

DOI : https://doi.org/10.1007/s10639-022-11341-9

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Educational recommender systems
  • E-learning recommendation
  • Systematic literature review
  • Computer-mediated teaching and learning
  • Find a journal
  • Publish with us
  • Track your research

Student Guide

New MyStudies functionality for master's thesis management launched on 19 August 2024

Photo: Unto Rautio

The Master’s thesis management will change in August 2024 with the introduction of a new thesis functionality in MyStudies portal. When starting a master’s thesis, students will enter the basic information of the thesis, such as the topic, supervisor and advisors, in MyStudies and apply for the approval of the supervisor on the platform. The supervisor approves the topic and advisors. In the field of technology (the schools of CHEM, ENG, ELEC and SCI), the topic, supervisor and advisors are no longer approved by the degree programme committees. Students can start a thesis process and apply for the necessary approvals continuously considering the instructions of the degree programme on starting and timing of the thesis. 

The final thesis is submitted to MyStudies for evaluation, approval and archiving. 

The new process collects the basic information and administrative processes of a theses on single platform and allows better support and monitoring for the thesis process. 

The eAge applications for master’s theses were closed on 1 August 2024, with the exception of the School of Business. 

Below you can see a more detailed timetable for the implementation of the new functionality: 

School  Starting a thesis process on MyStudies, starting from  Submitting a thesis on MyStudies, starting from  eAge forms to close for master’s theses 
ARTS  19.8.2024  19.8.2024 (next DL 18.11.)  1.8.2024 
BIZ  19.8.2024  October 2024 (preliminary)  1.10.2024 (preliminary) 
CHEM  19.8.2024  19.8.2024 (next DL 30.9.)  1.8.2024 
ENG  19.8.2024  19.8.2024 (next DL 30.9.)  1.8.2024 
ELEC  19.8.2024  19.8.2024 (next DL 30.9.)  1.8.2024 
SCI  19.8.2024  19.8.2024 (next DL 30.9.)  1.8.2024 

If you are currently working on your Master's thesis, you will receive an email with more detailed instructions on how to submit your thesis for evaluation on  MyStudies .  

MyStudies is a digital portal for students. On MyStudies you will find the success team, a dedicated team of experts supporting you in your studies, and if they offer appointments on the platform, you can schedule appointments with the members of your success team. Advisors may write notes of the appointments and share them with you in MyStudies. In addition, the service includes the most important services for a student, some of them including appointment functionality, as well as direct links to the most frequently used pages.   

For more information, contact Learning Services ( [email protected] ) or Project Manager Elsa Kivi-Koskinen ( [email protected] ). 

Image: students

MyStudies (URL) (external link)

MyStudies is a daily desktop for Aalto students. Log in to MyStudies here.

Opiskelijoita Kauppakorkeakoululla. Kuva: Aalto-yliopisto / Unto Rautio

Find and contact your study support team on MyStudies

Connect with Study Coordinators and Academic Advisors in MyStudies.

Feedback about the page

  • Published: 19.8.2024
  • Updated: 19.8.2024

Read more news

Itseopiskelutila Y412

New self-study spaces in Undergraduate Centre

Colorful image of students sitting in a class room. Around the photo there's a green bubbly border and logos of Aalto University and Aalto Ventures Prograam.

Entrepreneurial courses coming up this fall at Aalto Ventures Program

Maisteritason opinnäytteet näkyvät valvojan profiilissa

The integration of master's theses will begin in the Aalto University research information system

Biz exchange secondary application round for the spring 2025 open until august 22.

Purdue University Graduate School

File(s) under embargo

until file(s) become available

Digital Inertia Programming

Vibration is ubiquitous in the modern world, making it a topic that cannot be avoided during design, manufacture, and maintenance. Systems, such as civil structures and suspension of cars, are normally designed to stay in the attenuation zone to avoid harsh vibrations. Designing and manufacturing systems with the desired natural frequency distribution is easy. However, it is much harder to maintain the frequency response since materials keep aging as time goes by. To counter the effect of aging and attenuate vibrations, this thesis designed a meta-material that is capable of reprogramming its natural frequency distribution by inserting various masses at different locations. This ability to specifically adjust the system's natural frequency distribution is what we define as "Digital Inertia Programming".

The model consists of 12 identical unit cells, with each unit cell comprising two types of springs. By determining whether to insert a mass into the unit cell at various locations, the model achieves its programmability to adjust its natural frequency distribution. A "Binary Representation" is used to label the patterns of mass inserted in the model. Each unit cell is represented by a binary bit and a total of 12 bits are used to indicate the presence of mass in each unit cell. In the thesis, we mainly discuss bilaterally symmetrical patterns to avoid unwanted twisting. For the 12 unit cells, we can obtain a total of 128 bilaterally symmetrical patterns, resulting in 896 independent natural frequencies for the model. The number of patterns and independent natural frequencies will increase exponentially with the increase of the number of unit cells in the model.

An ideal one-dimensional analytical metamaterial model is developed. Lagrange's method is used to determine the system's mass matrix and stiffness matrix directly from the kinetic energy and potential energy equations. The natural frequencies and mode shapes are then calculated from the eigenvalue equation. Based on free response analysis and sensitivity analysis, the model successfully showed great programmability on frequency distribution by varying the insert patterns, as well as changing the value of the variables in the model, such as the weight of the inserts, the weight of the top mass, the stiffness of the unit cell wall spring and the stiffness of the connecting spring. When continuously varying the parameter, the model's natural frequency distribution also changes continuously, giving a possibility to adjust the natural frequency distribution by carefully adjusting the weight of the mass inserted at each location. Lastly, a forced-response analysis is performed, and the amplitude of the model's frequency response is plotted. This provides a straightforward view of the changes in the band gaps and the overall stiffness of the model by altering the patterns with two inserts.

A two-dimensional model is developed based on the one-dimensional model. The model retains the same 12 unit cells setup as the one-dimensional model. Aiming to ensure stability, the rectangular-shaped unit cell is now configured as a combination of two triangles. Taylor expansion and small angle approximation are used to eliminate nonlinear terms and triangular function terms in the stiffness matrix respectively. The model again shows its programmability by adjusting the variables of the model. Since the results of asymmetrical patterns are bounded by the results of symmetrical patterns, including the asymmetrical patterns increases the model's precision. However, the symmetrical patterns already provide a good representation of the model. The rotational motion is added to the inserts in the model, which further increases the model's complexity. In the model, the mode shapes are characterized by the rotational motion of inserts and the horizontal motion of inserts, which correspond to a zero strain mode of the model. A linear regression model is trained based on 100 bilaterally symmetrical patterns to predict the second lowest natural frequencies of the two-dimensional model for both symmetrical and asymmetrical patterns. The success in the linear regression model indicates the potential for applying machine learning algorithms to the design of meta-materials in the future.

Degree Type

  • Master of Science
  • Mechanical Engineering

Campus location

  • West Lafayette

Advisor/Supervisor/Committee Chair

Additional committee member 2, additional committee member 3, usage metrics.

  • Dynamics, vibration and vibration control

CC BY 4.0

thesis topic recommendation system

Advertisement

Supported by

How Did Mpox Become a Global Emergency? What’s Next?

The virus is evolving, and the newest version spreads more often through heterosexual populations. Sweden reported the first case outside Africa.

  • Share full article

A doctor in yellow protective gear and white gloves examines the head of a young boy in a makeshift tent.

By Apoorva Mandavilli

Apoorva Mandavilli covered the 2022 mpox outbreak and the Covid-19 pandemic.

Faced once again with a rapidly spreading epidemic of mpox, the World Health Organization on Wednesday declared a global health emergency. The last time the W.H.O. made that call was in 2022, when the disease was still called monkeypox.

Ultimately the outbreak affected nearly 100,000 people worldwide, primarily gay and bisexual men, including more than 32,000 in the United States.

The W.H.O.’s decision this time was prompted by an escalating crisis of mpox concentrated in the Democratic Republic of Congo. It recently spread to a dozen other African countries. If it is not contained, the virus again may rampage all over the world, experts warned.

On Thursday, Sweden reported the first case of a deadlier form of mpox outside Africa , in a person who had traveled to the continent. “Occasional imported cases like the current one may continue to occur,” the country’s public health agency warned.

“There’s a need for concerted effort by all stakeholders, not only in Africa, but everywhere else,” Dr. Dimie Ogoina, a Nigerian scientist and chair of the W.H.O.’s mpox emergency committee, said on Wednesday.

Congo alone has reported 15,600 mpox cases and 537 deaths, most of them among children under 15, indicating that the nature of the disease and its mode of spread may have changed.

We are having trouble retrieving the article content.

Please enable JavaScript in your browser settings.

Thank you for your patience while we verify access. If you are in Reader mode please exit and  log into  your Times account, or  subscribe  for all of The Times.

Thank you for your patience while we verify access.

Already a subscriber?  Log in .

Want all of The Times?  Subscribe .

IMAGES

  1. (PDF) Recommendation System for Thesis Topics Using Content-based Filtering

    thesis topic recommendation system

  2. Traditional book recommendation process based on collaborative...

    thesis topic recommendation system

  3. CHAPTER 5 SUMMARY, FINDINGS

    thesis topic recommendation system

  4. Recommendation Thesis

    thesis topic recommendation system

  5. Recommendation Letter For Phd Admission Collection

    thesis topic recommendation system

  6. Phd. Thesis : Temporal Recommendation

    thesis topic recommendation system

COMMENTS

  1. (PDF) Recommendation System for Thesis Topics Using ...

    Therefore, a recommendation system is. needed to classify thesis top ics ba sed on the students' interest and abilities. This study. developed a r ecommendation system for thesis topics using ...

  2. A Comparative Study of Recommendation Systems

    Objective Function of a Recommendation System The three main steps in building a Recommendation System are: loading and. formatting the data, calculating the similarity between the users or between the items and. predicting the unknown ratings for the users. As discussed earlier, the data can be collected.

  3. Research-paper recommender systems: a literature survey

    In 1998, Giles et al. introduced the first research-paper recommender system as part of the CiteSeer project [].Since then, at least 216 articles relating to 120 research-paper recommendation approaches were published [2-217].The amount of literature and approaches represents a problem for new researchers: they do not know which of the articles are most relevant, and which recommendation ...

  4. (PDF) Recommender Systems: An Overview, Research Trends, and Future

    Abstract: Recommender system (RS) has emerged as a major research interest. that aims to help users to find items online by providing sug gestions that. closely match their interest. This pa per ...

  5. [PDF] Recommendation System for Thesis Topics Using Content-based

    This study developed a recommendation system for thesis topics using content-based filtering where the students will be asked to choose the course that they interested in along with their grades. After getting all the required data, the recommendation system will process the data and then it'll show the title and the abstract of publication ...

  6. Scientific paper recommendation systems: a literature review of recent

    A simple problem definition of a paper recommendation system could be the ... (3.08%) which have not yet been published otherwise. There has been one master's thesis (1.54%) within scope. ... in their discussion of papers per year which were published in the area of the topic paper or citation recommendation but later on only studied 62 ...

  7. Scientific paper recommendation systems: a literature review ...

    In this chapter we first clearly define the scope of our literature review (see Sect. 3.1) before we conduct a meta-analysis on the observed papers (see Sect. 3.2).Afterwards our categorisation or lack thereof is discussed in depth (see Sect. 3.3), before we give short overviews of all paper recommendation systems we found (see Sect. 3.5) and some other relevant related work (see Sect. 3.6).

  8. How to Write Recommendations in Research

    Recommendations for future research should be: Concrete and specific. Supported with a clear rationale. Directly connected to your research. Overall, strive to highlight ways other researchers can reproduce or replicate your results to draw further conclusions, and suggest different directions that future research can take, if applicable.

  9. Scientific paper recommendation systems: a literature review of recent

    Master's thesis, KTH, School of Electrical Engineering and Computer Science (EECS) (2021) ... A scholarly recommendation system is an important tool for identifying prior and related resources such as literature, datasets, grants, and collaborators. ... Differentiable Topics Guided New Paper Recommendation Neural Information Processing 10. ...

  10. (PDF) Thesis topic recommendation using simple multi ...

    The system designed successfully runs well and can provide recommendations on the selection of thesis topics. This recommendation provided an overview of the strength of competencies that are ...

  11. PDF Scientific paper recommendation systems: a literature review ...

    3.3.2 Current categorisation. Recent paper recommendation systems can be categorised in 20 different dimensions by general information on the approach (G), already existing data directly taken from the papers used (D) and methods which might create or (re-)structure data, which are part of the approach (M):

  12. (PDF) Recommendation System for Thesis Topics Using Content-based

    The system built in this study is the CBR system to make recommendations on the topic of student thesis concentration. This study used data from undergraduate students of Informatics Engineering IST AKPRIND Yogyakarta with a total of 115 data consisting of 80 training data and 35 test data.

  13. Recommender systems: Trends and frontiers

    Recent research work on topics such as multistakeholder recommendation, system biases, fairness and various potentially negative effects of recommender systems started to address these important questions (Abdollahpouri et al. 2020; Deldjoo et al. 2021; Ekstrand et al. 2021). However, still too often these problems are mainly addressed from a ...

  14. PDF Dissertation Recommender System: Design & Development

    University (IHU) has inspired the dissertation recommender system that is being proposed in this Thesis. The topic assignment to students is roughly a three-step process. First, staff with teaching assignments (referred to as advisors in the Thesis), local or adjunct, announce dissertation topics. Then, the students submit a ranked list of up ...

  15. PDF Recommendation Systems on E-Learning and Social Learning: A ...

    recommendation system can take many forms based on many concepts: The Electronic Journal of e-Learning Volume 19 Issue 5 2021 www.ejel.org 434 ©ACPIL 2.1.1 Content-based approach (Mobasher, 2007; Wang et al., 2018) This type of recommendation system is mainly based on content analyzing documents, resources and objects

  16. A systematic review and research perspective on ...

    Recommender systems are efficient tools for filtering online information, which is widespread owing to the changing habits of computer users, personalization trends, and emerging access to the internet. Even though the recent recommender systems are eminent in giving precise recommendations, they suffer from various limitations and challenges like scalability, cold-start, sparsity, etc. Due to ...

  17. "Deep Learning for Recommender Systems" by Travis Akira Ebesu

    Ebesu, Travis Akira, "Deep Learning for Recommender Systems" (2019). Engineering Ph.D. Theses. 22. The widespread adoption of the Internet has led to an explosion in the number of choices available to consumers. Users begin to expect personalized content in modern E-commerce, entertainment and social media platforms.

  18. PDF Recommendation System for Thesis Topics Using Content ...

    3.3 System Display The recommendation system are implemented in web-based. The language that is used for the front-end is javascript (native), and for the back-end python with flask framework. The ...

  19. PDF A Multi-objective Recommendation System a Thesis Submitted to The

    Approval of the thesis: A MULTI-OBJECTIVE RECOMMENDATION SYSTEM submitted by MAKBULE GÜLÇ˙IN ÖZSOY in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Computer Engineering Department, Middle East Technical University by, Prof. Dr. Gülbin Dural Ünver Dean, Graduate School of Natural and Applied Sciences

  20. List of recommender system dissertations

    Dissertation within a year are sorted alphabetically by title. Contents. 1 2020s. 1.1 2022; 1.2 2021; ... A Model-Based Music Recommendation System for Individual Users and Implicit User Groups - Yajie Hu; ... Effective tag recommendation system based on topic ontology - V Subramaniyaswamy;

  21. A systematic literature review on educational recommender ...

    This study is based on the SLR methodology for gathering evidences related to the research topic investigated. As stated by Kitchenham and Charters and Kitchenham et al. (), this method provides the means for aggregate evidences from current research prioritizing the impartiality and reproducibility of the review.Therefore, a SLR is based on a process that entails the development of a review ...

  22. New MyStudies functionality for master's thesis management launched on

    The Master's thesis management will change in August 2024 with the introduction of a new thesis functionality in MyStudies portal. When starting a master's thesis, students will enter the basic information of the thesis, such as the topic, supervisor and advisors, in MyStudies and apply for the approval of the supervisor on the platform. The supervisor approves the topic and advisors.

  23. Digital Inertia Programming

    Vibration is ubiquitous in the modern world, making it a topic that cannot be avoided during design, manufacture, and maintenance. Systems, such as civil structures and suspension of cars, are normally designed to stay in the attenuation zone to avoid harsh vibrations. Designing and manufacturing systems with the desired natural frequency distribution is easy. However, it is much harder to ...

  24. Systematic Review of Recommendation Systems for Course Selection

    Department of Computer Science, University of Idaho, Moscow, ID 83843, USA; [email protected]. * Correspondence: [email protected]. Abstract: Course recommender systems play an ...

  25. Enhancing vulnerability prioritization with asset context and EPSS

    Figure 2. Exposed devices with their criticality level in the recommendation object. You can also use the critical devices filter to display only recommendations that involve critical assets, as shown in figure 3. Figure 3. Capability to filter and display only recommendations that involves critical assets.

  26. How Did Mpox Become a Global Emergency? What's Next?

    Officials have not solved the problems that hobbled the response in 2022, including poor uptake of the vaccine and "a shockingly underfunded S.T.I. public health system," Mr. Harvey said.