Twitter Recommendation Algorithm • There there are many areas of the app where Tweets are recommended : Search , Explore , and Ads • The recommendation system filters out a small set of top Tweets from the approximately 500 million Tweets that are posted every day, which are then displayed on your device's For You timeline. Source : https://blog.twitter.com/engineering/en_us/topics/open - source/2023/twitter - recommendation - algorithm Recommender System Setup Like any other Recommender system , the twitter’s recommendation system consists of 3 main stages 1. Candidate Sourcing : Finding the best tweets from from different recommendation sources. 2. Ranking using ML Model 3. Applying heuristics and filters like filtering tweets from blocked or muted users, balancing good mix of tweets, appreciate if user has provided any negative feedbacks in the final feed etc. Tweets Candidate Generators Ranker Home - page Candidate Sources • Candidate Sourcing sources 1500 tweets, 50% from people you follows( in - network ) and 50% from people you don’t follow ( out - of - network ). In - Network Source Tweets • The In - Network source uses a logistic regression model and the Real Graph (Twitter’s graph model) to rank Tweets from users you follow • The logistic regression model efficiently ranks Tweets based on their relevance , while the Real Graph model predicts the likelihood of engagement between two users and helps prioritize Tweets from those you have a stronger connection with. Out - of - Network Source Tweets (Social Graph) • Twitter uses two approaches to find relevant tweets outside a user's network. Social Graph and Embedding Spaces • Social Graph estimates relevance by analyzing the engagements of people you follow or those with similar interests using GraphJet (Twitter's another Graph model). It generates candidate tweets based on recent engagements and ranks them using a logistic regression model. Out - of - Network Source Tweets (Embedding Spaces) Twitter's SimCluster algorithm discovered communities anchored by influential users using a custom matrix factorization algorithm. • Embedding Spaces generate numerical representations of users and tweets in community space. Twitter's SimCluster algorithm discovers communities anchored by influential users using a custom matrix factorization algorithm. • Users and Tweets are then embedded into this communities space/dimension. • Tweets are embedded into these communities by looking at their current popularity in each community. • Similarly users are embedded into communities based on their affinity towards communities. A user can be interested in dozens of communities Ranking using ML Model • 1 500 candidates sourced are ranked through a ~48M parameter neural network that is continuously trained on Tweet interactions taking into account thousand of features to predict engagement probability. (e.g. Likes , Retweets , and Replies ). Filtering & Blending • After ranking, heuristics and filters are applied to filter out tweets from blocked or muted users, balance good mix of tweets etc. • A s the last step in the process, the system blends together Tweets with other non - Tweet content like Ads , Follow Recommendations , and Onboarding prompts , which are returned to user's device to display. • The Home page pipeline runs 5 billion times per day and completes in under 1.5 seconds on average. Thank you!! Check out other videos in the Channel DATATREK https://youtube.com/@datatrek Video Link: https://youtu.be/IhGq9jgcxFM