Methods for Finding Related Reddit Subreddits with Simple Set Theory (2024)

I recently wrote a post on how to visualize network graphs of Reddit subreddits.

One of the reasons I’ve been researching the topic is to find a good way to facilitate discovery of lesser-known subreddits, as Reddit is doing a terrible job at it (although they have been trying a few new experiments very recently). As it turns out, invoking graph theory is overkill. Even fancy machine learning approaches like collaborative filtering, while powerful, may not be required to help Redditors discover new things.

Let’s say we have two sets: Set A, where A represents the number of active users in a given subreddit, and set B, where B is the set of active users in a subreddit. The intersection of Sets A and B (A ∩ B) represents users who are active in both subreddits.

Using BigQuery, I can get the comment data from ALL public Reddit subreddits, as otherwise this technique would not work well using any smaller subset. The network graph edgelist conveniently gives (A ∩ B), obtained as described in my previous post, which calculates the number of active users for all pairs of subreddits (defining “active users” as users who have made a comment in at least 5 unique threads in a given subreddit within the past 6 months).

Methods for Finding Related Reddit Subreddits with Simple Set Theory (1)

In this case, we can filter the edgelist to only allow intersections where there are at least 10 active users; this prevents including dead and personal subreddits.

We can run another similar query to get the number of active users for each subreddit.

Methods for Finding Related Reddit Subreddits with Simple Set Theory (2)

After that, for a given subreddit A, find:

(A ∩ B) / (B)

for all subreddits B where (A ∩ B) > 0 (i.e. only neighbors of A). This computation takes less than a second. Additionally, the output is always a percentage between 0% and 100%. For the visualizations, we plot the Top 15 subreddits with the highest overlap of the specified subreddit A (and color the bars with a nice viridis palette to provide another easy way to perceive relative magnitude of relatedness).

The methodology may sound arbitrary, but the results are very interesting. Here’s a chart of the top related subreddits for /r/aww, one of the most popular places on the internet for cat pictures.

Methods for Finding Related Reddit Subreddits with Simple Set Theory (3)

I have honestly never heard of any of these subreddits before. But yet, by analyzing public user activity alone, I found a few new places to get more cute pics.

This methodology is excellent for finding subreddit-specific subsubreddits which may not be documented. The related subreddits for /r/buildapc offer more places to get PC building advice.

Methods for Finding Related Reddit Subreddits with Simple Set Theory (4)

Related subreddits for sport-specific subreddits, like /r/cfb (college football) include the corresponding teams.

Methods for Finding Related Reddit Subreddits with Simple Set Theory (5)

/r/food related subreddits list a surprising number of subreddits dedicated to specific foods.

Methods for Finding Related Reddit Subreddits with Simple Set Theory (6)

There is a surprising amount of depth to the /r/me_irl network.

Methods for Finding Related Reddit Subreddits with Simple Set Theory (7)

The chart for /r/programming can tell you which subreddits exist for specific programming languages and technologies.

Methods for Finding Related Reddit Subreddits with Simple Set Theory (8)

The methodology can also reveal a lack of related subreddits, by the large contrast between subreddits with high relatedness and low relatedness. For example, while /r/cfb may have large numbers of obviously-related subreddits as a sports subreddit, /r/golf has only 2.

Methods for Finding Related Reddit Subreddits with Simple Set Theory (9)

You can view Related Subreddit charts for the Top 200 Subreddits in this GitHub repository.

Finding Similar Subreddits

Another method for finding related subreddits would be to find subreddits with similar communities. An academic approach to finding similarity between sets is the Jaccard Index. Using the same set A and set B definitions above, the formula now becomes:

(A ∩ B) / [(A) + (B) - (A ∩ B)]

which outputs the Jaccard Index, between 0 and 1. This formula only requires a few tweaks to the original code. The results from this computation tell a different story.

Here are the most-similar subreddits to /r/aww:

Methods for Finding Related Reddit Subreddits with Simple Set Theory (10)

In this implementation, the default Reddit subreddits must be removed from the results, as the communities of default subreddits are largely similar to most others by design. Even former defaults like /r/adviceanimals and /r/technology still have large amounts of holdout users which skew the results. As /r/aww is a mass-appeal subreddit, it makes sense that the communities are similar to other mass-appeal subreddits.

The magnitude of the Jaccard Index measures the strength of the similarity. Most subreddit relationships have a low Jaccard Index, but the relative magnitude between all subreddit neighbors illustrate comparisons for potential related subreddits regardless (this is also the reason why the x-axis is not fixed across plots). The subreddit relationship with the highest absolute similarity is /r/arrow and /r/flashtv at 0.345, which make sense given the massive overlap between the two CW television shows.

Methods for Finding Related Reddit Subreddits with Simple Set Theory (11)

The Jaccard Index is more useful for finding similar subreddits to niche subreddits. Let’s try a few of the subreddits mentioned previously and see how the results changed.

/r/buildapc is a niche, and the output identifies well-established subreddits, unlike with the previous related-subreddit methodology.

Methods for Finding Related Reddit Subreddits with Simple Set Theory (12)

The subreddit most similar to /r/cfb (college football) is /r/collegebasketball!

Methods for Finding Related Reddit Subreddits with Simple Set Theory (13)

The subreddit most similar to /r/food is /r/cooking!

Methods for Finding Related Reddit Subreddits with Simple Set Theory (14)

The subreddit most similar to /r/programming is /r/linux! (of course)

Methods for Finding Related Reddit Subreddits with Simple Set Theory (15)

You can view the Similar Subreddit charts for the Top 200 Subreddits in this GitHub repository.

Again, Reddit has significantly better internal data for identifying user activity between subreddits, such as voting patterns and clickthrough tracking. But the results shown using these two set methodologies are pretty good for using public data. In fact, these two set approaches can theoretically work with any set of categorized, settable data, which may give me a few ideas for new blog posts in the future.

And there’s still the fancy machine learning approaches to try.

As always, the full code used to process the comment data and generate the visualizations is available in this Jupyter notebook, open-sourced on GitHub.

If you do find any other interesting trends in the related/similar charts of other subreddits and write about it, it would be greatly appreciated if proper attribution is given back to this post and/or myself. Thanks!

Methods for Finding Related Reddit Subreddits with Simple Set Theory (2024)

FAQs

How do I find related communities on Reddit? ›

If you're using the Reddit app, the best place to find relevant and local communities you might be interested in is in the Communities tab. This page suggests communities you're likely to enjoy based on your existing favorite communities, local trends, and popular communities in topics of your interest.

How to search Reddit community? ›

Open Reddit and navigate to the subreddit you want to search. You can click the subreddit from the Reddit home page or enter the URL directly, but make sure you actually navigate to the subreddit and aren't still on the home page. Click the Search box at the top of the screen. Type your query, and press enter.

What is a subreddit? ›

A subreddit is a specific online community, and the posts associated with it, on the social media website Reddit. Subreddits are dedicated to a particular topic that people write about, and they're denoted by /r/, followed by the subreddit's name, e.g., /r/gaming.

How do you search within a subreddit on mobile? ›

Open the Reddit mobile app. From the Home screen, use the Search bar along the top to locate and reach the subreddit you want to search within. When you go to that subreddit, the name of the subreddit will now occupy the beginning of the search bar. Use the search bar to search for whatever your search term is.

How do I see all the communities I follow on Reddit? ›

On the left side panel on Reddit.com, scroll down to the Communities section to see the communities that you've joined. You'll see up to 250 of the communities you follow there. To see all of the communities you've joined, you'll need to head over to old Reddit. Click MY SUBREDDITS near the top left of your screen.

How do I add related communities on Reddit? ›

You can do so in the subreddit community appearance settings:
  1. Mod Tools.
  2. Scroll down and select Community Appearance.
  3. Scroll down and select Add Widget.
  4. Select Community list.
  5. Under Communities, add the subreddits in the Add New Community section.
May 14, 2024

How to discover subreddits? ›

Reddit has a subreddit search feature at reddit.com/subreddits. To use the subreddit search, look for the box that says, “what are you interested in?” and enter keywords that are related to your niche.

How to find Reddit channels? ›

From Reddit's mobile app, the chat tab is located at the bottom. If you're on desktop web, the chat tab is at the top right of reddit.com. To discover popular chat channels to join, you can either scroll through the featured channels at the top of the screen, or select the View all to view a full list.

How do I get Reddit to show NSFW communities? ›

To view mature and Not Safe for Work (NSFW) communities on the mobile app, there are a few settings you'll need to enable. Log in and go to your Settings. Under the Preferences tab, toggle Show mature (18+) content to on.

What is the most used subreddit? ›

Reddit most subscribed communities 2024

As of February 2024, r/funny was the most popular community on the platform, with approximately 56.6 million subscribers. Subreddit r/AskReddit ranked second, with approximately 45 million registered Reddit users subscribing to the community.

What is a banned subreddit? ›

When we ban a subreddit, the community becomes disabled. Users will no longer be able to visit or post new content to the community.

What's the difference between a Redditor and a subreddit? ›

Registered users (commonly referred to as "Redditors") submit content to the site such as links, text posts, images, and videos, which are then voted up or down by other members. Posts are organized by subject into user-created boards called "communities" or "subreddits".

How do I browse NSFW Subreddits on my phone? ›

How to enable NSFW on mobile app
  1. Log into reddit.
  2. Select the "I am over eighteen years old and willing to view adult content" option, scroll down to the bottom, and click "save options"
  3. Select the "include not safe for work (NSFW) search results in searches" option, scroll down to the bottom, and click "save options"
Feb 7, 2019

How do I search random Subreddits? ›

There's a button at the top of the Reddit homepage called "Random." When pressed, the algorithm will serve you up one of the 138,000 subreddits bobbing around the website's depths.

How do I get community members on Reddit? ›

  1. Adding a link to your subreddit on all of your comments. ...
  2. Mass adding approved submitters to get people to join your subreddit. ...
  3. Mass commenting or posting with links to your subreddit. ...
  4. Mass messaging users with links to your subreddit. ...
  5. Add content to your community, giving visitors something to interact with.
Jun 14, 2023

How to favorite subreddits on Reddit app? ›

Tap on the star icon next to a community's name to add it to your favorites, or tap on the star again to unfavorite something. If you're a moderator, you'll also see a list of the communities you moderate along with links to your mod feed and mod queue.

Does Reddit have communities? ›

The site's content is divided into categories or communities known on-site as "subreddits", of which there are more than 138,000 active communities.

Top Articles
Latest Posts
Article information

Author: Golda Nolan II

Last Updated:

Views: 5715

Rating: 4.8 / 5 (78 voted)

Reviews: 85% of readers found this page helpful

Author information

Name: Golda Nolan II

Birthday: 1998-05-14

Address: Suite 369 9754 Roberts Pines, West Benitaburgh, NM 69180-7958

Phone: +522993866487

Job: Sales Executive

Hobby: Worldbuilding, Shopping, Quilting, Cooking, Homebrewing, Leather crafting, Pet

Introduction: My name is Golda Nolan II, I am a thoughtful, clever, cute, jolly, brave, powerful, splendid person who loves writing and wants to share my knowledge and understanding with you.