MyKpopLists: Building a Scalable Social Platform for K-pop Enthusiasts

K-pop Fans Deserve Better Than Five Separate Apps

The K-pop fandom is one of the most active communities online, but the tooling doesn’t match. Keeping up with comebacks, maintaining bias lists, connecting with other fans who have the same taste, and discovering new releases requires jumping between Reddit, Twitter, Spotify, and fan wikis. Nothing talks to anything else. There’s no single place to put all of it.

MyKpopLists is my attempt to build that place. It’s a full-stack social platform where fans can discover music, track artists, create ranked lists, write reviews, and connect with people who share their taste — all with the privacy controls and community structure that make an online community actually liveable.

What the Platform Does

The core of the site is an interconnected database of K-pop groups, solo artists, albums, songs, and variety shows. Users can follow artists, explore discographies with Spotify integration, and track member lineups over time — including join and leave dates, which matters a lot when groups go through lineup changes. Content discovery runs through an intelligent tagging system that uses Google Gemini to automatically categorize imported posts from r/kpop, keeping the community feed populated without requiring manual curation.

The social layer is built around a friend system with real privacy controls. Not “privacy controls” as a settings page that nobody reads — granular per-user visibility settings for lists, reviews, posts, activity feed, and friend list. You decide what’s public, what’s friends-only, and what’s private. The platform enforces these consistently across every feature that touches user content.

Community engagement runs through a scoring system that tracks contributions: post likes, review votes, comment engagement, content creation. The leaderboard shows all-time leaders and monthly risers, with scores updated in real time through Laravel observers. Real-time notifications via WebSockets (Laravel Reverb) keep users in the loop without page refreshes.

Stack

Backend: Laravel 12 with PHP 8.3, PostgreSQL, Redis, Laravel Sanctum, Laravel Fortify with 2FA

Frontend: Vue.js 3 with Composition API, Inertia.js 2, TypeScript, Tailwind CSS 4, Reka UI

Infrastructure: Docker multi-stage builds, Nginx, Supervisor for process management, Laravel Reverb for WebSockets, server-side rendering for SEO

Integrations: Google Gemini for content tagging, Reddit API for aggregation, Spotify API for music metadata, Google OAuth, AWS S3 for file storage

Why Inertia.js Instead of a Traditional REST API

For a social platform with complex, nested data requirements, Inertia.js eliminates the impedance mismatch between backend and frontend. Traditional REST APIs require maintaining separate endpoint contracts, handling serialization/deserialization, and managing state synchronization. Inertia lets the backend pass Eloquent models and collections directly to Vue components, cutting boilerplate significantly while keeping authorization where it belongs — at the controller level, before data ever reaches the client. Server-side routing means policies are enforced server-side consistently.

Why PostgreSQL

The data model involves extensive many-to-many relationships (users-groups, posts-tags, group-idol memberships) and polymorphic associations (comments on posts, reviews, and songs; likes on multiple content types). The activity feed query alone joins across six tables with privacy filters. PostgreSQL’s query planner handles this well; MySQL would not. The JSON column support also covers flexible metadata storage for content types that don’t fit neatly into relational rows.

Why Redis

Social platforms have tiered caching needs. User profiles must reflect real-time friendship status. Activity feeds need sub-second freshness. Static content — artist bios, album details — can be cached for longer. Redis handles this through different TTLs per content type. The SocialCacheService uses pattern-based cache invalidation via Redis SCAN, allowing surgical cache updates when friendships change without flushing unrelated data. File-based or database caching can’t do this.

The Hard Parts

Privacy-Aware Activity Feeds

The friends’ activity feed is the most complex feature in the platform. It has to answer: “Show me what my friends have been doing, but only if they’ve made their activity public, and only show content types they’ve chosen to share, and filter out activities on content I don’t have permission to see.” Those three conditions have to apply across every record in the result set.

The solution applies filtering at the database level by joining through the privacy settings table to exclude users who haven’t opted into activity sharing, then applying content-type-specific filters based on each friend’s individual settings. The complexity compounds because of polymorphic relationships — a “like” activity could reference a post, comment, review, song, or album. The activity transformation layer resolves these relationships, loading the appropriate related models and presenting them consistently regardless of the underlying type. Strategic eager loading avoids N+1 queries, Redis caching covers frequently-accessed privacy settings, and composite indexes on (user_id, activity_type, created_at) keep it under 200ms even with hundreds of friends.

Getting AI Tagging Right

The Reddit integration automatically fetches posts from r/kpop and uses Google Gemini to categorize them with relevant artists and songs. A single post might mention multiple groups, reference specific songs, and discuss variety show appearances — all of which need accurate tagging for discoverability.

The service constructs a prompt with the post content and a curated list of all groups, idols, songs, and variety shows currently in the database. Gemini returns structured JSON. The challenge is mapping those natural language responses back to database IDs. I implemented fuzzy matching to handle variations in artist names — stage names versus birth names, romanization differences, group name abbreviations. Confidence thresholds ensure tags only get created when Gemini is reasonably certain, which keeps false positives low.

Rate limiting adds another layer. Reddit allows 60 requests per minute; Gemini has daily quotas. The solution spreads processing across 12 hours via Laravel’s job queue, with exponential backoff on failures. Each job is idempotent — it checks for an existing reddit_id before creating a post — and includes retry logic with jitter to handle transient API failures.

Keeping the Leaderboard Accurate in Real Time

A naive approach would recalculate scores on every page load, which would kill the database. The hybrid approach I landed on uses Laravel’s observer pattern for incremental updates combined with scheduled nightly batch reconciliation.

When a post receives a like, the PostLikeObserver immediately updates the author’s score by the appropriate point value. This keeps the leaderboard current without delay. But observers can’t catch everything — deleted content, cascading relationship changes, data inconsistencies require periodic recalculation. The nightly command processes users in chunks to avoid memory exhaustion, recalculating scores from scratch to correct any drift. The monthly increase calculation compares current scores against a snapshot from 30 days ago, with careful handling of new users and deleted accounts.

What I’d Do Differently

Building privacy controls into the foundation rather than retrofitting them later turned out to be the right call. Establishing the PrivacyService early and routing all content access through it consistently means privacy enforcement is uniform across dozens of controllers. The lesson I’d carry forward: treat privacy as a cross-cutting concern from day one.

Effective caching also required actual domain knowledge — not generic advice. The SocialCacheService has seven different TTL tiers based on content volatility: profile data (5 minutes), activity feeds (1 minute), friend data (10 minutes), static content (1 hour). That granularity came from observing real usage patterns, not from intuition. You can’t design a caching strategy without understanding what changes frequently versus what doesn’t.

AI integration requires defensive programming throughout. Working with Gemini meant assuming nothing about response format, handling partial failures gracefully, and building fallback strategies. JSON validation, fuzzy matching for entity resolution, confidence thresholds, and manual override capabilities — all of it is load-bearing. AI is a useful tool, but it needs to be wrapped in robust error handling.

What’s Coming

The WebSocket infrastructure is already in place through Laravel Reverb, but largely underutilized. Live collaborative list editing and real-time comment threads would make it earn its place — both are features that fit naturally given the fandom context where group curation is a social activity.

The platform collects rich behavioral data — what users follow, like, review, and list — but doesn’t yet use this for recommendations. Collaborative filtering based on similar users’ preferences would meaningfully improve artist discovery, which is a core use case the current version only partially addresses.

The current responsive design works on mobile but isn’t optimized for it. Converting to a PWA with offline support and push notifications would significantly improve mobile engagement for a user base that’s primarily on phones.