Skip to content

Conversation

@Suman1108-beep
Copy link

@Suman1108-beep Suman1108-beep commented Dec 25, 2025

Summary

This PR introduces the foundational engagement metrics pipeline for creators.

What’s included

  • Standardized Pydantic schemas for engagement metrics
  • Service layer to compute likes, comments, shares, averages, and engagement rate
  • New FastAPI endpoint: POST /engagement/compute
  • Uses existing UserPost.engagement_metrics (no DB schema changes)

Notes

  • No ML, embeddings, or external APIs added
  • No database migrations
  • Handles empty data safely

Testing

  • Tested locally via Swagger UI and curl
  • Endpoint returns 200 OK with structured metrics

Closes #22 (partial – engagement groundwork)
Tested via Swagger UI; happy to add screenshots if needed.

Summary by CodeRabbit

  • New Features

    • Added engagement metrics API endpoint to compute and return engagement statistics.
    • Application startup now performs automatic database initialization with retry logic.
  • Bug Fixes

    • Improved database connection reliability via better host resolution.
    • Improved handling of database credentials and reduced exposure in startup logs.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 25, 2025

📝 Walkthrough

Walkthrough

Database connection handling now URL-encodes passwords and resolves HOST to an IPv4 address. Application startup adds a retry-enabled database initialization sequence. A new engagement API endpoint and supporting schemas and service compute aggregated engagement metrics from stored user posts.

Changes

Cohort / File(s) Summary
Database connection
Backend/app/db/db.py
URL-encodes PASSWORD with quote_plus, resolves HOST -> HOST_IP via socket.getaddrinfo, rebuilds DATABASE_URL to use HOST_IP, adds masked debug print, retains SSL in engine init.
App startup / lifecycle
Backend/app/main.py
Adds async def startup_sequence() to create tables and seed data; wraps startup in retry loop (up to 5 attempts, 2s delay); imports asyncio; registers engagement router.
Engagement route
Backend/app/routes/engagement.py
New router = APIRouter(prefix="/engagement", tags=["Engagement"]) with POST /engagement/compute endpoint: accepts EngagementRequest, queries UserPost by creator_id, uses DI get_db, returns EngagementMetrics.
Engagement schemas
Backend/app/schemas/engagement.py
New Pydantic models: EngagementRequest(creator_id: str, follower_count: Optional[int]) and EngagementMetrics(likes, comments, shares, avg_likes, avg_comments, avg_shares, engagement_rate, total_posts, follower_count).
Engagement service logic
Backend/app/services/engagement_service.py
New compute_engagement_metrics(posts, follower_count) aggregates likes/comments/shares, computes per-post averages and engagement rate (handles zero/missing data), returns metrics dict compatible with response model.

Sequence Diagram

sequenceDiagram
    autonumber
    participant Client
    participant API as FastAPI / Route
    participant DB as Async DB Session
    participant Service as Engagement Service

    Client->>API: POST /engagement/compute (EngagementRequest)
    API->>DB: get_db() (dependency)
    DB-->>API: AsyncSession
    API->>DB: query UserPost by creator_id
    DB-->>API: List[UserPost]
    API->>Service: compute_engagement_metrics(posts, follower_count)
    Service->>Service: aggregate likes/comments/shares\ncompute averages\ncompute engagement_rate
    Service-->>API: metrics dict
    API-->>Client: 200 OK (EngagementMetrics)
Loading

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I hopped through code, sniffed hosts and seeds,
Encoded the secrets and followed leads.
I counted likes, comments, shares in a breeze,
With retries and routes, metrics now please —
A twitch, a wiggle, deployment succeeds! 🥕🐇

Pre-merge checks and finishing touches

❌ Failed checks (3 warnings)
Check name Status Explanation Resolution
Linked Issues check ⚠️ Warning The PR partially addresses issue #22 by implementing engagement metrics computation and a dedicated endpoint, but does not cover data fetching, niche analysis, or sentiment analysis requirements. This PR implements only the engagement metrics calculation component (#22). Future PRs should address data fetching from social media APIs, niche analysis via NLP, and sentiment analysis on comments.
Out of Scope Changes check ⚠️ Warning Database password encoding in db.py and HOST IP resolution changes are security improvements not covered by issue #22's objectives. Database connection improvements should be addressed in a separate PR or issue focused on database security. Consider extracting these changes into a dedicated security hardening issue.
Docstring Coverage ⚠️ Warning Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: add creator engagement metrics groundwork' clearly and concisely describes the main change—establishing the foundational infrastructure for computing creator engagement metrics.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
Backend/app/main.py (1)

40-51: Retry logic is reasonable, but consider logging the exception type.

The retry mechanism with 5 attempts and 2-second delays is appropriate for transient DB connection issues. The fallback to continue without DB initialization is a reasonable choice for development, but could mask serious configuration errors in production.

Consider using logging.warning instead of print for consistency with the rest of the application's logging patterns.

Backend/app/routes/engagement.py (1)

4-4: Unused import.

Optional is imported but not used in this file.

🔎 Proposed fix
-from typing import Optional
Backend/app/services/engagement_service.py (2)

3-8: Use TYPE_CHECKING for conditional type imports.

The try/except pattern for imports is fragile and unconventional. For type-only imports, use typing.TYPE_CHECKING which is the standard pattern and avoids runtime overhead.

🔎 Proposed fix
-from typing import List, Optional, Dict, Any
+from typing import TYPE_CHECKING, List, Optional, Dict, Any
 
-try:
-    from app.models.models import UserPost  # for typing clarity
-except Exception:
-    # Fallback for type hints if import is not available at runtime
-    class UserPost:  # type: ignore
-        engagement_metrics: Dict[str, Any]
+if TYPE_CHECKING:
+    from app.models.models import UserPost

Then update the function signature to use a string annotation or Any:

def compute_engagement_metrics(posts: List["UserPost"], follower_count: Optional[int]) -> Dict[str, Any]:

26-35: Defensive metric extraction is good, but int() could raise on invalid data.

The int(metrics.get("likes", 0) or 0) pattern handles None and missing keys well. However, if engagement_metrics contains a non-numeric string (e.g., "likes": "invalid"), int() will raise a ValueError.

Consider wrapping in a helper or try/except if the data source isn't fully trusted:

🔎 Optional: safer int extraction
def safe_int(value, default=0):
    try:
        return int(value) if value else default
    except (ValueError, TypeError):
        return default
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a3be437 and 067ea80.

📒 Files selected for processing (5)
  • Backend/app/db/db.py
  • Backend/app/main.py
  • Backend/app/routes/engagement.py
  • Backend/app/schemas/engagement.py
  • Backend/app/services/engagement_service.py
🧰 Additional context used
🧬 Code graph analysis (2)
Backend/app/routes/engagement.py (3)
Backend/app/db/db.py (1)
  • get_db (49-51)
Backend/app/services/engagement_service.py (2)
  • UserPost (7-8)
  • compute_engagement_metrics (11-64)
Backend/app/schemas/engagement.py (2)
  • EngagementRequest (5-7)
  • EngagementMetrics (10-21)
Backend/app/main.py (1)
Backend/app/db/seed.py (1)
  • seed_db (6-57)
🪛 Ruff (0.14.10)
Backend/app/services/engagement_service.py

5-5: Do not catch blind exception: Exception

(BLE001)

Backend/app/routes/engagement.py

15-15: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

Backend/app/main.py

38-38: Unused function argument: app

(ARG001)


46-46: Do not catch blind exception: Exception

(BLE001)

Backend/app/db/db.py

24-24: Do not catch blind exception: Exception

(BLE001)

🔇 Additional comments (8)
Backend/app/main.py (2)

32-34: Clean extraction of startup logic.

Bundling create_tables() and seed_db() into a single async function improves readability and makes the retry logic in lifespan cleaner.


72-72: Engagement router correctly registered.

The router is properly included alongside existing routes.

Backend/app/db/db.py (2)

13-13: Good addition of URL encoding for password.

Using quote_plus handles special characters in passwords that would otherwise break the connection URL.


28-29: Password masking in debug output is good security practice.

Printing the masked URL helps with debugging while avoiding accidental credential exposure in logs.

Backend/app/routes/engagement.py (1)

14-20: Endpoint implementation is clean and functional.

The endpoint correctly:

  • Uses dependency injection for the DB session
  • Queries posts by creator_id
  • Delegates computation to the service layer
  • Returns properly typed response

One consideration: returning empty metrics (all zeros) when creator_id doesn't exist or has no posts is a valid design choice, but you may want to add a check to return a 404 if the creator doesn't exist at all. This would help distinguish between "creator with no posts" vs "invalid creator_id".

Backend/app/schemas/engagement.py (2)

5-7: Request schema is well-defined.

Using str for creator_id aligns with the UUID string format used elsewhere in the codebase.


10-21: Response schema covers all computed metrics.

The schema properly includes totals, averages, and the optional engagement rate. The types align with the service layer output.

Consider adding Field constraints in a future iteration (e.g., ge=0 for counts) to enforce data validity at the schema level.

Backend/app/services/engagement_service.py (1)

39-52: Edge case handling is correct.

The logic properly handles:

  • Empty posts list (returns 0.0 averages)
  • Missing or zero follower_count (returns None for engagement_rate)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
Backend/app/db/db.py (4)

14-14: Consider validating required environment variables.

The fallback to an empty string for a missing password will create an invalid connection string that fails later at engine creation. While the error is caught, validating required credentials upfront would provide clearer error messages at startup.

🔎 Proposed validation approach

Add validation after loading environment variables:

# Fetch database credentials
USER = os.getenv("user")
PASSWORD_RAW = os.getenv("password")
HOST = os.getenv("host")
PORT = os.getenv("port")
DBNAME = os.getenv("dbname")

# Validate required credentials
required_vars = {"user": USER, "password": PASSWORD_RAW, "host": HOST, "port": PORT, "dbname": DBNAME}
missing = [k for k, v in required_vars.items() if not v]
if missing:
    raise ValueError(f"Missing required database environment variables: {', '.join(missing)}")

PASSWORD = quote_plus(PASSWORD_RAW)

20-33: Host resolution guard addresses previous review concern.

The if HOST and PORT: guard successfully addresses the past review comment about potential None HOST causing a TypeError. The fallback logic is sound.

However, consider refining the exception handling for better diagnostics:

🔎 Optional refinement for exception handling
 # Resolve IPv4 address for HOST to avoid intermittent DNS issues on Windows
 if HOST and PORT:
     try:
         addrinfo = socket.getaddrinfo(
             HOST,
             int(PORT),
             socket.AF_INET,
             socket.SOCK_STREAM,
         )
         HOST_IP = addrinfo[0][4][0] if addrinfo else HOST
-    except Exception:
+    except (ValueError, socket.gaierror, OSError) as e:
+        print(f"⚠️ DNS resolution failed for {HOST}, using hostname directly: {e}")
         HOST_IP = HOST
 else:
     HOST_IP = HOST

This catches the specific exceptions (ValueError from invalid PORT, socket.gaierror from DNS failures, OSError from network issues) and provides better visibility when resolution fails.


37-37: Consider making debug output conditional.

The debug print statement is helpful for troubleshooting but always executes. Consider making it conditional based on a debug flag or log level for production environments.

🔎 Example with environment-based debug flag
 DATABASE_URL = f"postgresql+asyncpg://{USER}:{PASSWORD}@{HOST_IP}:{PORT}/{DBNAME}"
-print(f"DB URL: postgresql+asyncpg://{USER}:***@{HOST_IP}:{PORT}/{DBNAME}")
+if os.getenv("DEBUG", "").lower() in ("true", "1"):
+    print(f"DB URL: postgresql+asyncpg://{USER}:***@{HOST_IP}:{PORT}/{DBNAME}")

42-44: Consider making SQL echo configurable.

The echo=True parameter enables verbose SQL logging, which is useful for development but can be excessive in production. Consider making it configurable via an environment variable.

🔎 Example with environment-based echo flag
     engine = create_async_engine(
         DATABASE_URL,
-        echo=True,
+        echo=os.getenv("SQL_ECHO", "false").lower() in ("true", "1"),
         connect_args={"ssl": "require"},
     )
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 067ea80 and 0017635.

📒 Files selected for processing (1)
  • Backend/app/db/db.py
🧰 Additional context used
🪛 Ruff (0.14.10)
Backend/app/db/db.py

30-30: Do not catch blind exception: Exception

(BLE001)

🔇 Additional comments (1)
Backend/app/db/db.py (1)

6-7: LGTM! Imports are correctly placed.

The new imports for quote_plus (password URL-encoding) and socket (DNS resolution) are properly positioned at the module level and are used appropriately later in the file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Extract Creator’s Audience Niche, Sentiment & Engagement from Social Media

1 participant