Skip to content

Conversation

@mbani01
Copy link
Contributor

@mbani01 mbani01 commented Dec 30, 2025

This pull request introduces significant improvements to the repository integration and synchronization logic within the IntegrationService in backend/src/services/integrationService.ts. The main focus is on ensuring that repository mappings between source integrations and the unified public.repositories table are consistent, robust, and transactional across different platforms (GitHub, GitLab, Gerrit, and direct Git). The changes add validation, transactional safety, and better handling of edge cases when mapping, restoring, or soft-deleting repositories.

Key changes include:

Repository Mapping and Synchronization Enhancements

  • Added a new method mapUnifiedRepositories, which handles inserting, restoring, and soft-deleting repositories in the unified public.repositories table in a transactional manner. This includes robust validation to prevent remapping repositories across different integrations and ensures consistency across platforms.
  • Introduced helper methods like validateRepoIntegrationMapping and buildRepositoryPayloads to support the main mapping logic, including building payloads with correct associations (segment, integration, insights project, forkedFrom, etc.).

Platform-Aware Repository Synchronization

  • Updated the gitConnectOrUpdate method and all relevant integration flows (GitHub, GitLab, Gerrit, direct Git) to call mapUnifiedRepositories with the correct platform and integration context, ensuring that repository synchronization is handled appropriately for each platform. [1] [2] [3]
  • Added a sourcePlatform parameter to gitConnectOrUpdate to control when unified repository mapping should be triggered, allowing for more flexible integration flows.

Integration with Data Access Layer

  • Expanded imports from the data access layer to support new repository operations, such as fetching, inserting, restoring, and soft-deleting repositories.

Improved Transaction Management

  • Ensured that all repository mapping operations are performed within a transaction, with proper commit and rollback handling to maintain data integrity in case of errors.

Platform-Specific Handling in Integration Flows

  • Updated integration flows for GitHub, GitLab, Gerrit, and direct Git to pass the appropriate platform type and ensure that repository synchronization logic is platform-aware. [1] [2] [3] [4] [5] [6]

These changes make the repository integration process more reliable, maintainable, and scalable across different source platforms.


Note

Introduces a unified repository-mapping flow across code platforms (GitHub, GitLab, Gerrit, Git) backed by public.repositories, plus a single API to fetch mappings.

  • Adds GET /integration/:id/repositories and removes platform-specific repo GETs; frontend now uses IntegrationService.fetchIntegrationRepositories for GitHub, GitLab, and Git
  • Major IntegrationService refactor: new mapUnifiedRepositories (insert/restore/soft-delete with validations and mirrored-repo handling), getIntegrationRepositories, and enhanced deletion logic to preserve/clean up repos correctly; gitConnectOrUpdate gains sourcePlatform and syncs only owned repos
  • Data layer: new repositories module (insertRepositories, restoreRepositories, softDeleteRepositories, getRepositoriesBy*, getIntegrationReposMapping), and syncRepositoriesToGitV2 now reuses IDs from github/gitlab/git where available
  • Frontend UX: Git settings visualize and protect mirrored repos (disabled inputs, tooltips); array-input supports disabled; Gerrit UI defaults enableGit to true and hides the toggle

Written by Cursor Bugbot for commit 442550f. This will update automatically on new commits. Configure here.

@mbani01 mbani01 requested review from themarolt and ulemons December 30, 2025 16:50
@mbani01 mbani01 marked this pull request as ready for review December 30, 2025 16:50
@github-actions
Copy link
Contributor

⚠️ Jira Issue Key Missing

Your PR title doesn't contain a Jira issue key. Consider adding it for better traceability.

Example:

  • feat: add user authentication (CM-123)
  • feat: add user authentication (IN-123)

Projects:

  • CM: Community Data Platform
  • IN: Insights

Please add a Jira issue key to your PR title.

4 similar comments
@github-actions
Copy link
Contributor

⚠️ Jira Issue Key Missing

Your PR title doesn't contain a Jira issue key. Consider adding it for better traceability.

Example:

  • feat: add user authentication (CM-123)
  • feat: add user authentication (IN-123)

Projects:

  • CM: Community Data Platform
  • IN: Insights

Please add a Jira issue key to your PR title.

@github-actions
Copy link
Contributor

⚠️ Jira Issue Key Missing

Your PR title doesn't contain a Jira issue key. Consider adding it for better traceability.

Example:

  • feat: add user authentication (CM-123)
  • feat: add user authentication (IN-123)

Projects:

  • CM: Community Data Platform
  • IN: Insights

Please add a Jira issue key to your PR title.

@github-actions
Copy link
Contributor

⚠️ Jira Issue Key Missing

Your PR title doesn't contain a Jira issue key. Consider adding it for better traceability.

Example:

  • feat: add user authentication (CM-123)
  • feat: add user authentication (IN-123)

Projects:

  • CM: Community Data Platform
  • IN: Insights

Please add a Jira issue key to your PR title.

@github-actions
Copy link
Contributor

⚠️ Jira Issue Key Missing

Your PR title doesn't contain a Jira issue key. Consider adding it for better traceability.

Example:

  • feat: add user authentication (CM-123)
  • feat: add user authentication (IN-123)

Projects:

  • CM: Community Data Platform
  • IN: Insights

Please add a Jira issue key to your PR title.

}),
},
segmentOptions,
PlatformType.GITHUB,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing repository sync for GitHub and GitLab integrations

The PR adds PlatformType.GITHUB and PlatformType.GITLAB parameters to gitConnectOrUpdate calls, which causes the internal mapUnifiedRepositories call to be skipped (due to the if (!sourcePlatform) check at line 1422). However, unlike githubNangoConnect (which adds an external call at line 961) and gerritConnectOrUpdate (which adds an external call at line 1789), neither mapGithubRepos nor mapGitlabRepos add the required external mapUnifiedRepositories call. This means when these methods are called directly via API endpoints (githubMapRepos.ts, gitlabMapRepos.ts) or from updateGithubIntegrationSettings, repositories won't be synced to public.repositories, breaking the PR's stated goal of consistent repository mapping across all platforms.

Additional Locations (1)

Fix in Cursor Fix in Web


const txOptions = { ...this.options, transaction }
const txService = new IntegrationService(txOptions)
await txService.mapUnifiedRepositories(PlatformType.GERRIT, integration.id, mapping)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gerrit repository sync fails when Git disabled

The mapUnifiedRepositories call for Gerrit is placed outside the if (integrationData.remote.enableGit) block, meaning it executes unconditionally. However, buildRepositoryPayloads requires a Git integration to exist - it calls IntegrationRepository.findByPlatform(PlatformType.GIT, ...) and throws Error400 if not found. When enableGit is false, no Git integration is created by gitConnectOrUpdate, so the subsequent mapUnifiedRepositories call will fail with "Git integration not found for segment". The mapUnifiedRepositories call should be inside the if (enableGit) block.

Fix in Cursor Fix in Web

@github-actions
Copy link
Contributor

⚠️ Jira Issue Key Missing

Your PR title doesn't contain a Jira issue key. Consider adding it for better traceability.

Example:

  • feat: add user authentication (CM-123)
  • feat: add user authentication (IN-123)

Projects:

  • CM: Community Data Platform
  • IN: Insights

Please add a Jira issue key to your PR title.

@github-actions
Copy link
Contributor

⚠️ Jira Issue Key Missing

Your PR title doesn't contain a Jira issue key. Consider adding it for better traceability.

Example:

  • feat: add user authentication (CM-123)
  • feat: add user authentication (IN-123)

Projects:

  • CM: Community Data Platform
  • IN: Insights

Please add a Jira issue key to your PR title.

const [gitRepoIdMap, sourceIntegration] = await Promise.all([
// TODO: after migration, generate UUIDs instead of fetching from git.repositories
getGitRepositoryIdsByUrl(qx, urls),
isGitHubPlatform ? IntegrationRepository.findById(sourceIntegrationId, txOptions) : null,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GitLab forkedFrom data not synced to public.repositories

In buildRepositoryPayloads, the sourceIntegration is only fetched for GitHub platforms (line 3118 uses a conditional that returns null for non-GitHub). The forkedFromMap at lines 3157-3166 is populated exclusively from sourceIntegration?.settings?.orgs, which is GitHub-specific. For GitLab repositories, forkedFromMap remains empty, causing all GitLab repos to have forkedFrom: null in public.repositories. However, GitLab's forkedFrom data IS available and passed to gitConnectOrUpdate (lines 2873-2875, 2884-2886), so the information exists but isn't propagated to the unified repositories table.

Additional Locations (1)

Fix in Cursor Fix in Web

@github-actions
Copy link
Contributor

⚠️ Jira Issue Key Missing

Your PR title doesn't contain a Jira issue key. Consider adding it for better traceability.

Example:

  • feat: add user authentication (CM-123)
  • feat: add user authentication (IN-123)

Projects:

  • CM: Community Data Platform
  • IN: Insights

Please add a Jira issue key to your PR title.

1 similar comment
@github-actions
Copy link
Contributor

⚠️ Jira Issue Key Missing

Your PR title doesn't contain a Jira issue key. Consider adding it for better traceability.

Example:

  • feat: add user authentication (CM-123)
  • feat: add user authentication (IN-123)

Projects:

  • CM: Community Data Platform
  • IN: Insights

Please add a Jira issue key to your PR title.

remotes: repositories.map((url) => ({ url, forkedFrom: null })),
},
txOptions,
platform,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing repository sync in integration update method

The update method now passes platform to gitConnectOrUpdate which skips the internal mapUnifiedRepositories call. However, the update method doesn't add its own mapUnifiedRepositories call afterward. This creates a regression where updating code platform integrations syncs to git.repositories but no longer syncs to public.repositories, breaking the unified repository table consistency.

Fix in Cursor Fix in Web

const gitIntegration = await IntegrationRepository.findByPlatform(
PlatformType.GIT,
segmentOptions,
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Integration destroy fails without Git integration

The call to IntegrationRepository.findByPlatform at line 433 lacks error handling. The original code wrapped this in a try-catch with shouldUpdateGit fallback to gracefully handle missing Git integrations. The new code assumes a Git integration always exists for GitHub/GitLab integrations. If a Git integration is missing (due to partial creation failures, manual deletion, or legacy data), findByPlatform throws an error, causing the entire destroy operation to fail and roll back instead of proceeding with cleanup.

Fix in Cursor Fix in Web

SELECT id, url FROM gitlab_repos
UNION ALL
SELECT id, url FROM git_repos
) combined
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DISTINCT ON query without ORDER BY is non-deterministic

The SQL query in syncRepositoriesToGitV2 uses DISTINCT ON (url) without a corresponding ORDER BY clause. When the same URL exists in multiple source tables (github_repos, gitlab_repos, git_repos) with potentially different IDs, PostgreSQL will return an arbitrary row. This could lead to inconsistent repository ID selection across different executions, though in practice this may only manifest if there's existing data inconsistency between the tables.

Fix in Cursor Fix in Web

@mbani01 mbani01 force-pushed the feat/integrationService_sync_to_repositories branch from 8f348bc to be61340 Compare January 6, 2026 10:20
@github-actions
Copy link
Contributor

github-actions bot commented Jan 6, 2026

⚠️ Jira Issue Key Missing

Your PR title doesn't contain a Jira issue key. Consider adding it for better traceability.

Example:

  • feat: add user authentication (CM-123)
  • feat: add user authentication (IN-123)

Projects:

  • CM: Community Data Platform
  • IN: Insights

Please add a Jira issue key to your PR title.

1 similar comment
@github-actions
Copy link
Contributor

github-actions bot commented Jan 6, 2026

⚠️ Jira Issue Key Missing

Your PR title doesn't contain a Jira issue key. Consider adding it for better traceability.

Example:

  • feat: add user authentication (CM-123)
  • feat: add user authentication (IN-123)

Projects:

  • CM: Community Data Platform
  • IN: Insights

Please add a Jira issue key to your PR title.

@mbani01 mbani01 changed the title feat: integration service sync to repositories feat: integration service sync to repositories [CM-865] Jan 6, 2026
@mbani01 mbani01 force-pushed the feat/integrationService_sync_to_repositories branch from 644a90c to 442550f Compare January 6, 2026 11:05
@mbani01 mbani01 self-assigned this Jan 6, 2026
@mbani01 mbani01 removed the request for review from themarolt January 6, 2026 11:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants