Skip to content

Commit f90db80

Browse files
docs and minor tweaks
1 parent 0e527f4 commit f90db80

File tree

18 files changed

+196
-39
lines changed

18 files changed

+196
-39
lines changed

docs/docs.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,7 @@
4646
"docs/features/code-navigation",
4747
"docs/features/analytics",
4848
"docs/features/mcp-server",
49+
"docs/features/permission-syncing",
4950
{
5051
"group": "Agents",
5152
"tag": "experimental",

docs/docs/configuration/config-file.mdx

Lines changed: 16 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -33,17 +33,19 @@ Sourcebot syncs the config file on startup, and automatically whenever a change
3333

3434
The following are settings that can be provided in your config file to modify Sourcebot's behavior
3535

36-
| Setting | Type | Default | Minimum | Description / Notes |
37-
|-------------------------------------------|---------|------------|---------|----------------------------------------------------------------------------------------|
38-
| `maxFileSize` | number | 2 MB | 1 | Maximum size (bytes) of a file to index. Files exceeding this are skipped. |
39-
| `maxTrigramCount` | number | 20 000 | 1 | Maximum trigrams per document. Larger files are skipped. |
40-
| `reindexIntervalMs` | number | 1 hour | 1 | Interval at which all repositories are re‑indexed. |
41-
| `resyncConnectionIntervalMs` | number | 24 hours | 1 | Interval for checking connections that need re‑syncing. |
42-
| `resyncConnectionPollingIntervalMs` | number | 1 second | 1 | DB polling rate for connections that need re‑syncing. |
43-
| `reindexRepoPollingIntervalMs` | number | 1 second | 1 | DB polling rate for repos that should be re‑indexed. |
44-
| `maxConnectionSyncJobConcurrency` | number | 8 | 1 | Concurrent connection‑sync jobs. |
45-
| `maxRepoIndexingJobConcurrency` | number | 8 | 1 | Concurrent repo‑indexing jobs. |
46-
| `maxRepoGarbageCollectionJobConcurrency` | number | 8 | 1 | Concurrent repo‑garbage‑collection jobs. |
47-
| `repoGarbageCollectionGracePeriodMs` | number | 10 seconds | 1 | Grace period to avoid deleting shards while loading. |
48-
| `repoIndexTimeoutMs` | number | 2 hours | 1 | Timeout for a single repo‑indexing run. |
49-
| `enablePublicAccess` **(deprecated)** | boolean | false || Use the `FORCE_ENABLE_ANONYMOUS_ACCESS` environment variable instead. |
36+
| Setting | Type | Default | Minimum | Description / Notes |
37+
|-------------------------------------------------|---------|------------|---------|----------------------------------------------------------------------------------------|
38+
| `maxFileSize` | number | 2 MB | 1 | Maximum size (bytes) of a file to index. Files exceeding this are skipped. |
39+
| `maxTrigramCount` | number | 20 000 | 1 | Maximum trigrams per document. Larger files are skipped. |
40+
| `reindexIntervalMs` | number | 1 hour | 1 | Interval at which all repositories are re‑indexed. |
41+
| `resyncConnectionIntervalMs` | number | 24 hours | 1 | Interval for checking connections that need re‑syncing. |
42+
| `resyncConnectionPollingIntervalMs` | number | 1 second | 1 | DB polling rate for connections that need re‑syncing. |
43+
| `reindexRepoPollingIntervalMs` | number | 1 second | 1 | DB polling rate for repos that should be re‑indexed. |
44+
| `maxConnectionSyncJobConcurrency` | number | 8 | 1 | Concurrent connection‑sync jobs. |
45+
| `maxRepoIndexingJobConcurrency` | number | 8 | 1 | Concurrent repo‑indexing jobs. |
46+
| `maxRepoGarbageCollectionJobConcurrency` | number | 8 | 1 | Concurrent repo‑garbage‑collection jobs. |
47+
| `repoGarbageCollectionGracePeriodMs` | number | 10 seconds | 1 | Grace period to avoid deleting shards while loading. |
48+
| `repoIndexTimeoutMs` | number | 2 hours | 1 | Timeout for a single repo‑indexing run. |
49+
| `enablePublicAccess` **(deprecated)** | boolean | false || Use the `FORCE_ENABLE_ANONYMOUS_ACCESS` environment variable instead. |
50+
| `experiment_repoDrivenPermissionSyncIntervalMs` | number | 24 hours | 1 | Interval at which the repo permission syncer should run. |
51+
| `experiment_userDrivenPermissionSyncIntervalMs` | number | 24 hours | 1 | Interval at which the user permission syncer should run. |

docs/docs/configuration/environment-variables.mdx

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,7 @@ The following environment variables allow you to configure your Sourcebot deploy
5959
| `AUTH_EE_OKTA_ISSUER` | `-` | <p>The issuer URL for Okta SSO authentication.</p> |
6060
| `AUTH_EE_GCP_IAP_ENABLED` | `false` | <p>When enabled, allows Sourcebot to automatically register/login from a successful GCP IAP redirect</p> |
6161
| `AUTH_EE_GCP_IAP_AUDIENCE` | - | <p>The GCP IAP audience to use when verifying JWT tokens. Must be set to enable GCP IAP JIT provisioning</p> |
62+
| `EXPERIMENT_EE_PERMISSION_SYNC_ENABLED` | `false` | <p>Enables [permission syncing](/docs/features/permission-syncing).</p> |
6263

6364

6465
### Review Agent Environment Variables

docs/docs/connections/github.mdx

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -196,4 +196,8 @@ To connect to a GitHub host other than `github.com`, provide the `url` property
196196

197197
<GitHubSchema />
198198

199-
</Accordion>
199+
</Accordion>
200+
201+
## See also
202+
203+
- [Syncing GitHub Access permissions to Sourcebot](/docs/features/permission-syncing#github)

docs/docs/features/agents/overview.mdx

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,9 +3,9 @@ title: "Agents Overview"
33
sidebarTitle: "Overview"
44
---
55

6-
<Warning>
7-
Agents are currently a experimental feature. Have an idea for an agent that we haven't built? Submit a [feature request](https://github.com/sourcebot-dev/sourcebot/issues/new?template=feature_request.md) on our GitHub.
8-
</Warning>
6+
import ExperimentalFeatureWarning from '/snippets/experimental-feature-warning.mdx'
7+
8+
<ExperimentalFeatureWarning />
99

1010
Agents are automations that leverage the code indexed on Sourcebot to perform a specific task. Once you've setup Sourcebot, check out the
1111
guides below to configure additional agents.
Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
---
2+
title: "Permission syncing"
3+
sidebarTitle: "Permission syncing"
4+
tag: "experimental"
5+
---
6+
7+
import LicenseKeyRequired from '/snippets/license-key-required.mdx'
8+
import ExperimentalFeatureWarning from '/snippets/experimental-feature-warning.mdx'
9+
10+
<LicenseKeyRequired />
11+
<ExperimentalFeatureWarning />
12+
13+
# Overview
14+
15+
Permission syncing allows you to sync Access Permission Lists (ACLs) from a code host to Sourcebot. When configured, users signed into Sourcebot (via the code host's OAuth provider) will only be able to access repositories that they have access to on the code host. Practically, this means:
16+
17+
- Code Search results will only include repositories that the user has access to.
18+
- Code navigation results will only include repositories that the user has access to.
19+
- Ask Sourcebot (and the underlying LLM) will only have access to repositories that the user has access to.
20+
- File browsing is scoped to the repositories that the user has access to.
21+
22+
Permission syncing can be enabled by setting the `EXPERIMENT_EE_PERMISSION_SYNC_ENABLED` environment variable to `true`.
23+
24+
```bash
25+
docker run \
26+
-e EXPERIMENT_EE_PERMISSION_SYNC_ENABLED=true \
27+
/* additional args */ \
28+
ghcr.io/sourcebot-dev/sourcebot:latest
29+
```
30+
31+
## Platform support
32+
33+
We are actively working on supporting more code hosts. If you'd like to see a specific code host supported, please [reach out](https://www.sourcebot.dev/contact).
34+
35+
| Platform | Permission syncing |
36+
|:----------|------------------------------|
37+
| [GitHub (GHEC & GHEC Server)](/docs/features/permission-syncing#github) ||
38+
| GitLab | 🛑 |
39+
| Bitbucket Cloud | 🛑 |
40+
| Bitbucket Data Center | 🛑 |
41+
| Gitea | 🛑 |
42+
| Gerrit | 🛑 |
43+
| Generic git host | 🛑 |
44+
45+
# Getting started
46+
47+
## GitHub
48+
49+
Prerequisite: [Add GitHub as an OAuth provider](/docs/configuration/auth/providers#github).
50+
51+
Permission syncing works with **github.com**, **GitHub Enterprise Cloud**, and **GitHub Enterprise Server**. For organization-owned repositories, users that have **read-only** access (or above) via the following methods will have their access synced to Sourcebot:
52+
- Outside collaborators
53+
- Organization members that are direct collaborators
54+
- Organization members with access through team memberships
55+
- Organization members with access through default organization permissions
56+
- Organization owners.
57+
58+
**Notes:**
59+
- A GitHub OAuth provider must be configured to (1) correlate a Sourcebot user with a GitHub user, and (2) to list repositories that the user has access to for [User driven syncing](/docs/features/permission-syncing#how-it-works).
60+
- OAuth tokens must assume the `repo` scope in order to use the [List repositories for the authenticated user API](https://docs.github.com/en/rest/repos/repos?apiVersion=2022-11-28#list-repositories-for-the-authenticated-user) during [User driven syncing](/docs/features/permission-syncing#how-it-works). Sourcebot **will only** use this token for **reads**.
61+
62+
# How it works
63+
64+
Permission syncing works by periodically syncing ACLs from the code host(s) to Sourcebot to build an internal mapping between Users and Repositories. This mapping is hydrated in two directions:
65+
- **User driven** : fetches the list of all repositories that a given user has access to.
66+
- **Repo driven** : fetches the list of all users that have access to a given repository.
67+
68+
User driven and repo driven syncing occurs every 24 hours by default. These intervals can be configured using the following settings in the [config file](/docs/configuration/config-file):
69+
| Setting | Type | Default | Minimum |
70+
|-------------------------------------------------|---------|------------|---------|
71+
| `experiment_repoDrivenPermissionSyncIntervalMs` | number | 24 hours | 1 |
72+
| `experiment_userDrivenPermissionSyncIntervalMs` | number | 24 hours | 1 |
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
2+
<Warning>
3+
This is an experimental feature. Certain functionality may be buggy or incomplete, and breaking changes may ship in non-major releases. Have feedback? Submit a [issue](https://github.com/sourcebot-dev/sourcebot/issues) on GitHub.
4+
</Warning>

docs/snippets/schemas/v3/index.schema.mdx

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,16 @@
6969
"deprecated": true,
7070
"description": "This setting is deprecated. Please use the `FORCE_ENABLE_ANONYMOUS_ACCESS` environment variable instead.",
7171
"default": false
72+
},
73+
"experiment_repoDrivenPermissionSyncIntervalMs": {
74+
"type": "number",
75+
"description": "The interval (in milliseconds) at which the repo permission syncer should run. Defaults to 24 hours.",
76+
"minimum": 1
77+
},
78+
"experiment_userDrivenPermissionSyncIntervalMs": {
79+
"type": "number",
80+
"description": "The interval (in milliseconds) at which the user permission syncer should run. Defaults to 24 hours.",
81+
"minimum": 1
7282
}
7383
},
7484
"additionalProperties": false
@@ -195,6 +205,16 @@
195205
"deprecated": true,
196206
"description": "This setting is deprecated. Please use the `FORCE_ENABLE_ANONYMOUS_ACCESS` environment variable instead.",
197207
"default": false
208+
},
209+
"experiment_repoDrivenPermissionSyncIntervalMs": {
210+
"type": "number",
211+
"description": "The interval (in milliseconds) at which the repo permission syncer should run. Defaults to 24 hours.",
212+
"minimum": 1
213+
},
214+
"experiment_userDrivenPermissionSyncIntervalMs": {
215+
"type": "number",
216+
"description": "The interval (in milliseconds) at which the user permission syncer should run. Defaults to 24 hours.",
217+
"minimum": 1
198218
}
199219
},
200220
"additionalProperties": false

packages/backend/src/constants.ts

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,9 @@ export const DEFAULT_SETTINGS: Settings = {
1515
maxRepoGarbageCollectionJobConcurrency: 8,
1616
repoGarbageCollectionGracePeriodMs: 10 * 1000, // 10 seconds
1717
repoIndexTimeoutMs: 1000 * 60 * 60 * 2, // 2 hours
18-
enablePublicAccess: false // deprected, use FORCE_ENABLE_ANONYMOUS_ACCESS instead
18+
enablePublicAccess: false, // deprected, use FORCE_ENABLE_ANONYMOUS_ACCESS instead
19+
experiment_repoDrivenPermissionSyncIntervalMs: 1000 * 60 * 60 * 24, // 24 hours
20+
experiment_userDrivenPermissionSyncIntervalMs: 1000 * 60 * 60 * 24, // 24 hours
1921
}
2022

2123
export const PERMISSION_SYNC_SUPPORTED_CODE_HOST_TYPES = [

packages/backend/src/ee/repoPermissionSyncer.ts

Lines changed: 11 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ import { Job, Queue, Worker } from 'bullmq';
99
import { Redis } from 'ioredis';
1010
import { env } from "../env.js";
1111
import { createOctokitFromConfig, getUserIdsWithReadAccessToRepo } from "../github.js";
12-
import { RepoWithConnections } from "../types.js";
12+
import { RepoWithConnections, Settings } from "../types.js";
1313
import { PERMISSION_SYNC_SUPPORTED_CODE_HOST_TYPES } from "../constants.js";
1414
import { hasEntitlement } from "@sourcebot/shared";
1515

@@ -28,6 +28,7 @@ export class RepoPermissionSyncer {
2828

2929
constructor(
3030
private db: PrismaClient,
31+
private settings: Settings,
3132
redis: Redis,
3233
) {
3334
this.queue = new Queue<RepoPermissionSyncJob>(QUEUE_NAME, {
@@ -50,7 +51,7 @@ export class RepoPermissionSyncer {
5051

5152
return setInterval(async () => {
5253
// @todo: make this configurable
53-
const thresholdDate = new Date(Date.now() - 1000 * 60 * 60 * 24);
54+
const thresholdDate = new Date(Date.now() - this.settings.experiment_repoDrivenPermissionSyncIntervalMs);
5455

5556
const repos = await this.db.repo.findMany({
5657
// Repos need their permissions to be synced against the code host when...
@@ -166,8 +167,14 @@ export class RepoPermissionSyncer {
166167
const config = connection.config as unknown as GithubConnectionConfig;
167168
const { octokit } = await createOctokitFromConfig(config, repo.orgId, this.db);
168169

169-
// @nocheckin - need to handle when repo displayName is not set.
170-
const [owner, repoName] = repo.displayName!.split('/');
170+
// @note: this is a bit of a hack since the displayName _might_ not be set..
171+
// however, this property was introduced many versions ago and _should_ be set
172+
// on each connection sync. Let's throw an error just in case.
173+
if (!repo.displayName) {
174+
throw new Error(`Repo ${id} does not have a displayName`);
175+
}
176+
177+
const [owner, repoName] = repo.displayName.split('/');
171178

172179
const githubUserIds = await getUserIdsWithReadAccessToRepo(owner, repoName, octokit);
173180

0 commit comments

Comments
 (0)