Using Perforce depots with Sourcegraph

Sourcegraph supports Perforce Helix depots using p4-fusion. This creates an equivalent Git repository from a Perforce depot, which can then be indexed by Sourcegraph.

Screenshot of a Perforce repository in a Sourcegraph

Add a Perforce code host connection

Perforce depots can be added to a Sourcegraph instance by adding the appropriate code host connection.

To enable Perforce code host connections, a site admin must:

  1. Go to Site admin > Manage code hosts > Add code host

  2. Scroll down the list of supported code hosts and select Perforce.

  3. Configure which depots are mirrored/synchronized as Git repositories to Sourcegraph:

    • depots

      A list of depot paths that can be either a depot root or an arbitrary subdirectory. Note: Only "local" type depots are supported.

    • p4.user

      The user to be authenticated for p4 CLI, and should be capable of performing:

      • p4 login
      • p4 trust
      • and any p4 commands involved with git p4 clone and git p4 sync for listed depots.

      If repository permissions are mirrored, the user needs additional ability (aka. "super" access level) to perform the commands:

      • p4 protects
      • p4 groups
      • p4 group
      • p4 users
    • p4.passwd

      The ticket to be used for authenticating the p4.user. It is recommended to create tickets of users in a group that never expire. Use the command p4 -u <p4.user> login -p -a to obtain a ticket value.

    • See the configuration documentation below for other fields you can configure.

  4. Configure fusionClient:

    JSON
    { "fusionClient": { "enabled": true, "lookAhead": 2000 } }

    NOTE: While the fusionClient configuration is optional, without it the code host connection uses git p4, which has performance issues so we strongly recommend p4-fusion.

  5. Click Add repositories.

Sourcegraph will now talk to the Perforce host and sync the configured depots to the Sourcegraph instance.

It's worthwhile to note some limitations of this process:

  • When syncing depots p4-fusion is used to convert Perforce depots into git repositories so that Sourcegraph can index them.
  • Rename of a Perforce depot, including changing the depot on the Perforce server or the repositoryPathPattern config option, will cause a re-import of the depot.
  • Unless permissions syncing is enabled, Sourcegraph is not aware of the depot permissions, so it can't enforce access restrictions.

Perforce labels

Perforce labels are converted to Git tags, but only under the following conditions:

  • The depot is fully contained within one of the label's views. (i.e. the depot is contained at //path/to/depot/... and the label's view is //path/to/depot/...).
  • The label has a Revision field that matches a single revision. (i.e. @4521)

Perforce label names are also more flexible than git tag names, so incompatible characters are replaced with underscores. (i.e. v1:2:3 will become v1_2_3)

This behaviour can be disabled by setting noConvertLabels to true in the fusion client configuration.

Repository permissions (Beta)

To enforce file-level permissions for Perforce depots using the Perforce protects file, include the authorization field in the configuration of the Perforce code host connection you created above:

JSON
{ "authorization": {} }

Adding the authorization field to the code host connection configuration will enable partial parsing of the protects file. Learn more about the partial support of protects file parsing.

Syncing subdirectories to match permission boundaries

By default Sourcegraph only supports repository-level permissions and does not match the granularity of the Perforce protects file.

If you don't activate file-level permissions you should sync subdirectories of a depot using the depots configuration that best describes the most concrete path of your permissions boundary.

For example, if your Perforce depot //depot/Talkhouse has different permissions for //depot/Talkhouse/main-dev and subdirectories //depot/Talkhouse/rel1.0/front, //depot/Talkhouse/rel1.0/back we recommend setting the following depots:

JSON
{ "depots": [ "//depot/Talkhouse/main-dev/", "//depot/Talkhouse/rel1.0/front/", "//depot/Talkhouse/rel1.0/back/" ] }

By configuring each subdirectory that has unique permissions, Sourcegraph is able to recognize and enforce permissions for the sub-directories. You can NOT define these permissions as:

JSON
{ "depots": [ "//depot/Talkhouse/main-dev/", "//depot/Talkhouse/rel1.0/", "//depot/Talkhouse/rel1.0/back/" ] }

Since that would override the permissions for the //depot/Talkhouse/rel1.0/back depot.

Wildcards

File-level permissions can handle wildcards in the protects file. If file-level permissions is not enabled, Sourcegraph provides limited support for * and ... paths, so the workaround of adding sub-folders as separate repositories for the paths that employ wildcards needs to be followed.

File-level permissions

File-level permissions eliminate the need for syncing subdirectories to match permission boundaries.

To enable file-level permissions:

  1. Add the following entry to your site configuration file:

    JSON
    { "experimentalFeatures": { "subRepoPermissions": { "enabled": true } } }
  2. Enable the feature in the code host configuration by adding subRepoPermissions to the authorization object:

    JSON
    { "authorization": { "subRepoPermissions": true } }
  3. Save the configuration.

Permissions will be synced in the background based on your Perforce protects file.

Handling IP-based rules

Perforce's protects table allows administrators to define fine-grained access controls based on user identities and host IP addresses. By default, Sourcegraph applies all rules from the protects table without considering host-specific restrictions, effectively treating all host rules as the wildcard *. This behavior can lead to users having unintended access to repositories or files that should be restricted based on their IP addresses.

If your Perforce environment relies heavily on host-based permissions, it's crucial to configure Sourcegraph appropriately to respect these restrictions. This documentation provides detailed instructions on how to enforce or ignore host rules in Sourcegraph when integrating with Perforce.

Default Behavior

By default, Sourcegraph:

  • Applies all rules in the Perforce protects table.
  • Ignores host-specific restrictions, treating all host fields as *.

Implication: Users may gain access to resources that should be restricted based on their IP addresses.

Configuration Options

To ensure Sourcegraph handles host rules according to your requirements, you have two additional options:

  1. Enforce Host Rules: Configure Sourcegraph to respect and enforce IP-based restrictions defined in the protects table.
  2. Ignore Host-Specific Rules: Configure Sourcegraph to disregard any rules with a host value other than *.
Enforcing host rules

If you want Sourcegraph to enforce host-specific permissions, you need to enable IP restriction enforcement in your site configuration:

JSON
{ "experimentalFeatures": { "subRepoPermissions": { "enabled": true, "enforceIPRestrictions": true } } }

When enforceIPRestrictions is set to true, Sourcegraph will use the user's IP address to apply Perforce permissions at the user level. It uses the final X-Forwarded-For header in the request to identify the user's IP. Note that this header can be easily spoofed, so ensure your load balancer or proxy handles X-Forwarded-For headers securely.

Ignore rules with host

To ignore rules that have a host value other than *, set ignoreRulesWithHost to true in your code host configuration:

JSON
{ "authorization": { "subRepoPermissions": true, "ignoreRulesWithHost": true } }

With this setting, Sourcegraph will ignore any rules with a host other than *, treating them as if they do not exist.

Notes about permissions

  • Sourcegraph users are mapped to Perforce users based on their verified email addresses.
  • As long as a user has been granted at least Read permissions in Perforce they will be able to view content in Sourcegraph.
  • As a special case, commits in which a user does not have permissions to read any files are hidden. If a user can read a subset of files in a commit, only those files are shown.
  • file-level permissions must be disabled for Batch Changes to work.
  • Setting authz.enforceForSiteAdmins to true in the site configuration will enforce permissions for admin users. They may not be able to see repositories and their contents if their Sourcegraph user account email does not match with their email on the Perforce server.

Configuration

admin/code_hosts/perforce.schema.json

JSON
{ "$id": "perforce.schema.json#", "$schema": "http://json-schema.org/draft-07/schema#", "additionalProperties": false, "allowComments": true, "description": "Configuration for a connection to Perforce Server.", "properties": { "authorization": { "description": "If non-null, enforces Perforce depot permissions.", "properties": { "ignoreRulesWithHost": { "default": false, "description": "Ignore host-based protection rules (any rule with something other than a wildcard in the Host field).", "type": "boolean" }, "subRepoPermissions": { "default": false, "description": "Experimental: infer sub-repository permissions from protection rules.", "type": "boolean" } }, "title": "PerforceAuthorization", "type": "object" }, "depots": { "description": "Depots can have arbitrary paths, e.g. a path to depot root or a subdirectory.", "examples": [ [ "//Sourcegraph/", "//Engineering/Cloud/" ] ], "items": { "pattern": "^\\/[\\/\\S]+\\/$", "type": "string" }, "type": "array" }, "fusionClient": { "additionalProperties": false, "description": "Configuration for the experimental p4-fusion client", "properties": { "cacheLabels": { "default": false, "description": "Whether to cache Perforce labels on disk to avoid unnecessary roundtrips to the Perforce server.", "type": "boolean" }, "enabled": { "default": false, "description": "DEPRECATED. p4-fusion is always enabled.", "type": "boolean" }, "fsyncEnable": { "default": false, "description": " Enable fsync() while writing objects to disk to ensure they get written to permanent storage immediately instead of being cached. This is to mitigate data loss in events of hardware failure.", "type": "boolean" }, "includeBinaries": { "default": false, "description": "Whether to include binary files", "type": "boolean" }, "lookAhead": { "default": 2000, "description": "How many CLs in the future, at most, shall we keep downloaded by the time it is to commit them", "minimum": 1, "type": "integer" }, "maxChanges": { "default": -1, "description": "How many changes to fetch during initial clone. The default of -1 will fetch all known changes", "type": "integer" }, "networkThreads": { "default": 12, "description": "The number of threads in the threadpool for running network calls. Defaults to the number of logical CPUs.", "minimum": 1, "type": "integer" }, "networkThreadsFetch": { "default": 12, "description": "The number of threads in the threadpool for running network calls when performing fetches. Defaults to the number of logical CPUs.", "minimum": 1, "type": "integer" }, "noConvertLabels": { "default": false, "description": "Disable Perforce label to git tag conversion.", "type": "boolean" }, "printBatch": { "default": 100, "description": "The p4 print batch size", "minimum": 1, "type": "integer" }, "refresh": { "default": 1000, "description": "How many times a connection should be reused before it is refreshed", "minimum": 1, "type": "integer" }, "retries": { "default": 10, "description": "How many times a command should be retried before the process exits in a failure", "minimum": 1, "type": "integer" } }, "type": "object" }, "p4.client": { "description": "Client specified as an option for p4 CLI (P4CLIENT, also enables '--use-client-spec')", "type": "string" }, "p4.passwd": { "description": "The ticket value for the user (P4PASSWD). You can get this by running `p4 login -p` or `p4 login -pa`. It should look like `6211C5E719EDE6925855039E8F5CC3D2`.", "type": "string" }, "p4.port": { "description": "The Perforce Server address to be used for p4 CLI (P4PORT). It's recommended to specify the protocol prefix (e.g. tcp: or ssl:) as part of the address.", "examples": [ "ssl:111.222.333.444:1666", "tcp:111.222.333.444:1666" ], "type": "string" }, "p4.user": { "description": "The user to be authenticated for p4 CLI (P4USER).", "examples": [ "admin" ], "type": "string" }, "repositoryPathPattern": { "default": "{depot}", "description": "The pattern used to generate the corresponding Sourcegraph repository name for a Perforce depot. In the pattern, the variable \"{depot}\" is replaced with the Perforce depot's path.\n\nFor example, if your Perforce depot path is \"//Sourcegraph/\" and your Sourcegraph URL is https://src.example.com, then a repositoryPathPattern of \"perforce/{depot}\" would mean that the Perforce depot is available on Sourcegraph at https://src.example.com/perforce/Sourcegraph.\n\nIt is important that the Sourcegraph repository name generated with this pattern be unique to this Perforce Server. If different Perforce Servers generate repository names that collide, Sourcegraph's behavior is undefined.", "type": "string" } }, "required": [ "p4.port", "p4.user", "p4.passwd" ], "title": "PerforceConnection", "type": "object" }

Configuration Notes

  • p4-fusion Recommended: Use the fusionClient configuration for better performance compared to git p4.
  • Depot Types: Only "local" type depots are supported for synchronization.
  • User Permissions: The p4.user must have appropriate permissions for basic operations, and "super" access level if repository permissions are mirrored.
  • Ticket Authentication: Use p4 -u <p4.user> login -p -a to obtain ticket values for p4.passwd.
  • Repository Path Pattern: The {depot} variable in repositoryPathPattern is replaced with the Perforce depot path.

Security Considerations

  • Ticket Management: Store Perforce tickets securely and create tickets that never expire for service accounts.
  • User Permissions: Grant minimal required permissions to the Perforce user account used by Sourcegraph.
  • SSL Connections: Use SSL connections (ssl: prefix) for Perforce server connections when possible.
  • IP Restrictions: Configure IP-based rules carefully when using file-level permissions with enforceIPRestrictions.
  • Permission Boundaries: Align depot configurations with permission boundaries to maintain security.

Common Examples

Basic Perforce Configuration

JSON
{ "p4.port": "ssl:perforce.example.com:1666", "p4.user": "sourcegraph-service", "p4.passwd": "6211C5E719EDE6925855039E8F5CC3D2", "depots": [ "//Sourcegraph/", "//Tools/" ], "fusionClient": { "enabled": true, "lookAhead": 2000 } }

Configuration with Permissions

JSON
{ "p4.port": "ssl:perforce.example.com:1666", "p4.user": "admin", "p4.passwd": "6211C5E719EDE6925855039E8F5CC3D2", "depots": [ "//Depot/Frontend/", "//Depot/Backend/", "//Depot/Common/" ], "authorization": { "subRepoPermissions": true }, "repositoryPathPattern": "perforce/{depot}" }

Subdirectory Permission Boundaries

JSON
{ "p4.port": "tcp:perforce.example.com:1666", "p4.user": "readonly", "p4.passwd": "ABCDEF1234567890", "depots": [ "//Project/main-dev/", "//Project/release/frontend/", "//Project/release/backend/" ], "maxChanges": 5000 }

Best Practices

  • Use SSL Connections: Always use SSL (ssl: prefix) for production Perforce connections.
  • Enable p4-fusion: Configure fusionClient with appropriate lookAhead values for better sync performance.
  • Plan Permission Boundaries: Design depot configurations around permission boundaries to avoid complex sub-repository permissions.
  • Monitor Changelist Mapping: When using experimental changelist ID features, monitor background mapping jobs.
  • Regular Ticket Rotation: Establish procedures for rotating Perforce tickets without service interruption.
  • Test Permissions: Verify that repository permissions work correctly after configuration changes.

Batch Changes does not support repos that use sub-repo permissions, so in order to use batch changes with Perforce depots, the code host cannot use file-level permissions.

When a Batch Change is published, it is sent as a shelved changelist to the server configured in the code host. The Changelist Id is displayed in the UI for the user to use for managing the shelved changelist.