dulwich.repo module

Repository access.

This module contains the base class for git repositories (BaseRepo) and an implementation which uses a repository on local disk (Repo).

class dulwich.repo.BaseRepo(object_store: BaseObjectStore, refs: RefsContainer)

Bases: object

Base class for a git repository.

This base class is meant to be used for Repository implementations that e.g. work on top of a different transport than a standard filesystem path.

object_store

Dictionary-like object for accessing the objects

refs

Dictionary-like object with the refs in this repository

Open a repository.

This shouldn’t be called directly, but rather through one of the base classes, such as MemoryRepo or Repo.

Parameters
  • object_store – Object store to use

  • refs – Refs container to use

do_commit(message: Optional[bytes] = None, committer: Optional[bytes] = None, author: Optional[bytes] = None, commit_timestamp=None, commit_timezone=None, author_timestamp=None, author_timezone=None, tree: Optional[bytes] = None, encoding: Optional[bytes] = None, ref: bytes = b'HEAD', merge_heads: Optional[List[bytes]] = None, no_verify: bool = False, sign: bool = False)

Create a new commit.

If not specified, committer and author default to get_user_identity(…, ‘COMMITTER’) and get_user_identity(…, ‘AUTHOR’) respectively.

Parameters
  • message – Commit message

  • committer – Committer fullname

  • author – Author fullname

  • commit_timestamp – Commit timestamp (defaults to now)

  • commit_timezone – Commit timestamp timezone (defaults to GMT)

  • author_timestamp – Author timestamp (defaults to commit timestamp)

  • author_timezone – Author timestamp timezone (defaults to commit timestamp timezone)

  • tree – SHA1 of the tree root to use (if not specified the current index will be committed).

  • encoding – Encoding

  • ref – Optional ref to commit to (defaults to current branch)

  • merge_heads – Merge heads (defaults to .git/MERGE_HEAD)

  • no_verify – Skip pre-commit and commit-msg hooks

  • sign – GPG Sign the commit (bool, defaults to False, pass True to use default GPG key, pass a str containing Key ID to use a specific GPG key)

Returns

New commit SHA1

fetch(target, determine_wants=None, progress=None, depth=None)

Fetch objects into another repository.

Parameters
  • target – The target repository

  • determine_wants – Optional function to determine what refs to fetch.

  • progress – Optional progress function

  • depth – Optional shallow fetch depth

Returns: The local refs

fetch_objects(determine_wants, graph_walker, progress, get_tagged=None, depth=None)

Fetch the missing objects required for a set of revisions.

Parameters
  • determine_wants – Function that takes a dictionary with heads and returns the list of heads to fetch.

  • graph_walker – Object that can iterate over the list of revisions to fetch and has an “ack” method that will be called to acknowledge that a revision is present.

  • progress – Simple progress function that will be called with updated progress strings.

  • get_tagged – Function that returns a dict of pointed-to sha -> tag sha for including tags.

  • depth – Shallow fetch depth

Returns: iterator over objects, with __len__ implemented

fetch_pack_data(determine_wants, graph_walker, progress, get_tagged=None, depth=None)

Fetch the pack data required for a set of revisions.

Parameters
  • determine_wants – Function that takes a dictionary with heads and returns the list of heads to fetch.

  • graph_walker – Object that can iterate over the list of revisions to fetch and has an “ack” method that will be called to acknowledge that a revision is present.

  • progress – Simple progress function that will be called with updated progress strings.

  • get_tagged – Function that returns a dict of pointed-to sha -> tag sha for including tags.

  • depth – Shallow fetch depth

Returns: count and iterator over pack data

generate_pack_data(have: List[bytes], want: List[bytes], progress: Optional[Callable[[str], None]] = None, ofs_delta: Optional[bool] = None)

Generate pack data objects for a set of wants/haves.

Parameters
  • have – List of SHA1s of objects that should not be sent

  • want – List of SHA1s of objects that should be sent

  • ofs_delta – Whether OFS deltas can be included

  • progress – Optional progress reporting method

get_config() ConfigFile

Retrieve the config object.

Returns: ConfigFile object for the .git/config file.

get_config_stack() StackedConfig

Return a config stack for this repository.

This stack accesses the configuration for both this repository itself (.git/config) and the global configuration, which usually lives in ~/.gitconfig.

Returns: Config instance for this repository

get_description()

Retrieve the description for this repository.

Returns: String with the description of the repository

as set by the user.

get_graph_walker(heads: Optional[List[bytes]] = None) ObjectStoreGraphWalker

Retrieve a graph walker.

A graph walker is used by a remote repository (or proxy) to find out which objects are present in this repository.

Parameters

heads – Repository heads to use (optional)

Returns: A graph walker object

get_named_file(path: str) Optional[BinaryIO]

Get a file from the control dir with a specific name.

Although the filename should be interpreted as a filename relative to the control dir in a disk-based Repo, the object returned need not be pointing to a file in that location.

Parameters

path – The path to the file, relative to the control dir.

Returns: An open file object, or None if the file does not exist.

get_object(sha: bytes) ShaFile

Retrieve the object with the specified SHA.

Parameters

sha – SHA to retrieve

Returns: A ShaFile object :raises KeyError: when the object can not be found

get_parents(sha: bytes, commit: Optional[Commit] = None) List[bytes]

Retrieve the parents of a specific commit.

If the specific commit is a graftpoint, the graft parents will be returned instead.

Parameters
  • sha – SHA of the commit for which to retrieve the parents

  • commit – Optional commit matching the sha

Returns: List of parents

get_peeled(ref: bytes) bytes

Get the peeled value of a ref.

Parameters

ref – The refname to peel.

Returns: The fully-peeled SHA1 of a tag object, after peeling all

intermediate tags; if the original ref does not point to a tag, this will equal the original SHA1.

get_refs() Dict[bytes, bytes]

Get dictionary with all refs.

Returns: A dict mapping ref names to SHA1s

get_shallow() Set[bytes]

Get the set of shallow commits.

Returns: Set of shallow commits.

get_walker(include: Optional[List[bytes]] = None, *args, **kwargs)

Obtain a walker for this repository.

Parameters
  • include – Iterable of SHAs of commits to include along with their ancestors. Defaults to [HEAD]

  • exclude – Iterable of SHAs of commits to exclude along with their ancestors, overriding includes.

  • order – ORDER_* constant specifying the order of results. Anything other than ORDER_DATE may result in O(n) memory usage.

  • reverse – If True, reverse the order of output, requiring O(n) memory.

  • max_entries – The maximum number of entries to yield, or None for no limit.

  • paths – Iterable of file or subtree paths to show entries for.

  • rename_detector – diff.RenameDetector object for detecting renames.

  • follow – If True, follow path across renames/copies. Forces a default rename_detector.

  • since – Timestamp to list commits after.

  • until – Timestamp to list commits before.

  • queue_cls – A class to use for a queue of commits, supporting the iterator protocol. The constructor takes a single argument, the Walker.

Returns: A Walker object

head() bytes

Return the SHA1 pointed at by HEAD.

open_index() Index

Open the index for this repository.

Raises

NoIndexPresent – If no index is present

Returns: The matching Index

parents_provider() ParentsProvider
set_description(description)

Set the description for this repository.

Parameters

description – Text to set as description for this repository.

update_shallow(new_shallow, new_unshallow)

Update the list of shallow objects.

Parameters
  • new_shallow – Newly shallow objects

  • new_unshallow – Newly no longer shallow objects

exception dulwich.repo.InvalidUserIdentity(identity)

Bases: Exception

User identity is not of the format ‘user <email>’

class dulwich.repo.MemoryRepo

Bases: BaseRepo

Repo that stores refs, objects, and named files in memory.

MemoryRepos are always bare: they have no working tree and no index, since those have a stronger dependency on the filesystem.

Open a repository.

This shouldn’t be called directly, but rather through one of the base classes, such as MemoryRepo or Repo.

Parameters
  • object_store – Object store to use

  • refs – Refs container to use

get_config()

Retrieve the config object.

Returns: ConfigFile object.

get_description()

Retrieve the description for this repository.

Returns: String with the description of the repository

as set by the user.

get_named_file(path, basedir=None)

Get a file from the control dir with a specific name.

Although the filename should be interpreted as a filename relative to the control dir in a disk-baked Repo, the object returned need not be pointing to a file in that location.

Parameters

path – The path to the file, relative to the control dir.

Returns: An open file object, or None if the file does not exist.

classmethod init_bare(objects, refs)

Create a new bare repository in memory.

Parameters
  • objects – Objects for the new repository, as iterable

  • refs – Refs as dictionary, mapping names to object SHA1s

open_index()

Fail to open index for this repo, since it is bare.

Raises

NoIndexPresent – Raised when no index is present

set_description(description)

Set the description for this repository.

Parameters

description – Text to set as description for this repository.

class dulwich.repo.ParentsProvider(store, grafts={}, shallows=[])

Bases: object

get_parents(commit_id, commit=None)
class dulwich.repo.Repo(root: str, object_store: Optional[BaseObjectStore] = None, bare: Optional[bool] = None)

Bases: BaseRepo

A git repository backed by local disk.

To open an existing repository, call the constructor with the path of the repository.

To create a new repository, use the Repo.init class method.

Note that a repository object may hold on to resources such as file handles for performance reasons; call .close() to free up those resources.

path

Path to the working copy (if it exists) or repository control directory (if the repository is bare)

Type

str

bare

Whether this is a bare repository

Type

bool

Open a repository.

This shouldn’t be called directly, but rather through one of the base classes, such as MemoryRepo or Repo.

Parameters
  • object_store – Object store to use

  • refs – Refs container to use

bare: bool
clone(target_path, mkdir=True, bare=False, origin=b'origin', checkout=None, branch=None, progress=None, depth=None)

Clone this repository.

Parameters
  • target_path – Target path

  • mkdir – Create the target directory

  • bare – Whether to create a bare repository

  • checkout – Whether or not to check-out HEAD after cloning

  • origin – Base name for refs in target repository cloned from this repository

  • branch – Optional branch or tag to be used as HEAD in the new repository instead of this repository’s HEAD.

  • progress – Optional progress function

  • depth – Depth at which to fetch

Returns: Created repository as Repo

close()

Close any files opened by this repository.

commondir()

Return the path of the common directory.

For a main working tree, it is identical to controldir().

For a linked working tree, it is the control directory of the main working tree.

controldir()

Return the path of the control directory.

classmethod create(path, mkdir=False, object_store=None)

Create a new bare repository.

path should already exist and be an empty directory.

Parameters

path – Path to create bare repository in

Returns: a Repo instance

classmethod discover(start='.')

Iterate parent directories to discover a repository

Return a Repo object for the first parent directory that looks like a Git repository.

Parameters

start – The directory to start discovery from (defaults to ‘.’)

get_blob_normalizer()

Return a BlobNormalizer object

get_config() ConfigFile

Retrieve the config object.

Returns: ConfigFile object for the .git/config file.

get_description()

Retrieve the description of this repository.

Returns: A string describing the repository or None.

get_named_file(path, basedir=None)

Get a file from the control dir with a specific name.

Although the filename should be interpreted as a filename relative to the control dir in a disk-based Repo, the object returned need not be pointing to a file in that location.

Parameters
  • path – The path to the file, relative to the control dir.

  • basedir – Optional argument that specifies an alternative to the control dir.

Returns: An open file object, or None if the file does not exist.

has_index()

Check if an index is present.

index_path()

Return path to the index file.

classmethod init(path: str, mkdir: bool = False) Repo

Create a new repository.

Parameters
  • path – Path in which to create the repository

  • mkdir – Whether to create the directory

Returns: Repo instance

classmethod init_bare(path, mkdir=False, object_store=None)

Create a new bare repository.

path should already exist and be an empty directory.

Parameters

path – Path to create bare repository in

Returns: a Repo instance

open_index() Index

Open the index for this repository.

Raises

NoIndexPresent – If no index is present

Returns: The matching Index

path: str
reset_index(tree: Optional[bytes] = None)

Reset the index back to a specific tree.

Parameters

tree – Tree SHA to reset to, None for current HEAD tree.

set_description(description)

Set the description for this repository.

Parameters

description – Text to set as description for this repository.

stage(fs_paths: Union[str, bytes, PathLike, Iterable[Union[str, bytes, PathLike]]]) None

Stage a set of paths.

Parameters

fs_paths – List of paths, relative to the repository path

unstage(fs_paths: List[str])

unstage specific file in the index :param fs_paths: a list of files to unstage,

relative to the repository path

exception dulwich.repo.UnsupportedExtension(extension)

Bases: Exception

Unsupported repository extension.

exception dulwich.repo.UnsupportedVersion(version)

Bases: Exception

Unsupported repository version.

dulwich.repo.check_user_identity(identity)

Verify that a user identity is formatted correctly.

Parameters

identity – User identity bytestring

Raises

InvalidUserIdentity – Raised when identity is invalid

dulwich.repo.get_user_identity(config: StackedConfig, kind: Optional[str] = None) bytes

Determine the identity to use for new commits.

If kind is set, this first checks GIT_${KIND}_NAME and GIT_${KIND}_EMAIL.

If those variables are not set, then it will fall back to reading the user.name and user.email settings from the specified configuration.

If that also fails, then it will fall back to using the current users’ identity as obtained from the host system (e.g. the gecos field, $EMAIL, $USER@$(hostname -f).

Parameters

kind – Optional kind to return identity for, usually either “AUTHOR” or “COMMITTER”.

Returns

A user identity

dulwich.repo.parse_graftpoints(graftpoints: Iterable[bytes]) Dict[bytes, List[bytes]]

Convert a list of graftpoints into a dict

Parameters

graftpoints – Iterator of graftpoint lines

Each line is formatted as:

<commit sha1> <parent sha1> [<parent sha1>]*

Resulting dictionary is:

<commit sha1>: [<parent sha1>*]

https://git.wiki.kernel.org/index.php/GraftPoint

dulwich.repo.read_gitfile(f)

Read a .git file.

The first line of the file should start with “gitdir: “

Parameters

f – File-like object to read from

Returns: A path

dulwich.repo.serialize_graftpoints(graftpoints: Dict[bytes, List[bytes]]) bytes

Convert a dictionary of grafts into string

The graft dictionary is:

<commit sha1>: [<parent sha1>*]

Each line is formatted as:

<commit sha1> <parent sha1> [<parent sha1>]*

https://git.wiki.kernel.org/index.php/GraftPoint