[Nix-dev] Fetching only a small part of a huge git repository

zimbatm zimbatm at zimbatm.com
Tue Jul 19 23:45:46 CEST 2016


Another potential approach is to use git's sparse checkout. I don't know if
it will also download only a subset of the pack files though.
https://schacon.github.io/git/git-read-tree.html#_sparse_checkout

On Tue, 19 Jul 2016 at 22:38 zimbatm <zimbatm at zimbatm.com> wrote:

> It really only depends if github allows to archive subsets of a repo. Did
> you try to look at the API yet?
>
> On Tue, 19 Jul 2016, 22:25 Martijn Vermaat, <martijn at vermaat.name> wrote:
>
>> Doesn't look like anything like this has been implemented. The
>> fetchFromGitHub (or fetchgit actually) machinery is here:
>>
>> https://github.com/NixOS/nixpkgs/tree/master/pkgs/build-support/fetchgit
>>
>>
>>
>>
>>
>> 2016-07-19 16:52 GMT-04:00 Michiel Leenaars <ml.software at leenaa.rs>:
>>
>>> Hi all,
>>>
>>> I was wondering if there is a recommended way to retrieve only a certain
>>> path (or a set of paths) from a git repository as a source for a Nix
>>> closure? The use case is working with a repository that is huge but only
>>> a single
>>> small folder is needed (a practical example is Google fonts on Github -
>>> hundreds of Mb which is a bit much to store if all that is needed is a
>>> single font such
>>> as the Roboto font which is the default font of Android).
>>> fetchFromGitHub and fetchgit seem to always copy the entire repository.
>>>
>>> git has an option for this (git archive, [1]) which seems to do exactly
>>> what
>>> I want - a checkout of only a single folder of a larger git repository -
>>> and there are other options like a filter branch or a sparse-checkout.
>>> I've grepped nixpkgs to see if there was any usage of git archive but
>>> could not find it - as a user I would probably like to see something in
>>> fetchFromGitHub with something like:
>>>
>>>  url = fetchFromGitHub {
>>>    owner = "name";
>>>    repo = "reallyhugerepo";
>>>    rev = "revisionnumber";
>>>    paths = [ "tinypath-a" "path-b" ];
>>>    sha256 =
>>> "22bfd184608ca00e38515c8d5d01d38460a66a503c75a35d0fea3e33559bb3b6"    };
>>>
>>> That seems like a reasonable and declarative way to specify what can be
>>> discarded (a path in git archive can be a file or a folder with all the
>>> files in it btw).
>>>
>>> Probably I'm not the first to think he needs this, and perhaps such a
>>> thing is currently already possible? If so, I'd love any pointers ... If
>>> this
>>> is a bad or redundant idea, please also let me know ...
>>>
>>> Best,
>>> Michiel
>>>
>>> [1] https://www.kernel.org/pub/software/scm/git/docs/git-archive.html
>>> _______________________________________________
>>> nix-dev mailing list
>>> nix-dev at lists.science.uu.nl
>>> http://lists.science.uu.nl/mailman/listinfo/nix-dev
>>>
>>
>> _______________________________________________
>> nix-dev mailing list
>> nix-dev at lists.science.uu.nl
>> http://lists.science.uu.nl/mailman/listinfo/nix-dev
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.science.uu.nl/pipermail/nix-dev/attachments/20160719/8bdaa911/attachment-0001.html>


More information about the nix-dev mailing list