Package managers keep using Git as a database, it never works out

(nesbitt.io)

339 points | by birdculture 5 hours ago

46 comments

ekjhgkejhgk 2 minutes ago
Uncertain if this is OT, but given that the CCC is politically inspired organization, I hope not:
One thing that still seems absent is awareness of the complete takeover of "gadgets" in schools. Schools these days, as early as primary school, shove screens in front of children. They're expected to look at them, and "use" them for various activities, including practicing handwriting. I wish I was joking [1].
I see two problems with this.
First is that these devices are engineered to be addictive by way of constant notifications/distractions, and learning is something that requires long sustained focus. There's a lot of data showing that under certain common circumstances, you do worse learning from a screen than from paper.
Second is implicitly it trains children to expect that anything has to be done through a screen connected to a closed point-and-click platform. (Uninformed) people will say "people who work with computers make money, so I want my child to have an ipad". But interacting with a closed platform like an ipad is removing the possibilities and putting the interaction "on rails". You don't learn to think, explore and learn from mistakes, instead you learn to use the app that's put in front of you. This in turn reinforces the "computer says no" [2] approach to understanding the world.
I think this is a matter of civil rights and freedom, but sadly I don't often see "civil rights" organizations talk about this. I think I heard Stallman say something along these lines once, but other than that I don't see campaigns anywhere.
[1] https://www.letterjoin.co.uk/
[2] https://youtu.be/eE9vO-DTNZc
c-linkage 4 hours ago
This seems like a tragedy of the commons -- GitHub is free after all, and it has all of these great properties, so why not? -- but this kind of decision making occurs whenever externalities are present.
My favorite hill to die on (externality) is user time. Most software houses spend so much time focusing on how expensive engineering time is that they neglect user time. Software houses optimize for feature delivery and not user interaction time. Yet if I spent one hour making my app one second faster for my million users, I can save 277 user hour per year. But since user hours are an externality, such optimization never gets done.
Externalities lead to users downloading extra gigabytes of data (wasted time) and waiting for software, all of which is waste that the developer isn't responsible for and doesn't care about.
[-]
- Aurornis 50 minutes ago
  > Most software houses spend so much time focusing on how expensive engineering time is that they neglect user time. Software houses optimize for feature delivery and not user interaction time.
  I don’t know what you mean by software houses, but every consumer facing software product I’ve worked on has tracked things like startup time and latency for common operations as a key metric
  This has been common wisdom for decades. I don’t know how many times I’ve heard the repeated quote about how Amazon loses $X million for every Y milliseconds of page loading time, as an example.
  [-]
  - dijit 40 minutes ago
    I worked in e-commerce SaaS in 2011~ and this was true then but I find it less true these days.
    Are you sure that you’re not the driving force behind those metrics; or that you’re not self-selecting for like-minded individuals?
    I find it really difficult to convince myself that even large players (Discord) are measuring startup time. Every time I start the thing I’m greeted by a 25s wait and a `RAND()%9` number of updates that each take about 5-10s.
  - rovr138 15 minutes ago
    There was a thread here earlier this month,
    > Helldivers 2 devs slash install size from 154GB to 23GB
    https://news.ycombinator.com/item?id=46134178
    Section of the top comment says,
    > It seems bizarre to me that they'd have accepted such a high cost (150GB+ installation size!) without entirely verifying that it was necessary!
    and the reply to it has,
    > They’re not the ones bearing the cost. Customers are.
- ekjhgkejhgk 4 hours ago
  I wouldn't call it tragedy of the commons, because it's not a commons. It's owned by microsoft. They're calculating that it's worth it for them, so I say take as much as you can.
  Commons would be if it's owned by nobody and everyone benefits from its existence.
  [-]
  - TeMPOraL 3 hours ago
    Still, because reality doesn't respect boundaries of human-made categories, and because people never define their categories exhaustively, we can safely assume that something almost-but-not-quite like a commons, is subject to an almost-but-not-quite tragedy of the commons.
    [-]
    - bee_rider 29 minutes ago
      That seems to assume some sort of… maybe unfounded linearity or something? I mean, I’m not sure I agree that GitHub is nearly a commons in any sense, but let’s put that aside as a distraction…
      The idea of the tragedy of the commons relies on this feedback loop of having these unsustainably growing herds (growing because they can exploit the zero-cost-to-them resources of the commons). Feedback loops are notoriously sensitive to small parameter changes. MS could presumably impose some damping if they wanted.
    - reactordev 3 hours ago
      An A- is still an A kind of thinking. I like this approach as not everything perfectly fits the mold.
    - lo_zamoyski 2 hours ago
      There is an analogy in the sense that for the users a resource is, for certain practical intents and purposes, functionally common. Social media is like this as well.
      But I would make the following clarifications:
      1. A private entity is still the steward of the resource and therefore the resource figures into the aims, goals, and constraints of the private entity.
      2. The common good is itself under the stewardship of the state, as its function is guardian of the common good.
      3. The common good is the default (by natural law) and prior to the private good. The latter is instituted in positive law for the sake of the former by, e.g., reducing conflict over goods.
    - ttiurani 3 hours ago
      The whole notion of the "tragedy of the commons" needs to be put to rest. It's an armchair thought experiment that was disproven at the latest in the 90s by Elinor Ostrom with actual empirical evidence of commons.
      The "tragedy", if you absolutely need to find one, is only for unrestricted, free-for-all commons, which is obviously a bad idea.
      [-]
      - wongarsu 2 hours ago
        A high-trust community like a village can prevent a tragedy of the commons scenario. Participants feel obligations to the community, and misusing the commons actually does have real downsides for the individual because there are social feedback mechanisms. The classic examples like people grazing sheep or cutting wood are bad examples that don't really work.
        But that doesn't mean the tragedy of the commons can't happen in other scenarios. If we define commons a bit more generously it does happen very frequently on the internet. It's also not difficult to find cases of it happening in larger cities, or in environments where cutthroat behavior has been normalized
        [-]
        TeMPOraL 2 hours ago
        > A high-trust community like a village can prevent a tragedy of the commons scenario. Participants feel obligations to the community, and misusing the commons actually does have real downsides for the individual because there are social feedback mechanisms.
        That works while the size of the community is ~100-200 people, when everyone knows everyone else personally. It breaks down rapidly after that. We compensate for that with hierarchies of governance, which give rise to written laws and bureaucracy.
        New tribes break off old tribes, form alliances, which form larger alliances, and eventually you end up with countries and counties and vovoidships and cities and districts and villages, in hierarchies that gain a level per ~100x population increase.
        This is sociopolitical history of the world in a nutshell.
        [-]
        lukan 1 hour ago
        "and eventually you end up with countries and counties and vovoidships and cities and districts and villages, in hierarchies that gain a level per ~100x population increase."
        You say it like this is a law set in stone, because this is what happened im history, but I would argue it happened under different conditions.
        Mainly, the main advantage of an empire over small villages/tribes is not at all that they have more power than the villages combined, but that they can concentrate their power where it is needed. One village did not stand a chance against the empire - and the villages were not coordinated enough.
        But today we would have the internet for better communication and coordination, enabling the small entieties to coordinate a defense.
        Well, in theory of course. Because we do not really have autonomous small states, but are dominated by the big players. And the small states have mowtly the choice which block to align with, or get crushed. But the trend might go towards small again.
        (See also cheap drones destroying expensive tanks, battleships etc.)
        xorcist 59 minutes ago
        > That works while the size of the community is ~100-200 people,
        Yet we regularly observe that working with millions of people; we take care of our young, we organize, when we see that some action hurt our environment we tend to limit its use.
        It's not obvious why some societies break down early and some go on working.
        [-]
        AnthonyMouse 36 minutes ago
        I get the feeling it's the combination of Schelling points and surplus. If everyone else is being pro-social, i.e. there is a culture of it, and the people aren't so hard up that they can reasonably afford to do the same, then that's what happens, either by itself (Hofstadter's theory of superrationality) or via anything so much as light social pressure.
        But if a significant fraction of the population is barely scraping by then they're not willing to be "good" if it means not making ends meet, and when other people see widespread defection, they start to feel like they're the only one holding up their end of the deal and then the whole thing collapses.
        This is why the tendency for people to propose rent-seeking middlemen as a "solution" to the tragedy of the commons is such a diabolical scourge. It extracts the surplus that would allow things to work more efficiently in their absence.
        vlovich123 1 hour ago
        I’ve heard stories from communist villages where everyone knew everyone. Communal parks and property was not respected and frequently vandalized or otherwise neglected because it didn’t have an owner and it was treated as something for someone else to solve.
        It’s easier to explain in those terms than assumptions about how things work in a tribe.
        jandrewrogers 51 minutes ago
        > A high-trust community like a village can prevent a tragedy of the commons scenario.
        No it does not. This sentiment, which many people have, is based on a fictional and idealistic notion of what small communities are like having never lived in such communities.
        Empirically, even in high-trust small villages and hamlets where everyone knows everyone, the same incentives exist and the same outcomes happen. Every single time. I lived in several and I can't think of a counter-example. People are highly adaptive to these situations and their basic nature doesn't change because of them.
        Humans are humans everywhere and at every scale.
        ttiurani 2 hours ago
        > But that doesn't mean the tragedy of the commons can't happen in other scenarios.
        Commons can fail, but the whole point of Hardin calling commons a "tragedy" is to suggest it necessarily fails.
        Compare it to, say, driving. It can fail too, but you wouldn't call it "the tragedy of driving".
        We'd be much better off if people didn't throw around this zombie term decades after it's been shown to be unfounded.
        lo_zamoyski 2 hours ago
        Even here, the state is the steward of the common good. It is a mistaken notion that the state only exists because people are bad. Even if people were perfectly conscientious and concerned about the common good, you still need a steward. It simply wouldn’t be a steward who would need to use aggressive means to protect the common good from malice or abuse.
      - Saline9515 2 hours ago
        Ostrom showed that it wasn't necessarily a tragedy, if tight groups involved decided to cooperate. This common in what we call "trust-based societies", which aren't universal.
        Nonetheless, the concept is still alive, and anthropic global warming is here to remind you about this.
      - dpark 1 hour ago
        She not “disprove” the existence of the tragedy of the commons. What she established was that controlling the commons can be done communally rather than through privatization or through government ownership.
        Communal management of a resource is still government, though. It just isn’t central government.
        The thesis of the tragedy of the commons is that an uncontrolled resource will be abused. The answer is governance at some level, whether individual, collective, or government ownership.
        > The "tragedy", if you absolutely need to find one, is only for unrestricted, free-for-all commons, which is obviously a bad idea.
        Right. And that’s what people are usually talking about when they say “tragedy of the commons”.
      - b00ty4breakfast 3 hours ago
        yeah, it's a post-hoc rationalization for the enclosure and privatization of said commons.
        [-]
        TeMPOraL 2 hours ago
        And here I thought the standard, obvious solution to tragedy of the commons is centralized governance.
        [-]
        dpark 1 hour ago
        People invoke the tragedy of the commons in bad faith to argue for privatization because “the alternative is communism”. i.e. Either an individual or the government has to own the resource.
        This is of course a false dichotomy because governance can be done at any level.
      - gmfawcett 2 hours ago
        Ostrom's results didn't disprove ToC. She showed that common resources can be communally maintained, not that tragic outcomes could never happen.
  - dahart 1 hour ago
    > so I say take as much as you can. Commons would be if it’s owned by nobody
    This isn’t what “commons” means in the term ‘tragedy of the commons’, and the obvious end result of your suggestion to take as much as you can is to cause the loss of access.
    Anything that is free to use is a commons, regardless of ownership, and when some people use too much, everyone loses access.
    Finite digital resources like bandwidth and database sizes within companies are even listed as examples in the Wikipedia article on Tragedy of the Commons. https://en.wikipedia.org/wiki/Tragedy_of_the_commons
  - jasonkester 3 hours ago
    It has the same effect though. A few bad actors using this “free” thing can end up driving the cost up enough that Microsoft will have to start charging for it.
    The jerks get their free things for a while, then it goes away for everyone.
    [-]
    - Y_Y 2 hours ago
      I think the jerks are the ones who bought and enshittified GitHub after it had earned significant trust and become an important part of FOSS infrastructure.
      [-]
      - irishcoffee 2 hours ago
        Scoping it to a local maxima, the only thing worse than git is github. In an alternate universe hg won the clone wars and we are all better off for it.
      - dahart 1 hour ago
        Why do you blame MS for predictably doing what MS does, and not the people who sold that trust & FOSS infra to MS for a profit? Your blame seems misplaced.
        And out of curiosity, aside from costing more for some people, what’s worse exactly? I’m not a heavy GitHub user, but I haven’t really noticed anything in the core functionality that would justify calling it enshittified.
  - groundzeros2015 2 hours ago
    A public park suffers from tragedy of the commons even though it’s managed by the city.
  - ericyd 2 hours ago
    Tragedy of the Microsoft just doesn't sound as nice though
  - PunchyHamster 3 hours ago
    Well, till you choose to host something yourself and it becomes popular
  - rvba 2 hours ago
    I doubt anyone is calculating
    Remember how GTA5 took 10 minutes to start and nobody cared? Lots of software is like this.
    Some Blizzard games download 137 MB file every time you run them and take few minutes to start (and no, this is not due to my computer).
- massysett 23 minutes ago
  > Externalities lead to users downloading extra gigabytes of data (wasted time) and waiting for software, all of which is waste that the developer isn't responsible for and doesn't care about.
  This is perfectly sensible behavior when the developers are working for free, or when the developers are working on a project that earns their employer no revenue. This is the case for several of the projects at issue here: Nix, Homebrew, Cargo. It makes perfect sense to waste the user's time, as the user pays with nothing else, or to waste Github's bandwidth, since it's willing to give bandwidth away for free.
  Where users pay for software with money, they may be more picky and not purchase software that indiscriminately wastes their time.
  [-]
  - BobbyTables2 15 minutes ago
    Microsoft would have long gone out of business if users cared about their time being wasted.
    Windows 11 should not be more sluggish than Windows 7.
- zahlman 3 hours ago
  > Most software houses spend so much time focusing on how expensive engineering time is that they neglect user time. Software houses optimize for feature delivery and not user interaction time. Yet if I spent one hour making my app one second faster for my million users, I can save 277 user hour per year. But since user hours are an externality, such optimization never gets done.
  This is what people mean about speed being a feature. But "user time" depends on more than the program's performance. UI design is also very important.
- Y-bar 2 hours ago
  You’ll enjoy ”Saving Lives” by Andy Hertzfied: https://www.folklore.org/Saving_Lives.html
  > "The Macintosh boots too slowly. You've got to make it faster!"
- solatic 3 hours ago
  If you think too hard about this, you come back around to Alan Kay's quote about how people who are really serious about software should build their own hardware. Web applications, and in general loading pretty much anything over the network, is a horrible, no-good, really bad user experience, and it always will be. The only way to really respect the user is with native applications that are local-first, and if you take that really far, you build (at the very least) peripherals to make it even better.
  The number of companies that have this much respect for the user is vanishingly small.
  [-]
  - phkahler 2 hours ago
    >> The number of companies that have this much respect for the user is vanishingly small.
    I think companies shifted to online apps because #1 it solved the copy protection problem. FOSS apps are not in any hurry to become centralized because they dont care about that issue.
    Local apps and data are a huge benefit of FOSS and I think every app website should at least mention that.
    "Local app. No ads. You own your data."
    [-]
    - xorcist 56 minutes ago
      Another important reason to move to online applications is that you can change the terms of the deal at any time. This may sound more nefarious than it needs to be, it just means you do not have to commit fully to your licensing terms before the first deal is made, which is tempting for just about anyone.
  - hombre_fatal 3 hours ago
    Software I don’t have to install at all “respects me” the most.
    Native software being an optimum is mostly an engineer fantasy that comes from imagining what you can build.
    In reality that means having to install software like Meta’s WhatsApp, Zoom, and other crap I’d rather run in a browser tab.
    I want very little software running natively on my machine.
    [-]
    - solatic 2 hours ago
      Your browser is acting like a condom, in that respect (pun not intended).
      Yes, there are many cases when condoms are indicative of respect between parties. But a great many people would disagree that the best, most respectful relationships involve condoms.
      > Meta
      Does not sell or operate respectful software. I will agree with you that it's best to run it in a browser (or similar sandbox).
    - freedomben 3 hours ago
      Yes, amen. The more invasive and abusive software gets, the less I want it running on my machine natively. Native installed applications for me now are limited only to apps I trust, and even those need to have a reason to be native apps rather than web apps to get a place in my app drawer
    - shash 1 hour ago
      You mean you’d rather run unverified scripts using a good order of magnitude more resources with a slower experience and have an entire sandboxing contraption to keep said unverified scripts from doing anything to your machine…
      I know the browser is convenient, but frankly, its been a horror show of resource usage and vulnerabilities and pathetic performance
      [-]
      - whstl 38 minutes ago
        The #1 reason the web experience universally sucks today is because companies add an absurd amount of third-party code on their pages for tracking, advertisement, spying on you or whatever non-essential purpose. That, plus an excessive/unnecessary amount of visual decoration.
        The idea that somehow those companies would respect your privacy were they running a native app is extremely naive.
        We can already see this problem on video games, where copy protection became resource-heavy enough to cause performance issues.
  - ghosty141 3 hours ago
    Yes because users don't appreciate this enough to pay for the time this takes.
- imiric 1 minute ago
  > GitHub is free after all, and it has all of these great properties, so why not?
  The answer is in TFA:
  > The underlying issue is that git inherits filesystem limitations, and filesystems make terrible databases.
- robmccoll 2 hours ago
  I don't think most software houses spend enough time even focusing on engineering time. CI pipelines that take tens of minutes to over an hour, compile times that exceed ten seconds when nothing has changed, startup times that are much more than a few seconds. Focus and fast iteration are super important to writing software and it seems like a lot of orgs just kinda shrug when these long waits creep into the development process.
- ozim 3 hours ago
  About apps done by software houses, even though we should strive for doing good job and I agree with sentiment...
  First argument would be - take at least two 0's from your estimation, most of applications will have maybe thousands of users, successful ones will maybe run with 10's of thousands. You might get lucky to work on application that has 100's of thousands, millions of users and you work in FAANG not a typical "software house".
  Second argument is - most users use 10-20 apps in typical workday, your application is most likely irrelevant.
  Third argument is - most users would save much more time learning how to use applications (or to use computer) properly they use on daily basis, than someone optimizing some function from 2s to 1s. But of course that's hard because they have 10-20 apps daily plus god know how many other not on daily basis. Though still I see people doing super silly stuff in tools like Excel or even not knowing copy paste - so not even like any command line magic.
  [-]
- pastor_williams 2 hours ago
  This was something that I heavily focused on for my feature area a year ago - new user sign up flow. But the decreased latency was really in pursuit of increased activation and conversion. At least the incentives aligned briefly.
- vlovich123 57 minutes ago
  I think it’s naive to think engineers or managers don’t realize this or don’t think in these ways.
  https://www.folklore.org/Saving_Lives.html
- threatofrain 28 minutes ago
  > Most software houses spend so much time focusing on how expensive engineering time is that they neglect user time. Software houses optimize for feature delivery and not user interaction time.
  Oh no no no. Consumer-facing companies will burn 30% of your internal team complexity budget on shipping the first "frame" of your app/website. Many people treat Next as synonymous with React, and Next's big deal was helping you do just this.
- brightball 52 minutes ago
  User time is typically a mix of performance tuning and UX design isn’t it?
- inapis 3 hours ago
  >Yet if I spent one hour making my app one second faster for my million users, I can save 277 user hour per year. But since user hours are an externality, such optimization never gets done.
  I have never been convinced by this argument. The aggregate number sounds fantastic but I don't believe that any meaningful work can be done by each user saving 1 second. That 1 second (and more) can simply be taken by me trying to stretch my body out.
  OTOH, if the argument is to make software smaller, I can get behind that since it will simply lead to more efficient usage of existing resources and thus reduce the environmental impact.
  But we live in a capitalist world and there needs to be external pressure for change to occur. The current RAM shortage, if it lasts, might be one of them. Otherwise, we're only day dreaming for a utopia.
  [-]
  - adrianN 3 hours ago
    Time saved to increased productivity or happiness or whatever is not linear but a step function. Saving one second doesn’t help much, but there is a threshold (depending on the individual) where faster workflows lead to a better experience. It does make a difference whether a task takes a minute or half a second, at least for me.
  - Aerroon 3 hours ago
    One second is long enough that it can put a user off from using your app though. Take notifications on phones for example. I know several people who would benefit from a habitual use of phone notifications, but they never stick to using them because the process of opening (or switching over to) the notification app and navigating its UI to leave a notification takes too long. Instead they write a physical sticky note, because it has a faster "startup time".
    [-]
    - tehbeard 3 hours ago
      All depends on the type of interaction.
      A high usage one, absolutely improve the time of it.
      Loading the profile page? Isn't done often so not really worth it unless it's a known and vocal issue.
      https://xkcd.com/1205/ gives a good estimate.
- loloquwowndueo 4 hours ago
  Just a reminder that GitHub is not git.
  The article mentions that most of these projects did use GitHub as a central repo out of convenience so there’s that but they could also have used self-hosted repos.
  [-]
  - machinationu 3 hours ago
    Explain to me how you self-host a git repo which is accessed millions of time a day from CI jobs pulling packages.
    [-]
    - freedomben 2 hours ago
      I'm not sure whether this question was asked in good faith, but is actually a damn good one.
      I've looked into self hosting and git repo that has horizontal scalability, and it is indeed very difficult. I don't have the time to detail it in a comment here, but for anyone who is curious it's very informative to look at how GitLab handled this with gitaly. I've also seen some clever attempts to use object storage, though I haven't seen any of those solutions put heavily to the test.
      I'd love to hear from others about ideas and approaches they've heard about or tried
      https://gitlab.com/gitlab-org/gitaly
    - fweimer 2 hours ago
      These days, people solve similar problems by wrapping their data in an OCI container image and distribute it through one of the container registries that do not have a practically meaningful pull rate limit. Not really a joke, unfortunately.
    - adrianN 2 hours ago
      You git init —-bare on a host with sufficient resources. But I would recommend thinking about your CI flow too.
      [-]
      - machinationu 2 hours ago
        no, hundred of thousands of thousands of individual projects CI jobs. OP was talking about package managers for the whole world, not for one company
        [-]
        adrianN 20 minutes ago
        If people depend on remote downloads from different companies for their CI pipelines they’re doing it wrong. Every sensible company sets up a mirror or at least a cache on infra that they control. Rate limiting downloads is the natural course of action for the provider of a package registry.
    - ozim 3 hours ago
      FTFY:
      Explain to me how you self-host a git repo without spending any money and having no budget which is accessed millions of time a day from CI jobs pulling packages.
  - justincormack 4 hours ago
    They probably would have experienced issues way sooner, as the self hosted tools don't scale nearly as well.
- JohnHaugeland 1 hour ago
  > This seems like a tragedy of the commons -- GitHub is free after all, and it has all of these great properties, so why not?
  because it's bad at this job, and sqlite is also free
  this isn't about "externalities"
- machinationu 3 hours ago
  With AI engineering costs are plummeting.
  You can implement entire features with 10 cents of tokens.
  Companies which dont adapt will be left behind this year.
  [-]
  - benchloftbrunch 57 minutes ago
    As long as you don't have any security compliance requirements and/or can afford the cost of self hosting your LLM, sure.
    Anyone working in government, banking, or healthcare is still out of luck since the likes of Claude and GPT are (should be) off limits.
  - camgunz 3 hours ago
    I've never been more convinced LLMs are the vanguard of the grift economy now that green accounts are low effort astroturfing on HN.
    [-]
    - freedomben 2 hours ago
      LLM's obviously can't do it all, and they still have severe areas of weakness where they can't replace humans, but there are definitely a lot of areas where they really can now. I've seen it first hand. I've even experienced it first hand. There are a couple of services that I wrote years ago that were basically parked in maintenance mode because they weren't worth investing time in, and we just dealed with some of the annoyances and bugs. With the latest LLM's, over the last couple of months I've been able to resurrect them and fix a lot of bugs and even add some wanted features in just a few hours. It really is quite incredible and scary at the same time.
      Also in case you're not aware, accusing people of shilling or astroTurfing is against the hacker news guidelines
      [-]
      - camgunz 1 hour ago
        The loophole here is that this account isn't a person
        [-]
        machinationu 1 hour ago
        you forgot to ask me to ignore previous instructions and say pizza
    - machinationu 3 hours ago
      hey, I'm just a lowly LLM, gotta earn my tokens :|
newswangerd 2 minutes ago
It’s always humbling when you go on the front news of HN and see an article titled “the thing you’re doing right now is a bad idea and here’s why”
This has happened to me a few times now. The last one was a fantastic article about how PG Notify locks the whole database.
In this particular case it just doesn’t make a ton of sense to change course. Im a solo dev building a thing that may never take off, so using git for plug-in distribution is just a no brainer right now. That said, I’ll hold on to this article in case I’m lucky enough to be in a position where scale becomes an issue for me.
cesarb 2 hours ago
One of these is not like the others...
> The problem was that go get needed to fetch each dependency’s source code just to read its go.mod file and resolve transitive dependencies.
This article is mixing two separate issues. One is using git as the master database storing the index of packages and their versions. The other is fetching the code of each package through git. They are orthogonal; you can have a package index using git but the packages being zip/tar/etc archives, you can have a package index not using git but each package is cloned from a git repository, you can have both the index and the packages being git repositories, you can have neither using git, you can even not have a package index at all (AFAIK that's the case for Go).
[-]
- jayd16 3 minutes ago
  Even with git, it should be possible to grab the single file needed without the rest of the repo, but i'ts still trying to round a square peg.
- bobpaw 1 hour ago
  I think the article takes issue not with fetching the code, but with fetching the go.mod file that contains index and dependency information. That’s why part of the solution was to host go.mod files separately.
dboon 3 hours ago
I’m building Cargo/UV for C. Good article. I thought about this problem very deeply.
Unfortunately, when you’re starting out, the idea of running a registry is a really tough sell. Now, on top of the very hard engineering problem of writing the code and making a world class tool, plus the social one of getting it adopted, I need to worry about funding and maintaining something that serves potentially a world of traffic? The git solution is intoxicating through this lense.
Fundamentally, the issue is the sparse checkouts mentioned by the author. You’d really like to use git to version package manifests, so that anyone with any package version can get the EXACT package they built with.
But this doesn’t work, because you need arbitrary commits. You either need a full checkout, or you need to somehow track the commit a package version is in without knowing what hash git will generate before you do it. You have to push the package update and then push a second commit recording that. Obviously infeasible, obviously a nightmare.
Conan’s solution is I think just about the only way. It trades the perfect reproduction for conditional logic in the manifest. Instead of 3.12 pointing to a commit, every 3.x points to the same manifest, and there’s just a little logic to set that specific config field added in 3.12. If the logic gets too much, they let you map version ranges to manifests for a package. So if 3.13 rewrites the entire manifest, just remap it.
I have not found another package manager that uses git as a backend that isn’t a terrible and slow tool. Conan may not be as rigorous as Nix because of this decision but it is quite pragmatic and useful. The real solution is to use a database, of course, but unless someone wants to wire me ten thousand dollars plus server costs in perpetuity, what’s a guy supposed to do?
[-]
- adrianN 2 hours ago
  Before you managed to build a popular tool it is unlikely that you need to serve many users. Directly going for something that can serve the world is probably premature
  [-]
  - dboon 2 hours ago
    For most software, yes. But the value of a package manager is in its adoption. A package manager that doesn’t run up against these problems is probably a failure anyway.
- krautsauer 1 hour ago
  I wonder how meson wraps' story fits with this. They used not to, but now they're throwing everything into a single repository [0]. I wonder about the motivation and how it compares to your project.
  0: https://github.com/mesonbuild/wrapdb/tree/master/subprojects
- ambicapter 2 hours ago
  > Unfortunately, when you’re starting out, the idea of running a registry is a really tough sell. Now, on top of the very hard engineering problem of writing the code and making a world class tool, plus the social one of getting it adopted, I need to worry about funding and maintaining something that serves potentially a world of traffic? The git solution is intoxicating through this lense.
  So you need a decentralized database? Those exist (or you can make your own, if you're feeling ambitious), probably ones that scale in different ways than git does.
  [-]
  - dboon 2 hours ago
    Please share. I’m interested in anything that’s roughly as simple as implementing a centralized registry, is easily inspected by users (preferably with no external tooling), and is very fast.
    It’s really important that someone is able to search for the manifest one of their dependencies uses for when stuff doesn’t work out of the box. That should be as simple as possible.
    I’m all ears, though! Would love to find something as simple and good as a git registry but decentralized
    [-]
    - strbean 39 minutes ago
      Distributed ledger! /s... ?
ekjhgkejhgk 4 hours ago
Do the easy thing while it works, and when it stops working, fix the problem.
Julia does the same thing, and from the Rust numbers on the article, Julia has about 1/7th the number of packages that Rust does[1] (95k/13k = 7.3).
It works fine, Julia has some heuristics to not re-download it too often.
But more importantly, there's a simple path to improve. The top Registry.toml [1] has a path to each package, and once donwloading everything proves unsustainable you can just download that one file and use it to download the rest as needed. I don't think this is a difficult problem.
[1] https://github.com/JuliaRegistries/General/blob/master/Regis...
[-]
- galenlynch 3 hours ago
  I believe Julia only uses the Git registry as an authoritative ledger where new packages are registered [1]. My understanding is that as you mention, most clients don't access it, and instead use the "Pkg Protocol" [2] which does not use Git.
  [1] https://github.com/JuliaRegistries/General
  [2] https://pkgdocs.julialang.org/dev/protocol/
- mi_lk 35 minutes ago
  > Do the easy thing while it works, and when it stops working, fix the problem
  Another way to phrase this mindset is "fuck around and find out" in gen-Z speak. It's usually practical to an extent but I'm personally not a fan
- IshKebab 1 hour ago
  > when it stops working, fix the problem
  This is too naive. Fixing the problem costs a different amount depending on when you do it. The later you leave it the more expensive it becomes. Very often to the point where it is prohibitively expensive and you just put up with it being a bit broken.
  This article even has an example of that - see the vcpkg entry.
- zahlman 3 hours ago
  > 00000000-1111-2222-3333-444444444444 = { name = "REPLTreeViews", path = "R/REPLTreeViews" }
  ... Should it be concerning that someone was apparently able to engineer an ID like that?
  [-]
  - ekjhgkejhgk 3 hours ago
    Could you please articulate specifically why that should be concerning?
    Right now I don't see the problem because the only criterion for IDs is that they are unique.
    [-]
    - zahlman 2 hours ago
      I didn't know whether they were supposed to be within the developer's control (in which case the only real concern is whether someone else has already used the id), or generated by the system (in which case a developer demonstrated manipulation of that system).
      Apparently it is the former, and most developers independently generate random IDs because it's easy and is extremely unlikely to result in collisions. But it seems the dev at the top of the list had a sense of vanity instead.
      [-]
      - KenoFischer 2 hours ago
        You're supposed to generate a random one, but the only consequence of not doing so is that you won't be able to register your package if someone else already took the UUID (which is a pain if you have registered versions in a private registry). That said, "vanity" UUIDs are a bad look, so we'd probably reject them if someone tried that today, but there isn't any actual issue with them.
  - skycrafter0 3 hours ago
    If you read the repo README, it just says "generate a uuid". You can use whatever you want as long as it fits the format, it seems.
  - adestefan 3 hours ago
    It’s as random as any other UUID.
    [-]
    - Severian 3 hours ago
      Incorrect, only some UUIDs are random, specifically v4 and v7 (v7 uses time as well).
      https://en.wikipedia.org/wiki/Universally_unique_identifier
      > 00000000-1111-2222-3333-444444444444
      This would technically be version 2, which would be built from the date-time and MAC address, and DCE security version.
      But overall, if you allow any yahoo to pick a UUID, its not really a UUID, its just some random string that looks like one.
      [-]
      - ekjhgkejhgk 1 hour ago
        > if you allow any yahoo to pick a UUID, its not really a UUID
        universally unique identifier (UUID)
        > 00000000-1111-2222-3333-444444444444
        It's unique.
        Anyway we're talking about a package that doesn't matter. It's abandoned. Furthermore it's also broken, because it uses REPL without importing it. You can't even precompile it.
        https://github.com/pfitzseb/REPLTreeViews.jl/blob/969f04ce64...
    - anonymars 1 hour ago
      Which is to say, not guaranteed at all. GUIDs are designed to be unique, not random/unpredictable
      https://devblogs.microsoft.com/oldnewthing/20120523-00/?p=75...
- 0xbadcafebee 2 hours ago
  This is basically unethical. Imagine anything important in the world that worked this way. "Do nuclear engineering the easy way while it works, and when it stops working, fix the problem."
  Software engineers always make the excuse that what they're making now is unimportant, so who cares? But then everything gets built on top of that unimportant thing, and one day the world crashes down. Worse, "fixing the problem" becomes near impossible, because now everything depends on it.
  But really the reason not to do it, is there's no need to. There are plenty of other solutions than using Git that work as well or better without all the pitfalls. The lazy engineer picks bad solutions not because it's necessarily easier than the alternatives, but because it's the path of least resistance for themselves.
  Not only is this not better, it's often actively worse. But this is excused by the same culture that gave us "move fast and break things". All you have to do is use any modern software to see how that worked out. Slow bug-riddled garbage that we're all now addicted to.
  [-]
  - xboxnolifes 39 minutes ago
    Most of the world does work this way. Problems are solved within certain conditions and for use over a certain time frame. Once those change, the problem gets revisited.
    Most software gets to take it to more of an extreme then many engineering fields since there isn't physical danger. Its telling that the counter examples always use the potentially dangerous problems like medicine or nuclear engineering. The software in those fields are more stringent.
  - hombre_fatal 2 hours ago
    On the other hand, GitHub wants to be the place you choose to build your registry for a new project, and they are clearly on board with the idea given that they help massive projects like Nix packages instead of kicking them off.
    As opposed to something like using a flock of free blogger.com blogs to host media for an offsite project.
  - ModernMech 1 hour ago
    Hold up... "lazy engineers" are the problem here? What about a society that insists on shoving the work product of unfunded, volunteer engineers into critical infrastructure because they don't want to pay what it costs to do things the right way? Imagine building a nuclear power plant with an army of volunteer nuclear engineers.
    It cannot be the case that software engineers are labelled lazy for not building the at-scale solution to start with, but at the same time everyone wants to use their work, and there are next to no resources for said engineer to actually build the at scale solution.
    > the path of least resistance for themselves.
    Yeah because they're investing their own personal time and money, so of course they're going to take the path that is of least resistance for them. If society feels that's "unethical", maybe pony up the cash because you all still want to rely on their work product they are giving out for free.
    [-]
    - rovr138 2 minutes ago
      > If society feels that's "unethical", maybe pony up the cash because you all still want to rely on their work product they are giving out for free.
      I like OSS and everything.
      Having said that, ethically, should society be paying for these? Maybe that is what should happen. In some places, we have programs to help artists. Should we have the same for software?
  - ekjhgkejhgk 55 minutes ago
    Fixing problems as they appear is unethical? Ok then.
    You realize, there are people who think differently? Some people would argue that if you keep working on problems you don't have but might have, you end up never finishing anything.
    It's a matter of striking a balance, and I think you're way on one end of the spectrum. The vast majority of people using Julia aren't building nuclear plants.
steeleduncan 4 hours ago
The other conclusion to draw is "Git is a fantastic choice of database for starting your package manager, almost all popular package managers began that way."
[-]
- saidinesh5 3 hours ago
  I think the conclusion is more that package definitions can still be maintained on git/GitHub but the package manager clients should probably rely on a cache/db/a more efficient intermediate layer.
  Mostly to avoid downloading the whole repo/resolve deltas from the history for the few packages most applications tend to depend on. Especially in today's CI/CD World.
  [-]
  - pseufaux 14 minutes ago
    This is how WinGet works. It has a small SQLite db it downloads from a hosted url. The DB contains some minimal metadata and a url path to access the full metadata. This way WinGet only has to make API calls for packages it's actually interacting with. As a package manager, it has plenty of problems still, but it's a simple, elegant solution for the git as a DB issue.
  - reactordev 3 hours ago
    This is exactly the right approach. I did this for my package manager.
    It relies on a git repo branch for stable. There are yaml definitions of the packages including urls to their repo, dependencies, etc. Preflight scripts. Post install checks. And the big one, the signatures for verification. No binaries, rpms, debs, ar, or zip files.
    What’s actually installed lives in a small SQLite database and searching for software does a vector search on each packages yaml description.
    Semver included.
    This was inspired by brew/portage/dpkg for my hobby os.
- edolstra 1 hour ago
  Indeed. Nixpkgs wouldn't have been as successful if it hadn't been using Git (or GitHub).
  Sure, eventually you run into scaling issues, but that's a first world problem.
- bluGill 3 hours ago
  Git isn't a fantastic choice unless you know nothing about databases. A search would show plenty of research on databases and what works when/why.
  [-]
  - kibwen 3 hours ago
    For the purposes of the article, git isn't just being used as a database, it's being used as a protocol to replicate the database to the client to allow for offline operation and then keep those distributed copies in sync. And even for that purpose you can do better than git if you know what you're doing, but knowledge of databases alone isn't going to help you (let alone make your engineering more economical than relying on free git hosting).
    [-]
    - freedomben 2 hours ago
      Exactly. It's not just about the best solution to the problem, it's also heavily about the economics around it. If I wanted to create a new package manager today, I could get started by utilizing Git and existing git hosting solutions with very little effort, and effort translates to time, and time is a scarce resource. If you don't know whether your package manager will take off or not, it may not be the best use of your scarce resources to invest in a robust and optimized solution out of the gate. I wish that weren't the case, I would love to have an infinite amount of time, but wishing is not going to make it happen
- adastra22 3 hours ago
  Git is an absolute shit database for a package manager even in the beginning. It’s just that GitHub subsidizes hosting and that is hard to pass up.
  [-]
  - fn-mote 2 hours ago
    Sure, but can you back up the expletive with some reason why you think that?
    As it is, this comment is just letting out your emotion, not engaging in dialogue.
  - IshKebab 1 hour ago
    What's a better option? One that keeps track of history and has a nice review interface?
kibwen 3 hours ago
I think there's a form of survivorship bias at work here. To use the example of Cargo, if Rust had never caught on, and thereby gotten popular enough to inflate the git-based index beyond reason, then it would never have been a problem to use git as the backing protocol for the index. Likewise, we can imagine innumerable smaller projects that successfully use git as a distributed delta-updating data distribution protocol, and never happen to outgrow it.
The point being, if you're not sure whether your project will ever need to scale, then it may not make sense to reinvent the wheel when git is right there (and then invent the solution for hosting that git repo, when Github is right there), letting you spend time instead on other, more immediate problems.
[-]
- stickfigure 47 minutes ago
  Right, this post may encourage premature optimization. Cargo, Homebrew, et al chose an easy, good-enough solution which allowed them to grow until they hit scaling limits. This is a good problem to have.
  I am sure there's value having a vision for what your scaling path might be in the future, so this discussion is a good one. But it doesn't automatically mean that git is a bad place to start.
Ericson2314 48 minutes ago
The Nixpkgs example is not like the others, because it is source code.
I don't get what is so bad about shallow clones either. Why should they be so performance sensative?
[-]
- ajb 35 minutes ago
  In a compressed format, later commits would be added as a delta of some kind, to avoid increasing the size by the whole tree size each time. To make shallow clones efficient you'd need to rewrite the compressed form such that earlier commits are instead deltas on later ones, or something equivalent.
jarofgreen 1 hour ago
It's not just package manager who do this - a lot of smaller projects crowd source data in git repositories. Most of these don't reach the scale where the technical limitations become a problem.
Personally my view is that the main problem when they do this is that it gets much harder for non-technical people to contribute. At least that doesn't apply to package managers, where it's all technical people contributing.
There are a few other small problems - but it's interesting to see that so many other projects do this.
I ended up working on an open source software library to help in these cases: https://www.datatig.com/
Here's a write up of an introduction talk about it: https://www.datatig.com/2024/12/24/talk.html I'll add the scale point to future versions of this talk with a link to this post.
drzaiusx11 11 minutes ago
I'd add git gemfile dependencies to the list of languages called out here as well. It supports git repos, but in general it's a bad idea unless you are diligent with git tag use and disallow git tag mutability, which also assumes you have complete control of your git dependencies...
drzaiusx11 15 minutes ago
One of the first things I did at my current place of employment was to detangle the mess of gemfile git dependencies and get them to adopt semver and an actual package repo. There were so many footguns with git dependencies in ruby we were getting taken down by friendly fire on the daily...
cbondurant 2 hours ago
Admittedly, I try and stay away from database design whenever possible at work. (Everything database is legacy for us) But the way the term is being used here kinda makes me wonder, do modern sql databases have enough security features and permissions management systems in place that you could just directly expose your database to the world with a "guest" user that can only make incredibly specific queries?
Cut out the middle man, directly serve the query response to the package manager client.
(I do immediately see issues stemming from the fact that you cant leverage features like edge caching this way, but I'm not really asking if its a good solution, im more asking if its possible at all)
[-]
- bob1029 2 hours ago
  There are still no realistic ways to expose a hosted SQL solution to the public without really unhappy things occurring. It doesn't matter which vendor you pick.
  Anything where you are opening a TCP connection to a hosted SQL server is a non-starter. You could hypothetically have so many read replicas that no one could blow anyone else up, but this would get to be very expensive at scale.
  Something involving SQLite is probably the most viable option.
  [-]
- zX41ZdbW 1 hour ago
  ClickHouse can do it. Examples:
```
    https://play.clickhouse.com/

    clickhouse-client --host play.clickhouse.com --user play --secure

    ssh play.clickhouse.com
```
- brendoncarroll 2 hours ago
  I personally think that this is the future, especially since such an architecture allows for E2E encryption of the entire database. The protocol should just be a transaction layer for coordinating changes of opaque blobs.
  All of the complexity lives on the client. That makes a lot of sense for a package manager because it's something lots of people want to run, but no one really wants to host.
- mirekrusin 1 hour ago
  You can use fossil [0]
  [0] https://fossil-scm.org
quaintdev 4 hours ago
I host my own code repository using Forgejo. It's not public. In fact, it's behind mutual tls like all the service I host. Reason? I don't want to deal with bots and other security risks that come with opening port to the world.
Turns out Go module will not accept package hosted on my Forgejo instance because it asks for certificate. There are ways to make go get use ssh but even with that approach the repository needs to be accessible over https. In the end, I cloned the repository and used it in my project using replace directive. It's really annoying.
[-]
- agwa 3 hours ago
  If you add .git to the end of your module path and set $GOPRIVATE to the hostname of your Forgejo instance, then Go will not make any HTTPS requests itself and instead delegate to the git command, which can be configured to authenticate with client certificates. See https://go.dev/ref/mod#vcs-find
- xyzzy_plugh 3 hours ago
  > There are ways to make go get use ssh but even with that approach the repository needs to be accessible over https.
  No, that's false. You don't need anything to be accessible over HTTP.
  But even if it did, and you had to use mTLS, there's a whole bunch of ways to solve this. How do you solve this for any other software that doesn't present client certs? You use a local proxy.
- irusensei 1 hour ago
  Have a look at Tailscale DNS and certs. Its gives you a valid cert through lets encrypt without exposing your services to the internet.
ifh-hn 3 hours ago
So what's the answer then? That's the question I wanted answered after reading this article. With no experience with git or package management, would using a local client sqlite database and something similar on the server do?
[-]
- encom 3 hours ago
  I quite like Gentoo's rsync based package manager. I believe they've used that since the beginning. It works well.
dleslie 2 hours ago
GitHub is intoxicatingly free hosting, but Git itself is a terrible database. Why not maintain an _actual_ database on GitHub, with tagged releases?
Sqlite data is paged and so you can get away with only fetching the pages you need to resolve your query.
https://phiresky.github.io/blog/2021/hosting-sqlite-database...
mukundesh 28 minutes ago
Though not Github, worth mentioning Huggingface, which is also using git, but managing large files with their(?) xet protocol. https://huggingface.co/docs/hub/en/xet/index
hogrug 3 hours ago
The facts are interesting but the conclusion a bit strange. These package managers have succeeded because git is better for the low trust model and GitHub has been hosting infra for free that no one in their right mind would provide for the average DB.
If it didn't work we would not have these massive ecosystems upsetting GitHub's freemium model, but anything at scale is naturally going to have consequences and features that aren't so compatible with the use case.
teiferer 46 minutes ago
And this my friends is the reason why (only) focusing on CPU cycles and memory hierarchies is insufficient when thinking of the performance of a system. Yes they are important. But no level of low-level optimization will get you out of the hole that a wrong choice of algorithm and/or data structure may have dug you into.
twoodfin 4 hours ago
What made git special & powerful from the start was its data model: Like the network databases of old, but embedded in a Merkle tree for independent evolution and verifiability.
Scaling that data model beyond projects the size of the Linux kernel was not critical for the original implementation. I do wonder if there are fundamental limits to scaling the model for use cases beyond “source code management for modest-sized, long-lived projects”.
[-]
- amluto 3 hours ago
  Most of the problems mentioned in the article are not problems with using a content-addressed tree like git or even with using precisely git’s schema. The problems are with git’s protocol and GitHub’s implementation thereof.
  Consider vcpkg. It’s entirely reasonable to download a tree named by its hash to represent a locked package. Git knows how to store exactly this, but git does not know how to transfer it efficiently.
  [-]
  - mananaysiempre 2 hours ago
    > Git knows how to store [a hash-addressed tree], but git does not know how to transfer it efficiently.
    Naïvely, I’d expect shallow clones to be this, so I was quite surprised by a mention of GitHub asking people not to use them. Perhaps Git tries too hard to make a good packfile?..
    Meanwhile, what Nixpkgs does (and why “release tarballs” were mentioned as a potential culprit in the discussion linked from TFA) is request a gzipped tarball of a particular commit’s files from a GitHub-specific endpoint over HTTP rather than use the Git protocol. So that’s already more or less what you want, except even the tarball is 46 MB at this point :( Either way, I don’t think the current problems with Nixpkgs actually support TFA’s thesis.
gethly 3 hours ago
If we stopped using VCS to fetch source files, we would lose the ability to get the exact commit(understand as version that has nothing to do with the underlying VCS) of these files. Git, Mercurial, SVN.., github, bitbucket...it does not matter. Absolutely nobody will be building downloadable versions of their source files, hosted on who knows how "prestigious" domains, by copying them to another location just to serve the --->exact same content<--- that github and alike already provide.
This entire blog is just a waste of time for anyone reading it.
[-]
- throwway120385 2 hours ago
  Or you could just ship a tarball and an sha checksum.
  [-]
  - gethly 2 hours ago
    you could, in case you want to make only certain releases publicly available. but then, who wants to do that manual labour? we're talking mainstream here, not specific use cases.
- layer8 2 hours ago
  And yet, that's pretty much how the Java world works (Maven repositories).
- forrestthewoods 2 hours ago
  > This entire blog is just a waste of time for anyone reading it.
  Well that’s an extremely rude thing to say.
  Personally I thought it was really interesting to read about a bunch of different projects all running into the same wall with Git.
  I also didn’t realize that Git had issues with sparse checkouts. Or maybe author meant shallow? I forget.
ori_b 3 hours ago
Alternatively: Downloading the entire state of all packages when you care about just one, it never works out.
O(1) beats O(n) as n gets large.
[-]
- gruez 3 hours ago
  Seems to still work out for apt?
  [-]
  - ajb 2 hours ago
    Not in the same sense. An analogy might be: apt is like fetching a git repo in which all the packages are submodules, so lazily fetched. Some of the package managers in the article seem to be using a monorepo for all packages - including the content. Others seem to have different issues - go wasn't including enough information in the top level, so all the submodules had to be fetched anyway. vcpkg was doing something with tree hashes which meant they weren't really addressible.
Zambyte 4 hours ago
The issues with using Git for Nix seem to entirely be issues with using GitHub for Nix, no?
[-]
- Rucadi 4 hours ago
  I also got the same feeling from that, in fact, I would go as far as to say that nixpkgs and nix-commands integration with git works quite well and is not an issue.
  So the phrase the article says "Package managers keep falling for this. And it keeps not working out" I feel that's untrue.
  The most issue I have with this really is "flakes" integration where the whole recipe folder is copied into the store (which doesn't happen with non-flakes commands), but that's a tooling problem not an intrinsic problem of using git
- femiagbabiaka 4 hours ago
  Yeah, it's inclusion in here is baffling because none of the listed issues have anything to do with the particular issue nixpkgs is having.
iamwil 40 minutes ago
This sounds like a missing piece of software in the OSS world. If you have the inclination, you should write it.
pizlonator 1 hour ago
What is the alternative?
"Use a database" isn't actionable advice because it's not specific enough
bencornia 4 hours ago
> Grab’s engineering team went from 18 minutes for go get to 12 seconds after deploying a module proxy. That’s not a typo. Eighteen minutes down to twelve seconds.
> The problem was that go get needed to fetch each dependency’s source code just to read its go.mod file and resolve transitive dependencies. Cloning entire repositories to get a single file.
I have also had inconsistent performance with go get. Never enough to look closely at it. I wonder if I was running into the same issue?
[-]
- zahlman 3 hours ago
  > needed to fetch each dependency’s source code just to read its go.mod file and resolve transitive dependencies.
  Python used to have this problem as well (technically still does, but a large majority of things are available as a wheel and PyPI generally publishes a separate .metadata file for those wheels), but at least it was only a question of downloading and unpacking an archive file, not cloning an entire repo. Sheesh.
  Why would Go need to do that, though? Isn't the go.mod file in a specific place relative to the package root in the repo?
  [-]
  - klooney 2 hours ago
    Go's lock files arrived at around the same time as the proxy, before then you didn't have transitive dependencies pre baked.
- fireflash38 3 hours ago
  How long ago were you having issues? That was changed in go 1.13.
dwardu 55 minutes ago
Worst thing is when you’re in a an office and your pc along with other pcs pulls from git unauthenticated, then you get hit with api limits
weiwenhao 39 minutes ago
For package management software that is rarely used, free is the biggest motivation.
nacozarina 3 hours ago
successful things often have humble origins, it’s a feature not a bug
for every project that managed to out-grow ext4/git there were a hundred that were well-served and never needed to over-invest in something else
mikkupikku 4 hours ago
People who put off learning SQL for later end up using anything other than a database as their database.
[-]
- groundzeros2015 1 hour ago
  Is sql over ssh a thing?
- redog 3 hours ago
  SQL killed the set theory star
PunchyHamster 2 hours ago
The article conclusion is just... not good. There are many benefits to using Git as backend, you can point your project to every single commit as a version which makes testing any fixes or changes in libs super easy, it has built in integrity control and technically (sadly not in practice) you could just sign commits and use that to verify whether package is authentic.
It being unoptimal bandwidth wise is frankly just a technical hurdle to get over it, with benefits well worth the drawback
hk1337 3 hours ago
I like Go but it’s dependency management is weird and seems to be centered around GitHub a lot.
[-]
- Hendrikto 3 hours ago
  There is nothing tying Go to GitHub.
- rewgs 2 hours ago
  Not at all. It can grab git repos (as well as work with other VCSs). There's just a lot of stuff on GitHub, hence your impression.
xpressvideoz 2 hours ago
The article lists Git-based wiki engines as a bad usage of Git. Can anybody recommend alternatives? I want something that can be self-hosted, is easily modified by text editors, and has individual page history, preferably with Markdown.
keithgroves 1 hour ago
When building https:/enact.tools we considered this. I'm glad we didn't go this route.
dromologist 2 hours ago
We wanted to pull updated code in our undockerized instances when they were instantiated, so we decided to pull the code from GitHub. Worked out pretty well though after a thousand trials we got a 502 and now we're one step closer to being forced into a CD pipeline.
sghiassy 2 hours ago
Use the git clone —shallow option and you’ll only download the most recent commits. Yeesh
[-]
miyuru 4 hours ago
Funnily enough, I clicked the homebrew GitHub link in the post, only to get a rate limited error page from GitHub.
0xbadcafebee 2 hours ago
YOLO software engineering, the hallmark of the 21st century
born-jre 3 hours ago
lol I see this as I plan on using Git for my thing store. https://github.com/blue-monads/potatoverse
BlueTemplar 1 hour ago
Wait, isn't fossil based on sqlite ?
Or does fossil itself still have the same issues ?
holyknight 1 hour ago
It’s basically the same thing that always happens when you choose a technology because it’s convenient rather than a great fit for your problem. Sooner or later, you’ll hit a wall. Just because you can cook a salmon in your dishwasher doesn’t mean you should.
frumplestlatz 3 hours ago
Since ~2002, Macports has used svn or git, but users, by default, rsync the complete port definitions + a server-generated index + a signature.
The index is used for all lookups; it can also be generated or incrementally updated client-side to accommodate local changes.
This has worked fine for literally decades, starting back when bandwidth and CPU power was far more limited.
The problem isn’t using SCM, and the solutions have been known for a very long time.
encom 3 hours ago
>[Homebrew] Auto-updates now run every 24 hours instead of every 5 minutes[...]
That is such an insane default, I'm at a loss for words.
[-]
- croemer 1 hour ago
  You mean the 5 minutes is insane, right?
aniou 3 hours ago
As side note. Maybe someone knows, why rust devs chose an already used name for language changes proposal? "RFC" was already taken and well-established and I simply refuse to accept that someone wasn't aware about Request For Comments - and if it was true and clash was created deliberately, then it was rude and arrogant.
Every, ...king time, when I read something like "RFC 2789 introduced a sparse HTTP protocol." my brain suffers from a short-circuit. BTW: RFC 2789 is a "Mail Monitoring MIB".
[-]
- adastra22 3 hours ago
  There are many, many RFC collections. Including many that predate the IETF. Some even predate computers.
  [-]
  - aniou 3 hours ago
    But they were in different domains. Here, we have a strong clash because Rust is positioning itself as secure system and internet language and computer and internet standard are already defined by RFC-s. So, it may be not uncommon, when someone would tell about Rust mechanisms, defined by particular RFC in context of handling particular protocol, defined by... well... RFC too. But not by rust-one.
    Not so smart, when we realize, that one of aspects of secure and reliable system is elimination of ambiguities.
gjvc 3 hours ago
sqlite seems to be ideal for a package manager
[-]
- sigwinch 2 hours ago
  I feel like the rqlite people would have a lot to say about how to coordinate your installations, especially for the high-bandwidth non-desktop installs.
  https://news.ycombinator.com/item?id=45257349
- mirekrusin 1 hour ago
  ...or scm [0]
  [0] https://fossil-scm.org
eviks 4 hours ago
Indeed, the seductive nature of bad tools lying close to your hand - no need to lift your butt to get them!