I don't get why people use Docker for "small SaaS apps" where "scale is not an issue". You can run everything without Docker to remove an extra level of indirection. I've been doing that exact thing with a very similar Django stack for years. Not once have I missed Docker. On the other hand, I have taken over SaaS projects where the (unnecessary) usage of Docker made it more expensive to host, decidedly more difficult to debug and maintain, and tests were slow, making me less productive.
As the reason for using Docker, the author writes "The benefits of matching your development environment to your production one cannot be overstated". Again in my personal experience, this simply isn't an issue. Just use the same Python version and a roughly similar PostgreSQL version in development and production, and you're good to go.
In short, I feel OP would be better off without Docker. Once you know the rest of the stack well enough, you can set it up in dev and prod in such a way that you can trust that what works in dev will also work on prod. I've been doing this without problems for years.
This is a pretty common sentiment on HN and I’m sure you have your reasons for it, but to me personally it just seems bonkers. Docker is an absolutely game-changing tool for me. I can’t count the number of times it’s saved me from completely screwed up system libraries, tools installing config files in weird places, conflicting versions of this or that, or other versions of system pollution. I write a simple docker-compose.yml and I’m good to go - Docker takes care of sand boxing, networking, mounts - the works.
As I said before, you must have your reasons for your view, but my own experience is the polar opposite.
I'm a bit of a docker cynic, but if you tell me I have to deploy any of Ruby, Python, Node.js, or Perl, I'm using docker to manage the dependencies. My opinion is that this is actually docker's primary value-add. Yes, it enables who hosts of orchestration benefits, but it also makes it actually possible to just package these runtimes in a reasonable-ish way.
For platforms which support more self-contained binaries, I'm going to keep trying to avoid docker as much as possible.
Sure, we're all just talking about our personal experiences. I just haven't run into the problems you describe. I use virtual environments to keep my system installation clean, and PostgreSQL / Nginx are so stable that the version I happen to have on my dev machine usually works with all my projects dating back to 2014 (but kept up-to-date).
> Sure, we're all just talking about our personal experiences. I just haven't run into the problems you describe. I use virtual environments to keep my system installation clean, and PostgreSQL / Nginx are so stable that the version I happen to have on my dev machine usually works with all my projects dating back to 2014 (but kept up-to-date).
One of the most important thing docker provides is an abstraction between the application and the OS. If you don't need the abstraction, then it's only going to look like cost.
On the other hand, being able to write an some code in python 3.9 and deploy it anywhere across a heterogeneous production environment without fighting with the OS about how many versions of python it has installed is a useful thing.
Do you by any chance often have cause to compile and use tools or libraries written by academics? Or regularly use projects that aren’t as big and well-developed as Postgres and Nginx?
I find it really useful for development. For example when I configure nginx + a service, if it works in my local env, I have to run one command and I'll have the same in prod, and I don't have to debug in the server. Also is useful to keep track of the OS dependencies.
But it ends up being a matter of preference, I could still do it without it, but I'm used to it now, for some reason I feel that it makes my systems more "reproducible" in a simpler way.
Explain? I'm aware that giving a host user permission to control Docker is equivalent to root, but that's no worse than wheel/sudo in most cases, and not a "sandbox" failure. So I assume you're thinking of container escape, which I was given to believe is actually hard these days?
Docker use shared resources like Kernel. Linux Kernel is big ugly C mess (Compared to includeOS) and probably one can find a good enough exploit for the kernel then escape the Docker.
That's and VM provides much better security. Well, VM escape exploits exists but they are at least much harder than say a Docker level escape.
He was referring to sandboxing in terms of not polluting the installed packages/libraries on the server and not from a security perspective. But even from a security perspective, there are solutions to this problem like running containers as an unprivileged user using podman.
At work I had to set up a server with django, postgres, SSL, rabbitmq, celery and a few other things. I did it in both ansible and docker-compose, the ansible version was 1500 lines and docker-compose version was 100 lines. A lot of the ansible stuff is dealing with users, groups, services, supervisord, virtualenvs and repositories - none of this is required in docker as containers restart themselves, users are already set up, you don't need virtualenv, and using containers avoids the need to use repositories to get specific software versions.
Also the docker-compose version was much faster to pick up changes, vs minutes for ansible.
I don't find docker that useful for developing on a day to dat basis however, it just gets in the way and generally you're not going to need a webserver, SSL, celery, rabbitmq in development anyway. It is useful to debug things locally though.
The other reason to use docker is state: while using ansible we got bitten by old dependencies hanging around because ansible doesn't exactly mirror your config, it can really only add stuff. Whereas docker actually wipes the slate clean every single deploy so your server is an exact mirror of your docker-compose file. This is really important.
Cost benefit trade offs of Docker will fit some and not others. For smaller projects with less change and few+standard components Docker is probably overkill. In other cases Ansible is too fiddly or slow.
Still important to keep in mind those Docker images must be maintained too.
You can tell Ansible that a package should be removed. You just have to remember to do it. If there was ever a statement in your playbook that a package should be added then you can’t assume it will disappear automatically - you have to tell Ansible to remove it.
Most of the stuff you would be doing with Ansible happens inside the Dockerfiles. So comparing lines of code of your Ansible codebase with just the docker-compose file is not fair towards Ansible.
I'm not trying to pick on you personally here but this comment, like many I see on HN, seems overly adversarial and could have made the same arguments in a less combatative way.
There's a lot of emphasis on your own perspective: "*I* don't get", "*I've* been doing that", "Not once have *I* missed Docker", "making *me* less productive", "*my* personal experience*"
And an assumption that it applies to others: "Just use the same [...]", "I feel OP would be better off [...]".
The comment could provoke a more productive, thoughtful discussion by focusing on understanding rather than on pushing particular views and practices.
For example:
- "I don't get why people use Docker" -> "Could anybody share their reasoning for using Docker?"
- "You can run everything without Docker to remove an extra level of indirection" -> "What benefits do you see in exchange for this extra level of indirection?"
- "Not once have I missed Docker." -> "I haven't found much use for Docker and would like to understand those who have"
- "I have taken over [...] making me less productive" -> "I've only encountered downsides in my own experience with Docker, could anybody share what the upsides may have been for others involved?"
- "Again in my personal experience, this simply isn't an issue" -> "I'd be interested to hear more about the benefits as I haven't encountered issues before"
- "In short, I feel OP would be better off without Docker" -> "I wonder if the OP really needs Docker, perhaps they could be more productive without it"
If we all wrote like this (and I'm certainly not perfect myself), HN could be a much friendlier, more welcoming place.
Denigrating any subjective experience because it's "too direct" is bordering on passive aggressive. Every suggested edit you've made to the comment completely alters the meaning of the text: you're basically saying that if you're not looking to learn and completely disregard the veracity of your own experience in every statement you do, you're better off not saying anything at all.
It strikes me as exceedingly dishonest to pretend that you're genuinely interested in someone's opinion if you already know you strongly disagree with it, and it's by no means a requirement to lack an opinion to have a respectful debate. Telling someone that you believe they're wrong is not an insult.
People have subjective opinions and experiences, this does not necessarily make their statements true. This should be obvious and I don't see why it's the sender's responsibility to preface everything they say with that fact to avoid stepping on the toes of whoever reads it. Opinions don't kill debate, it furthers it.
> you're basically saying that if you're not looking to learn and completely disregard the veracity of your own experience in every statement you do, you're better off not saying anything at all.
That's really not what I'm trying to say at all. If I had to condense it down to one sentence it would be: "if there's a difference in opinion between two reasonable people, it's probably because you've had different experiences; you'll have a more productive discussion by finding out what they are than simply stating your opinions".
> It strikes me as exceedingly dishonest to pretend that you're genuinely interested in someone's opinion if you already know you strongly disagree with it, and it's by no means a requirement to lack an opinion to have a respectful debate. Telling someone that you believe they're wrong is not an insult.
If you're not interested in somebody's opinion, what value could there possibly be in having a discussion with them?
> If you're not interested in somebody's opinion, what value could there possibly be in having a discussion with them?
Not every comment on the Internet is an invitation to discussion, and the discussion is not exclusively between two people. A lot of comments are just responses about why someone disagrees or agrees with what was just said, which is allowed.
Is it OK if I am direct with you and say this is absurd and unhelpful?
OP’s post was clear and to the point. It was entirely focused on the content of their frustrating experience with Docker. And it wasn’t at all “adversarial”. They had a bad experience with software and are sharing it.
I’m glad they didn’t try to wrap it in softening language just in case someone might find their communication style “violent.” I hate this trend. I see so many comments like this in tech and honestly it’s mostly from people who have nothing to say or who become indignant if you say anything they disagree with.
I understand your concern and agree that I am being extremely direct. It's not out of spite. On the one hand, it comes from my frustration with what I as an engineer perceive to be wrong advice. It feels to me like OP is using Docker because he read a blog post about it somewhere, never seriously thought about alternatives, then stuck with it and now recommends it to others, repeating a cycle where software gets worse and worse. And on the other hand, being direct is more concise. We all have little time and I feel that to get a point across on the internet, one has to get it across quickly and succinctly. I have no problem with my views being criticised and I'll be happy to concede where I was wrong if someone makes a clear and harsh point. But either way, I'm sorry if I offended you in particular or others who did not comment; It was not my intention. Having read this thread, I think much of the difference of opinion comes from different contexts. People who like Docker often seem to be working in larger teams, which is totally fine and they're probably right to use Docker. I just couldn't help but disagree with a statement that I felt pertains to an area I work in, which is small SaaS of tiny teams with usually no scalability requirements.
I get where you're coming from and you didn't offend me at all. I'd just been looking at the top comments of a bunch of HN threads and thinking "why does this place seem so unfriendly". Your comment isn't anywhere close to being the worst, it just happened to be the one I responded to.
And being direct isn't a bad thing either, my point here wasn't "don't be direct or concise, you'll offend people" but "assuming everyone here is reasonable, disagreements are likely due to differences in experience and you'll have a more productive discussion by figuring out the differences than simply stating opinions".
You are making judgments about OP using Docker. But from my point of view you are a true pain to work with because you do not containerise your work. Anytime I'd have to work with one of your creations I'd have to go figure what you are using, set it up and make sure it works. That's a waste of time and you are promoting that very fact because you were unable to figure any usage for it. I doubt OP has only "read some blog post about it somehwere" It's literally wide in usage by a lot of companies. "People who like Docker often seem to be working in larger teams, which is totally fine and they're probably right to use Docker." -> it's actually about that, it is useful is scaling that's true, but it's also very useful just to share the env your software should work with. And I'd dare to say that giving a Dockerfile is now standard practice.
You're judging my code without ever having seen it, nor knowing the context of my projects. You won't have to work with my creations because as I explained my projects usually involve just me. As I also explained above, i think Docker can make sense in companies. And re "standard practice", I like to make decisions on what's best for each project, not by dogma.
What if you want to run the code on a different architecture? In that case isn't it better to document which "pip install" etc. you have to perform to get it running? And from there do you really still need docker?
> I'm not trying to pick on you personally here but this comment, like many I see on HN, seems overly adversarial.
Ironically I find the first comment non-adversarial and this comment overly-adversarial, but maybe that’s just me.
People are allowed to disagree and share their own experiences, and OP didn’t say anything nasty about anyone. I would prefer a space where people share their honest experiences with reasoning over one where everything has to be coated in sugar to make it digestible. And I’m sure the author of the post can handle hearing someone else’s differing opinion, it’s a discussion forum after all - not a blind agreement forum.
I think using "I" statements and referring to their own experience makes the comments subjective. They may be expressing a strong opinion, but they aren't stating it as objective truth.
> If we all wrote like this (and I'm certainly not perfect myself), HN could be a much friendlier, more welcoming place.
This is a well-intentioned thought but I'm not sure it actually makes sense. The things that make or detract from HN being a friendly and welcoming place have a lot to do with its size and heterogeneity -- ideology included. You're trying to neuter the disagreement and rephrase it in terms of a question. That's not always a bad thing, but in some cases you really do just disagree. If that's going to be the case, why not be upfront about it, so long as you can be civil? (GP's point does seem to be phrased in a pretty civil manner IMO)
It’s too fucking hard to setup python with all its pyenv, anaconda, poetry, pip requirements.txt and what-not, that’s the best use case for Docker here. Get it to work once and forget about it.
I also dislike all those tools you mention. So often, I feel like devs use a tool "because it's cool" but fail to appreciate the complexity it adds - not just for themselves but for others. The frustration expressed by your comment is a perfect example of this.
In my ideal world, everybody would just use what comes with vanilla Python:
I prefer my system for running youtube-dl in an isolated environment without having to install all of its dependencies on the host. Ansible writes a bash script at ~/bin/youtube-dl containing:
#!/usr/bin/env bash
set -e
IMAGE_NAME="grepular/youtube-dl"
# Build the image if it does not exist
if [[ $(podman images --filter "reference=$IMAGE_NAME" -q) == "" ]]; then
podman build -t "$IMAGE_NAME" -<<EOF >&2
FROM python:3-slim
RUN python3 -m pip --no-cache-dir install youtube_dl
ENTRYPOINT ["youtube-dl"]
EOF
fi
podman run --net host -i --rm -v "$PWD:/app" -w /app "$IMAGE_NAME" "$@"
Nope. I'm using podman rather than docker so they're owned by the user that runs the script. The root user isn't involved. If I were using docker, I'd add `-u "$(id -u):$(id -g)"` to the docker args to deal with that issue.
Venv is great for keeping dependencies separate but if your local machine doesn’t have the same version of Python as your server then you’ll need pyenv.
Not necessarily, you can use a CI pipeline to verify that your project can be build with many Python versions. This is the workflow I use here[0]. It makes catching and fixing breaking changes easier. Plus, I'm confident the core team is not likely to introduce a painful breaking change (think Python 2 -> 3) soon[1]
Few hours of annoyance per developer in the beginning vs. a few minutes every time you run tests or when you debug. Also add in the overhead of configuring a debugging system where you work in an ide and debug code running in a docker image. Doable (though many of my fellow Devs have just given up) but also takes time, probably more than to run pip and get psycopg2 installed on Macs. Also the majority of engineers probably don't even know docker that we'll (definitely far worse than they think they do, just like with k8s). The number of times I've seen Devs running webservers as root in docker containers is amazing.
My team insists on using Docker and against my better personal judgement I let it, but I set up the code to run locally without it. If I call the shots I'll not use docker absolutely.
Because I have, my colleagues have, in fact we probably spent more than an hour and just decided it's not worth it. I decided it's not worth it because I had a way to debug, while others decided they will just code without proper debugging tools because they only want to work through docker.
My (limited) experience debugging Typescript/Node, Typescript/Browser, PHP using IntelliJ and Docker has been pretty much like this:
1. Open a port on the container for debugging
2. Tell the debugger in the container what's the port it should use (if you're not using the default one)
3. Tell the IDE what port it can use to connect to the debugger (if you're not using the default).
4. Debug it!
For things like Typescript you might need some extra trickery because it's all transpiled, so you'll need to make sure your sourcemaps are set up correctly, but that's not overly difficult.
It did take me several hours to work everything out and write a README so that every time we get a new hire / someone sets up a new PC they can just follow the instructions. I'd say it was time well spent.
Same. The only issue I had with virtualenv was when I copied one to a different directory and it didn't work. It turns out you can't do that. Everything else has always worked fine, and I've been using it professionally for 10 years.
No it's not. It's literally just two commands to get started with poetry, and after that, one command to add a dependency and it's pretty easy to run simple projects locally.
Ah, see, this is the perfect example of why Docker helps, especially with Python: consistency.
The team would no longer have to make these deployment decisions and argue about which tool is a better fit. They'd make the decision once, hopefully follow best Docker practices (unprivileged user, multi-stage builds, etc.), and have documentation available for how to integrate the setup with IDEs, work with volumes, etc.
Once this initial adoption hurdle is overcome, IME the productivity gains are greater than the issues of dealing with Docker. It becomes trivial to setup CI/CD, onboard new developers and integrate the app into other workflows.
Docker and containers in general have become mature enough to prove their use case and benefits, so the cargo cult argument doesn't hold weight for me.
> You can run everything without Docker to remove an extra level of indirection.
... and add questions like: are all paths correct for this machine, are binary lib dependencies the same as in production, are the OS versions compatible between dev and prod, is my local runtime version compiled with same options?
The great example where the indirection adds value is: every developer has the same environment and the CI is the same, regardless of personal preferences in systems.
If you have to answer such questions then yes, Docker is probably a good solution for them. I find that in my pretty vanilla web development, I am not faced with those questions. Most of my stuff is just Python, Django, PostgreSQL, Nginx and a few not very exciting Python dependencies. (This is in several independent projects, working as a sole developer, since 2014 and let's say ~100.000 users of my SaaS apps.)
In my experience you'll just run into them one day by accident. My last one was actually about differences in postgres drivers talking to sqlalchemy between Debian and Fedora. I agree it's likely a "pick your poison" situation, but these days I prefer defaulting to containers (and investing time upfront) for just about everything to prevent those problems rather than debug them once they happen.
I could tell you were a solo developer before getting to this comment.
A lot of the problems docker solves is as soon as you need to fit out a team with a mix of environments and split ops from dev
I still use it for individual projects because I never plan on maintaining everything forever - makes it easier to grow to a 1+1 team or hand the project to someone else or solicit contributions
Also builds your experience for when you do work in teams
What do you use for the frontend part? I'm using Vue or React + django-drf, but it always feels like it could be much simpler for most SaaS applications. Using Django forms feels too limited, on the other hand.
Docker uses the host kernel. To get everything identical you need to use a VM anyways.
Some languages bundle all of their dependencies so you can be relatively sure they will run the same on prod. For others (Python, Ruby) that use many system libraries, containers may add value
> To get everything identical you need to use a VM anyways.
Yes, which is where you can reach for firecracker for example. But docker gets you 95% there. If kernel makes a difference you can make the decision about the other 5%.
> Some languages bundle all of their dependencies so you can be relatively sure they will run the same on prod.
As long as they're static, vendored dependencies. Otherwise you're still likely to run into pulling something very common like openssl, zlib or libuv from the system.
Docker is useful for reasons besides scale. It allows you to (kind of) declaratively define your environment and spin it up in any number of different scenarios.
It allows you to ship all your dependencies with your app and not worry about getting the host machine properly configured beyond setting up Docker and/or Kubernetes.
Have other apps you want to host on the same set of machines? No problem.
When you you pair that with something like alpine Linux, you’re getting a whole lot almost for free.
For running applications in production, Alpine Linux is harmful and should be avoided. Getting a smaller container image is not worth trading mucl for glibc.
To a Docker outsider, to me this implies that Alpine for testing and non-Alpine for production is not harmful. Is that right? Wasn't having a unified environment half the point of using Docker? Doesn't two different base systems just open you up to a load of headaches? If so, then isn't it more of a "Alpine Linux as a Docker base is harmful" situation?
Thank you for sharing that! It has never occurred to me but that makes sense.
For my Go/Haskell binaries, I usually do need to make changes in order to get them working. The smaller image size is essential in my use case though so I pay that penalty.
How many scenarios do you typically need to spin up your environment in?
I achieve the same (defining the environment) by pinning Python dependencies via requirements.txt and virtual environments. And I have an installation script, like a Dockerfile but just an executable bash script, that installs PostgreSQL etc, pulls the code from GitHub and starts up the server. I can upload this to a clean Debian installation to set up the server, with a well-defined stack, in minutes.
> Have other apps you want to host on the same set of machines
I don't. I just use one $5 per month Linode for each SaaS. (Or bigger Linodes as the projects get more users. My biggest single box currently serves ~100,000 users.)
> I don't. I just use one $5 per month Linode for each SaaS
Well then you're basically using VMs as your containerization mechanism, with install scripts replacing Dockerfiles.
So from your perspective, don't think of Docker as a really fat binary. Think of it as a stripped-down VM that doesn't cost a minimum of $5 per instance, and comes in a standardized format with a standardized ecosystem around it (package registries, monitoring dashboards, etc.)
It's a really fat binary that adds unnecessary layers in both dev and prod. In dev, it's faster to run tests and easier to debug without Docker. And in prod, why use a vm inside a vm?
> In dev, it's easier to debug without Docker. And in prod, why use a vm inside a vm?
On the contrary, it's easier to debug with Docker - it eliminates dangling system level libraries / old dependencies / cache, and everything is self-contained. If you have the problem in dev, you'll have it in prod as well.
Docker is not a VM, it's basically a big wrapper around chroot and cgroups, the performance hit is minimal. The advantage is that, again, it's self-contained, so there's little risk some OS / dangling library muddies the waters ( especially in Python that's a great risk -OS level python library installations are a thing, and many a Python library depend on C libraries on the system level, which you can't manage through Python tooling). It's also idempotent ( thus making rollbacks easier) and declarative.
I've never encountered this "dangling libraries" problem you describe in the past 7 years of developing web apps. I suspect you are working in different, maybe more specialised, environments than me. For me, it usually is really just pretty standard libraries and dependencies.
Containerisation software adds so much more than just a binary with layers of indirection.
Easy fine grain control of individual application memory, namespace isolation (container to container communication on a need to know basis), A-B testing, ease of on-boarding, multi-os development, CD pipelines, ease of container restarting etc.
This is absolutely the case. IMO the main benefit provided by containerization is reproducibility. Scaling and other positives are just a product of this.
If you know Docker, and use it daily - it’s a no brainer. For me it’s an easy choice - I dev on a Desktop PC but when I’m on the go I’m on a MacBook. No need to customize environment config with docker. What’s your disaster recovery plan if your server tips over? With docker it’s relatively easy. I’m not saying it’s for everyone, but in my experience it makes my life simpler.
How do you debug (in dev or prod), or in production, do things such as inspecting log files, monitoring system resources, running necessary commands if there is something urgent...? In my experience, Docker just makes everything more tedious without tangible benefits. I personally lose nothing by not using Docker and am much faster in everything I do.
How much experience do you have with Docker? I am not being snarky -- all the things you say sounds like something I would have said until I pulled myself together and learned how to use it properly. And granted, that did take a bit of effort but now my perspective is that Docker is extraordinarily easy to work with.
A Dockerfile is almost exactly like your requirements.txt file, only it works for everything. Need Imagemagick installed on your server? Just add it to the Dockerfile. Need to run a pre-start tool that was written in Ruby? A line or two in the Dockerfile can add that. And if it turns out that you don't want it after all, you just remove the lines and feel confident that no trace is left behind.
I don't have a lot of experience with Docker, but it's not what's holding me back. Here are some of my pain points:
1) Having to "docker exec" to get a shell in a running container vs just running the command (in dev) or sshing into the server (in prod).
2) Tests taking 3:30 min to run with docker vs 30 secs without. And then of course, you don't just run tests once but many times, leading to tens of minutes on a given day where I'm just twisting my thumbs.
3) Having to even spend time learning how to debug Python code inside a locally running docker container.
4) Having to deal with disk space issues caused by old docker images on my 500gb hdd.
All while not deriving any value from it in my projects.
> 2) Tests taking 3:30 min to run with docker vs 30 secs without.
This is not a problem with docker, but your environment. Tests should run at the same speed and initial time with or without docker (with an exception for changes that update dependency list - that will take the time for the initial build).
Things to check: Are you installing dependencies before adding the app? For development are you mounting the app instead of building a new image each time?
Give Docker a go on a project and you'll see its utility:
1. If you're doing this, then you're likely using docker incorrectly. Your container should be run from images that are automatically deployed. If there's an issue, fix local and deploy image.
2. I agree, that sounds dreadful. We use pyenv and poetry for local development and docker for deployments. That would address your issue. We of course do not use pyenv or poetry inside the docker image. Hopefully that helps clarify potential real world use.
3. Your should not be debugging a live deployment. Debugging local container is straight forward in VSCode.
4. This shouldn't be an issue. Prune weekly, and aim for smallest images where possible, e.g., 60MB for python microservice.
The value is in the simplicity of deployment. Need to host a well-known software? Can be up or down no time.
As I wrote at the very top of this thread, i have used Docker in a project that was forced on me, and I still use it in other client projects where I don't have a choice. I don't need to "give it a go" to learn what it's like.
1. First, the infra to automatically deploy again introduces complexity. And waiting for the Docker push to complete and then until the new container is started takes away from my time.
2. Yes, it is a pain.
3. I do what's necessary to fix problems. Sometimes that means looking at production. I use PyCharm not vs code and because I don't find value in Docker for my projects have no incentive to look into how to set up local debugging.
4. "Prune weekly" you say. But it's just another complication I have to deal with when using Docker. What for?
> The value is the simplicity of deployment
I would argue my deployments are simpler than yours. Give me git and ssh and I'm good to go. No need for Docker, pushing to some registry, looking at a dashboard or waiting for the image to be deployed. And my setup is much easier to debug.
But it sounds to me like you are coming from a more enterprise-y environment. There, the things you say probably make sense. In my case, I'm a sole developer. Any unnecessary process or tool slows me down and incurs a risk of bugs due to added complexity.
> I would argue my deployments are simpler than yours. Give me git and ssh and I'm good to go.
FYI: the wait time is marginal at best to deploy images and containers. My workflow is the same as yours most of the time for solo projects: SSH into server, git pull (from a "stack" branch that has docker-compose files containing DB and microservices), docker-compose up -d, and because of cached images it takes minimal time to deploy.
--
I agree that as a sole developer it can add initial complexity. As a sole developer myself on prior projects and now on personal projects, I use Docker primarily to streamline my deployment practices.
Another excellent use case of Docker not mentioned elsewhere in this thread is the simplicity of running databases locally, e.g., mongodb, postgres, etc.
Tests running with this difference implies a misconfiguration or a whole lot of difference on setting up the test pipeline. You shouldnt blame it on docker before investigating the issue.
> Having to "docker exec" to get a shell in a running container vs just running the command (in dev) or sshing into the server (in prod).
If this bother you, take a look at some 3rd party docker management ui such as portainer (vanilla docker) or k9s (kubernetes). These tools will let you navigate and launch shell on your container quickly. Very useful if you have tons of apps running in your node.
> Tests taking 3:30 min to run with docker vs 30 secs without. And then of course, you don't just run tests once but many times, leading to tens of minutes on a given day where I'm just twisting my thumbs.
In my case, I don't use docker in development phase. I test in local environment and only build the images when it's ready for deployment.
> Having to even spend time learning how to debug Python code inside a locally running docker container.
I never had to do this anymore. I just hook sentry or newrelic and they'll log exception stack traces that I can use to figure out the issue without live-debugging the app.
> Having to deal with disk space issues caused by old docker images on my 500gb hdd.
Yes, disk usage is one of the drawback of using docker. It's especially suck pruning images on busy servers with spinning rust. On ssd, pruning is not as slow though.
> or in production, do things such as inspecting log files
Logs should be sent out of the container, either directly (mount a /logs folder, push them to Sentry / ELK / etc.) or by simply logging to stdout and having Docker send the logs where you want.
> monitoring system resources
Docker processes are still processes. They show up in `top` and friends just fine.
> running necessary commands if there is something urgent...?
`Docker exec my_container <insert command here>`.
Although I've never even considered doing that. For the past few years the resolution to a production bug has always been "rollback to the previous image, then if any data got borked fix it in the database".
Everything you write is possible, but - and again in every single step of your daily dev work - more tedious than doing things directly. It's just not worth it for me.
But it also sounds like you are operating within a larger organization. So your requirements may be different from mine.
Because it codifies things and ensures people don't make a giant mess of intertwined crap that takes days/months to separate due to random global dependencies such as a cron/script/mount/etc.
It's not tedious at all and it saves potentially blowing up your system with crap splattered all over your filesystem. We have tons of WSL2 devs and Linux people and it takes care of the, "it works on my machine" problem once and for all.
This insistent push that the old way was good and why did we expend all this effort to make a new thing that I don't want to bother learning the five new invocations to just doesn't line up with the needs of today.
Docker pull, docker exec, docker ps, docker logs and you've pretty much got what you need for ninety percent of your job.
This stuff is not hard. You make it hard for yourself by digging in.
"New is good" is also no general justification. It always depends on the context. Several replies in this thread mention very good use cases where Docker makes sense. From what you wrote, it sounds to me like it also makes sense in your environment.
Yes I mean I don't want to have to learn things that don't bring me value. But it's not about learning the commands. It's about having to repeat them over and over again in my daily work. About the associated mental burden "am I in the container now? Is it running?" And about everything taking longer, be it running tests in Docker or pushing to a registry and waiting for the new container to be spawned. As a single dev, and I feel this is where your and my requirements differ, it simply is not worth it.
I echo this sentiment entirely.. mutli dev-environment, multi-machine / os things really just work. Docker et al also really shines with onboarding, the new recruit can literally get up and going in minutes.
I would not want to go back to the old ways of doing things.
Docker on Mac and Windows is so slow when used for a development environment but there are enough differences that it makes it worth using. It’s more portable than VMs (vagrant and such) and if you’re using your CI pipeline to build an image then deployment is a snap.
I much prefer building a Docker image and pushing it somewhere compared to tarballs of the repo or repo access. I would rather build rpms or debs than just push the repo around. Container tooling makes that kind of stuff nice in my opinion.
You agree that Docker is "so slow" and point to its benefits for preventing differences between dev and prod. But you don't mention what those differences are. What are they?
I also don't see why you'd use a CI pipeline for simple projects where you are typically the only developer. Run tests locally, if everything is fine push to prod.
I use git with a small script to roll out to production. When I execute the script on the server, it stops all services, pulls the latest code from Git, applies Django migrations, and starts everything up again. I use desk [1] to make this as simple as a single `release` command in my local shell.
For multi tenant deployments. Each tenant gets its own docker-compose in a directory on prod and voila: 100% same code base (same Docker image) and good separation between tenants
Multi-tenant deployments with Docker make sense _only_ if you trust all of the tenants, since it is trivial to take control of the host if you have write access to Docker socket.
You may say that it can be mitigated with some wrapper scripts with limited commands, but then you have to maintain them and we can all agree that homebrew security is very hard to do correctly.
Using Docker will provide you a complete runtime env which will in most cases 'just work' on a variety of hosts. It's easy to reproduce and easy to distribute. You can easily setup other dependencies and connect everything together using docker compose, creating a portable 'production env'. And that's important when there's > 1 person working on a project.
And it won't break the host machine because someone ran sudo pip install -r requirements.txt by mistake.
Yep. I agree. Docker has its place, but nothing wrong with keeping everything one layer short. It's how I run my side-projects. Just a small Bash setup.sh script and you are good to go. Docker solves the annoyance of system dependencies, but you still have to deal with it in your Docker build.
As I always say, know what you are doing + know to debug everything. If somebody is more comfortable with Docker, I think no problem there either :). Sometimes it solves a real pain, sometimes don't (having an Elixir release or Java jar makes me not to really use containers).
If you are on Docker, I recommend looking into Podman (maybe even together with runc -> today trending on HN). Don't run your containers under root and run your app inside the container under a specific user as well.
Shameless plug: I am now writing a book on application deployment and will show both approaches, Docker and Docker-less[0].
For me it’s actually more about isolation and security. I touch and build a lot of random code and docker gives me some good assurances that a transient dependency of a dependency somewhere in the pits of NPM install hell won’t do things like install malware on my computer.
Sandboxing/isolation is a very good mitigation against software supply chain worries IMO
With docker, I can move or upgrade the host server without worrying about compatibility of my apps. Before using docker, every time I migrate to a new version of os or database (which I do every 3-4 years per host), I would often run into small problems in some old apps that must be fixed, which took time because I had to dig into the app again and re-learn the context before I can patch the small issue. After moving everything to docker, it's no longer a problem. As long as the host can run docker (be it docker compose or kubernetes) my apps would run on it.
You still need to do those updates inside your container. I deal with a number of clients who like Docker because it gives them “a stable environment” but they don`t plan image upgrades.
I work products based on the official Python Docker image, but the base image containing the version of Python they want was release in April and haven`t received updates since. Developers forget that releasing software as Docker images mean that they are now responsible for patch management.
Yes, but now I can do the update in a separate schedule instead of being forced to do it right there and then when performing host upgrade or migrating the app to a different host.
It's not about docker per-se but repeatable, immutable infrastructure, a great boon to productivity and portability to local or cloud environment. "Cattle vs. pets"
Sure, you can do similar with ansible or puppet, but use something.
SSHing into boxes to fix things wasn't good enough in the 90's, one reason I looked into building .debs back before the turn of the century. cough
Well, maybe you are careful and meticulous and are the only person to deploy. But when you are working in a team where anybody can deploy, maybe not everybody as meticulous as you. Using Docker makes it easy for anyone to rollback anything (yeah, yeah, you know those kind of issues that should not happen but it still do ...), and that's a game changer.
I had great use of Docker even in my personal RPI projects, as Python modules can be difficult to get compiled and in a usable state on raw Raspbian. In projects at work I personally only recommend Docker when we need Hyper-V isolation or K8s is hosting the product (these are .NET projects, which work really well when published as self-contained binaries).
If you're not using docker then you're building in production instead of in CI. Also, that means you're maintaining a deployment instead of building a fresh one every time you deploy. This means leftover files will pile up and that is certainly not tested upstream at any time: tests run on fresh ephemeral deployments in general.
Thank you! I would wish that there are more projects having things like ansible modules or similar. You can run it for you local dev on vagrant images and deploy it on the server directly. Less layers. Less differences between environments. No messing with docker bypassing ufw and other system configs.
Docker makes deployments easy. In almost all situations I've encountered, it's easier to 'git pull foo && cd foo && docker-compose up' than it is to install and configure a mishmash of individual services to work in tandem.
A Dockerfile may be complex for any piece of the stack. But now this complexity is not part of deploying the application. Each component has its own build process which is decoupled from the others. And even a really complex installation process results in a deployment process that works the same way.
If you can't see these benefits then maybe you haven't had to deal with complex software installation. Or maybe you haven't had to run the same stack on your laptop that you run in production. But if you ever get to the point that you want to deploy something in more than one place, or more than once to a single place, it really makes things much easier.
Docker itself isn't going to increase your hosting costs. If you rent a server for $20 a month, it's still $20 a month with or without Docker being installed.
Disagree. Not using docker I have to learn about the specifics of how the hosting company runs my software. With docker it's 30 lines I can copy from an earlier project and I have full control.
Docker compose and docker machine make deployments so simple. For development I don’t want all my side projects to share the same database, similar to them using the same python install.
I can run disparate environments across multiple language runtimes without having to fight version issues or wrangle multiple VMs. That’s why, for me at least.
"Just use the same Python version and a roughly similar PostgreSQL version in development and production, and you're good to go" -> And that's more time than it would take to re-use that Dockerfile and docker-compose that i literally wrote once and go to once in a while.
I get your point but don't think you're getting mine. In a bigger project / organization? Yes, let's have those processes and tools. But for simple apps as described in the article? Use a correspondingly simple solution. As always in software development, it's all about context.
I get your point, and I disagree on opinion. I and many others have had success using containerized devenvs on projects both large and small, and have likewise felt some pain with respect to repeatability when not - especially with the Python stack. Containers are synonymous with repeatability. Your future self is just another collaborator, and they’ll appreciate it down the line when they’ve got a new laptop and new env, for example ;)
I agree. With you and with the other opinions (why should only one perspective be right?).
I've been the lead developer on teams where I introduced Docker to solve consistency/reproducibility issues in AWS and Azure.
I've also done smaller applications in DotNet Core, Go, Node, Python, and Ruby. In those cases I've use other alternatives, including:
- Known Linux version, with git push-to-deploy (my favourite)
- Packer (the server), with embedded codebase (still quite simple)
- Docker (for most non-trivial deployments)
- Known Linux version, with chef or ansible (as an alternative to Docker)
- Terraform the machine, upload the codebase, run scripts on the server (ugh)
Every method had it's place, time, and reason. If possible, for simplicity, I'd go with the first option every time and then the others in that order.
The thing is, though, I may have an order of preference but that is totally overridden by the requirements of the project and whether or not the codebase is ever to be shared.
For solo projects and small sites, I've not benefited from Docker as I have never had any server/OS issues (and I've been doing dev stuff for decades).
However the moment there was a need for collaborators or for pulling in extra dependencies (and here is the crunch point for me) such as headless browsers or other such 'larger' packages, then I would move on to either Packer/Terraform for fairly slow-changing deployment targets or Docker for fast-changing targets, as otherwise I inevitably started to find subtle issues creeping in over time.
In other words keep it simple while (a) you can and (b) you don't need to share code, but complexity inevitably changes things.
In my personal experience, there are a few reasons for using Docker:
1) By installing packages on the system directly, you can't be sure that a future update won't break it. For example, when using Tomcat or a similar application server, new updates sometimes deprecate or disable functionality in the older versions, or add new configuration parameters, that the configuration from the older versions won't have, thus leading to weird behaviour. This will be especially noticeable, if you set up development, test or production environments some time apart, or in different locations, where the available packages may not be 100% consistent (i've had situations where package mirrors are out of date for a while).
2) Furthermore, if you maintain the software long term, it's likely that your environments will have extremely long lists of configuration changes, managing which will be pretty difficult. This results in the risk of either losing some of this configuration in new environments, or even losing some of the knowledge over time, if your approach to change management isn't entirely automated (e.g. Ansible with the config in a Git repo and read-only access to the servers), or you don't explain why each and every change is done.
3) Also, it's likely that if you install system packages (e.g. Tomcat from standard repositories instead of unzipping it after downloading it manually), it'll be pretty difficult for you to tell where the system software ends and where the stuff needed by your app begins. This will probably make migrating to newer releases of the OS harder, as well as will complicate making backups or even moving environments over to other servers.
4) If your application needs to scale horizontally, that means that you'll need multiple parallel instances of it, which will once again necessitate all of the configuration that your application needs to be present and equal for all of them. You can of course do this with Ansible, but if you don't invest the time necessary for it, then it's likely that inconsistencies will crop up. In the case of Knight Capital, this caused them to lose more than 400 million dollars in less than an hour: https://dealbook.nytimes.com/2012/08/02/knight-capital-says-...
Edit: It was stated in this case that "scale is not an issue" and therefore this point could be ignored. But usually scale isn't an issue, until it suddenly is.
5) Also, you'll probably find that it'll be somewhat difficult to have multiple similar applications deployed on the server at the same time, should they have any inconsistencies amidst them, such as needing a specific port, or needing a specific runtime on the system. For example, if you've tested application FOO against Python 3.9.1, then you'll probably need to run it against it in production, whereas sif you have application BAR that's only tested with Python 3.1.5 and hasn't been updated due to a variety of complex socioeconomical factors, then you'll probably need to run it against said older runtime.
6) Then there's the question of installing dependencies which are not in the OSes package repositories, but rather are available only in the npm (for Node.js), pip (for Python) or elsewhere, where you're also dealing with different mechanisms for ensuring consistency and making sure that you're running with exactly the version that you have tested the application on can be a bit of a pain. For an example of this going really wrong, see the left-pad incident: https://www.theregister.com/2016/03/23/npm_left_pad_chaos/
Essentially, it's definitely possible to live without containers and still have your environments be mostly consistent (for example, by using Ansible), but in my experience it's just generally harder than to tell an orchestrator (preferably Docker Swarm/Hashicorp Nomad, because Kubernetes is a can of worms for simple deployments) that you'd like to run application FOO on servers A, B and C with a specific piece of configuration, resource limits, storage options and some exposed ports.
I feel like this is a largely issue of tooling and our development approaches, NixOS attempts to solve this, but i can't comment on how successful it is: https://nixos.org/
For sure as far as MVC frameworks go. Those are two of the best. I enjoyed Django when I worked on it. It was enjoyable to work in and had a lot of power. It was clearly a great framework. Also tons of big sites use it. And if a boss somewhere said they wanted the team to use it at some point in the future I would be totally happy about that decision.You cannot go wrong using django in the least as far as I'm concerned.
You’ve just dismissed the entire article and it’s details by adding nothing useful to discussion. By your logic, let’s all dismiss HN - the best knowledge is what you already have.
I think you misunderstood the intent. Constantly optimizing your toolset instead of building a startup with what you already know is a very common form of bike shedding and is one of the hardest things for some developer-founders to get past.
The goal of a startup is to build things quickly, so actually being able to build things quickly is important. If you know a set of tools well you can work quicker than using a perhaps better set of tools that you do not know well. That is an important point to make to people like me who like playing with new toys
I build things quite fast with flask and bootstrap, perhaps now with fastapi. If I have spare time I'd rather spend the time validating ideas or tech in new domains rather than learning more new tech that's just similar to this.
I don't see it that way. After multiple comments in one direction it was healthy to have someone say, "Wait a minute. Let's not forgot why we're here and what we're here to do."
IntercoolerJS / htmx looks like a really cool solution for server-side rendered HTML with client-side interactivity. All the hype is currently with https://hotwire.dev/ but htmx has more features, seems more solid and even supports older browsers like IE11.
The libraries cover a similar ground, but there are some fundamental differences. With Turbo, by default all your links/forms will use XHR to fetch new pages which do a full body swap (unless you want to use frames/streams or opt out entirely), whereas with htmx you have to be specific which links/forms and targets you want - there is an hx-boost attribute that does something similar but it lacks a corresponding data-turbo-permanent attribute (at least in the current production version) when you want to exempt certain elements. Both are valid approaches but it depends on what suits your requirements better.
In addition to the founder, there are at least 3 active contributors with significant contributions to HTMX. It may have started as a one man show, however, the knowledge is now spread over quite a few people who are exceptionally helpful.
I feel like an idiot reading these kinds of posts. If this is a “simple” stack, then I’ll be jiggered. Why do you need a “control plane” like ZeroTier? What problem does it solve? Certainly not a problem I've ever had. And I have a real SaaS business which people pay real money for. I've been developing enterprise SaaS for 10 years.
My go to stack for small projects where scale is not an issue is Laravel, Laravel Forge for deployment, Vue or jQuery for interactivity, SQLite for database, Redis for cache/queue and...that’s it. No Docker (because Laravel has a super simple dev environment setup with Valet), not a single YAML configuration file to be found anywhere, and this kind of setup on a single $20 DigitalOcean server can literally serve 100k users without a hitch. How many apps have more than 100k users?
The explosion in stack choices means no one can know them all. Also, even simple apps can have one or two areas where things get a bit complicated. "Simple" is relative and inadvertently implies things the author is familiar with.
I'd guess sqlite (for example) is fine for very simple apps with one user at a time, but that bar is pretty low. Yaml itself is not an issue, and intercooler/htmx is simpler than your JS picks. shrug
Great writeup! I think it's useful to talk more about complete stacks instead of focusing on individual parts without mentioning the "glue" between them so I really like these kinds of posts. Just two questions:
1. If you're not going for scalability why Postgres and not SQLite?
2. What does your monitoring look like? I've read something about Grafana and Hetzner dashboards in the comments but what exactly do you use and where do you run that? Also, do you have anything for intrusion detection specifically? (I'd be extremely paranoid about that.)
Wow, lots of good questions coming out of this :-)
1. I chose Postgres because I have the need to remotely manage as well as pull metrics into grafana and metabase. That is simply not possible with SQLite.
I realize this post should be a series of posts where I go deep on all parts of the SimpleCTO stack, so to speak.
2. My monitoring is quite limited because. Generally I use netdata to get a quick overview on health, and I export that into grafana. I rely on Hetzner for actual healthcheck and uptime monitoring. Sentry tells me if my code starts throwing too many exceptions or errors.
The main reason for me is that Postgres can be backed up while online, and I can scale up to multiple servers if I suddenly need to. Not so much with SQLite where if a small project suddenly hit the front page of whatever I would suddenly need to get really creative.
I'm not sure how well the architecture described in the article lends itself to horizontal scaling since everything assumes it's on the same machine (I think?). Granted, if you build for horizontal scaling I agree, a client/server database is the way to go.
As for backups, SQLite also supports backups while another process is using the database.
This brings to mind a question: would it be possible to run simply run multiple instances of SQLite on independent hosts that are backed by single shared filesystem, such as AWS's EBS? Of course they'd be blocked on writes but it does seem that SQLite3 supports the read concurrency.
Thanks for the kind words. I should update the part about the queues. I started using this pattern for SKIP LOCKED and it works well for multiple workers (search engine crawlers).
I instantly thought of something similar when I looked at your code.
I like the simplicity!
I used to combine RabbitMQ and Celery for async tasks. Mostly because it's what I learned to do since it was already in use at my first job. But Celery is such a pain -- or, at least, it was at that job. So many configuration options, different places where they were stored in different versions[1]. Weird errors. Poor documentation (at least at the time?)... I just started going for something simpler: rq and rq-scheduler with redis. For most of my use cases it's more than enough.
I've got to say that your approach has gotten me thinking of maybe simplifying everything even more. We'll see where I end up. In about 4 weeks I'll have to introduce asynchronous tasks in our current project, and though I was thinking of going the rq-way, your article has given me food for thought.
Other than that, your backend stack is mostly like what we use for our projects. We also use plain old docker + docker-compose, with the small difference that we have a somewhat hacked-together system I built with bash several years ago to extend docker-compose's functionality a bit and make every component somewhat more reusable between projects and easier to fine-tune on a "per-environment" (development, staging, production, etc) basis. We also use nginx, but your article has convinced me to look into alternatives.
Once again, thank you for your articles, they're a joy to read and think about!
[1] To be fair, that job had several aging Django codebases and I know most of them are still stuck with Python 2 and outdated Celery and django-channels versions. I constantly kept pushing for us to get rid of technical debt, but we never got to it...it's part of the reason why me and a mate left it for our own endeavors together.
Interesting that you talk about gluing together some things around docker-compose. I've done something similar but on the command-line. Specifically, I use Makefile as a task-runner [1]. It is in there that I keep a lot of my custom settings for deployments, hosts, and one-off taks.
I’ve never been able to use rq with redis without problems. For one: restarting the app requires wiping the related redis data because rq will refuse to launch because there’s still entries for the workers, which it apparently doesn’t clean up correctly.
Have you looked at django-fsm for managing your state transitions? Overkill for small projects - invaluable for larger ones where state changes can be literally anywhere.
Not really. It just enforces that specific properties (status field) can only be manipulated with methods decorated by an FSM transition. Allows enforcement of what actions are taken during a state transition so that some random view doesn’t change to “in progress” without the necessary actions occurring.
I’ve built an Open Source Django starter template which uses a sinilar stack. It implements the full payment flow, basically ready to be deployed: https://getlaunchr.com/
It’s fairly new, all feedback greatly appreciated.
Hey, just came across this, it is nice and easy to use. My comparison was with Laravel spark, which seems like a similar way to get all the obvious things running, but I'd prefer to use python. Have you found much traction with this? It is very neat and seems easy to use.
The thing I like from spark for B2B products is the Team type roles & team billing
He says "docker, just plain docker" but in reality uses docker-compose - in my view a significant addition.
However it aligns well with my own sentiments - for anything below large scale complexity or scalability needs, docker-compose hits a beautiful simplicity vs power tradeoff.
You can also just ignore that and get Docker from your package manager. At least on Debian it's in the official APT repositories and Docker.com hosts one as well.
The version in the official OS-provided repository is often very old, even obsolete. But yeah, just use the Docker provided repository instead, that gets you updates afterwards.
Well, I guess it's subjective - but if you didn't have docker-compose I would have a whole lot more questions about how you are managing all the running services, storing configuration, defining networks, monitoring health of containers, defining volumes, port mappings etc etc. It may be "just a python package" but so is Django and docker is "just a go package" :-)
Can anyone answer why python is a choice anymore instead of i.e. golang for saas applications like this? I previously worked heavily with ruby and I cannot imagine building beyond >1000 LoC in interpreted, dynamically typed languages. Aside from developer familiarity which trumps all other points.
It's also significantly easier to dockerize a generated binary from golang or rust, etc. Some of my worst docker headaches have been from trying to debug something like mayan-edbms, which uses python and celery workers.
I just don't get it unless you specifically need some libraries that no other language has.
> I cannot imagine building beyond >1000 LoC in interpreted, dynamically typed languages
Many have imagined and built valuable stuff in interpreted, dynamically typed languages before you, by being more focused on overall structure and making sure it's strict and resilient. One really doesn't have to search far for successful applications that are certainly way beyond 1000 LoC and still iterate pretty quickly for their size.
> significantly easier to dockerize a generated binary from golang or rust
Not sure the verboseness of either golang or rust would be worth it if you compare it with python. If you're just launching, your focus should not be on performance or how easy it is to dockerize but to figure out who your user really is. Scaling and deployment issues will happen much further down the road, and iteration speed is more important in the beginning as you have many changes. Your architecture needs to reflect this too, and dynamically typed languages (arguably) makes it easier to change things, as long as you know what you're doing.
But together with that, you are eventually gonna have to start leveraging more languages, as you have programmers working on different levels of the overall architecture. Some fit for some tasks, and when it comes to quickly launching and iterating on SaaS businesses, dynamically typed languages are a pretty good fit.
In the end, I don't really think dynamic/static typing is the most important consideration, that's such a small part of what you have to think about overall.
That is a double edged sword. Sure, that dynamically typed language of python or ruby might be quick to develop on but they often contain a lot of surprising bugs.
Compilers catch a lot of bugs before your software is in production.
I don’t know. I keep seeing this argument. I’ve been developing on the Django stack for the past 10+ years. I have fixed many bugs if mine and of others. It has been extremely rare, if at all, that they were bugs caused by an incorrect type. The closest I can think of is a typo in a method or class name, but is it really a bug if you catch it instantly? These are just not the types of bugs that end up creeping into the code.
In the meantime Django provides productivity like you can’t get many other places. It’s just a pleasure to work with and aside from a couple of setting-related places it’s code is a pleasure to read.
If not, this is the first time I've heard someone say it's a pleasure to read/write.
I've done lots of Python over the years (Zope3, Django, flask, aiohttp, custom stuff...), and if there is one thing I wouldn't like to come back to, it's Django.
Sure, it's a do-everything toolset, but the syntax is an abomination (imho, at least, though I hope that doesn't need to be stressed every time).
To contrast ORM syntax, look at old SQLObject, Canonical's Storm or both SQLAlchemy's declarative and core implementations.
It has not but it has expanded and been polished up. I used to be pretty anti-ORM because I felt like they were leaky incomplete abstractions. Having used SQLAlchemy for a major project in the past (used it on top of Django instead of the built in one) and then coming back to Django’s, I can say that Django’s is to me easier and nicer to use. The performance is now on par with the SQLA declarative implementation. I never really loved the core implementation for being as verbose as it was. I would rather write SQL directly at that point.
About the only place where I dislike Django’s ORM is stuff around aggregation and annotation. The fact that the order in which you declare your annotations matters is annoying af and while I understand the need for it, I do wish there was a better way. Having said that, I have simply started structuring my models in a way that doesn’t require complex aggregation and that made my life a lot easier. If I have an instance of a Book model I know what fields to expect it to have and don’t worry about whether this particular time it has some specific annotation attached to it or not. That has made the rest of the code a lot cleaner.
Basically, I think to each their own, but I don’t see the Django ORM as a negative. Give it a try and see how it works for you especially if you can avoid going against the grain with it.
I used Python on the backend for years and have taken to Go, the verbosity feels like a minor complaint to me, productivity in either language feels about equal to me but the end product is just better in Go. Better performance, better packaging and deployment, etc. I still use Python a lot for other things.
> Many have imagined and built valuable stuff in interpreted, dynamically typed languages before you, by being more focused on overall structure and making sure it's strict and resilient. One really doesn't have to search far for successful applications that are certainly way beyond 1000 LoC and still iterate pretty quickly for their size.
I once tried to get into a large Python project, and even the IDE (PyCharm) had trouble "guessing" the function parameter types. It's absolutely not scalable. I don't want to be reading a function and guessing "what the hell is the type of this?".
This would be nice if everyone used that. When I looked at the source code for mayan-edms, they weren't using that.
Apache airflow? Doesn't use it. Jupyterhub? Doesn't use it.
I was happy to see projects like Zulip using this, but if it's optional then I can't rely on people actually using it. It's the same thing with ruby/sorbet.
That’s shifting the goalposts a bit. You can use optional typing in your project as it’s scales, and there are plenty of projects that have been stable for what they do through other testing. Pick dependencies carefully and use typing as needed
I'd say typing is even more important when you are using other people's code with your code.
It's also a huge productivity boost too when the relevant methods and types are available to you, the editor almost writes the code itself!
The projects that don't use it probably need to maintain compatibility with Python 2, for whatever reason. I expect that more and more codebases will be at least somewhat typed as Python 2 is deprecated everywhere.
Weird of you to assume that I don't work in Python. Do you just not interact with the scientific Python community? Your anecdata that is blind to large swathes of the community is no more valid than my firsthand experiences. This comes off as defensive, not informed.
It's an objective fact that Python 3 was not adopted quickly (given various EOL deadline extensions) and that plenty of libraries are stuck on 2, so your point that types have been around for 10 years is not really an argument that I can actually expect to be able to use types when working within the ecosystem.
I work in web development, Django moved to 3 exclusively versions ago, and I drop support for Python 2 pretty easily in the libraries I maintain. numpy requires at least 3.7, same for pandas, and sklearn is at least 3.6. When you say "half stuck in Python 2", do you mean you have a lot of legacy code you don't want to update? Because I don't think that's Python's fault.
Ah, so the answer to "do you interact with scientists" is no, even though scientists are responsible for a huge amount of the python code that exists. Got it, maybe consider that your experience isn't universal.
I don't know why you think dismissing legacy code is valid! More code is old than new in the world, and the idea that you're only going to run into codebases with the most up to date versions of software seems pretty naive to me.
Serverless runtimes have historically been behind on Python versions, Airflow had a hell of a time with dependency issues and Python 2, etc etc.
I also don't really care whose fault it is - the question isn't "is Python morally wrong", the question is "can I rely on the technology that you offer as the solution to a problem as being actually used by the community and therefore actually a viable solution to the problem." Python type annotations are not popular.
> I don't know why you think dismissing legacy code is valid!
Because everyone everywhere has legacy code. In that sense, Python isn't split between versions any more than any other language, making that statement moot.
If you want to use type annotations, you can, for years now. If you want to use macros to generate macros in Rust, you can, it's not Rust's fault if you're stuck on 1.20 and don't want to upgrade.
I agree that all languages have legacy code, that was sort of my point :)
> If you want to use type annotations, you can, for years now.
Yes! And it's a cool feature. Can I expect most libraries to provide types so that I know I'm calling them correctly? Is it going to be an uphill battle for my team to use them, because the community at large doesn't find them necessary? The default for a majority of Python codebases is to not use types, and that's totally fine, but that also means that rather than type annotations solving the super-grand-OP's concern, they're going to potentially add friction.
It's likely, and that's FINE, that you're going to be in dynamic-land when you're working in Python, due to the preferences of the community, and it's cool that you can use annotations if you yourself want to get some nice signature checking.
But you don't need the libraries to have type annotations for them to be useful to you. You can use the libraries' classes as types, and type signatures are already very useful even if you only use them in your code. I agree that they will be even better when morelibraries declare them too, but I find that type annotations are a huge benefit in my own code either way.
I think we can agree on that - my point primarily is that:
1. In my own code, they are super helpful
2. I still get really frustrated when I'm calling other libraries (I had this experience when the various typed JSes were fighting it out as well) - in fact for a lot of web-style work that I do MOST of my code is calling a library, which is where I'd get most of the value.
Sure, I agree with that. More libraries supporting types would be great. I don't know why they don't, it's not like they need to support any Python versions that don't support types... I guess most just haven't gotten around to it.
People always forget that none of the things we stress about (code cleanliness, architecture, UX,...) make or break a product.
It's some weird combination of solving an actual problem decently enough (iow, it does need to work most of the time :)), good timing and some arbitrary pick up by the masses.
Like Google succeeded because they provided a no-frills search that had no ads, and geeks like us started pushing it down the throats of our non-geeky friends as The One True Search.
I ain't saying that Python leads to bad code (because you can have bad code in any language, just like you can have good untyped Python), but that there's this talk of scalability like it's some msgic thing that's impossible if all the stars are not aligned.
I agree that it's harder as the project gets bigger, but at the beginning you don't have to worry about that. By the time these problems become real limitations, you hopefully have a business and enough revenue to deal with them, which could include rewriting parts of or the entire application in a statically typed language if necessary.
"If you're just launching, your focus should not be on performance or how easy it is to dockerize but to figure out who your user really is."
I agree here but I should note that my main focus is spending less time debugging errors at runtime and in production, and avoiding errors in the first place. This is primarily why I use rust.
"Not sure the verboseness of either golang or rust"
Verbosity has nothing at all to do with implementing good, robust code. You could do it in python, rust, or another language. With certain languages you don't have to spend time writing guard code because it's handled natively by the compiler, type system, or both.
" dynamically typed languages (arguably) makes it easier to change things,"
They make it easier to change things, not necessarily correctly. With modern IDEs refactoring is not an issue.
"Many have imagined and built valuable stuff in interpreted, dynamically typed languages before you, by being more focused on overall structure and making sure it's strict and resilient. One really doesn't have to search far for successful applications that are certainly way beyond 1000 LoC and still iterate pretty quickly for their size."
I am aware of this, but I can't help but wonder if they would be able to iterate much more quickly knowing that the compiler and type system eliminate entire classes of bugs. Sure, you might be able to get to running code a little bit faster, but if you have to spend time guarding it and debugging runtime errors... is it really faster?
===
There seems to be this mindset that being fledgling startup means no time for silly things like types or compilation, etc, only iteration. If one actually does their market research and figures out who their users are and what they want, one should be able to iterate just as quickly in i.e. golang or another compiled language with a healthy ecosystem.
I have not worked at a FAANG company or a company even remotely the size of those companies, but I have created and maintained somewhat large ruby on rails codebases. Debugging runtime errors became my bane, especially when I could not guarantee that the persons writing the ruby code were following modern practices (or any practices at all). Not that I think compilers are perfect, but they -do- catch so many errors, some of which may have made it into runtime.
> If one actually does their market research and figures out who their users are and what they want, one should be able to iterate just as quickly in i.e. golang or another compiled language with a healthy ecosystem.
That's the key thing. You don't know until you put people in front of what you're building, to understand if you're actually solving the problem. Or even if the problem you're aiming for is the right problem. Hence you want a non-verbose and simple program that is easy to adopt to changes you want to make. Verbose languages that enforces you to be very strict, doesn't allow itself to change as much as a non-strict one. You're right that it's more error-prone, as you don't have the same guarantees. But in the beginning of launching something new to the world, you are gonna need to focus more on what needs to change, rather than how correct something is.
I agree with you. I've worked at several startups and they start with 'productive' languages like python, ruby, and PHP. These are great for getting something up and running easily.
Everything looks great at the beginning. They have features, customers and a growing codebase full of technical debt. Debt that I think would be greatly reduced if the code was compiled.
Maybe it is the price you need to pay to be successful but I doubt it.
Django drives the choice here. Batteries included, huge community support, and AMAZING documentation. Half the software I write is from stackoverflow. LOL
I dont see dockerizing as a problem. Once copied out from my template I never think about it. I do admit that familarity with the underlying Image OS (Debian in this case) makes debugging / fussing about a non-issue.
For local development that debian container is fine but for production you want something small like scratch or alpine. The fewer binaries in the container, the better and more secure.
This is one of the benefits of golang. You compile it into a binary and copy just the binary into your scratch container. Maybe it is 20MB in size.
This feels very hand-wavey to me. I can patch and ship my docker images the same way I might patch a server.
apt-get update && apt-get upgrade
rinse, repeat.
Secondly, all this cargo-culting around small containers is fine if you are cramming resources. I am not.
The reality is that my projects and many others are really, really over-provisioned. Looking at my graphs right now, I am on the front page of HN and my server load is still only 20%.
You can do this exactly the same in alpine, with "apk update" and "apk add".
"Secondly, all this cargo-culting around small containers is fine if you are cramming resources. I am not."
The default docker python image is 885MB. python:buster-slim is 114MB which is much more reasonable. Even if you're not "cramming resources", pushing 885MB vs 115MB vs 20MB across the wire does add up over time.
In my case, we do quickly iterate and pull docker images so it does make a difference for my colleagues if the image is 30MB vs 500MB. Even if we're all on gigabit internet.
"Cargo culting" implies doing it without knowing why we're doing it. The benefits of reducing the size of docker images is apparent to everyone, although there are diminishing returns. With dive[0] it is very easy to figure out where the waste is.
"The reality is that my projects and many others are really, really over-provisioned. Looking at my graphs right now, I am on the front page of HN and my server load is still only 20%. "
Optimizing docker images has nothing to do with being over-provisioned.
I will concede that it's very different as a solo developer vs even a small team.
> Yes, you can patch them but with a compiled binary like go, you don't have to.
> You don't have to watch security lists for vulnerabilities.
These two statements are incompatible. You have fewer things to watch but you’re definitely still going to track your dependencies. Static linking still means you have to do that, and nobody else can do it for you.
Not exactly. You have to watch your libraries for vulnerabilities no matter what language you use. Link or apt-getting the library will pull in a vulnerability
I am more concerned about pulling in Linux binaries that are full of vulnerabilities.
This is a valid concern but in general you should be most worried about code you actually run. If you have curl in a container but your app only uses it during the startup process, the fact that it has an issue with, say, FTP almost certainly has no effect on you. OTOH, if you’re using something like libjpeg your Go code needs to be recompiled either way and you might have to manually backport a patch just like a Linux distribution will.
It's not only about the resources allocated to your project. Bigger images also leads to longer deployment times, which increases the feedback loop for issues. Not to mention increased costs associated with storage, bandwidth, layer caching, etc.
I'm quite opinionated here, but these things only matter at significant scale. Most one-person-SaaS or SMEs never really graduates out of the first pricing tier for this stuff.
* It uses musl-c. It's not glibc, it's different. Not necessarily better or worse.
* You can't use manylinux1 wheels with alpine, so if you've got python/c extensions, you're going to be building them instead of installing upstream binaries. So cue the need for a dev toolchain, and a much longer install time.
And once you have all that, you're running a different version in dev/prod, which is one of those things that docker is supposed to be good at fixing.
I didn't mention any specific distro. However, lots of ppl use ubuntu as a base image, or other non-slim images. It's not what I do but it's a reality.
You have to reach massive scale before you’re going to see a significant performance benefit from Go, but you’ll see productivity wins from Django from day one. If you’re not targeting that kind of public scale and don’t have a large team, there’s a lot to be said for picking a mature stack on a high productivity language.
I used to think the same way and still kind of do. But you're neglecting another performance benefit. You have ~100 ms to respond to a request before the user perceives slowness. With Go you can do 10-100x more stuff in that timeframe.
Or 0 to -100x more stuff. The vast, vast majority of code is limited by the application’s workload and architecture, not the language runtime. I remember a colleague spending 3 months trying to beat Python using Go, eking out a single digit increase because it turns out that the Python stdlib C code is not easy to beat. For web apps, this is especially common: if 95% of your runtime is in the database or other external services, Amdahl’s law reigns supreme.
The other experience point which a lot of people forget the first few times is maintenance: if using Go means that you have notably more code to write and integration to support, you’re shipping fewer features and have less time to optimize the architecture. That Go code I mentioned earlier was much larger and required time to identify, integrate, and debug third-party libraries for things which are in the Python standard library.
This isn’t to say that Go is a bad choice but again a cue to make sure you’re solving the problems you actually have rather than someone else’s situation. If you’re still exploring the business case you should hesitate before copying a decision made by someone with both a well understood problem, large scale, and more people working on it.
In this thread’s context, I would highly recommend focusing on keeping the architecture easy to support and replace components when the business expands to the point where you really need to hammer specific optimizations. This is especially true in the 2020s where a large fraction of problems are either never exceeding single server scale or can be handled by trivially autoscaling containers on-demand for less than it’d cost in developer time to optimize them beyond the level you can hit in a language like Python or Node.
Again, that doesn’t mean there’s anything wrong with using Go – it’s a perfectly fine choice and has some great libraries — but it usually won’t be transformative the way some advocates claim. When you hear about huge benefits from rewrites look for how often they mention rearchitecting based on what they’d learned from the first system.
Exactly this. Developers and their managers and the CTO must accept most likely their work/company does operate a huge scale.
Once you internalize that your priorities change from all the tech hype (k8s, react, whatever) to actually working on the business problems. Not solving technical ones.
> I cannot imagine building beyond >1000 LoC in interpreted, dynamically typed languages
That would include Ruby, Python, PHP, and JS. GitHub, Shopify, Wordpress, Instagram, Pinterest, Reddit, significant parts of Netflix (Node), Facebook, just off the top of my head.
And you can’t imagine writing > 1000 LOC of a startup SaaS product in any of them? Really?
> After working on large rails codebases I get nauseous even thinking about having to safely maintain that level of code in dynamically typed languages.
I can’t speak to Rails but this is easy in Python - especially because the addition of typing years back now means that a few annotations will cover most of the common mistakes before leaving your editor. In the Django community there’s a healthy distrust for the heavy levels of magic which are harder to test, which helps a lot.
I've worked on a few sizable (~100+ tables) Django Rest applications and have yet to run into all these fabled type safety issues that would prevent refactoring.
Serializers and hygienic procedures seem to take care of whatever might have been or maybe I'm just really used to the setup.
What's the Django/Rails equivalent in Go? The value from Python isn't Python itself (though it might be easier to write in - trading off reliability for ease of use compared to a compiled language) but the ecosystem.
Fully-featured frameworks such as Django, Rails, etc give you lots of functionality out of the box which is very valuable especially at the early stages of the project where performance isn't a concern yet.
I think if you're using Htmx/Intercooler with Golang, you'd probably want to take a different approach from Django anyway. The backend design approach has to change slightly. You're focused less on whole pages and more on individual user interactions.
There's room for a tool that will scaffold a Go backend from an Htmx-ified web page IMHO.
Honestly, frameworks like Rails or Django are an anti-pattern for Golang.
The standard library makes it very easy to build web services and there’s a couple or popular packages for making routes and dealing with HTTP requests/responses simple.
There’s projects like Micro that try to do a little more batteries included but I wouldn’t consider them to be pervasive in the way Django or Rails are.
Beyond that, it’s just app logic and the typical Golang boiler plate.
Source: I’ve built a few big Golang services for a couple companies.
>Honestly, frameworks like Rails or Django are an anti-pattern for Golang.
Which is, IMHO, the biggest thing holding Go back. The standard library does a lot but a lot is also missing. Authentication, db migration, admin panels, ORM, Asset pipeline and probably more. It's a lot of custom work to do or pulling in various libraries.
I have mixed feelings on this. I go back and forth on whether I agree or disagree.
For example, once an application becomes sufficiently complex, I've found ORM's to be more of a hindrance. The abstraction gets leaky and I end up having to tune custom queries to get around some pathological edge case.
That said - most of my experience is with Rails and a lot of the implicit nature of it is problematic. On the other hand, I've not had this experience with Laravel.
Which one is right? I dunno. Lately though, writing sizable web applications in Golang is working out well so I guess I'll keep doing that.
A good ORM has a clean escape handle for tuning hot spots and otherwise saves you time with boilerplate queries.
Add a new field? Update the class / structs, generate a migration and move on. Without the ORM you write the migration yourself, you have to find and update all the queries and if you’re centralizing all your query building then you’re just using your own ORM-light without the battle hardened trials of other users using something open source and popular.
I insisted we use sql alchemy and alembic in our latest Go project — and I’m eyeballing Buffalo’s Soda and Fizz as a replacement.
Its generally just nerdcore navel gazing to insist on raw sql across a project in my opinion. Especially web apps / SaaS where you churn tables a lot. Simple migrations from simple models is what I love about Django and Flask/Fast API with SQL Alchemy.
>once an application becomes sufficiently complex,
Perhaps, but most applications don't start off complex. Most of them start off pedal to the metal develop as fast as you can.
Its also a gentler introduction to Go. I've wanted to use it on past side projects but the upfront cost of learning the language + figuring out how to get everything else was too steep of a curve.
I would have chosen user authentication as an example. Better have a team looking over this stuff than to write this up on you own except you know exactly what you are doing. Might end somewhere with you users passwords public readable in an elastic bean stack...
Contrary perspective: only avoid frameworks when you can be assured your project is (and will remain) so narrow that it has no need of validation libraries and the like.
Gin-gonic and a few others try to do that job ( and provide the flexibility to plug the missing parts), but a lot of people say the stdlib is sufficient. Depending on the he case, either can work great.
Sounds like you may not be familiar with Django, but it's honestly an amazing option if you want something that's mature, stable, well documented and full stack (the latter being key - one of the only ORMs I don't hate).
The biggest problem for Django and Python in general to me is more in the line of the scalability (performance) question. You are simply forced to put one or even two layers in front of it with a multi-process architecture just to get it to a baseline deployable state (in this case, traefik + gunicorn). But the good news is there are so many people doing that it really isn't a problem. And it turns out you probably want that architecture anyway (probably best if your application server isn't also doing your TLS etc).
Developing apps in Go is not a high productivity experience, compared to Rails or Django. The framework ecosystem isn't comparable, and you DIY a lot of stuff.
I'm saying this in complete self contradiction: my current stack is Next.js with Go/gRPC Web/sqlboiler
I've been writing python for a living for about 13 years now. I've dabbled in go, but found it incredibly verbose and of a much lower level than python. The error handling alone is enough to turn me off completely. So much boilerplate, so much code and words to express even simple things.
No thanks, I'll stick with python, for me it is the perfect balance between expressiveness and being easy to read.
Also Django itself and the python ecosystem is hard to beat. It gets out of my way so that I can focus on the business logic, instead of reinventing wheels.
I'm playing devil's advocate here. But Python has a ton of useful libraries that other languages don't have. I'm fully aware you can access those libraries in a serverless function or via docker. But some people may just want all of that native to the code to keep things super streamlined.
Honestly, django is the only thing that keeps me on python for backend service. With microservice being all the rage these days, I don't see any battery-included framework that can rival django emerging in golang, rust, swift and other new and hip languages.
If compute performance is not a requirement, python is a lot faster to develop in. Once the design has solidified, a few simple techniques can improve reliability to an acceptable degree.
To the extent of django or rail's maturity and features, likely not, although I don't know what features you use django for. I guess this answers my question to an extent.
Last time I worked with golang I used gin and xorm, but there were some things I had to manually do. That being said, I find rails to be an enormous beast that provides a lot of things that one may or may not need.
In my experience, working with golang and rust, which are compiled and statically/strongly typed have saved me headaches and time even with having to hand-roll some features.
It’s often better to have something you may not need than have not something you may need, especially for projects you intend to grow, iterate on, and possibly pivot. If my experience has taught me anything, I can never assume what I don’t need in the future. Most features come with Django and Rails almost for free (the only downside being some additional deployment size), so why not.
Regarding ZeroTier, these days I recommend using tailscale [1] instead. Much simpler setup, easier to understand, and likely more secure since it's just built on top of Wireguard (vs. layers of custom protocols). You do trade-off some of the more advanced features, but those are hardly used for this use-case anyway.
Firstly, while dedicated servers are extremely good value (by my calculations they can be 100x cheaper for performance heavy workloads), for most SaaS they're not worth it due to operational requirements. I would, however, definitely look at them for cases where performance matters and the savings are really substanial, which for an early stage SaaS product, they aren't.
I'd recommend using Heroku level solutions over more complex stacks to start with. For most use cases it works great, scales easily and the provider takes care of loads of underlying stuff. This won't work when you have a lot of developers and many apps and services to run, but it's premature optimisation to do anything more than that at the start and often in my experience ends up causing downtime due to overly complex infrastructure. You don't even need to use Docker to start with.
Next, I definitely would not recommend hosting Postgres yourself at an early stage. Use a hosted provider for this! It's worth paying slightly more for someone else to manage one level of backups, the underlying OS and version patching for you! I have seen so many data loss incidents that wouldn't have happened on managed DB services. In one case it basically killed the business overnight.
Even worse, it is hard to scale if you get hit with a load of traffic unexpectedly. You will not be able to provision VMs, configure read replicas etc quickly enough if you suddenly get hit with a load of users from some news article or blog. If it is hosted you have autoscale which will handle this for you.
Finally - I would recommend using "CI/CD" from the start (I find it odd that maintaing CI is too much of a hassle, but running dedicated servers isn't!). Many reasons for this, but one is that you will be able to do small edits even from your phone (I'm not proposing big changes). Small copy changes for example - these can be really urgent but you may not have your laptop to hand. If you can edit source file and commit to SCM and have it build for you, it can get save your arse more than you'd think.
I really like the authors philosophy. Too many developers use a bunch of tools, and you get the impression that it's just for the sake of tooling, or because that's how you're expected to do things. You can keep scaling a single server and deploying via simple git hooks for years until you need a more fancy setup.
Especially for side projects, you want to use the minimum amount of tooling that gets out of your way so you can focus on building a great product. To me, a weekend spent doing auxiliary or unessential stuff like configuring Kubernetes or whatever feels really fatiguing and demotivating when you don't have a lot of time to work on the project otherwise.
Using tools and services is the easiest way to profit off the work of other people (stand on the shoulders of giants), and choosing the right tools for your use case can save you SO MUCH work.
With side projects, maybe there is a certain thing I want to get out of it - eg I want to focus on design more. With the myriad of tools out there, I can then choose something that makes development less involved, and lets me focus on design. Or vice versa - if I only care about the dev side, I can import eg tailwind or buy a web components lib and call it a day.
It is fun to learn, but it sucks to yak-shave [1] all the time.
The simplicity of the stack lets me walk away for days (or years) at a time, come back, and easily ship changes. No need to remember esoteric commands, configs, or get bitten by strange infrastructure upgrades.
I'm lazy, and I'm just looking for ways to be lazier. :-)
Did you mean to expose your postgres port to the world?
Adding the port declaration to the docker file exposes it to other containers on the same network/machine, adding it in docker-compose exposes it to the world.
As they are using traefik as a reverse proxy, the container is likely not exposed to the internet directly, as long as traefik is there, as it manages all incoming connections.
What I find lacking in this post is tips on how to best configure systemd to run docker/podman containers and/or docker-compose. Any tips on that would be great!
Thank you! At no point do I have to touch systemd, unless you want to change the default behavior of the docker daemon (which I dont).
You can simply apt-get install docker-compose. I am not familiar with other distributions. I suppose you could also pip install docker-compose as well.
Fortunately most of my systems remain in the default state. All the magic happens in docker/docker-compose.
The way I see it, there's a level of complexity above which SSH becomes too painful to administer. So now what? Well, unless you find something inbetween you are up for setting up a full VPN and managing the complexity and security of that. And if your network needs are quite fluid, that itself is going to be a pain. Zerotier to me looks like that "something inbetween". But I'd be interested in Sam's comments.
Lots of discussion around Docker here. I use some glue shell scripts to deploy Docker containers to prod as I think Kubernetes is overkill. I have however been considering podman and starting the containers from systemd which would be a very small leap.
Or you could just use Dokku [0]? I don't mean to criticise the author's stack but if you just really want to focus on your application and spend less time figuring/learning about deployment then just adopt the 12 factor [1] approach and use Dokku with Fabric[2] for you server setup. I have a sample fabfile [3] I use for my initial server setup that I can easily modify to suit different project needs.
Ultimately, it comes down to personal preferences but Dokku being installable on your development machine (via virtualbox) you've got a very similar environment to production.
You probably want your references without leading spaces, just add another line feed to get them on separate lines and avoid the "code block" that doesn't parse links:
OP here: I already tried the Dokku way of things, and it was not to my taste. I like being a bit closer to the native docker commands if/when things go sideways.
If I have the competency and like it, then why not? Also consider this is in the context of personal projects and small operations where scale is not the issue.
I get way more bang for the buck in terms of compute, storage, and memory with self-hosted/dedicated.
A dedicated server at hetzner is about 30-35 dollars / month. That will get you a ryzen5/64 gigs/512 RAID-1 NVME on linux.
Heroku at that spec really starts to hit the wallet.
More than engouh for a lot of applications. Hell, I ran a E-commerce company with 20MM USD/year in sales on a dedicated server from 2005-2013.
Most things are plain vanilla and rock stable. I can go from new server to running my first services is less than an hour. Ansible or cloud-init go a long way to turning my pets into cattle, so to speak.
I spend nearly no time "managing it" -- I have grafana dashboards and Hetzner monitoring the critical things.
I see Hetzner mentioned often due to their great value, but given their EU location I’d imagine the latency isn’t ideal. I’m curious what your experience is.
When I’m looking for a cheap dedi box I use OVH or one of their related companies since they have Canada DCs which are closer.
Obviously if you are in the EU, Hetzner and scaleway are a great option.
I'm in Europe, but most of my users are based in the US. The ping seems to be in the low 100s for my users and the price is really low compared to AWS/GCP/Azure (about $450 vs $2500). I think it's pretty much worth it.
I'm from the US and my personal website has been hosted through hetzner for 3+ years. From the US, I typically got around 120-130ms ping before I put it behind cloudflare.
Maybe I am using the wrong terminology, but I am interested in whether this provides for zero-downtime deploys with health/readiness checks and rollbacks. This enables safe and non-disruptive deploys. It is a common feature of many software platforms.
the issue I had with this setup is that without a systemd or any daemon you might need to firefight and manually restart your containers each time there is an issue
To make it more realistic, let’s assume that you use an exotic pg extension. Some morning you read about a huge security issue in django. You see the patch comes with library upgrades that are incompatible with your exotic extension. So what do you do?
With two separate containers, you can patch one and leave the other one untouched. Saying that the extension shouldn’t be so brittle is true but doesn’t solve the issue because without it you wouldn’t have the job.
(For a real scenario look at the python 2 deprecated thread and arcgis)
Docker monitors only main process. If side process fails docker will do nothing about it. Means if you want to put multiple apps inside single container you have to add some manager guy to keep an eye on all the processes (e.g. supervisord).
Plus in case with separate containers the layers cache is reused better.
Why do you need Docker to monitor it? I would think it will only recognize a problem if the process dies. But you need to monitor your db and application for other types of failures anyhow, right? I cannot remember the last time I had a problem with the DB process dying.
That's default behavior. Docker tracks the main process. If it dies, and your database is not the main process, then the entire container will be restarted (depends on configuration). Means your database will be forcefully killed. At some point you might run into data corruption incident.
If you make database your main process, then docker will be blind for your application death.
That's why I mentioned that if you want to put multiple applications inside a single container, you have to put it under supervisord or alternative. So neither would be the main process.
> s there a benefit of running Postgres and Django in different containers
You can use the official optimised containers ( especially for complex things like Postgres that's great)
> By putting them in the same container, things could be a good bit simpler. And I don't see any downside here
The downsides are that you get fat containers that are impossible to scale independently ( one day you might want two Django containers, or to set up a replica PgSQL for load balancing, HA, etc.)
Putting them in the same container means I have to manage the processes myself with something like supervisord. A docker-compose file makes this trivial because it is just a matter of a few lines of YAML that link the services together. It also means I can ship updates to my application and restart it without touching the rock-solid Postgres image.
Typically the process to update an application is to push up a new image and then tell docker to stop the old one, and load the new one in its place (yes, I know there are 0-downtime ways, but lets keep it simple).
If you put both processes in the same container, then you cannot simply upload a new image and restart the container. This is where it gets very complicated, error prone, and not in line with what I understand to be current best practices with docker.
As the reason for using Docker, the author writes "The benefits of matching your development environment to your production one cannot be overstated". Again in my personal experience, this simply isn't an issue. Just use the same Python version and a roughly similar PostgreSQL version in development and production, and you're good to go.
In short, I feel OP would be better off without Docker. Once you know the rest of the stack well enough, you can set it up in dev and prod in such a way that you can trust that what works in dev will also work on prod. I've been doing this without problems for years.