I don't get why people use Docker for "small SaaS apps" where "scale is not an issue". You can run everything without Docker to remove an extra level of indirection. I've been doing that exact thing with a very similar Django stack for years. Not once have I missed Docker. On the other hand, I have taken over SaaS projects where the (unnecessary) usage of Docker made it more expensive to host, decidedly more difficult to debug and maintain, and tests were slow, making me less productive.
As the reason for using Docker, the author writes "The benefits of matching your development environment to your production one cannot be overstated". Again in my personal experience, this simply isn't an issue. Just use the same Python version and a roughly similar PostgreSQL version in development and production, and you're good to go.
In short, I feel OP would be better off without Docker. Once you know the rest of the stack well enough, you can set it up in dev and prod in such a way that you can trust that what works in dev will also work on prod. I've been doing this without problems for years.
This is a pretty common sentiment on HN and I’m sure you have your reasons for it, but to me personally it just seems bonkers. Docker is an absolutely game-changing tool for me. I can’t count the number of times it’s saved me from completely screwed up system libraries, tools installing config files in weird places, conflicting versions of this or that, or other versions of system pollution. I write a simple docker-compose.yml and I’m good to go - Docker takes care of sand boxing, networking, mounts - the works.
As I said before, you must have your reasons for your view, but my own experience is the polar opposite.
I'm a bit of a docker cynic, but if you tell me I have to deploy any of Ruby, Python, Node.js, or Perl, I'm using docker to manage the dependencies. My opinion is that this is actually docker's primary value-add. Yes, it enables who hosts of orchestration benefits, but it also makes it actually possible to just package these runtimes in a reasonable-ish way.
For platforms which support more self-contained binaries, I'm going to keep trying to avoid docker as much as possible.
Sure, we're all just talking about our personal experiences. I just haven't run into the problems you describe. I use virtual environments to keep my system installation clean, and PostgreSQL / Nginx are so stable that the version I happen to have on my dev machine usually works with all my projects dating back to 2014 (but kept up-to-date).
> Sure, we're all just talking about our personal experiences. I just haven't run into the problems you describe. I use virtual environments to keep my system installation clean, and PostgreSQL / Nginx are so stable that the version I happen to have on my dev machine usually works with all my projects dating back to 2014 (but kept up-to-date).
One of the most important thing docker provides is an abstraction between the application and the OS. If you don't need the abstraction, then it's only going to look like cost.
On the other hand, being able to write an some code in python 3.9 and deploy it anywhere across a heterogeneous production environment without fighting with the OS about how many versions of python it has installed is a useful thing.
Do you by any chance often have cause to compile and use tools or libraries written by academics? Or regularly use projects that aren’t as big and well-developed as Postgres and Nginx?
I find it really useful for development. For example when I configure nginx + a service, if it works in my local env, I have to run one command and I'll have the same in prod, and I don't have to debug in the server. Also is useful to keep track of the OS dependencies.
But it ends up being a matter of preference, I could still do it without it, but I'm used to it now, for some reason I feel that it makes my systems more "reproducible" in a simpler way.
Explain? I'm aware that giving a host user permission to control Docker is equivalent to root, but that's no worse than wheel/sudo in most cases, and not a "sandbox" failure. So I assume you're thinking of container escape, which I was given to believe is actually hard these days?
Docker use shared resources like Kernel. Linux Kernel is big ugly C mess (Compared to includeOS) and probably one can find a good enough exploit for the kernel then escape the Docker.
That's and VM provides much better security. Well, VM escape exploits exists but they are at least much harder than say a Docker level escape.
He was referring to sandboxing in terms of not polluting the installed packages/libraries on the server and not from a security perspective. But even from a security perspective, there are solutions to this problem like running containers as an unprivileged user using podman.
At work I had to set up a server with django, postgres, SSL, rabbitmq, celery and a few other things. I did it in both ansible and docker-compose, the ansible version was 1500 lines and docker-compose version was 100 lines. A lot of the ansible stuff is dealing with users, groups, services, supervisord, virtualenvs and repositories - none of this is required in docker as containers restart themselves, users are already set up, you don't need virtualenv, and using containers avoids the need to use repositories to get specific software versions.
Also the docker-compose version was much faster to pick up changes, vs minutes for ansible.
I don't find docker that useful for developing on a day to dat basis however, it just gets in the way and generally you're not going to need a webserver, SSL, celery, rabbitmq in development anyway. It is useful to debug things locally though.
The other reason to use docker is state: while using ansible we got bitten by old dependencies hanging around because ansible doesn't exactly mirror your config, it can really only add stuff. Whereas docker actually wipes the slate clean every single deploy so your server is an exact mirror of your docker-compose file. This is really important.
Cost benefit trade offs of Docker will fit some and not others. For smaller projects with less change and few+standard components Docker is probably overkill. In other cases Ansible is too fiddly or slow.
Still important to keep in mind those Docker images must be maintained too.
You can tell Ansible that a package should be removed. You just have to remember to do it. If there was ever a statement in your playbook that a package should be added then you can’t assume it will disappear automatically - you have to tell Ansible to remove it.
Most of the stuff you would be doing with Ansible happens inside the Dockerfiles. So comparing lines of code of your Ansible codebase with just the docker-compose file is not fair towards Ansible.
I'm not trying to pick on you personally here but this comment, like many I see on HN, seems overly adversarial and could have made the same arguments in a less combatative way.
There's a lot of emphasis on your own perspective: "*I* don't get", "*I've* been doing that", "Not once have *I* missed Docker", "making *me* less productive", "*my* personal experience*"
And an assumption that it applies to others: "Just use the same [...]", "I feel OP would be better off [...]".
The comment could provoke a more productive, thoughtful discussion by focusing on understanding rather than on pushing particular views and practices.
For example:
- "I don't get why people use Docker" -> "Could anybody share their reasoning for using Docker?"
- "You can run everything without Docker to remove an extra level of indirection" -> "What benefits do you see in exchange for this extra level of indirection?"
- "Not once have I missed Docker." -> "I haven't found much use for Docker and would like to understand those who have"
- "I have taken over [...] making me less productive" -> "I've only encountered downsides in my own experience with Docker, could anybody share what the upsides may have been for others involved?"
- "Again in my personal experience, this simply isn't an issue" -> "I'd be interested to hear more about the benefits as I haven't encountered issues before"
- "In short, I feel OP would be better off without Docker" -> "I wonder if the OP really needs Docker, perhaps they could be more productive without it"
If we all wrote like this (and I'm certainly not perfect myself), HN could be a much friendlier, more welcoming place.
Denigrating any subjective experience because it's "too direct" is bordering on passive aggressive. Every suggested edit you've made to the comment completely alters the meaning of the text: you're basically saying that if you're not looking to learn and completely disregard the veracity of your own experience in every statement you do, you're better off not saying anything at all.
It strikes me as exceedingly dishonest to pretend that you're genuinely interested in someone's opinion if you already know you strongly disagree with it, and it's by no means a requirement to lack an opinion to have a respectful debate. Telling someone that you believe they're wrong is not an insult.
People have subjective opinions and experiences, this does not necessarily make their statements true. This should be obvious and I don't see why it's the sender's responsibility to preface everything they say with that fact to avoid stepping on the toes of whoever reads it. Opinions don't kill debate, it furthers it.
> you're basically saying that if you're not looking to learn and completely disregard the veracity of your own experience in every statement you do, you're better off not saying anything at all.
That's really not what I'm trying to say at all. If I had to condense it down to one sentence it would be: "if there's a difference in opinion between two reasonable people, it's probably because you've had different experiences; you'll have a more productive discussion by finding out what they are than simply stating your opinions".
> It strikes me as exceedingly dishonest to pretend that you're genuinely interested in someone's opinion if you already know you strongly disagree with it, and it's by no means a requirement to lack an opinion to have a respectful debate. Telling someone that you believe they're wrong is not an insult.
If you're not interested in somebody's opinion, what value could there possibly be in having a discussion with them?
> If you're not interested in somebody's opinion, what value could there possibly be in having a discussion with them?
Not every comment on the Internet is an invitation to discussion, and the discussion is not exclusively between two people. A lot of comments are just responses about why someone disagrees or agrees with what was just said, which is allowed.
Is it OK if I am direct with you and say this is absurd and unhelpful?
OP’s post was clear and to the point. It was entirely focused on the content of their frustrating experience with Docker. And it wasn’t at all “adversarial”. They had a bad experience with software and are sharing it.
I’m glad they didn’t try to wrap it in softening language just in case someone might find their communication style “violent.” I hate this trend. I see so many comments like this in tech and honestly it’s mostly from people who have nothing to say or who become indignant if you say anything they disagree with.
I understand your concern and agree that I am being extremely direct. It's not out of spite. On the one hand, it comes from my frustration with what I as an engineer perceive to be wrong advice. It feels to me like OP is using Docker because he read a blog post about it somewhere, never seriously thought about alternatives, then stuck with it and now recommends it to others, repeating a cycle where software gets worse and worse. And on the other hand, being direct is more concise. We all have little time and I feel that to get a point across on the internet, one has to get it across quickly and succinctly. I have no problem with my views being criticised and I'll be happy to concede where I was wrong if someone makes a clear and harsh point. But either way, I'm sorry if I offended you in particular or others who did not comment; It was not my intention. Having read this thread, I think much of the difference of opinion comes from different contexts. People who like Docker often seem to be working in larger teams, which is totally fine and they're probably right to use Docker. I just couldn't help but disagree with a statement that I felt pertains to an area I work in, which is small SaaS of tiny teams with usually no scalability requirements.
I get where you're coming from and you didn't offend me at all. I'd just been looking at the top comments of a bunch of HN threads and thinking "why does this place seem so unfriendly". Your comment isn't anywhere close to being the worst, it just happened to be the one I responded to.
And being direct isn't a bad thing either, my point here wasn't "don't be direct or concise, you'll offend people" but "assuming everyone here is reasonable, disagreements are likely due to differences in experience and you'll have a more productive discussion by figuring out the differences than simply stating opinions".
You are making judgments about OP using Docker. But from my point of view you are a true pain to work with because you do not containerise your work. Anytime I'd have to work with one of your creations I'd have to go figure what you are using, set it up and make sure it works. That's a waste of time and you are promoting that very fact because you were unable to figure any usage for it. I doubt OP has only "read some blog post about it somehwere" It's literally wide in usage by a lot of companies. "People who like Docker often seem to be working in larger teams, which is totally fine and they're probably right to use Docker." -> it's actually about that, it is useful is scaling that's true, but it's also very useful just to share the env your software should work with. And I'd dare to say that giving a Dockerfile is now standard practice.
You're judging my code without ever having seen it, nor knowing the context of my projects. You won't have to work with my creations because as I explained my projects usually involve just me. As I also explained above, i think Docker can make sense in companies. And re "standard practice", I like to make decisions on what's best for each project, not by dogma.
What if you want to run the code on a different architecture? In that case isn't it better to document which "pip install" etc. you have to perform to get it running? And from there do you really still need docker?
> I'm not trying to pick on you personally here but this comment, like many I see on HN, seems overly adversarial.
Ironically I find the first comment non-adversarial and this comment overly-adversarial, but maybe that’s just me.
People are allowed to disagree and share their own experiences, and OP didn’t say anything nasty about anyone. I would prefer a space where people share their honest experiences with reasoning over one where everything has to be coated in sugar to make it digestible. And I’m sure the author of the post can handle hearing someone else’s differing opinion, it’s a discussion forum after all - not a blind agreement forum.
I think using "I" statements and referring to their own experience makes the comments subjective. They may be expressing a strong opinion, but they aren't stating it as objective truth.
> If we all wrote like this (and I'm certainly not perfect myself), HN could be a much friendlier, more welcoming place.
This is a well-intentioned thought but I'm not sure it actually makes sense. The things that make or detract from HN being a friendly and welcoming place have a lot to do with its size and heterogeneity -- ideology included. You're trying to neuter the disagreement and rephrase it in terms of a question. That's not always a bad thing, but in some cases you really do just disagree. If that's going to be the case, why not be upfront about it, so long as you can be civil? (GP's point does seem to be phrased in a pretty civil manner IMO)
It’s too fucking hard to setup python with all its pyenv, anaconda, poetry, pip requirements.txt and what-not, that’s the best use case for Docker here. Get it to work once and forget about it.
I also dislike all those tools you mention. So often, I feel like devs use a tool "because it's cool" but fail to appreciate the complexity it adds - not just for themselves but for others. The frustration expressed by your comment is a perfect example of this.
In my ideal world, everybody would just use what comes with vanilla Python:
I prefer my system for running youtube-dl in an isolated environment without having to install all of its dependencies on the host. Ansible writes a bash script at ~/bin/youtube-dl containing:
#!/usr/bin/env bash
set -e
IMAGE_NAME="grepular/youtube-dl"
# Build the image if it does not exist
if [[ $(podman images --filter "reference=$IMAGE_NAME" -q) == "" ]]; then
podman build -t "$IMAGE_NAME" -<<EOF >&2
FROM python:3-slim
RUN python3 -m pip --no-cache-dir install youtube_dl
ENTRYPOINT ["youtube-dl"]
EOF
fi
podman run --net host -i --rm -v "$PWD:/app" -w /app "$IMAGE_NAME" "$@"
Nope. I'm using podman rather than docker so they're owned by the user that runs the script. The root user isn't involved. If I were using docker, I'd add `-u "$(id -u):$(id -g)"` to the docker args to deal with that issue.
Venv is great for keeping dependencies separate but if your local machine doesn’t have the same version of Python as your server then you’ll need pyenv.
Not necessarily, you can use a CI pipeline to verify that your project can be build with many Python versions. This is the workflow I use here[0]. It makes catching and fixing breaking changes easier. Plus, I'm confident the core team is not likely to introduce a painful breaking change (think Python 2 -> 3) soon[1]
Few hours of annoyance per developer in the beginning vs. a few minutes every time you run tests or when you debug. Also add in the overhead of configuring a debugging system where you work in an ide and debug code running in a docker image. Doable (though many of my fellow Devs have just given up) but also takes time, probably more than to run pip and get psycopg2 installed on Macs. Also the majority of engineers probably don't even know docker that we'll (definitely far worse than they think they do, just like with k8s). The number of times I've seen Devs running webservers as root in docker containers is amazing.
My team insists on using Docker and against my better personal judgement I let it, but I set up the code to run locally without it. If I call the shots I'll not use docker absolutely.
Because I have, my colleagues have, in fact we probably spent more than an hour and just decided it's not worth it. I decided it's not worth it because I had a way to debug, while others decided they will just code without proper debugging tools because they only want to work through docker.
My (limited) experience debugging Typescript/Node, Typescript/Browser, PHP using IntelliJ and Docker has been pretty much like this:
1. Open a port on the container for debugging
2. Tell the debugger in the container what's the port it should use (if you're not using the default one)
3. Tell the IDE what port it can use to connect to the debugger (if you're not using the default).
4. Debug it!
For things like Typescript you might need some extra trickery because it's all transpiled, so you'll need to make sure your sourcemaps are set up correctly, but that's not overly difficult.
It did take me several hours to work everything out and write a README so that every time we get a new hire / someone sets up a new PC they can just follow the instructions. I'd say it was time well spent.
Same. The only issue I had with virtualenv was when I copied one to a different directory and it didn't work. It turns out you can't do that. Everything else has always worked fine, and I've been using it professionally for 10 years.
No it's not. It's literally just two commands to get started with poetry, and after that, one command to add a dependency and it's pretty easy to run simple projects locally.
Ah, see, this is the perfect example of why Docker helps, especially with Python: consistency.
The team would no longer have to make these deployment decisions and argue about which tool is a better fit. They'd make the decision once, hopefully follow best Docker practices (unprivileged user, multi-stage builds, etc.), and have documentation available for how to integrate the setup with IDEs, work with volumes, etc.
Once this initial adoption hurdle is overcome, IME the productivity gains are greater than the issues of dealing with Docker. It becomes trivial to setup CI/CD, onboard new developers and integrate the app into other workflows.
Docker and containers in general have become mature enough to prove their use case and benefits, so the cargo cult argument doesn't hold weight for me.
> You can run everything without Docker to remove an extra level of indirection.
... and add questions like: are all paths correct for this machine, are binary lib dependencies the same as in production, are the OS versions compatible between dev and prod, is my local runtime version compiled with same options?
The great example where the indirection adds value is: every developer has the same environment and the CI is the same, regardless of personal preferences in systems.
If you have to answer such questions then yes, Docker is probably a good solution for them. I find that in my pretty vanilla web development, I am not faced with those questions. Most of my stuff is just Python, Django, PostgreSQL, Nginx and a few not very exciting Python dependencies. (This is in several independent projects, working as a sole developer, since 2014 and let's say ~100.000 users of my SaaS apps.)
In my experience you'll just run into them one day by accident. My last one was actually about differences in postgres drivers talking to sqlalchemy between Debian and Fedora. I agree it's likely a "pick your poison" situation, but these days I prefer defaulting to containers (and investing time upfront) for just about everything to prevent those problems rather than debug them once they happen.
I could tell you were a solo developer before getting to this comment.
A lot of the problems docker solves is as soon as you need to fit out a team with a mix of environments and split ops from dev
I still use it for individual projects because I never plan on maintaining everything forever - makes it easier to grow to a 1+1 team or hand the project to someone else or solicit contributions
Also builds your experience for when you do work in teams
What do you use for the frontend part? I'm using Vue or React + django-drf, but it always feels like it could be much simpler for most SaaS applications. Using Django forms feels too limited, on the other hand.
Docker uses the host kernel. To get everything identical you need to use a VM anyways.
Some languages bundle all of their dependencies so you can be relatively sure they will run the same on prod. For others (Python, Ruby) that use many system libraries, containers may add value
> To get everything identical you need to use a VM anyways.
Yes, which is where you can reach for firecracker for example. But docker gets you 95% there. If kernel makes a difference you can make the decision about the other 5%.
> Some languages bundle all of their dependencies so you can be relatively sure they will run the same on prod.
As long as they're static, vendored dependencies. Otherwise you're still likely to run into pulling something very common like openssl, zlib or libuv from the system.
Docker is useful for reasons besides scale. It allows you to (kind of) declaratively define your environment and spin it up in any number of different scenarios.
It allows you to ship all your dependencies with your app and not worry about getting the host machine properly configured beyond setting up Docker and/or Kubernetes.
Have other apps you want to host on the same set of machines? No problem.
When you you pair that with something like alpine Linux, you’re getting a whole lot almost for free.
For running applications in production, Alpine Linux is harmful and should be avoided. Getting a smaller container image is not worth trading mucl for glibc.
To a Docker outsider, to me this implies that Alpine for testing and non-Alpine for production is not harmful. Is that right? Wasn't having a unified environment half the point of using Docker? Doesn't two different base systems just open you up to a load of headaches? If so, then isn't it more of a "Alpine Linux as a Docker base is harmful" situation?
Thank you for sharing that! It has never occurred to me but that makes sense.
For my Go/Haskell binaries, I usually do need to make changes in order to get them working. The smaller image size is essential in my use case though so I pay that penalty.
How many scenarios do you typically need to spin up your environment in?
I achieve the same (defining the environment) by pinning Python dependencies via requirements.txt and virtual environments. And I have an installation script, like a Dockerfile but just an executable bash script, that installs PostgreSQL etc, pulls the code from GitHub and starts up the server. I can upload this to a clean Debian installation to set up the server, with a well-defined stack, in minutes.
> Have other apps you want to host on the same set of machines
I don't. I just use one $5 per month Linode for each SaaS. (Or bigger Linodes as the projects get more users. My biggest single box currently serves ~100,000 users.)
> I don't. I just use one $5 per month Linode for each SaaS
Well then you're basically using VMs as your containerization mechanism, with install scripts replacing Dockerfiles.
So from your perspective, don't think of Docker as a really fat binary. Think of it as a stripped-down VM that doesn't cost a minimum of $5 per instance, and comes in a standardized format with a standardized ecosystem around it (package registries, monitoring dashboards, etc.)
It's a really fat binary that adds unnecessary layers in both dev and prod. In dev, it's faster to run tests and easier to debug without Docker. And in prod, why use a vm inside a vm?
> In dev, it's easier to debug without Docker. And in prod, why use a vm inside a vm?
On the contrary, it's easier to debug with Docker - it eliminates dangling system level libraries / old dependencies / cache, and everything is self-contained. If you have the problem in dev, you'll have it in prod as well.
Docker is not a VM, it's basically a big wrapper around chroot and cgroups, the performance hit is minimal. The advantage is that, again, it's self-contained, so there's little risk some OS / dangling library muddies the waters ( especially in Python that's a great risk -OS level python library installations are a thing, and many a Python library depend on C libraries on the system level, which you can't manage through Python tooling). It's also idempotent ( thus making rollbacks easier) and declarative.
I've never encountered this "dangling libraries" problem you describe in the past 7 years of developing web apps. I suspect you are working in different, maybe more specialised, environments than me. For me, it usually is really just pretty standard libraries and dependencies.
Containerisation software adds so much more than just a binary with layers of indirection.
Easy fine grain control of individual application memory, namespace isolation (container to container communication on a need to know basis), A-B testing, ease of on-boarding, multi-os development, CD pipelines, ease of container restarting etc.
This is absolutely the case. IMO the main benefit provided by containerization is reproducibility. Scaling and other positives are just a product of this.
If you know Docker, and use it daily - it’s a no brainer. For me it’s an easy choice - I dev on a Desktop PC but when I’m on the go I’m on a MacBook. No need to customize environment config with docker. What’s your disaster recovery plan if your server tips over? With docker it’s relatively easy. I’m not saying it’s for everyone, but in my experience it makes my life simpler.
How do you debug (in dev or prod), or in production, do things such as inspecting log files, monitoring system resources, running necessary commands if there is something urgent...? In my experience, Docker just makes everything more tedious without tangible benefits. I personally lose nothing by not using Docker and am much faster in everything I do.
How much experience do you have with Docker? I am not being snarky -- all the things you say sounds like something I would have said until I pulled myself together and learned how to use it properly. And granted, that did take a bit of effort but now my perspective is that Docker is extraordinarily easy to work with.
A Dockerfile is almost exactly like your requirements.txt file, only it works for everything. Need Imagemagick installed on your server? Just add it to the Dockerfile. Need to run a pre-start tool that was written in Ruby? A line or two in the Dockerfile can add that. And if it turns out that you don't want it after all, you just remove the lines and feel confident that no trace is left behind.
I don't have a lot of experience with Docker, but it's not what's holding me back. Here are some of my pain points:
1) Having to "docker exec" to get a shell in a running container vs just running the command (in dev) or sshing into the server (in prod).
2) Tests taking 3:30 min to run with docker vs 30 secs without. And then of course, you don't just run tests once but many times, leading to tens of minutes on a given day where I'm just twisting my thumbs.
3) Having to even spend time learning how to debug Python code inside a locally running docker container.
4) Having to deal with disk space issues caused by old docker images on my 500gb hdd.
All while not deriving any value from it in my projects.
> 2) Tests taking 3:30 min to run with docker vs 30 secs without.
This is not a problem with docker, but your environment. Tests should run at the same speed and initial time with or without docker (with an exception for changes that update dependency list - that will take the time for the initial build).
Things to check: Are you installing dependencies before adding the app? For development are you mounting the app instead of building a new image each time?
Give Docker a go on a project and you'll see its utility:
1. If you're doing this, then you're likely using docker incorrectly. Your container should be run from images that are automatically deployed. If there's an issue, fix local and deploy image.
2. I agree, that sounds dreadful. We use pyenv and poetry for local development and docker for deployments. That would address your issue. We of course do not use pyenv or poetry inside the docker image. Hopefully that helps clarify potential real world use.
3. Your should not be debugging a live deployment. Debugging local container is straight forward in VSCode.
4. This shouldn't be an issue. Prune weekly, and aim for smallest images where possible, e.g., 60MB for python microservice.
The value is in the simplicity of deployment. Need to host a well-known software? Can be up or down no time.
As I wrote at the very top of this thread, i have used Docker in a project that was forced on me, and I still use it in other client projects where I don't have a choice. I don't need to "give it a go" to learn what it's like.
1. First, the infra to automatically deploy again introduces complexity. And waiting for the Docker push to complete and then until the new container is started takes away from my time.
2. Yes, it is a pain.
3. I do what's necessary to fix problems. Sometimes that means looking at production. I use PyCharm not vs code and because I don't find value in Docker for my projects have no incentive to look into how to set up local debugging.
4. "Prune weekly" you say. But it's just another complication I have to deal with when using Docker. What for?
> The value is the simplicity of deployment
I would argue my deployments are simpler than yours. Give me git and ssh and I'm good to go. No need for Docker, pushing to some registry, looking at a dashboard or waiting for the image to be deployed. And my setup is much easier to debug.
But it sounds to me like you are coming from a more enterprise-y environment. There, the things you say probably make sense. In my case, I'm a sole developer. Any unnecessary process or tool slows me down and incurs a risk of bugs due to added complexity.
> I would argue my deployments are simpler than yours. Give me git and ssh and I'm good to go.
FYI: the wait time is marginal at best to deploy images and containers. My workflow is the same as yours most of the time for solo projects: SSH into server, git pull (from a "stack" branch that has docker-compose files containing DB and microservices), docker-compose up -d, and because of cached images it takes minimal time to deploy.
--
I agree that as a sole developer it can add initial complexity. As a sole developer myself on prior projects and now on personal projects, I use Docker primarily to streamline my deployment practices.
Another excellent use case of Docker not mentioned elsewhere in this thread is the simplicity of running databases locally, e.g., mongodb, postgres, etc.
Tests running with this difference implies a misconfiguration or a whole lot of difference on setting up the test pipeline. You shouldnt blame it on docker before investigating the issue.
> Having to "docker exec" to get a shell in a running container vs just running the command (in dev) or sshing into the server (in prod).
If this bother you, take a look at some 3rd party docker management ui such as portainer (vanilla docker) or k9s (kubernetes). These tools will let you navigate and launch shell on your container quickly. Very useful if you have tons of apps running in your node.
> Tests taking 3:30 min to run with docker vs 30 secs without. And then of course, you don't just run tests once but many times, leading to tens of minutes on a given day where I'm just twisting my thumbs.
In my case, I don't use docker in development phase. I test in local environment and only build the images when it's ready for deployment.
> Having to even spend time learning how to debug Python code inside a locally running docker container.
I never had to do this anymore. I just hook sentry or newrelic and they'll log exception stack traces that I can use to figure out the issue without live-debugging the app.
> Having to deal with disk space issues caused by old docker images on my 500gb hdd.
Yes, disk usage is one of the drawback of using docker. It's especially suck pruning images on busy servers with spinning rust. On ssd, pruning is not as slow though.
> or in production, do things such as inspecting log files
Logs should be sent out of the container, either directly (mount a /logs folder, push them to Sentry / ELK / etc.) or by simply logging to stdout and having Docker send the logs where you want.
> monitoring system resources
Docker processes are still processes. They show up in `top` and friends just fine.
> running necessary commands if there is something urgent...?
`Docker exec my_container <insert command here>`.
Although I've never even considered doing that. For the past few years the resolution to a production bug has always been "rollback to the previous image, then if any data got borked fix it in the database".
Everything you write is possible, but - and again in every single step of your daily dev work - more tedious than doing things directly. It's just not worth it for me.
But it also sounds like you are operating within a larger organization. So your requirements may be different from mine.
Because it codifies things and ensures people don't make a giant mess of intertwined crap that takes days/months to separate due to random global dependencies such as a cron/script/mount/etc.
It's not tedious at all and it saves potentially blowing up your system with crap splattered all over your filesystem. We have tons of WSL2 devs and Linux people and it takes care of the, "it works on my machine" problem once and for all.
This insistent push that the old way was good and why did we expend all this effort to make a new thing that I don't want to bother learning the five new invocations to just doesn't line up with the needs of today.
Docker pull, docker exec, docker ps, docker logs and you've pretty much got what you need for ninety percent of your job.
This stuff is not hard. You make it hard for yourself by digging in.
"New is good" is also no general justification. It always depends on the context. Several replies in this thread mention very good use cases where Docker makes sense. From what you wrote, it sounds to me like it also makes sense in your environment.
Yes I mean I don't want to have to learn things that don't bring me value. But it's not about learning the commands. It's about having to repeat them over and over again in my daily work. About the associated mental burden "am I in the container now? Is it running?" And about everything taking longer, be it running tests in Docker or pushing to a registry and waiting for the new container to be spawned. As a single dev, and I feel this is where your and my requirements differ, it simply is not worth it.
I echo this sentiment entirely.. mutli dev-environment, multi-machine / os things really just work. Docker et al also really shines with onboarding, the new recruit can literally get up and going in minutes.
I would not want to go back to the old ways of doing things.
Docker on Mac and Windows is so slow when used for a development environment but there are enough differences that it makes it worth using. It’s more portable than VMs (vagrant and such) and if you’re using your CI pipeline to build an image then deployment is a snap.
I much prefer building a Docker image and pushing it somewhere compared to tarballs of the repo or repo access. I would rather build rpms or debs than just push the repo around. Container tooling makes that kind of stuff nice in my opinion.
You agree that Docker is "so slow" and point to its benefits for preventing differences between dev and prod. But you don't mention what those differences are. What are they?
I also don't see why you'd use a CI pipeline for simple projects where you are typically the only developer. Run tests locally, if everything is fine push to prod.
I use git with a small script to roll out to production. When I execute the script on the server, it stops all services, pulls the latest code from Git, applies Django migrations, and starts everything up again. I use desk [1] to make this as simple as a single `release` command in my local shell.
For multi tenant deployments. Each tenant gets its own docker-compose in a directory on prod and voila: 100% same code base (same Docker image) and good separation between tenants
Multi-tenant deployments with Docker make sense _only_ if you trust all of the tenants, since it is trivial to take control of the host if you have write access to Docker socket.
You may say that it can be mitigated with some wrapper scripts with limited commands, but then you have to maintain them and we can all agree that homebrew security is very hard to do correctly.
Using Docker will provide you a complete runtime env which will in most cases 'just work' on a variety of hosts. It's easy to reproduce and easy to distribute. You can easily setup other dependencies and connect everything together using docker compose, creating a portable 'production env'. And that's important when there's > 1 person working on a project.
And it won't break the host machine because someone ran sudo pip install -r requirements.txt by mistake.
Yep. I agree. Docker has its place, but nothing wrong with keeping everything one layer short. It's how I run my side-projects. Just a small Bash setup.sh script and you are good to go. Docker solves the annoyance of system dependencies, but you still have to deal with it in your Docker build.
As I always say, know what you are doing + know to debug everything. If somebody is more comfortable with Docker, I think no problem there either :). Sometimes it solves a real pain, sometimes don't (having an Elixir release or Java jar makes me not to really use containers).
If you are on Docker, I recommend looking into Podman (maybe even together with runc -> today trending on HN). Don't run your containers under root and run your app inside the container under a specific user as well.
Shameless plug: I am now writing a book on application deployment and will show both approaches, Docker and Docker-less[0].
For me it’s actually more about isolation and security. I touch and build a lot of random code and docker gives me some good assurances that a transient dependency of a dependency somewhere in the pits of NPM install hell won’t do things like install malware on my computer.
Sandboxing/isolation is a very good mitigation against software supply chain worries IMO
With docker, I can move or upgrade the host server without worrying about compatibility of my apps. Before using docker, every time I migrate to a new version of os or database (which I do every 3-4 years per host), I would often run into small problems in some old apps that must be fixed, which took time because I had to dig into the app again and re-learn the context before I can patch the small issue. After moving everything to docker, it's no longer a problem. As long as the host can run docker (be it docker compose or kubernetes) my apps would run on it.
You still need to do those updates inside your container. I deal with a number of clients who like Docker because it gives them “a stable environment” but they don`t plan image upgrades.
I work products based on the official Python Docker image, but the base image containing the version of Python they want was release in April and haven`t received updates since. Developers forget that releasing software as Docker images mean that they are now responsible for patch management.
Yes, but now I can do the update in a separate schedule instead of being forced to do it right there and then when performing host upgrade or migrating the app to a different host.
It's not about docker per-se but repeatable, immutable infrastructure, a great boon to productivity and portability to local or cloud environment. "Cattle vs. pets"
Sure, you can do similar with ansible or puppet, but use something.
SSHing into boxes to fix things wasn't good enough in the 90's, one reason I looked into building .debs back before the turn of the century. cough
Well, maybe you are careful and meticulous and are the only person to deploy. But when you are working in a team where anybody can deploy, maybe not everybody as meticulous as you. Using Docker makes it easy for anyone to rollback anything (yeah, yeah, you know those kind of issues that should not happen but it still do ...), and that's a game changer.
I had great use of Docker even in my personal RPI projects, as Python modules can be difficult to get compiled and in a usable state on raw Raspbian. In projects at work I personally only recommend Docker when we need Hyper-V isolation or K8s is hosting the product (these are .NET projects, which work really well when published as self-contained binaries).
If you're not using docker then you're building in production instead of in CI. Also, that means you're maintaining a deployment instead of building a fresh one every time you deploy. This means leftover files will pile up and that is certainly not tested upstream at any time: tests run on fresh ephemeral deployments in general.
Thank you! I would wish that there are more projects having things like ansible modules or similar. You can run it for you local dev on vagrant images and deploy it on the server directly. Less layers. Less differences between environments. No messing with docker bypassing ufw and other system configs.
Docker makes deployments easy. In almost all situations I've encountered, it's easier to 'git pull foo && cd foo && docker-compose up' than it is to install and configure a mishmash of individual services to work in tandem.
A Dockerfile may be complex for any piece of the stack. But now this complexity is not part of deploying the application. Each component has its own build process which is decoupled from the others. And even a really complex installation process results in a deployment process that works the same way.
If you can't see these benefits then maybe you haven't had to deal with complex software installation. Or maybe you haven't had to run the same stack on your laptop that you run in production. But if you ever get to the point that you want to deploy something in more than one place, or more than once to a single place, it really makes things much easier.
Docker itself isn't going to increase your hosting costs. If you rent a server for $20 a month, it's still $20 a month with or without Docker being installed.
Disagree. Not using docker I have to learn about the specifics of how the hosting company runs my software. With docker it's 30 lines I can copy from an earlier project and I have full control.
Docker compose and docker machine make deployments so simple. For development I don’t want all my side projects to share the same database, similar to them using the same python install.
I can run disparate environments across multiple language runtimes without having to fight version issues or wrangle multiple VMs. That’s why, for me at least.
"Just use the same Python version and a roughly similar PostgreSQL version in development and production, and you're good to go" -> And that's more time than it would take to re-use that Dockerfile and docker-compose that i literally wrote once and go to once in a while.
I get your point but don't think you're getting mine. In a bigger project / organization? Yes, let's have those processes and tools. But for simple apps as described in the article? Use a correspondingly simple solution. As always in software development, it's all about context.
I get your point, and I disagree on opinion. I and many others have had success using containerized devenvs on projects both large and small, and have likewise felt some pain with respect to repeatability when not - especially with the Python stack. Containers are synonymous with repeatability. Your future self is just another collaborator, and they’ll appreciate it down the line when they’ve got a new laptop and new env, for example ;)
I agree. With you and with the other opinions (why should only one perspective be right?).
I've been the lead developer on teams where I introduced Docker to solve consistency/reproducibility issues in AWS and Azure.
I've also done smaller applications in DotNet Core, Go, Node, Python, and Ruby. In those cases I've use other alternatives, including:
- Known Linux version, with git push-to-deploy (my favourite)
- Packer (the server), with embedded codebase (still quite simple)
- Docker (for most non-trivial deployments)
- Known Linux version, with chef or ansible (as an alternative to Docker)
- Terraform the machine, upload the codebase, run scripts on the server (ugh)
Every method had it's place, time, and reason. If possible, for simplicity, I'd go with the first option every time and then the others in that order.
The thing is, though, I may have an order of preference but that is totally overridden by the requirements of the project and whether or not the codebase is ever to be shared.
For solo projects and small sites, I've not benefited from Docker as I have never had any server/OS issues (and I've been doing dev stuff for decades).
However the moment there was a need for collaborators or for pulling in extra dependencies (and here is the crunch point for me) such as headless browsers or other such 'larger' packages, then I would move on to either Packer/Terraform for fairly slow-changing deployment targets or Docker for fast-changing targets, as otherwise I inevitably started to find subtle issues creeping in over time.
In other words keep it simple while (a) you can and (b) you don't need to share code, but complexity inevitably changes things.
In my personal experience, there are a few reasons for using Docker:
1) By installing packages on the system directly, you can't be sure that a future update won't break it. For example, when using Tomcat or a similar application server, new updates sometimes deprecate or disable functionality in the older versions, or add new configuration parameters, that the configuration from the older versions won't have, thus leading to weird behaviour. This will be especially noticeable, if you set up development, test or production environments some time apart, or in different locations, where the available packages may not be 100% consistent (i've had situations where package mirrors are out of date for a while).
2) Furthermore, if you maintain the software long term, it's likely that your environments will have extremely long lists of configuration changes, managing which will be pretty difficult. This results in the risk of either losing some of this configuration in new environments, or even losing some of the knowledge over time, if your approach to change management isn't entirely automated (e.g. Ansible with the config in a Git repo and read-only access to the servers), or you don't explain why each and every change is done.
3) Also, it's likely that if you install system packages (e.g. Tomcat from standard repositories instead of unzipping it after downloading it manually), it'll be pretty difficult for you to tell where the system software ends and where the stuff needed by your app begins. This will probably make migrating to newer releases of the OS harder, as well as will complicate making backups or even moving environments over to other servers.
4) If your application needs to scale horizontally, that means that you'll need multiple parallel instances of it, which will once again necessitate all of the configuration that your application needs to be present and equal for all of them. You can of course do this with Ansible, but if you don't invest the time necessary for it, then it's likely that inconsistencies will crop up. In the case of Knight Capital, this caused them to lose more than 400 million dollars in less than an hour: https://dealbook.nytimes.com/2012/08/02/knight-capital-says-...
Edit: It was stated in this case that "scale is not an issue" and therefore this point could be ignored. But usually scale isn't an issue, until it suddenly is.
5) Also, you'll probably find that it'll be somewhat difficult to have multiple similar applications deployed on the server at the same time, should they have any inconsistencies amidst them, such as needing a specific port, or needing a specific runtime on the system. For example, if you've tested application FOO against Python 3.9.1, then you'll probably need to run it against it in production, whereas sif you have application BAR that's only tested with Python 3.1.5 and hasn't been updated due to a variety of complex socioeconomical factors, then you'll probably need to run it against said older runtime.
6) Then there's the question of installing dependencies which are not in the OSes package repositories, but rather are available only in the npm (for Node.js), pip (for Python) or elsewhere, where you're also dealing with different mechanisms for ensuring consistency and making sure that you're running with exactly the version that you have tested the application on can be a bit of a pain. For an example of this going really wrong, see the left-pad incident: https://www.theregister.com/2016/03/23/npm_left_pad_chaos/
Essentially, it's definitely possible to live without containers and still have your environments be mostly consistent (for example, by using Ansible), but in my experience it's just generally harder than to tell an orchestrator (preferably Docker Swarm/Hashicorp Nomad, because Kubernetes is a can of worms for simple deployments) that you'd like to run application FOO on servers A, B and C with a specific piece of configuration, resource limits, storage options and some exposed ports.
I feel like this is a largely issue of tooling and our development approaches, NixOS attempts to solve this, but i can't comment on how successful it is: https://nixos.org/
As the reason for using Docker, the author writes "The benefits of matching your development environment to your production one cannot be overstated". Again in my personal experience, this simply isn't an issue. Just use the same Python version and a roughly similar PostgreSQL version in development and production, and you're good to go.
In short, I feel OP would be better off without Docker. Once you know the rest of the stack well enough, you can set it up in dev and prod in such a way that you can trust that what works in dev will also work on prod. I've been doing this without problems for years.