Opinions on starting new backend software projects


I’ve been writing software professionally for more than five years now. Each time I git init a new project, start a directory structure, and deploy a project, I’m doing it on the shoulders of previous knowledge. I like to think that each time I do it better.

Here’s my thoughts on starting new code projects that I wish I knew when I started. Consider it an ever-growing argument with myself about “software best practices” that you have the privilege to witness.

Assumptions and Scope

This is limited in scope to what I have experience with, so this article is making some assumptions about what kind of software you are writing:

  • You are trying to write software to solve general business cases, not more tightly scoped projects like a fast json parser, vim plugin, or physics engine.
  • You or a single small team will be doing most of the work across frontend, backend, and infra; and you are fully or partially responsible for making decisions about its backend architecture and infra.
  • You want to ship quickly and do not have much administrative overhead (ie not public sector or regulated fields).
  • You are optimistic that your codebase will continue to grow for years, and that more and more people will join the team to help you.

This is based on what I have learned from experience and have had success putting into practice. Much of the advice here is based on mistakes I made and subsequent solutions that made things easier. Even if you are operating outside my assumptions, most of this should still be useful.

General Opinions

Minimize complexity

Software is really hard! It’s one of the most complex things to work with. Libraries and frameworks change constantly, and we don’t have the luxury of centuries of experience like bridge-builders designing civic infrastructure. Every dependency you introduce adds layers of levels to the call stack, reaching across millions of lines of code. A project can interact with a browser engine, GPU rendering, database drivers, consensus algorithms, interpreters, ancient astrophysics libraries, and constantly-patched npm modules.

Do your brain a favor and make the system simple so you can focus on the functionality and user experience.

Don’t confuse “simplicity” with code golf or spending hours and hours refactoring to make a perfectly polished codebase. Think more about the quantity and complexity of services and modules required for the project to work.

Code/Repo Structure

A Modular Monolith

A single codebase and single deployment should be the default. It could reach to external services, but until you have a good reason that “reach” should come from one deployment.

Start with a monolith until you need something else. If you have a few engineers, you can go very far with just one or two git repos. When you start scaling to multiple teams of engineers with hundreds of commits a day, it may be valuable to create a shared library and import this into more discrete services in more discrete runtimes.

In my experience, each microservice ends up being a different git repo, which requires their own CICD pipelines, docker containers, maybe package releases, deployments, etc which can get complicated. If you develop a modular monolith, you can either split these out later as needed or create a slightly more complex build process that deploys modules separately.

If you have a traditional frontend-backend split, like kotlin backend / react frontend, it may make sense to have codebases for each. But you could still hypothetically do a monorepo, which is what I do for a personal project – a svelte codebase embedded in the project structure in a module, with a build step in my dockerfile to pnpm install and pnpm run build.

There are downsides for larger projects, the biggest being bloat. If you need numpy/tensorflow for data jobs, this will increase the build time for all services and you may run into import conflicts where two libraries require different versions of dependencies. For languages that require a larger rebuild before each run, you may be waiting for it to build large dependencies. But in my experience the increased build time and image size is worth it for the simplicity it offers, and you can always split out later if needed or have individual builds per major module.

Additionally the practice of putting both backend and frontend makes it less trivial to set up with cicd, for example if you are trying to get a next.js project on Vercel and a go project deployed to k8s.

A caveat is for a shared services library or package you will be publishing and others importing. In this case a very pared-down, clean structure with as few imports as possible is advised.

In 2015 when Building Microservices was published this may have been a hot take, but I think people are coming around to the advantages of monoliths.

Finally, the biggest reason I like monoliths is that I am creating for myself a world of my own. I write code for any purpose and share between modules and deploy it with no friction. Any tooling for this personal universe can be contained within a single directory. This is a powerful concept for the practice of creating code.

“Bucket of top-level modules”

Organize code at its highest level by service; not by technology, tech abstraction, or your org chart. A lot of starter project structures will tell you to sort things in a MVC model: A folder for views, a folder for models, and a folder for controllers. So you would payments split up across views/payments.py, modes/payments.py, and controllers/payments.py. It feels tidy, but it’s better to organize by domain.

For my python personal project, I have a series of top-level modules. I like to think of it as a giant bucket of projects. For example if I’m trying to write some quick ETL job for scraping real estate listings, I can create a realestate module. I have a view.py file that has fastapi endpoints that are imported into the main fastapi router, I have a workflows/ folder that contains workflows that I can import into the workflow worker, and the real estate business scraping logic can import from scrape/string_utils.py. Super clean! It’s super quick to prototype things and import from other modules.

Service Name Pattern

At its core, every service, daemon, program, etc is a single main() function. That function might spawn multiple processes, but it starts somewhere.

A pattern I like to use with monoliths or other software that has multiple services is what I’ll call the “Service Name” approach. I first saw this in temporal. Essentially, you have a codebase that may have multiple services within it: ETL worker, api webserver, cli, a frontend webserver, etc and you assign each a service name. Then, in your main function, you take a service name or list of service names from an environment variable or other external configuration, and start each service specified.

This leads to a single binary or Dockerfile/image that can be started for a variety of uses. If for example you need to host separate kubernetes deployments for api/workers or run a job, you are simply working from a single codebase and single docker image and can manage which services are starting via configuration. Everything can share a CI/CD pipeline and single repo and you can choose what you’re deploying via an environment variable.

Be OK with being a little messy (within modules)

Don’t worry as much about code quality, repeating code snippets, or correct naming of tokens like classes or filenames. Worry more about the boundaries between the code you write, the larger abstractions, and the interfaces the code has with the outside world. You can always refactor the small stuff! But the larger abstractions and interfaces are important to define well.

There is a tendency to create a util module that might static functions like string utils, model utils, shared objects, etc. This is definitely an antipattern – ideally these would go in a module more related to its domain. “Util” is a technological concept not a service or domain. But you know what? If it’s easy to refactor, just put it there and be ready to move it.

What’s more important is correcting intra-project dependencies, defining delineation of responsibilities, stable endpoint paths, and the names and scope of services and public functionality.

Obviously paying attention to code quality and “best practices” is important. But I think time is better spent working on providing value, or thinking carefully about things that are much harder to change.

Considering the Entrypoint

An example of the larger structure that is important to get right early is how dependencies are specified or injected into an application context.

It’s as simple as creating a function that starts your application, and passing dependencies into it.

For example if you have an api application that aggregates routers which aggregates logic, having each of these layers allowing for dependency injection (rather than just instantiating clients globally within files) is a clean way that avoids circular imports and surprises. But the most clear benefit is the ability to mock every level of the application stack during testing.

This isn’t hard to put into practice, but is annoying to implement after 100 commits. So while it’s ok to be messy, things like this are easy to start with and have long-lasting implications on application structure.

External Services, or “Someone else’s code”

Define the boundaries between runtimes carefully and generally

Abstract the connection between your business logic and state in a general way. This way if you switch from say postgres to mongo, you only need to change one layer of the code – the data layer – instead of other areas.

This is one of the places where you are doing your future self and future team a favor. Because you’ll be reading in this article that you should minimize the amount of external services you are using, or I’ll recommend that you store analytics data better suited to a OLAP db in a OLTP database like postgres. This is all on the condition that you are handling this connection on a layer that can more easily be migrated in the future when tens of developers might depend on it.

Choosing boring libraries and services

Prefer boring libraries and services unless you have a great reason to use something more unstable. A safe bet is preferring the tech with ten times the github forks/stars/commits even if it doesn’t look as sexy.

This is especially true for that crucial software or services that is used all over your code and is hard to change: Databases, ORMs, loggers, web frameworks, job/scheduling, cloud providers.

The word “boring” imo is overloaded and kind of faddish. But after enough time working with enough dependencies on enough projects, I truly believe that adopting any technology is adopting a complex liability which should be carefully considered by the entire team, and that “boring” is a positive signal.

I first learned this from an article Dan McKinley wrote during his time at etsy and its concept of ‘innovation tokens’. Read it in lieu of me explaining it.

In the past I was burned by this by having a project with its ORM being TortoiseORM rather than SQLAlchemy. SQLAlchemy has top-tier documentation, 6k+ github stars, more than a hundred active contributors, a history of usage by other companies, all while TortiseORM has one guy (long2ice] doing most of the maintenance. And we’ve run into lots of issues around its migration workflow where we really need to dive into the code to get things to work. It’s good software, but doesn’t have the support that more mature tools like SQLAlchemy have. I think this was chosen originally because Django did not have async support, but now it does :)

There are obviously exceptions, like if you are trying to learn something new in a personal project or get a competitive advantage from something more greenfield. This is how I’ve been feeling about using Temporal over Airflow for workflow management: Airflow is objectively a more mature and industry-standard tool for batch workflows. It has a great UI. It’s got a bunch of integrations with other providers. But as the guy getting one of the two installed on a kubernetes cluster, temporal’s monolithic design was a lot easier to deploy and also run locally, and we had a lot of problems with airflows dev usability. It was much easier to integrate into our codebase and start writing activities that we could put into workflows and add to workers.

For our case, we made the decision to use temporal and I think we made the right call. When we first chose it the temporal python sdk was in beta, but since we’ve continued using it we have seen adoption of temporal by other large companies like Yum Brands, coinbase, and datadog. Beyond ETL we can use it for lightweight async processing, like enriching data or populating analytics.

It was a gamble that we think is paying off. The project could have been abandoned and forces us to migrate to something else a few years later. But it has enough support that we aren’t worrying about it.

Nevertheless this decision spent one of our innovation tokens! Luckily the rest of our stack is boring.

If you want to make this decision in a more data-driven method, you could look at some developer surveys. For example if you were thinking about choosing a frontend javascript framework you could reference the State of JS survey’s Frontend Frameworks Ratios over Time section. It’s showing react in an obvious lead far above the others, with angular and vue in a group below it. If you’re unfamiliar with javascript but need to use it, the easy, boring, smart choice is react. If your project grows beyond your wildest dreams you will certainly be able to hire people that can work with this framework.

For my personal project I chose Svelte even through it was not in the top three of this list, mainly because I like Vue-like syntax but wanted something compiled (vs virtual DOM which I’m not a fan of) and component-based, and my use-case was browser only so I didn’t need to consider react native. Another reason was since Svelte is gaining traction compared to Vue and Angular, but that’s like betting on individual stocks as investments and isn’t a great reason when choosing something for a larger more important project. But I’m fine spending this innovation token since I’m happy with the framework, and since it’s a personal project I am just considering myself.

The following are some questions I ask whenever considering a new technology. It isn’t a checklist, but all of these should be considered when choosing to add another critical technology or service for your business as a dependency:

  • Can existing tech in the stack handle it? Ie doing unstructured object storage in postgres instead of adopting mongodb, or writing a dumb cache table in postgres vs redis. This is the core question, and determining if you truly need a new technology comes from its answer.
  • Does it have wide adoption, ie a few books written about it? Bonus points if AWS or other cloud providers have a managed solution for it ala opensearch/elasticsearch or airflow. Another way to determine this is going doing a search like [tech] jobs which shows you what companies are using this in their real business.
  • If it does not have wide adoption, would you put your reputation on the line that it will have future adoption and support? For example in that innovation tokens article written in 2015 he uses NodeJS as an example of a new risky tech, but it would now be considered stable and a smart choice for many projects.
  • If it is a very very small project with few active contributors, would you feel comfortable forking it if you needed to?
  • Did it emerge from a clear business need in a successful company? For example Airflow came from the early days of airbnb to handle their workflow automation, or kubernetes which was created to manage google’s compute scale. Bonus points if it is being actively developed by a successful company, like Apple and Cassandra.
  • Has it been adopted by or partnered with a software foundation? For example Tomcat being maintained by the Apache Software Foundation, ArgoCD in CNCF, or Let’s Encrypt’s collaboration with the Linux Foundation. This doesn’t guarantee eternal support but is an extremely strong signal.
  • Has it already gone through a few major successful major versions? This to me indicates that people working on it are comfortable with making potentially-breaking changes.
  • Is it written in a mature language, ideally that aligns with the current stack and developer competencies?
  • Is it easy to hire for? For example it’s a lot easier to find people with experience with sql/postgres than mongodb or cassandra.
  • If it’s seen as an improvement on something widely used or stable (eg Airtable to Temporal for non-data-specific workflow orchestration), are you seeing successful adoption and migration from one to the other?
  • Can you find multiple live conference talks on youtube about it?

Remember that even if a library or service met every single one of these criteria, it would still add entropy to your project. It might also just not be the right tech for your needs. Pick and choose wisely!

Use services familiar to you

Beyond “choosing boring tech” just choose what works and gets the job done. If you have a ton of experience with Cassandra just choose that.

Infra / Deployment

I’m defining this as “what computers the runtime is executing on, and how it interacts with external services”. Another way to think of it is “everything that happens outside of my local development environment” or even “the cloud”.

For starting a small project, my advice for this is “go with the simplest solution for you, and don’t worry about scaling as long as most decisions are reversible”. This could be hosting it on a spare laptop at home behind dynamic DNS in a tmux session or using the Vercel wizard to connect a next.js gitlab repo to a domain.

Secrets and configuration

The code will need probably certain values, and you need to find a way to provide these values to the runtime without storing them in code. This could be because they should be secrets, or because you might have this code run in multiple circumstances that need different values.

Configuration can be as simple as some global “config” file that does a case-switch on a single environment variable.

For secrets, I’d recommend using whatever you have in your cloud environment, ie gcp or aws secrets manager. If you’re self hosting or want to keep it even simpler, something like ansible vault can work too where you have secrets encrypted at rest that is decrypted and applied in cicd.

But configuring secrets can really ruin my flow when I’m prototyping things. Needing to muck around in the secrets manager slows me down when I’m most in the flow of implementing a new feature. So what I like to do to keep my velocity going is to at first hard-code everything in variables like DANGEROUS_POSTGRES_PASSWORD while I’m prototyping, then when I properly deploy it I generate a new secret and get it in my secrets manager. It’s also a good opportunity to do a test-run of setting and rotating the secrets in a new environment.

Kubernetes

I like kubernetes. In my opinion, it strikes a good balance between usability and complexity, providing primitives to the usual clustering.

I’m biased though – my first job was managing VMs with puppet, ansible, and a bit later docker-compose, and when I later used kubernetes it felt like “all that VM functionality but with a standard api across cloud providers”. Using containers forces me to keep deployments stateless which mitigates an entire class of issues, and makes rollbacks or managing networking a lot easier when it’s all via one interface.

I can fire up k9s and get an interface to all my pods, port forwarding services to my local desktop, getting a tail of all pods in a deployment, or shell into a container to run a migration or see why some service is misbehaving or execute some code on a REPL in the production environment. Any decent language will have a client to interact with the kubernetes api for simple infrastructure as code.

But it’s complex. And expensive – the control plane will inevitably cost money, and having a container continuously running costs more than triggered code like lambda functions. So my recommendation is not to use it until you are running into problems with scaling, configuration, and networking.

If you’re making some simple project and don’t already have a cluster, forget kubernetes until you need it. Consider it an innovation token. Use an EC2 instance, vercel, supabase, AWS Amplify, google app engine, or anything simple.

There’s nothing too wrong with deploying like it’s 1999 – Manually packaging a binary, copying it from your dev machine to a debian stable EC2 instance via SFTP, editing a config file using vi in an ssh session, and then restarting a service with sudo systemctl restart is perfectly acceptable and could be automated with a script on your local machine.

When starting a project, just choose the simplest solution that works for you. But if you already have an EKS cluster lying around why not use it, and as you scale to having more developers besides yourself doing deployments it is worth investing in a more simple cicd process and migrating to a more advanced solution as it makes sense.

Multiple Environments

When starting out, don’t worry about dev/staging/prod environments. Think about how you would architect it, but it’s not necessary until you need it.

Personally I find it important to have a local development environment for working on projects in situations without internet, and another environment which is production which interacts with any external services. When getting started the ability to run code both “airplane mode” and against production services is valuable.

When you eventually need a staging or dev environment, put in some effort to create that second environment using something like terraform or ansible (or at least document the process) so the abstraction of an “environment” is programmable and you could hypothetically spin one up with little work. A bucket of k8s yaml files is ok but I always wish I would have spent more time on infra as code.

Data / Databases

Use postgres until you can’t

Persist all data in a single postgres database. This goes along the lines of “choose boring technology” because SQL is ancient and postgres is feature-rich.

As you scale, you may be tempted to use something like elasticsearch for performing larger searches, storing objects in mongo for more advanced document retrieval, a custom vector database for LLM reference, or using a time series database for time series data. But unless you are at massive scale (which you probably aren’t yet) it’s ok to put it in postgres for now. You will know when you’ve reached the limits and need to migrate data elsewhere. Besides, postgres has enough extensions to go far.

Some exceptions I could imagine:

  • You have a very specific use case where you have big data, need a relatively low latency for operations, and you have complex analytical queries that don’t work well in a relational model. It may make sense to use something more column-oriented (OLAP) like clickhouse. But you can still go very very far with postgres!
  • You need to store very large objects. You could put this in a postgres blob, but a simple s3-like object storage solution is mature and industry-standard. It wouldn’t take much work to integrate. This is especially useful if you are storing files that will be downloaded semi-publicly, like uploading images that will be embedded in webpages or zip files that will be downloaded by users.
  • Your use case has immediate latency requirements that require you to keep data on the edge near your users. This could be accomplished by sharded postgres but there may be a simpler SaaS solution that abstracts the sysops of setting up data stores all over the world.
  • Short-lived, cached data may work better somewhere like redis.

I’d even go so far as to say “Use postgres until you know better.” If you have battlefield experience with postgres in a specific domain and all the problems were fixed by moving to something else, you know better than I.

ORMs are useful but not required

You can go very far just writing SQL. There are complex projects that do just fine by manually specifying sql calls for database operations within code, and manually writing migrations. The matrix homeserver implementation Synapse is a good example.

But personally I find ORMs useful for their migration tooling such as automatically generating migrations and scripts to move between migrations easily. It’s also great to use classes to both define the schema for the database and work with the schema in the code.

As for the benefit of being able to use multiple types of databases with an ORM: I don’t think this is really compelling since most people won’t switch databases, and relying on custom functionality such as postgres extensions is worth sticking to one type of database.

Deleting strategies

Inevitably, you will need to recover deleted data. So define what you mean by “delete” early. This might even mean having multiple definitions of “delete”.

Make sure you are planning for this from the start, even if you don’t implement it immediately. At the very least make sure you can manually recover things from backups and data is not permanently lost. And I when I say “make sure” I mean trying it out at least once.

One way to do this is making sure every object has a deleted_at and modified_at mixin. This way in the data access / repository layer you can do all your get() or collect() methods with a WHERE deleted_at is NULL.

Make sure you are writing tests that ensure this case because it’s very easy to forget to add this WHERE clause and accidentally fetch deleted items. It helps to bake this into a base orm object which includes this field and create abstract methods to enforce deleted_at for retrieval or deletion methods.

If you don’t want to mange deleted_at in this way, another option is to store the object somewhere else before actually deleting it. This could be as simple as marshalling the db object into a json object and dumping it in a deleted_objects table. But this could be problematic if the schema changes and requires you to keep a reference to the resource type / table name.

I haven’t found a great solution for this yet.

Do not use sequential integer ids for database objects, or at least store them as strings

Use UUID4 or ulid. Using sequential integer ids can cause problems:

  • Attackers can arbitrarily guess ids when interacting with the api if they are sequential.
  • If the sequence is ever broken, there may be collisions for newly created environments with external services that associate based on an id.
  • Code assumptions around integers and strings, for example in datastores without type enforcement like mongodb, the string '88' and integer 88 are different and requires casting.

If your solution needs sequential ids, consider ulid which is a lexographically sortable type of unique id.

There are exceptions against using uuid/ulid, for example more human-readable ids like in ticketing systems where you may want a human-readable and sequential ID for tickets.

Use constraints carefully

Instead of defining column constraints in the code, try to do it on the database level. This will protect you if you forget about it later and add functionality to the data classes or interact with the database in other languages.

Personally I would draw the line at more well-known constraints, like using UNIQUE or NOT NULL or the obvious PRIMARY KEY. But I personally would not do more advanced usages of CHECK which would be better represented in business logic unless it is a fundamental assumption of your domain model.

However, when you introduce soft deletion, you may run into problems. For example if have a like object for liking content, and set a unique constraint with a resource_id and resource_kind and account_id, and then the like is deleted and re-added, it would fail since there is already a deleted entity with the unique constraint. There’s ways around this which may involve not using a unique constraint.

Tooling

For repetitive things like packaging, deploying, db migrations, tests, etc it’s usually worth writing some tooling to do it quicker. This also adds a layer of abstraction in the case that you need to change something like the test client or certain default arguments.

If you’re using a monolith with the service name pattern, I like to create a cli service incorporates a popular cli library for your language.

As always, consult XKCD 1205 to see if it’s worth your time. Or just do it anyways because it’s more fun to write something than remembering command arguments ;)

Code Quality

It’s important to have clear, readable code that ideally doesn’t repeat itself. But if you’re trying to move quick, this should come second to shipping great code. In most cases it’s possible to achieve both :)

Lead by Example

In terms of balancing creating perfect code and shipping quick, my heuristic is to keep the quality at a level that is what I would want to see in a project. When someone with a fresh pair of eyes inevitably starts contributing to the codebase, they will use the code you have already written as reference. If it’s clean and aggressively linted, they will likely do the same.

It’s easy to say “I’ll design this in a suboptimal way to save time but fix it later,” but if you don’t fix it then a future person could come in and use the same pattern, leading to messier and messier code.

Keep an eye on the stack size

If you are debugging code, take a brief look at the call stack. More lines is typically worse and might indicate you should refactor.

Auto-Linting Enforces Code Quality

Binding an autofixing linter to my Actions on Save in InteliJ was one of the best things I did for my coding workflow. It strictly and immediately enforces code quality, without interrupting my flow.

If you’re using python, I’d recommend fully embracing ruff for both linting and formatting, and specifying as many rules as possible in the pyproject.toml, having the linter and checker running on save, and running a check as a post-commit hook. But it should be the same story for other languages.

When you’re fighting with a specific rule, be quick about adding a noqa directive or possibly disabling the rule globally if you deem it to not be super helpful. Otherwise you can specify files the linter does not run certain rules on, like preventing asserts in production code but not tests.

Staying in the flow is more important than nitpicky linting in most cases, especially if there’s an easy way to enable the rule later.

Make Testing Easy

I like tests. It makes me more confident to make large changes to the codebase. Even a simple test that makes sure the project runs makes me feel warm and safe.

Through friends’ and my own experience I know of successful, enterprise codebases that do a lot and make good money which don’t have many (if any!!) tests. This scares me! But it works for them I guess.

It can be annoying/intimidating adding tests when you start a project from scratch, but a good way to jump in would be to unit test really simple functions that have clear expected inputs and outputs that you can parametrize. These kinds of tests are fun and simple to write and will get you started. Once you have some of these, getting mocking and test dependencies set up is a shorter step.

Getting tests into CICD can be more complicated when you have external services, but it’s probably worth it for larger projects with more people and more customers. But if you have just a few developers and a good release process to prevent bad commits from getting to prod, you can probably just trust people to run them locally and blame people when tests break.

Tests probably go best in their modules (like /project/src/payments/tests/ vs /project/src/tests/payments) which is something I see happening in rust and go. This makes it easier to split out modules when scaling and aids in simple code navigation. Personally I do the latter out of habit but will start switching to the former for ease of refactoring and code navigation.

References and Inspiration for Software Design Opinions

These opinions were formed mainly through three factors: Experience developing and scaling services, interacting with open-source libraries, and technical longform text (books and articles/blogs I find on hackernews). The following are some things that I found useful when constructing my opinions and this article.

See also