Why Great Programmers Obsess Over Data, Not Code?

How to shift your mindset from code to data first approach to excel at programming

Jan 19, 2025

A misconception I had early in my career was obsessing over writing the 'cleanest' code possible.

I'd spend hours writing elegant functions, choosing the perfect variable names, and structuring code that would stand the test of time. I thought good code meant future-proof code.

But I was wrong.

My turning point came when I had to rewrite a CMS (content management system) due to the poor design of its data model and data structures.

The quote by Eric S. Raymond, author of The Cathedral and The Bazaar, completely resonated with me at that time:

Smart data structures and dumb code works a lot better than the other way around.

I now begin with “How should this data be structured?” instead of " How do I write this feature?”. I've found that the code often becomes obvious once the data structures are right.

In this post, I'll share how this mindset shift changed my approach to software development, and why it might be a game-changer for you too. I want to clarify that in the context of this post, data refers to databases, data models and data structures.

Linus Torvald is the creator of Linux kernel and git (Image credit: Aditya Patange)

Why Data Matter More Than Code

1. Poor data design creates maintenance chaos

You may have heard the popular quote by Fred Brooks, the author of the book The Mythical Man Month (1975):

Show me your flowchart and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won't usually need your flowcharts; they’ll be obvious.

The quote is still relevant after 50 years because well-designed data are the foundation of maintainable software. And if you get the data model and their relationships wrong, no amount of clean code can make up for it.

As I mentioned in the intro I had to rewrite a CMS due to its poor data model. Yes, it was painful to maintain (for developers) and use (for end-users).

Every new feature required increasingly complex code to work around the limitations of our data model. For users, they had to update stuff in 4-5 different places to simply change product data.

2. Poor data structure makes changes expensive

Whether it is a database, data model or data structure, the cost of changing the data structure is higher as compared to the code. The reason is simple. When the data structure is changed, the entire code base that interacts with the data structure needs to be changed.

For example, I used to work on a project a while ago where encoded images were stored as text in the database. It not only increased the DB cost (encoded image consumes a lot of space) but also created a bottleneck in data transfer on API (50+ images were transferred in API).

This was surely not a good data model design for our use case. So, we finally decided to move the images to CDN and store only the path on the database. But, the data migration process was costly as it took us almost two weeks.

3. Poor data modelling increases code complexity

I’ve noticed that a poorly designed data structure forces you to write complex code for simple operations. For example, consider the below example where articles are stored as a string:

const articles = [
    "title:JavaScript Tips;author:1;tags:js,coding;status:draft",
    "title:Data First;author:2;tags:architecture;status:published"
];

If you need to get all published articles, it requires a lot of complicated logic. You need to iterate over the array, split the strings separated by a semicolon, again split by a colon, and whatnot.

However, when you change the data structure to an object of array, the operation becomes simpler. You can simply use the array filter method to get the published article.

const articles = [ 
  { 
    title: "JavaScript Tips", 
    author: { id: 1, name: "John" }, 
    tags: ["js", "coding"], 
    status: "draft" 
  }, 
  { 
    title: "Data First", 
    author: { id: 2, name: "Jane" }, 
    tags: ["architecture"], 
    status: "published" 
  }
];

The second approach is not just cleaner - it's more maintainable, easier to debug, and simpler to extend.

How to implement a data-first mindset?

Whether you’re working on a new project or simply adding a feature, it’s important to approach the task with a data-first mindset.

The following steps help you to set data first mindset in your everyday workflow:

1. Start with data modelling, not coding

This is a crucial step because it forces you to deeply understand the problem domain before implementing the solutions.

The following guide (taken from AWS) provides four clear steps for data modelling:

Identify the entities and their attributes
Define the relationship between entities
Use data modelling techniques (Entity-relationship diagram, Object-oriented data modelling, Relational data modelling, Graph data modelling)
Optimise and iterate

You can either use a paper, a whiteboard or any online tools for data modelling. Process matters more than tools here. However, make sure to document them.

2. Consider how data will be accessed and queried

This step is critical because it bridges the gap between theory and practice. A perfect data model that performs terribly under real-world access patterns is useless.

For example, if users need to frequently find all orders for a customer, but your data model makes this expensive, you've failed regardless of how elegant the model looks.

In our CMS case, we stored article content, metadata, and relationships separately, thinking it would be more "organised". Instead, it made simple features like "show all articles by this author" painfully complex.

3. Review how data flows through your application

Our job as a software engineer is to transform data — from binary to user-friendly format. So, understanding how data flows through the application helps you to avoid common data pitfalls such as over-fetching, unnecessary transformation and data integrity issues.

For example, consider the following example:

Frontend -> API -> Database -> API -> Frontend

1. Frontend sends: "2024-01-19"
2. API converts to Date object for validation
3. Database stores as timestamp: 1705622400
4. API reads and converts back to Date object
5. Frontend receives and converts to "2024-01-19" again

The above pseudo code converts the date between 3 different formats (string → Date → timestamp). Each conversion wastes processing time and introduces the chances of bugs.

It would be a lot simpler to use a consistent ISO date format throughout the entire flow.

Frontend -> API -> Database -> API -> Frontend

1. Frontend sends: "2024-01-19"
2. API validates string directly
3. Database stores as: "2024-01-19"
4. API reads and passes through
5. Frontend uses "2024-01-19" directly

Balancing data design with code quality

While data design is crucial, it doesn’t mean we should completely ignore code quality. The key is to understand where to invest your effort.

Here’s the priority order:

Get the data model (data relationship in the database) and data structure right
Make the data flow in the program clear
Focus on software design and code organisation.

Code still matters when implementing complex business logic, handling edge cases, building reusable components, and so on.

The goal isn’t to choose between data or code quality - it’s to understand that good data design naturally leads to simpler code. When you get the data right, you spend less time-fighting complexity and writing straightforward code.

I hope you find the post helpful. If you haven’t subscribed yet, please subscribe to read more articles like this.

The Driven Dev

Discussion about this post