How I Work (Pt 1): Models & Database Design

Inspired by Jeffrey Way's Laracasts series 'Whatcha Working On', I wanted to write a series of articles not on some technical aspect of building websites & APIs, but on the decisions that shape how they're built. In this article I talk about identifying models and how to settle upon a relational database structure.

During the course of my career so far, I can’t begin to count the amount of time I’ve spent just thinking about how I’m going to build a website or piece of functionality. Decisions range from low-level code style choices, to picking one tool or framework over another. Now, especially in the context of new work, I spend more time thinking than I do writing code.

With this in mind, I wanted to write about problems I’ve overcome that I’m sure my less-experienced self would have appreciated knowing; It’s why this series of articles will be by design, highly opinionated. I don’t expect readers to my words as face value, nor do I want them to, only feel better equipped and more aware of the decisions they’ll likely have to make.

Models

A huge part of building the websites I work on involves choosing which data I need, how it’ll be sorted, and how it’s returned. In WordPress, data can be represented as a Custom Post Type but in MVC frameworks like Laravel we use Models.

If you need to brush up on your knowledge of MVC, a popular design pattern, there’s no shortage of online resources explaining the fundamentals. While the rest of this article won’t focus on Laravel specifically, I’d recommend reading the Laravel docs for more of an overview of MVC in the context of PHP.

An abstraction of database tables, models help us organise the way data is stored and requested in our applications. As the backbone of our applications, we should be able to strip back and simplify all functionality until we’re left with just its models. Identifying what they should be is something I see a lot of developers starting out stumble on (and certainly something I struggled with), but identfying how they relate to one another is even harder.

Forms first

Predominantly, we collect information from a website’s users through forms, the input of which we process and store. Building out these forms is a great way to think about which data we need, and how it should be represented in a database but most importantly how it’s going to be used.

Displaying fields on a page forces us to visualise how the data we’re collecting relates to the core actions of the website, in other words, what we’re trying to get our users to do. The same principle applies to much of the work we do in the back end; Building an interface that consumes data is a great way to conceive how to collect it.

Naming conventions

Something I initially struggled with was choosing a name for my models that best encapsulated their meaning. There are of course synonyms for many of the words in our vocabulary, but when naming models it’s important to choose a word whose meaning cannot be simplified, in other words, a word for which there are as few synonyms as possible.

Furthermore, it’s important to describe a model as a single instance of the ‘thing’ we’re describing, for example, a ‘product’ not ‘products’. I’d also advise deciding what all the models for an application or website will be before beginning, so as to not discover any conflicts in their meaning later on.

Relation me crazy

Having chosen the models that represent the data in our websites and applications, we often need to relate pieces of information to one another, like a comment to a blog post (each of which would be represented as their own models). Defining these relationships before we write any code helps us inform how to design our database. Once they are defined, we can more easily query our database and group together related data.

The most common kind of relationship is a ‘One to Many’ where one model ‘owns’ multiple instances of another model. An example might be the way a blog post ‘owns’ multiple comments, but a comment can only belong to a single blog post. Typically we define that relationship on both models, but different frameworks have different ways of doing this.

To think about how my models relate to one another, I usually first identify the ones without which my application would not exist. In, for example a blogging platform, this would be the posts themselves (‘Post’) and users (‘User’). First I identify the link between these intial models: A user can have many posts but a post can only belong to one user. I then begin to add the models that make up other parts of the application, like comments. With each new model, I define how it relates to the existing ones. Sometimes I find that in undergoing this process, my understanding of my existing models develops and the relationships have to be tweaked.

Database design

When thinking about how an application retrieves information from a database, I’m typically thinking about the type of data I want to return: primarily strings and integers, but sometimes I might need a date or Unix timestamp.

Planning is key

As much as going into a project with a strong understanding of what the core models will be is important (at least in MVC) it’s very useful to have a clear idea of what information specifically it will require initially. This reduces the likelihood that we need to change a table’s structure mid or post-project which might be confusing to those working on it and produce inconsistent result sets. For example, deciding on whether to use Unix or Datetime timestamps is a decision best made at the start.

The need for speed

While MySQL and other relational databases have functions we can use to manipulate data in queries, setting the correct data type for each column in a table allows us to query it in the most efficient way.

In MySQL for example, we sometimes may feel inclined to store dates in a VARCHAR column because of the flexibility of both Unix and Datetime values. However, to order a result set by a column whose date were stored in a string, one has to use MySQL’s STR_TO_DATE or FROM_UNIXTIME which is far slower to perform on larger results sets than it would to simply query a DATETIME or TIMESTAMP column. Similarly, querying or joining tables on an INTEGER data type is far more performant than using a string and we can order results based on their numerical value not length.


If it’s not already apparent, my closing advice to those learning about the role of database design in websites and web apps would be to plan ahead. It’s a much better idea to build on the foundation of a solid database design than to have to amend a project later when you discover conflicts. I’d also recommend playing around with data types and comparing the results of some basic queries.