Putting your strings to work: Make your SKU strings more meaningful

At Known Decimal, we work with a lot of e-commerce companies. Some are small (for now) and some are rather big. Even though they differ in size, they all have one thing in common: they’re successful in selling products.

Whatever the product might be, they can have different variations. Think about different colors, different sizes, and some that are even used to produce a new product. Each of these specific variants has a unique identifier: a SKU (i.e. a Stock Keeping Unit).

In a database, a SKU is often stored as a simple, unique String and is indexed properly for fast lookups. However, this simple string value contains way more information than just a unique identifier. That SKU string can contain information about the type of product, about the region the product is sold in, whether or not it’s the final product, etc. Enter the Primitive Obsession code smell.

In the case of the Primitive Obsession, we’re depending too much on a primitive (e.g. a string, an integer, etc.) when storing information. We lose information that is hidden in the primitive. When working with a SKU in a codebase, we don’t want this information to be lost.

In this post, we’ll take a look at how we can enhance a simple String value to be even more valuable, fix the Primitive Obsession code smell, and how to query these rich data objects in the database.

What does a SKU look like?

Before we start, let’s give some examples of what a SKU might look like:

  • p1234 for a product that is sold in any of the worldwide stores.
  • p1234eu for a product that is sold in the European Union (EU).
  • p1234copy for a product that’s being used to produce the final product (e.g. p1234).

As the SKU is a unique identifier for a specific variant, we will store the SKU on the Variant class that might look like this.

# app/models/variant.rb
# 
# == Schema Info
#
# Table name: variants
#
#  id                  :integer(11)    not null, primary key
#  sku                 :string         not null
class Variant < ApplicationRecord
end

Extracting information from a SKU, one might be tempted to add the information extraction methods to the Variantmodel. Extracting information might look like this: variant.eu_sku?.

However, this would pollute the Variant model with a lot of methods that are not necessarily the core of the Variantmodel in the first place. Let’s take a look at how we can fix that.

From a string to a PORO

We’re going to turn the sku into a PORO — a Plain Old Ruby Object — for a more natural API and to not pollute our Variant model. Turning an attribute into a PORO can be achieved fairly easily. However, we need to keep in mind to fetch the attribute from the attributes hash. Otherwise, we will create a circular dependency.

# app/models/variant.rb
class Variant < ApplicationRecord
  def sku
    Variant::Sku.new(attributes.fetch('sku'))
  end
end

# app/models/variant/sku.rb
class Variant::Sku
  def initialize(sku)
    @sku = sku.to_s
  end
end

Now that we’ve turned out String into a PORO, we will see something funny. The Variant#sku will now return the PORO when rendering it in our interface (e.g. <%= @variant.sku %> in a view).

Let’s quickly fix that by adding one method to our PORO.

# app/models/variant/sku.rb
class Variant::Sku
  # ... 

  def to_s
    @sku
  end
end

If we render the Variant#sku in (for example) a view now, it will be rendering the SKU as expected. It will simply output the variant’s SKU as a string.

Extract information out of a SKU

As we now have an easily testable and isolated data model of a SKU, it’s time to turn the PORO into something more useful and extract information from the SKU.

# app/models/variant/sku.rb
class Variant::Sku
  # ...

  COPY_SUFFIX = 'copy'
  EU_SUFFIX = 'eu'

  def copy?
    @sku.end_with?(COPY_SUFFIX)
  end

  def eu?
    @sku.end_with?(EU_SUFFIX)
  end

  def retail?
    !copy? && !eu?
  end

  def to_copy
    return @sku if copy?

    [@sku, COPY_SUFFIX].join
  end

  def to_eu
    return @sku if eu?

    [@sku, EU_SUFFIX].join
  end
end

By adding these methods to the Variant::Sku model, we’re enhancing the maintainability and the readability of our code in several ways:

  1. We’re not polluting the Variant model with extra methods regarding the SKU.
  2. We can test the code for the SKU in isolation from the Variant.
  3. We have a more natural API for checking what type of SKU we’re dealing with (e.g. variant.sku.copy?).

Querying the PORO in ActiveRecord

By adding the Variant::Sku#to_s method, we enabled the entire codebase to continue using the Variant#sku as usual. Every single view in our Rails application continues to render the SKU as a string.

However, we also broke the way that we perform queries on the database using ActiveRecord.

$ variant = Variant.first
=> <Variant @id=1 @sku=p1234 />

$ Order.where(variant: { sku: variant.sku }).count
=> TypeError: Cannot visit Variant::Sku
from /usr/local/bundle/gems/arel-5.0.1.20140414130214/lib/arel/visitors/visitor.rb:28:in `rescue in visit'
Caused by NoMethodError: undefined method `visit_Variant_Sku' for #<Arel::Visitors::DepthFirst:0x0000aaaaf66f2b50>
Did you mean?  visit_Arel_Attributes_Boolean
from /usr/local/bundle/gems/arel/lib/arel/visitors/visitor.rb:22:in `visit'

The problem is that Arel — which is used by ActiveRecord to build the queries — doesn’t know how to parse our newly created SKU PORO. Arel knows how to parse the standard types like a string, an integer, an array, etc. but not our PORO.

Creating an ActiveRecord type

Luckily, Rails provides us with a way to be able to easily query our newly created PORO through the ActiveRecord::Type class. With this technique, we can (1) keep using the PORO in the application, (2) query the database as the SKU PORO is a string, and (3) clean up the Variant class even more.

Credit where credit is due: Bart de Water reminded me about the powerful ActiveRecord::Type API. It was a crucial piece in coming up with this more usable API for SKUs. Thank you, Bart!

First, we will need to implement our own ActiveRecord::Type. This ActiveRecord::Type needs to have two instance methods: cast and serialize.

  • The cast is called whenever the data is being fetched from the database. In our case, we want to use this to turn our simple SKU into a PORO without having to define it on the model.
  • The serialize is called whenever the data is being persisted to the database or whenever it’s being used to query data in the database. In our case, we want to turn our PORO into a string again.
# app/models/variant/sku_type.rb
class Variant::SkuType < ActiveRecord::Type::String
  def cast(value)
    Variant::Sku.new(value)
  end

  def serialize(value)
    value.to_s
  end
end
NOTE: One could choose to leave out the serialize implementation as we’re already inheriting from ActiveRecord::Type::String which does the same. However, for the blog post, I wanted to be explicit.

Now that we’ve created a new type, we need to make the model aware of this. We can do this in two ways:

  1. By initializing the Variant::SkuType on Variant class itself.
  2. By registering the Variant::SkuType as a new type globally in an initializer.

As this Variant::SkuType is very specific to the Variant model, it makes sense to initialize it on the Variant model itself. However, if the new type would be more broadly applicable, one can register the type globally (see “Creating Custom Types”).

# app/models/variant.rb
class Variant < ApplicationRecord
  attribute :sku, Variant::SkuType.new
end

Now, we’ve come full circle. We’ve turned a simple string value into a rich PORO while (1) maintaining its previous querying abilities, (2) isolating all the code related to a SKU into its own space, (3) retaining a SKU's information and avoiding the Primitive Obsession code smell, and (4) we’ve made it easy to test our new code.

Not bad at all for a "string"!

Want to dig a little bit deeper into the ActiveRecord Attributes API? Check out the wonderful documentation: https://api.rubyonrails.org/classes/ActiveRecord/Attributes/ClassMethods.html