Blog / Why taxonomy is the most effective growth tactic for large catalogue Shopify stores
Why taxonomy is the most effective growth tactic for large catalogue Shopify stores
HOW WE WORK
I have been in eCommerce for almost 20 years. In that time, and collectively across our team, we have worked with hundreds - if not thousands - of stores.
Without a doubt, one of the most important lessons that we have learned is that the most overlooked lever for sustainable growth isn’t higher ad spend, better creative, or a new app.
Instead, the most fundamental and powerful driver of scale is the structure of the store itself: taxonomy.
While the word sounds like a dry exercise for 18th-century biologists, it's actually the most human part of eCommerce. Taxonomy is the practice of identifying, naming, and grouping things based on shared characteristics - creating a mental shorthand for the infinite complexity of the products we sell.
Most brands treat their navigation like a filing cabinet - a place to put things away. But your customers don't want a filing cabinet; they want a personal shopper. When your taxonomy is broken, you aren't just "disorganised" - you’re actively hiding products from people who want to buy.
This is why for a large-catalogue Shopify brand, getting taxonomy right is the foundation for growing overall revenue by at least 20% year on year.
The precise impact on revenue is something that we'll cover in detail later in the article, but before then, let's look at why what taxonomy really is and why it's so important.
1. The world’s biggest brands are built on depth of taxonomy
Some of the most successful brands in history were not built on unique products, but on the superior categorisation of existing ones. Platforms like eBay, Amazon and Walmart have achieved success because they mastered the ability to help people find exactly what they are looking for with minimal friction.
Take this example from eBay's developer documentation. The first line is stating how buyers typically drill down through categories to find what they're looking for.
And here's an example of Amazon's nested menu structure. Notice how deep it goes and how easy it is to get to the broad category that you're looking for in just a few clicks.
In these billion-dollar organisations, this is the work of dedicated merchandising teams and data scientists who treat taxonomy as a core product feature, not an afterthought. They understand that if a customer can’t find it in a few clicks, it doesn’t exist.
However, the principles remain the same whether you are a global conglomerate or a boutique store with 250 products. Over the years, we’ve worked with brands at every stage of this journey - from those just starting out to teams who were early employees at giants like ASOS and Gymshark.
That experience has given us a unique under the hood look at how the biggest names in the business organise data. What we’ve learned is that scale is relative, but logic is universal.
2. The history of taxonomy is a journey from nicknames to standardised data
There is an inherent tension between the human and machine sides of eCommerce (and this is only getting bigger now that agentic commerce is emerging), but taxonomy is the bridge that resolves it.
In a literal sense, taxonomy is the practice of classification - identifying, naming, and grouping things based on shared characteristics - it's how we create a mental shorthand for the infinite complexity of the products we sell.
The Linnaean System was nature's first "database"
The most famous example of this is the Linnaean System, a biological hierarchy created by Carl Linnaeus in the mid-1700s. Before Linnaeus, naming a plant or animal was a descriptive mess of nicknames that was impossible to scale.
He solved this by creating a structured, "machine-readable" language for nature - an inverted pyramid that moved from broad kingdoms down to specific species.
If you are interested in the deeper - and often messy - history of how we have tried to put the world into boxes for centuries, this podcast episode is a fantastic deep dive.
In modern eCommerce, we use the same logic to move from a broad "department" down to a specific SKU. This hierarchy transforms your store from a flat list of products into a structured database.
Why this matters for the future of AI search
The reason we look at this history is that it perfectly mirrors the direction of the next decade of search. In traditional SEO, we focused on "top-down" optimisation - stitching keywords onto pages. However, AI discovery requires a bottom-up approach. Instead of fixing the page at the end of the line, we are optimising the product data at the source.
Just as Linnaeus replaced nicknames with data, we are moving away from "keyword updates" toward technical merchandising. AI engines don't just "match keywords"; they attempt to "reason".
When an AI agent is asked to find "the best value dog food," it isn't looking for a slogan; it’s looking at your ingredients list and price-per-gram to build a justification for its recommendation. Taxonomy is the map that allows that machine to understand exactly where your product fits in the world.
3. Fixing your taxonomy has a real impact on revenue
We’ve established that taxonomy is the logical foundation of your store. But for a business owner or marketing director, the most important question is: What is the actual ROI of being more organised?
To understand this, we have to look at Shopify differently. We treat Shopify as the single source of truth - effectively, the data warehouse for your entire product catalogue.
When you optimise your taxonomy within Shopify, you aren't just changing a menu on a website. You are enriching the data that flows downstream to every other channel you use to grow.
Taxonomy is the first domino. When it's poorly structured, it triggers a chain reaction of inefficiencies and lost opportunities. On the other hand, when it's fixed, those same channels begin to compound in your favour.
As you can see in the visualisation above, a lack of granularity in Shopify doesn't stay in Shopify. It poisons the well for everything else:
- Paid media & social: Poor data leads to Merchant Center disapprovals, catalogue rejections on Meta, and mismatched price/stock data. This results in ineffective ad targeting and wasted spend.
- SEO: Without clear categorisation, you face crawlability issues and "collection bloat," which prevents you from ranking for the specific terms your customers actually use.
- Operations & automation: If your data is messy, you can’t automate your workflows. You end up relying on manual tagging and unscalable processes, leading to high operational costs.
- The future of AI search: AI search engines rely on structured data to "reason." Weak taxonomy leads to "hallucinated" details or, worse, your products being excluded from AI-generated answers entirely.
The financial formula for growth
We know from working on hundreds of similar projects that the impact of fixing this structure is predictable. Once you have fixed your tracking to ensure you have a clean baseline of data, we typically look at two key benchmarks:
- The organic baseline: For a healthy Shopify store, organic revenue should represent roughly 20% of the overall revenue mix.
- The growth forecast: By moving from a "flat" structure to a deep, granular taxonomy, it's entirely reasonable to forecast a 100% improvement in organic revenue.
The result? A 20% upturn in overall revenue purely from the improved impact of organic search.
The compounding effect
Here is the most exciting part: the 20% figure is actually a conservative estimate. That calculation only accounts for the gains in organic search. It doesn't factor in the reduced wasted spend in PPC, the higher conversion rates from better internal search, the lower bounce rates from improved navigation, or the massive reduction in "manual work" for your team.
4. The problem with most taxonomies: data quality and the lack of granularity
If taxonomy is so powerful, why do so many stores get it wrong? In our experience, it usually boils down to two things: poor quality product data and a failure to go deep enough.
Most brands fail to realise that their store structure is only as good as the data feeding it. If your product information is thin - missing materials, specific intents, or technical attributes -you cannot build a granular structure. You are forced to stay "shallow" because your data doesn't support anything else.
We find a useful way of thinking about this struggle is through the lens of a "desire path."
In urban planning, architects design formal paved roads, but people often ignore them to take the most efficient route, wearing dirt trails into the grass. These trails are the physical manifestation of human intent.
Most Shopify stores are built as rigid, shallow maps. We create broad collections like "Men’s Footwear" because that’s how we think the world should be organised. However, your customers are walking on the grass. They are searching for "comfortable shoes for standing all day" or "wedding shoes that won't scuff".When you don't have the granularity in your product data to "pave" those paths, you lose the customer.
- The map (Your current store): Rigid navigation menus that don't reflect how people actually look for products.
- The territory (the reality): Search engines now infer meaning rather than just matching keywords. If your data is brittle, a machine can't "reason" its way to your products.
5. Build for the gaps before the data catches up
Where merchants often stall is waiting for "enough data" before they build a new category. We argue for the opposite: build the structure based on logic, even if the products aren't all there yet.
We can learn a lot from the history of the periodic table. Before 1869, chemistry was a disconnected mess of facts.
When Dmitri Mendeleev organized the elements, he didn't just list what was already known. He understood the underlying logic so deeply that he famously left gaps in his table.
He prioritized the structural pattern over the missing data, even predicting the properties of elements like "Eka-silicon" fifteen years before they were discovered.
| 28.1 Si Silicon |
| ? ? Eka-Silicon |
| 118.7 Sn Tin |
When we build a robust taxonomy for a Shopify store, we aim for something similar. By defining clear, granular attributes - material, weight, intent, and persona - we create a structure that is bigger than the sum of its parts.
- A well-structured taxonomy helps search engines predict the right result even for specific trait combinations you haven't explicitly named.
- By focusing on the logic of how people think, you build a foundation that lasts, rather than constantly reacting to messy, surface-level metrics.
6. Putting this into practice for a large catalogue Shopify store
While all of this makes sense in principle, the real question is: how do we put it into practice?
Before looking at the specific workflows, there are a few central principles that are important to cover. The first, as we have touched on, is the idea of using Shopify as your point of truth.
When you optimise your product data at the source, you are making sure that every downstream channel - from SEO to your Merchant Centre feed to your internal site search - is working from the same high-quality foundation.
To maintain that point of truth, we follow three core rules.
1. Stick to the default structure wherever possible
The closer you stay to Shopify’s native logic, the easier it is for your data to flow. For example, Shopify has its own Standard Product Taxonomy.
By mapping your items to these global categories, you are using a language that is already "machine-readable".
Keeping to the defaults means knowing where your data belongs. This ensures that information is fed correctly into AI platforms and marketing feeds. For instance:
- Category structure: We use the native Shopify category fields whenever possible to ensure data flows predictably between channels.
- Product Type: This should always be filled out accurately, as it is a primary signal used by AI platforms to understand what you sell.
- Metafields vs. Tags: As our Head of Delivery, Lauren Harris, notes in her guide, you should use Metafields for the permanent foundation (the facts of the product) and Tags only for merchandising (temporary labels like "Summer Sale").
2. Map and transform your existing data
Once you have aligned with the default structure, the task becomes a mapping and transformation exercise.
if the data is present, but in the wrong place, like buried in product descriptions rather than metafields, this is where AI becomes a genuine asset. It can act as a transformation engine, scanning your existing content to extract the data and mapping it into the correct place.
On the other hand, if the data is fundamentally missing, this becomes an operational challenge. We often find that critical information isn't "lost" - it just isn't being pulled through from a PIM correctly, or it lives exclusively in the heads of your product team.
These are specific challenges that need to be addressed before any technical work begins. You cannot prompt-engineer your way out of a missing fact; if the information isn't in your business, it won't be in your database.
Part of our process is identifying these gaps and helping you build the rigour needed to capture that data at the source.
3. Customise only where the platform has limitations
We only step away from Shopify’s native features when the platform's default logic starts to act as a ceiling. While we keep to standard structures for your data wherever possible, there is one specific area that requires a more custom-engineered approach: its flat URL structure.
Shopify was designed for simplicity, which is excellent for smaller catalogues. However, it lacks native sub-categories. This creates a technical bottleneck as a brand scales. Here's why:
-
By default, when a customer applies a filter, the URL changes, but the page content -the headings, descriptions, and metadata - stays exactly the same. Search engines cannot "see" the specific context (like "Men’s Brown Boots"), meaning these high-intent pages never get indexed.
- Because these filtered views do not create unique, indexable pages, you are effectively blocked from targeting the specific, granular keywords your customers are actually searching for.
To overcome this, we build the kind of parent-child category structure you would expect from other platforms. We bridge this structural gap using a few specific technical solutions:
- Secondary navigation & related collections: We build custom modules that allow you to link parent and child categories directly. This goes back to the "desire paths" idea we covered earlier - here you're effectively creating these paths for both users and Google. You can find a deeper dive on this in our guide to Related Collections.
- Custom breadcrumb solutions: Since Shopify does not natively track a hierarchical path, we implement custom breadcrumbs. This gives search engines a clear map of exactly where a product sits within your taxonomy.
- Custom schema: We use custom schema to explicitly define these relationships in the code. This helps AI-driven search engines "see" the hierarchy that a flat URL structure usually hides.
Final thoughts
Technical limitations aside, the most important shift you can make is a mental one. Taxonomy isn't an administrative chore; it's a discipline. It's the practice of being granular, consistent, and organised enough to respect how your customers actually think.
If you can move away from being a digital librarian - simply filing things away where they "belong" - and become a technical merchandiser, you unlock a level of growth that ad spend alone can’t buy. You aren't just building a menu; you are building a paved road that leads directly to your checkout.
By matching your store’s data to the "desire paths" of human search intent, you ensure that your products are not just "on the site," but are actively discoverable. In an era where AI is starting to do the shopping for us, having a structured, logical, and deep taxonomy is no longer a "nice to have" - it’s the only way to stay on the map.