Data Tier Routing Patterns

Data tier routing is the deterministic mapping of index shards to specific node roles across an index’s operational life. In production OpenSearch, routing is not a passive allocation heuristic; it is an actively enforced constraint matrix governed by Index State Management (ISM) phase transitions, shard allocation deciders, and Cross-Cluster Replication (CCR) topology boundaries. When routing drifts from intent, the symptoms are expensive and visible: unbalanced disk utilization on the hot tier, ISM transitions that stall mid-flight, UNASSIGNED shards after a tier migration, and CCR followers that fall permanently behind their leader. This guide operationalizes routing enforcement — exact node attributes, index templates, ISM allocation actions, verification commands, and Python automation for drift detection and remediation — building directly on the OpenSearch ISM Architecture & Fundamentals execution model.

Tier, hardware, and routing attribute alignment

Routing decisions only pay off when the node attribute you route to is backed by the right hardware. A shard tagged data: cold that lands on an NVMe hot node wastes premium storage; a hot shard forced onto HDD collapses ingest throughput. Align each tier to a storage profile, a compute ratio, and a single canonical routing attribute before writing any policy. The physical topology behind this table is covered in depth under Node Role Allocation.

Tier	Storage profile	vCPU : RAM ratio	Routing attribute	Primary workload
Hot	Local NVMe SSD	1 : 4 (compute-heavy)	`node.attr.data: hot`	Active ingest, rollover, real-time search
Warm	SATA/SAS SSD	1 : 6	`node.attr.data: warm`	Recent history, shrunk indices, occasional query
Cold	High-density HDD	1 : 8 (storage-heavy)	`node.attr.data: cold`	Infrequent access, force-merged segments
Frozen	Object storage / snapshots	1 : 8 (minimal compute)	`node.attr.data: frozen`	Searchable snapshots, archival retention

The attribute name is a convention, not a fixed keyword — node.attr.data is idiomatic, but node.attr.box_type or node.attr.temp work identically as long as the value referenced in every index template and ISM allocation action matches the value declared in opensearch.yml. Native data_hot/data_warm/data_cold/data_frozen node roles (OpenSearch 2.x+) can coexist with the legacy attribute approach; this guide uses node.attr.data because ISM’s allocation action targets index-level require/include/exclude filters, which key off attributes.

How the allocation deciders resolve a shard

OpenSearch routes shards through an ordered decider pipeline evaluated synchronously during every cluster state update. Explicit allocation filters are matched against node attributes (node.attr.data, node.attr.zone, node.attr.rack_id) before placement is committed. Production deployments should prefer index.routing.allocation.require for strict tier enforcement, reserving include and exclude for flexible, capacity-driven placement.

During a tier transition the routing filter must shift atomically, or OpenSearch will thrash shards between nodes. The FilterAllocationDecider runs after the enable-state and disk-watermark deciders, so a shard can satisfy its require filter and still stay UNASSIGNED if index.routing.allocation.enable is set to primaries during a maintenance window or if the target node has breached its high watermark. Understanding this ordering is what prevents allocation storms during peak ingestion, and it is the same decider chain that Hot-Warm-Cold Tier Design tunes for capacity. When a require filter can match no eligible node, Fallback Routing Strategies determine whether the shard degrades gracefully or blocks the transition outright.

As an index ages, ISM rewrites the allocation.require attribute at each phase, walking the shard down the tiers. The state machine below shows the routing attribute ISM stamps onto the index at every transition — this is the contract the deciders above enforce on the next evaluation cycle. The order in which routing changes relative to shrink and force_merge is what keeps the migration clean; that sequencing is grounded in Index Lifecycle Basics.

Step-by-step routing configuration

The four steps below move a logs-prod-* index set from hot to cold under enforced routing. Apply them in order: attributes on the nodes, a template so new indices start correct, a policy so aging indices migrate, and a verification pass.

1. Node configuration

Declare the tier attribute on every data node in opensearch.yml. The value here is the exact string every template and policy will reference — a single typo (warm with a trailing space, Warm with different case) produces silent allocation failures.

YAML

# opensearch.yml — one line per node, value matches its physical tier
node.name: os-data-hot-01
node.roles: [ data, ingest ]
node.attr.data: hot          # hot | warm | cold | frozen
node.attr.zone: us-east-1a   # optional: pair with allocation awareness

Confirm the attribute is live on the running node before proceeding:

Shell

# Every data node must report a data attribute; blanks mean the shard can never route there
curl -s "https://<cluster>:9200/_cat/nodeattrs?v&h=node,attr,value&s=attr" | grep -E "data|zone"

2. Index template

Bake the baseline require filter into a template so freshly created (or rolled-over) indices land on the hot tier without waiting for ISM’s first evaluation cycle. This closes the window where a new index would otherwise allocate wherever capacity happens to exist.

HTTP

PUT _index_template/logs-prod-template
{
  "index_patterns": ["logs-prod-*"],
  "template": {
    "settings": {
      "index.routing.allocation.require.data": "hot",   // start on NVMe
      "index.number_of_shards": 3,
      "index.number_of_replicas": 1,
      "plugins.index_state_management.policy_id": "log-routing-policy"
    }
  }
}

3. ISM policy JSON

The ISM allocation action rewrites index.routing.allocation.require during each phase transition, so placement follows age without a single manual PUT _settings. Pair allocation with shrink or force_merge in the same state — those actions inherently relocate shards under controlled constraints, which absorbs the routing change instead of triggering a separate relocation wave.

HTTP

PUT _plugins/_ism/policies/log-routing-policy
{
  "policy": {
    "description": "Production log ingestion routing policy",
    "default_state": "hot",
    "ism_template": [
      { "index_patterns": ["logs-prod-*"], "priority": 100 }
    ],
    "states": [
      {
        "name": "hot",
        "actions": [
          {
            "rollover": {
              "min_index_age": "1d",
              "min_primary_shard_size": "30gb"   // roll before shards get unwieldy
            }
          }
        ],
        "transitions": [
          { "state_name": "warm", "conditions": { "min_index_age": "3d" } }
        ]
      },
      {
        "name": "warm",
        "actions": [
          { "allocation": { "require": { "data": "warm" } } },  // route to SSD tier
          { "shrink": { "num_new_shards": 1 } }                 // relocation absorbs the routing shift
        ],
        "transitions": [
          { "state_name": "cold", "conditions": { "min_index_age": "14d" } }
        ]
      },
      {
        "name": "cold",
        "actions": [
          { "allocation": { "require": { "data": "cold" } } },  // route to HDD tier
          { "force_merge": { "max_num_segments": 1 } }
        ],
        "transitions": [
          { "state_name": "delete", "conditions": { "min_index_age": "90d" } }
        ]
      },
      {
        "name": "delete",
        "actions": [ { "delete": {} } ]
      }
    ]
  }
}

4. Verification

Never assume the allocation action took effect — confirm both the declared setting and the actual node placement, because they diverge exactly when something is wrong.

Shell

# a) Confirm ISM wrote the expected require attribute for the current phase
curl -s "https://<cluster>:9200/logs-prod-2026.07/_settings" \
  | grep -o '"require":{[^}]*}'

# b) Confirm the shards physically sit on nodes of that tier
curl -s "https://<cluster>:9200/_cat/shards/logs-prod-2026.07?v&h=index,shard,prirep,state,node"

# c) Ask ISM where the index is in its lifecycle
curl -s "https://<cluster>:9200/_plugins/_ism/explain/logs-prod-2026.07?pretty"

Cross-cluster replication routing boundaries

CCR adds a second routing surface because follower indices inherit allocation settings from the leader. In multi-cluster deployments, routing patterns must account for network topology and storage-tier parity across clusters. If a leader index routes to data: hot, the follower must either mirror that routing or explicitly override it via index.routing.allocation.require on the follower side. The credentials and endpoint scoping that make cross-cluster calls safe are governed by Security & Access Boundaries.

Misconfigured CCR routing produces persistent UNASSIGNED shards whenever follower nodes lack matching attributes. To enforce deterministic placement in replicated environments:

Define identical node.attr.data labels across leader and follower clusters wherever tier parity exists.
Apply index templates on the follower cluster to inject require filters before replication initializes.
Avoid cross-tier replication where network latency exceeds the acceptable window for write acknowledgment.
Audit follower routing by checking index.routing.allocation.require on follower indices so overrides never conflict with leader metadata propagation.

Python automation for routing drift

Manual routing audits do not scale in a dynamic cluster. The opensearch-py service below compares each managed index’s declared require filter against the tier its primary shards actually occupy, then remediates any drift. It uses structured logging, transport-level retries, and SSL verification for production use.

Python

import os
import logging
from opensearchpy import OpenSearch
from opensearchpy.exceptions import TransportError

logging.basicConfig(level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s")
logger = logging.getLogger(__name__)


class RoutingAuditor:
    def __init__(self, host: str, port: int = 9200, auth: tuple = None):
        self.client = OpenSearch(
            hosts=[{"host": host, "port": port}],
            http_auth=auth,
            use_ssl=True,
            verify_certs=True,
            timeout=30,
            max_retries=3,
            retry_on_timeout=True,
        )

    def get_managed_index_patterns(self) -> list:
        """Retrieve index patterns from every active ISM policy."""
        try:
            response = self.client.transport.perform_request("GET", "/_plugins/_ism/policies")
            patterns = []
            for entry in response.get("policies", []):
                for template in entry.get("policy", {}).get("ism_template", []):
                    patterns.extend(template.get("index_patterns", []))
            return patterns
        except TransportError as exc:
            logger.error("Failed to fetch ISM policies: %s", exc)
            return []

    def audit_routing_drift(self, index_pattern: str) -> list:
        """Compare declared routing against actual primary-shard placement."""
        drift_report = []
        try:
            settings_resp = self.client.indices.get_settings(index=index_pattern)
            for idx, idx_cfg in settings_resp.items():
                routing_require = (
                    idx_cfg.get("settings", {})
                    .get("index", {})
                    .get("routing", {})
                    .get("allocation", {})
                    .get("require", {})
                )
                if not routing_require:
                    continue

                shards = self.client.cluster.state(index=idx)["routing_table"]["indices"][idx]["shards"]
                for shard_id, shard_list in shards.items():
                    for shard in shard_list:
                        if shard.get("primary") and shard.get("state") == "STARTED":
                            node_id = shard.get("node")
                            node_info = self.client.nodes.info(node_id)
                            node_attrs = node_info["nodes"][node_id].get("attributes", {})
                            for key, expected in routing_require.items():
                                if node_attrs.get(key) != expected:
                                    drift_report.append({
                                        "index": idx,
                                        "shard": shard_id,
                                        "expected": {key: expected},
                                        "actual_node": node_id,
                                        "actual_attr": {key: node_attrs.get(key)},
                                    })
        except TransportError as exc:
            logger.warning("Skipping pattern %s: %s", index_pattern, exc)
        return drift_report

    def remediate_drift(self, drift_report: list) -> None:
        """Re-apply the declared require filter so the decider relocates the shard."""
        corrected = set()
        for entry in drift_report:
            idx = entry["index"]
            if idx in corrected:
                continue
            try:
                self.client.indices.put_settings(
                    index=idx,
                    body={"index.routing.allocation.require": entry["expected"]},
                )
                logger.info("Applied routing correction to %s: %s", idx, entry["expected"])
                corrected.add(idx)
            except TransportError as exc:
                logger.error("Failed to remediate %s: %s", idx, exc)


if __name__ == "__main__":
    auditor = RoutingAuditor(
        host=os.getenv("OPENSEARCH_HOST", "localhost"),
        port=int(os.getenv("OPENSEARCH_PORT", "9200")),
        auth=(os.getenv("OPENSEARCH_USER", "admin"), os.getenv("OPENSEARCH_PASS", "admin")),
    )

    for pattern in auditor.get_managed_index_patterns():
        drift = auditor.audit_routing_drift(pattern)
        if drift:
            logger.warning("Detected %d routing drift(s) for '%s'. Remediating...", len(drift), pattern)
            auditor.remediate_drift(drift)
        else:
            logger.info("Pattern '%s': routing alignment verified.", pattern)

Schedule this against a CI/CD job or cron worker so configuration divergence is caught before it becomes a relocation storm. For advanced connection pooling and async execution, see the official Python client documentation.

Operational guardrails

Routing enforcement depends on capacity headroom the deciders can see. Disk watermarks gate every allocation, so a tier at capacity will reject shards even when the require filter is correct. The usable headroom on a tier before the flood stage locks writes is:

H_{\text{tier}} = C_{\text{total}} \times (w_{\text{flood}} - w_{\text{current}})

where $C_{\text{total}}$ is the tier’s aggregate disk and $w$ values are the watermark fractions. Keep $H_{\text{tier}}$ larger than the size of the next index scheduled to migrate in, or the transition stalls at the boundary. The settings below are the ones that most directly govern routing stability.

Setting	Recommended value	Effect on routing
`cluster.routing.allocation.disk.watermark.low`	`82%`	Stops new shards routing to a filling node
`cluster.routing.allocation.disk.watermark.high`	`88%`	Triggers relocation off the node
`cluster.routing.allocation.disk.watermark.flood_stage`	`93%`	Forces indices read-only; blocks migration in
`cluster.routing.allocation.node_concurrent_recoveries`	`3` (NVMe) / `1–2` (HDD)	Caps parallel relocations per node
`cluster.routing.allocation.enable`	`all` (revert after maintenance)	`primaries` blocks replica routing

HTTP

PUT _cluster/settings
{
  "persistent": {
    "cluster.routing.allocation.disk.watermark.low": "82%",
    "cluster.routing.allocation.disk.watermark.high": "88%",
    "cluster.routing.allocation.disk.watermark.flood_stage": "93%",
    "cluster.routing.allocation.node_concurrent_recoveries": 3
  }
}

During rolling upgrades or node decommissioning, setting index.routing.allocation.enable to primaries prevents replica allocation storms — but revert it promptly, or fault tolerance silently degrades. Always confirm node-attribute consistency before scaling: a single node missing node.attr.data: warm blocks every warm-tier placement.

Troubleshooting routing failures

Each failure mode below pairs a diagnosis command with the corrective action.

Shards stuck UNASSIGNED after a tier transition. The require attribute has no eligible node. Diagnose the exact decider that rejected placement, then confirm the attribute exists on some node:

Shell

curl -s -X POST "https://<cluster>:9200/_cluster/allocation/explain" \
  -H "Content-Type: application/json" \
  -d '{"index":"logs-prod-2026.07","shard":0,"primary":true}'
# Fix: add the missing attribute to a node, or relax the require filter
curl -s -X PUT "https://<cluster>:9200/logs-prod-2026.07/_settings" \
  -H "Content-Type: application/json" \
  -d '{"index.routing.allocation.require.data":"warm"}'

ISM transition stalls with a failed allocation action. The action retried out and the index is parked. Read the failure, then retry the managed index:

Shell

curl -s "https://<cluster>:9200/_plugins/_ism/explain/logs-prod-2026.07?pretty"
curl -s -X POST "https://<cluster>:9200/_plugins/_ism/retry/logs-prod-2026.07"

Cold allocation blocked by disk watermark. The cold tier is above watermark.high. Identify the pressured node, then expand capacity or lower the incoming volume:

Shell

curl -s "https://<cluster>:9200/_cat/allocation?v&h=node,disk.percent,disk.avail&s=disk.percent:desc"
# Fix: raise capacity, or temporarily route the migration to another tier

Node-attribute drift after a config change. A node came back with a different attribute value than its template expects, so shards route away from it:

Shell

curl -s "https://<cluster>:9200/_cat/nodeattrs?v&h=node,attr,value" | grep data
# Fix: correct node.attr.data in opensearch.yml and restart the node

CCR follower shards UNASSIGNED. The follower inherited a require attribute its OpenSearch cluster has no node for. Override routing on the follower index:

Shell

curl -s -X PUT "https://<follower>:9200/logs-prod-2026.07-follower/_settings" \
  -H "Content-Type: application/json" \
  -d '{"index.routing.allocation.require.data":"warm"}'

Frequently asked questions

Should I use require, include, or exclude for tier routing?

Use require for strict tier enforcement — a shard must land on a node matching every listed attribute. Use include when any of several tiers is acceptable (capacity-driven placement), and exclude to drain a specific node or zone. ISM’s allocation action writes require by default, which is the right choice for deterministic hot-warm-cold movement.

Why does my shard stay UNASSIGNED even though the require filter matches a node?

The filter decider runs after the enable-state and disk-watermark deciders. If index.routing.allocation.enable is primaries, or the target node is above watermark.high, the shard is rejected before its attribute is even considered. Run _cluster/allocation/explain to see which decider said no.

Does the ISM allocation action move shards immediately?

No. The allocation action rewrites the index setting; physical relocation happens on the next cluster state evaluation, subject to node_concurrent_recoveries throttling. Pairing allocation with shrink or force_merge in the same state lets that action’s inherent relocation carry the routing change.

Node Role Allocation — declare and verify the tier attributes routing filters target.
Hot-Warm-Cold Tier Design — capacity ratios and watermark tuning behind each tier.
Fallback Routing Strategies — graceful degradation when a require filter matches no node.
Index Lifecycle Basics — how phase transitions sequence routing against shrink and force_merge.
Security & Access Boundaries — scoping the credentials that drive cross-cluster routing.

Up: OpenSearch ISM Architecture & Fundamentals

Data Tier Routing Patterns

Tier, hardware, and routing attribute alignment #

How the allocation deciders resolve a shard #

Step-by-step routing configuration #

1. Node configuration #

2. Index template #

3. ISM policy JSON #

4. Verification #

Cross-cluster replication routing boundaries #

Python automation for routing drift #

Operational guardrails #

Troubleshooting routing failures #

Frequently asked questions #

Related #

Tier, hardware, and routing attribute alignment

How the allocation deciders resolve a shard

Step-by-step routing configuration

1. Node configuration

2. Index template

3. ISM policy JSON

4. Verification

Cross-cluster replication routing boundaries

Python automation for routing drift

Operational guardrails

Troubleshooting routing failures

Frequently asked questions

Related