OpenSearch ISM Architecture & Fundamentals

OpenSearch Index State Management (ISM) is a policy-driven state machine that automates index rollover, tiered shard allocation, segment optimization, snapshotting, and deletion across a distributed cluster without external cron jobs. This guide is written for search and log-platform engineers, data-platform operators, and Python automation builders who need deterministic index lifecycle control that integrates cleanly with Cross-Cluster Replication (CCR) and fine-grained access control.

Policy definition feeds the scheduler; the evaluation loop reads live index metadata and, when a transition condition is met, dispatches actions that fan out across the data tiers, the snapshot repository, and the audit-trail history index.

ISM decouples policy definition from policy execution. You declare the desired lifecycle as a directed graph of states and transitions; an OpenSearch-native background scheduler then reconciles every managed index toward that graph on a fixed interval. Because execution is idempotent and audit-logged, the same policy behaves predictably across rolling upgrades, node failures, and cross-cluster replication. The sections below map each moving part — the execution engine, the storage topology it routes across, the policy schema it evaluates, the access boundaries it must respect, the automation that drives it, and the APIs that observe it.

State Machine Execution & Policy Evaluation

The ISM execution engine runs as a scheduled background job inside the OpenSearch cluster, polling the .opendistro-ism-config system index for pending policy evaluations. Each policy defines a finite state machine: a set of named states, the ordered actions inside each state, and the transitions that move an index from one state to the next. The scheduler evaluates every managed index against its assigned policy at a configurable cadence — plugins.index_state_management.job_interval (default: 5 minutes) — and executes an action only when its transition conditions are satisfied.

Transition conditions are evaluated against live index metadata rather than wall-clock guesses. The primary predicates are min_index_age, min_size, min_doc_count, min_primary_shard_size, and cron expressions for calendar-aligned transitions. A rollover in the hot state, for example, fires when the write index crosses either an age or a primary-shard-size boundary — whichever comes first — so that shard sizing stays bounded regardless of ingest spikes.

ISM maintains strict idempotency. If an action fails — a snapshot repository is briefly unreachable, or a target tier is at its disk watermark — the index remains in its current state and the engine retries on the next evaluation cycle, or immediately when you call the _plugins/_ism/retry/<index> endpoint. Every transition is written to the .opendistro-ism-managed-index-history-* indices, giving you an immutable audit trail for debugging stuck transitions or policy drift. Configure each action’s retry block (count, backoff, delay) explicitly so transient network partitions or disk-pressure events degrade into bounded retries instead of cascading failures. The mechanics of how indices progress through their states are covered in depth in Index Lifecycle Basics; for the plugin’s full parameter reference, consult the OpenSearch Index State Management documentation.

The evaluation cycle is idempotent: unmet conditions loop back to wait for the next interval, while a met condition executes exactly one action, records it to the history index, and transitions the index to its next state.

The evaluation loop is deliberately conservative: it processes a bounded number of indices per run (plugins.index_state_management.coordinator.sweep_period governs how often the coordinator re-scans for newly managed indices), and it never runs two actions on the same index concurrently. This is why an OpenSearch cluster with tens of thousands of managed indices should tune job_interval up rather than down — a shorter interval increases scheduler pressure without accelerating any individual transition beyond its condition boundary.

Storage Topology, Node Roles & Disk Watermarks

ISM routes shards across performance and cost boundaries, so it can only be as good as the OpenSearch cluster topology beneath it. Modern OpenSearch (2.x and later) segregates data nodes with dedicated roles — data_hot, data_warm, data_cold, and data_frozen — which coexist with the legacy node.attr.data tag approach. The Node Role Allocation model determines how OpenSearch’s cluster manager assigns primary and replica shards based on those roles, node.attr tags, and disk watermark thresholds. ISM’s allocation action moves an index between tiers by writing index.routing.allocation.require, include, or exclude filters that match node attributes.

The table below aligns each tier with a storage profile and the routing attribute ISM targets when it transitions an index into that tier:

Tier	Node role	Storage profile	Routing attribute	Primary workload
Hot	`data_hot`	NVMe SSD, high IOPS	`require.data: hot`	Active ingest, real-time search, aggregations
Warm	`data_warm`	SATA SSD, moderate IOPS	`require.data: warm`	Recent history, read-mostly search
Cold	`data_cold`	High-capacity HDD	`require.data: cold`	Compliance retention, infrequent queries
Frozen	`data_frozen`	Object storage (S3-backed searchable snapshots)	`require.data: frozen`	Archival, rare investigative access

Choosing where each tier boundary sits — how long data stays hot before it moves warm, and how many replicas each tier carries — is the subject of Hot-Warm-Cold Tier Design. The physical relocation of shards during a transition is governed by the shard allocation deciders detailed in Data Tier Routing Patterns, which explains why a transition can stall in a WAITING state when the target tier lacks capacity.

Disk watermarks are the single most common cause of stalled ISM transitions, so calibrate them deliberately. OpenSearch stops allocating new shards to a node once its used-disk fraction crosses the low watermark, starts relocating shards away at the high watermark, and enforces a read-only-allow-delete block at the flood stage. In formal terms, a shard relocation into a warm node is admitted only while:

\text{disk}_\text{used} + \text{shard}_\text{size} \le \text{watermark}_\text{high} \times \text{capacity}_\text{node}

For multi-tier clusters, the single-tier defaults (85% / 90% / 95%) are usually too aggressive during migration windows. Lowering the low and high watermarks — for example to 82% and 88% — reserves headroom so an ISM allocation action does not push a warm node into flood stage mid-transition. Configure them cluster-wide:

HTTP

PUT _cluster/settings
{
  "persistent": {
    "cluster.routing.allocation.disk.watermark.low": "82%",
    "cluster.routing.allocation.disk.watermark.high": "88%",
    "cluster.routing.allocation.disk.watermark.flood_stage": "93%",
    "cluster.routing.allocation.disk.threshold_enabled": true
  }
}

When a tier genuinely runs out of capacity, ISM should not simply hang. Fallback Routing Strategies define how transitions and CCR sync cycles degrade gracefully — spilling to an adjacent tier or holding ingest — instead of triggering an uncontrolled shard storm.

Policy Design: States, Actions & Transition Conditions

A policy is a JSON document with a default_state and an ordered list of states. Each state carries an actions array and a transitions array. Actions are the operations ISM performs — rollover, allocation, replica_count, force_merge, shrink, snapshot, close, open, and delete — while transitions declare the condition that advances the index to the next state. The ism_template block lets a policy claim newly created indices by pattern, so you rarely attach policies by hand in production.

The skeleton below expresses a canonical hot → warm → cold → delete lifecycle for time-series log data. Rollover bounds shard size in the hot state; the warm transition drops replicas and force-merges to reclaim space; the cold transition relocates to HDD-backed nodes; and the terminal state deletes after the retention window:

JSON

{
  "policy": {
    "description": "Time-series log lifecycle: hot -> warm -> cold -> delete",
    "default_state": "hot",
    "ism_template": [
      { "index_patterns": ["logs-app-prod-*"], "priority": 100 }
    ],
    "states": [
      {
        "name": "hot",
        "actions": [
          { "rollover": { "min_index_age": "1d", "min_primary_shard_size": "50gb" } }
        ],
        "transitions": [
          { "state_name": "warm", "conditions": { "min_index_age": "7d" } }
        ]
      },
      {
        "name": "warm",
        "actions": [
          { "replica_count": { "number_of_replicas": 1 } },
          { "force_merge": { "max_num_segments": 1 } },
          { "allocation": { "require": { "data": "warm" }, "wait_for": true } }
        ],
        "transitions": [
          { "state_name": "cold", "conditions": { "min_index_age": "30d" } }
        ]
      },
      {
        "name": "cold",
        "actions": [
          { "allocation": { "require": { "data": "cold" }, "wait_for": true } },
          { "snapshot": { "repository": "s3-archive", "snapshot": "cold-{{ctx.index}}" } }
        ],
        "transitions": [
          { "state_name": "delete", "conditions": { "min_index_age": "90d" } }
        ]
      },
      {
        "name": "delete",
        "actions": [
          { "retry": { "count": 3, "backoff": "exponential", "delay": "1h" },
            "delete": {} }
        ],
        "transitions": []
      }
    ]
  }
}

Two design rules keep policies stable in production. First, always set wait_for: true on an allocation action so ISM blocks until shards physically land on the target tier before proceeding — otherwise a downstream force_merge can run on the wrong hardware. Second, attach policies through the ism_template block rather than the imperative _plugins/_ism/add call whenever possible; template-based attachment survives index recreation, while manually attached policies must be reattached after a template rebuild.

Security & Access Control for ISM Endpoints

ISM actions mutate cluster state and index metadata, so the _plugins/_ism/* API surface must be locked down with fine-grained access control (FGAC). The Security & Access Boundaries model scopes which service accounts may create policies, trigger manual transitions, or read execution history. The governing principle is that only platform automation roles get write access to lifecycle policies; application teams receive read-only visibility into index state through explain.

A minimal role definition grants the automation service account the exact cluster and index permissions ISM needs, and nothing more:

JSON

{
  "cluster_permissions": [
    "cluster:admin/opendistro/ism/policy/write",
    "cluster:admin/opendistro/ism/policy/get",
    "cluster:admin/opendistro/ism/managedindex/explain"
  ],
  "index_permissions": [
    {
      "index_patterns": ["logs-*", ".opendistro-ism-*"],
      "allowed_actions": ["indices:admin/settings/update", "indices:monitor/settings/get"]
    }
  ]
}

Application-facing roles should be granted only explain and get, never policy/write or retry, so a service consumer can observe why an index is stuck without being able to force a transition. Because every action is recorded in the .opendistro-ism-managed-index-history-* indices, restrict read access to those history indices as well — they reveal snapshot repository names, tier topology, and retention policy, which are useful reconnaissance for an attacker. Rotate the automation account’s credentials out of band and never embed them in policy JSON.

Python Automation & CI/CD Integration

At scale, ISM is managed as code. The opensearch-py client provides a programmatic interface for attaching policies, verifying state, and handling retries idempotently. The example below attaches a policy to an index pattern and reads back its managed-index state with typed signatures and structured error handling suitable for a CI/CD job:

Python

import logging
from typing import Dict, Any
from opensearchpy import OpenSearch, ConnectionError, NotFoundError, TransportError

logging.basicConfig(level=logging.INFO, format="%(asctime)s [%(levelname)s] %(message)s")
logger = logging.getLogger(__name__)


def attach_ism_policy(
    client: OpenSearch,
    index_pattern: str,
    policy_id: str,
    timeout: int = 30,
) -> Dict[str, Any]:
    """Attach an ISM policy to a matching index pattern. Idempotent: re-attaching
    the same policy_id is a no-op the plugin resolves without error."""
    try:
        response = client.transport.perform_request(
            method="POST",
            url=f"/_plugins/_ism/add/{index_pattern}",
            body={"policy_id": policy_id},
            params={"timeout": f"{timeout}s"},
        )
        logger.info("Policy '%s' attached to pattern '%s'", policy_id, index_pattern)
        return response
    except ConnectionError as e:
        logger.error("Cluster connection failed during policy attachment: %s", e)
        raise
    except TransportError as e:
        logger.error("ISM plugin transport error: %s", e.info)
        raise


def verify_ism_state(client: OpenSearch, index_name: str) -> Dict[str, Any]:
    """Retrieve current ISM metadata for a specific index via the explain API."""
    try:
        return client.transport.perform_request(
            method="GET", url=f"/_plugins/_ism/explain/{index_name}"
        )
    except NotFoundError:
        logger.warning("Index '%s' does not exist in cluster.", index_name)
        return {}


if __name__ == "__main__":
    opensearch_client = OpenSearch(
        hosts=[{"host": "opensearch-master.internal", "port": 9200}],
        http_auth=("automation_svc", "REDACTED"),
        use_ssl=True,
        verify_certs=True,
        maxsize=10,
        retry_on_timeout=True,
        max_retries=3,
    )

    attach_ism_policy(opensearch_client, "logs-app-prod-*", "default_retention_policy")
    state = verify_ism_state(opensearch_client, "logs-app-prod-000012")
    logger.info("Managed-index state: %s", state)

In a CI/CD pipeline, version-control policy JSON alongside your Terraform or Kubernetes manifests and validate it against the OpenSearch schema before deployment, so a malformed state graph or invalid cron expression never reaches production. The attach step is safe to run on every deploy because it is idempotent — re-applying an unchanged policy resolves to a no-op. The broader family of orchestration patterns — bulk attachment, drift detection, and rollover automation — is developed in ISM Policy Implementation & Python Automation, which extends this attach/verify skeleton into full production frameworks.

Cross-Cluster Replication Integration

ISM and Cross-Cluster Replication (CCR) intersect in ways that surprise teams new to OpenSearch. Policy attachments do not propagate over replication: a policy attached on the leader cluster has no effect on the follower, because replication carries index data and settings, not ISM metadata. You must attach a follower-appropriate policy independently on the follower cluster.

Follower policies should be deliberately narrower than leader policies. A follower index is read-only while replication is active, so any action that mutates the write index — chiefly rollover — will conflict with the replication engine and can produce split-brain state where the leader and follower disagree on which index is current. Design follower policies to prioritize delete and snapshot for retention and archival, and stop replication cleanly before allowing the follower to take over write duties in a failover. When a follower is promoted, detach the replication relationship first, then let the follower’s own ISM policy resume rollover. Routing constraints inherited from the leader are covered under Node Role Allocation, which shows how to override inherited allocation tags when the follower cluster uses a different tier layout.

Operational Monitoring & Alerting

Observing ISM in production comes down to three API families. Use _cat/indices for a fast inventory of index sizes, doc counts, and health; use _plugins/_ism/explain to see the exact state, action, and any failure reason for a managed index; and use _cluster/allocation/explain to diagnose why a shard will not move during a tier transition.

Shell

# Fast inventory: size, docs, health per index
curl -s "https://<cluster>:9200/_cat/indices/logs-*?v&h=index,health,docs.count,store.size&s=index"

# Why is this index stuck? — state, action, and failure reason
curl -s "https://<cluster>:9200/_plugins/_ism/explain/logs-app-prod-000012?pretty"

# Why can't this shard allocate to the target tier?
curl -s -X POST "https://<cluster>:9200/_cluster/allocation/explain" \
  -H "Content-Type: application/json" \
  -d '{"index": "logs-app-prod-000012", "shard": 0, "primary": true}'

Wire the explain output into alerting. The signal that matters is an index whose ISM action reports a non-empty failed flag or an info.message that has not changed across two evaluation cycles — that is a transition stuck on watermarks, a missing snapshot repository, or a routing mismatch. Alert on the count of managed indices in a failed action state, not on individual transitions, so a single transient retry does not page anyone. Cross-reference stuck transitions against disk watermark metrics and unassigned-shard counts to distinguish a capacity problem from a policy problem.

Frequently Asked Questions

How often does ISM evaluate a policy, and can I make transitions happen faster?

ISM evaluates every managed index once per plugins.index_state_management.job_interval (default 5 minutes). Shortening the interval does not make an individual transition fire before its condition (min_index_age, min_size, and similar) is met — it only increases scheduler load. To make a transition happen sooner, relax the condition or call _plugins/_ism/retry/<index> after fixing whatever blocked it.

Why is my index stuck in a WAITING or failed state?

The most common causes are a target tier over its disk high watermark, a missing or unreachable snapshot repository, or a routing attribute that no node matches. Run _plugins/_ism/explain/<index> to read the failure reason, then _cluster/allocation/explain if it is an allocation problem. Fix the underlying cause and call the retry endpoint.

Do ISM policies replicate to CCR follower clusters?

No. Replication carries index data and settings but not ISM policy attachments. Attach a follower-specific policy on the follower cluster, and avoid rollover on follower indices while replication is active to prevent split-brain state.

Should I attach policies with ism_template or the _ism/add API?

Prefer the ism_template block inside the policy so new indices are claimed automatically by pattern and the attachment survives index recreation. Use _plugins/_ism/add only for one-off backfills of already-existing indices; those attachments must be reapplied after an index template rebuild.

Node Role Allocation — how OpenSearch’s cluster manager places primary and replica shards across data_hot/warm/cold/frozen roles.
Hot-Warm-Cold Tier Design — sizing tier boundaries, replica counts, and hardware profiles for time-series data.
Data Tier Routing Patterns — the shard allocation decider chain and how transitions move shards between tiers.
Index Lifecycle Basics — how indices progress through hot, warm, cold, and delete states.
Security & Access Boundaries — FGAC scoping for the _plugins/_ism/* endpoints and history indices.
Fallback Routing Strategies — graceful degradation when a target tier runs out of capacity.
ISM Policy Implementation & Python Automation — production frameworks for bulk attachment, drift detection, and rollover automation.

Up next: return to the index-state-management.org home page for the full map of OpenSearch ISM and Cross-Cluster Replication guides.

OpenSearch ISM Architecture & Fundamentals

State Machine Execution & Policy Evaluation #

Storage Topology, Node Roles & Disk Watermarks #

Policy Design: States, Actions & Transition Conditions #

Security & Access Control for ISM Endpoints #

Python Automation & CI/CD Integration #

Cross-Cluster Replication Integration #

Operational Monitoring & Alerting #

Frequently Asked Questions #

Related Guides #