When massive datasets, fragmented infrastructure, and high-throughput workloads collide, they place enormous strain on traditional storage systems. This is especially true for data-heavy environments like AI, HPC, media, and research, where managing the flow of data across tiers, sites, and clouds becomes a major bottleneck.
Ngenea solves this by placing intelligent automation between your applications and storage layers. It unifies your heterogeneous storage under a global namespace, enables transparent tiering, and orchestrates lifecycle actions without disrupting workflows or compromising performance. When used with DataCore Pixstor, Ngenea enhances its capabilities with smart data mobility, extending the high-performance file system into a dynamic, policy-driven data fabric.
- Extends existing storage with orchestration; no rip-and-replace
- Supports file, object, and cloud targets: NAS, S3, tape, and more
- Creates a global namespace with file stubbing for seamless data access
- Automates data placement, recall, migration, and archive
- Integrates with job schedulers and hybrid workflows
- Enables site-to-site sync and recall across locations
- Provides full metadata indexing and API-driven control
- Deploys on-premises, in the cloud, or hybrid environments
Cas d’usage
Hybrid HPC environments that require seamless data movement between compute, scratch, and archive tiers
AI model training pipelines requiring fast ingest, pre-staging, and checkpoint archival
Media workflows needing seamless movement between production, render, archive, and cloud tiers
Global collaboration with remote access to data through a unified view and smart recall
Cloud bursting and DR with intelligent caching, prefetch, and cost-efficient data movement
Multi-site teams with distributed data access and lifecycle governance
Fonctionnalités
Ngenea abstracts all storage into a single logical view, eliminating silos while enabling automated, policy-driven tiering and recall across tiers and locations.
- Stub-based namespace with seamless recall on access
- Policy-based data movement based on age, path, tags, access, usage
- Tiering between NVMe, SSD, HDD, object, tape, cloud, NAS
- File hydration on-demand or prefetch via job scheduler hooks
- Maintains original metadata, permissions, and paths
- Works with legacy storage via reverse stubbing
- No reconfiguration of apps or users required
Automate the movement of data between performance and capacity layers based on real usage, policies, and workflows without impacting access or breaking paths.
- Rule-based tiering by age, size, owner, location, tags, or activity
- Transparent data movement between NVMe, SSD, HDD, object, tape, and cloud
- Preserves file paths, permissions, and metadata
- On-demand or policy-driven recall (hydration) of files as needed
- Integrates with workflow managers and job schedulers
- Reverse stubbing enables legacy data access, reducing migration
- Align tiering to performance, cost, and retention goals
- Granular control for staging, archiving, or bursting scenarios
- Capacity analysis tools provide insights into usage patterns to plan tiering decisions more effectively
Ngenea tracks and indexes metadata across all connected storage, providing complete visibility and actionability regardless of physical location.
- Central index of file names, paths, tags, timestamps
- Supports content metadata and custom metadata for data discoverability
- GUI and API access to search, filter, and retrieve results
- Preview proxies available for archived or cold data
- Integrated with Pixstor Search and external pipelines
- Export search results to action workflows
Ngenea accelerates movement and replication across environments without the complexity of manual transfers or dedicated infrastructure.
- Efficient sync with minimal overhead
- Bi-directional and dehydrated sync (stub only) options
- Supports POSIX↔POSIX, POSIX↔S3, S3↔S3 sync types
- Bandwidth-aware transfers and per-path control
- DR-friendly reverse recall: only fetch what’s needed
- Can migrate billions of files with ease
Ngenea integrates with your compute workflows to automate data prep, staging, and post-processing actions.
- REST and Python APIs for pre/post job triggers
- Supports Slurm, Portable Batch System (PBS), and CI/CD toolchains
- File-based and event-based triggers for full automation
- Stage data to scratch before a job; archive it after
- Action cart model for scheduled movement or cleanup
Ngenea ensures secure, policy-bound access across all tiers and locations, without impacting user productivity.
- Role-based access control (RBAC) at all layers
- Supports AD, LDAP, OAuth2 authentication
- Audit logging for recall, movement, and access events
- Retention enforcement via snapshot and tiering policies
- Encryption in transit (TLS/SSH) and optionally at rest
- Site-specific rules, recall limits, and DR availability
From dashboards to programmatic control, Ngenea provides full visibility into where your data is, how it is used, and what it costs.
- Central dashboard for policy status, recall queue, and usage
- Event and system logging via Logstash, Elasticsearch
- Email alerts for key thresholds and system events
- Automation hooks for lifecycle and policy actions
- REST APIs for integration and orchestration
- Custom triggers for ingest, sync, migration, and DR
Ngenea extends orchestration to any destination—be it hyperscale cloud, deep archive, or edge ingest site.
- Integrates with AWS, Azure, GCP, and private object stores
- Integrates with DataCore Swarm for on-premises S3 archive
- Supports tape via BlackPearl and NAS via NFS/S3
- Stub-based transparent recall and indexing across all targets
- Policy-controlled cloud caching and archival
- Policy-based tiering based on cost, usage, or workflows
- Supports hybrid and multi-cloud environments
Avantages
Automate The Complex Stuff
- Schedule and trigger data workflows with rules and policies: no manual intervention
- Orchestrate data from ingest to archive without user disruption
- Reduce operational overhead across locations
Optimize Data Placement & Cost
- Ensure hot data stays on fast storage; archive cold data automatically
- Enforce cost-aware policies to avoid unexpected cloud spend
- Leverage existing storage systems and expand with free choice of hardware: no vendor lock-in required
Enable Global Collaboration
- Remote teams work as one through unified namespace
- Stub access ensures files are always discoverable
- Data is recalled on-demand or pre-staged as needed
- Role-based access ensures users see only what they need
Accelerate Workloads
- Data is where it needs to be even before compute jobs start
- Pre-stage, cache, and hydrate without delays
- Simplify workflows from core to edge to cloud
Protect & Recover with Precision
- Stub-based DR with low footprint
- Policy-driven sync and recall across storage tiers and sites
- Fast recovery without full dataset replication
Future-Proof Your Storage Strategy
- Easily integrate new hardware, storage, or sites
- Tier to cloud or tape without disrupting workflows
- Scale operations and locations without reinventing access
Architecture
DataCore Ngenea and Pixstor Reference Architecture for HPC/AI Environments
Déploiement
Ngenea and Pixstor can be deployed on-premises, in the cloud, or as a hybrid solution, giving you the flexibility to support any workflow or infrastructure strategy.
Licence
Ngenea is licensed per server. A single server license enables data movement across any number of storage targets—including scratch, home, archive, and cloud—with no capacity limits and no performance-based fees. This eliminates hidden costs, avoids capacity-based charges as the namespace expands, and allows hardware to be retired or repurposed without incurring additional licensing costs. Ngenea is available exclusively with Pixstor.