VISION Supercomputer Beta Test Phase
Overview
The VISION Beta Test Phase (April 20 - June/July 2026) represents a critical "Shakedown and Expansion" period designed to stress-test the supercomputer under real-world conditions before general availability with a fully operationalized supercomputer.
VISION is onboarding a limited number of diverse users to validate different aspects of the platform. Beta participants are expected to provide feedback that directly informs platform development. All beta users must meet the technical requirements.
Is Beta Right for Your Project?
Apply for beta if:
- Your project is defined and can begin shortly after onboarding
- Your team can operate with limited support and incomplete documentation
- You will provide structured feedback throughout participation
- Your workload requires a large fraction of VISION resources concurrently
- Your project can produce benchmarking or comparative performance results
- Your project meets all technical requirements
Wait for general availability if:
- Your project is still in planning
- You do not meet the technical requirements
- Your work depends on features or access not yet available
- You require a stable, fully supported production environment
Expectations for Beta Participants
Participant Guidelines
Beta users are expected to:
- Avoid using for critical work - Don't run anything you can't afford to lose - save important tasks for general availability
- Save your work often - Decide how much work you're okay losing and save frequently to match that
- Report all issues- Let us know about anything unusual, even small problems
- Vary workloads - Try a variety of workloads and tasks to help test the system
- Experiment broadly - Run varied and diverse experiments that test the system in different ways
Feedback Requirements
Feedback is mandatory. Beta testers must:
- Report issues with performance, usability, and workflows through designated support channels
- Identify errors or gaps in documentation
- Provide results or summaries that demonstrate VISION's capabilities (when applicable)
Technical Requirements
Access & Authentication
- TAMU System–affiliated identity
- Duo multi-factor authentication
- Cloudflare WARP (Zero Trust) client configured for VISION
Connectivity
- SSH access using public/private key authentication
- Familiarity with SSH clients for your operating system (e.g., native terminal, WSL, PuTTY, MobaXTerm)
Security & Compliance
- Completed data classification and software impact assessments
- Compliance with VISION security and network access policies
Expected Proficiencies
- Linux command-line operations
- Batch job submission and scheduling
- Containers or compiled software environments (as applicable)
- Data transfer tools (e.g., Globus)
Beta Test Limitations
- Onboarding — Invitation-based; manually coordinated
- Support — Limited; users should expect to work independently
- External users — Non-TAMU System affiliates may not be eligible
- Network access — Some internal and external connections are restricted
- Documentation — Incomplete; still under development
Beta Test Phase Objectives and Strategies
Core Philosophy
The beta phase operates on a fundamental principle: failures are expected and valuable. This phase recognizes that complex systems reveal emergent behaviors only under realistic operational conditions. Every issue discovered during beta is essentially "a production incident that didn't happen."
Primary Objectives
- System Integration Testing - Validating how individual components behave as a unified system beyond vendor installation tests
- Stress Testing Under Load - Identifying failure modes through intensive, varied workloads
- Documentation of Deficiencies - Compiling a comprehensive list of issues requiring correction
- Crew Training - Allowing users and administrators to learn the system's specific characteristics
The Ultimate Goal
The beta phase aims to identify and eliminate failure modes before general availability, transitioning VISION from "proving the system works, to learning how it fails, to delivering reliable, transformative science at scale." Participants aren't merely tolerating failures—they're actively hunting them to build a more robust production system.
Beta Testing Strategies
The phase employs several sophisticated approaches to maximize learning while minimizing lost work:
- Graduated scaling validation: Testing progresses from single nodes → single SU (~32 nodes) → multiple SUs → full system
- Checkpoint/Restart protocols: Treating checkpoints like "watertight compartments" to contain losses
- Fault injection testing: Deliberately causing failures under controlled circumstances
- Coordinated batch testing: Organizing themed testing days (storage intensive, LLM training, chaos testing)