Managing File Storage with Lustre File Systems

Lustre Quotas

Customers can use the Lustre file system storage quota feature (lfs setquota) to set user-based, group-based, and project-based quotas on the file system itself to limit end-user capacity utilization & optimize cost.

To limit File Storage with Lustre resources, see Limits on File Storage with Lustre Resources.

Required IAM Policy

To use Oracle Cloud Infrastructure, an administrator must be a member of a group granted security access in a policy  by a tenancy administrator. This access is required whether you're using the Console or the REST API with an SDK, CLI, or other tool. If you get a message that you don't have permission or are unauthorized, verify with the tenancy administrator what type of access you have and which compartment  your access works in.

If you're new to policies, see Getting Started with Policies and Policy Builder Policy Templates.

The following IAM policy statement allows a group of administrators to manage File Storage with Lustre resources:

Allow group <lustre-admin-group> to manage lustre-file-family in compartment <file_system_compartment>

Because file systems use network resources, users must also have "use" permissions for VNICs, private IPs, private DNS zones, and subnets to create or delete a file system. The following policy grants a group of administrators the ability to use network resources:

allow group <lustre-admin-group> to use virtual-network-family in compartment <network_resource_compartment>

The following required policy allows the File Storage with Lustre service to attach Lustre servers hosts to subnets in your tenancy:

allow service lustrefs to use virtual-network-family in tenancy
Important

Without the preceding policies, File Storage with Lustre can't use the network resources necessary to function.

If you're planning to encrypt file systems using your own keys, see the policies in Updating File System Encryption.

For more information, see File Storage with Lustre Policies.

Limitations and Considerations

  • The minimum capacity for a file system is 31.2 TiB. File system capacity can be increased in specific increments. File system capacity can't be decreased.
  • The default maximum capacity for a file system is 200 TiB. If you need more than 200 TiB of capacity, contact support.
  • The maximum aggregate throughput per tenancy is 200 Gbps (gigabits per second). This allotment can be used across multiple file systems. For example, you can create a 72.8 TiB file system at 125 MB/s/TiB (for an aggregate of 72.8 Gbps) and a 41.6 TiB file system at 250 MB/s/TiB (for an aggregate of 83.2 Gbps).
  • Don't use /25 or smaller subnets for file system creation because they don't have enough available IP addresses.
    • For file systems with capacity between 120 TiB and 215 TiB, use /24 or larger subnets
    • For file systems with capacity between 200 TiB and 400 TiB, use /23 or larger subnets
  • You can't attach a file system to a public subnet in a VCN. File systems must be attached to private subnets.
  • Each file system has a weekly maintenance schedule. For more information, see Finding a File System's Maintenance Schedule.

Understanding Maintenance Schedules

File Storage with Lustre file systems require a maintenance schedule for patching and updates. A sample maintenance schedule has two components: DayOfWeek and StartTime, such as MONDAY:04:00.

The service might use the weekly maintenance schedule to perform routine maintenance operations on the Lustre file system. Normally, only a single maintenance schedule is used per month. Maintenance can take 30 minutes to four hours for file systems that are less than 1 petabyte (PB). Larger file systems might take longer.

During an active maintenance operation, the file system remains available but can have degraded performance. In rare cases, the file system might become unavailable for part of the maintenance.

Here are some considerations when working with maintenance schedules:

  • You can specify a maintenance schedule when you first create the file system, and you can also update it later.
  • If you don't specify a maintenance schedule, the system assigns it for you. See Finding a File System's Maintenance Schedule.
  • Based on the maintenance schedule, the system determines the next planned maintenance, which is the date and time when the next maintenance starts.
  • If needed, you can override the next planned maintenance of your File Storage with Lustre file system and change it to a new date and time. Note that the overall recurring maintenance schedule isn't impacted by the override. See Overriding a File System's Next Planned Maintenance.
  • You can monitor maintenance activity using File Storage with Lustre metrics:
    • HoursUntilMaintenance indicates the number of hours remaining until the next scheduled maintenance. Configure an alarm or notification on this metric to alert you when the value falls below a chosen threshold, providing advance notice of upcoming maintenance.
    • MaintenanceActivity indicates whether maintenance is currently in progress for the file system. A value of 1 means maintenance has started and is ongoing; a value of 0 means maintenance has completed. No data points are emitted when no maintenance activity is occurring. Configure an alarm or notification on this metric to be informed when a maintenance cycle starts (value transitions to 1) and when it ends (value transitions to 0).

    For full details on available metrics and creating alarms, see File Storage with Lustre Metrics.

Applying Tags

Apply tags to resources to help organize them according to your business needs. You can apply tags when you create a resource, and you can update a resource later to add, revise, or remove tags. For general information about applying tags, see Resource Tags.

Monitoring File System Usage

To ensure optimal performance and avoid potential service disruptions, keep your File Storage with Lustre file system usage under 85% capacity. When the File Storage with Lustre file system becomes completely full, its volumes (OST/MDT) can run out of space, which might impact service availability and performance.

We recommend monitoring your space usage using the File Storage with Lustre metric FileSystemCapacity with the dimension available. Set up alarms based on this metric to proactively manage usage thresholds. For full details on available metrics and creating alarms, see File Storage with Lustre Metrics.