This explanation is based on AWS documentation and best practices but is paraphrased, not a literal extract.
The company wants to move from a manual EC2 and EBS-based workflow to a containerized application on Amazon EKS and automate data movement. The solution must:
Support automated transfer of raw and processed data.
Offer multiprotocol support.
Be directly usable from the EKS cluster as a mounted volume.
Minimize operational effort by using managed services where possible.
AWS DataSync is a managed service designed to move data between on-premises storage and AWS storage services or between AWS storage services. It can perform scheduled or continuous transfers with minimal operational overhead. For storage accessible from Amazon EKS, a shared file system that supports mounting as a volume is appropriate.
Amazon FSx for NetApp ONTAP provides a fully managed file system with multiprotocol support, including NFS and SMB, and supports features such as snapshots and storage efficiencies. Because it supports multiple protocols, it satisfies the requirement for multiprotocol access and can be mounted by applications running in Amazon EKS using standard Kubernetes persistent volume mechanisms.
In the correct solution (option C), DataSync is used to copy raw data from the on-premises environment to FSx for NetApp ONTAP. The FSx for NetApp ONTAP file system is then mounted as a volume in the EKS cluster, allowing the containerized analytics processing logic to read and write data directly. After processing, DataSync is again used to copy processed data from FSx for NetApp ONTAP to Amazon S3 for long-term storage. This leverages DataSync’s native integration with both FSx for NetApp ONTAP and Amazon S3, and avoids the need to run or manage custom upload tooling.
Option A uses Amazon EFS, which supports NFS but does not provide multiprotocol support (for example, SMB), so it does not fully meet the multiprotocol requirement. It also introduces AWS Transfer for SFTP for the processed data upload, which adds an additional managed endpoint and SFTP-based flow, increasing complexity relative to using DataSync end-to-end.
Option B uses Amazon FSx for Lustre, which is optimized for high-performance compute workloads and integrates well with S3, but it is not a multiprotocol file system and is typically accessed via NFS. It does not meet the stated multiprotocol requirement.
Option D uses FSx for NetApp ONTAP (which supports multiprotocol) but relies on AWS Transfer for SFTP to move processed data to S3. While this can work, it adds another managed input endpoint and requires SFTP client configuration and management. Using DataSync directly from FSx for NetApp ONTAP to Amazon S3 (as in option C) is more straightforward, better suited for automated large-scale transfers, and involves less operational overhead.
Therefore, option C meets all the requirements with the least operational effort by using DataSync with FSx for NetApp ONTAP and S3.
[References:AWS documentation on AWS DataSync for automated, scheduled data transfers between on-premises storage, FSx file systems, and Amazon S3.AWS documentation on Amazon FSx for NetApp ONTAP including its multiprotocol support (NFS and SMB) and integration with Kubernetes and Amazon EKS.]