Airflow architecture :
https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/overview.html
Airflow with docker compose official docs:
https://airflow.apache.org/docs/apache-airflow/stable/howto/docker-compose/index.html
Airflow with distributed environment using celery:
https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/executor/celery.html
Yes, your idea of having a master server and worker nodes is aligned with the distributed architecture that Apache Airflow supports. Airflow can be set up in a distributed mode where tasks can be run on multiple worker nodes while the scheduler and web server run on a single or separate node. This is made possible by using a CeleryExecutor that allows Airflow to distribute task execution across multiple worker nodes.
Your CPU server could serve as the Airflow master node hosting the Airflow Webserver, Scheduler, and a Celery Flower (a web-based tool for monitoring and administrating Celery clusters). All these components can be run on the same machine if resources permit or they can be distributed over several machines. The Webserver provides the UI for Airflow, the Scheduler schedules tasks and sends them to the queue, and the Flower monitors the Celery workers.
The GPU servers, meanwhile, would act as your worker nodes, executing the tasks sent from the master node.
Here is a simple way to think about the architecture:
CPU Server (Master Node):
Airflow Webserver: Provides the UI for Airflow.
Airflow Scheduler: Reads the tasks from your DAGs and puts them in a message queue.
Celery Flower: (Optional) A tool for monitoring your Celery cluster.
Redis or RabbitMQ: These are popular choices for the message queue that holds tasks ready to be executed.
GPU Servers (Worker Nodes):
Celery Workers: Pull tasks from the message queue and execute them.
Regarding your last question: Yes, it is possible to manage remote servers through a master server in Airflow. Tasks are packaged and sent to the worker nodes for execution via the message queue. As such, the worker nodes can be located on any server accessible to the master node, assuming they are correctly configured and networked.
Just make sure that the following are properly configured:
Network accessibility between all nodes.
Same Airflow version and Python libraries across all nodes.
Common/shared file system accessible by all nodes, including the master node and worker nodes. All nodes should be able to access DAG files and any other files necessary for your tasks.