Hardening - Security
This section describes additional setup that would be needed to make Corridor secure. All the below are recommended but optional, and can be configured as needed.
Data Storage Security
Section titled “Data Storage Security”Corridor saves data in the following locations:
- Data Lake
- File Management System
- Metadata Database
- Jupyter Content Manager (For notebooks)
Any data stored in them should be encrypted and backups should be maintained as needed.
Network Security
Section titled “Network Security”The following network connections are created in Corridor, and should be secured:
- Web Application ↔︎ API Server: HTTPS connection
- API Server / API - Celery ↔︎ File Management: FTPS connection (if using FTP)
- API Server / API - Celery ↔︎ Metadata Database: SSL connections to Database
- Spark - Celery / Jupyter ↔︎ Spark: Kerberos
- Corridor Package ↔︎ API Server: HTTPS connection
- Jupyter ↔︎ Web Application: HTTPS connection
- End User ↔︎ Web Application / Jupyter: HTTPS / WSS connection
Securing each component
Section titled “Securing each component”This section describes the steps to follow for each of the Corridor components to ensure it can be accessed securely.
Common
Section titled “Common”- Ensure that all configuration and installation files are readable only by the user that the process is running with
Web Application Server
Section titled “Web Application Server”- Ensure that the application is served using a standard web server like Nginx or Apache httpd in front of the WSGI server
- Setup the secure HTTPS protocol at the WSGI Server or the Web Server using an SSL certificate
- When setting up HTTPS, also set the JWT_COOKIE_SECURE configuration to ensure JWT cookies are sent in a secure manner
- Set a strong and unique SECRET_KEY
- It should use a reliable Authentication Provider (Avoid using the inbuilt authentication provider)
API Server
Section titled “API Server”- Ensure that the application is served using a standard web server like Nginx or Apache httpd in front of the WSGI server
- Setup the secure HTTPS protocol at the WSGI Server or the Web Server using an SSL certificate
- Set a strong and unique SECRET_KEY
- Ensure API Keys are set to ensure only authorized access to the APIs
API - Celery worker
Section titled “API - Celery worker”No specific steps are required for the API - Celery workers as no other component connects to it directly.
Spark - Celery worker
Section titled “Spark - Celery worker”- Ensure that the cluster is Kerberized
- Ensure the standard security practices for Spark are followed as described in the Spark - Security documentation.
Jupyter Notebook
Section titled “Jupyter Notebook”- Ensure that the application is served using a standard web server like Nginx or Apache httpd in front of the WSGI server
- Setup the secure HTTPS protocol at the WSGI Server or the Web Server using an SSL certificate
- Ensure the standard security practices for Jupyter are followed as described in the Jupyter - Security documentation.
File Management
Section titled “File Management”For Local File System: Ensure that the Hard disk being used is encrypted.
For FTP: Ensure the FTPS protocol is being used and the underlying data is encrypted
Metadata Database (SQL RDBMS)
Section titled “Metadata Database (SQL RDBMS)”- Ensure that the connection to the SQL database is secured using any of the authentication methods available to the RDBMS.
- Ensure that the Database is encrypted.
- Ensure the standard security practices for the RDBS are followed as described in its documentation.
Authentication Provider
Section titled “Authentication Provider”- For LDAP: Ensure the secure LDAPS is used to create connections
- For SAML: Ensure a valid x509 certificate is used to authenticate messages being sent/received