System Design
Top-Level Design
The following diagram shows the top-level design of Artemis which is decomposed into an application client (running as Angular web app in the browser) and an application server (based on Spring Boot). For programming exercises, the application server connects to a version control system (VCS) and a continuous integration system (CIS). Authentication is handled by an external user management system (UMS).
While Artemis includes generic adapters to these three external systems with a defined protocol that can be instantiated to connect to any VCS, CIS or UMS, it also provides 3 concrete implementations for these adapters to connect to:
VCS: Atlassian Bitbucket Server
CIS: Atlassian Bamboo Server
UMS: Atlassian JIRA Server (more specifically Atlassian Crowd on the JIRA Server)
Deployment
The following UML deployment diagram shows a typical deployment of Artemis application server and application client. Student, Instructor and Teaching Assistant (TA) computers are all equipped equally with the Artemis application client being displayed in the browser.
The Continuous Integration Server typically delegates the build jobs to local build agents within the university infrastructure or to remote build agents, e.g. hosted in the Amazon Cloud (AWS).
Data Model
The Artemis application server uses the following (simplified) data model in the MySQL database. It supports multiple courses with multiple exercises. Each student in the participating student group can participate in the exercise by clicking the Start Exercise button. Then a repository and a build plan for the student (User) will be created and configured. The initialization state helps to track the progress of this complex operation and allows recovering from errors. A student can submit multiple solutions by committing and pushing the source code changes to a given example code into the version control system or using the user interface. The continuous integration server automatically tests each submission, and notifies the Artemis application server, when a new result exists. In addition, teaching assistants can assess student solutions and “manually” create results.
Please note, that the actual database model is more complex. The UML class diagram above omits some details for readability (e.g. lectures, student questions, exercise details, static code analysis, quiz questions, exam sessions, submission subclasses, etc.)
Server Architecture
The following UML component diagram shows more details of the Artemis application server architecture and its REST interfaces to the application client.
Integrated Code Lifecycle
Artemis supports an integrated version control system (VCS) and continuous integration system (CIS). If you use Integrated Code Lifecycle, the architecture differs from the architecture with external VC and CI systems. An exemplary deployment with Integrated Code Lifecycle (without using an external user management system), consisting of one main application server and three build agent servers, looks like this:
Employing the Integrated Code Lifecycle, administrators and developers can set the Artemis application up without the need for dedicated VCS and CIS installations. This new architecture simplifies the setup process, reduces dependencies on external systems, and streamlines maintenance for both developers and administrators. Developers have fewer applications to run in parallel, which translates into decreased system requirements. See Integrated Code Lifecycle Setup on how to set up a single-node environment for developing purposes. TODO: Additional reference to the production setup.
Version Control Subsystem
The following diagram shows an overview of the components in the version control subsystem:
The Local VC Service
implements the VersionControlService
interface and thus contains methods that the exercise management subsystem and the exercise participation subsystem need to interact with the VC system.
E.g. the createRepository()
method creates a repository on the file system.
For users to be able to access the repositories using their integrated Git client, the integrated VC subsystem contains a Git Server
component.
It responds to fetch
and push
requests from Git clients, enabling instructors and students to interact with their repositories the way they are used to.
It encompasses all the logic for implementing the Git HTTP protocol server-side.
This includes extracting the command and parameters from the client request and executing the Git commands on the server-side repository, provided the repository exists, and the user has the requisite permissions.
It reads objects and refs from the repository, updates the repository for push requests, and formats the results of the Git commands it executes into a response that it sends back to the client.
This could involve sending objects and refs to the client in a packfile, or transmitting error messages.
The Git Server
delegates all logic connected to Artemis to the Local VC Servlet Service
.
This service resolves the repository from the file system depending on the repository URI. It also handles user authentication (only Basic Auth for now) and authorization.
For authorization (e.g. “is the requesting user the owner of the repository?”, “has the due date already passed?”), it uses the logic outsourced to the RepositoryAccessService
that the existing online editor also uses.
For push requests, the Local VC Servlet Service
calls the processNewProgrammingSubmission()
method of the Programming Submission Service
to create a new submission and finally calls the integrated CI subsystem to trigger a new build.
Integrating the VC system into the Artemis server application improves performance. For instance, when an instructor creates a new programming exercise, Artemis needs to copy the template source code to the template repository. Using the integrated VCS, Artemis merely needs to communicate with the host file system, copying the files from one location in the file system to another, which is faster than communicating with the external VCS through the network.
Continuous Integration Subsystem
The following diagram shows an overview of the components in the integrated continuous integration subsystem:
The integrated CIS consists of two further subsystems: the CI Management and the Build Agent. Both systems are decoupled and can be deployed on separate servers if necessary (not obligatory). This allows for a flexible scaling of the system, as we can deploy multiple build agent instances to handle a high number of build jobs.
CI Management
The following diagram shows an overview of the components in the CI Management subsystem:
The CI Management prepares information for build jobs and add them to the distributed Hazelcast queue. It has complete access to the distributed data structures related to the CI system. It provides endpoints so users can interact with these datastructures, such as viewing and cancelling build jobs. It also receives the build job results, grades them, and notifies the user. The CI Management has access to the database and the file system.
The CI Management subsystem implements the ContinuousIntegrationTriggerService
interface, the LocalCITriggerService
which provides the triggerBuild
method. This method gets called whenever a repository needs to be tested, i.e. after creating a programming exercise or when a student submits code.
When the triggerBuild
method is called, all necessary information necessary to execute the build job is prepared and used to create a LocalCIBuildJobQueueItem
object. The object contains, among other things, repository URIs, the build configuration, a user-defined build script (prepared by the LocalCIScriptService
) and a priority value.
This object is then added to the job queue where it will then be retrieved by a build agent to execute the build job. The following diagram shows the structure of the LocalCIBuildJobQueueItem
:
The CI Management subsystem consists of two additional services: The SharedQueueManagementService
and the Local CI Result Processing Service
.
The SharedQueueManagementService
has direct access to the job queue as well as to other Hazelcast data structures, a map for currently running build jobs, a map for build agent information and a topic for cancelled build jobs.
The service provides the functionality for an Artemis user to interact with build jobs and build agents. Build jobs can be viewed and cancelled. Build agents can only be viewed at the current state of this thesis.
The user can access this functionality using the UI over a set of endpoints provided by a REST API.
The LocalCIResultProcessingService
retrieves the build job results which were generated by the build agents from the result queue. It is responsible for grading the build job results, notifying the user and persisting information on the build job execution in the database.
Build Agent
The following diagram shows an overview of the components in the Build Agent subsystem:
The build agent is a separate subsystem that is responsible for executing build jobs. It can be run as a standalone application or as part of the main Artemis application. The build agent implements multiple services responsible for retrieving queued build jobs and executing them.
The SharedQueueProcessingService
has direct access to the job queue and detects newly added build jobs. The job is then taken from the queue if the build agent currently has the capacity to execute the job.
The service then makes an asynchronous method call to the BuildJobManagementService
that eventually results in either a LocalCIBuildJobResult
or an exception if something went wrong during the build job processing.
Either way, a ResultQueueItem
object containing all necessary information about the build job execution is created and added to the result queue.
The BuildJobManagementService
contains the logic for managing build jobs.
It prepares a build task in form of a lambda function and submits this task to the ExecutorService
.
The ExecutorService
encapsulates the low level logic for handling of the queue and the concurrency when running multiple build jobs on the build agent at a time.
As soon as a build job finishes, the ExecutorService
returns the result of the task execution to the BuildJobManagementService
.
The ExecutorService
makes sure that errors happening during the build job execution are propagated to the BuildJobManagementService
, so it can handle all errors in one spot.
To improve the reliability of the system, the BuildJobManagementService
implements a timeout mechanism.
Administrators can configure a maximum amount of time that build jobs can run by setting the artemis.continuous-integration.timeout-seconds
environment variable. The default value is 120 seconds.
If a build job times out, the BuildJobManagementService
interrupts the build job.
This is crucial to prevent jobs that require an abnormally high amount of time from clogging up the system and reducing overall system performance.
The BuildJobExecutionService
has the method runBuildJob
, that contains the actual logic for executing a build job.
A basic build job for the purpose of providing automated assessment in Artemis consists of the following steps:
Check out the relevant repositories.
Configure Docker container.
Start a Docker container for the build job.
Copy repositories into container
Execute the build script in the container.
Retrieve the test results from the container.
Stop the container.
Parse the test results.
To address potential security risks associated with executing student code during automated assessment, we run the build job in a container, that the BuildJobContainerService
creates and starts just for this purpose.
This container functions as an isolated environment.
If a student submits potentially malicious code, the container confines its execution, preventing it from directly affecting the host system or other containers.
The ephemeral nature of Docker containers allows the BuildJobExecutionService
to quickly remove them and the data they produced during the build when a build job finishes.
Finally, when the build ran through successfully, the SharedQueueProcessingService
puts the build result into the result queue so it can then be processed by the CI Management.
If there were any errors, the BuildJobManagementService
stops the Docker container and SharedQueueProcessingService
relays the exception message to the CI Management via the result queue.