Virtual File System: 2 of 3
This is the second in a three-part series about the virtual file system (VFS) that underlies the Codenvy platform.
Codenvy’s platform is different from most cloud IDEs. A developer’s workspace is virtualized across different physical resources that are used to service different IDE functions. Dependency management, build, runners and code assistants can execute on different clusters of physical nodes. In order to properly virtualize access to all of these resources, we needed to implement a VFS that underpins the services and physical resources but also has a native understanding of IDE behavior.
In our previous post, we explored our VFS requirements, how we implement it, user access and what the organizational structure looks like. In this post, we’ll cover how a VFS can help with managing data, including access, modification and manipulation. We’ll also talk about downloading and uploading information.
JSON-based Virtual File System
All file discovery, loading and access take place over a custom API. This API uses JSON to pass parameters back and forth between IDE clients. Compare the file access in a cloud IDE to that of a desktop IDE. In a desktop IDE, the application has local access to the disk drive and uses native commands to manipulate the files and defers to the operating system to provide critical functions around locking, seeking and other forms of access.
But in a cloud environment, there are many IDEs operating simultaneously distributed across a number of physical nodes. The IDEs are coordinating using a set of code assistants, builders and runners that are also on distributed nodes. A workspace may be accessed by multiple developers simultaneously, running in different IDEs, also on different nodes. The role of a VFS, then, is not only to provide access to the files, but also to provide distributed, controlled access to the files.
By using a RESTful API with JSON, we are able to standardize the techniques used by different types of clients, whether those clients are running within our infrastructure or directly accessed by a browser. We needed to take the core operating system functions relating to file manipulation and access and package them up into this format.
The rest of this article goes into some of the API structure details related to navigating the tree, identifying special nodes (e.g., Projects), handling file modification tactics such as copy/remove, modifying the core contents of a file and downloading/uploading a file.
Codenvy's VFS API provides methods for:
Navigating the resource tree step by step
Accessing a particular resource directly by using its unique identifier (UID) or Path
You can access data via its root Folder.
Children resources description in JSON format can be accessed by calling the <vfsURI>/children/{UID}? method with a GET request including:
Pagination parameters, such as number of returning items and number of first item for convenient output
An optional filter to retrieve resources of a particular type (File, Folder or Project)
An optional filter to retrieve resources with particular properties
An optional filter to retrieve resources based on whether they contain permission info
The entire structure, as well as individual substructures, of descendants’ description can be accessed in JSON format by calling the <vfsURI>/tree/{UID} method with a GET request including:
Depth it should go to discover children. Using a -1 value will retrieve all children at all levels
An optional filter to retrieve resources with particular properties
An optional filter to retrieve resources based on whether they contain permission info
The resources using either the UID or Path can be accessed by calling either the <vfsURI>/item/{UID} or <vfsURI>/itembypath/{Path:.*}method. The identifier (UID or Path) will accept:
An optional filter to retrieve resources with particular properties
An optional filter to retrieve resources based on whether they contain permission info
Here is an example of a JSON response for a single resource description as part of a response to any of the methods mentioned above:
"id":"/folder01/DOCUMENT01.txt",
"path":"/folder01/DOCUMENT01.txt",
"creationDate":1292574268440,
"contentType":"text/plain",
"lastModificationDate":1292574268440
Like the resource description, the file content can be obtained with a GET request to either the <vfsURI>/content/{UID} method or the <vfsURI>/contentbypath/{Path:.*} method. Each of them will return the requested file content in the response body with an appropriate content type header.
Data Modification Methods
There are three resource types:
Files, which can be categorized differently and which have bodies with useful (indexable, searchable) content
Folders, the standard structure unit
Projects, a special type of Folder with a set of properties that help identify that project’s nature, appropriate actions, views, etc.
This hierarchical organization manages the VFS’s structure in the following ways:
Only Projects are allowed to be the top-level resource (i.e., have workspace’s root folder as a parent).
Projects may have Files, Folders or other Projects (for multi-module Project) as child resources.
Folders may have Files or Folders as child resources.
To launch data modification methods, call a POST request to the following methods in order:
Doing this will return a JSON describing a newly created resource. All the methods should also return the name of a new item. For a file, the client should pass content type (MIME) of creating file and (optional) the initial content. Additionally, the client may pass Project properties when calling a Project. To make specific modifications, call a POST request to the following methods:
<vfsURI>/delete/{resourceUID} to delete a resource
<vfsURI>/content/{fileUID} to update files
<vfsURI>/item/{resourceUID} to update resource properties
New content and JSON serialized property sets are passed in the request’s body. If the VFS supports Locking and the item is locked, use lockTokento unlock it.
Data Manipulation Methods (Copy, Move, Rename)
To copy or move a resource identified by {resourceUID}, call a POST request to <vfsURI>/copy/{resourceUID} or <vfsURI>/move/{resourceUID}. In either case, a new parent UID should be passed as a query parameter and for move method lockToken should be respected as well (as described above). To rename a resource or to change the File’s content type, call a POST request to <vfsURI>/rename/{resourceUID} with the new resource name, new content type and lock token.
Mass Update Methods (Downloading/Uploading)
There are four methods (two complementary pairs) for uploading and downloading zipped trees of resources. The first pair is mostly used for simple HTTP clients. They pass application/zip content back and forth using this sequence:
Call on the GET method <vfsURI>/export/{folderUID} to downloada zipped resources tree.
Call on the POST method <vfsURI>/import/{parentUID} to upload a zipped resources tree to the parent Folder (or Project).
The second pair is mostly used in web browsers and follows this sequence:
Call on the GET method <vfsURI>/downloadzip/{folderUID} to download the zipped folder. Like the HTTP clients, the application/zip is in a body. However, the response must contain a 'Content-Disposition' header to force the browser to save the file.
Call on the POST method <vfsURI>/uploadzip/{folderUID} to upload the zipped folder. The content is supposed to be sent in HTML form, so zip content is a part of 'multipart/form-data request.'
There are also methods for uploading and downloading file content:
Call on the GET method <vfsURI>/downloadfile/{fileUID} to download file content. To force the web browser to save the file, the response must contain a 'Content-Disposition' header.
Call on the POST method <vfsURI>/uploadfile/{fileUID} to upload file content. The file’s content is part of 'multipart/form-data request'; e.g., content sent from HTML form.
In the next article, we'll talk about how to use search in the workspace, along with access control, locking, versioning and observation within the VFS.